Textscan skips desired white space at begin of line - matlab

To run a certain software I'm using .txt-input files which I need to manipulate with Matlab.
I know how to do it, and I didn't expected problems. As it was not working I reduced my manipulation script to a minimum, so actually nothing is changed. Except some white spaces, and the other software seems to react very sensitive on that.
parts of my file look like that:
...
*CONTROL_TERMINATION
$# endtim endcyc dtmin endeng endmas
1.000000 0 0.000 0.000 0.000
*CONTROL_TIMESTEP
$# dtinit tssfac isdo tslimt dt2ms lctm erode ms1st
0.000 0.900000 0 0.000 -1.000E-4 0 0 0
$# dt2msf dt2mslc imscl
0.000 0 0
...
I'm loading it to Matlab and directly save it again without changes:
% read original file
fid = fopen('filename.txt','r');
param = textscan(fid,'%s','delimiter','\n');
rows = param{1,1};
fclose(fid);
% overwrite to new file
fid = fopen('filename.txt','w');
fprintf(fid, '%s\r\n', rows{:});
fclose(fid);
The output file is lacking of the white spaces at the begin of every line, that seems to be the only difference of input and output file. (at least I hope so)
...
*CONTROL_TERMINATION
$# endtim endcyc dtmin endeng endmas
1.000000 0 0.000 0.000 0.000
*CONTROL_TIMESTEP
$# dtinit tssfac isdo tslimt dt2ms lctm erode ms1st
0.000 0.900000 0 0.000 -1.000E-4 0 0 0
$# dt2msf dt2mslc imscl
0.000 0 0
...
Though it seems weird to me, that this should be the reason - what can I change, that both files look 100% identical? The problem I'm having is that the white spaces have different lengths.

You can use the whitespace option in textscan, and setting it to an empty string.
param = textscan(fid,'%s','delimiter','\n','whitespace','');
By default, textscan does not include leading white-space characters in the processing of any data fields (doc center).

Related

How to extract data set from a text file?

Am quite new in the Unix field and I am currently trying to extract data set from a text file. I tried with sed, grep, awk but it seems to only work with extracting lines, but I want to extract an entire dataset... Here is an example of file from which I'd like to extract the 2 data sets (figures after the lines "R.Time Intensity")
[Header]
Application Name LabSolutions
Version 5.87
Data File Name C:\LabSolutions\Data\Antoine\170921_AC_FluoSpectra\069_WT3a derivatized lignin LiCl 430_GPC_FOREVER_430_049.lcd
Output Date 2017-10-12
Output Time 12:07:32
[Configuration]
Instrument Name BOTAN127-Instrument1
Instrument # 1
Line # 1
# of Detectors 3
Detector ID Detector A Detector B PDA
Detector Name Detector A Detector B PDA
# of Channels 1 1 2
[LC Chromatogram(Detector A-Ch1)]
Interval(msec) 500
# of Points 9603
Start Time(min) 0,000
End Time(min) 80,017
Intensity Units mV
Intensity Multiplier 0,001
Ex. Wavelength(nm) 405
Em. Wavelength(nm) 430
R.Time (min) Intensity
0,00000 -709779
0,00833 -709779
0,01667 17
0,02500 3
0,03333 7
0,04167 19
0,05000 9
0,05833 5
0,06667 2
0,07500 24
0,08333 48
[LC Chromatogram(Detector B-Ch1)]
Interval(msec) 500
# of Points 9603
Start Time(min) 0,000
End Time(min) 80,017
Intensity Units mV
Intensity Multiplier 0,001
R.Time (min) Intensity
0,00000 149
0,00833 149
0,01667 -1
I would greatly appreciate any idea. Thanks in advance.
Antoine
awk '/^[^0-9]/&&d{d=0} /R.Time/{d=1}d' file
Brief explanation,
Set d as a flag to determine print line or not
/^[^0-9]/&&d{d=0}: if regex ^[^0-9] matched && d==1, disabled d
/R.Time/{d=1}: if string "R.Time" searched, enabled d
awk '/R.Time/,/LC/' file|grep -v -E "R.Time|LC"
grep part will remove the R.Time and LC lines that come as a part of the output from awk
I think it's a job for sed.
sed '/R.Time/!d;:A;N;/\n$/!bA' infile

Matlab - read unstructured file

I'm quite new with Matlab and I've been searching, unsucessfully, for the following issue: I have an unstructure txt file, with several rows I don't need, but there are a number of rows inside that file that have an structured format. I've been researching how to "load" the file to edit it, but cannot find anything.
Since i don't know if I was clear, let me show you the content in the file:
8782 PROJCS["UTM-39",GEOGC.......
1 676135.67755473056 2673731.9365976951 -15 0
2 663999.99999999302 2717629.9999999981 -14.00231124135486 3
3 709999.99999999162 2707679.2185399458 -10 2
4 679972.20003752434 2674637.5679516452 0.070000000000000007 1
5 676124.87132483651 2674327.3183533219 -18.94794942571912 0
6 682614.20527054626 2671000.0000000549 -1.6383425512446661 0
...........
8780 682247.4593014461 2676571.1515358146 0.1541080392180566 0
8781 695426.98657108378 2698111.6168302582 -8.5039945992245904 0
8782 674723.80100125563 2675133.5486935056 -19.920312922947179 0
16997 3 21
1 2147 658 590
2 1855 2529 5623
.........
I'd appreciate if someone can just tell me if there is the possibility to open the file to later load only the rows starting with 1 to the one starting with 8782. First row and all the others are not important.
I know than manually copy and paste to a new file would be a solution, but I'd like to know about the possibility to read the file and edit it for other ideas I have.
Thanks!
% Now lines{i} is the string of the i'th line.
lines = strsplit(fileread('filename'), '\n')
% Now elements{i}{j} is the j'th field of the i'th line.
elements = arrayfun(#(x){strsplit(x{1}, ' ')}, lines)
% Remove the first row:
elements(1) = []
% Take the first several rows:
n_rows = 8782
elements = elements(1:n_rows)
Or if the number of rows you need to take is not fixed, you can replace the last two statements above by:
firsts = arrayfun(#(x)str2num(x{1}{1}), elements)
n_rows = find((firsts(2:end) - firsts(1:end-1)) ~= 1, 1, 'first')
elements = elements(1:n_rows)

MATLAB avoid matrix wrapping in command window

Is there a way to prevent MATLAB from wrapping matrices into multiple chunks when displaying them in the command window? Here's what I mean:
>> x = rand(10,1);
>> y = rand(10,1);
>> c = squareform(pdist([x y]))
c =
Columns 1 through 6
0 0.9160 0.4707 0.7161 0.6093 0.1555
0.9160 0 0.8495 0.8984 0.6463 1.0714
0.4707 0.8495 0 0.2459 0.2477 0.5541
0.7161 0.8984 0.2459 0 0.2603 0.7970
0.6093 0.6463 0.2477 0.2603 0 0.7306
0.1555 1.0714 0.5541 0.7970 0.7306 0
0.0881 0.9695 0.4311 0.6762 0.6012 0.1295
0.4698 0.4566 0.4587 0.6057 0.3612 0.6245
0.2442 1.1079 0.7006 0.9460 0.8534 0.1629
0.8282 0.1355 0.7200 0.7629 0.5114 0.9832
Columns 7 through 10
0.0881 0.4698 0.2442 0.8282
0.9695 0.4566 1.1079 0.1355
0.4311 0.4587 0.7006 0.7200
0.6762 0.6057 0.9460 0.7629
0.6012 0.3612 0.8534 0.5114
0.1295 0.6245 0.1629 0.9832
0 0.5156 0.2700 0.8736
0.5156 0 0.6857 0.3588
0.2700 0.6857 0 1.0359
0.8736 0.3588 1.0359 0
I'd like to be able to copy and paste the matrix c (into a LaTeX document, say, or a MATLAB script) but this is obviously cumbersome with the current output format, especially for larger matrices.
You could do fprintf([repmat('%f\t', 1, size(c, 2)) '\n'], c');, which gave this output:
0.000000 0.818064 1.054641 0.342287 0.668041 0.717356 0.597756 0.804045 0.650459 0.815819
0.818064 0.000000 0.778921 0.485276 0.322136 1.157594 0.833495 0.363079 0.185730 0.060130
1.054641 0.778921 0.000000 0.917058 0.529164 0.815812 0.556431 0.421934 0.846744 0.837905
0.342287 0.485276 0.917058 0.000000 0.422061 0.885196 0.638057 0.565268 0.309989 0.476907
0.668041 0.322136 0.529164 0.422061 0.000000 0.848242 0.518164 0.143653 0.325248 0.368679
0.717356 1.157594 0.815812 0.885196 0.848242 0.000000 0.333280 0.894846 1.078174 1.191962
0.597756 0.833495 0.556431 0.638057 0.518164 0.333280 0.000000 0.562174 0.773488 0.871944
0.804045 0.363079 0.421934 0.565268 0.143653 0.894846 0.562174 0.000000 0.428803 0.420291
0.650459 0.185730 0.846744 0.309989 0.325248 1.078174 0.773488 0.428803 0.000000 0.167448
0.815819 0.060130 0.837905 0.476907 0.368679 1.191962 0.871944 0.420291 0.167448 0.000000
But it's probably easier to use the variable explorer as mentioned in the comments.
As I mentioned in my comment, I don't think there's a way to change the command line output. If you don't need a programmatic solution you can utilize the variable explorer to interact with your data using a slightly Excel-ish interface.
You can access the variable explorer by double clicking on your variable in the workspace browser, right clicking on your variable and selecting Open, selecting your variable and hitting ctrl+D (on Windows), or programmatically using openvar.
If you do need a programmatic solution, you can use one of the many exporting functions (sprintf, fprintf, save, etc.), one example being the answer that #badjr posted.

Gnuplot reading not locale encoding file

I want to plot data of an ISO_8859_1 encoded file (two columns of numbers). Those are the first 10 data points of the file:
#Pe2
1 0.8000
2 0.8000
3 0.8000
4 0.8000
5 0.8000
6 0.8000
7 0.8000
8 0.8000
9 0.8000
10 0.8000
The original file has 15000 data points. I create this data with MATLAB, specifically setting ISO_8859_1 encoding, so I am sure that that's the encoding. This is a snippet of the matlab code:
slCharacterEncoding('ISO-8859-1'); %Instruction before writing anything to the file.
fprintf(fileID,' %7d %7.4f',Tempo(i),y(i)); %For loop in this instruction
fprintf(fileID,'\r'); %Closing the file
fclose(fileID);
This is the script that I run. This file is encoded with the default Windows txt files encoding:
set encoding iso_8859_1
set terminal wxt size 1000,551
# Line width of the axes
set border linewidth 1.5
# Line styles
set style line 1 lc rgb '#dd181f' lt 1 lw 1 pt 0 # red
# Axes label
set xlabel 'tiempo'
set ylabel 'valor'
plot 'Pe2.txt' with lines ls 1
This is the output of the gnuplot console when I run the script. After that I input "show encoding":
G N U P L O T
Version 4.6 patchlevel 5 last modified February 2014
Build System: MS-Windows 32 bit
Copyright (C) 1986-1993, 1998, 2004, 2007-2014
Thomas Williams, Colin Kelley and many others
gnuplot home: http://www.gnuplot.info
faq, bugs, etc: type "help FAQ"
immediate help: type "help" (plot window: hit 'h')
Terminal type set to 'wxt'
gnuplot> cd 'C:\Example'
gnuplot> load 'script.txt'
"script.txt", line 10: warning: Skipping data file with no valid points
gnuplot> plot 'Pe2.txt' with lines ls 1
^
"script.txt", line 10: x range is invalid
gnuplot> show encoding
nominal character encoding is iso_8859_1
however LC_CTYPE in current locale is Spanish_Spain.1252
gnuplot>
If I open the file, make some change undo the change and save the file, gnuplot plots the file. I guess that it's because it saves it with local encoding which is the one gnuplot uses to read files.
How do I plot files with gnuplot which are not with the local encoding format?
I also have what it seems to be a similar problem when I output a file with VS2010Css. If I don't specifically set the culture with:
Thread.CurrentThread.CurrentUICulture = CultureInfo.GetCultureInfo("en-US");
Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("en-US");
I am not able to save a file wich gnuplot is able to plot. I believe that this last problem is because of the "," and the "."
In Css I save the files with this:
StreamWriter Writer = new StreamWriter(dir + #"\" + + (k+1) + "_" + nombre + extension);
Writer.WriteLine("#" + (k+1) + "_" + nombre);
Writer.WriteLine();
Writer.WriteLine("{0,32} {1,32}", "#tiempo", "#valor");
for (int i = 0; i < tiempo.GetLength(0); i++)
{
Writer.WriteLine("{0,32} {1,32}", tiempo[i].ToString(), valor[i, k]);
}
Thank you.
Your file has only carriage returns (\r 0xd) as line breaks which doesn't work with gnuplot. You must use only line feed (\n 0xa), but \r\n does also work.

CSV import, only one column (Matlab/Octave)

I have since several days problems with reading my measurement csv files and make some simple calculations. I hope someone can help me.
My Aim
Read CSV data file, as followed:
Open with Excel:
date: 20140202 time: 083736 Cycles total: 74127 T_zer: 56 T_op1: 90.000
Actu state: stoppes ! T1: -23 T2: -12 T3: -32 T4: -65
*-*
324203 0 34724 0 0 0 2
431040 0 0 0 0 0 1
230706 0 0 0 0 0 1
340810 0 0 0 0 0 1
..............
....
.
-->Here 1st question: If I open with editor, I can only see one delimiter, its ";". But there must be two? One for row , one for columns? How can Excel separate it correctly into row and col, if there is only ";" ?
However... now I tried to csvread this file with octave. There I get it into octave, but everything only in one column:/. For me it would be very comfortable Octave could read it into a 7x X Matrix. In this case I can handle the data easy.
Here my Code:
clc
clear all
[fname,pname] =uigetfile();
fname;
extra="/";
pname;
b=strcat(pname,extra,fname);
m = csvread(b);
Result:
m as double with 4003x1. 4003 is corretct, but everything in one colum:/
m =
0
0
0
454203
561040
340706
I tried now to handle this problem up to several days, but no result.
Not a Octave expert, but looks like you can use the dlmread function to read a CSV files, it has many parameters which can help you read the file correctly.
start reading the data from row X (and not from the start)
only have Y columns
defined the separator between fields