Data separation by particular rows in Matlab - matlab

I am relatively new to using Matlab and I don't have much knowledge about programming either. For a project I am working on currently I need to process a lot of data which is logged using the following format.
$GPRMC,202124.985,V,,,,,,,091112,,,N*44
2038,4674,4667,5593,3379
2087,5133,5111,6084,3372
2138,5134,5114,6080,3376
2188,5133,5114,6084,3377
2238,5130,5113,6084,3410
2287,5134,5113,6080,3416
2337,5133,5110,6080,3417
2387,5133,5110,6084,3416
2438,5130,5113,6081,3396
2487,5132,5110,6080,3410
$GPRMC,202125.985,V,,,,,,,091112,,,N*45
2985,5130,5113,6085,3988
3035,5130,5118,6084,4541
3085,5138,5113,6082,5186
3135,5130,5114,6081,6001
3185,5134,5110,6084,6311
3234,5134,5113,6084,6319
3284,5131,5114,6084,6316
3339,5131,5110,6084,6260
3389,5130,5114,6080,6178
3438,5134,5110,6085,6077
$GPRMC,202126.985,V,,,,,,,091112,,,N*46
3942,5131,5114,6085,5916
3992,5130,5110,6084,5917
4042,5133,5110,6084,5950
4091,5131,5114,6080,5996
4142,5134,5114,6085,6062
4192,5134,5114,6084,6129
4242,5134,5110,6080,6150
4291,5130,5110,6079,6186
4341,5130,5110,6089,6246
4391,5130,5118,6083,6266
It continues like this until the end of the file. What I want to do is to be able to separate the data such that, all the '$GPRMC' strings (rows) are listed together as text (not separated) in one file or array while all the other rows (numerical) listed together in one file array (comma separated is desirable). Is it even possible? If it is than can you please give me some pointers?

Not quite sure what you mean by separated or not separated. If you copy the text you posted into some file like testf.dat, a simple script like this using fopen, fprintf, and fgets might be what you're looking for:
infile = fopen('testf.dat');
outf1 = fopen('GPRMC.dat','w');
outf2 = fopen('nums.dat','w');
tline = fgets(infile);
while ischar(tline)
if tline(1:6) == '$GPRMC'
fprintf(outf1,tline);
else
fprintf(outf2,tline);
end
tline = fgets(infile);
end
fclose(infile);
fclose(outf1);
fclose(outf2);

Related

How to read a number from text file via Matlab

I have 1000 text files and want to read a number from each file.
format of text file as:
af;laskjdf;lkasjda123241234123
$sakdfja;lskfj12352135qadsfasfa
falskdfjqwr1351
##alskgja;lksjgklajs23523,
asdfa#####1217653asl123654fjaksj
asdkjf23s#q23asjfklj
asko3
I need to read the number ("1217653") behind "#####" in each txt file.
The number will follow the "#####" closely in all text file.
"#####" and the close following number just appear one time in each file.
clc
clear
MyFolderInfo = dir('yourpath/folder');
fidin = fopen(file_name,'r','n','utf-8');
while ~feof(fidin)
tline=fgetl(fidin);
disp(tline)
end
fclose(fidin);
It is not finish yet. I am stuck with the problem that it can not read after the space line.
This is another approach using the function regex. This will easily provide a more advanced way of reading files and does not require reading the full file in one go. The difference from the already given example is basically that I read the file line-by-line, but since the example use this approach I believe it is worth answering. This will return all occurences of "#####NUMBER"
function test()
h = fopen('myfile.txt');
str = fgetl(h);
k = 1;
while (isempty(str) | str ~= -1 ) % Empty line returns empty string and EOF returns -1
res{k} = regexp(str,'#####\d+','match');
k = k+1;
str = fgetl(h);
end
for k=1:length(res)
disp(res{k});
end
EDIT
Using the expression '#####(\d+)' and the argument 'tokens' instead of 'match' Will actually return the digits after the "#####" as a string. The intent with this post was also, apart from showing another way to read the file, to show how to use regexp with a simple example. Both alternatives can be used with suitable conversion.
Assuming the following:
All files are ASCII files.
The number you are looking to extract is directly following #####.
The number you are looking for is a natural number.
##### followed by a number only occurs once per file.
You can use this code snippet inside a for loop to extract each number:
regx='#####(\d+)';
str=fileread(fileName);
num=str2double(regexp(str,regx,'tokens','once'));
Example of for loop
This code will iterate through ALL files in yourpath/folder and save the numbers into num.
regx='#####(\d+)'; % Create regex
folderDir='yourpath/folder';
files=cellstr(ls(folderDir)); % Find all files in folderDir
files=files(3:end); % remove . and ..
num=zeros(1,length(files)); % Pre allocate
for i=1:length(files) % Iterate through files
str=fileread(fullfile(folderDir,files{i})); % Extract str from file
num(i)=str2double(regexp(str,regx,'tokens','once')); % extract number using regex
end
If you want to extract more ''advanced'' numbers e.g. Integers or Real numbers, or handle several occurrences of #####NUMBER in a file you will need to update your question with a better representation of your text files.

Octave: create .csv files with varying file names stored in a sub folder

I have multiple arrays with string data. All of them should be exported into a .csv file. The file should be saved in a subfolder. The file name is variable.
I used the code as follows:
fpath = ('./Subfolder/');
m_date = inputdlg('Date of measurement [yyyymmdd_exp]');
m_name = inputdlg('Characteristic name of the expteriment');
fformat = ('.csv');
fullstring = strcat(fpath, m_date,'_', m_name, fformat);
dlmwrite(fullstring,measurement);
However, I get an error that FILE must be a filename string or numeric FID
What's the reason?
Best
Andreas
What you are asking to do is fairly straightforward for Matlab or Octave. The first part is creating a file with a filename that changes. the best way to do this is by concatenating the strings to build the one you want.
You can use: fullstring = strcat('string1','string2')
Or specifically: filenameandpath = strcat('./Subfolder/FixedFileName_',fname)
note that because strings are pretty much just character arrays, you can also just use:
fullstring = ['string1','string2']
now, if you want to create CSV data, you'll first have to read in the file, possibly parse the data in some way, then save it. As Andy mentioned above you may just be able to use dlmwrite to create the output file. We'll need to see a sample of the string data to have an idea whether any more work would need to be done before dlmwrite could handle it.

Reading large amount of data stored in lines from csv

I need to read in a lot of data (~10^6 data points) from a *.csv-file.
the data is stored in lines
I can't know how many data points per line and how many lines are there before I read it in
the amount of data points per line can be different for each line
So the *.csv-file could look like this:
x Header
x1,x2
y Header
y1,y2,y3, ...
z Header
z1,z2
...
Right now I read in every line as string and split it at every comma. This is what my code looks like:
index = 1;
headerLine = textscan(csvFileHandle,'%s',1,'Delimiter','\n');
while ~isempty(headerLine{1})
dummy = textscan(csvFileHandle,'%s',1,'Delimiter','\n', ...
'BufSize',2^31 - 1);
rawData(index) = textscan(dummy{1}{1},'%f','Delimiter',',');
headerLine = textscan(csvFileHandle,'%s',1,'Delimiter','\n');
index = index + 1;
end
It's working, but it's pretty slow. Most of the time is used while splitting the string with textscan. (~95%).
I preallocated rawData with sample data, but it brought next to nothing for the speed.
Is there a better way than mine to read in something like this?
If not, is there a faster way to split this string?
First suggestion: to read a single line as a string when looping over a file, just use fgetl (returns a nice single string so no faffing with cell arrays).
Also, you might consider (if possible), reading everything in a single go rather than making repeating reads from file:
output = textscan(fid, '%*s%s','Delimiter','\n'); % skips headers with *
If the file is so big that you can't do everything at once, try to read in blocks (e.g. tackle 1000 lines at a time, parsing data as you go).
For converting the string, there are the options of str2num or strsplit+str2double but the only thing I can think of that might be slightly quicker than textscan is sscanf. Since this doesn't accept the delimiter as a separate input put it in the format string (the last value doesn't end with ,, true, but sscanf can handle that).
for n = 1:length(output);
data{n} = sscanf(output{n},'%f,');
end
Tests with a limited patch of test data suggests sscanf is a bit quicker (but might depend on machine/version/data sizes).

create more than one text file using matab's fopen in a for-loop

I'm quite new to Matlab and programming in general and would love to get some help with the following. I've look here on the website, but couldn't find an answer.
I am trying to use a for-loop and fprintf to give me a bunch of separate text files, whose file names contain the index I use for my for-loop. See for example this piece of code to get the idea of what I'd like to do:
for z=1:20
for x=1:z;
b=[x exp(x)];
fid = fopen('table z.txt','a');
fprintf(fid,'%6.2f, %6.2f\n',b);
fclose(fid);
end
end
What I'm looking for, is a script that (in this case) gives me 20 separate .txt files with names 'table i.txt' (i is 1 through 20) where
table 1.txt only contains [1, exp(1)],
table 2.txt contains [1, exp(1)] \newline [2, exp(2)]
and so on.
If I run the script above, I get only one text file (named 'table z.txt' with all the data appended underneath. So the naming of fopen doesn't 'feel' the z values, but interprets z as a letter (which, seeing the quotation marks doesn't really surprise me)
I think there must be an elegant way of doing this, but I haven't been able to find it. I hope someone can help.
Best,
L
use num2str and string concatenation [ ... ].
fid = fopen( ['table ' num2str(z) '.txt'],'a');
Opening your file in the innermost loop is inefficient, you should create a file as soon as you know z (see example below). To format a string the same way that fprintf, you can use sprintf.
for z=1:20
fname = sprintf('table %d.txt',z);
fid = fopen(fname,'w');
for x=1:z
fprintf(fid,'%6.2f, %6.2f\n', x, exp(x));
end
fclose(fid);
end

Saving Multiple Arrays to Text File in Matlab

I need to save multiple arrays to a text file with the filename the same as the variable name. I have created a vector of all the variables required using the follow lines.
all_var={};
vars=whos;
for(i=1:size(vars,1))
if(~isempty(regexp(vars(i).name,'A[0-9]','match')))
all_var{end+1}=vars(i).name;
end
end
I am now struggling to find a way to save all of these variable to file. Any help would be appreciated.
Thank you
I'm not sure if I understood correctly. Do you want to save each variable in different files? Assuming you want to save all variables in the same file with, lets say, the first value of the vector as the filename, you could try something like:
filename = sprintf('vector_starting_with%d.mat', vars(1).name);
save(filename)
In case you want separated files for each element in the vector, you could try:
all_var={};
vars=whos;
for(i=1:size(vars,1))
if(~isempty(regexp(vars(i).name,'A[0-9]','match')))
all_var{end+1}=vars(i).name;
varsave=sprintf('vector_%d.mat', vars(i).name)
save(varsave);
end
end
Sorry that it might have some bugs, right now I don't have MATLAB. Nevertheless, try to go over this documentation.
Edit Let me know if you try this then:
all_var={};
vars=whos;
for(i=1:size(vars,1))
if(~isempty(regexp(vars(i).name,'A[0-9]','match')))
all_var{end+1}=vars(i).name;
filename = sprintf('%d.txt', vars(i).name);
file = fopen(filename,'w');
fprintf(file,vars(i).name);
fclose(file);
end
end