Matlab - read a specific format line - matlab

I have a file which contains data in the following format 0,"20 300 40 12".
How can I read this data with sscanf function such that I store 0 in a separate variable and 20 300 40 12 in another variable. The problem is that the array within the " " changes its size, so I cannot use a fix length array. So I can have something like this within my file:
0,"20 300 40 12"
0,"20 300 43 40 12"
1,"22 40 12"
Can you give me a hint of how to read this?

Have you tried with this:
fid = fopen(filename,'r');
A = textscan(fid,'%d,%q','Delimiter','\n');

Here's another way to do it:
[a,b] = textread('ah.txt','%d,"%[^"]"');
fun = #(x) split(' ',x);
resb = cellfun(fun,b,'UniformOutput',false)
res = {a resb};
function l = split(d,s)
%split string s on string d
out = textscan(s,'%s','delimiter',d,'multipleDelimsAsOne',1);
l = out{1};

Related

MATLAB how to split columns without space or operators?

i read a data from text-file with 20000 rows:
0000
1000
0110
0000
0110
1101
1010
0200
0011
....
I want to split columns to four 20000x1 Matrix.
How can I do it? What is the code? Thanks!
Rather than manipulate the data in MATLAB I would read it in in the format that you want to. Use textscan and use the format spec %1d to specify reading in one width single integer.
If there are 4 integers per row then this should work.
data = textscan(fid,'%1d%1d%1d%1d')
The resulting data variable should be a 4x1 cell array with each cell having the column of data required.
Using the data you supplied I get
data =
1×4 cell array
{9×1 int32} {9×1 int32} {9×1 int32} {9×1 int32}
Where for example the 2nd column is
>> data{2}
ans =
9×1 int32 column vector
0
0
1
0
1
1
0
2
0
I interpret this question as follows:
The data:
Data = ...
["0000"
"1000"
"0110"
"0000"
"0110"
"1101"
"1010"
"0200"
"0011"];
Proposal for your code:
% Initializing variables
Colomn1 = string(zeros(length(Data),1));
Colomn2 = string(zeros(length(Data),1));
Colomn3 = string(zeros(length(Data),1));
Colomn4 = string(zeros(length(Data),1));
% Looping trough Data and extracting the columns
for i = 1:length(Data)
DataPerRow = Data(i);
Colomn1(i) = extractBetween(DataPerRow,1,1);
Colomn2(i) = extractBetween(DataPerRow,2,2);
Colomn3(i) = extractBetween(DataPerRow,3,3);
Colomn4(i) = extractBetween(DataPerRow,4,4);
end
The results:
Column1 =
"0"
"1"
"0"
"0"
.
Column2 =
"0"
"0"
"1"
"0"
.
Column3 =
"0"
"0"
"1"
"0"
.
Column4 =
"0"
"0"
"0"
"0"
.

get values from a text file with a mix of floats and strings

I am struggling with a text file that I have to read in. In this file, there are two types of line:
133 0102764447 44 11 54 0.4 0 0.89 0 0 8 0 0 7 Attribute_Name='xyz' Type='string' 02452387764447 884
134 0102256447 44 1 57 0.4 0 0.81 0 0 8 0 0 1 864
What I want to do here is to textscan all the lines and then try to determine the number of 'xyz' (and the total number of lines).
I tried to use:
fileID = fopen('test.txt','r') ;
data=textscan(fileID, %d %d %d %d %d %d %d %d %d %d %d %d %d %s %s %d %d','\n) ;
And then I will try to access data{i,16} to count how many are equal to Attribute_Name='xyz', it doesnt seem to be an efficient though.
what will be a proper way to read the data(what interests me is to count how many Attribute_Name='xyz' do I have)? Thanks
You could simply use count which is referenced here.
In your case you could use it in this way:
filetext = fileread("test.txt");
A = count(filetext , "xyz")
fileread will read the whole text file into a single string. Afterwards you can process that string using count which will return the occurrences from the given pattern.
An alternative when using older versions of MATLAB is this one. It will work with R2006a and above.
filetext = fileread("test.txt");
A = length(strfind(filetext, "xyz");
strfind will return an array which length represents the amount of occurrences of the specified string. The length of that array can be accessed by length.
There is the option of strsplit. You may do something like the following:
count = 0;
fid = fopen('test.txt','r');
while ~feof(fid)
line = fgetl(fid);
words = strsplit( line )
ind = find( strcmpi(words{:},'Attribute_Name=''xyz'''), 1); % Assume only one instance per line, remove 1 for more and correct the rest of the code
if ( ind > 0 ) then
count = count + 1;
end if
end
So at the end count will give you the number.

gnuplot: how to sum over an arbitrary list

For gnuplot, I have a large list of (randomly generated) numbers which I want to use as indices in a sum. How do I do it?
Here is what I mean. Let's say the list of numbers is
list = [81, 37, 53, 22, 72, 74, 44, 46, 96, 27]
I have a function
f(x,n) = cos(n*x)
I now want to plot the function, on the interval (-pi,pi) which is the sum of the f(x,n) as n runs through the numbers in list.
If you can control how your list looks like, try the following:
num = 10
# Let the numbers be in a space separated string.
# We can access the individual numbers with the word(string, index) function.
list = "81 37 53 22 72 74 44 46 96 27"
f(x,n) = cos(n*x)
set terminal pngcairo
set output "sum_cos.png"
set xrange [-pi:pi]
set samples 1000
# Build the plot command as a macro.
plt_cmd = ""
do for [n=1:num] {
plt_cmd = sprintf("%s + f(x,%s)", plt_cmd, word(list,n))
}
# Check what we have done so far.
print plt_cmd
titlestring = "{/Symbol S} cos(n_i*x), i = 1 ...".num
# Finally plot the sum by evaluating the macro.
plot #plt_cmd title titlestring
This is the result:

how can I import multiple csv files with selected columns using textscan?

I have a large number of csv files to be processed. I only want the selected columns in each file and then load all the files from a certain folder and then output as one combined file. Here are my codes running with errors.... Could anyone help me to solve this problem?
data_directory = 'C:\Users\...\data';
numfiles = 17;
for n = 1:numfiles
filepath = [data_directory,'data_', num2str(n),'_output.csv'];
fid = fopen (filepath, 'rt');
wanted_columns= [2 3 4 5 10 11 12 13 14 15 16 17 35 36 41 42 44 45 59 61];
format = [];
columns = 109;
for i = 1 : columns;
if any (i == wanted_columns)
format = [format '%s'];
else
format = [format '%*s'];
end
end
data = textscan(fid, format, 'Delimiter',',','HeaderLines',1);
fclose(fid);
end
I think you should check whether the file is opened correctly. The error message seems to indicate that this is not the case. If it is not, check if the filepath is correct.
fid = fopen (filepath, 'rt');
if fid == -1
error('Failed to open file');
end
If the error is thrown here, you know that there was a problem with 'fopen'.
Ofcourse I don't know which files are on your computer, but I assume the '...' in the filename is not in your actual matlab file, only in your question on SO.
But could it be that you repeat the word 'data', while the actual filename only contains 'data' once? You code now will result in filenames like ''C:\Users\...\datadata_1_output.csv'. Maybe 'data' should be removed in data_directory or in filepath = ...?
Here is another way how you can setup the format string in a vectorized manner:
fcell = repmat({'%*s '},1,n_columns);
fcell(wanted_columns) = {'%s '};
formatstr = [fcell{:}];
Notice format is a build-in function in MATLAB, and it's better not to be used for variable name.

Prefix match in MATLAB

Hey guys, I have a very simple problem in MATLAB:
I have some strings which are like this:
Pic001
Pic002
Pic003
004
Not every string starts with the prefix "Pic". So how can I cut off the part "pic" that only the numbers at the end shall remain to have an equal format for all my strings?
Greets, poeschlorn
If 'Pic' only ever occurs as a prefix in your strings and nowhere else within the strings then you could use STRREP to remove it like this:
>> x = {'Pic001'; 'Pic002'; 'Pic003'; '004'}
x =
'Pic001'
'Pic002'
'Pic003'
'004'
>> x = strrep(x, 'Pic', '')
x =
'001'
'002'
'003'
'004'
If 'Pic' can occur elsewhere in your strings and you only want to remove it when it occurs as a prefix then use STRNCMP to compare the first three characters of your strings:
>> x = {'Pic001'; 'Pic002'; 'Pic003'; '004'}
x =
'Pic001'
'Pic002'
'Pic003'
'004'
>> for ii = find(strncmp(x, 'Pic', 3))'
x{ii}(1:3) = [];
end
>> x
x =
'001'
'002'
'003'
'004'
strings = {'Pic001'; 'Pic002'; 'Pic003'; '004'};
numbers = regexp(strings, '(PIC)?(\d*)','match');
for cc = 1:length(numbers);
fprintf('%s\n', char(numbers{cc}));
end;