I have a text file which has the contents as follows.. I need to read this file column wise (ies, 2 columns here). I have tried many ways.. but cannot do as it contains "(" , "," , ")" etc... please guide..
(-2.714141687294326, 0.17700122506478025)
(-2.8889905690592976, 0.1449494260855578)
(-2.74534285564141, 0.3182989792519164)
(-2.728716536554531, -0.3267545129349194)
(-2.280859632844493, -0.7413304490629143)
(-2.8205377507406095, 0.08946138452856946)
(-2.6261449731466335, -0.16338495969832847)
(-2.8863827317805537, 0.5783117541867042)
(-2.6727557978209546, 0.11377424587411682)
(-2.5069470906518565, -0.6450688986485736)
(-2.6127552309087236, -0.01472993916137419)
(-2.7861092661880185, 0.23511200020171835)
(-3.2238037438656533, 0.5113945870063824)
(-2.6447503899420304, -1.1787646364375748)
Try this:
x = importdata('filename.txt');
x = regexp(x,'-?\d+\.?\d*','match'); %// detect numbers as [-]a[.][bcd]
x = cellfun(#str2num, vertcat(x{:}));
If the file can contain numbers in decimal form ("1.234") and in scientific notation ("1.234e-56"):
x = importdata('filename.txt');
x = regexp(x,'-?\d+\.?\d*(e-?\d+)?','match');
x = cellfun(#str2num, vertcat(x{:}));
You can use fscanf and specify the desired format:
fid = fopen('filename.txt');
x = fscanf(fid,'(%f, %f)\n',[2,inf]).';
fclose(fid);
The format spec '(%f, %f)\n' reads to float values inside brackets, separated by , per line. With [2,inf] you specify to put it into a 2 x n array, where n is as large as needed. To have the same format as before, you'll have to transpose it again .'.
Related
When I save a table as PDF (using report generator) I do not get the numeric fields (Index1, Index2, Index3) shortG format (total of 5 digits). What is the problem and how can I fix it?
The code:
function ButtonPushed(app, event)
import mlreportgen.dom.*;
import mlreportgen.report.*
format shortG;
ID = [1;2;3;4;5];
Name = {'San';'John';'Lee';'Boo';'Jay'};
Index1 = [71.1252;69.2343245;64.345345;67.345322;64.235235];
Index2 = [176.23423;163.123423654;131.45364572;133.5789435;119.63575647];
Index3 = [176.234;16.123423654;31.45364572;33.5789435;11.6647];
mt = table(ID,Name,Index1,Index2,Index3);
d = Document('myPDF','pdf');
d.OutputPath = ['E:/','temp'];
append(d,'Table 1: ');
append(d,mt);
close(d);
rptview(d.OutputPath);
end
To fix it, format your numeric arrays to character arrays with 5 significant digits before writing to PDF.
mt = table(ID,Name,f(Index1),f(Index2),f(Index3));
where,
function FivDigsStr = f(x)
%formatting to character array with 5 significant digits and then splitting.
%at each tab. categorical is needed to remove ' 's that appear around char
%in the output PDF file with newer MATLAB versions
%e.g. with R2018a, there are no ' ' in the output file but ' ' appears with R2020a
FivDigsStr = categorical(split(sprintf('%0.5G\t',x)));
%Removing the last (<undefined>) value (which is included due to \t)
FivDigsStr = FivDigsStr(1:end-1);
end
The above change gives the following result:
Edit:
To bring back the headers:
mt.Properties.VariableNames(3:end) = {'Index1', 'Index2', 'Index3'};
or in a more general way to extract the variable names instead of hardcoding them, you can use inputnames to extract the variable names.
V = #(x) inputname(1);
mt.Properties.VariableNames(3:end) = {V(Index1), V(Index2), V(Index3)};
which gives:
I am generating 2500 values in Matlab in format (time,heart_rate, resp_rate) by using below code
numberOfSeconds = 2500;
time = 1:numberOfSeconds;
newTime = transpose(time);
number0 = size(newTime, 1)
% generating heart rates
heart_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intHeartRate = int64(heart_rate);
number1 = size(intHeartRate, 1)
% hist(heart_rate)
% generating resp rates
resp_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intRespRate = int64(resp_rate);
number2 = size(intRespRate, 1)
% hist(heart_rate)
% joining time and sensor data
joinedStream = strcat(num2str(newTime),{','},num2str(intHeartRate),{','},num2str(intRespRate))
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', joinedStream,'delimiter','');
The data shown in the console is alright, but when I save this data to a .txt file, it contains extra spaces in beginning. Hence I am not able to parse the .txt file to generate input stream. Please help
Replace the last two lines of your code with the following. No need to use strcat if you want a CSV output file.
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', [newTime intHeartRate intRespRate]);
πβπ π πππ’π‘πππ π π’ππππ π‘ππ ππ¦ ππΎπ ππ π‘βπ π ππππππ π‘ πππ π¦ππ’π πππ π. πβππ πππ π€ππ ππ₯ππππππ π€βπ¦ π¦ππ’ πππ‘ π‘βπ π’πππ₯ππππ‘ππ ππ’π‘ππ’π‘.
The data written in the file is exactly what is shown in the console.
>> joinedStream(1) %The exact output will differ since 'rand' is used
ans =
cell
' 1,60,63'
num2str basically converts a matrix into a character array. Hence number of characters in its each row must be same. So for each column of the original matrix, the row with the maximum number of characters is set as a standard for all the rows with less characters and the deficiency is filled by spaces. Columns are separated by 2 spaces. Take a look at the following smaller example to understand:
>> num2str([44, 42314; 4, 1212421])
ans =
2Γ11 char array
'44 42314'
' 4 1212421'
I have a .txt file with rows consisting of three elements, a word and two numbers, separated by commas.
For example:
a,142,5
aa,3,0
abb,5,0
ability,3,0
about,2,0
I want to read the file and put the words in one variable, the first numbers in another, and the second numbers in another but I am having trouble with textscan.
This is what I have so far:
File = [LOCAL_DIR 'filetoread.txt'];
FID_File = fopen(File,'r');
[words,var1,var2] = textscan(File,'%s %f %f','Delimiter',',');
fclose(FID_File);
I can't seem to figure out how to use a delimiter with textscan.
horchler is indeed correct. You first need to open up the file with fopen which provides a file ID / pointer to the actual file. You'd then use this with textscan. Also, you really only need one output variable because each "column" will be placed as a separate column in a cell array once you use textscan. You also need to specify the delimiter to be the , character because that's what is being used to separate between columns. This is done by using the Delimiter option in textscan and you specify the , character as the delimiter character. You'd then close the file after you're done using fclose.
As such, you just do this:
File = [LOCAL_DIR 'filetoread.txt'];
f = fopen(File, 'r');
C = textscan(f, '%s%f%f', 'Delimiter', ',');
fclose(f);
Take note that the formatting string has no spaces because the delimiter flag will take care of that work. Don't add any spaces. C will contain a cell array of columns. Now if you want to split up the columns into separate variables, just access the right cells:
names = C{1};
num1 = C{2};
num2 = C{3};
These are what the variables look like now by putting the text you provided in your post to a file called filetoread.txt:
>> names
names =
'a'
'aa'
'abb'
'ability'
'about'
>> num1
num1 =
142
3
5
3
2
>> num2
num2 =
5
0
0
0
0
Take note that names is a cell array of names, so accessing the right name is done by simply doing n = names{ii}; where ii is the name you want to access. You'd access the values in the other two variables using the normal indexing notation (i.e. n = num1(ii); or n = num2(ii);).
I have a txt file that I want to read into Matlab. Data format is like below:
term2 2015-07-31-15_58_25_612 [0.9934343, 0.3423043, 0.2343433, 0.2342323]
term0 2015-07-31-15_58_25_620 [12]
term3 2015-07-31-15_58_25_625 [2.3333, 3.4444, 4.5555]
...
How can I read these data in the following way?
name = [term2 term0 term3] or namenum = [2 0 3]
time = [2015-07-31-15_58_25_612 2015-07-31-15_58_25_620 2015-07-31-15_58_25_625]
data = {[0.9934343, 0.3423043, 0.2343433, 0.2342323], [12], [2.3333, 3.4444, 4.5555]}
I tried to use textscan in this way 'term%d %s [%f, %f...]', but for the last data part I cannot specify the length because they are different. Then how can I read it? My Matlab version is R2012b.
Thanks a lot in advance if anyone could help!
There may be a way to do that in one single pass, but for me these kind of problems are easier to sort with a 2 pass approach.
Pass 1: Read all the columns with a constant format according to their type (string, integer, etc ...) and read the non constant part in a separate column which will be processed in second pass.
Pass 2: Process your irregular column according to its specificities.
In a case with your sample data, it looks like this:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %s %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data into variables
name = M{1,1} ;
time = M{1,2} ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,3} ) ;
What happened:
The first textscan instruction reads the full file. In the format specifier:
term%d read the integer after the literal expression 'term'.
%s read a string representing the date.
%*c ignore one character (to ignore the character '[').
%[^]] read everything (as a string) until it finds the character ']'.
%*[^\n] ignore everything until the next newline ('\n') character. (to not capture the last ']'.
After that, the first 2 columns are easily dispatched into their own variable. The 3rd column of the result cell array M contains strings of different lengths containing different number of floating point number. We use cellfun in combination with another textscan to read the numbers in each cell and return a cell array containing double:
Bonus:
If you want your time to be a numeric value as well (instead of a string), use the following extension of the code:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %f-%f-%f-%f_%f_%f_%f %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data
name = M{1,1} ;
time_vec = cell2mat( M(1,2:7) ) ;
time_ms = M{1,8} ./ (24*3600*1000) ; %// take care of the millisecond separatly as they are not handled by "datenum"
time = datenum( time_vec ) + time_ms ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,end} ) ;
This will give you an array time with a Matlab time serial number (often easier to use than strings). To show you the serial number still represent the right time:
>> datestr(time,'yyyy-mm-dd HH:MM:SS.FFF')
ans =
2015-07-31 15:58:25.612
2015-07-31 15:58:25.620
2015-07-31 15:58:25.625
For comlicated string parsing situations like such it is best to use regexp. In this case assuming you have the data in file data.txt the following code should do what you are looking for:
txt = fileread('data.txt')
tokens = regexp(txt,'term(\d+)\s(\S*)\s\[(.*)\]','tokens','dotexceptnewline')
% Convert namenum to numeric type
namenum = cellfun(#(x)str2double(x{1}),tokens)
% Get time stamps from the second row of all the tokens
time = cellfun(#(x)x{2},tokens,'UniformOutput',false);
% Split the numbers in the third column
data = cellfun(#(x)str2double(strsplit(x{3},',')),tokens,'UniformOutput',false)
I want to load a text file whit ASCII characters and put all the content into a varialbe in MATLAB, This is the code i tried :
f =fopen('tp.txt')
The result I get is 1, then it increments each time I excute this code line.
However, when I try :
f =load('tp.txt')
I get this error :
??? Error using ==> load
Number of columns on line 1 of ASCII file D:\Cours\TP\tp.txt
must be the same as previous lines.
Error in ==> TPTI at 3
f =load('tp.txt')
My goal is to count the occurance of each charachter then calculate the propabilites and the entropies.
Any idea?
Try this:
%// Import file contents as a string
str = importdata('tp.txt'); %// each line is a cell
str = [str{:}]; %// concat all lines into a single string
%// Remove spaces. Maybe you want to remove other characters as well: punctuation...
str = regexprep(str, '\s', '');
%// Count frequency of each character:
freq = histc(double(str), 0:255); %// convert characters to ASCII codes and count