I have a text file containing randoms of 0 and 1, and I want to read it in Matlab and obtain each element in an array
goal :
I have two text files that I want to compare and see if they are identical and how much difference there is, in fact, the two files are :
1) original file that I send via a communication line
2) the received file should be identical to the send file
Example of my code:
for i=1:1:size
if (send[i] ~= received[i]) error++;
end
but I need to know how to obtain these two arrays from the text files, where all the "0" and "1" are in one line
Since you want to check that the contents of the two files are the same, I do not think you need to worry about the format of their contents or the sequence of zeros and ones, they should be simply identical. You can use the following code to read the entire text file and store it in a char vector:
C = char(join(readlines(filename), ''));
To compare contents of two files and find the error percent you can do the following:
act = char(join(readlines(actualfilename), ''));
exp = char(join(readlines(expectedfilename), ''));
err = (sum(act~=exp))/length(act);
But you should also detect if two files contain different number of characters:
act = char(join(readlines(actualfilename), ''));
exp = char(join(readlines(expectedfilename), ''));
al = length(act); % actual length
el = length(exp); % expected length
dl = abs(al-el);
if (dl>0)
ml = min(al, el); % min length
act = act(1:ml); % shorten act if needed
exp = exp(1:ml); % shorten exp if needed
end
err = (sum(act~=exp)+dl)/al % error
Note that in the second case, if a character is added or lost in the middle of the file, all subsequent characters will be considered as error.
Reading in the Text Files:
If the text file is configured with spaces or line breaks:
Text.txt (line breaks)
0
1
0
1
1
Text.txt (spaces)
1 0 1 0 1 1
Scanning in the data can be done by using the fscanf() function with format specification %d indicated to scan in the file as integers.
File_Name = "Text.txt";
File_ID = fopen(File_Name);
Binary = fscanf(File_ID,'%d');
If the text file has the characters beside/concatenated on the same line without spaces:
Text.txt (single line, no spaces)
01011
Scanning the text file can be done using the format specification, %s indicated to read the file as a string. This string can be split and converted into an array by using split(), cell2mat() and str2num().
split() β Splits the string into a cell array with individual bits/binary
cell2mat() β Converts the cell array to a character array
str2num() β Converts the character array to a numerical double array
File_Name = "Text.txt";
File_ID = fopen(File_Name);
Binary = fscanf(File_ID,'%s');
Binary = split(Binary,'');
Binary = str2num(cell2mat(Binary(2:end-1))).';
Comparing to Evaluate Amount of Errors:
Error checking can be done by comparing the arrays logically in an element-wise fashion. Then by using the nnz() (number of non-zeroes) function we can count the number of times the condition is true, "1". Here the condition is when the two binary signals Binary_1 and Binary_2 not equal to each other.
Code Snippet:
Error = nnz(Binary_1 ~= Binary_2);
Error
Full Script Option 1 (line breaks/spaces text file):
File_Name = "Text_1.txt";
File_ID = fopen(File_Name);
Binary_1 = fscanf(File_ID,'%d');
fclose(File_ID);
File_Name = "Text_2.txt";
File_ID = fopen(File_Name);
Binary_2 = fscanf(File_ID,'%d');
fclose(File_ID);
clearvars -except Binary_1 Binary_2
Error = nnz(Binary_1 ~= Binary_2);
Error
Full Script Option 2 (single line, no spaces text file):
File_Name = "Text_1.txt";
File_ID = fopen(File_Name);
Binary_1 = fscanf(File_ID,'%s');
Binary_1 = split(Binary_1,'');
Binary_1 = str2num(cell2mat(Binary_1(2:end-1))).';
File_Name = "Text_2.txt";
File_ID = fopen(File_Name);
Binary_2 = fscanf(File_ID,'%s');
Binary_2 = split(Binary_2,'');
Binary_2 = str2num(cell2mat(Binary_2(2:end-1))).';
fclose(File_ID);
clearvars -except Binary_1 Binary_2
Error = nnz(Binary_1 ~= Binary_2);
Error
Ran using MATLAB R2019b
Related
How can I go about doing this? So far I've opened the file like this
fileID = fopen('hamlet.txt'.'r');
[A,count] = fscanf(fileID, '%s');
fclose(fileID);
Getting spaces from the file
First, if you want to capture spaces, you'll need to change your format specifier. %s reads only non-whitespace characters.
>> fileID = fopen('space.txt','r');
>> A = fscanf(fileID, '%s');
>> fclose(fileID);
>> A
A = Thistexthasspacesinit.
Instead, we can use %c:
>> fileID = fopen('space.txt','r');
>> A = fscanf(fileID, '%c');
>> fclose(fileID);
>> A
A = This text has spaces in it.
Mapping between characters and values (array indices)
We could create a character array that contains all of the target characters to look for:
search_chars = ['A':'Z', 'a':'z', ',', '.', ' '];
That would work, but to map the character to a position in the array you'd have to do something like:
>> char_pos = find(search_chars == 'q')
char_pos = 43
You could also use containters.Map, but that seems like overkill.
Instead, let's use the ASCII value of each character. For convenience, we'll use only values 1:126 (0 is NUL, and 127 is DEL. We should never encounter either of those.) Converting from characters to their ASCII code is easy:
>> c = 'q'
c = s
>> a = uint8(c) % MATLAB actually does this using double(). Seems wasteful to me.
a = 115
>> c2 = char(a)
c2 = s
Note that by doing this, you're counting characters that are not in your desired list like ! and *. If that's a problem, then use search_chars and figure out how you want to map from characters to indices.
Looping solution
The most intuitive way to count each character is a loop. For each character in A, find its ASCII code and increment the counter array at that index.
char_count = zeros(1, 126);
for current_char = A
c = uint8(current_char);
char_count(c) = char_count(c) + 1;
end
Now you've got an array of counts for each character with ASCII codes from 1 to 126. To find out how many instances of 's' there are, we can just use its ASCII code as an index:
>> char_count(115)
ans = 4
We can even use the character itself as an index:
>> char_count('s')
ans = 4
Vectorized solution
As you can see with that last example, MATLAB's weak typing makes characters and their ASCII codes pretty much equivalent. In fact:
>> 's' == 115
ans = 1
That means that we can use implicit broadcasting and == to create a logical 2D array where L(c,a) == 1 if character c in our string A has an ASCII code of a. Then we can get the count for each ASCII code by summing along the columns.
L = (A.' == [1:126]);
char_count = sum(L, 1);
A one-liner
Just for fun, I'll show one more way to do this: histcounts. This is meant to put values into bins, but as we said before, characters can be treated like values.
char_count = histcounts(uint8(A), 1:126);
There are dozens of other possibilities, for instance you could use the search_chars array and ismember(), but this should be a good starting point.
With [A,count] = fscanf(fileID, '%s'); you'll only count all string letters, doesn't matter which one. You can use regexp here which search for each letter you specify and will put it in a cell array. It consists of fields which contains the indices of your occuring letters. In the end you only sum the number of indices and you have the count for each letter:
fileID = fopen('hamlet.txt'.'r');
A = fscanf(fileID, '%s');
indexCellArray = regexp(A,{'A','B','C','D',... %I'm too lazy to add the other letters now^^
'a','b','c','d',...
','.' '};
letterCount = cellfun(#(x) numel(x),indexCellArray);
fclose(fileID);
Maybe you put the cell array in a struct where you can give fieldnames for the letters, otherwise you might loose track which count belongs to which number.
Maybe there's much easier solution, cause this one is kind of exhausting to put all the letters in the regexp but it works.
When I save a table as PDF (using report generator) I do not get the numeric fields (Index1, Index2, Index3) shortG format (total of 5 digits). What is the problem and how can I fix it?
The code:
function ButtonPushed(app, event)
import mlreportgen.dom.*;
import mlreportgen.report.*
format shortG;
ID = [1;2;3;4;5];
Name = {'San';'John';'Lee';'Boo';'Jay'};
Index1 = [71.1252;69.2343245;64.345345;67.345322;64.235235];
Index2 = [176.23423;163.123423654;131.45364572;133.5789435;119.63575647];
Index3 = [176.234;16.123423654;31.45364572;33.5789435;11.6647];
mt = table(ID,Name,Index1,Index2,Index3);
d = Document('myPDF','pdf');
d.OutputPath = ['E:/','temp'];
append(d,'Table 1: ');
append(d,mt);
close(d);
rptview(d.OutputPath);
end
To fix it, format your numeric arrays to character arrays with 5 significant digits before writing to PDF.
mt = table(ID,Name,f(Index1),f(Index2),f(Index3));
where,
function FivDigsStr = f(x)
%formatting to character array with 5 significant digits and then splitting.
%at each tab. categorical is needed to remove ' 's that appear around char
%in the output PDF file with newer MATLAB versions
%e.g. with R2018a, there are no ' ' in the output file but ' ' appears with R2020a
FivDigsStr = categorical(split(sprintf('%0.5G\t',x)));
%Removing the last (<undefined>) value (which is included due to \t)
FivDigsStr = FivDigsStr(1:end-1);
end
The above change gives the following result:
Edit:
To bring back the headers:
mt.Properties.VariableNames(3:end) = {'Index1', 'Index2', 'Index3'};
or in a more general way to extract the variable names instead of hardcoding them, you can use inputnames to extract the variable names.
V = #(x) inputname(1);
mt.Properties.VariableNames(3:end) = {V(Index1), V(Index2), V(Index3)};
which gives:
I want to read all wav files containing in a folder and save samples of each file in several cvs files containing in other folder.
This is my code:
dirMask = 'inputFolder\*.wav';
wavRoot = fileparts(dirMask);
Files=dir(dirMask);
for k=1:length(Files)
FileNames = fullfile(wavRoot, Files(k).name);
[s,fs] = audioread(FileNames);
end
fid = fopen('\filename.xls','a');
fprintf(fid,'%f\n',num2str(s));
fclose(fid);
This code doesn't work. How can I do this?
Firstly, note that you are using %f which is for floats, but you are converting "s" into a string. Also, \n jumps to the next line, so if you need several columns you will have to check when to use \n or \t (tab) or just ";" for example. Check matlab data formatting in any case
So if you know the exact number of columns (e.g. 3 columns) you can write:
fprintf(fid,'%s \t %s \t %s \n',string1, string2, string3);
If you want to do it within a loop you can check your iterator and add "\n" every X strings.
E.g.
ncols = 3
for i = 1:21
mystring = num2str(i)
if mod(i,ncols) == 0
fprintf(fid,'%s \n',mystring);
else
fprintf(fid,'%s ', mystring);
end
end
I am generating 2500 values in Matlab in format (time,heart_rate, resp_rate) by using below code
numberOfSeconds = 2500;
time = 1:numberOfSeconds;
newTime = transpose(time);
number0 = size(newTime, 1)
% generating heart rates
heart_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intHeartRate = int64(heart_rate);
number1 = size(intHeartRate, 1)
% hist(heart_rate)
% generating resp rates
resp_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intRespRate = int64(resp_rate);
number2 = size(intRespRate, 1)
% hist(heart_rate)
% joining time and sensor data
joinedStream = strcat(num2str(newTime),{','},num2str(intHeartRate),{','},num2str(intRespRate))
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', joinedStream,'delimiter','');
The data shown in the console is alright, but when I save this data to a .txt file, it contains extra spaces in beginning. Hence I am not able to parse the .txt file to generate input stream. Please help
Replace the last two lines of your code with the following. No need to use strcat if you want a CSV output file.
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', [newTime intHeartRate intRespRate]);
πβπ π πππ’π‘πππ π π’ππππ π‘ππ ππ¦ ππΎπ ππ π‘βπ π ππππππ π‘ πππ π¦ππ’π πππ π. πβππ πππ π€ππ ππ₯ππππππ π€βπ¦ π¦ππ’ πππ‘ π‘βπ π’πππ₯ππππ‘ππ ππ’π‘ππ’π‘.
The data written in the file is exactly what is shown in the console.
>> joinedStream(1) %The exact output will differ since 'rand' is used
ans =
cell
' 1,60,63'
num2str basically converts a matrix into a character array. Hence number of characters in its each row must be same. So for each column of the original matrix, the row with the maximum number of characters is set as a standard for all the rows with less characters and the deficiency is filled by spaces. Columns are separated by 2 spaces. Take a look at the following smaller example to understand:
>> num2str([44, 42314; 4, 1212421])
ans =
2Γ11 char array
'44 42314'
' 4 1212421'
I have a txt file that I want to read into Matlab. Data format is like below:
term2 2015-07-31-15_58_25_612 [0.9934343, 0.3423043, 0.2343433, 0.2342323]
term0 2015-07-31-15_58_25_620 [12]
term3 2015-07-31-15_58_25_625 [2.3333, 3.4444, 4.5555]
...
How can I read these data in the following way?
name = [term2 term0 term3] or namenum = [2 0 3]
time = [2015-07-31-15_58_25_612 2015-07-31-15_58_25_620 2015-07-31-15_58_25_625]
data = {[0.9934343, 0.3423043, 0.2343433, 0.2342323], [12], [2.3333, 3.4444, 4.5555]}
I tried to use textscan in this way 'term%d %s [%f, %f...]', but for the last data part I cannot specify the length because they are different. Then how can I read it? My Matlab version is R2012b.
Thanks a lot in advance if anyone could help!
There may be a way to do that in one single pass, but for me these kind of problems are easier to sort with a 2 pass approach.
Pass 1: Read all the columns with a constant format according to their type (string, integer, etc ...) and read the non constant part in a separate column which will be processed in second pass.
Pass 2: Process your irregular column according to its specificities.
In a case with your sample data, it looks like this:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %s %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data into variables
name = M{1,1} ;
time = M{1,2} ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,3} ) ;
What happened:
The first textscan instruction reads the full file. In the format specifier:
term%d read the integer after the literal expression 'term'.
%s read a string representing the date.
%*c ignore one character (to ignore the character '[').
%[^]] read everything (as a string) until it finds the character ']'.
%*[^\n] ignore everything until the next newline ('\n') character. (to not capture the last ']'.
After that, the first 2 columns are easily dispatched into their own variable. The 3rd column of the result cell array M contains strings of different lengths containing different number of floating point number. We use cellfun in combination with another textscan to read the numbers in each cell and return a cell array containing double:
Bonus:
If you want your time to be a numeric value as well (instead of a string), use the following extension of the code:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %f-%f-%f-%f_%f_%f_%f %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data
name = M{1,1} ;
time_vec = cell2mat( M(1,2:7) ) ;
time_ms = M{1,8} ./ (24*3600*1000) ; %// take care of the millisecond separatly as they are not handled by "datenum"
time = datenum( time_vec ) + time_ms ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,end} ) ;
This will give you an array time with a Matlab time serial number (often easier to use than strings). To show you the serial number still represent the right time:
>> datestr(time,'yyyy-mm-dd HH:MM:SS.FFF')
ans =
2015-07-31 15:58:25.612
2015-07-31 15:58:25.620
2015-07-31 15:58:25.625
For comlicated string parsing situations like such it is best to use regexp. In this case assuming you have the data in file data.txt the following code should do what you are looking for:
txt = fileread('data.txt')
tokens = regexp(txt,'term(\d+)\s(\S*)\s\[(.*)\]','tokens','dotexceptnewline')
% Convert namenum to numeric type
namenum = cellfun(#(x)str2double(x{1}),tokens)
% Get time stamps from the second row of all the tokens
time = cellfun(#(x)x{2},tokens,'UniformOutput',false);
% Split the numbers in the third column
data = cellfun(#(x)str2double(strsplit(x{3},',')),tokens,'UniformOutput',false)