Matlab fscanf read two column character/hex data from text file - matlab

Need to read in data stored as two columns of hex values in text file temp.dat into a Matlab variable with 8 rows and two columns.
Would like to stick with the fcsanf method.
temp.dat looks like this (8 rows, two columns):
0000 7FFF
30FB 7641
5A82 5A82
7641 30FB
7FFF 0000
7641 CF05
5A82 A57E
30FB 89BF
% Matlab code
fpath = './';
fname = 'temp.dat';
fid = fopen([fpath fname],'r');
% Matlab treats hex a a character string
formatSpec = '%s %s';
% Want the output variable to be 8 rows two columns
sizeA = [8,2];
A = fscanf(fid,formatSpec,sizeA)
fclose(fid);
Matlab is producing the following which I don't expect.
A = 8×8 char array
'03577753'
'00A6F6A0'
'0F84F48F'
'0B21F12B'
'77530CA8'
'F6A00F59'
'F48F007B'
'F12B05EF'
In another variation, I attemped changing the format string like this
formatSpec = '%4c %4c';
Which produced this output:
A =
8×10 char array
'0↵45 F7↵78'
'031A3F65E9'
'00↵80 4A↵B'
'0F52F0183F'
'7BA7B0C20 '
'F 86↵0F F '
'F724700AB '
'F6 1F↵55 '
Still another variation like this:
formatSpec = '%4c %4c';
sizeA = [8,16];
A = fscanf(fid,formatSpec);
Produces a one by 76 character array:
A =
'00007FFF
30FB 7641
5A82 5A827641 30FB
7FFF 0000
7641CF05
5A82 A57E
30FB 89BF'
Would like and expect Matlab to produce a workspace variable with 8 rows and 2 columns.
Have followed the example on the Matlab help area here:
https://www.mathworks.com/help/matlab/ref/fscanf.html
My Matlab code is based on the 'read file contents into an array' section about 1/3 of the way down the page. The example I reference is doing something very similar except that the two columns are one int and one float rather than two characters.
Running Matlab R2017a on Redhat.
Here is the complete code with the solution provided by Azim and comments about
what I learned as a result of posting the question.
fpath = './';
fname = 'temp.dat';
fid = fopen([fpath fname],'r');
formatSpec = '%9c\n';
% specify the output size as the input transposed, NOT the input.
sizeA = [9,8];
A = fscanf(fid,formatSpec,sizeA);
% A' is an 8 by 9 character array, which is the goal matrix size.
% B is an 8 by 1 cell array, each member has this format 'dead beef'.
%
% Cell arrays are data types with indexed data containers called cells,
% where each cell can contain any type of data.
B = cellstr(A');
% split divides str at whitespace characters.
S = split(C)
fclose(fid)
S =
8×2 cell array
'0000' '7FFF'
'30FB' '7641'
'5A82' '5A82'
'7641' '30FB'
'7FFF' '0000'
'7641' 'CF05'
'5A82' 'A57E'
'30FB' '89BF'

It is likely your, 8x2 MATLAB variable would end up being a cell array. This can be done in two steps.
First, your lines have 9 characters so you could use formatSpec = '%9c\n' to read each line. Next you need to adjust the size parameter to read 9 rows and 8 columns; sizeA = [9 8]. This will read in all 9 characters into columns of the output; transposing the output will get you closer.
In the second step you need to convert the result of fscanf into your 8x2 cell array. Since you have R2017a you can then use cellstr and split to get your result.
Finally, if you need the integer values of each hex value you can use hex2dec on each cell in the cell-array.

Related

Converting csv of strings into a matrix

I just started using Octave (No money for Matlab :/) and I'm also new to Stack Overflow, so please pardon any error I make with conventions.
Problem: I have a csv of strings like so:
Bob Marley,Kobe Bryant,Michael Jackson,Kevin Hart
I would like to make this into a 1 column matrix (I need it in a matrix so that I can combine it with data that are in other matrices).
My approach: I have tried doing textread, but this gives me a cell array. I tried converting the resulting cell array to a matrix by using cell2mat, but I suspect that I cannot do this because my strings are of varying lengths.
Let me know if any other information is necessary.
You can use char arrays using:
fid = fopen('strings.csv');
A = textscan(fid, '%s', 'delimiter', ',');
B = char(A{:})
[rows, cols] = size(B)
Output is the following:
B =
Bob Marley
Kobe Bryant
Michael Jackson
Kevin Hart
rows = 4
cols = 15
As you can see, the number of columns of B is the maximum length of all "strings" (Michael Jackson, 15). All other "strings" get whitespaces appended.
Considering you are in the directory where you have a file "strings.csv" with the content you mentioned in the question, your code whould look like this:
fid=fopen('strings.csv');
A=textscan(fid,'%s','delimiter',',');
A=A{1};
A=cellfun(#(x) string(x),A,'uni',0);
B=[A{:}];
If your data is that simple, you can do it in a one-liner. Use fileread to slurp all the data in, and then strsplit to separate the elements, and a ' transpose to convert it to a column vector.
x = strsplit(fileread('myfile.txt'), ',')'
If you end up with spaces around the commas in your data, upgrade to regexp.
x = regexp(fileread('myfile.txt'), ' *, *', 'split')

Read multiple data from a MATLAB file

I am currently trying to read data from a text file written exactly like this:
Height = 10
Length = 10
NodeX = 11
NodeY = 11
K = 10
I've written a small code like this
fileID = fopen('input.dat','r');
[a, b] = fscanf(fileID, '%s %f')
And I get the following answer:
a =
72
101
105
103
104
116
b =
1
It seems quite obvious I am not mananging to specify the format specification.
I would like to know how to pick a string along with a float multiple times in the same file.
As the documentation for fscanf states:
If formatSpec contains a combination of numeric and character
specifiers, then fscanf converts each character to its numeric
equivalent. This conversion occurs even when the format explicitly
skips all numeric values (for example, formatSpec is '%*d %s').
MATLAB can be annoyingly bad at reading mixed data types. One possible alternative is to read each line and split up your data using a simple regular expression:
fileID = fopen('results.txt','r');
mydata = {};
ii = 1;
while ~feof(fileID) % While we're not at the end of the file
tline = fgetl(fileID); % Get next line
mydata(ii,:) = regexp(tline, '([a-zA-Z])* = (\d*)', 'tokens');
ii = ii + 1;
end
fclose(fileID);
This returns a 5 x 1 cell array where each cell contains 2 cells (slightly annoying, but you can pull them out) that match your data. In this case, mydata{1}{1} is Height and mydata{1}{2} is 10.
Edit:
And you can flatten your cell array with a reshape call:
mydata = reshape([mydata{:}], 2, [])';
Which turns mydata in this case into a 5x2 cell array.
The fscanf function is a low-level I/O function and is often not the best choice for such rather high-level file input. One alternative would be to use the textscan function, which allows quite advanced format specifications:
fileID = fopen('input.dat','r');
C = textscan(fileID,'%s = %d')
which creates a 1x2 cell array. The first cell C{1} contains another 5x1 cell, where each field contains the name of the field, e.g. 'Height'. The second cell C{2} contains a 5x1 vector containing all integer values from the file.

Exporting blank values into a .txt file - MATLAB

I'm currently trying to export multiple matrices of unequal lengths into a delimited .txt file thus I have been padding the shorter matrices with 0's such that dlmwrite can use horzcat without error:
dlmwrite(filename{1},[a,b],'delimiter','\t')
However ideally I do not want the zeroes to appear in the .txt file itself - but rather the entries are left blank.
Currently the .txt file looks like this:
55875 3.1043e+05
56807 3.3361e+05
57760 3.8235e+05
58823 4.2869e+05
59913 4.3349e+05
60887 0
61825 0
62785 0
63942 0
65159 0
66304 0
67509 0
68683 0
69736 0
70782 0
But I want it to look like this:
55875 3.1043e+05
56807 3.3361e+05
57760 3.8235e+05
58823 4.2869e+05
59913 4.3349e+05
60887
61825
62785
63942
65159
66304
67509
68683
69736
70782
Is there anyway I can do this? Is there an alternative to dlmwrite which will mean I do not need to have matrices of equal lengths?
If a is always longer than b you could split vector a into two vectors of same length as vector b and the rest:
a = [1 2 3 4 5 6 7 8]';
b = [9 8 7 ]';
len = numel(b);
dlmwrite( 'foobar.txt', [a(1:len), b ], 'delimiter', '\t' );
dlmwrite( 'foobar.txt', a(len+1:end), 'delimiter', '\t', '-append');
You can read in the numeric data and convert to string and then add proper whitespaces to have the final output as string based cell array, which you can easily write into the output text file.
Stage 1: Get the cell of strings corresponding to the numeric data from column vector inputs a, b, c and so on -
%// Concatenate all arrays into a cell array with numeric data
A = [{a} {b} {c}] %// Edit this to add more columns
%// Create a "regular" 2D shaped cell array to store the cells from A
lens = cellfun('length',A)
max_lens = max(lens)
A_reg = cell(max_lens,numel(lens))
A_reg(:) = {''}
A_reg(bsxfun(#le,[1:max_lens]',lens)) = cellstr(num2str(vertcat(A{:}))) %//'
%// Create a char array that has string data from input arrays as strings
wsp = repmat({' '},max_lens,1) %// Create whitespace cell array
out_char = [];
for iter = 1:numel(A)
out_char = [out_char char(A_reg(:,iter)) char(wsp)]
end
out_cell = cellstr(out_char)
Stage 2: Now, that you have out_cell as the cell array that has the strings to be written to the text file, you have two options next for the writing operation itself.
Option 1 -
dlmwrite('results.txt',out_cell(:),'delimiter','')
Option 2 -
outfile = 'results.txt';
fid = fopen(outfile,'w');
for row = 1:numel(out_cell)
fprintf(fid,'%s\n',out_cell{row});
end
fclose(fid);

read text files containing binary data as a single matrix in matlab

I have a text file which contains binary data in the following manner:
00000000000000000000000000000000001011111111111111111111111111111111111111111111111111111111110000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111000111100000000000000000000000000000000
00000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111000111110000000000000000000000000000000
00000000000000000000000000000000000000111111111111111111111111111111111111111111111111111111110000000000000000000000000000000
00000000000000000000000000000000000000000000111111111111111111111111111111111111110000000011100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111100111110000000000000000000000000000000
00000000000000000000000000000000000111111111111111111111111111111111111111111111111111110111110000000000000000000000000000000
00000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000001111111111111111111111111111111111111111111111000011100000000000000000000000000000000
00000000000000000000000000000000000000001111111111111111111111111111111111111111111111000011100000000000000000000000000000000
00000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111000000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111110000011100000000000000000000000000000000
00000000000000000000000000000000000000000000011111111111111111111111111111111111100000000011100000000000000000000000000000000
00000000000000000000000000000000000000111111111111111111111111111111111111111111111111110111100000000000000000000000000000000
Please note that each 1 or 0 is independent i.e the values are not decimal. I need to find the column wise sum of the file. There are 125 columns in all and there are 840946 rows.
I have tried textread, fscanf and a few other matlab commands, but the result is that they all read each row in decimal format and create a 840946x1 array. I want to create a 840946x125 matrix to compute a column wise sum.
You can use textread to do it. Just read strings and later process them with sscanf, one digit at a time
A = textread('data.txt', '%s');
ncols = size(A, 1);
nrows = size(A{1}, 2);
A = reshape(sscanf([A{:}], '%1d'), nrows, ncols);
Note that now A is transposed, i.e. you have 125 rows.
The column-wise sum is then computed simply by
colsum = sum(A);
Here's a slightly hack-ish approach:
A = textread('data.txt', '%s');
colsum = sum(cat(1,A{:})-'0')
Breakdown:
textread will read each line of 0's and 1's as a single string. A will therefore be a cell-string, with each element equal to a string of length 125.
cat(1,A{:}) will concatenate the cell string into a "normal" Matlab character array of size 840946-by-125.
Subtracting the ASCII-value '0' from any character array consisting of 0's and 1's will return their numeric representation. For example, 'a'-0 = 97, the ASCII-value for lower-case 'a'.
sum will finally sum over the columns of this array.

Reading text values into matlab variables from ASCII files

Consider the following file
var1 var2 variable3
1 2 3
11 22 33
I would like to load the numbers into a matrix, and the column titles into a variable that would be equivalent to:
variable_names = char('var1', 'var2', 'variable3');
I don't mind to split the names and the numbers in two files, however preparing matlab code files and eval'ing them is not an option.
Note that there can be an arbitrary number of variables (columns)
I suggest importdata for operations like this:
d = importdata('filename.txt');
The return is a struct with the numerical fields in a member called 'data', and the column headers in a field called 'colheaders'.
Another useful interface for importing manipulating data like these is the 'dataset' class available in the Statistics Toolbox.
If the header is on the first row then
A = dlmread(filename,delimString,2,1);
will read the numeric data into the Matrix A.
You can then use
fid = fopen(filename)
headerString = fscanf(fid,'%s/n') % reads header data into a string
fclose(fid)
You can then use strtok to split the headerString into a cell array. Is one approach I can think of deal with an unknown number of columns
Edit
fixed fscanf function call
Just use textscan with different format specifiers.
fid = fopen(filename,'r');
heading = textscan(fid,'%s %s %s',1);
fgetl(fid); %advance the file pointer one line
data = textscan(fid,'%n %n %n');%read the rest of the data
fclose(fid);
In this case 'heading' will be a cell array containing cells with each column heading inside, so you will have to change them into cell array of strings or whatever it is that you want. 'data' will be a cell array containing a numeric array for each column that you read, so you will have to cat them together to make one matrix.