Read binary matrix from a file in Matlab - matlab

I have a binary square matrix with complex values, stored in a .bin format file. I have tried to read this 100-by-100 matrix with a Matlab script:
i=fopen('matrix.bin','r')
A=fread(i,[100 100]
This code does not correctly read the complex values contained in A. I only get a 100-by-100 matrix of integers.

MATLAB fread support ANSI C types, but there is no native ANSI C types that represent complex numbers. Most likely, a complex number is stored as a pair of real and imaginary numbers.
Without information as to how the binary file is saved, you can still perform some test to figure this out. If the complex number is represented as a real part and an imaginary part, and both in double precision, then a single complex number would take up 8 + 8 = 16 bytes. We can test this by navigating to the end of the file, and see how many bytes there are.
fID = fopen('matrix.bin','r')
fseek(fID, 0, 'eof') % Go to the end of file
ftell(fID) % Tell current position in the open file
fclose(fID)
If this number is equal to 16 * 100 * 100 = 160000, then you're in luck. There is no extra stuff saved in this file, and you can simply read the data by this code:
fID = fopen('matrix.bin','r')
data = []
for ii = 1:10000
data = [data; fread(fID, 2, 'double')']
end
fclose(fID)
You'll end up with a 10000*2 array, with each row representing a complex number. If the file size is 80000, then both real and imaginary part could be saved in single data type. If file size is some other number, then it probably means some additional information is stored in the binary. You'll have to know what additional information is stored so you can read the file correctly.

Related

Retrieve data from .rec binary file

The question may be naive, but answers could help me.
A measurement is recorded in binary format, with a header that contains all information about the data and the data itself (i.e. a series of doubles).
The measurement data can be exported in csv format from the application, but it takes ages.
What do you have to pay attention to when trying to read data from a binary file? Is this process even feasible using Matlab to import as an array or labview (export as .txt maybe?)
Binary .rec file format may refer to various things (audio/video encoding format of Topfield based on MPEG4-TS, proprietary audio encoding, and even MRI scanner from Phillips) ...
If it refers to MRI scanner you may find some direct reader on fileexchange: Matlab PAR REC Reader
If it refer to something else, you may parse binary file header and data by yourself using the low level routine: fread
Edit
Not knowing the exact file format for your recorded sensor displacement, here is dummy example with fread for reading large rec file block-by-block supposing header contains just the length of data, and that data is just a serie of double values:
function [] = DummyReadRec()
%[
% Open rec file for reading
[fid, errmsg] = fopen('dummy.rec', 'r');
if (fid < 0), error(errmsg); end
cuo = onCleanup(#()fclose(fid));
% Read header (here supposing it is only an integer giving length of data)
reclenght = fread(fid, 1, 'uint32');
% Read data block-by-block (here supposing it is only double values)
MAX_BLOCK_LENGTH = 512;
blockCount = ceil(reclenght / MAX_BLOCK_LENGTH);
for bi = 1:blockCount,
% Will read a maximum of 'MAX_BLOCK_LENGTH' (or less if we're on the last block)
[recdata, siz] = fread(fid, [1 MAX_BLOCK_LENGTH], 'double');
% Do something with this block (fft or whatever)
offset = (bi-1)*MAX_BLOCK_LENGTH;
position = (offset+1):(offset+siz);
plot(position, 20*log10(abs(fft(recdata))));
drawnow();
end
%]
end
The answer is going to depend on the format of your binary file and how large it is.
I have done many conversion of various binary files all with differing layouts. If the file will fit into memory then you can just use fread as long as you know the layout of the binary file. Below is an example of reading a header & simple data block. It would of course have to be modified depending on the layout of your file. Depending on recording equipment & computer type you may also need to make use of the machinefmt ('ieee-le' or 'ieee-be') options of fread ... that has burned me before.
%Open the File for reading
fid = fopen(yourRECfile,'r');
%Read the Header ... your layout will be different
header.MajorRel = fread(fid,1,'uint16'); %Major File Rev #
header.MinorRel = fread(fid,1,'uint16'); %Minor File Rev #
header.IRIGStart = fread(fid,1,'double'); %Start time in secs
header.Flags = fread(fid,1,'uint32'); %Flags
%Read everything else from there until end of file as a series of doubles.
data = fread(fid,inf,'double');
fclose(fid);
If the file does not fit into memory you will either need to process it in blocks or look into using memmapfile.

Why can't MATLAB save uint8 type matrix?

Here is the code:
x = rand(5)*100;
save('pqfile.txt','x','-ascii','-tabs')
The above works, but:
x = rand(5)*100;
x = uint8(x);
save('pqfile.txt','x','-ascii','-tabs')
says:
Warning: Attempt to write an unsupported data type to an ASCII file.
Variable 'x' not written to file.
Does anyone know why this happens? How come I can't save the data when it is uint8. I have to read data into a VHDL testbench so was experimenting. I guess the only option is to save my 8 bit unsigned integer values in 2d array using printf then read into the test bench.
ASCII option
The save method is somewhat restrictive in what it can support, and then it uses floating point notation to represent your numbers which bloats your file when dealing with a limited range of numbers like you are (i.e. uint8, 0 to 255).
Check out dlmwrite as an alternative (documentation here).
It takes the filename to write/save to, the variable to store, and some additional parameters, like the delimiter you want to separate your values with.
For your example, it looks like this
x = rand(5)*100;
x = uint8(x);
dlmwrite('pqfile.txt',x,'\t');
Binary option
If you are looking to stored your uint8 data as single bytes then you probably want go with a custom binary file instead instead of ASCII. (Yes, you can convert uint8 to single ASCII characters but you run into issues with these values being interpreted with your delimiters; newlines or tabs.)
fid=fopen('pqfile.dat','wb');
if(fid>2)
fwrite(fid,size(x),'*uint8'); % Note: change data type here you are dealing with more than 255 rows or columns
fwrite(fid,x','*uint8'); % Transpose x (with x') so it is stored in row order.
fclose(fid);
else
fprintf(1,'Could not open the file for writing.\n');
end
I'm not sure what type of parser you are using for your VHDL, but this will pack your data into a file with a short header of the expected dimensions followed by one long row of your serialized data.
To read it back in with MATLAB, you can do this:
fid = fopen('pqfile.dat','rb');
szX = fread(fid,2,'uint8');
x = fread(fid,szX,'*uint8')'; % transpose back if you are dealing with matlab.
fclose(fid);
The transpose operations are necessary for MATLAB because it reads data column-wise, whereas most other languages (in my experience) read row-wise.

reading a fortran binary file into matlab

in my fortran code i am outputting the results into a binary file.
open(21,file=anum('press',itime),form=format_mode)
write(21) rtime,itime,dt,nx0,ny0,nz,deltax,deltay,rlenz
write(21) rw
close(21)
the above is the fortran code that writes and saves the file.
i now want to open and analyse it in matlab:
fid = open('press.420000');
A = fread(fid);
close(fid);
this however, only creates a 1d array which i am guessing includes all the header information too.
i want Matlab to read the header values but not include them into the final array. i intend to reshape the array in to a 3d array as the data is from a cfd simulation which has a grid of 256x512x390 = 51,180,80
the Matlab code gives me a 1d array of 411,343,976, which cannot be correct.
thus i am struggling how to read the binary file. I need some guidance on how i should code a Matlab script to read the binary file
You can read data in byte vector:
bytevec = fread(fid, inf, 'uint8');
Then you can look at and manually arrange elements by their indices, for example - single precision (float) data:
vec = typecast(bytevec(i1:i2), 'single');
And then convert it to default matlab double type without changing data values:
vec = cast(vec, 'double');
Finally, you can reshape raw vector to 3d matrix:
M = reshape(vec, [d1, d2, d3]);

Memory map file in MATLAB?

I have decided to use memmapfile because my data (typically 30Gb to 60Gb) is too big to fit in a computer's memory.
My data files consist two columns of data that correspond to the outputs of two sensors and I have them in both .bin and .txt formats.
m=memmapfile('G:\E-Stress Research\Data\2013-12-18\LD101_3\EPS/LD101_3.bin','format','int32')
m.data(1)
I used the above code to memory map my data to a variable "m" but I have no idea what data format to use (int8', 'int16', 'int32', 'int64','uint8', 'uint16', 'uint32', 'uint64', 'single', and 'double').
In fact I tried all of the data formats listed that MATLAB supports, but when I used the m.data(index number) I never get a pair of numbers (2 columns of data) which is what I expected, also the number will be different depending on the format I used.
If anyone has experience with memmapfile please help me.
Here are some smaller versions of my data files so people can understand how my data is structured:
cheers
James
memmapfile is designed for reading binary files, that's why you are having trouble with your text file. The data in there is characters, so you'll have to read them as characters and then parse them into numbers. More on that below.
The binary file appears to contain more than just a stream of floating point values written in binary format. I see identifiers (strings) and other things in the file as well. Your only hope of reading that is to contact the manufacturer of the device that created the binary file and ask them about how to read in such files. There'll probably be an SDK, or at least a description of the format. You might want to look into this as the floating point numbers in your text file might be truncated, i.e., you have lost precision compared to directly reading the binary representation of the floats.
Ok, so how to read your file with memmapfile? This post provides some hints.
So first we open your file as 'uint8' (note there is no 'char' option, so as a workaround we read the content of the file into a datatype of the same size):
m = memmapfile('RTL5_57.txt','Format','uint8'); % uint8 is default, you could leave that off
We can render the data read in as uint8 as characters by casting it to char:
c = char(m.Data(1:19)).' % read the first three lines. NB: transpose just for getting nice output, don't use it in your code
c =
0.398516 0.063440
0.399611 0.063284
0.398985 0.061253
As each line in your file has the same length (2*8 chars for the numbers, 1 tab and 2 chars for newline = 19 chars), we can read N lines from the file by reading N*19 values. So m.Data(1:19) gets you the first line, m.Data(20:38), the second line, and m.Data(20:57) the second and third lines. Read as much as you want at once.
Then we'll have to parse the read-in data into floating point numbers:
f = sscanf(c,'%f')
f =
0.3985
0.0634
0.3996
0.0633
0.3990
0.0613
All that's left now is to reshape them into your two column format
d = reshape(f,2,[]).'
d =
0.3985 0.0634
0.3996 0.0633
0.3990 0.0613
Easier ways than using memmapfile:
You don't need to use memmapfile to solve your problem, and I think it makes things more complicated. You can simply use fopen followed by fread:
fid = fopen('RTL5_57.txt');
c = fread(fid,Nlines*19,'*char');
% now sscanf and reshape as above
% NB: one can read the values the text file directly with f = fscanf(fid,'%f',Nlines*19).
% However, in testing, I have found calling fread followed by sscanf to be faster
% which will make a significant difference when reading such large files.
Using this you can read Nlines pairs of values at a time, process them and simply call fread again to read the next Nlines. fread remembers where it is in the file (as does fscanf), so simply use same call to get next lines. Its thus easy to write a loop to process the whole file, testing with feof(fid) if you are at the end of the file.
An even easier way is suggested here: use textscan. To slightly adapt their example code:
Nlines = 10000;
% describe the format of the data
% for more information, see the textscan reference page
format = '%f\t%f';
fid = fopen('RTL5_57.txt');
while ~feof(fid)
C = textscan(fid, format, Nlines, 'CollectOutput', true);
d = C{1}; % immediately clear C at this point if you need the memory!
% process d
end
fclose(fid);
Note again however that the fread followed by sscanf will be fastest. Note however that the fread method would die as soon as there is one line in the text file that doesn't exactly match your format. textscan is forgiving of whitespace changes on the other hand and thus more robust.

What is the best way to store a 16 × (2^20) matrix in MATLAB?

I am thinking of writing the data to a file. Does anyone have an example of how to write a big amount of data to a file?
Edit: Most elements in the matrix are zeroes, others are uint32. I guess the simplest save() and load() would work, as #Jonas suggested.
I guess nobody's seen the edit about the zeroes :)
If they're mostly zeroes, you should convert your matrix to its sparse representation and then save it. You can do that with the sparse function.
Code
z = zeros(10000,10000);
z(123,456) = 1;
whos z
z = sparse(z);
whos z
Output
Name Size Bytes Class Attributes
z 10000x10000 800000000 double
Name Size Bytes Class Attributes
z 10000x10000 40016 double sparse
I don't think the sparse implementation is designed to handle uint32.
If you're concerned with keeping the size of the data file as small as possible, here are some suggestions:
Write the data to a binary file (i.e. using FWRITE) instead of to a text file (i.e. using FPRINTF).
If your data contains all integer values, convert it to or save it as a signed or unsigned integer type instead of the default double precision type MATLAB uses.
If your data contains floating point values, but you don't need the range or resolution of the default double precision type, convert it to or save it as a single precision type.
If your data is sufficiently sparse (i.e. there are many more zeroes than non-zeroes in your matrix), then you can use the FIND function to get the row and column indices of the non-zero values, then just save these to your file.
Here are a couple of examples to illustrate:
data = double(rand(16,2^20) <= 0.00001); %# A large but very sparse matrix
%# Writing the values as type double:
fid = fopen('data_double.dat','w'); %# Open the file
fwrite(fid,size(data),'uint32'); %# Write the matrix size (2 values)
fwrite(fid,data,'double'); %# Write the data as type double
fclose(fid); %# Close the file
%# Writing the values as type uint8:
fid = fopen('data_uint8.dat','w'); %# Open the file
fwrite(fid,size(data),'uint32'); %# Write the matrix size (2 values)
fwrite(fid,data,'uint8'); %# Write the data as type uint8
fclose(fid); %# Close the file
%# Writing out only the non-zero values:
[rowIndex,columnIndex,values] = find(data); %# Get the row and column indices
%# and the non-zero values
fid = fopen('data_sparse.dat','w'); %# Open the file
fwrite(fid,numel(values),'uint32'); %# Write the length of the vectors (1 value)
fwrite(fid,rowIndex,'uint32'); %# Write the row indices
fwrite(fid,columnIndex,'uint32'); %# Write the column indices
fwrite(fid,values,'uint8'); %# Write the non-zero values
fclose(fid); %# Close the file
The files created above will differ drastically in size. The file 'data_double.dat' will be about 131,073 KB, 'data_uint8.dat' will be about 16,385 KB, and 'data_sparse.dat' will be less than 2 KB.
Note that I also wrote the data\vector sizes to the files so that the data can be read back in (using FREAD) and reshaped properly. Note also that if I did not supply a 'double' or 'uint8' argument to FWRITE, MATLAB would be smart enough to figure out that it didn't need to use the default double precision and would only use 8 bits to write out the data values (since they are all 0 and 1).
How is the data generated? How do you need to access the data?
If I calculate correctly, the variable is less than 200MB if it's all double. Thus, you can easily save and load it as a single .mat file if you need to access it from Matlab only.
%# create data
data = zeros(16,2^20);
%# save data
save('myFile.mat','data');
%# clear data to test everything works
clear data
%# load data
load('myFile.mat')