Using Matlab to read binary data from file - matlab

I'm currently trying to read data from a .surf file using Matlab. (I realise there will probably be quite a lot of other questions similar to this, but each is specific to its own problem, and I wasn't able to find a duplicate of what I'm about to ask; if someone else does and marks this as duplicate... apologies!)
This is what the readme for the data files says:
All of the files consist of 1-byte values, except for the
surface (surf) files, which consist of 4-byte floating-point
values. The 1-byte phase values denote the phase scaled and
quantized to the range 0-255. The 1-byte correlation and
mask values denote the weights 0-1 scaled and quantized to
the range 0-255.
I have been able to read a .phase file without any problems using
fn = 'spiral.257x257.phase';
f = fopen(fn);
fin = fread(f, [257, 257], 'uint8');
fclose(f);
However, I'm unable to read the surf file. I have tried single, float32 and real*4, the three options under 'Floating-point numbers' in the Matlab documentation for fread, as well as uint, uint32, int, int32 and long, which are the other options for 4-byte data. None of these give the correct solution at all - not even just the wrong scaling; they're completely off.
I'm pretty stuck for ideas; any advice (including general advice) would be most appreciated.

Related

Why can't MATLAB save uint8 type matrix?

Here is the code:
x = rand(5)*100;
save('pqfile.txt','x','-ascii','-tabs')
The above works, but:
x = rand(5)*100;
x = uint8(x);
save('pqfile.txt','x','-ascii','-tabs')
says:
Warning: Attempt to write an unsupported data type to an ASCII file.
Variable 'x' not written to file.
Does anyone know why this happens? How come I can't save the data when it is uint8. I have to read data into a VHDL testbench so was experimenting. I guess the only option is to save my 8 bit unsigned integer values in 2d array using printf then read into the test bench.
ASCII option
The save method is somewhat restrictive in what it can support, and then it uses floating point notation to represent your numbers which bloats your file when dealing with a limited range of numbers like you are (i.e. uint8, 0 to 255).
Check out dlmwrite as an alternative (documentation here).
It takes the filename to write/save to, the variable to store, and some additional parameters, like the delimiter you want to separate your values with.
For your example, it looks like this
x = rand(5)*100;
x = uint8(x);
dlmwrite('pqfile.txt',x,'\t');
Binary option
If you are looking to stored your uint8 data as single bytes then you probably want go with a custom binary file instead instead of ASCII. (Yes, you can convert uint8 to single ASCII characters but you run into issues with these values being interpreted with your delimiters; newlines or tabs.)
fid=fopen('pqfile.dat','wb');
if(fid>2)
fwrite(fid,size(x),'*uint8'); % Note: change data type here you are dealing with more than 255 rows or columns
fwrite(fid,x','*uint8'); % Transpose x (with x') so it is stored in row order.
fclose(fid);
else
fprintf(1,'Could not open the file for writing.\n');
end
I'm not sure what type of parser you are using for your VHDL, but this will pack your data into a file with a short header of the expected dimensions followed by one long row of your serialized data.
To read it back in with MATLAB, you can do this:
fid = fopen('pqfile.dat','rb');
szX = fread(fid,2,'uint8');
x = fread(fid,szX,'*uint8')'; % transpose back if you are dealing with matlab.
fclose(fid);
The transpose operations are necessary for MATLAB because it reads data column-wise, whereas most other languages (in my experience) read row-wise.

marshalling arbitrary-length binary data from udp() object

I am reading binary data from instrumentation using the Matlab udp() object.
I am surprised by the apparent lack of support for reading arbitrary length data types. How does one read a 24-bit integer? Or a 24-bit float? These are not that strange in instrumentation, and I have found only 8/16/32/64 data types in the documentation.
Have you looked tried help fread? The documentation shows it supports reading up to 64 bits at a time using bitN where N is a value between 1 and 64.
fid = udp(<your parameters here>); % use fopen to open the stream.
...
A = fread(fid,1,'bit24=>int32'); % stream 24 bits to a 32 bit integer.
B = fread(fid,1,'ubit24=>uint32'); % stream 24 bits to a 32 bit unsigned integer.
Since floating point specs vary, so this may or may not work for your situation:
C = fread(fid,1,'bit24=>float32'); % transcode 24bits to 32 bit float (MATLAB spec)
UPDATE
Seeing that the udp/fread implementation does not support this casting there are a couple, not-so-pretty, workarounds you can try.
Read in uchar data in multiples of three and then multiply it by their byte offsets directly. For example:
% First determine number of bytes on the stream and make sure you
% have at 3 or more bytes to read so you can calculate thirdOfBytesExpected.
[anMx3result, packetCount] = fread(fid,[thirdOfBytesExpected,3]);
unsigned20bitInt = anMx3result*(2.^(0:8:16))';
To be precise, the unsigned20bitInt is actually stored as a MATLAB double here. So if you need to write it elsewhere, you will need to bring it back to the individual uchar types it came from.
The not so pretty option is to eat the overhead of streaming the data back to a binary file format as an interim step so that you can then use the base fread method mentioned above. Not an ideal solution, but perhaps worth considering if you just need something to work.
% your original code for opening the udp handle
....
tmpFid = fopen('tmp.bin','rw');
[ucharVec, bytesRead] = fread(udpFid,bytesExpected,'uchar=>uchar');
bytesWritten = fwrite(tmpFid,ucharVec,'uchar');
% Do some quality control on bytes read vs written ...
fseek(tmpFid,-bytesWritten,'cof');
% in theory you should be able to just go to the beginning
% of the file each time like this fseek(tmpFid, 0, 'bof');
% provided you also reset to the beginning prior writing or after reading
% Read in the data as described originally
num24ByteIntsToRead = bytesWritten/3;
A = fread(tmpFid,num24BytsIntsToRead,'bit24=>int32');

Quantization of .wav file

I am attempting to quantize a 16 bit .wav file to a lower bit rate using Matlab. I've opened the file using wavread() but I am unsure of how to proceed from here. I know that somehow I need to "round" each sample value to (for example) a 7 bit number. Here's the code that's reading the file:
[file,rate,bits] = wavread('smb.wav');
file is a 1 column matrix containing the values of each sample. I can loop through each item in that matrix like so:
for i=1 : length(file)
% not sure what to put here..
end
Could you point me in the right direction to quantize the data?
If you have int16 data, varying from -32768 to +32767, it can be as simple as
new_data = int8(old_data./2^8);
That won't even require a for loop.
For scaled doubles it would be
new_data = int8(old_data.*2^7);
The wavread documentation suggests that you might even be able retrieve the data in that format to begin with:
[file,rate,bits] = wavread('smb.wav','int8');
EDIT: Changing the bit rate:
After rereading the question, I realize that you also mentioned a lower bit rate which implies reducing the sample rate, not the quantization of the data. If that is the case, you should look at the documentation for downsample, decimate, and/or resample. They are all built in MATLAB functions that change the bit rate.
downsample(file,2)
would half the bit rate, for example.

basic - Trying to add noise to an Audio file and trying to reduce errors using basic coding such as Repeatition code

We were recently taught the concepts of error control coding - basic codes such as Hamming code, repeatition code etc.
I thought of trying out these concepts in MATLAB. My goal was to compare how an audio file plays when corrupted by noise and in the case when the file is protected by basic codes and then corrupted by noise.
So I opened a small audio clip of 20-30 seconds in MATLAB using audioread function. I used 16 bit encoded PCM wave file.
If opened in 'native' format it is in int16 format . If not it opens in a double format.
I then added two types of noises to it : - AWGN noise (using double format) and Binary Symmetric Channel noise (by converting the int16 to uint16 and then by converting that to binary using dec2bin function). Reconverting back to the original int16 format does add a lot of noise to it.
Now my goal is to try out a basic repeatition code. So what I did was to convert the 2-d audio file matrix which consists of binary data into a 3-d matrix by adding redundancy. I used the following command : -
cat(3,x,x,x,x,x) ;
It created a 3-D matrix such that it had 5 versions of x along the 3rd dimension.
Now I wish to add noise to it using bsc function.
Then I wish to do the decoding of the redundant data by removing the repetition bits using a mode() function on the vector which contains the redundant bits.
My whole problem in this task is that MATLAB is taking too long to do the computation. I guess a 30 second file creates quite a big matrix so maybe its taking time. Moreover I suspect what I am doing is not the most efficient way to do it with regards to the various data types.
Can you suggest a way in which I may improve on the computation times. Are there some functions which can help do this basic task in a better way.
Thanks.
(first post on this site with respect to MATLAB so bear with me if the posting format is not upto the mark.)
Edit - posting the code here :-
[x,Fs] = audioread('sample.wav','native'); % native loads it in int16 format , Fs of sample is 44 khz , size of x is 1796365x1
x1 = x - min(x); % to make all values non negative
s = dec2bin(x); % this makes s as a 1796365x15 matrix the binary stream stored as character string of length 15. BSC channel needs double as the data type
s1 = double(s) - 48; % to get 0s and 1s in double format
%% Now I wish to compare how noise affects s1 itself or to a matrix which is error control coded.
s2 = bsc(s,0.15); % this adds errors with probability of 0.15
s3 = cat(3,s,s,s,s,s) ; % the goal here is to add repetition redundancy. I will try out other efficient codes such as Hamming Code later.
s4 = bsc(s3,0.15);% this step is taking forever and my PC is unresponsive because of this one.
s5 = mode(s4(,,:)) ; % i wish to know if this is a proper syntax, what I want to do is calculate mode along the 3rd dimension just to remove redundancy and thereby reduce error.
%% i will show what I did after s was corrupted by bsc errors in s2,
d = char(s2 + 48);
d1 = bin2dec(d) + min(x);
sound(d1,Fs); % this plays the noisy file. I wish to do the same with error control coded matrix but as I said in a previous step it is highly unresponsive.
I suppose what is mostly wrong with my task is that I took a large sampling rate and hence the vector was very big.

Save 4D matrix to a file with high precision (%1.40f) in Matlab

I need to write 4D matrix (M-(16x,101x,101x,6x) to a file with high precision ('precision'-'%1.40f') in MATLAB.
I've found save('filename.mat', 'M' ); for multidimensional matrix but precision cannot be set (only -double). On the other hand I've found dlmwrite('filename.txt', M, 'delimiter', '\t', 'precision', '%1.40f'); to set the precision but only limited to 2-D array.
Can somebody suggest a way to tackle with my problem?
What is the point in storing 40 digits of fractional part if double precision number in MATLAB keeps only 16 of them?
Try this code:
t=pi
whos
fprintf('%1.40f\n',t)
The output is
Name Size Bytes Class Attributes
t 1x1 8 double
3.1415926535897931000000000000000000000000
The command save('filename.mat', 'M' ); will store numbers in their binary representation (8 bytes per double-precision number). This is unbeatable in terms of space-saving comparing with plain-text representation.
As for the 4D shape the way j_kubik suggested seems simple enough.
I always thought that save will store exactly the same numbers you already have, with the precision that is already used to store them in matlab - you are not losing anything. The only problems might be disk space consumption (too precise numbers?) and closed format of .mat files (cannot be read by outside programs). If I wanted to just store the data and read them with matlab later on, I would definitely go with save.
save can also print ascii data, but it is (as dlmwrite) limited to 2D arrays, so using dlmwrite will be better in your case.
Another solution:
tmpM = [size(M), 0, reshape(M, [], 1)];
dlmwrite('filename.txt', tmpM, 'delimiter', '\t', 'precision', '%1.40f');
reading will be a bit more difficult, but only a bit ;)
Then you can just write your own function to write stuff to a file using fopen & fprintf (just as dlmwrite does) - there you can control every aspect of your file format (including precision).
Something I would have done if I really cared about precision, file-size and execution time (this is probably not the way for you) would be to write a mex function that takes a matrix parameter and stores it in a binary file by just copying raw data buffer from matlab. It would also need some indication of array dimensions, and would probably be the quickest (not sure if save doesn't already do something similar).