Resampling scaling truncating Matlab data file - matlab

I need to perform the conditions to the data file below on the matlab. can you help me how to do that.
Data File = 2x460800
Conditions:
1.) The scaling defined in the PhysioNet .info files for each PhysioNet data file was applied to each raw data file.
2.) The data were resampled to a common rate of 128 hertz.
3.) Each data file consisting of two ECG recordings was separated into two
separate data records.
4.) The data length was truncated to a common length of 65536 samples.

Related

How can I save audio wav file without data clipping?

I am using MATLAB tool for extracting silence part from audio WAV files. After extracting silence part from audio, I want to save new audio as a WAV file.
For this process, I use 'audiowrite' function. However, the program warns to me with this message :
Warning: Data clipped when writing file.
I tried to add 'BitsPerSample' value with single file format(32 bit) and I dont take a message from program with this way. I saved audio files with 32 bit but WAV files should be 16 bit.
How can I fix this problem?
audiowrite(filename,y,fs,'BitsPerSample',32);
Note: I also normalized data and problem is same.
Thanks for your help!
UPDATE:
I want to normalize audio samples as mean 0 and standard deviation or variance 1.Thus, I use z-score normalization technique.
Also,y/max(abs(y)) method is normalized data between -1 and 1. However, mean and variance are not equal to 0 and 1 respectively. These techniques are normalized data with different way.
Actually, My question is that How can I save samples with z score normalization technique without data clipping?
Matlab's audiowrite uses different normalizations for different data types. So if you want to get 16bit audio wav file, you should normalize your data to the [-32768,32767] range and convert your data to int16 type:
y_normalized = intmax('int16') * y/(max(abs(y))*1.001);
audiowrite(filename, int16(y_normalized), fs)
Similarly, for float you should normalize your data to the [-1,+1] range :
y_normalized = y/(max(abs(y)));
audiowrite(filename, y_normalized, fs)

How to get the dataset size of a Caffe net in python?

I look at the python example for Lenet and see that the number of iterations needed to run over the entire MNIST test dataset is hard-coded. However, can this value be not hard-coded at all? How to get the number of samples of the dataset pointed by a network in python?
You can use the lmdb library to access the lmdb directly
import lmdb
db = lmdb.open('/path/to/lmdb_folder') //Needs lmdb - method
num_examples = int( db.stat()['entries'] )
Should do the trick for you.
It seems that you mixed iterations and amount of samples in one question. In the provided example we can see only number of iterations, i. e. how many times training phase will be repeated. The is no any direct relationship between amount of iterations (network training parameters) and amount of samples in dataset (network input).
Some more detailed explanation:
EDIT: Caffe will totally load (batch size x iterations) samples for training or testing, but there is no relation with amount of loaded samples and actual database size: it will start reading from the beginning after reaching database last record - it other words, database in caffe acts like a circular buffer.
Mentioned example points to this configuration. We can see that it expects lmdb input, and sets batch size to 64 (some more info about batches and BLOBs) for training phase and 100 for testing phase. Really we don't make any assumption about input dataset size, i. e. number of samples in dataset: batch size is only processing chunk size, iterations is how many batches caffe will take. It won't stop after reaching database end.
In other words, network itself (i. e. protobuf config files) doesn't point to any number of samples in database - only to dataset name and format and desired amount of samples. There is no way to determine database size with caffe at the current moment, as I know.
Thus if you want to load entire dataset for testing, you have only option to firstly determine amount of samples in mnist_test_lmdb or mnist_train_lmdb manually, and then specify corresponding values for batch size and iterations.
You have some options for this:
Look at ./examples/mnist/create_mnist.sh console output - it prints amount of samples while converting from initial format (I believe that you followed this tutorial);
follow #Shai's advice (read lmdb file directly).

Partition a large scale HDF5 dataset into sub-files

I have a pretty large HDF5 dataset which is of size [1 12672 1 228020] following the format:[height width channel N]. This file occupies about 22G on hard disk.
I want to partition this file in to smaller parts, say 2G files.
h5repart has been tried out but it does not work well, because I'm not able to display partitioned files in MATLAB using h5disp('...').
One solution would be for you to use the 'chunk' capability of the HDF5 format.
Using the MATLAB low-level HDF5 functions you should be able to read the chunks you require.

get integer representation of .SPH audio files

I am trying to train a neural network using audio files that are originally in .SPH format. I need to get integers that represent the amplitude of the sound waves for neural net, so I used sox to convert the files to .wav format by calling sox infile.SPH outfile.wav remix 1-2 (remix for converting 2 channels into 1), and then tried to use
[y, Fs, nbits, opts] = wavread('outfile.wav') in matlab to get the integer representation.
However, matlab threw Data compression format (CCITT mu-law) is not supported.
So I used sox infile.SPH -b 16 -e signed-integer -c 1 outfile.wav
which I think puts the wave file in a linear format instead of mu-law. But now matlab threw another error: Invalid Wave File. Reason: Cannot open file.
My audio files are in 8000 Hz u-law single or dual channels, and all in 8-bit, I think (8-bit for single for sure).
Is there a way to get the integer representation out of the audio files using matlab or any other programs? Either u-law or linear is fine, unless one would be better for neural net training. Preferably 8 bit, since the source files are in 8-bit.
I don't really understand .SPH. For the uncompressed ones (and ignore headers), are the files storing amplitudes (guess it has to somehow)? Can I extract numbers out of those files directly without bothering with waves? Are the signals stored in a sequential fashion such that it would make sense to split the audio files?
I am new to audio processing in general, so any pointers would be appreciated!
You need to clearly identify the main task: feeding the neural net with vectors or matrix. So the first step is to work on the audio file (without matlab!) in order to have wav files. The second step is the neural net setting/training with matlab.
I would try to decompress 'sph' files, then convert them into 'wav' (for example see the instructions here and here).
Finally, using sox in a command/terminal window is better than using it in the matlab console.

How does MATLAB read and interpret binary digits from a .bin file?

I have a binary file with .bin extension. This file is created by a data acquisition software. Basically a "measurement computing" 16-bit data-acquisition hardware is receiving signals from a transducer(after amplified by an amplifier) and sending this to PC by a USB. A program/software then is generating a .bin file corresponding received serial data from data aq. hardware. There are several ways to read this .bin file and plot the signal in MATLAB.
When I open this .bin file with a hexeditor I can see the ASCII or ones and zeros (binary). The thing is I don't know how to interpret this knowledge. There are 208000 bytes in the file obtained in 16 seconds. I was thinking each 2 bytes corresponds to a sample since the DAQ device has 16 bit resolution. So I thought for example a 16-bit data such as 1000100111110010 is converted by MATLAB to a corresponding voltage level. But I tried to open two different .bin files with different voltage levels such as 1V and 9V and still teh numbers do not seem to be related what I think.
How does MATLAB read and interpret binary digits from a .bin file?
Thnx,
Assuming your .bin file is literally just a dump of the values recorded, you can read the data using fread (see the documentation for more info):
fid = fopen('path_to_your_file', 'r');
nSamples = 104000;
data = fread(fid, nSamples, 'int16');
fclose(fid);
You will also need to know, however, whether this data is signed or unsigned - if it's unsigned you can use 'uint16' as the third argument to fread instead. You should also find out if it's big-endian or little-endian... You should check the original program's source code.
It's a good idea to record the sample rate at which you make acquisitions like this, because you'll be hard pressed to do anything but trivial analysis on it afterwards without knowing this information. Often this kind of data is stored in .wav files, so that both the data and its sample rate (and the bit depth, in fact) are stored in the file. That way you don't need a separate bit of paper to go along with your file (also, reading .wav files in MATLAB is extremely easy).