what is the content of an .acc audio file? - matlab

i may sound too rookie please excuse me. When i read a .AAC audio file in Matlab using the audioread function the out put is a 256000x6 matrix. how do i know what is the content of each column?
filename = 'sample1.aac';
[y,Fs] = audioread(filename,'native');
writing the first column using audiowrite as below i can hear the whole sound. so what are the other columns?
audiowrite('sample2.wav',y,Fs);

Documentation:
https://uk.mathworks.com/help/matlab/ref/audioread.html
Output Arguments
y - Audio Data
Audio data in the file, returned as an m-by-n matrix, where m is the number of audio samples read and n is the number of audio channels in the file.
If you can hear the entire file in the first channel, it just means most of that file is contained in a mono channel. From Wikipedia r.e. AAC audio channels:
AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams
https://en.wikipedia.org/wiki/Advanced_Audio_Coding

Related

Reading wav audio files and get amplitude data in relation to time in dart and flutter for plotting

I want to read wav files and get amplitude data in relation to time in dart to use it in dart and plot it in chart using flutter
At first i converted .mp3 and .m4a files to wav files using flutter_ffmpeg but i want to read those files and get sample rate and amplitude
And try to plot already existing file to chart with amplitude on y axis and time on x axis
My question about extracting amplitude and other wav info from .wav or .pcm files as there is no documentation i found on web.
natural audio in the wild is a continuous wobble of a curve .. think here of your ear drum or the membrane of a microphone or a drum head ... digital audio is that same curve recorded as a progression of audio samples ... typically 44100 samples are recorded per second and each sample records 16 bits of resolution of the height of this curve meaning 2^16 == 65,536 possible distinct height values for a given point on the raw audio curve ( more detail research PCM audio ) ... so a single audio sample amplitude represents the curve height at a specific point in time lets call it s1 ... this height as plotted on the raw audio curve is its amplitude for that sample
when reading a WAV format file the first 44 bytes is a header followed by the payload which contains the raw audio curve of each channel of the audio ( mono one channel, stereo 2 channels, etc ) ... typically audio is recorded using many microphones however to create an audio CD the music studio mixes down multi track audio ( possibly dozens of channels originally ) into two channels of audio ( one for left speaker one for right speaker meaning stereo that is two channels ) ... this header will tell you these critical details of what appears in the payload: sample_rate ( number of audio samples captured per second per channel ), bit_depth of each sample ( number of bits of data used to store each audio sample for a given channel ) , payload size in bytes, and number of channels ... you can write a WAV parser yourself ( takes about two pages of code ) or use a library to retrieve these data structures ... once parsed the raw audio found in the WAV file payload will give you the raw audio curve s1, s2, s3, etc for each channel ...
typically when folks need to identify the amplitude they refer to an aggregate of this curve height of many audio samples ... s1, s2, s3, ... one way to skin this cat is to calculate the Root Mean Square of a set of audio samples to generate one value of currAmplitude aggregate amplitude then slide forward in time to repeat for another set of audio sample points ... the number of samples in a given RMS calculation is up to you perhaps 1k or 2k more or less depending on your appetite for CPU consumption and resolution of this aggregated amplitude measurement
currAmplitude = square_root_of( ( s1*s1 + s2*s2 + s3*s3 + ... sn*sn ) / n ) // this is the RMS forumula
keep in mind each audio sample has its own amplitude and perhaps you can simply plot these ( s1, s2, s3, ... ) or instead repeatedly do above RMS to get a set of aggregate amplitudes which is more helpful if a general ballpark amplitude is desired instead of the instantaneous amplitude of each sample

Different length of sound files with different sampling frequencies

Im currently struggling to understand what is happening. So, I created a sound using the audiowrite function in Matlab (the sound is created using two different sounds but I dont think it matters) first with a sampling frequency of 44100 Hz, and another one, the sound file is the same but the sampling frequency is 48000 Hz. Now I'm observing that the sound produced at 44100Hz is approx. 30sec longer than the other one (48000Hz sampling). It looks like phase shifting of some sort, but I'm not sure. Any help/explanation is appreciated. I also made a amplitude/time plot for better understanding:
(I set the x axis to 350sec to see where the signal ends).
EDIT: here is the code for how I create the sound file:
[y1,F1] = audioread(cave_file); %cave and forest files are mp3 files loaded earlier both have samp.freq of 48000Hz
[y2,F2] = audioread(forest_file);
samp_freq=44100;
%samp_freq=48000;
a = max(size(y1),size(y2));
z = [[y1;zeros(abs([a(1),0]-size(y1)))],[y2;zeros(abs([a(1),0]- size(y2)))]]
audiowrite('test_sound.wav', z,samp_freq);
What is the storage format? More specifically, is the info about sampling rate and number of channels stored in file meta data? which is then used during playback.
If so, then there are 3 possibilities for this behavior:
1) The sampling rate meta data of the 44.1KHz file is incorrect, while the audio was sampled at the correct rate i.e. 44.1KHz. Because the 44.1KHz file is playing longer than 48KHz, which I'm assuming to be producing the correct sound, and playing for the correct duration, it can be concluded that the sampling rate meta data of 44.1KHz is much lesser than 44.1KHz.
Could you please check the meta data? or attach the files here so that I can try to take a look?
2) The sampling didn't happen at the correct rate, while the meta data has 44.1KHz as the sampling rate.
3) The number of channels is incorrectly stored.
In case the files are raw PCMs, then this probably the correct sampling rate and/or number of channels is not selected when playing the 44.1KHz file.
Hope this helps

How can I record an audio file in Qt and read it in MATLAB?

I'm recording an audio file in Qt. After that, I have to read the file with MATLAB and analyse it. Qt likes to save audio files in .pcm format (i.e. .wav format without header) and I can't read .pcm audio files with MATLAB (format is not supported).
What is the best solution to transfer audio from Qt to MATLAB?
Firstly, since your .pcm file has no header information, you'll need to know the number of bits per sample you used to create it in Qt. A typical value would be 16 bits per sample, or a data type of int16. Then you can use fread to read the audio waveform from the file like so:
fid = fopen('your_file.pcm', 'r');
audioWaveform = fread(fid, Inf, 'int16');
fclose(fid);
If you then want to do any processing, you will likely need to provide other pieces of information from when you created it in Qt, like the sampling frequency.

How can I save audio wav file without data clipping?

I am using MATLAB tool for extracting silence part from audio WAV files. After extracting silence part from audio, I want to save new audio as a WAV file.
For this process, I use 'audiowrite' function. However, the program warns to me with this message :
Warning: Data clipped when writing file.
I tried to add 'BitsPerSample' value with single file format(32 bit) and I dont take a message from program with this way. I saved audio files with 32 bit but WAV files should be 16 bit.
How can I fix this problem?
audiowrite(filename,y,fs,'BitsPerSample',32);
Note: I also normalized data and problem is same.
Thanks for your help!
UPDATE:
I want to normalize audio samples as mean 0 and standard deviation or variance 1.Thus, I use z-score normalization technique.
Also,y/max(abs(y)) method is normalized data between -1 and 1. However, mean and variance are not equal to 0 and 1 respectively. These techniques are normalized data with different way.
Actually, My question is that How can I save samples with z score normalization technique without data clipping?
Matlab's audiowrite uses different normalizations for different data types. So if you want to get 16bit audio wav file, you should normalize your data to the [-32768,32767] range and convert your data to int16 type:
y_normalized = intmax('int16') * y/(max(abs(y))*1.001);
audiowrite(filename, int16(y_normalized), fs)
Similarly, for float you should normalize your data to the [-1,+1] range :
y_normalized = y/(max(abs(y)));
audiowrite(filename, y_normalized, fs)

How to frequency shift an audio file in matlab

I am reading two audio files (mp3/wav) which are bandlimited to 15kHz.
[audio1 fs]=audioread('test1.mp3');
[audio2 fs]=audioread('test2.mp3');
I want to combine audio1 and audio2 into a single audio file in such a way that the first audio file (audio1) is band limited between 1-15kHz and the second (audio2) is band limited between 16-30kHz.