I am looking for the end of aac in mp4 file. In the internet, I found the suggestion that one needs to find binary 111, 7 in decimal from the end.
However, after analysing the aac, some aac sample do not seem to end with END element (7).
Any idea why is it so? How can I find the end of aac stream in those cases?
Related
I'm recording an audio file in Qt. After that, I have to read the file with MATLAB and analyse it. Qt likes to save audio files in .pcm format (i.e. .wav format without header) and I can't read .pcm audio files with MATLAB (format is not supported).
What is the best solution to transfer audio from Qt to MATLAB?
Firstly, since your .pcm file has no header information, you'll need to know the number of bits per sample you used to create it in Qt. A typical value would be 16 bits per sample, or a data type of int16. Then you can use fread to read the audio waveform from the file like so:
fid = fopen('your_file.pcm', 'r');
audioWaveform = fread(fid, Inf, 'int16');
fclose(fid);
If you then want to do any processing, you will likely need to provide other pieces of information from when you created it in Qt, like the sampling frequency.
i may sound too rookie please excuse me. When i read a .AAC audio file in Matlab using the audioread function the out put is a 256000x6 matrix. how do i know what is the content of each column?
filename = 'sample1.aac';
[y,Fs] = audioread(filename,'native');
writing the first column using audiowrite as below i can hear the whole sound. so what are the other columns?
audiowrite('sample2.wav',y,Fs);
Documentation:
https://uk.mathworks.com/help/matlab/ref/audioread.html
Output Arguments
y - Audio Data
Audio data in the file, returned as an m-by-n matrix, where m is the number of audio samples read and n is the number of audio channels in the file.
If you can hear the entire file in the first channel, it just means most of that file is contained in a mono channel. From Wikipedia r.e. AAC audio channels:
AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams
https://en.wikipedia.org/wiki/Advanced_Audio_Coding
I'm decoding a H264 video bit stream. I have PPS and SPS at first, after that is I slice, P,B,SI,SP slice. I used ffmpeg to convert it into mp4 format, but the video is not correct (it's playable but can't be seen clearly).
I think that my I-slice has been fragmented. Have you got any idea to merge them? In slices there are only frames-begin-with-0181 having frame_num from 1 to x. The other frames have non-sequency frame_num. What does that mean?
thanks for reading.
I have solved my problem. With the fragmented slices, I cut the header slices (example: 00000101) and only left the header of start slice (example: 00000181) and now I can convert to mp4 and watch the video :D
I need to know if it is possible to create a 30 second sample MP3 from a WAV file. The generated MP3 file must feature a fade at the start and end.
Currently using ffmpeg, but can not find any documentation that would support being able to do such a thing.
Could someone please provide me the name of software (CLI, *nix only) that could achieve this?
This will
trim out from Position 45 sec. the next 30 seconds (0:45.0 30) and
fade the first 5 seconds (0:5) and the last 5 seconds (0 0:5) and
convert from wav to mp3
sox infile.wav outfile.mp3 trim 0:45.0 30 fade h 0:5 0 0:5
Check out SoX - Sound eXchange
I have not used it myself but one of my friends speaks highly of it.
From web page (highlighted my me):
SoX is a cross-platform (Windows,
Linux, MacOS X, etc.) command line
utility that can convert various
formats of computer audio files in to
other formats. It can also apply
various effects to these sound files,
and, as an added bonus, SoX can play
and record audio files on most
platforms.
The best way to do this is to apply the 30-second truncation, fade in and fade out to the WAV audio data before converting it to an MP3. If your conversion library has a method that takes an array of samples, this is very easy to do. If the method only accepts a WAV file (either in-memory or on disk), then this is slightly less easy as you have to learn the WAV file format (which is easy to write but somewhat more difficult to read). Either way, applying gain and/or attenuation to time-domain sample data (as in a WAV file) is much easier than trying to apply these effects to frequency-domain data (as in an MP3 file).
Of course, if your conversion library already does all this, it's best to just use that and not worry about it yourself.
A voice recorder doesn't need uncompressed Linear PCM audio. Compressed AMR would do fine. The iPhone framework built for recording audio is simple enough, but the only examples I've found for setting up the audio format (which come from Apple) use LinearPCM. I've tried various other combinations of values, but can't seem to get anything to work.
Does anybody have any code that actually records AMR?
Edit:
The AMR format is one of the options for setting the data type, but the other options (packet size, frame size, etc.) don't seem to match up no matter what I set them to.
Edit: Here's what I have for the PCM version:
/*
If we want to use AMR instead of PCM:
AMR Format:
Sampling Frequency: 8 kHz/13-bit (160 samples for 20 ms frames), filtered to 200-3400 Hz
eight source codecs : 12.2, 1.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s
generated frame length: 244, 204, 159, 148, 134, 118, 103, 95 bits per frame
*/
format->mFormatID = kAudioFormatLinearPCM;
format->mSampleRate = 8000.0; //8 kHz
format->mFramesPerPacket = 1; //1 frame per packet
format->mChannelsPerFrame = 1; //Mono
format->mBytesPerFrame = 2; //8/bits per frame (round up)
format->mBytesPerPacket = 2; //Same as bytes per frame
format->mBitsPerChannel = 16; //16-bit audio
format->mReserved = 0; //always 0
format->mFormatFlags = kLinearPCMFormatFlagIsBigEndian |
kLinearPCMFormatFlagIsSignedInteger |
kLinearPCMFormatFlagIsPacked;
AMR codec is NOT supported for encoding/recording on the iPhone, albeit it is supported for playback: this is the reason the kAudioFormatAMR constant exists.
Official api says that supported encoding formats are:
ALAC (Apple Lossless) ~> kAudioFormatAppleLossless
iLBC (internet Low Bitrate Codec, for speech) ~> kAudioFormatiLBC
IMA/ADPCM (IMA4) ~> kAudioFormatAppleIMA4
linear PCM ~> kAudioFormatLinearPCM
ยต-law ~> kAudioFormatULaw
a-law ~> kAudioFormatALaw
You may try one of these formats or use an open source AMR encoder as goldenmean suggests.
edit: Updated Official api link
To update olegueret's link to the official documentation (why do they hide this stuff?)
http://developer.apple.com/library/ios/#qa/qa2008/qa1615.html
I guess AMR codec format is not supported my iPhone voice recorder app.
May be one can try integrating some open-source, implementation of AMR encoder into the apples' iPhone application framework and try making the voice recorder store the audio in AMR encoded format. (i dont know if thats allowed by apple by their NDA/license).
-AD
You can record audio to a uncompressed Linear PCM buffer (circular or ring), and, in another thread, convert data in this buffer, using your own AMR (or other) compression engine, before saving the compressed audio data to a file.