Is there a way to get the total number of pcm samples inside an ogg file containing opus audio using libogg and libopus?
Thanks,
I made a similar modification to opus-tools' opusinfo command:
issue: Show total samples used to calculate duration in opusinfo
commit: opusinfo: Add total playback samples to output
Related
Im currently struggling to understand what is happening. So, I created a sound using the audiowrite function in Matlab (the sound is created using two different sounds but I dont think it matters) first with a sampling frequency of 44100 Hz, and another one, the sound file is the same but the sampling frequency is 48000 Hz. Now I'm observing that the sound produced at 44100Hz is approx. 30sec longer than the other one (48000Hz sampling). It looks like phase shifting of some sort, but I'm not sure. Any help/explanation is appreciated. I also made a amplitude/time plot for better understanding:
(I set the x axis to 350sec to see where the signal ends).
EDIT: here is the code for how I create the sound file:
[y1,F1] = audioread(cave_file); %cave and forest files are mp3 files loaded earlier both have samp.freq of 48000Hz
[y2,F2] = audioread(forest_file);
samp_freq=44100;
%samp_freq=48000;
a = max(size(y1),size(y2));
z = [[y1;zeros(abs([a(1),0]-size(y1)))],[y2;zeros(abs([a(1),0]- size(y2)))]]
audiowrite('test_sound.wav', z,samp_freq);
What is the storage format? More specifically, is the info about sampling rate and number of channels stored in file meta data? which is then used during playback.
If so, then there are 3 possibilities for this behavior:
1) The sampling rate meta data of the 44.1KHz file is incorrect, while the audio was sampled at the correct rate i.e. 44.1KHz. Because the 44.1KHz file is playing longer than 48KHz, which I'm assuming to be producing the correct sound, and playing for the correct duration, it can be concluded that the sampling rate meta data of 44.1KHz is much lesser than 44.1KHz.
Could you please check the meta data? or attach the files here so that I can try to take a look?
2) The sampling didn't happen at the correct rate, while the meta data has 44.1KHz as the sampling rate.
3) The number of channels is incorrectly stored.
In case the files are raw PCMs, then this probably the correct sampling rate and/or number of channels is not selected when playing the 44.1KHz file.
Hope this helps
i may sound too rookie please excuse me. When i read a .AAC audio file in Matlab using the audioread function the out put is a 256000x6 matrix. how do i know what is the content of each column?
filename = 'sample1.aac';
[y,Fs] = audioread(filename,'native');
writing the first column using audiowrite as below i can hear the whole sound. so what are the other columns?
audiowrite('sample2.wav',y,Fs);
Documentation:
https://uk.mathworks.com/help/matlab/ref/audioread.html
Output Arguments
y - Audio Data
Audio data in the file, returned as an m-by-n matrix, where m is the number of audio samples read and n is the number of audio channels in the file.
If you can hear the entire file in the first channel, it just means most of that file is contained in a mono channel. From Wikipedia r.e. AAC audio channels:
AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams
https://en.wikipedia.org/wiki/Advanced_Audio_Coding
The Problem I have is when using ffmpeg to encode a YUV using libx264 I don't get all the frame information in -vstats output. It raises the question of how reliable ffmpeg is, and therefore can any 'codec benchmark' review based on ffmpeg be trusted?
I am analysing codec's to determine how they perform. I am using ffmpeg and its -vstats option to look at an encoded movie frame by frame. the process I use:
RAW YUV -> bar-code each frame with frame number -> Bar-coded YUV
Bar-coded YUV -> encoded (e.g. with libx264) -> MKV -> Decoded to YUV
I can compare the two outputs ('Bar-coded YUV' & 'Decoded to YUV') using the bar-code in each frame. I can then compare, exactly, an original frame with an encoded frame using PSNR etc.
When encoding using libx264 and libdirac, there are some frame information which is missing. Other codecs, such as mpeg2video or even libvpx, don't have this problem.
I have found that libx264 vstats are missing for the first 40 to 50 frames. I have since proved that the missing information is actually the last 40 to 50 frames.
It also looks like ffmpeg calculates average bitrate based on the information in vstats. But as there is missing frames the average bitrate is less than what it should be.
Below are links to the average bitrate error example:
http://dl.dropbox.com/u/6743276/ffmpeg_probs/ffmpeg_av_bitrate_error.png
http://dl.dropbox.com/u/6743276/ffmpeg_probs/ffmpeg_av_bitrate_error.xlsx
Below is a link to the PSNR & f_size graph:
http://dl.dropbox.com/u/6743276/ffmpeg_probs/frame_mismatch.png
Below is a link to the output & command line options:
http://dl.dropbox.com/u/6743276/ffmpeg_probs/stderr.txt
I think this is also a bug, anyone clever enough to work it out might want to follow this tracker:
http://roundup.ffmpeg.org/issue2248
I have just discovered something which makes me very red in the face!! quite annoyed, but never mind :)
A fellow ffmpeg user pointed out that ffprobe should output more frame info, which it did. here is a link to his handy tip:
http://forums.creativecow.net/thread/291/71
Using this I found the following:
Actual average bitrate (ffprobe data): 8355.2776056338
Actual average bitrate (ffmpeg vstats data): 8406.23275471698
Ffmpeg -vstats avg_br: 7816.3
Reproduced above: 7816.32168421053
Ffmpeg standard error output 'bitrate=': 8365.8
Below is a link to my workings out:
http://dl.dropbox.com/u/6743276/ffmpeg_probs/ffprobe_vs_ffmpeg-vstats.xlsx
What I have discovered is I should have been using the average bitrate info from ffmpeg standard error output, it looks like the most reliable!
I am streaming a MP3 over network using custom feeding code, not AVAudioPlayer (which only works with URLs) using APIs like AudioFileStreamOpen and etc.
Is there any way to estimate a length of the stream? I know that I can get a 'elapsed' property using:
if(AudioQueueGetCurrentTime(queue.audioQueue, NULL, &t, &b) < 0)
return 0;
return t.mSampleTime / dataFormat.mSampleRate;
But what about total duration to create a progress bar? Is that possible?
P.S. Clarification - I do know the actual size of the MP3 file, don't know if that can be used... I'll even settle for solution that just gives me a progress bar, not the actual time of play/duration.
If you know the total size of the MP3 file, you can calculate the bits per second, and therefore calculate the duration of the stream. If it's VBR, you'll probably have to average several MPEG frames. For CBR, you can simply use the bitrate of one packet.
I'm making an Iphone game, we need to use a compressed format for sound, and we want to be able to loop SEAMLESSLY back to a specific sample in the audio file (so there is an intro, then it loops back to an offset)
currently THE ONLY export process I have found that will allow seamless looping (reports the right priming and padding frame numbers, no clicking when looping ect) is using apple's afconvert to a aac format in a caf file.
but when we try and encode to lower bitrates, it automatically re samples the sound! we do NOT want to have the sound re sampled, every other encoder I have encountered has an option to set the output sample rate, but I can't find it for this one.
on another note, if anyone has had any luck with seamless looping of a compressed file format using audio queues, let me know.
currently I'm working off the information found at:
http://developer.apple.com/mac/library/qa/qa2009/qa1636.html
note that this DID work PERFECTLY when I left the bitrate for the encode at default (~128kbs) but when I set it to 32kbps - with the -b option - it resampled, and looping clicks now.
It needs to be at least 48kbps. 32kbps will downsample to a lower sample rate.
I think you are confusing sample rate (typical values: 32kHz, 44.1kHz, 48kHz) and bit rate (typical values: 128kbps, 160kbps, 192kbps).
For a bit rate, 32kbps is extremely low. Sound will have bad quality at this bit rate. You probably intended to set the sample rate to 32kHz instead, which is also not outright typical, but makes more sense.
When compressing to AAC and uncompressing back to WAV, you will not get the same audio file back, because in AAC, the audio data is represented in a completely different format than in raw wave. E.g. you can have shifts by few microseconds, which are necessary to convert to the compressed format. You can not completely get around this with any highly compressed format.
The clicking sound originates from the sudden change between two samples which are played in direct succession. This is likely taking place because the offset to which you jump back in your loop does not end up to be at exactly the same position in the AAC file as it was in the WAV file (as explained above, there can shifts by microseconds).
You will not get around these slight changes when compressing. Instead, you have to compensate for them after compression by adjusting the offset. That means you have to open the compressed sound file in an audio editor, e.g. Audacity, and manually find another offset close to the original one, which is suitable for looping.
How to find an offset which is suitable for looping?
Zoom in to the waveform's end. Look at how the waveform looks there. Then zoom in to the waveform at the original offset and search in its neighbourhood for an offset at which the waveform connects seamlessly to the end of the waveform.
For an example how this shoud look like, open the uncompressed audio file in the audio editor and examine the end of the waveform and the offset there.