I want to record video conference. I can receive rtp media from video conferencing server. I want to output fragmented mp4 file format for live streaming. So, how to write a fragmented mp4 file programmatically using Bento4?
MP4Box supports DASH. i supply the following simple example:
MP4Box -dash 4000 -frag 4000 -rap -segment-name test_ input.mp4
'-dash 4000' to segment the input mp4 file into 4000ms chunks
'-frag 4000' since frag = dash, actually segments are not fragmented further.
'-rap' to enforce each segment to start random access points, i.e. at keyframes. In such case the segment duration may differ from 4000ms depending on distribution of key frames.
'-segment-name' to specify the pattern of segments names. So in this case, the segments will be named like this: test_1.m4s, test_2.m4s, ...
Related
I would like to get data from audio file based on microphone input (both Android and iOS), currently I'm using audioplayers and recordMp3 to record the microphone input. This results in a mp3 file with a local file path. In order to use the audio data, I want an uncompressed format like WAV. Would ffmpeg help with this conversion ? I want to eventually use this data for visualization.
MP3 to WAV
ffmpeg -i input.mp3 output.wav
Note that any encoding artifacts in the MP3 will be included in the WAV.
Piping from ffmpeg to your visualizer
I'm assuming you need WAV/PCM because your visualizer only accepts that format and does not accept MP3. You can create a WAV file as shown in the example above, but if your visualizer accepts a pipe as input you can avoid creating a temporary file:
ffmpeg -i input.mp3 -f wav - | yourvisualizer …
Using ffmpeg for visualization
See examples at How do I turn audio into video (that is, show the waveforms in a video)?
I have to play various mp3 files on a given software without showing any information that could lead to the recognition of the played track (some kind of quiz). For that, I want to change the displayed track length to an arbitrary value. I can easily change up standard ID3 tags like "name", "artist" and so on. But changing the displayed track length seems to be more tricky, though...
Edit (after Vikram's response):
So far, I was able to manipulate the displayed track length by modifying the 'xing' header in a vbr encoded mp3 file. More precisely, I changed the bytes in the 'number of frames' section with a hex editor which lead to an mp3 that showed a modified track length according to:
Track length = Number of Frames * Samples Per Frame / Sampling Rate
with the file still being correctly played. This approach seems to work for winamp, vlc player and windows in general. Unfortunately, it does not seem to work for the proprietary software I have to use. When using that software, somehow the original track duration is still identified because a different calculation method is applied.
Any other ideas on how the track duration could be calculated resp. fooled into displaying an arbitrary value?
Thanks!
YES and NO.
Most of the .mp3 files have this extra info besides ID3 in XING header, which contains duration of the file.
You can modify this header to put wrong info.
Or you can simply remove this XING header!
There are two types of .mp3 files: CBR and VBR.
CBR is most common. So, using bitrate info, players still can estimate length of the audio for CBR.
For VBR this is not always correct!
So, the audio file you had was most likely an VBR encoded mp3 without XING header.
The track length is calculated by parsing the whole MPEG audio stream, a fairly straight forward process. XING (or similar) headers (correctly: frames) exist as a help like an additional index for seeking in the file, but it is not mandatory; it also only exists because 25 years ago in most cases it would have taken too much performance to fully parse files and keeping relevant data in memory when VBR was "invented". Metadata, where you could define incorrect track lengths (like the TLEN frame thru ID3v2) aren't mandatory either.
So: it's not possible. You rather found software/players that opt for performance without assuring everything they spot. Also no other file/stream format comes to my mind where a track length is mandatory and cannot be calculated by parsing the file.
we know movie atom in a MP4 container file stores information that describes a movie's data. I am wondering if a single MP4 file can contain more than one movie atoms (atom type of 'moov')?
anyone knows? Thanks
A valid MP4 file may only contain one single 'moov' box. This is stated in 8.2.1 of the ISO 14496-12 specification:
The metadata for a presentation is stored in the single Movie Box which occurs at the top-level of a file.
Normally this box is close to the beginning or end of the file, though this is not required.
If you are using fragmented MP4 files the role of the 'moov' is mostly played by 'moof' boxes and those typically exist multiple times.
I would like to play some kind of text-to-speech with only numbers. I can record 10 wav files, but how can I combine them programmatically ?
For instance, the user types 1234, and the text-to-speech combines 1.wav with 2.wav, 3.wav and 4.wav to produce 1234.wav that plays "one two three four".
1) create a new destination sample buffer (you will want to know the sizes).
2) read the samples (e.g. using AudioFile and ExtAudioFile APIs) and write them in sequence to the buffer. You may want to add silence between the files.
It will help if your files are all the same bit depth (the destination bit depth - 16 should be fine) and sample rate.
Alternatively, if you have fixed, known, sample rates and bit depths for all files, you could just save them as raw sample data and be done in much less time because you could simply append the data as is without writing all the extra audio file reading programs.
The open source project wavtools provides a good reference for this sort of work, if you're ok with perl. Otherwise there is a similar question with some java examples.
The simplist common .wav (RIFF) file format just has a 44 byte header in front of raw PCM samples. So, for these simple types of .wav files, you could just try reading the files as raw bytes, removing the 44 byte header from all but the first file, and concatening the samples. Or just play the concatenated samples directly using the Audio Queue API.
I am using AVAudioRecorder in my app in order to create a .wav file based on user voice input. I want to be able to stuff" silence in the beginning of an audio file for some time before the actual recording is done.
How can I do this using AVAudioRecorder? Can I mention a time for which I want the "silence" to be recorded?
Thanks.
I haven't worked with the AVAudioRecorder in particular. But silence in PCM audio is just zeros as sample values. You could set the encoding of the AVAudioRecorder to PCM, store the file and then edit the file by prepending the desired number of zeros in the beginning. E.g. at 44100 Hz, 8-bit encoding, you'd add 44100 zero-bytes to the beginning of the file for each second of silence you would like to have. Hope this helps as an idea.
Note: Keep the file header of the PCM file intact and edit the "data chunk".
I'm sure there's probably a better way to do it, but how about simply pre-recording the silent stream you want and then concatenating that file with a file of the user generated content before saving out the final version?