I have read MP4Box Doc about Mpeg-Dash, but I don't clearly understand about "MP4Box -dash 10000 -frag 2000 largeFile.mp4" and MP4Box -dash 10000 -frag 1000 largeFile.mp4. When I open the *.mpd File I found the duration of SegmentList is 10023(about 10 sec). If the -frag 2000 or 1000 is no used?
I'm designing a HTML5 Video Player(like this sample), and I using MP4Box tool to create DASH Video.
But I don't clearly understand what's the difference when I convert my video with -frag 2000 and 1000. For example: I don't the mean about my video with 10 second segments and 1 second fragments. maybe My Video Player do not need to set this option?
GPAC contributor here. It is difficult to help you without a full example. I stronlgy recommend to describe bugs on our bug tracker (https://github.com/gpac/gpac/issues).
When I open the *.mpd File I found the duration of SegmentList is 10023(about 10 sec). If the -frag 2000 or 1000 is no used?
Three points:
1) You probably get 10023 ms (instead of 10000 ms) because you may use an old version of MP4Box. Please consider using the latest version.
2) Fragments are an MP4 feature and is not seen at the MPEG-DASH level. Segments is also an MP4 feature (basically a segment contains fragments) that is seen by MPEG-DASH. Therefore you can't see it in the MP4 but it may have consequences on your playback.
3) The blog article you mention (http://gpac.io/2011/02/02/mp4box-fragmentation-segmentation-splitting-and-interleaving/ contains all the information you need. If you think we can improve it, please leave a message there. Thanks!
Related
I have a simple script that uses music21 to process the notes in a midi file:
import music21
score = music21.converter.parse('www.vgmusic.com/music/console/nintendo/nes/zanac1a.mid')
for i in score.flat.notes:
print(i.offset, i.quarterLength, i.pitch.midi)
Is there a way to also obtain a note's voicing / midi program using a flat score? Any pointers would be appreciated!
MIDI channels and programs are stored on Instrument instances, so use getContextByClass(instrument.Instrument) to find the closest Instrument in the stream, and then access its .midiProgram.
Be careful:
.midiChannel and .midiProgram are 0-indexed, so MIDI channel 10 will be 9 in music21, etc., (we're discussing changing this behavior in the next release)
Some information might be missing if you're not running the bleeding edge version (we merged a patch yesterday on this topic), so I advise pulling from git: pip install git+https://github.com/cuthbertLab/music21
.flat is going to kill you, though, if the file is multitrack. If you follow my advice you'll just get the last instrument on every track. 90% of the time people doing .flat actually want .recurse().
I'm attempting to stream a H.264 video feed to a web browser. Media Foundation is used for encoding a fragmented MPEG4 stream (MFCreateFMPEG4MediaSink with MFTranscodeContainerType_FMPEG4, MF_LOW_LATENCY and MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS enabled). The stream is then connected to a web server through IMFByteStream.
Streaming of the H.264 video works fine when it's being consumed by a <video src=".."/> tag. However, the resulting latency is ~2sec, which is too much for the application in question. My suspicion is that client-side buffering causes most of the latency. Therefore, I'm experimenting with Media Source Extensions (MSE) for programmatic control over the in-browser streaming. Chrome does, however, fail with the following error when consuming the same MPEG4 stream through MSE:
Failure parsing MP4: TFHD base-data-offset not allowed by MSE. See
https://www.w3.org/TR/mse-byte-stream-format-isobmff/#movie-fragment-relative-addressing
mp4dump of a moof/mdat fragment in the MPEG4 stream. This clearly shows that the TFHD contains an "illegal" base data offset parameter:
[moof] size=8+200
[mfhd] size=12+4
sequence number = 3
[traf] size=8+176
[tfhd] size=12+16, flags=1
track ID = 1
base data offset = 36690
[trun] size=12+136, version=1, flags=f01
sample count = 8
data offset = 0
[mdat] size=8+1624
I'm using Chrome 65.0.3325.181 (Official Build) (32-bit), running on Win10 version 1709 (16299.309).
Is there any way of generating a MSE-compatible H.264/MPEG4 video stream using Media Foundation?
Status Update:
Based on roman-r advise, I managed to fix the problem myself by intercepting the generated MPEG4 stream and perform the following modifications:
Modify Track Fragment Header Box (tfhd):
remove base_data_offset parameter (reduces stream size by 8bytes)
set default-base-is-moof flag
Add missing Track Fragment Decode Time (tfdt) (increases stream size by 20bytes)
set baseMediaDecodeTime parameter
Modify Track fragment Run box (trun):
adjust data_offset parameter
The field descriptions are documented in https://www.iso.org/standard/68960.html (free download).
Switching to MSE-based video streaming reduced the latency from ~2.0 to 0.7 sec. The latency was furthermore reduced to 0-1 frames by calling IMFSinkWriter::NotifyEndOfSegment after each IMFSinkWriter::WriteSample call.
There's a sample implementation available on https://github.com/forderud/AppWebStream
I was getting the same error (Failure parsing MP4: TFHD base-data-offset not allowed by MSE) when trying to play a fmp4 via MSE. The fmp4 had been created from a mp4 using the following ffmpeg comand:
ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov myfmp4video.mp4
Based on this question I was able to find out that to have the fmp4 working in Chrome I had to add the "default_base_moof" flag. So, after creating the fmp4 with the following command:
ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof myfmp4video.mp4
I was able to play successfully the video using Media Source Extensions.
This Mozilla article helped to find out that missing flag:
https://developer.mozilla.org/en-US/docs/Web/API/Media_Source_Extensions_API/Transcoding_assets_for_MSE
The mentioned 0.7 sec latency (in your Status Update) is caused by the Media Foundation's MFTranscodeContainerType_FMPEG4 containterizer which gathers and outputs each roughly 1/3 seconds (from unknown reason) of frames in one MP4 moof/mdat box pair. This means that you need to wait 19 frames before getting any output from MFTranscodeContainerType_FMPEG4 at 60 FPS.
To output single MP4 moof/mdat per each frame, simply lie that MF_MT_FRAME_RATE is 1 FPS (or anything higher than 1/3 sec). To play the video at the correct speed, use Media Source Extensions' <video>.playbackRate or rather update timescale (i.e. multiply by real FPS) of mvhd and mdhd boxes in your MP4 stream interceptor to get the correctly timed MP4 stream.
Doing that, the latency can be squeezed to under 20 ms. This is barely recognizable when you see the output side by side on localhost in chains such as Unity (research) -> NvEnc -> MFTranscodeContainerType_FMPEG4 -> WebSocket -> Chrome Media Source Extensions display.
Note that MFTranscodeContainerType_FMPEG4 still introduces 1 frame delay (1st frame in, no output, 2nd frame in, 1st frame out, ...), hence the 20 ms latency at 60 FPS. The only solution to that seems to be writing own FMPEG4 containerizer. But that is order of magnitude more complex than intercepting of Media Foundation's MP4 streams.
The problem was solved by following roman-r's advise, and modifying the generated MPEG4 stream. See answer above.
Another way to do this is again using the same code #Fredrik mentioned but I write my own IMFByteStream and and I check the chunks written to the IMFByteStream.
FFMpeg writes the atoms almost once at a time. So you can check the atom name and do the mods. It is the same thing. I wish there was an MSE compliant windows sinker.
Is there one that can generate .ts files for HLS?
I'm trying to use GDCL MP4 Muxer with my RTSP Source Filter. They work fine together except after stopping the graph, muxer doesn't finilize the file and write the reqiured tables to the end of file via file writer (some parts are written starting from moov but not the time table values). When I try another RTSP source filter (which I don't have its source codes), table values are created with GDCL MP4 Muxer.
But when I try Elecard's MP4 Muxer, it works fine with my RTSP Source Filter. So, there is an incompatibility. I examined GDCL's source codes but couldn't find what it was expecting from me. I already calculate and set timestamp values to samples using SetTime method. But GDCL still doesn't finilaze file. Is it caused by missing information or missing signal when graph stops? What can be the problem, any ideas?
One thing you should be aware of regarding Geraint's MP4 Mux is that it is checking incoming media samples to have both start and stop time. You might be having only .tStart/AM_SAMPLE_TIMEVALID which still makes sense for video, but this would be a problem.
So the samples have to have stop time, or you need to fix this in multiplexer code.
A typical symptom for the problem is that generated files are empty or of zero duration.
I am trying to encode video in h.264 that when split with Apples HTTP Live Streaming tools media file segmenter will pass the media file validator I am getting two errors on the split MPEG-TS file
WARNING: Media segment contains a video track but does not contain any IDR access unit with a SPS and a PPS.
WARNING: 7 samples (17.073 %) do not have timestamps in track 257 (avc1).
After hours of research I think the "IDR" warning relates to not having keyframes in the right place on the segmented MPEG-TS file so in my ffmpeg command I set -keyint_min 1 to ensure keyframes where at every frame, but this didn't work.
Although it would be great to get an answer, if anyone can shed any light on what a "IDR access unit with a SPS and a PPS" is or what the timestamps warning means I would be very grateful, thanks.
Fix can be found on this thread https://devforums.apple.com/thread/45830?tstart=15
I have a 2giga mpeg file of people runnig,jogging,walking etc. in it. I will use it in a image classification project but I need to segmentate the video depending on per person an per action.
for example;
there are 25 people in video which repeat these actions in order
1st person
-runs
-walks
2nd person
-runs
-walks
and goes on....
and what I want is to have 2 different mpeg file for each person
such as;
firstperson_runs.mpeg
firstperson_waves.mpeg
so I need a tool to split big file into these files. Splitting shall be due to time.
such as;
pick t1:start of action
pick t2:end of action
create a new video from big file for the interval t1 and t2
of course I will select time intervals for each video.
OS:Winxp pro
if it can be done by matlab ,can you describe it?
any help???
I imagine there are a number of tools available to do this without MATLAB, but if you really want to use MATLAB I would check out these submissions on The MathWorks File Exchange:
Gerald Dalley's videoIO Toolbox for Matlab
Micah Richert's mmread
David Foti's mpgread and mpgwrite
EDIT:
As mentioned by M456, you can also use the built-in function MMREADER for creating a multimedia reader object for your movie file (and subsequently reading selected movie frames from it with the READ method). However, I don't know which version of MATLAB this function was introduced in. It is in versions 7.7 and 7.8 (R2008b and R2009a, respectively), but it is not in version 7.1.
Matlab can do such video split operations. There are two built in functions (aviread and mmreader) for reading video files. Both will create objects which contain the individual frames of the video. You can save these as separate frames or make a new video out of by using avifile.