I am complete newbie to video encoding. I am trying to encode a series of .dpx files into one single encoded video O/P file in any of the following format. ( .mp4,.avi,.h264,.mkv etc)
I have tried 2 different approaches. The first one works and the second one does not.
I would like to know the difference between the two. Any help / input would be much appreciated.
1) using FFMPEG along with x264 library and it works well. I am able to produce desired output
ffmpeg -start_number 0 -i frame%4d.dpx -pix_fmt yuv420p -c:v libx264 -crf 28
-profile:v baseline fromdpx.h264
2) I first try to concatenate all the dpx files into a single file using concate protocol in ffmpeg and then use x264 to encode the concatenated file.
Here I see that the size of the concatenated file is the sum of all the files concatenated. But when I use x264 command to encode the concatenated file, I get a green screen (basically not the desired output) .
ffmpeg -i "concat:frame0.dpx|frame01.dpx|frame2.dpx etc" -c copy output.dpx
then
x264 --crf 28 --profile baseline -o encoded.mp4 --input-res 1920x1080 --demuxer raw
output.dpx
I also tried to encoded the concatenated file using ffmpeg as follows
ffmpeg -i output.dpx -pix_fmt yuv420p -c:v libx264 -crf 28 -profile:v baseline fromdpx.h264
This also gives me a blank video.
Could someone please point out to me what is going on here? Why does the first method work and the second does not?
Thank you.
In the second approach you specify DPX-file as raw input for x264 (--demuxer raw) but DPX is not raw format (it is more container which have headers, metadata and optionally RLE compression) and so need decoding. x264 supports only this raw formats (--input-csp): i420, yv12, nv12, i422, yv16, nv16, i444, yv24, bgr, bgra, rgb. All this formats can have 8..16 bits per component (--input-depth).
Also I doubt DPX format supports concating at all because it is image format not video format. So probably your result after concat is already broken.
Related
I am having trouble showing the progress of ffmpeg through my script. I compile my script to exe with ps2exe and ffmpeg output on standard error instead of outputting on standard out
So I used to pipe 1 option
my script.ps1 now is:
# $nb_of_frames= #some_int
& $ffmpeg_path -progress pipe:1 -i input.mp4 -c:v libx264 -pix_fmt yuv420p -crf 25 -preset fast -an output.mp4
then I compile it with ps2exe. (to reproduce you don't need the compile, just use the above command with pipe:1 directly in cmd or PowerShell you will get the same behavior)
Normally with ffmpeg you get a progress reporting (that is interactive), one line containing the information and it keeps getting updated as 1 single line without spamming the console with 100s of lines, it looks like this.
frame= 7468 fps=115 q=22.0 size= 40704kB time=00:05:10.91 bitrate=1072.5kbits/s speed= 4.8x
But this does not appear in the compiled version of my script, so after digging I added -progress pipe:1 to get the progress to appear on std out
Now I get a continuous output every second that looks like this:
frame=778
fps=310.36
stream_0_0_q=16.0
bitrate= 855.4kbits/s
total_size=3407872
progress=continue
...
frame=1092
fps=311.04
stream_0_0_q=19.0
bitrate= 699.5kbits/s
total_size=3932160
progress=continue
I would like to print some sort of updatable percentage out of this, I can compute a percentage easily if I can capture that frame number, but in this case, I don't know how to capture a real-time output like this and how to make my progress reporting update 1 single line of percentage in real-time (or some progress bar via symbols) instead of spamming on many lines
(or if there is a way to make the default progress of FFmpeg appear in the compiled version of my script that would work too)
edit: a suggestion based on the below answer
#use the following lines instead of write-progress if using with ps2exe
#$a=($frame * 100 / $maxFrames)
#$str = "#"*$a
#$str2 = "-"*(100-$a)
#Write-Host -NoNewLine "`r$a% complete | $str $str2|"
Thanks
Here is an example how to capture current frame number from ffmpeg output, calculate percentage and pass it to Write-Progress:
$maxFrames = 12345
& $ffmpeg_path -progress pipe:1 -i input.mp4 -c:v libx264 -pix_fmt yuv420p -crf 25 -preset fast -an output.mp4 |
Select-String 'frame=(\d+)' | ForEach-Object {
$frame = [int] $_.Matches.Groups[1].Value
Write-Progress -Activity 'ffmpeg' -Status 'Converting' -PercentComplete ($frame * 100 / $maxFrames)
}
Remarks:
Select-String parameter is a regular expression that captures the frame number by the group (\d+) (where \d means a digit and + requires at least one digit). See this Regex101 demo.
ForEach-Object runs the given script block for each match of Select-String. Here $_.Matches.Groups[1].Value extracts the matched value from the first RegEx group. Then we convert it to integer to be able to use it for calculations.
Finally calculate the percentage and pass it to Write-Progress.
I'm trying to do automatic detect chapter with blackdetect with ffmpeg.
When I use blackdetect I get result but what is the result? Its not frames? Also. Is it possible to do a script/bat-file (for windows 10, powershell or cmd) to convert the result to a "mkv xml-file" so It can be imported with mkvtoolnix?
ffmpeg -i "movie.mp4" -vf blackdetect=d=0.232:pix_th=0.1 -an -f null - 2>&1 | findstr black_duration > output.txt
result:
black_start:2457.04 black_end:2460.04 black_duration:3
black_start:3149.46 black_end:3152.88 black_duration:3.41667
black_start:3265.62 black_end:3268.83 black_duration:3.20833
black_start:3381.42 black_end:3381.92 black_duration:0.5
black_start:3386.88 black_end:3387.38 black_duration:0.5
black_start:3390.83 black_end:3391.33 black_duration:0.5
black_start:3824.29 black_end:3824.58 black_duration:0.291667
black_start:3832.71 black_end:3833.08 black_duration:0.375
black_start:3916.29 black_end:3920.29 black_duration:4
Please see the documentation on this function here. Specifically this line:
Output lines contains the time for the start, end and duration of the detected black interval expressed in seconds.
If you read further down the page you will see another function blackframe which does a similar thing but outputs the frame number rather than seconds.
If the mkvtoolnix xml file has a chapter definition facility you will be able to create a script that takes the ffmpeg output and dumps it into the correct format in any language of your choice.
With FFmpeg how can I use AV1 codec in a webm container?
I get the error:
Only VP8 or VP9 video and Vorbis or Opus audio and WebVTT subtitles are supported for WebM.
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:0 --
However Wikipedia says WebM supports AV1.
https://en.wikipedia.org/wiki/AV1
AV1 is intended to be able to be used together with the audio format Opus in a future version of the WebM container format for HTML5 web video
Or can FFmpeg simply not encode this new version?
My settings:
ffmpeg -y
-i "C:\Users\Matt\video.mp4"
-c:v libaom-av1 -strict experimental
-cpu-used 1 -crf 28
-pix_fmt yuv420p
-map 0:v:0? -map_chapters -1
-sn
-c:a libopus
-map 0:a:0?
-map_metadata 0
-f webm
-threads 0
"C:\Users\Matt\video.webm"
ffmpeg currently doesn't support muxing AV1 in WebM. The error you're getting comes from this code:
if (mkv->mode == MODE_WEBM && !(par->codec_id == AV_CODEC_ID_VP8 ||
par->codec_id == AV_CODEC_ID_VP9 ||
par->codec_id == AV_CODEC_ID_OPUS ||
par->codec_id == AV_CODEC_ID_VORBIS ||
par->codec_id == AV_CODEC_ID_WEBVTT)) {
av_log(s, AV_LOG_ERROR,
"Only VP8 or VP9 video and Vorbis or Opus audio and WebVTT subtitles are supported for WebM.\n");
return AVERROR(EINVAL);
}
Note the lack of AV_CODEC_ID_AV1 in the expression.
This isn't too surprising, though. AV1 in Matroska (and therefore WebM) hasn't been finalized yet. If you want to follow progress on AV1 in Matroska (and WebM), follow the discussion here on the IETF CELLAR mailing list.
Update, FFmpeg does support AV1 in Webm now!
if (!native_id) {
av_log(s, AV_LOG_ERROR,
"Only VP8 or VP9 or AV1 video and Vorbis or Opus audio and WebVTT subtitles are supported for WebM.\n");
return AVERROR(EINVAL);
}
Source code here.
Wondering if anyone has any insight about h.264 byte code stream:
The ffmpeg command line is:
fmpeg\" -s 320x240 -f avfoundation -r 30.00 -i \"0:none\" -c:v libx264 -preset ultrafast -tune zerolatency -x264opts crf=20:vbv-maxrate=3000:vbv-bufsize=100:intra-refresh=1:slice-max-size=1500:keyint=30:ref=1 -b:v 1000 -an -f mpegts -threads 8 -profile:v baseline -level 3.0 -pix_fmt yuv420p udp://127.0.0.1:5564"
In theory, the elementary stream in h.264 should be like this: (view image)
So the key is to generate individual NALUs from H.264 stream. So we should get the bitstream like this: (view image).
We need get the real NALU type like this: 0x1F & NALU type. So 0x27 is equal to 0x67.
Normally, we should just have these NALU type(after the operation of 0x1F & NALU type):
1: slice of a non-IDR picture. (P frame)
5: slice of an IDR picture. (I frame)
6: Supplemental enhancement information. (SEI)
7: Sequence parameter set. (SPS parameter)
8: Picture parameter set. (PPS parameter)",
9: Access unit delimiter.
But what I get from udp is like this from the first UDP packet:
(source: artsmesh.io)
In this UDP datagram, something doesn’t make sense, after the 0x00000001 start code header, the NALU type is 0xff, and the second one is 0xf0, both of them are undefined in h.264.
So I’m having trouble finding out why the h.264 stream is not working.
And is it true that the start code header is always either four bytes 0x0000 0001 or three bytes 0x000001 within the same UDP packets(or the same session of streaming)?
This is not a raw h.264 stream. It is a transport stream. Some of 0x000001 are from the PES header, and not part of of the AVC payload. https://en.wikipedia.org/wiki/MPEG_transport_stream
Also, 3 and 4 bytes start codes can be mixed in the same ES. The reason is covered in my answer Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream
I have twitter data (usernames and their tweets) which I am trying to cluster. The text file is 151.7 MB in size.
I converted the raw txt text data to a mahout sequence file.
I inspected this sequence file, it's full of data. It's also 151.7 MB.
I tried to convert the sequence file to sparse vectors.
At this point something has clearly gone wrong. It claims success, but it only creates vector files that are bytes in size. My TFIDF vector file is only 90 bytes, which is obviously wrong when the original txt file and the sequence file are both 151 MB.
What confuses me most is that I can't see what's so different between the data I have and the reuters dataset which is used in the clustering example in 'Mahout in Action'. They're both just text.
Here are the exact commands I used:
--- Turned raw text txt file into a mahout sequence file. I've also checked the sequence file using seqdumper, it's full of username/tweet data. ---
sudo /opt/mahout/bin/mahout seqdirectory -c UTF-8 -i /home/efx/Desktop/tweetQueryOutput.txt -o /home/efx/Desktop/allNYCdataseqfiles
(Inspect the sequence file, it's full of username/tweet data)
sudo /opt/mahout/bin/mahout seqdumper -i /home/efx/Desktop/allNYCdataseqfiles/chunk-0 -o /home/efx/Desktop/allNYCdataseqfiles/sequenceDumperOutput
--- Then tried to convert the sequence file to sparse vectors. ---
sudo /opt/mahout/bin/mahout seq2sparse -o /home/efx/Desktop/allNYC_DataVectors -i /home/efx/Desktop/allNYCdataseqfiles/ -seq
Under Mahout 0.8+cdh5.0.2, you have to do the following :
sudo /opt/mahout/bin/mahout seq2sparse
-o /home/efx/Desktop/allNYC_DataVectors
-i /home/efx/Desktop/allNYCdataseqfiles/
-seq
--maxDFPercent 100
The --maDFPercent options represents the max percentage of docs for the DF. It can be used to remove really high frequency terms. By default the value is 99. But if you use --maxDFSigma also, it will override this value.
This works fine for me but I'm not sure about the 0.7 version of Mahout.