Edit: In the previous version I used a very old ffmpeg API. I now use the newest libraries. The problem has only changed slightly, from "Main" to "High".
I am using the ffmpeg C API to create a mp4 video in C++.
I want the resulting video to be of the profile "Constrained Baseline", so that the resulting video can be played on as much platforms as possible, especially mobile, but I get "High" profile every time, even though I hard coded the codec profile to be FF_PROFILE_H264_CONSTRAINED_BASELINE. As a result, the video does not play on all our testing platforms.
This is what "ffprobe video.mp4 -show_streams" tells about my video streams:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 1970-01-01 00:00:00
encoder : Lavf53.5.0
Duration: 00:00:13.20, start: 0.000000, bitrate: 553 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 320x180,
424 kb/s, 15 fps, 15 tbr, 15 tbn, 30 tbc
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, s16, 12
kb/s
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : SoundHandler
-------VIDEO STREAM--------
[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
profile=High <-- This should be "Constrained Baseline"
codec_type=video
codec_time_base=1/30
codec_tag_string=avc1
codec_tag=0x31637661
width=320
height=180
has_b_frames=0
sample_aspect_ratio=N/A
display_aspect_ratio=N/A
pix_fmt=yuv420p
level=30
timecode=N/A
is_avc=1
nal_length_size=4
id=N/A
r_frame_rate=15/1
avg_frame_rate=15/1
time_base=1/15
start_time=0.000000
duration=13.200000
bit_rate=424252
nb_frames=198
nb_read_frames=N/A
nb_read_packets=N/A
TAG:creation_time=1970-01-01 00:00:00
TAG:language=und
TAG:handler_name=VideoHandler
[/STREAM]
-------AUDIO STREAM--------
[STREAM]
index=1
codec_name=aac
codec_long_name=Advanced Audio Coding
profile=unknown
codec_type=audio
codec_time_base=1/44100
codec_tag_string=mp4a
codec_tag=0x6134706d
sample_fmt=s16
sample_rate=44100
channels=2
bits_per_sample=0
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/44100
start_time=0.000000
duration=13.165714
bit_rate=125301
nb_frames=567
nb_read_frames=N/A
nb_read_packets=N/A
TAG:creation_time=1970-01-01 00:00:00
TAG:language=und
TAG:handler_name=SoundHandler
[/STREAM]
This is the function I use to add a video stream. All the values that come from ptr-> are defined from outside, do those values have to be specific values to get the correct profile?:
static AVStream *add_video_stream( Cffmpeg_dll * ptr, AVFormatContext *oc, enum CodecID codec_id )
{
AVCodecContext *c;
AVStream *st;
AVCodec* codec;
// Get correct codec
codec = avcodec_find_encoder(codec_id);
if (!codec) {
av_log(NULL, AV_LOG_ERROR, "%s","Video codec not found\n");
exit(1);
}
// Create stream
st = avformat_new_stream(oc, codec);
if (!st) {
av_log(NULL, AV_LOG_ERROR, "%s","Could not alloc stream\n");
exit(1);
}
c = st->codec;
/* Get default values */
codec = avcodec_find_encoder(codec_id);
if (!codec) {
av_log(NULL, AV_LOG_ERROR, "%s","Video codec not found (default values)\n");
exit(1);
}
avcodec_get_context_defaults3(c, codec);
c->codec_id = codec_id;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->bit_rate = ptr->video_bit_rate;
av_log(NULL, AV_LOG_ERROR, " Bit rate: %i", c->bit_rate);
c->qmin = ptr->qmin;
c->qmax = ptr->qmax;
c->me_method = ptr->me_method;
c->me_subpel_quality = ptr->me_subpel_quality;
c->i_quant_factor = ptr->i_quant_factor;
c->qcompress = ptr->qcompress;
c->max_qdiff = ptr->max_qdiff;
// We need to set the level and profile to get videos that play (hopefully) on all platforms
c->level = 30;
c->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
c->width = ptr->dstWidth;
c->height = ptr->dstHeight;
c->time_base.den = ptr->fps;
c->time_base.num = 1;
c->gop_size = ptr->fps;
c->pix_fmt = STREAM_PIX_FMT;
c->max_b_frames = 0;
// some formats want stream headers to be separate
if(oc->oformat->flags & AVFMT_GLOBALHEADER)
c->flags |= CODEC_FLAG_GLOBAL_HEADER;
return st;
}
Additional info:
As a reference video, I use the gizmo.mp4 that Mozilla serves as an example that plays on every platform/browser. It definitely has the "Constrained Baseline" profile, and definitely works on all our testing smartphones. You can download it here. Our self-created video doesn't work on all platforms and I'm convinced this is because of the profile.
I am also using qt-faststart.exe to move the headers to the start of the file after creating the mp4, as this cannot be done in a good way in C++ directly. Could that be the problem?
Obviously, I am doing something wrong, but I don't know what it could be. I'd be thankful for every hint ;)
I have the solution. After spending some time and discussions in the ffmpeg bug tracker and browsing for profile setting examples, I finally figured out the solution.
One needs to use av_opt_set(codecContext->priv_data, "profile", "baseline" (or any other desired profile), AV_OPT_SEARCH_CHILDREN)
So in my case that would be:
Wrong:
// We need to set the level and profile to get videos that play (hopefully) on all platforms
c->level = 30;
c->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
Correct:
// Set profile to baseline
av_opt_set(c->priv_data, "profile", "baseline", AV_OPT_SEARCH_CHILDREN);
Completely unintuitive and contrary to the rest of the API usage, but that's ffmpeg philosophy. You don't need to understand it, you just need to understand how to use it ;)
Related
I've configured AVAudioSinkNode attached to AVAudioEngine's inputNode like so:
let sinkNode = AVAudioSinkNode() { (timestamp, frames, audioBufferList) -> OSStatus in
print("SINK: \(timestamp.pointee.mHostTime) - \(frames) - \(audioBufferList.pointee.mNumberBuffers)")
return noErr
}
audioEngine.attach(sinkNode)
audioEngine.connect(audioEngine.inputNode, to: sinkNode, format: nil)
audioEngine.prepare()
do {
try audioEngine.start()
print("AudioEngine started.")
} catch {
print("AudioEngine did not start!")
}
I've separately configured it to use the "Built-in Microphone" device (which I am sure it does use).
If I set sample rate 44100 for the mic (using "Audio MIDI Setup" app provided by Apple on all Macs), everything works as expected:
AudioEngine started.
SINK: 692312319180567 - 512 - 2
SINK: 692312348024104 - 512 - 2
SINK: 692312359634082 - 512 - 2
SINK: 692312371244059 - 512 - 2
SINK: 692312382854036 - 512 - 2
...
However, if I use "Audio MIDI Setup" app (provided by Apple on all Macs), and change the mic's sample rate to anything other than 44100 (say 48000), then the sink node doesn't seem to do anything (doesn't print anything).
Of course, originally I was trying to modify the mic's sample rate programmatically. But later on I discovered that the same happens when I just change the device sample rate via the standard "Audio MIDI Setup" app. Therefore, the code I have for setting the sample rate is unnecessary to post here.
Does anyone know if AVAudioSinkNode has allowed sample rate hard-coded into it?
I cannot find any other explanation...
I've been toying around with AVAudioSinkNodes and it doesn't appear to me to be restricted to a 44100 sampling rate.
In my case, when I check the sampling rate of my input and sink nodes after attaching them, I get the following:
Input node sample rates:
IF <AVAudioFormat 0x600002527700: 2 ch, 48000 Hz, Float32, non-inter>
OF <AVAudioFormat 0x60000250fac0: 2 ch, 48000 Hz, Float32, non-inter>
Sink node sample rates:
IF <AVAudioFormat 0x60000250fbb0: 2 ch, 44100 Hz, Float32, non-inter>
OF <AVAudioFormat 0x60000250fb60: 2 ch, 44100 Hz, Float32, non-inter>
But once I connected them together, I got the following:
Input node sample rates:
IF <AVAudioFormat 0x600002527980: 1 ch, 48000 Hz, Float32>
OF <AVAudioFormat 0x600002506760: 2 ch, 48000 Hz, Float32, non-inter>
Sink node sample rates:
IF <AVAudioFormat 0x600002506710: 2 ch, 48000 Hz, Float32, non-inter>
OF <AVAudioFormat 0x600002505db0: 2 ch, 48000 Hz, Float32, non-inter>
I'm new to working with audio frameworks, but this does seem to suggest that the sink node's sample rate isn't hardcoded.
Your connection,
audioEngine.connect(audioEngine.inputNode, to: sinkNode, format: nil)
seems to differ from mine. Rightly or wrongly, I explicitly specified the format as audioEngine.inputNode.outputFormat(forBus: 0) which led to the settings shown. Not sure if that makes a difference.
I try to encode audio to AAC with profile FF_PROFILE_AAC_LOW by the following settings.
oc_cxt->profile = FF_PROFILE_AAC_LOW;
Also from the output of av_dump_format, I got this
Metadata:
encoder : Lavf57.36.100
Stream #0:0: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s
But the output is different. Everything is ok, except the output is AAC, not AAC (LC). By using ffprobe to detect, the output information is
$ ffprobe o.m4a
...
Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 195 kb/s (default)
...
AAC (LC) is the desired result I need.
But from the command line, ffmpeg can generate AAC (LC) output. Below is a small test.
$ ffmpeg -f lavfi -i aevalsrc="sin(440*2*PI*t):d=5" aevalsrc.m4a
$ ffprobe aevalsrc.m4a
...
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 69 kb/s (default)
...
How can I select FF_PROFILE_LOW to get AAC (LC) output?
This was caused by new ffmpeg api which I didn't notice.
The extra data need to copy back to AVStream->codecpar->extradata after avcodec_open2. After that, the ffprobe can detect output is the format I need, AAC (LC).
The following is a code snippet from ffmpeg.c
if (!ost->st->codecpar->extradata && avctx->extradata) {
ost->st->codecpar->extradata = av_malloc(avctx->extradata_size + FF_INPUT_BUFFER_PADDING_SIZE);
if (!ost->st->codecpar->extradata) {
av_log(NULL, AV_LOG_ERROR, "Could not allocate extradata buffer to copy parser data.\n");
exit_program(1);
}
ost->st->codecpar->extradata_size = avctx->extradata_size;
memcpy(ost->st->codecpar->extradata, avctx->extradata, avctx->extradata_size);
}
Hopefully it would be helpful to anyone use the latest version of ffmpeg (3.x).
The following Python code that adds three files to a GES timeline throws up the following error that others have also had:
(GError('Your GStreamer installation is missing a plug-in.',), 'gstdecodebin2.c(3928): gst_decode_bin_expose (): /GESPipeline:gespipeline0/GESTimeline:gestimeline0/GESVideoTrack:gesvideotrack0/GnlComposition:gnlcomposition1/GnlSource:gnlsource0/GstBin:videosrcbin/GstURIDecodeBin:uridecodebin0/GstDecodeBin:decodebin4:\nno suitable plugins found')
from gi.repository import GES
from gi.repository import GstPbutils
from gi.repository import Gtk
from gi.repository import Gst
from gi.repository import GObject
import sys
import signal
VIDEOPATH = "file:///path/to/my/video/folder/"
class Timeline:
def __init__(self, files):
print Gst._version # prints 1
self.pipeline = GES.Pipeline()
container_caps = Gst.Caps.new_empty_simple("video/quicktime")
video_caps = Gst.Caps.new_empty_simple("video/x-h264")
audio_caps = Gst.Caps.new_empty_simple("audio/mpeg")
self.container_profile = GstPbutils.EncodingContainerProfile.new("jane_profile", "mp4 concatation", container_caps, None )#Gst.Caps("video/mp4", None))
self.video_profile = GstPbutils.EncodingVideoProfile.new(video_caps, None, None, 0)
self.audio_profile = GstPbutils.EncodingAudioProfile.new(audio_caps, None, None, 0)
self.container_profile.add_profile(self.video_profile)
self.container_profile.add_profile(self.audio_profile)
self.bus = self.pipeline.get_bus()
self.bus.add_signal_watch()
self.bus.connect("message", self.busMessageCb)
self.timeline = GES.Timeline.new_audio_video()
self.layer = self.timeline.append_layer()
signal.signal(signal.SIGINT, self.handle_sigint)
self.start_on_timeline = 0
for file in files:
asset = GES.UriClipAsset.request_sync(VIDEOPATH + file)
print asset.get_duration()
duration = asset.get_duration()
clip = self.layer.add_asset(asset, self.start_on_timeline, 0, duration, GES.TrackType.UNKNOWN)
self.start_on_timeline += duration
print 'start:' + str(self.start_on_timeline)
self.timeline.commit()
self.pipeline.set_timeline(self.timeline)
def handle_sigint(self, sig, frame):
Gtk.main_quit()
def busMessageCb(self, unused_bus, message):
print message
print message.type
if message.type == Gst.MessageType.EOS:
print "eos"
Gtk.main_quit()
elif message.type == Gst.MessageType.ERROR:
error = message.parse_error()
print (error)
Gtk.main_quit()
if __name__=="__main__":
GObject.threads_init()
Gst.init(None)
GES.init()
gv = GES.version() # prints 1.2
timeline = Timeline(['one.mp4', 'two.mp4', 'two.mp4'])
done = timeline.pipeline.set_render_settings('file:///home/directory/output.mp4', timeline.container_profile)
print 'done: {0}'.format(done)
timeline.pipeline.set_mode(GES.PipelineFlags.RENDER)
timeline.pipeline.set_state(Gst.State.PAUSED)
Gtk.main()
I have set the GST_PLUGIN_PATH_1_0 environment variable to "/usr/local/lib:/usr/local/lib/gstreamer-1.0:/usr/lib/x86_64-linux-gnu:/usr/lib/i386-linux-gnu/gstreamer-1.0"
I compiled and installed gstreamer1.0-1.2.4, together with the base, good, bad and ugly packages for that version. GES is installed with version 1.2.1 as this was the nearest to the gstreamer version I found. I also installed the libav-1.2.4.
The decodebin2 should be in base according to the make install log for plugin-base and is linked into libgstplayback, which is part of my GST_PLUGIN_PATH_1_0:
/usr/local/lib/gstreamer-1.0 libgstplayback_la-gstdecodebin2.lo
I do have gstreamer0.10 and the decodebin2 is there as a blacklisted version when I do 'gst-inspect-1.0 -b' as it sits in the gstreamer0.10 library path rather than on that for 1.0.
I tried clearing the ~/.cache/gstreamer files and running gst-inspect-1.0 again to regenerate the plugin registry but I still keep getting the error in the Python code. This sample code might be wrong as it is my first stab at writing a timeline using Gstreamer editing services. I am on Ubuntu Trusty or 14.04.
The file is an mp4 file which is why I installed gst-libav for the required libraries.
The output of MP4Box -info on the file is:
Movie Info *
Timescale 90000 - Duration 00:00:08.405
Fragmented File no - 2 track(s)
File suitable for progressive download (moov before mdat)
File Brand mp42 - version 0
Created: GMT Mon Aug 17 17:02:26 2015
File has no MPEG4 IOD/OD
Track # 1 Info - TrackID 1 - TimeScale 50000 - Duration 00:00:08.360
Media Info: Language "English" - Type "vide:avc1" - 209 samples
Visual Track layout: x=0 y=0 width=1920 height=1080
MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x21
AVC/H264 Video - Visual Size 1920 x 1080
AVC Info: 1 SPS - 1 PPS - Profile Main # Level 4.2
NAL Unit length bits: 32
Pixel Aspect Ratio 1:1 - Indicated track size 1920 x 1080
Self-synchronized
Track # 2 Info - TrackID 2 - TimeScale 48000 - Duration 00:00:08.405
Media Info: Language "English" - Type "soun:mp4a" - 394 samples
MPEG-4 Config: Audio Stream - ObjectTypeIndication 0x40
MPEG-4 Audio MPEG-4 Audio AAC LC - 2 Channel(s) - SampleRate 48000 Synchronized on stream 1
log # pastebin.com/BjJ8Z5Bd for when I run 'GST_DEBUG=3,gnl*:5 python ./timeline1.py > timeline1.log 2>&1'
There is no "decodebin2" in GStreamer 1.x, which you're using here. It's just called "decodebin" now and is equivalent to "decodebin2" in 0.10.
Your problem here however is not that decodebin is not found. Your problem is that you're missing a plugin to play this specific media file. What kind of media file is it?
I am making screen share java based application. I am done with encoding frames into H264 using JCodec java Library. I have Picture data in Byte Buffer.
How I will send these encoded frames to Wowza through rtmp client?
Can Wowza recognize the H264 encoded frames, Encoded by Jcodec library?
Pretty much any of the "flash" media servers will understand h264 data in a stream. You'll need to encode your frames with baseline or main profile and then "package" the encoded bytes into flv streaming format. The first step is creating an AMF video data item, what that means is prefixing and suffixing the h264 encoded byte array based on its "NALU" content; in pseudo code it looks something like this:
if idr
flv[0] = 0x17 // 0x10 key frame; 0x07 h264 codec id
flv[1] = 0x01 // 0 sequence header; 1 nalu; 2 end of seq
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont
else if coded slice
flv[0] = 0x27
flv[1] = 0x01
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont
else if PPS or SPS
.... skipping this here as its really complicated, this is the h264/AVC configuration data
copy(encoded, 0, flv, 9, encoded.length)
flv[flv.length - 1] = 0
The next step is packaging the AMF video data into an RTMP message. I suggest that you look at flazr or one of the android rtmp libraries for details on this step.
I've got a small example project that takes raw encoded h264 and writes it to an flv here if you want to see how its done.
Im trying to automate video conversion with powershell and ffmpeg tool.
Ffmpeg have detailed output about video if called without all nessesary parameters. Usually it reports about error and display input file info if one specified.
Ex I interactively executed such command:
d:\video.Enc\ffmpeg.exe -i d:\video.Enc\1.wmv
this is powershell console output
ffmpeg.exe : FFmpeg version SVN-r20428, Copyright (c) 2000-2009 Fabrice Bellard, et al.
row:1 char:24
+ d:\video.Enc\ffmpeg.exe <<<< -i d:\video.Enc\1.wmv
+ CategoryInfo : NotSpecified: (FFmpeg version ...Bel
lard, et al.:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
built on Nov 1 2009 04:03:50 with gcc 4.2.4
configuration: --enable-memalign-hack --prefix=/mingw --cross-pre
fix=i686-mingw32- --cc=ccache-i686-mingw32-gcc --target-os=mingw32
--arch=i686 --cpu=i686 --enable-avisynth --enable-gpl --enable-vers
ion3 --enable-zlib --enable-bzlib --enable-libgsm --enable-libfaad
--enable-pthreads --enable-libvorbis --enable-libtheora --enable-li
bspeex --enable-libmp3lame --enable-libopenjpeg --enable-libxvid --
enable-libschroedinger --enable-libx264 --enable-libopencore_amrwb
--enable-libopencore_amrnb
libavutil 50. 3. 0 / 50. 3. 0
libavcodec 52.37. 1 / 52.37. 1
libavformat 52.39. 2 / 52.39. 2
libavdevice 52. 2. 0 / 52. 2. 0
libswscale 0. 7. 1 / 0. 7. 1
[wmv3 # 0x144dc00]Extra data: 8 bits left, value: 0
Seems stream 1 codec frame rate differs from container frame rate:
1000.00 (1000/1) -> 15.00 (15/1)
Input #0, asf, from 'd:\video.Enc\1.wmv':
Duration: 00:12:0
2.00, start: 5.000000, bitrate: 197 kb/s
Stream #0.0(eng): Audio: wmav2, 44100 Hz, 1 channels, s16, 48 k
b/s
Stream #0.1(eng): Video: wmv3, yuv420p, 1024x768, 137 kb/s, 15 tbr, 1k tbn, 1k tbc Metadata
title : Silverlight 2.0 Hello World Application
author : Sergey Pugachev
copyright :
comment :
WMFSDKVersion : 11.0.6001.7000
WMFSDKNeeded : 0.0.0.0000
IsVBR : 1
ASFLeakyBucketPairs:
VBR Peak : 715351
Buffer Average : 127036
At least one output file must be specified
But I cant figure how to script this and capture output to any kind of posh objects.
I tried direct script, wher ps1 file contained exact expression "d:\video.Enc\ffmpeg.exe -i d:\video.Enc\1.wmv" - it didnt work. Also i tried to do that with invoke-command and invoke expression. First one returns an exact string with command, second one - dump error to output console but not to -ErrorVariable i specified ( I did set all out variables, not only error one - all of them were empty).
Can anyone point to correct syntax for invoking console applications in posh and capturing output ?
Second one question will be about parsing that output - I'll need video resolution data to calculate correct aspect ratio for conversion. So it will be cool if anyone point how to work with captured error output and parse string like
Stream #0.1(eng): Video: wmv3, yuv420p, 1024x768,
Try redirecting the error stream to stdout like so and then you should be able to capture both stdout and stderr in a single variable e.g.:
$res = d:\video.Enc\ffmpeg.exe -i d:\video.Enc\1.wmv 2>&1
To capture the data try this:
$res | Select-String '(?ims)^Stream.*?(\d{3,4}x\d{3,4})' -all |
%{$_.matches} | %{$_.Groups[1].Value}
I'm not sure if $res will be one string or multiple but the above should work for both cases.