AVAudioSinkNode with non-default, but still device-native sample rates - swift

I've configured AVAudioSinkNode attached to AVAudioEngine's inputNode like so:
let sinkNode = AVAudioSinkNode() { (timestamp, frames, audioBufferList) -> OSStatus in
print("SINK: \(timestamp.pointee.mHostTime) - \(frames) - \(audioBufferList.pointee.mNumberBuffers)")
return noErr
}
audioEngine.attach(sinkNode)
audioEngine.connect(audioEngine.inputNode, to: sinkNode, format: nil)
audioEngine.prepare()
do {
try audioEngine.start()
print("AudioEngine started.")
} catch {
print("AudioEngine did not start!")
}
I've separately configured it to use the "Built-in Microphone" device (which I am sure it does use).
If I set sample rate 44100 for the mic (using "Audio MIDI Setup" app provided by Apple on all Macs), everything works as expected:
AudioEngine started.
SINK: 692312319180567 - 512 - 2
SINK: 692312348024104 - 512 - 2
SINK: 692312359634082 - 512 - 2
SINK: 692312371244059 - 512 - 2
SINK: 692312382854036 - 512 - 2
...
However, if I use "Audio MIDI Setup" app (provided by Apple on all Macs), and change the mic's sample rate to anything other than 44100 (say 48000), then the sink node doesn't seem to do anything (doesn't print anything).
Of course, originally I was trying to modify the mic's sample rate programmatically. But later on I discovered that the same happens when I just change the device sample rate via the standard "Audio MIDI Setup" app. Therefore, the code I have for setting the sample rate is unnecessary to post here.
Does anyone know if AVAudioSinkNode has allowed sample rate hard-coded into it?
I cannot find any other explanation...

I've been toying around with AVAudioSinkNodes and it doesn't appear to me to be restricted to a 44100 sampling rate.
In my case, when I check the sampling rate of my input and sink nodes after attaching them, I get the following:
Input node sample rates:
IF <AVAudioFormat 0x600002527700: 2 ch, 48000 Hz, Float32, non-inter>
OF <AVAudioFormat 0x60000250fac0: 2 ch, 48000 Hz, Float32, non-inter>
Sink node sample rates:
IF <AVAudioFormat 0x60000250fbb0: 2 ch, 44100 Hz, Float32, non-inter>
OF <AVAudioFormat 0x60000250fb60: 2 ch, 44100 Hz, Float32, non-inter>
But once I connected them together, I got the following:
Input node sample rates:
IF <AVAudioFormat 0x600002527980: 1 ch, 48000 Hz, Float32>
OF <AVAudioFormat 0x600002506760: 2 ch, 48000 Hz, Float32, non-inter>
Sink node sample rates:
IF <AVAudioFormat 0x600002506710: 2 ch, 48000 Hz, Float32, non-inter>
OF <AVAudioFormat 0x600002505db0: 2 ch, 48000 Hz, Float32, non-inter>
I'm new to working with audio frameworks, but this does seem to suggest that the sink node's sample rate isn't hardcoded.
Your connection,
audioEngine.connect(audioEngine.inputNode, to: sinkNode, format: nil)
seems to differ from mine. Rightly or wrongly, I explicitly specified the format as audioEngine.inputNode.outputFormat(forBus: 0) which led to the settings shown. Not sure if that makes a difference.

Related

Audiokit: How record iphone 11 microphone with AKClipRecorder ? (CRASHED)

For the purpose of this issue , i edited the example/ios project of Audiokit: LoopbackRecording.
So i wanted to add effect before recording microphone:
let mic = AKMicrophone()
let reverb = AKReverb(mic)
reverb.loadFactoryPreset(.cathedral)
// Set up recorders
loopBackRecorder = AKClipRecorder(node: reverb)
This code is working fine on simulator (iphone 11 iOS13) but crashed on iPhone 11 Pro
when AudioKit trying to start showing this error log:
2020-02-17 13:49:05.854838+0100 LoopbackRecording[456:33365] [avae]
AVAudioEngine.mm:160 Engine#0x283c94f30: could not initialize, error = -10868
2020-02-17 13:49:05.955946+0100 LoopbackRecording[456:33365] [avae]
AVAEInternal.h:109 [AVAudioEngineGraph.mm:1397:Initialize: (err = AUGraphParser::InitializeActiveNodesInInputChain(ThisGraph, *GetInputNode())): error -10868
Fatal error: The operation couldn’t be completed. (com.apple.coreaudio.avfaudio error -10868.): file AudioKit-iOS/Examples/LoopbackRecording/LoopbackRecording/ViewController.swift, line 104
2020-02-17 13:49:05.967861+0100 LoopbackRecording[456:33365] Fatal error: The operation couldn’t be completed. (com.apple.coreaudio.avfaudio error -10868.)
I tried to debug and remove reverb and the app is still crashing, the AVFaudio code error :-10868 refer to this more explicite error: kAudioUnitErr_FormatNotSupported
I found this error code catch only in one AudiokitFile: AUBuffer.cpp line 95:
...
if (format.IsInterleaved()) {
nStreams = 1;
channelsPerStream = format.mChannelsPerFrame;
} else {
nStreams = format.mChannelsPerFrame;
channelsPerStream = 1;
if (nStreams > mAllocatedStreams)
COMPONENT_THROW(kAudioUnitErr_FormatNotSupported);
}
So i compared the format difference between the inputNode format of AudiokitEngine and the inputNode of mic.avAudioNode (AKMicrophone) :
(lldb) po AudioKit.engine.inputNode.inputFormat(forBus: 0)
<AVAudioFormat 0x2803eaa80: 1 ch, 48000 Hz, Float32>
(lldb) po AudioKit.engine.inputNode.outputFormat(forBus: 0)
<AVAudioFormat 0x2803dc0f0: 1 ch, 48000 Hz, Float32>
(lldb) po mic?.avAudioNode.inputFormat(forBus: 0)
<AVAudioFormat 0x2803e4190: 1 ch, 48000 Hz, Float32>
(lldb) po mic?.avAudioNode.outputFormat(forBus: 0)
<AVAudioFormat 0x2803dc1e0: 2 ch, 44100 Hz, Float32, non-inter>
I shall really appreciate any helps or tips that will allow me to debug this situation

How to encode audio to AAC with profile FF_PROFILE_AAC_LOW

I try to encode audio to AAC with profile FF_PROFILE_AAC_LOW by the following settings.
oc_cxt->profile = FF_PROFILE_AAC_LOW;
Also from the output of av_dump_format, I got this
Metadata:
encoder : Lavf57.36.100
Stream #0:0: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s
But the output is different. Everything is ok, except the output is AAC, not AAC (LC). By using ffprobe to detect, the output information is
$ ffprobe o.m4a
...
Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 195 kb/s (default)
...
AAC (LC) is the desired result I need.
But from the command line, ffmpeg can generate AAC (LC) output. Below is a small test.
$ ffmpeg -f lavfi -i aevalsrc="sin(440*2*PI*t):d=5" aevalsrc.m4a
$ ffprobe aevalsrc.m4a
...
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 69 kb/s (default)
...
How can I select FF_PROFILE_LOW to get AAC (LC) output?
This was caused by new ffmpeg api which I didn't notice.
The extra data need to copy back to AVStream->codecpar->extradata after avcodec_open2. After that, the ffprobe can detect output is the format I need, AAC (LC).
The following is a code snippet from ffmpeg.c
if (!ost->st->codecpar->extradata && avctx->extradata) {
ost->st->codecpar->extradata = av_malloc(avctx->extradata_size + FF_INPUT_BUFFER_PADDING_SIZE);
if (!ost->st->codecpar->extradata) {
av_log(NULL, AV_LOG_ERROR, "Could not allocate extradata buffer to copy parser data.\n");
exit_program(1);
}
ost->st->codecpar->extradata_size = avctx->extradata_size;
memcpy(ost->st->codecpar->extradata, avctx->extradata, avctx->extradata_size);
}
Hopefully it would be helpful to anyone use the latest version of ffmpeg (3.x).

ffmpeg API h264 encoded video does not play on all platforms

Edit: In the previous version I used a very old ffmpeg API. I now use the newest libraries. The problem has only changed slightly, from "Main" to "High".
I am using the ffmpeg C API to create a mp4 video in C++.
I want the resulting video to be of the profile "Constrained Baseline", so that the resulting video can be played on as much platforms as possible, especially mobile, but I get "High" profile every time, even though I hard coded the codec profile to be FF_PROFILE_H264_CONSTRAINED_BASELINE. As a result, the video does not play on all our testing platforms.
This is what "ffprobe video.mp4 -show_streams" tells about my video streams:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 1970-01-01 00:00:00
encoder : Lavf53.5.0
Duration: 00:00:13.20, start: 0.000000, bitrate: 553 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 320x180,
424 kb/s, 15 fps, 15 tbr, 15 tbn, 30 tbc
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, s16, 12
kb/s
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : SoundHandler
-------VIDEO STREAM--------
[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
profile=High <-- This should be "Constrained Baseline"
codec_type=video
codec_time_base=1/30
codec_tag_string=avc1
codec_tag=0x31637661
width=320
height=180
has_b_frames=0
sample_aspect_ratio=N/A
display_aspect_ratio=N/A
pix_fmt=yuv420p
level=30
timecode=N/A
is_avc=1
nal_length_size=4
id=N/A
r_frame_rate=15/1
avg_frame_rate=15/1
time_base=1/15
start_time=0.000000
duration=13.200000
bit_rate=424252
nb_frames=198
nb_read_frames=N/A
nb_read_packets=N/A
TAG:creation_time=1970-01-01 00:00:00
TAG:language=und
TAG:handler_name=VideoHandler
[/STREAM]
-------AUDIO STREAM--------
[STREAM]
index=1
codec_name=aac
codec_long_name=Advanced Audio Coding
profile=unknown
codec_type=audio
codec_time_base=1/44100
codec_tag_string=mp4a
codec_tag=0x6134706d
sample_fmt=s16
sample_rate=44100
channels=2
bits_per_sample=0
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/44100
start_time=0.000000
duration=13.165714
bit_rate=125301
nb_frames=567
nb_read_frames=N/A
nb_read_packets=N/A
TAG:creation_time=1970-01-01 00:00:00
TAG:language=und
TAG:handler_name=SoundHandler
[/STREAM]
This is the function I use to add a video stream. All the values that come from ptr-> are defined from outside, do those values have to be specific values to get the correct profile?:
static AVStream *add_video_stream( Cffmpeg_dll * ptr, AVFormatContext *oc, enum CodecID codec_id )
{
AVCodecContext *c;
AVStream *st;
AVCodec* codec;
// Get correct codec
codec = avcodec_find_encoder(codec_id);
if (!codec) {
av_log(NULL, AV_LOG_ERROR, "%s","Video codec not found\n");
exit(1);
}
// Create stream
st = avformat_new_stream(oc, codec);
if (!st) {
av_log(NULL, AV_LOG_ERROR, "%s","Could not alloc stream\n");
exit(1);
}
c = st->codec;
/* Get default values */
codec = avcodec_find_encoder(codec_id);
if (!codec) {
av_log(NULL, AV_LOG_ERROR, "%s","Video codec not found (default values)\n");
exit(1);
}
avcodec_get_context_defaults3(c, codec);
c->codec_id = codec_id;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->bit_rate = ptr->video_bit_rate;
av_log(NULL, AV_LOG_ERROR, " Bit rate: %i", c->bit_rate);
c->qmin = ptr->qmin;
c->qmax = ptr->qmax;
c->me_method = ptr->me_method;
c->me_subpel_quality = ptr->me_subpel_quality;
c->i_quant_factor = ptr->i_quant_factor;
c->qcompress = ptr->qcompress;
c->max_qdiff = ptr->max_qdiff;
// We need to set the level and profile to get videos that play (hopefully) on all platforms
c->level = 30;
c->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
c->width = ptr->dstWidth;
c->height = ptr->dstHeight;
c->time_base.den = ptr->fps;
c->time_base.num = 1;
c->gop_size = ptr->fps;
c->pix_fmt = STREAM_PIX_FMT;
c->max_b_frames = 0;
// some formats want stream headers to be separate
if(oc->oformat->flags & AVFMT_GLOBALHEADER)
c->flags |= CODEC_FLAG_GLOBAL_HEADER;
return st;
}
Additional info:
As a reference video, I use the gizmo.mp4 that Mozilla serves as an example that plays on every platform/browser. It definitely has the "Constrained Baseline" profile, and definitely works on all our testing smartphones. You can download it here. Our self-created video doesn't work on all platforms and I'm convinced this is because of the profile.
I am also using qt-faststart.exe to move the headers to the start of the file after creating the mp4, as this cannot be done in a good way in C++ directly. Could that be the problem?
Obviously, I am doing something wrong, but I don't know what it could be. I'd be thankful for every hint ;)
I have the solution. After spending some time and discussions in the ffmpeg bug tracker and browsing for profile setting examples, I finally figured out the solution.
One needs to use av_opt_set(codecContext->priv_data, "profile", "baseline" (or any other desired profile), AV_OPT_SEARCH_CHILDREN)
So in my case that would be:
Wrong:
// We need to set the level and profile to get videos that play (hopefully) on all platforms
c->level = 30;
c->profile = FF_PROFILE_H264_CONSTRAINED_BASELINE;
Correct:
// Set profile to baseline
av_opt_set(c->priv_data, "profile", "baseline", AV_OPT_SEARCH_CHILDREN);
Completely unintuitive and contrary to the rest of the API usage, but that's ffmpeg philosophy. You don't need to understand it, you just need to understand how to use it ;)

Add kAudioUnitSubType_Varispeed to AUGraph in iOS5

I am trying to add a kAudioUnitSubType to an AUGraph in iOS 5 but when I add it and call AUGraphInitialize an error code -10868 is returned (kAudioUnitErr_FormatNotSupported).
Here is my AudioComponentDescription:
AudioComponentDescription varispeedDescription;
varispeedDescription.componentType = kAudioUnitType_FormatConverter;
varispeedDescription.componentSubType = kAudioUnitSubType_Varispeed;
varispeedDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
varispeedDescription.componentFlags = 0;
varispeedDescription.componentFlagsMask = 0;
If I print the state of the graph right before I initialise the graph I get the following:
AudioUnitGraph 0x918000:
Member Nodes:
node 1: 'auou' 'rioc' 'appl', instance 0x16dba0 O
node 2: 'aumx' 'mcmx' 'appl', instance 0x1926f0 O
node 3: 'aufc' 'vari' 'appl', instance 0x193b00 O
Connections:
node 2 bus 0 => node 3 bus 0 [ 2 ch, 44100 Hz, 'lpcm' (0x00000C2C) 8.24-bit little-endian signed integer, deinterleaved]
node 3 bus 0 => node 1 bus 0 [ 2 ch, 0 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved]
Input Callbacks:
{0x39a5, 0x172e24} => node 2 bus 0 [2 ch, 44100 Hz]
{0x39a5, 0x172e24} => node 2 bus 1 [2 ch, 44100 Hz]
CurrentState:
mLastUpdateError=0, eventsToProcess=F, isRunning=F
As you can see the connection from the Varispeed unit to the Remote IO unit shows a very strange format. What is even more odd is that if I run this on the simulator as opposed to on my development device the format that shows is 16-bit little endian integer, deinterleaved. If I try to set the stream format on the input or output scopes of the Varispeed unit I get the same error code -10868.
The asbd that I am setting as the stream format on each Multichannel mixer bus on the input scope is as follows:
stereoFormat.mSampleRate = sampleRate;
stereoFormat.mFormatID = kAudioFormatLinearPCM;
stereoFormat.mFormatFlags = kAudioFormatFlagsCanonical;
stereoFormat.mFramesPerPacket = 1;
stereoFormat.mChannelsPerFrame = 2;
stereoFormat.mBitsPerChannel = 16;
stereoFormat.mBytesPerPacket = 4;
stereoFormat.mBytesPerFrame = 4;
The reason I am not using the canonical format for audio units is because I am reading samples from an AVAssetReader which I cannot configure to output 8.24 Signed Integer samples.
If I take out the Varispeed unit from the graph and connect the Multichannel Mixer unit to the input of the Remote IO unit the graph initialises and plays fine. Any idea what I'm doing wrong?
your io unit is asking for 32bit float and recieving 8.24 from the varispeed unit, hence the format error.
you can set the stream format on the output of varispeed to match the input format of the io unit like this:
AudioStreamBasicDescription asbd;
UInt32 asbdSize = sizeof (asbd);
memset (&asbd, 0, sizeof (asbd));
AudioUnitGetProperty(ioaudiounit , kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &asbd, &asbdSize);
AudioUnitSetProperty(varipseedaudiounit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &asbd, sizeof(asbd));

Using FFMPEG to reliably convert videos to mp4 for iphone/ipod and flash players

I need to convert videos for use in both a flash player and the iphone/ipod touch. I'm using the following batch script with ffmpeg:
#echo off
ffmpeg.exe -i %1 -s qvga -acodec libfaac -ar 22050 -ab 128k -vcodec libx264 -threads 0 -f ipod %2
This always outputs an mp4 file, and I can always play it on my PC. The videos also seem to play fine on my iphone 3GS. But with some input files it won't work for older iphone versions (3G and iPod touch).
Here's the ffmpeg output from one such file:
D:\ffmpeg>encode.bat d:\temp\recording.flv d:\temp\out.m4v
FFmpeg version SVN-r18709, Copyright (c) 2000-2009 Fabrice Bellard, et al.
configuration: --enable-memalign-hack --prefix=/mingw --cross-prefix=i686-ming
w32- --cc=ccache-i686-mingw32-gcc --target-os=mingw32 --arch=i686 --cpu=i686 --e
nable-avisynth --enable-gpl --enable-zlib --enable-bzlib --enable-libgsm --enabl
e-libfaac --enable-libfaad --enable-pthreads --enable-libvorbis --enable-libtheo
ra --enable-libspeex --enable-libmp3lame --enable-libopenjpeg --enable-libxvid -
-enable-libschroedinger --enable-libx264
libavutil 50. 3. 0 / 50. 3. 0
libavcodec 52.27. 0 / 52.27. 0
libavformat 52.32. 0 / 52.32. 0
libavdevice 52. 2. 0 / 52. 2. 0
libswscale 0. 7. 1 / 0. 7. 1
built on Apr 28 2009 04:04:42, gcc: 4.2.4
[flv # 0x187d650]skipping flv packet: type 18, size 164, flags 0
Input #0, flv, from 'd:\temp\recording.flv':
Duration: 00:00:07.17, start: 0.001000, bitrate: N/A
Stream #0.0: Video: flv, yuv420p, 320x240, 1k tbr, 1k tbn, 1k tbc
Stream #0.1: Audio: nellymoser, 44100 Hz, mono, s16
[libx264 # 0x13518b0]using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE
4.2
[libx264 # 0x13518b0]profile Baseline, level 4.2
Output #0, ipod, to 'd:\temp\out.m4v':
Stream #0.0: Video: libx264, yuv420p, 320x240, q=2-31, 200 kb/s, 1k tbn, 1k
tbc
Stream #0.1: Audio: libfaac, 22050 Hz, mono, s16, 128 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Stream #0.1 -> #0.1
Press [q] to stop encoding
frame= 90 fps= 0 q=-1.0 Lsize= 128kB time=6.87 bitrate= 152.4kbits/s
video:92kB audio:32kB global headers:1kB muxing overhead 2.620892%
[libx264 # 0x13518b0]slice I:8 Avg QP:29.62 size: 7047
[libx264 # 0x13518b0]slice P:82 Avg QP:30.83 size: 467
[libx264 # 0x13518b0]mb I I16..4: 17.9% 0.0% 82.1%
[libx264 # 0x13518b0]mb P I16..4: 0.6% 0.0% 0.0% P16..4: 23.1% 0.0% 0.0%
0.0% 0.0% skip:76.3%
[libx264 # 0x13518b0]final ratefactor: 57.50
[libx264 # 0x13518b0]SSIM Mean Y:0.9544735
[libx264 # 0x13518b0]kb/s:8412.6
My suspicion is that it has something to do with the audio encoding. If so, does anyone know how to force it to reencode the audio to the proper format?
Any other ideas?
WARNING: this answer is 10 years old and reported not to work anymore.
I think the issue is the H.264 level being level 4.2.
Some of the Apple devices only support up to 3.0.
Here's the FFMPEG settings I usually use:
ffmpeg -i YOUR-INPUT.wmv -s qvga -b 384k -vcodec libx264 -r 23.976 -acodec libfaac -ac 2 -ar 44100 -ab 64k -vpre baseline -crf 22 -deinterlace -o YOUR-OUTPUT.MP4
You can adjust the rate, size and bitrate as needed. The important settings are in the baseline config param.
The ffmpeg wiki provides some useful up to date guidance on how to encode H.264 for particular devices. Here's an excerpt from Apple's docs with corresponding profiles:
iOS Compatability
Profile Level Devices Options
Baseline 3.0 All devices -profile:v baseline -level 3.0
Baseline 3.1 iPhone 3G and later, iPod touch 2nd generation and later -profile:v baseline -level 3.1
Main 3.1 iPad (all vers), Apple TV 2 and later, iPhone 4 and later -profile:v main -level 3.1
Main 4.0 Apple TV 3 and later, iPad 2 and later, iPhone 4s and later -profile:v main -level 4.0
High 4.0 Apple TV 3 and later, iPad 2 and later, iPhone 4s and later -profile:v high -level 4.0
High 4.1 iPad 2 and later, iPhone 4s and later, iPhone 5c and later -profile:v high -level 4.1
High 4.2 iPad Air and later, iPhone 5s and later -profile:v high -level 4.2
ffmpeg -i test.mov -profile:v baseline -level 3.0 test.mp4
This disables some features but offers greater compatibility.
Also, here are some useful optional tags to add for working with the quality and file size:
-preset: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo
-crf: 0-51
(preset modifies how long it takes to compress your video, with faster getting a bigger file size, and slower getting a smaller file size, whereas crf modifies the video quality, with higher quality having a bigger file size, and lower quality having a smaller file size.)
The listed ffmpeg settings didn't work for me (I don't seem to have the "baseline" preset listed), ffmpeg settings that don't reference baseline, I posted over here: iPhone "cannot play" .mp4 H.264 video file
Spoiler:
ffmpeg -i INPUT -s 320x240 -r 30000/1001 -b 200k -bt 240k -vcodec libx264 -coder 0 -bf 0 -refs 1 -flags2 -wpred-dct8x8 -level 30 -maxrate 10M -bufsize 10M -acodec libfaac -ac 2 -ar 48000 -ab 192k OUTPUT.mp4
The official Apple reference on the subject: http://developer.apple.com/library/safari/#documentation/AppleApplications/Reference/SafariWebContent/CreatingVideoforSafarioniPhone/CreatingVideoforSafarioniPhone.html
Try this python script.
I wrote it for myself. Maybe you will find it useful too. It converts files to mp4.
Because of SO rules here the complete source code:
#!/usr/bin/python
# Copyright (C) 2007-2010 CDuke
# This program is free software. You may distribute it under the terms of
# the GNU General Public License as published by the Free Software
# Foundation, version 2.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
# Public License for more details.
#
# This program converts video files to mp4, suitable to be played on an iPod
# or an iPhone. It is careful about maintaining the proper aspect ratio.
from __future__ import division
from datetime import datetime
import sys
import argparse
import os
import re
import shlex
import time
from subprocess import Popen, PIPE
DEFAULT_ARGS = '-f mp4 -y -vcodec libxvid -maxrate 1000k -mbd 2 -qmin 3 -qmax 5 -g 300 -bf 0 -acodec libfaac -ac 2 -flags +mv4 -trellis 2 -cmp 2 -subcmp 2'
#DEFAULT_ARGS = '-f mp4 -y -vcodec mpeg4 -vtag xvid -maxrate 1000k -mbd 2 -qmin 3 -qmax 5 -g 300 -bf 0 -acodec libfaac -ac 2 -r 30000/1001 -flags +mv4 -trellis 2 -cmp 2 -subcmp 2'
#DEFAULT_ARGS = '-y -f mp4 -vcodec libxvid -acodec libfaac'
DEFAULT_BUFSIZE = '4096k'
DEFAULT_AUDIO_BITRATE = '128k'
DEFAULT_VIDEO_BITRATE = '400k'
FFMPEG = '/usr/bin/ffmpeg'
class device:
'''Describe properties of device'''
def __init__(self, name, width, height):
self.name = name
self.width = width
self.height = height
class videoFileInfo:
def __init__(self, width, height, duration):
self.width = width
self.height = height
self.duration = duration
devices = [device('ipod', 320, 240), device('iphone', 480, 320),
device('desire', 800, 480)]
def getOutputFileName(inputFileName, outDir):
if outDir == None:
outFileName = os.path.splitext(inputFileName)[0] + '.mp4'
else:
outFileName = os.path.join(outDir, os.path.basename(inputFileName))
return outFileName
def getVideoFileInfo(fileName):
p = Popen([FFMPEG, '-i', fileName], stdout = PIPE, stderr = PIPE)
fileInfo = p.communicate()[1]
videoRes = re.search(b'Video:.+ (\d+)x(\d+)', fileInfo)
w = float(videoRes.group(1))
h = float(videoRes.group(2))
duratMatch = re.search(b'Duration:\s+(\d+):(\d+):(\d+)\.(\d+)', fileInfo)
duration = float(duratMatch.group(1)) * 3600
duration += float(duratMatch.group(2)) * 60
duration += float(duratMatch.group(3))
duration += float(duratMatch.group(4)) / 10
fileInfo = videoFileInfo(w, h, duration)
return fileInfo
def getArguments(width, height, aspect):
args = {}
w = width
h = w // aspect
h -= (h % 2)
if h <= height:
pad = (height - h) // 2
pad -= (pad % 2)
pady = pad
padx = 0
else:
# recalculate using the height as the baseline rather than the width
h = height
w = int(h * aspect)
width -= (width % 2)
pad = (width - w) // 2
pad -= (pad % 2)
padx = pad
pady = 0
args['width'] = w
args['height'] = h
args['padx'] = padx
args['pady'] = pady
return args
def getProgressBar(perc):
convInfo = 'Converted: [{}] {:.2%} \r'
num_hashes = round(perc * 100 // 2)
bar = '=' * num_hashes + ' ' * (50 - num_hashes)
return convInfo.format(bar, perc)
def convert(inputFileName, outputFileName, args, audioBitrate, videoBitrate, devWidth, devHeight, aspect, duration):
cmd = '{ffmpeg} -i {inFile} {defaultArgs} -bufsize {bufsize} -s {width}x{height} -vf "pad={devWidth}:{devHeight}:{padx}:{pady},aspect={aspect}" -ab {audioBitrate} -b {videoBitrate} {outFile}'.format(ffmpeg=FFMPEG, inFile=inputFileName, defaultArgs=DEFAULT_ARGS, bufsize=DEFAULT_BUFSIZE, devWidth=devWidth, devHeight=devHeight, padx=args['padx'], pady=args['pady'], width=args['width'], height=args['height'], aspect=aspect, audioBitrate=audioBitrate, videoBitrate=videoBitrate, outFile=outputFileName)
# cmd = '{ffmpeg} -i {inFile} {defaultArgs} -bufsize {bufsize} -s {width}x{height} -ab {audioBitrate} -b {videoBitrate} {outFile}'.format(ffmpeg=FFMPEG, inFile=inputFileName, defaultArgs=DEFAULT_ARGS, bufsize=DEFAULT_BUFSIZE, width=args['width'], height=args['height'], audioBitrate=audioBitrate, videoBitrate=videoBitrate, outFile=outputFileName)
print(cmd)
print()
start = datetime.today()
print('Converting started at ' + str(start))
conv = Popen(shlex.split(cmd), shell=False, stdout=PIPE, stderr=PIPE)
while conv.poll() is None:
out = os.read(conv.stderr.fileno(), 2048)
last = out.splitlines()[-1]
timeMatch = re.search(b'time=([^\s]+)', last)
if timeMatch:
timeDone = float(timeMatch.group(1))
perc = timeDone / duration
if sys.version_info > (3, 0):
exec("print(getProgressBar(perc), end='')")
else:
exec("print getProgressBar(perc),")
sys.stdout.flush()
# else:
# print(out)
time.sleep(0.5)
print(getProgressBar(1))
end = datetime.today()
print('Converting ended at ' + str(end))
print('Spended time: ' + str(end - start))
class mp4Converter(argparse.Action):
def __call__(self, parser, namespace, values, option_string = None):
outdir = namespace.outdir
for f in values:
outFileName = getOutputFileName(f.name, outdir)
fileInfo = getVideoFileInfo(f.name)
aspect = fileInfo.width / fileInfo.height
dev = next(d for d in devices if d.name == namespace.device)
args = getArguments(dev.width, dev.height, aspect)
convert(f.name, outFileName, args, namespace.AUDIO_BITRATE, namespace.VIDEO_BITRATE, dev.width, dev.height, aspect, fileInfo.duration)
print('file "{0}" converted successful'.format(f.name))
opts = argparse.ArgumentParser(
description = 'Converter to MP4',
epilog = 'made by CDuke 2010')
opts.add_argument('-V','--version',
action = 'version',
version = '0.0.1')
opts.add_argument('-v', '--verbose',
action = 'store_true',
default = False,
help = 'verbose')
opts.add_argument('-a', '--audio',
dest = 'AUDIO_BITRATE',
default = DEFAULT_AUDIO_BITRATE,
help = 'override default audio bitrate {0}'.format(DEFAULT_AUDIO_BITRATE))
opts.add_argument('-b', '--video',
dest = 'VIDEO_BITRATE',
default = DEFAULT_VIDEO_BITRATE,
help = 'override default video bitrate {0}'.format(DEFAULT_VIDEO_BITRATE))
opts.add_argument('-d', '--device',
choices = [d.name for d in devices],
default = 'ipod',
help = 'device that will play video')
opts.add_argument('-o', '--outdir',
help = 'write files to given directory')
opts.add_argument('file',
nargs = '+',
type = argparse.FileType('r'),
action = mp4Converter,
help = 'file that will be converted')
opts.parse_args()
ffmpeg -i input.mov -c:v libx264 -pix_fmt yuv420p -profile:v main -crf 1 -preset medium -c:a aac -movflags +faststart output.mp4
ffmpeg.exe -i "Video.mp4" -vcodec libx264 -preset fast -profile:v baseline -lossless 1 -vf "scale=720:540,setsar=1,pad=720:540:0:0" -acodec aac -ac 2 -ar 22050 -ab 48k "Video (SD).mp4"
I had the same problem. I wanted to convert mainly videos for an iPod 5G. Every Info I found was either outdated or did not work for me.
So finally I stumbled upon some parameters that finally worked:
ffmpeg -i "INPUTFILE" \
-f mp4 -vcodec mpeg4 \
-vf scale=-2:320 \
-maxrate 1536k -b:v 768 -qmin 3 -qmax 5 -bufsize 4096k -g 300 \
-c:a aac -b:a 128k -ar 44100 -ac 2 \
"OUTPUTFILE.mp4"
Some remarks:
I let ffmpeg scale it to a height of 320, because for me this is the sweet spot. The files are not too big, and they also sync to an iPod Touch which has a Screen of 480x320, which then uses the full video. The Screen of the iPod is 320x240, so it could also be scaled to 240 height. The iPod seems support up to 480p, which is nice if you want to output the video to some other source.
The Audio quality can be changed if needed.
I did not try touching the other parameters as they work fine and produce decent results.
EDIT
I modified the command a tiny bit, as I had problems with some files which had audio tracks encoded in 48khz and more than 2 channels.
Got here because the simplest ffmpeg conversion approach was not producing an mp4 that would play on iOS for some reason.
Found settings that work for me in 2019 here:
https://gist.github.com/jaydenseric/220c785d6289bcfd7366
ffmpeg -i input.mov -c:v libx264 -pix_fmt yuv420p -profile:v baseline -level 3.0 -crf 22 -preset veryslow -vf scale=1280:-2 -c:a aac -strict experimental -movflags +faststart -threads 0 output.mp4