kAudioUnitProperty_MatrixLevels in Swift - swift

To query a MatrixMixer AudioUnit you do the following:
// code from MatrixMixerTest sample project in c++
UInt32 dims[2];
UInt32 theSize = sizeof(UInt32) * 2;
Float32 *theVols = NULL;
OSStatus result;
ca_require_noerr (result = AudioUnitGetProperty (au, kAudioUnitProperty_MatrixDimensions,
kAudioUnitScope_Global, 0, dims, &theSize), home);
theSize = ((dims[0] + 1) * (dims[1] + 1)) * sizeof(Float32);
theVols = static_cast<Float32*> (malloc (theSize));
ca_require_noerr (result = AudioUnitGetProperty (au, kAudioUnitProperty_MatrixLevels,
kAudioUnitScope_Global, 0, theVols, &theSize), home);
The return value on AudioUnitGetProperty for the kAudioUnitProperty_MatrixLevels is (defined in the documentation and in the sample code) a Float32.
I am trying to find the matrix levels in swift and can get the matrix dimensions without an issue. But I am not sure how to create an empty array of Float32 elements that is a UnsafeMutablePointer<Void>. Here's what I have tried without success:
var size = ((dims[0] + 1) * (dims[1] + 1)) * UInt32(sizeof(Float32))
var vols = UnsafeMutablePointer<Float32>.alloc(Int(size))
In the MatrixMixerTest the array is used like: theVols[0]

May need to be modified depending on how you converted other parts,
but the last part of your C++ code can be written in Swift like this:
theSize = ((dims[0] + 1) * (dims[1] + 1)) * UInt32(sizeof(Float32))
var theVols: [Float32] = Array(count: Int(theSize)/sizeof(Float32), repeatedValue: 0)
result = AudioUnitGetProperty(au, kAudioUnitProperty_MatrixLevels,
kAudioUnitScope_Global, 0, &theVols, &theSize)
guard result == noErr else {
//...
fatalError()
}
When a C-function based API is claiming a UnsafeMutablePointer<Void>, you just need an Array variable of an arbitrary type, and pass it as inout parameter.

Related

Matching Torch STFT with Accelerate

Im trying to re-implement Torch's STFT code in Swift with Accelerate / vDSP, to produce a Log Mel Spectrogram by post processing the STFT so I can use the Mel Spectrogram as an input for a CoreML port of OpenAI's Whisper
Pytorch's native STFT / Mel code produces this Spectrogram (its clipped due to importing raw float 32s into Photoshop lol)
and mine:
Obviously the two things to notice are the values, and the lifted frequency components.
The STFT Docs here https://pytorch.org/docs/stable/generated/torch.stft.html
X[ω,m]=
k=0
∑
win_length-1
​
window[k] input[m×hop_length+k] * exp(−j * (2π⋅ωk) /win_length)
I believe Im properly handling window[k] input[m×hop_length+k] but I'm a bit lost as to how to calculate the exponent and what -J is referring to in the documentation, and how to convert the final exponential in vDSP. Also, if its a sum, how do I get the 200 elements I need!?
My Log Mel Spectrogram
My code follows:
func processData(audio: [Int16]) -> [Float]
{
assert(self.sampleCount == audio.count)
var audioFloat:[Float] = [Float](repeating: 0, count: audio.count)
vDSP.convertElements(of: audio, to: &audioFloat)
vDSP.divide(audioFloat, 32768.0, result: &audioFloat)
// Up to this point, Python and swift are numerically identical
// insert numFFT/2 samples before and numFFT/2 after so we have a extra numFFT amount to process
// TODO: Is this stricly necessary?
audioFloat.insert(contentsOf: [Float](repeating: 0, count: self.numFFT/2), at: 0)
audioFloat.append(contentsOf: [Float](repeating: 0, count: self.numFFT/2))
// Split Complex arrays holding the FFT results
var allSampleReal = [[Float]](repeating: [Float](repeating: 0, count: self.numFFT/2), count: self.melSampleCount)
var allSampleImaginary = [[Float]](repeating: [Float](repeating: 0, count: self.numFFT/2), count: self.melSampleCount)
// Step 2 - we need to create 200 x 3000 matrix of STFTs - note we appear to want to output complex numbers (?)
for (m) in 0 ..< self.melSampleCount
{
// Slice numFFTs every hop count (barf) and make a mel spectrum out of it
// audioFrame ends up holding split complex numbers
var audioFrame = Array<Float>( audioFloat[ (m * self.hopCount) ..< ( (m * self.hopCount) + self.numFFT) ] )
// Copy of audioFrame original samples
let audioFrameOriginal = audioFrame
assert(audioFrame.count == self.numFFT)
// Split Complex arrays holding a single FFT result of our Audio Frame, which gets appended to the allSample Split Complex arrays
var sampleReal:[Float] = [Float](repeating: 0, count: self.numFFT/2)
var sampleImaginary:[Float] = [Float](repeating: 0, count: self.numFFT/2)
sampleReal.withUnsafeMutableBytes { unsafeReal in
sampleImaginary.withUnsafeMutableBytes { unsafeImaginary in
vDSP.multiply(audioFrame,
hanningWindow,
result: &audioFrame)
var complexSignal = DSPSplitComplex(realp: unsafeReal.bindMemory(to: Float.self).baseAddress!,
imagp: unsafeImaginary.bindMemory(to: Float.self).baseAddress!)
audioFrame.withUnsafeBytes { unsafeAudioBytes in
vDSP.convert(interleavedComplexVector: [DSPComplex](unsafeAudioBytes.bindMemory(to: DSPComplex.self)),
toSplitComplexVector: &complexSignal)
}
// Step 3 - creating the FFT
self.fft.forward(input: complexSignal, output: &complexSignal)
}
}
// We need to match: https://pytorch.org/docs/stable/generated/torch.stft.html
// At this point, I'm unsure how to continue?
// let twoπ = Float.pi * 2
// let freqstep:Float = Float(16000 / (self.numFFT/2))
//
// var w:Float = 0.0
// for (k) in 0 ..< self.numFFT/2
// {
// let j:Float = sampleImaginary[k]
// let sample = audioFrame[k]
//
// let exponent = -j * ( (twoπ * freqstep * Float(k) ) / Float((self.numFFT/2)))
//
// w += powf(sample, exponent)
// }
allSampleReal[m] = sampleReal
allSampleImaginary[m] = sampleImaginary
}
// We now have allSample Split Complex holding 3000 200 dimensional real and imaginary FFT results
// We create flattened 3000 x 200 array of DSPSplitComplex values
var flattnedReal:[Float] = allSampleReal.flatMap { $0 }
var flattnedImaginary:[Float] = allSampleImaginary.flatMap { $0 }

what causes the iteration of an array to conclude prematurely in metal?

What I'm trying to do
I'm testing out metals capability to work with loops. Since I can't define new constants in metal, I'm passing a uint into the buffer and use it to iterate over an array filled with integers. It looks like this in swift.
let array1: [Int] = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
The problem(s)
However when reading the result array buffer in Swift after completing the loop in metal, it seems like not every element has been allocated.
#include <metal_stdlib>
using namespace metal;
kernel void shader(constant int *arr [[ buffer(0) ]],
device int *resultArray [[ buffer(1) ]],
constant uint &iter [[ buffer(2) ]]) // value of 12
{
for (uint i = 0; i < iter; i++){
resultArray[i] = arr[i];
}
}
out
1
2
3
4
5
6
0
0
0
0
0
0
Similarly, using the iterator to set allocate each element of resultArray, yields strange results
for (uint i = 0; i < iter; i++){
resultArray[i] = i;
}
out
4294967296
12884901890
21474836484
30064771078
38654705672
47244640266
0
0
0
0
0
0
Multiplication seems to work
for (uint i = 0; i < iter; i++){
resultArray[i] = arr[i] * i;
}
out
0
4
12
24
40
60
0
0
0
0
0
0
Addition does not
for (uint i = 0; i < iter; i++){
resultArray[i] = arr[i] + i;
}
out
4294967297
12884901892
21474836487
30064771082
38654705677
47244640272
0
0
0
0
0
0
When however, I set iter to a value of for example 24 or higher, it at least iterated over the whole arrays of size 12.
for (uint i = 0; i < iter; i++){ // iter now value of 100
resultArray[i] = arr[i] * iter;
}
100
200
300
400
500
600
100
200
300
400
500
600
What is going on here?
MCVE
yes, it's a lot of code to get a simple loop running in metal, please bare with me
main.swift
import MetalKit
let array1: [Int] = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
func gpuProcess(arr1: [Int]) {
let size = arr1.count // value of 12
// GPU we want to use
let device = MTLCreateSystemDefaultDevice()
// Fifo queue for sending commands to the gpu
let commandQueue = device?.makeCommandQueue()
// The library for getting our metal functions
let gpuFunctionLibrary = device?.makeDefaultLibrary()
// Grab gpu function
let additionGPUFunction = gpuFunctionLibrary?.makeFunction(name: "shader")
var additionComputePipelineState: MTLComputePipelineState!
do {
additionComputePipelineState = try device?.makeComputePipelineState(function: additionGPUFunction!)
} catch {
print(error)
}
// Create buffers to be sent to the gpu from our array
let arr1Buff = device?.makeBuffer(bytes: arr1,
length: MemoryLayout<Int>.size * size ,
options: .storageModeShared)
let resultBuff = device?.makeBuffer(length: MemoryLayout<Int>.size * size,
options: .storageModeShared)
// Create the buffer to be sent to the command queue
let commandBuffer = commandQueue?.makeCommandBuffer()
// Create an encoder to set values on the compute function
let commandEncoder = commandBuffer?.makeComputeCommandEncoder()
commandEncoder?.setComputePipelineState(additionComputePipelineState)
// Set the parameters of our gpu function
commandEncoder?.setBuffer(arr1Buff, offset: 0, index: 0)
commandEncoder?.setBuffer(resultBuff, offset: 0, index: 1)
// Set parameters for our iterator
var count = size
commandEncoder?.setBytes(&count, length: MemoryLayout.size(ofValue: count), index: 2)
// Figure out how many threads we need to use for our operation
let threadsPerGrid = MTLSize(width: 1, height: 1, depth: 1)
let maxThreadsPerThreadgroup = additionComputePipelineState.maxTotalThreadsPerThreadgroup // 1024
let threadsPerThreadgroup = MTLSize(width: maxThreadsPerThreadgroup, height: 1, depth: 1)
commandEncoder?.dispatchThreads(threadsPerGrid,
threadsPerThreadgroup: threadsPerThreadgroup)
// Tell encoder that it is done encoding. Now we can send this off to the gpu.
commandEncoder?.endEncoding()
// Push this command to the command queue for processing
commandBuffer?.commit()
// Wait until the gpu function completes before working with any of the data
commandBuffer?.waitUntilCompleted()
// Get the pointer to the beginning of our data
var resultBufferPointer = resultBuff?.contents().bindMemory(to: Int.self,
capacity: MemoryLayout<Int>.size * size)
// Print out all of our new added together array information
for _ in 0..<size {
print("\(Int(resultBufferPointer!.pointee) as Any)")
resultBufferPointer = resultBufferPointer?.advanced(by: 1)
}
}
// Call function
gpuProcess(arr1: array1)
compute.metal
#include <metal_stdlib>
using namespace metal;
kernel void shader(constant int *arr [[ buffer(0) ]],
device int *resultArray [[ buffer(1) ]],
constant uint &iter [[ buffer(2) ]]) // value of 12
{
for (uint i = 0; i < iter; i++){
resultArray[i] = arr[i] * iter;
}
}
You are using 64 bit Int in Swift and 32 bit integers in MSL. Your GPU threads are also overlapping their work. Instead, use Int32 in Swift and make each thread process their own piece of data. Like this
import MetalKit
let array1: [Int32] = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
func gpuProcess(arr1: [Int32]) {
let size = arr1.count // value of 12
// GPU we want to use
let device = MTLCreateSystemDefaultDevice()
// Fifo queue for sending commands to the gpu
let commandQueue = device?.makeCommandQueue()
// The library for getting our metal functions
let gpuFunctionLibrary = device?.makeDefaultLibrary()
// Grab gpu function
let additionGPUFunction = gpuFunctionLibrary?.makeFunction(name: "shader")
var additionComputePipelineState: MTLComputePipelineState!
do {
additionComputePipelineState = try device?.makeComputePipelineState(function: additionGPUFunction!)
} catch {
print(error)
}
// Create buffers to be sent to the gpu from our array
let arr1Buff = device?.makeBuffer(bytes: arr1,
length: MemoryLayout<Int32>.stride * size ,
options: .storageModeShared)
let resultBuff = device?.makeBuffer(length: MemoryLayout<Int32>.stride * size,
options: .storageModeShared)
// Create the buffer to be sent to the command queue
let commandBuffer = commandQueue?.makeCommandBuffer()
// Create an encoder to set values on the compute function
let commandEncoder = commandBuffer?.makeComputeCommandEncoder()
commandEncoder?.setComputePipelineState(additionComputePipelineState)
// Set the parameters of our gpu function
commandEncoder?.setBuffer(arr1Buff, offset: 0, index: 0)
commandEncoder?.setBuffer(resultBuff, offset: 0, index: 1)
// Set parameters for our iterator
var count = size
commandEncoder?.setBytes(&count, length: MemoryLayout.size(ofValue: count), index: 2)
// Figure out how many threads we need to use for our operation
let threadsPerGrid = MTLSize(width: 1, height: 1, depth: 1)
let maxThreadsPerThreadgroup = additionComputePipelineState.maxTotalThreadsPerThreadgroup // 1024
let threadsPerThreadgroup = MTLSize(width: maxThreadsPerThreadgroup, height: 1, depth: 1)
commandEncoder?.dispatchThreads(threadsPerGrid,
threadsPerThreadgroup: threadsPerThreadgroup)
// Tell encoder that it is done encoding. Now we can send this off to the gpu.
commandEncoder?.endEncoding()
// Push this command to the command queue for processing
commandBuffer?.commit()
// Wait until the gpu function completes before working with any of the data
commandBuffer?.waitUntilCompleted()
// Get the pointer to the beginning of our data
var resultBufferPointer = resultBuff?.contents().bindMemory(to: Int32.self,
capacity: MemoryLayout<Int32>.stride * size)
// Print out all of our new added together array information
for _ in 0..<size {
print("\(Int32(resultBufferPointer!.pointee) as Any)")
resultBufferPointer = resultBufferPointer?.advanced(by: 1)
}
}
// Call function
gpuProcess(arr1: array1)
Kernel:
#include <metal_stdlib>
using namespace metal;
kernel void shader(constant int *arr [[ buffer(0) ]],
device int *resultArray [[ buffer(1) ]],
constant uint &iter [[ buffer(2) ]],
uint gid [[ thread_position_in_grid ]], // this is thread index in grid, since you have height and depth of a dispatch set to 1 in CPU code, you can use 1D `int` here.
)
{
// Early out if gid is out of array boudns
if(gid >= iter)
{
return;
}
// Each thread processes it's own data
resultArray[gid] = arr[gid] * iter;
}
For more information on how to use Metal for compute refer to developer docs and for the information about attributes such as thread_position_in_grid refer to Metal Shading Language specification.

vDSP_conv occasionally returns NANs

I'm using vDSP_conv to perform autocorrelation. Mostly it works just fine but every so often it's filling the output array with NaNs.
The code:
func corr_test() {
var pass = 0
var x = [Float]()
for i in 0..<2000 {
x.append(Float(i))
}
while true {
print("pass \(pass)")
let corr = autocorr(x)
if corr[1].isNaN {
print("!!!")
}
pass += 1
}
}
func autocorr(a: [Float]) -> [Float] {
let resultLen = a.count * 2 + 1
let padding = [Float].init(count: a.count, repeatedValue: 0.0)
let a_pad = padding + a + padding
var result = [Float].init(count: resultLen, repeatedValue: 0.0)
vDSP_conv(a_pad, 1, a_pad, 1, &result, 1, UInt(resultLen), UInt(a_pad.count))
return result
}
The output:
pass ...
pass 169
pass 170
pass 171
(lldb) p corr
([Float]) $R0 = 4001 values {
[0] = 2.66466637E+9
[1] = NaN
[2] = NaN
[3] = NaN
[4] = NaN
...
I'm not sure what's going on here. I think I'm handling the 0 padding correctly since if I weren't I don't think I'd be getting correct results 99% of the time.
Ideas? Gracias.
Figured it out. The key was this comment from https://developer.apple.com/library/mac/samplecode/vDSPExamples/Listings/DemonstrateConvolution_c.html :
// “The signal length is padded a bit. This length is not actually passed to the vDSP_conv routine; it is the number of elements
// that the signal array must contain. The SignalLength defined below is used to allocate space, and it is the filter length
// rounded up to a multiple of four elements and added to the result length. The extra elements give the vDSP_conv routine
// leeway to perform vector-load instructions, which load multiple elements even if they are not all used. If the caller did not
// guarantee that memory beyond the values used in the signal array were accessible, a memory access violation might result.”
“Padded a bit.” Thanks for being so specific. Anyway here's the final working product:
func autocorr(a: [Float]) -> [Float] {
let filterLen = a.count
let resultLen = filterLen * 2 - 1
let signalLen = ((filterLen + 3) & 0xFFFFFFFC) + resultLen
let padding1 = [Float].init(count: a.count - 1, repeatedValue: 0.0)
let padding2 = [Float].init(count: (signalLen - padding1.count - a.count), repeatedValue: 0.0)
let signal = padding1 + a + padding2
var result = [Float].init(count: resultLen, repeatedValue: 0.0)
vDSP_conv(signal, 1, a, 1, &result, 1, UInt(resultLen), UInt(filterLen))
// Remove the first n-1 values which are just mirrored from the end so that [0] always has the autocorrelation.
result.removeFirst(filterLen - 1)
return result
}
Note that the results here aren't normalized.

Type conversion - string of characters to integer

Hello I am writing my program in C, using PSoC tools to program my Cypress development kit. I am facing an issue regarding type conversion of a string of characters collected in my circular buffer (buffer) to a local variable "input_R", ultimately to a global variable st_input_R. The event in my FSM calling this action function is given below:
void st_state_5_event_0(void) //S6 OR S4
{
char buffer[ST_NODE_LIMIT] = {0};
st_copy_buffer(buffer);
uint32 input_R = {0};
mi_utoa(input_R, buffer);
if ((input_R >= 19000) && (input_R <= 26000))
{
st_input_R = input_R;
_st_data.state = ST_STATE_6;
}
else
{
_st_data.status = ST_STATE_4;
}
UART_1_Stop();
st_stop();
st_empty_buffer();
}
ST_NODE_LIMIT = 64
st_copy_buffer copies the the numbers I type in using hyper terminal to the circular buffer named "buffer".
input_R is the 32 bit integer I want the buffer content to be converted to.
mi_utoa is the function I am using for converting the contents in the buffer to input_R and is detailed below:
uint8 mi_utoa(uint32 number, char *string)
{
uint8 result = MI_BAD_ARGUMENT;
if (string != NULL)
{
uint8 c = 0;
uint8 i = 0;
uint8 j = 0;
do
{
string[i++] = number % 10 + '0';
} while ((number /=10) > 0);
string[i] = '\0';
for (i = 0, j = strlen(string) - 1 ; i < j ; i++, j--)
{
c = string[i];
string[i] = string[j];
string[j] = c;
}
result = MI_SUCCESS;
}
return result;
}
The problem is, suppose if I enter 21500(+\r), the mi_utoa function converts the first digit to 0 the second digit to \000 while the other digits including the carriage return "\r" remains unaltered. As a result the input_R is NOT = 21500. Its happening for any string of digits I input. So the condition "if ((input_R >= 19000) && (input_R <= 26000))" is never satisfied. Hence the FSM returns to state 4 all the time and I am going in circles.
Can you please advice where the bug is in the mi_utoa function? Let me know if you want to know any other details.
Your function st_state_5_event_0() sets the value input_R to zero. Then you call mi_utoa(), which converts the value input_R to an ascii string, "0".
void st_state_5_event_0(void) //S6 OR S4
{
char buffer[ST_NODE_LIMIT] = {0};
//what is the value of buffer after this statement?
st_copy_buffer(buffer);
//the value of input_R after the next statement is =0
uint32 input_R = {0};
//conversion of input_R to string will give ="0"
mi_utoa(input_R, buffer);
if ((input_R >= 19000) && (input_R <= 26000))
{
st_input_R = input_R;
_st_data.state = ST_STATE_6;
}
//...
}
You probably want a function which converts your ascii buffer to a number.
uint8
mi_atou(uint32* number, char *string)
{
uint8 result = MI_BAD_ARGUMENT;
if (!string) return result;
if (!number) return result;
uint8 ndx = 0;
uint32 accum=0;
for( ndx=0; string[ndx]; ++ndx )
{
if( (string[ndx] >= '0') && (string[ndx] <= '9') )
{
accum = accum*10 + (string[ndx]-'0');
//printf("[%d] %s -> %d\n",ndx,string,accum);
}
else break;
}
//printf("[%d] %s -> %d\n",ndx,string,accum);
*number = accum;
result = MI_SUCCESS;
return result;
}
Which you would call by providing the address of the number to store the result,
mi_atou(&input_R, buffer);

How to encode using the FFMpeg in Android (using H263)

I am trying to follow the sample code on encoding in the ffmpeg document and successfully build a application to encode and generate a mp4 file but I face the following problems:
1) I am using the H263 for encoding but I can only set the width and height of the AVCodecContext to 176x144, for other case (like 720x480 or 640x480) it will return fail.
2) I can't play the output mp4 file by using the default Android player, isn't it support H263 mp4 file? p.s. I can play it by using other player
3) Is there any sample code on encoding other video frame to make a new video (which mean decode the video and encode it back in different quality setting, also i would like to modify the frame content)?
Here is my code, thanks!
JNIEXPORT jint JNICALL Java_com_ffmpeg_encoder_FFEncoder_nativeEncoder(JNIEnv* env, jobject thiz, jstring filename){
LOGI("nativeEncoder()");
avcodec_register_all();
avcodec_init();
av_register_all();
AVCodec *codec;
AVCodecContext *codecCtx;
int i;
int out_size;
int size;
int x;
int y;
int output_buffer_size;
FILE *file;
AVFrame *picture;
uint8_t *output_buffer;
uint8_t *picture_buffer;
/* Manual Variables */
int l;
int fps = 30;
int videoLength = 5;
/* find the H263 video encoder */
codec = avcodec_find_encoder(CODEC_ID_H263);
if (!codec) {
LOGI("avcodec_find_encoder() run fail.");
}
codecCtx = avcodec_alloc_context();
picture = avcodec_alloc_frame();
/* put sample parameters */
codecCtx->bit_rate = 400000;
/* resolution must be a multiple of two */
codecCtx->width = 176;
codecCtx->height = 144;
/* frames per second */
codecCtx->time_base = (AVRational){1,fps};
codecCtx->pix_fmt = PIX_FMT_YUV420P;
codecCtx->codec_id = CODEC_ID_H263;
codecCtx->codec_type = AVMEDIA_TYPE_VIDEO;
/* open it */
if (avcodec_open(codecCtx, codec) < 0) {
LOGI("avcodec_open() run fail.");
}
const char* mfileName = (*env)->GetStringUTFChars(env, filename, 0);
file = fopen(mfileName, "wb");
if (!file) {
LOGI("fopen() run fail.");
}
(*env)->ReleaseStringUTFChars(env, filename, mfileName);
/* alloc image and output buffer */
output_buffer_size = 100000;
output_buffer = malloc(output_buffer_size);
size = codecCtx->width * codecCtx->height;
picture_buffer = malloc((size * 3) / 2); /* size for YUV 420 */
picture->data[0] = picture_buffer;
picture->data[1] = picture->data[0] + size;
picture->data[2] = picture->data[1] + size / 4;
picture->linesize[0] = codecCtx->width;
picture->linesize[1] = codecCtx->width / 2;
picture->linesize[2] = codecCtx->width / 2;
for(l=0;l<videoLength;l++){
//encode 1 second of video
for(i=0;i<fps;i++) {
//prepare a dummy image YCbCr
//Y
for(y=0;y<codecCtx->height;y++) {
for(x=0;x<codecCtx->width;x++) {
picture->data[0][y * picture->linesize[0] + x] = x + y + i * 3;
}
}
//Cb and Cr
for(y=0;y<codecCtx->height/2;y++) {
for(x=0;x<codecCtx->width/2;x++) {
picture->data[1][y * picture->linesize[1] + x] = 128 + y + i * 2;
picture->data[2][y * picture->linesize[2] + x] = 64 + x + i * 5;
}
}
//encode the image
out_size = avcodec_encode_video(codecCtx, output_buffer, output_buffer_size, picture);
fwrite(output_buffer, 1, out_size, file);
}
//get the delayed frames
for(; out_size; i++) {
out_size = avcodec_encode_video(codecCtx, output_buffer, output_buffer_size, NULL);
fwrite(output_buffer, 1, out_size, file);
}
}
//add sequence end code to have a real mpeg file
output_buffer[0] = 0x00;
output_buffer[1] = 0x00;
output_buffer[2] = 0x01;
output_buffer[3] = 0xb7;
fwrite(output_buffer, 1, 4, file);
fclose(file);
free(picture_buffer);
free(output_buffer);
avcodec_close(codecCtx);
av_free(codecCtx);
av_free(picture);
LOGI("finish");
return 0; }
H263 accepts only certain resolutions:
128 x 96
176 x 144
352 x 288
704 x 576
1408 x 1152
It will fail with anything else.
The code supplied in the question (I used it myself at first) seems to only generate a very rudimentary, if any, container format.
I found that this example, http://cekirdek.pardus.org.tr/~ismail/ffmpeg-docs/output-example_8c-source.html, worked much better as it creates a real container for the video and audio streams. My video is now displayable on the Android device.