flvmux not pulling video at same rate as audio - raspberry-pi

I have a pipeline which is intended to capture audio and video from a C920 camera, do a little very simple processing on it (low cpu requirements), then recompress it and mux it to file.
This is a general outline of the pipeline:
Platform:
- Raspberry Pi 3
- Debian Jessie
- GStreamer 1.8
Don't worry about my 'simple processing' area. My overall CPU sits below 25% CPU.
What I find, is that Q3 and Q4 slowly start filling up, until one hits a threshold and then my audio goes all choppy (and I get warnings from alsasrc 'downstreaming is not consuming buffers fast enough').
I can put leaks on the queues, but that's hardly resolving the issue.
As my pipeline is running, this is what my queues look like (current-level-time in ms)
QUEUE CONTENTS IN MILLISECONDS
TIME(s) Q1 Q2 Q3 Q4 Q5 Q6
0 0 0 0 0 0 0
5 0 0 252 380 0 0
10 0 0 293 460 0 0
15 0 0 332 470 0 0
20 0 0 378 451 0 0
25 0 0 333 460 0 0
30 0 0 383 480 0 0
35 0 0 500 550 0 0
40 0 0 500 610 0 0
45 0 0 539 630 0 0
50 0 0 584 670 0 0
=== EXPERIMENT ===
I removed the yellow leg of the pipeline, so that I was only capturing video, and the result was better. I had no queues which just kept 'growing' - and the output video was perfect.
QUEUE CONTENTS IN MILLISECONDS
TIME(s) Q1 Q2 Q3 Q4 Q5 Q6
0 0 0 0 0 0 0
5 0 0 2 0 0 0
10 0 0 5 0 0 0
15 0 0 8 0 0 0
20 0 0 8 0 0 0
25 0 0 8 0 0 0
30 0 0 8 0 0 0
35 0 0 8 0 0 0
40 0 0 8 0 0 0
45 0 0 8 0 0 0
50 0 0 8 0 0 0
Also, I tried the following pipeline (I have omitted the queues from the diagram), with complete success - video recorded for at least 10 minutes with no issues.
=== THE QUESTION ===
What is going on?
My guess is that because Q3 (the video output) is filling up, then the audio must be slowing things down.
Because Q4 is filling up, and NOT Q5 - that must mean that alsa is producing audio more quickly than the aac encoder can compress it - is that correct?
However, my CPU usage is very low - I've tried with 2 aac encoders (voaacenc, and avenc_aac), and an MP3 encoder, all with the same issue.
======== UPDATE =========
I've put a couple of identity elements after the audio, and video (directly after), and charted the PTS of their outputs. You can see that they very quickly start drifting away from each other. By the time the video is at 30s, the audio is well behind at 21 seconds.
Here is a chart
======== UPDATE 2 =========
I had a second camera, and swapped that over, and the problem went away. The audio and video PTS values stayed in sync for at least 25 minutes.
The difference with this new camera - is it's a modified C920, with a custom lens fitted. The lens coincidentally happened to be pulled completely out of focus - and that is what fixed the PTS drift (if I focus the custom lens, I get the same PTS drift).
So - the question has changed a little: Why does an in-focus C920 camera drift its PTS so badly?
Note: I am turning off auto-exposure, and setting the exposure-absolute value to the default of 250.
I would prefer to be able to use the auto-exposure however...

OK, I have resolved the issue. For anybody reading :)
If you are using a Raspberry Pi, even a v3 - ensure you have configured peak-bitrate to no more than 3650000 (3.65Mbps) on your uvch264src. I am also capturing audio at 24khz - if you're not doing that you might be able to eek out a little more.
If you set it to more, or omit it entirely, you will experience the same odd issues I had. Movement and high-detail in the video footage will cause the encoded H264 to outrun what the Pi can handle. So your issues will be odd and sporadic.
I can only think that the C920 is saturating the USB bus - odd, because USB2 is supposed to be good up to 480Mbps - and the limit I've set is 3.65Mbps.
I have heard that the Raspberry has a very flawed USB firmware blob - but had never encountered it, until now.
Problem solved. I have been considering moving to a dragonboard... this might have given me the best reason yet.

Related

What is the easiest way on a binary image to check, if two pixels are connected? (In Matlab)

Consider this binary image:
0 1 0 0 0
0 1 0 0 0
0 1 1 0 0
0 0 0 0 0
0 0 0 1 0
I am looking for a function with two coordinates as parameters, and a boolean return value, which states, if the two pixels are connected (by 4- or 8-connectivity), like this:
f([1,2],[3,3]) -> true;
f([1,2],[5,4]) -> false;
I know, that there must be an easy algorithm, and there are some functions in Matlab, which do much more (bwdist, bwconncomp), but I'm looking for a simpler way.
Thanks for the help!
Your alternatives are to flood fill from one pixel, then check the other, to label all connected components and check the label, or to do A* pathfinding. A* will probably produce the fastest results if most of the pairs are close together but in large shapes, it's also the most complicated of the three methods.
Matlab has labelconnected components built in. It's not a particularly complicated algorithm. If you check my binary image processing library you can find implementations in C of all three methods.
https://github.com/MalcolmMcLean/binaryimagelibrary

Matlab: I/O Delay detection

I have a continuous process with 3 inputs and 1 output. The 3 inputs are consecutive in time: Input 1 lags the output by 30 minutes, Input 2 by 15 etc.
My dataset below shows a startup for the system after a shutdown:
I1 I2 I3 Out
0 0 0 0
3 0 0 0
8 4 0 0
13 8 6 0
22 13 9 3.2
It can be seen how input1 started and everything else followed.
My question: in Matlab, what should I look for in order to determine such I/O delay for more complex datasets?
You should pay a close look to xcorr
xcorr performs a cross-correlation between two vectors (typically time signals) and checks their conformity in dependence of a time shift between the signals. A constant I/O lag should appear as a local maximum value for the correlation coefficient.

comparing images matlab

Ok so let's say i have a binary image containing the pixel representation for 1,2,A,B or whatever. But for now let's just consider 1
0 0 0 0
0 1 1 0
0 1 1 0
0 1 1 0
0 1 1 0
0 0 0 0
and then i have another image containing the standard representation of 1.
Now what i wan't is to compare these two images and decide whether my first image contains pixel values for 1 or not.
What kind of algorithms are available at my disposal ?
Please i do not require the name of the matlab function for image comparison as has been the answer for similar questions. Rather than that i require the name of some algorithms that can be used to solve this problem so that i can implement it on my own in C#
What you need to compute is the distance between your image and the ground truth. This distance can be stated in many different ways. Search google for similarity measures on binary data. See here a review.

Setting up an ANN to classify Tic-Tac-Toe End-Games

I'm having an hard time setting up a neural network to classify Tic-Tac-Toe board states (final or intermediate) as "X wins", "O wins" or "Tie".
I will describe my current solution and results. Any advice is appreciated.
* DATA SET *
Dataset = 958 possible end-games + 958 random-games = 1916 board states
(random-games might be incomplete but are all legal. i.e. do not have both players winning simultaneously).
Training set = 1600 random sample of Dataset
Test set = remaining 316 cases
In my current pseudo-random development scenario the dataset has the following characteristics.
Training set:
- 527 wins for "X"
- 264 wins for "O"
- 809 ties
Test set:
- 104 wins for "X"
- 56 wins for "O"
- 156 ties
* Modulation *
Input Layer: 18 input neurons where each one corresponds to a board position and player. Therefore,
the board (B=blank):
x x o
o x B
B o X
is encoded as:
1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 1 1 0
Output Layer: 3 output neurons which correspond to each outcome (X wins, O wins, Tie).
* Architecture *
Based on: http://www.cs.toronto.edu/~hinton/csc321/matlab/assignment2.tar.gz
1 Single Hidden Layer
Hidden Layer activation function: Logistic
Output Layer activation function: Softmax
Error function: Cross-Entropy
* Results *
No combination of parameters seems to achieve 100% correct classification rate. Some examples:
NHidden LRate InitW MaxEpoch Epochs FMom Errors TestErrors
8 0,0025 0,01 10000 4500 0,8 0 7
16 0,0025 0,01 10000 2800 0,8 0 5
16 0,0025 0,1 5000 1000 0,8 0 4
16 0,0025 0,5 5000 5000 0,8 3 5
16 0,0025 0,25 5000 1000 0,8 0 5
16 0,005 0,25 5000 1000 0,9 10 5
16 0,005 0,25 5000 5000 0,8 15 5
16 0,0025 0,25 5000 1000 0,8 0 5
32 0,0025 0,25 5000 1500 0,8 0 5
32 0,0025 0,5 5000 600 0,9 0 5
8 0,0025 0,25 5000 3500 0,8 0 5
Important - If you think I could improve any of the following:
- The dataset characteristics (source and quantities of training and test cases) aren't the best.
- An alternative problem modulation is more suitable (encoding of input/output neurons)
- Better network architecture (Number of Hidden Layers, activation/error functions, etc.).
Assuming that my current options in this regard, even if not optimal, should not prevent the system from having a 100% correct classification rate, I would like to focus on other possible issues.
In other words, considering the simplicity of the game, this dataset/modulation/architecture should do it, therefore, what am I doing wrong regarding the parameters?
I do not have much experience with ANN and my main question is the following:
Using 16 Hidden Neurons, the ANN could learn to associate each Hidden Unit with "a certain player winning in a certain way"
(3 different rows + 3 different columns + 2 diagonals) * 2 players
In this setting, an "optimal" set of weights is pretty straightforward: Each hidden unit has "greater" connection weights from 3 of the input units (corresponding to a row, columns or diagonal of a player) and a "greater" connection weight to one of the output units (corresponding to "a win" of that player).
No matter what I do, I cannot decrease the number of test errors, as the above table shows.
Any advice is appreciated.
You are doing everything right, but you're simply trying to tackle a difficult problem here, namely to generalize from some examples of tic-tac-toe configurations to all others.
Unfortunately, the simple neural network you use does not perceive the spatial structure of the input (neighborhood) nor can it exploit the symmetries. So in order to get perfect test error, you can either:
increase the size of the dataset to include most (or all) possible configurations -- which the network will then be able to simply memorize, as indicated by the zero training error in most of your setups;
choose a different problem, where there is more structure to generalize from;
use a network architecture that can capture symmetry (e.g. through weight-sharing) and/or spatial relations of the inputs (e.g. different features). Convolutional networks are just one example of this.

PyTables table.where equivalent in matlab

I'm trying to find something similar in MATLAB to PyTables' table.where that selects a subset of a dataset based on criteria (such as col1 = 4). So far, my searching has been completely fruitless. I can't believe such a useful feature wouldn't be supported somehow... can anyone help?
MATLAB ver R2011b.
EDIT: In case it wasn't clear from the question, I'm using an HDF5 file for data storage in MATLAB, hence my desire to find functionality similar to PyTables.
I think what you try to do involves either load-ing the file in memory (or you might give HDF5 Diskmap Class a try if it's to big for memory).
Once you have access to your data in matlab as a matrix, it's easy as:
a=[
0 0 0 0 1;
0 1 0 0 1;
1 0 1 1 1;
0 1 1 1 1;
1 0 1 0 1];
a(find(a(:,1)==1),:)