Stream video with custom data - streaming

The end goal is to live stream video with custom data in stream from mobile to another client. I need record data from sensors and this data have to be inline with video in terms of time. If I understood correctly I can have lets say 2 tracks/streams: video and custom data. Then I can mux them in one stream with GStreamer. Custom data is just basic class with 5 properties strings and int and I will gather this data lets say every millisecond. The question is how can I represent my custom data in order to combine video and custom data into one stream?
I am very new to all this multimedia things.

Related

How can you add custom metadata to a video (.mov, .mp4) with AVFoundation?

Although this question has been asked before, I still have not found a single working example of how to do this.
I have a golf swing analysis app that allows the user to save 10 positions in the videoclip that can be navigated to with a shortcut. Until now I have saved the position data separately from the video-file, which is less than ideal for obvious reasons, so I would like to save the position data as metadata in the video-file.
Can anyone give me a working example of saving a String (I'll use Codable to encode the data structure which is not very large) to a custom metadata key (ex. "posi") given an AVAsset?

Adaptive bitrate streaming protocol (for any data)

I'm looking for some ideas/hints for streaming protocol (similar to video/audio streaming) to send any data in so called real-time.
In simple words:
I'm producing some data each second (let's say one array with 1MB of data per second) and I'm sorting that data from most important to not so important (like putting them to priority queues or similar)
I would like to keep streaming those data via some protocol and in perfect case I would like to send all of it
If not possible (bandwidth, dropping packets etc.) I would like to send from each produced array as much as possible (first n-bytes) just to keep data going (it is important to start sending new produced array each second).
And now - I'm looking for such protocol/library that will handle adaptive bit rate stuff for any data. I would expect from it to tell me how much data I can send (put into send buffers or similar approach). The most similar thing is video/audio streaming when in poor network conditions (en)coder is changing quality depending on network conditions.
It is also OK if I miss some send data (so UDP deep down of this stuff is OK) but preferably I would like to send as much data as possible per second without loosing anything (from those first n-bytes send).
Do you have any ideas of what protocol/libraries I could use for client/server? (hopefully some libs in python, C or C++).
I think IPFIX (the generic NetFlow standard) has everything you need.
You can avoid a timestamp per sample by sending a samplingInterval update every time you change your rate. You can also add other updating the change in sampling asynchronously.
As for where to put your data. You can create a new field or just use an existing one with that has a datatype you want. IE: if you are just sending uint64 sample values then it might be easier to use packetDeltaCount then create your own field definition.
There are plenty of IPFIX libraries.

Is it possible to store multiple video paragraphs, each has its owned parameters, in one track of a mp4 file?

I want to encode a sequence of video frames (FHD) into a h264 stream in a way like this:
From time t1 to time t2: encode with "main" profile, FHD and at 30fps.
From time t3 to time t4: encode with "high" profile, HD(scaled) and at 15fps.
From time t5 to time t6: encode with "main" profile, FHD and at 30fps.
Note: t1 < t2 < t3 < t4 < t5 < t6.
My question is, by complying the MP4 standard, is it possible to put video streams encoded by different parameters into a same video track of a mp4 file? If it is impossible, what is the best alternative?
Yes, at least according to the specification. If you look at ISO/IEC 14496-15 (3rd edition), it contains a definition of Parameter set track:
A sync sample in a parameter set track indicates that all parameter sets needed
from that time forward in the video elementary stream are in that or succeeding parameter stream
samples. Also there shall be a parameter set sample at each point a parameter set is updated. Each
parameter set sample shall contain exactly the sequence and picture parameter sets needed to
decode the relevant section of the video elementary stream.
As I understand it, in this case instead of writing the intial SPS/PPS data into the avcC box in stbl you write a separate track containing the changing SPS/PPS data as sync samples. So at least according to the spec, you would have samples in that stream with presentation times t1,t2,t3,t4,t5 and the samples themselves would contain the updated SPS/PPS data. This quote from the same standard seems to agree:
Parameter sets: If a parameter set elementary stream is used, then the sample in the parameter
stream shall have a decoding time equal or prior to when the parameter set(s) comes into effect
instantaneously. This means that for a parameter set to be used in a picture it must be sent prior to the
sample containing that picture or in the sample for that picture.
NOTE Parameter sets are stored either in the sample descriptions of the video stream or in the parameter set
stream, but never in both. This ensures that it is not necessary to examine every part of the video elementary
stream to find relevant parameter sets. It also avoids dependencies of indefinite duration between the sample that
contains the parameter set definition and the samples that use it. Storing parameter sets in the sample
descriptions of a video stream provides a simple and static way to supply parameter sets. Parameter set
elementary streams on the other hand are more complex but allow for more dynamism in the case of updates.
Parameter sets may be inserted into the video elementary stream when the file is streamed over a transport that
permits such parameter set updates.
ISO/IEC 14496-15 (3rd edition) also defines additional avc3 / avc4 boxes, which, when used should allow to actually write the parameter sets in-band with the video NAL units:
When the sample entry name is 'avc3' or 'avc4', the following applies:
If the sample is an IDR access unit, all parameter sets needed for decoding that sample shall be included either in the sample entry or in the sample itself.
Otherwise (the sample is not an IDR access unit), all parameter sets needed for decoding the sample shall be included either in the sample entry or in any of the samples since the previous random access point to the sample itself, inclusive.
A different question is, even though standard allows at least two ways (in band with avc3, out of band with parameter set track) to achieve this, how many players there are which honor this. I'd assume looking at least into the sources of ffmpeg to find if this is supported there is a good start.
The answers in this question also lean towards the fact that many demuxers are only honoring the avcC box and not separate parameter set track, but a couple of quick google searches show that at least both vlc/ffmpeg forums and newsletters have mentions of these terms, so I'd say it's best to try to mux such a file and simply check what happens.

Features of GStreamer

Does GStreamer have the following functionalities/features, or is it possible to implement them on top of GStreamer:
Time windows: set up the graph such that a sink pad of one element does not just receive the current frame, but also n previous frames and m future frames. Including when seeking to a new position.
No data copies when passing data between elements, instead reusing the same buffer.
Having shared data between mutiple elements on different branches, that changes with time, but is buffered in such a way that all elements get the same value for it for the same frame index.
Q1) Time windows
You need to write your plugin using GstAdapter.
Q2) No data copies when passing data between elements
It's done by default. No data is copied from element to element unless required. It just passes a pointer to an instance of GstBuffer. If an element is like encoder or filter, which needs to work on a buffer to produce new data, a new GstBuffer instance is created with newly generated data in GstMemory, obviously.
Q3) Having shared data between mutiple elements
Not sure exactly what you mean. Is is possible to achieve what you want by using GstMemory share? Take a look at gst_memory_share(), gst_buffer_copy_region(), or gst_adapter_get_buffer().

Mix two audios using Xuggler's MediaTool API

I want to read the data from two audio files and after mixing it write it to a new audio file. Both audios have the same duration, samplerate & number of channels and I want to take the left channel of audio1 and the right channel of audio2 to make an stereo output with that.
From what I can see in the MediaTool API demos, onAudioSamples gets called whenever a packet is read and decoded, but for this scenario I need to have the data of both audio1 and audio2 available when onAudioSamples is called in order to modify the right channel of the samples of audio1 with the right channel of audio2.
Can I achieve this using the MediaTool API? should I use the lower level API?
Should I read all the packets from both audio files (like it is done in ConcatenateAudioAndVideo demo) before modifying the sample data, should I read a packet from each?
while(reader1.readPacket() == null && reader2.readPacket());
Thanks.
I couldn't find any way to split audio channels in the provided documentation. But the IAudioSampler and IAudioResampler can be used for audio chunks. If you start with those classes and manipulate the lower level data in IMediaData.