ffmpeg: What is the best practice to keep a live connection/socket with a camera, and save time on ffprobe - sockets

Today... I used the following command: with subprocess.PIPE and subprocess.Popen in python 3:
ffmpeg -i udp://{address_of_camera} \
-vf select='if(eq(pict_type,I),st(1,t),gt(t,ld(1)))' setpts=N/FRAME_RATE/TB \
-f rawvideo -an -vframes {NUM_WANTED_FRAMES} pipe:`
This command helps me to capture NUM_WANTED_FRAMES frames from a live camera at a given moment.
However... it takes me about 4 seconds to read the frames, and about 2.5 seconds to open a socket between my computer and the camera's computer.
Is there a way, to have a socket/connection always open between my computer and the camera's computer, to save the 2.5 seconds?
I read something about fifo_size and overrun_fatal. I thought that maybe I can set fifo_size to be equal to NUM_WANTED_FRAMES, and overrun_fatal to True? Will this solve my problem? Or is there a different and simpler/better solution?
Should I try to record always (no -vframes flag) store the frames in a queue(With max size), and upon a wish to slice the video, read from my queue buffer? Will it work well with the keyframe?
Also... What to do when ffmpeg fails? restart the ffmpeg command?

FFmpeg itself is an one-n-done type of app. So, to keep the camera running, the best option is to "record always (no -vframes flag)" and handle whether to drop/record frames in Python.
So, a rough sketch of the idea:
import subprocess as sp
from threading import Thread, Event
from queue import Queue
NUM_WANTED_FRAMES = 4 # whatever it is
width = 1920
height = 1080
ncomp = 3 # rgb
framesize = width*height*ncomp # in bytes
nbytes = framesize * NUM_WANTED_FRAMES
proc = Popen(<ffmpeg command>, stdout=sp.PIPE)
stdout = proc.stdout
buffer = Queue(NUM_WANTED_FRAMES)
req_frame = Event() # set to record, default to drop
def reader():
while True:
if req_frame.is_set():
queue.put(stdout.read(nbytes))
record_frame.clear()
else:
# frames not requested, drop
stdout.read(framesize)
rd_thread = threading.Thread(target=reader)
rd_thread.start()
...
# elsewhere in your program, do this when you need to get the camera data
req_frame.set()
framedata = queue.get()
....
Will it work well with the keyframe?
Yes, if your FFmpeg command has -discard nokey it'll read just keyframes.
What to do when ffmpeg fails? restart the ffmpeg command?
Have another thread to monitor the health of proc (Popen object) and if it is dead, you need to restart subprocess with the same command and overwrite with the new stdout. You probably want to protect your code with try-except blocks as well. Also, adding timeouts to queue ops would be a good idea, too.

Related

Cannot identify image file io.BytesIO on raspberry Pi using PiCamera library and PIL

I am having trouble using the output from PiCamera capture function (directed in a BytesIO stream) and opening it using the PIL library. Here is the code (based on the PiCamera basic examples):
#Camera stuff
camera = PiCamera()
camera.resolution = (640, 480)
stream = io.BytesIO()
sleep(2)
try:
for frame in camera.capture_continuous(stream, format = "jpeg", use_video_port = True):
frame.seek(0)
image = Image.open(frame) //THIS IS WHERE IS CRASHES
#OTHER STUFF THAT IS NON IMPORTANT GOES HERE
frame.truncate(0)
finally:
camera.close()
stream.close()
The error is : PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0xaa01cf00>
Any help would be greatly appreciated :)
Have a nice day!
The problem is simple but I am wondering why the io library works that way.
One simply needs to seek back the stream to 0 after truncating it or seek to 0 and then simply call truncate with no parameter (all after you are done opening the image). Like so:
for frame in camera.capture_continuous(stream, format = "jpeg", use_video_port = True):
stream.seek(0)
image = Image.open(stream)
#Do stuff with image
stream.seek(0)
stream.truncate()
Basically when you open the image and do some operation on it, the pointer of the BytesIO can move around and end up somewhere else than the zero position. After that when you call truncate(0) it does not move the pointer back to zero as I thought it would (seems logical to me to move the pointer back to where the truncation occurs). When to code runs once more, the capture writes in the stream but this time it does not start writing at the beginning and everything breaks after that.
Hope this can help someone in the future :)

I want to stop packet capture while sniffing continuously once a condition is met

Problem
I have written a script that sniffs packet from a host, however, I am sniffing the packets in continuous mode and would like to stop sniffing on a timeout. I have written the following code to stop packet sniffing, but it doesn't seem to stop when the time has clearly exceeded the timeout. What could I be doing wrong in here?
import time
import pyshark
prog_start = time.time()
capture = pyshark.LiveCapture(interface='en0')
capture.sniff(timeout=10)
start_time = capture[0].frame_info.time_epoch
end_time = capture[-1].frame_info.time_epoch
print("Capture lasted:", float(end_time) - float(start_time))
pkt_num = 0
for pkt in capture:
pkt_num += 1
print("Time", time.time() - prog_start, "Pkt#", pkt_num)
We then get this output, with thousands of additional packets a second, past when the capture should have stopped:
Capture lasted: 9.148329019546509
Time 10.346031188964844 Pkt# 1
Time 10.348641157150269 Pkt# 2
Time 10.351708889007568 Pkt# 3
Time 10.353564977645874 Pkt# 4
Time 10.35555100440979 Pkt# 5
...
Question
Why does PyShark continue to capture packets after the timeout?
I was having this same issue, I managed to find a bit of a solution for it. It isn't perfect but it works by telling the capture loop to stop on the next packet and sends an empty packet it will see to make it end. I made it a udp packet on a high port for my case because I use a filter that filters out most traffic so this solution worked for me
class PacketCapture(threading.Thread):
capture = 1
def __init__(self, interface_name):
threading.Thread.__init__(self)
self.interface_name = interface_name
def stop(self):
self.capture = 0
def run(self):
capture = pyshark.LiveCapture(interface=self.interface_name)
try:
for packet in capture.sniff_continuously():
if not self.capture:
capture.close()
except pyshark.capture.capture.TSharkCrashException:
self.exited = 1
print("Capture has crashed")
#start capture
pcap = PacketCapture(interface_name)
pcap.start()
#stop capture
pcap.stop()
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
msg = bytes("", "UTF-8")
sock.sendto(msg, ("external IP", 12345))
sock.close
I'm relatively new to python myself but I think this should be a somewhat acceptable solution given the scenario
Problems with PyShark
It looks like you're running into a known issue with PyShark that hasn't been fixed in years. Per the thread, the author wrote
You can subclass LiveCapture and override the get_parameters() function, adding your own parameters.
You could modify the parameters sent to tshark, but at this point, why not just use a tshark command directly?
Using Tshark Instead
PyShark is just a wrapper for tshark on your system. If you want to use subprocess with Python, the equivalent tshark command is tshark -a duration:5. The other advantage of using tshark directly is that subprocess gives you a pid that you can kill on an arbitrary condition.
See the manpage for more details.
You can use the following line in bash or even in python within os.system, os.popen or even subprocess:
while IFS= read -r line; do if [[ $line =~ 'some protocol' ]]; then <SOME_ACTION>; break; fi; done < <(sudo tshark)

MFCreateFMPEG4MediaSink does not generate MSE-compatible MP4

I'm attempting to stream a H.264 video feed to a web browser. Media Foundation is used for encoding a fragmented MPEG4 stream (MFCreateFMPEG4MediaSink with MFTranscodeContainerType_FMPEG4, MF_LOW_LATENCY and MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS enabled). The stream is then connected to a web server through IMFByteStream.
Streaming of the H.264 video works fine when it's being consumed by a <video src=".."/> tag. However, the resulting latency is ~2sec, which is too much for the application in question. My suspicion is that client-side buffering causes most of the latency. Therefore, I'm experimenting with Media Source Extensions (MSE) for programmatic control over the in-browser streaming. Chrome does, however, fail with the following error when consuming the same MPEG4 stream through MSE:
Failure parsing MP4: TFHD base-data-offset not allowed by MSE. See
https://www.w3.org/TR/mse-byte-stream-format-isobmff/#movie-fragment-relative-addressing
mp4dump of a moof/mdat fragment in the MPEG4 stream. This clearly shows that the TFHD contains an "illegal" base data offset parameter:
[moof] size=8+200
[mfhd] size=12+4
sequence number = 3
[traf] size=8+176
[tfhd] size=12+16, flags=1
track ID = 1
base data offset = 36690
[trun] size=12+136, version=1, flags=f01
sample count = 8
data offset = 0
[mdat] size=8+1624
I'm using Chrome 65.0.3325.181 (Official Build) (32-bit), running on Win10 version 1709 (16299.309).
Is there any way of generating a MSE-compatible H.264/MPEG4 video stream using Media Foundation?
Status Update:
Based on roman-r advise, I managed to fix the problem myself by intercepting the generated MPEG4 stream and perform the following modifications:
Modify Track Fragment Header Box (tfhd):
remove base_data_offset parameter (reduces stream size by 8bytes)
set default-base-is-moof flag
Add missing Track Fragment Decode Time (tfdt) (increases stream size by 20bytes)
set baseMediaDecodeTime parameter
Modify Track fragment Run box (trun):
adjust data_offset parameter
The field descriptions are documented in https://www.iso.org/standard/68960.html (free download).
Switching to MSE-based video streaming reduced the latency from ~2.0 to 0.7 sec. The latency was furthermore reduced to 0-1 frames by calling IMFSinkWriter::NotifyEndOfSegment after each IMFSinkWriter::WriteSample call.
There's a sample implementation available on https://github.com/forderud/AppWebStream
I was getting the same error (Failure parsing MP4: TFHD base-data-offset not allowed by MSE) when trying to play a fmp4 via MSE. The fmp4 had been created from a mp4 using the following ffmpeg comand:
ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov myfmp4video.mp4
Based on this question I was able to find out that to have the fmp4 working in Chrome I had to add the "default_base_moof" flag. So, after creating the fmp4 with the following command:
ffmpeg -i myvideo.mp4 -g 52 -vcodec copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof myfmp4video.mp4
I was able to play successfully the video using Media Source Extensions.
This Mozilla article helped to find out that missing flag:
https://developer.mozilla.org/en-US/docs/Web/API/Media_Source_Extensions_API/Transcoding_assets_for_MSE
The mentioned 0.7 sec latency (in your Status Update) is caused by the Media Foundation's MFTranscodeContainerType_FMPEG4 containterizer which gathers and outputs each roughly 1/3 seconds (from unknown reason) of frames in one MP4 moof/mdat box pair. This means that you need to wait 19 frames before getting any output from MFTranscodeContainerType_FMPEG4 at 60 FPS.
To output single MP4 moof/mdat per each frame, simply lie that MF_MT_FRAME_RATE is 1 FPS (or anything higher than 1/3 sec). To play the video at the correct speed, use Media Source Extensions' <video>.playbackRate or rather update timescale (i.e. multiply by real FPS) of mvhd and mdhd boxes in your MP4 stream interceptor to get the correctly timed MP4 stream.
Doing that, the latency can be squeezed to under 20 ms. This is barely recognizable when you see the output side by side on localhost in chains such as Unity (research) -> NvEnc -> MFTranscodeContainerType_FMPEG4 -> WebSocket -> Chrome Media Source Extensions display.
Note that MFTranscodeContainerType_FMPEG4 still introduces 1 frame delay (1st frame in, no output, 2nd frame in, 1st frame out, ...), hence the 20 ms latency at 60 FPS. The only solution to that seems to be writing own FMPEG4 containerizer. But that is order of magnitude more complex than intercepting of Media Foundation's MP4 streams.
The problem was solved by following roman-r's advise, and modifying the generated MPEG4 stream. See answer above.
Another way to do this is again using the same code #Fredrik mentioned but I write my own IMFByteStream and and I check the chunks written to the IMFByteStream.
FFMpeg writes the atoms almost once at a time. So you can check the atom name and do the mods. It is the same thing. I wish there was an MSE compliant windows sinker.
Is there one that can generate .ts files for HLS?

Avformat cannot seek to beginning of file

I need to quickly seek thru H.264 encoded video stream in MP4 container. I am using libav to decode frames, so I stumbled upon avformat_seek_file() method.
My problem is, assuming H.264 stream begins with keyframe, when I seek to timestamp 0 (regardless of time_base), I should be at the beggining of the stream. But Im not. I usually get few seconds into video. Also, if I seek to, for example 10 seconds, I usually get around 12 or so. Is it possible for keyframes to be so "rare"? It seems that AVSEEK_FLAG_ANY has no impact on seek result. Tested on multiple FullHD H.264 MP4 videos.
Code:
unsigned long seekTo = 0;
//Doesen´t actually matter for 0 since it will be also 0
seekTo = av_rescale_q(seekTo, AVRational{1, AV_TIME_BASE}, pFormatCtx->streams[videoStream]->time_base);
int result = avformat_seek_file(pFormatCtx, videoStream, INT_FAST64_MIN, seekTo, seekTo, AVSEEK_FLAG_ANY);
avcodec_flush_buffers(pCodecCtx);
Try using av_seek_frame instead. Read here for some gotchas about using that and seeking around.
My problem is, assuming H.264 stream begins with keyframe, when I seek to timestamp 0 (regardless of time_base), I should be at the beggining of the stream
Note that some files can have their first keyframe at a negative DTS, e.g. you need to seek to timestamp -1 or something like this.
You can set the flag inside AVFMT_SEEK_TO_PTS into AVInputFormat::flags before opening the AVFormatContext to use PTS which will be 0-based.

Near Real Time Video Upload from iPhone

I am trying to find the best way to upload video from an iPhone (iOS5) as fast as possible - real time if possible.
I found this previous question and answer very useful.
streaming video FROM an iPhone
But it has left me with several unanswered questions. I dont have enough rep to post comments in that question- and I think my questions are getting beyond the scope of the original question anyway.
So:
Is using AVCaptureSession/AVAssetWriter and chopping the video into short clips the best way to rapidly move (compressed) video off of the iPhone - in near realtime?
If so could someone supply more details on how to use two AVAssetWriters and a background queue to avoid dropouts (as user Steve McFarlin mentions in the referenced question above)? I am unclear how the handoff from one AVAssetWriter to another would work...
(Critical) Is there an easy way to append the chopped video files back into a full length video... or at least be able to play them as if they were one complete video? I would need to merge the smaller files to look like one file both on the server AND on the iPhone (for preview).
Thanks for any help...
Well you can try to do the buffering on the phone but that seems counter-productive to me, given that it has limited memory. I would try setting up an AVCaptureSession and use the AVCaptureVideoDataOutput which will vend the frames to you on a separate dispatch_queue thread (if setup it will vend them as MPEG frames). That thread can hand the frames off to an async socket to transmit, possibly with a small header that indicates the frame number and video format. Alternately you can hand the data off to a sending thread via a queue which would let you monitor how many frames are waiting to be transmitted.
On the receiving server, you'd want to deal with creating a small buffer (say a few seconds) and doing the frame reordering if they arrive out of order.
The big issue will be detecting the bandwidth and knowing when to drop the quality down so you don't end up with a backlog of packets waiting to go out. That's an entirely different and complicated topic :) The key will be in your selection if codec, quality, and video size... that is going to directly determine the bandwidth required to transmit the frames in real-time. AVVideoCodecH264 is supported in hardware in certain modes and is probably the only realistic option for real-time encoding.
I don't think you are going to find a ready-made example for this though as it represents a lot of work to get working just right.
2) Here's how I chunked the files without dropping too many frames:
- (void) segmentRecording:(NSTimer*)timer {
AVAssetWriter *tempAssetWriter = self.assetWriter;
AVAssetWriterInput *tempAudioEncoder = self.audioEncoder;
AVAssetWriterInput *tempVideoEncoder = self.videoEncoder;
self.assetWriter = queuedAssetWriter;
self.audioEncoder = queuedAudioEncoder;
self.videoEncoder = queuedVideoEncoder;
//NSLog(#"Switching encoders");
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), ^{
[tempAudioEncoder markAsFinished];
[tempVideoEncoder markAsFinished];
if (tempAssetWriter.status == AVAssetWriterStatusWriting) {
if(![tempAssetWriter finishWriting]) {
[self showError:[tempAssetWriter error]];
}
}
if (self.readyToRecordAudio && self.readyToRecordVideo) {
NSError *error = nil;
self.queuedAssetWriter = [[AVAssetWriter alloc] initWithURL:[self newMovieURL] fileType:(NSString *)kUTTypeMPEG4 error:&error];
if (error) {
[self showError:error];
}
self.queuedVideoEncoder = [self setupVideoEncoderWithAssetWriter:self.queuedAssetWriter formatDescription:videoFormatDescription bitsPerSecond:videoBPS];
self.queuedAudioEncoder = [self setupAudioEncoderWithAssetWriter:self.queuedAssetWriter formatDescription:audioFormatDescription bitsPerSecond:audioBPS];
//NSLog(#"Encoder switch finished");
}
});
}
https://github.com/chrisballinger/FFmpeg-iOS-Encoder/blob/master/AVSegmentingAppleEncoder.m
3) Here's a script to concatenate the files on the server
import glob
import os
run = os.system # convenience alias
files = glob.glob('*.mp4')
out_files = []
n = 0
for file in files:
out_file = "out-{0}.ts".format(n)
out_files.append(out_file)
full_command = "ffmpeg -i {0} -f mpegts -vcodec copy -acodec copy -vbsf h264_mp4toannexb {1}".format(file, out_file)
run(full_command)
n += 1
out_file_concat = ''
for out_file in out_files:
out_file_concat += ' {0} '.format(out_file)
cat_command = 'cat {0} > full.ts'.format(out_file_concat)
print cat_command
run(cat_command)
run("ffmpeg -i full.ts -f mp4 -vcodec copy -acodec copy -absf aac_adtstoasc full.mp4")
https://github.com/chrisballinger/FFmpeg-iOS-Encoder/blob/master/concat-mp4.py