sox soundexchange - how do you stop recording when there is silence longer than 20 sec? - command-line

I cannot get sox to stop recording when silence occurs.
I've have successfully gotten the result I needed when using sox with input output files:
sox original.mp3 trimmed.mp3 silence 1 0.3 1% 1 20.0 1%
sox starts recording after the first noise of 0.3 seconds, then records until 20 seconds of silence are detected.
Issue: When I try to do the same with parameter -d (to record from the default device, a microphone, not an input file) - it doesn't work, the recording never stops (even though there is absolute silence):
sox -d temp_new.mp3 silence 1 0.3 1% 1 20.0 1%
Thank you!
Added info:
I am not using a microphone, but a virtual audio cable that routes system noise to the sox app. The respective machine does not have a microphone, so there cannot be local background noise in the recording.

Related

Record short audio output using parec

I've stumbled upon a problem that if a program outputs short audio then parec is not able to record it.
Here's the recorder
parec --rate 48000 -d alsa_output.pci-0000_00_0e.0.analog-stereo.monitor | hexdump -C
Here is a program producing short audio
play -q -n -c1 synth -n 0.3 triangle 500
If I change 0.3 to 0.4 then hexdump starts to output. So, no output if the audio is shorter than 400ms. Maybe it is also dependent on the computer performance, idn
Is there anything I can do with that?

Flutter FFMPEG Increases in processing time based on size of video even when only cutting the same size

I am by no means an expert with ffmpeg. But I'm finding it strange that the the time to create a gif and trim that section is increasing so much based on the size of the video since I am always grabbing only three seconds.
I am using flutter FFmpeg.
-ss 0:00:01.000000, -i /data/user/0/com.example.example/cache/image_picker1475407716366431469.mp4 -t, 0:00:03.000000 -avoid_negative_ts make_zero, -vf fps=10,scale=320:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse, -loop 0
Is there a command to make sure that ffmpeg doesn't concentrate on the entire video, and only concentrates on the three seconds I am getting in -t, so that the time doesn't increase greatly based on the video size. Or is this just normal for ffmpeg. Does it have to parse the entire video, before creating the gif.
I found a solution for anyone else looking. For some reason its much faster to just trim the original video and then make a gif. So just running '-c:v copy ' on the original video and then running the above command cut the processing time on a 3 minute video down from 1 minute 40 seconds to 10 seconds.

How to get FFMPEG to use more GPU when encoding

so the situation is as following
Im receiging 20/30 uncompressed image per second. format is either PNG or Bitmap. Each individual photo size is between 40 and 50 mb (all have same size since uncompressed).
I want to encode them to a 265 lossless video and stream them to a http server using FFMPEG.
The output video is 1920x1080, so there is some downsampling.
Compression is allowed but nothing is allowed to be lost other than the down sampling.
now i m still in the testing phase. i have a 500 sample image. and i m tryng to encode them as effeciently as possible.
Im using commands such as :
ffmpeg -hwaccel cuvid -f image2 -i "0(%01d).png" -framerate 30 -pix_fmt p010le -c:v hevc_nvenc -preset lossless -rc vbr_hq -b:v 6M -maxrate:v 10M -vf scale=1920:1080 -c:a aac -b:a 240k result.mp4
I have a powerfull modern quadro GPU and a 6 cores intel CPU and an Nvme hard drive.
The usuage of the GPU when encoding is exactly 10%, CPU is circa 30-40%
How can i get GPU usuage to 80% ? The machine on which im going to run the code will have at leat a quadro 4000 (maybe stronger) and i want to use it to the fullest
That’s not how it works. GPUs do not use the standard vector processing units for video encoding. (Well, it does a little, for things like color conversion and scaling, but not not for everything). The GPU has dedicated circuitry for video encoding primitives. When those are full, it doesn’t matter how many GPU cores you have, they will be idle.
So to to use “more” GPU, you don’t get a beefy GPU, you buy a card that has more NVENC cores.
If your ffmpeg was compiled with --enable-libnpp then consider using the GPU based scale_npp filter instead of scale which is CPU only. Example from FFmpeg Wiki: Hardware Acceleration:
ffmpeg -hwaccel cuda -i input -vf scale_npp=1920:1080 -c:v h265_nvenc output.mp4
You may see an improvement in performance or GPU utilization.

ffplay keep video/audio sync when using select filter

I'm trying to play/skip some clips of a video using ffplay. My first approach to skip say frames 100 to 400 was:
ffplay -vf "select='lte(n\,100)+gte(n\,400)'" -i INPUT
this skips the desired frames, however it also freezes the video during the skipped frames. I tried to fix this by modifying the video presentation time stamp (PTS) with the setpts option:
ffplay -vf "select='lte(n\,100)+gte(n\,400)',setpts='PREV_OUTPTS'" -i INPUT
this seems to work (stills freeze a bit, guess is because of buffering), but now the audio is out of sync. I've tried applying a select filter and modifying the PTS on the audio as well
ffplay -vf "select='lte(n\,100)+gte(n\,400)',setpts='PREV_OUTPTS'" -af "aselect='lte(n\,100)+gte(n\,400)',asetpts='PREV_OUTPTS'" -i INPUT
this skips some audio frames, but still out of sync. I've tried with the aresample=async=10000 option with similar results. Moving some/all of the filters to the output (placing them after the -i INPUT) doesn't work either.
Does someone know how to skip parts of a video using ffplay? Many thanks
Audio frame numbers != video frame numbers. AAC audio generated by FFmpeg's encoder is 1024 samples per frame, so a 48kHz stream has 48000/1024 = 46.875 audio frames per second. Other codecs may have different rates.
Use t instead of n, and generate a continuous series of timestamps.
ffplay
-vf "select='lte(t\,4)+gte(t\,16)',setpts=N/FRAME_RATE/TB"
-af "aselect='lte(t\,4)+gte(t\,16)',asetpts=N/SR/TB"
-i INPUT
I assume a video frame rate of 25 fps. Modify accordingly.

iOS: Bad Mic input latency measurement result

I'm running a test to measure the basic latency of my iPhone app, and the result was disappointing: 50ms for a play-through test app. The app just picks up mic input and plays it out using the same render callback, no other audio units or processing involved. Therefore, the results seemed too bad for such a basic scenario. I need some pointers to see if the result makes sense or I had design flaws in my test.
The basic idea of the test was to have three roles:
My finger snap as the reference sound source.
A simple iOS play-thru app (using built-in mic) as the first
listener to #1.
A Mac (with a USB mic and Audacity) as the second listener to #1 and
the only listener to the iOS output (through a speaker connected via
iOS headphone jack).
Then, with Audacity in recording mode, the Mac would pick up both the sound from my fingers and its "clone" from the iOS speaker in close range. Finally I simply visually observe the waveform in Audacity's recorded track and measure the time interval between the peaks of the two recorded snaps.
This was by no means a super accurate measurement, but at least the innate latency of the Mac recording pipeline should have been cancelled out this way. So that the error should mainly come from the peak distance measurement, which I assume should be much smaller than the audio pipeline latency and can be ignored.
I was expecting 20ms or lower latency, but clearly the result gave me 50~60ms.
My ASBD uses kAudioFormatFlagsCanonical and kAudioFormatLinearPCM as format.
50 mS is about 4 mS more than the duration of 2 audio buffers (one output, one input) of size 1024 at a sample rate of 44.1 kHz.
17 mS is around 5 mS more than the duration of 2 buffers of length 256.
So it looks like the iOS audio latency is around 5 mS plus the duration of the two buffers (the audio output buffer duration plus the time it takes to fill the input buffer) ... on your particular iOS device.
A few iOS devices may support even shorter audio buffer sizes of 128 samples.
You can use core audio and set up the audio session to have a very low latency.
You can set the buffer size to be smaller using AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,...
Using smaller buffers causes the audio callback to happen more often while grabbing smaller chunks of audio. Keep in mind that this is merely a suggestion to the audio system. iOS will use a callback time suitable value based on your sample rate and integer powers of 2.
Once you set the buffer duration, you can get the actual buffer duration that the system will use using AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareIOBufferDuration,...
I'll summarize Paul R's comments as the answer, which has solved my problem:
50 ms corresponds to a total buffer size of around 2048 at a 44.1 kHz sample rate, which doesn't seem unreasonable given that you have both a record and a playback path.
I don't know that the buffer size is 2048, and there may be more than one buffer in your record-playback loopback test, but it seems that the effective total buffer size in you test is probably of the order of 2048, which doesn't seem unreasonable. Of course if you're only interested in record latency, as the title of your question suggests, then you'll need to find a way to tease that out separately from playback latency.