Is there any paid/licensed video converter that support GPU acceleration and command line features? I need the command line for automation purpose and GPU acceleration to convert the video in 4K in commercial way.
Related
Just started digging around Google's Resonance Audio for Unity and it has promise in headphones.
But I am interested in using for a speaker setup. I have an ambisonic decoder interface and a speaker array that takes b-format signals. Is there a way to output a 4-channel / b-format signal directly from Unity so I can monitor Resonance's soundfield in loudspeakers?
At the moment, using SuperCollider / ATK with Unity via OSC for a custom sound engine to allow ambisonic playback in a speaker array. Works well, but would like to take advantage Google's new tools.
Outputting a 4-channel b-format signal directly from Unity using the Resonance Audio SDK is only supported when saving directly to a file (.ogg). Streaming the signal directly from Unity to an external ambisonic decoder interface is not currently supported.
If you are interested in the option to record a scene in ambisonic format and save to a file, there are some scripting functions labeled "SoundfieldRecorder" that may help automate the monitoring process, e.g. saving to a custom file path. Ambisonic soundfield recording can also be done manually using the Resoncance Audio Listener component in Unity, which has a Soundfield Recorder UI button to start and stop/save a recording.
As we know BeagleBone Black dont have a DSP on SoC specific for the Video processing but is there any way we can achieve that by adding some extra DSP board.
I mean like Raspberry got Video Processing, so anyone tried to integrate both to get, so we have both the things to make that work.
I know its not the optimal way and these both are different but i have only one BBB and one Raspberry and I am trying to achieve some 1080p video streaming with better quality.
There is no DSP on BeagleBoneBlack, you need to use DSP functions.
If your input is audio, you can use ALSA.
When you say "dont have a DSP on SoC specific for the Video processing" - I think you mean what is usually called a VPU (Video Processing Unit), and indeed Beaglebone Black's AM3358 processor doesn't have it (source: http://www.ti.com/lit/gpn/am3358)
x264 has ARM NEON optimizations, so it can encode video reasonably well in software, 640x480#30fps should be fine, but 1920x1080#30fps is likely out of reach (you may get 8-10fps).
On Raspberry Pi, you can use gstreamer with omxh264enc to take advantage of the onboard VPU to encode video. I think it is a bit rough (not as solid as raspivid etc) but this should get you started: https://blankstechblog.wordpress.com/2015/01/25/hardware-video-encoding-progess-with-the-raspberry-pi/
I am working on a Screen Capture to h.264 bitstream solution using the Intel Media SDK.
I read the new 2nd Generation Intel processors have a hardware accelerated encoder so i am expecting the encode latency to drop and make it realtime.
Using ffmpeg 32bit version doing a screen capture and x264 i get an end to end latency of 200ms on the Pi. Well the Raspberry pi has a hardware decoder so i am guessing it does the decode in around 80ms. I used a Intel i5 520M and a 1st gen i7 to do the decoding the end to end was 250-350ms latency after using the Raspberry pi that went down to 150-200.
How do i link the Direct Show Screen Capture filter to the Intel Media SDK input?
there is not documentation i can follow, if anyone can shine some light.
I had the success of h.264 screen encoding by Direct3X + H.264 H/W encoder using Intel Media SDK.
screen shot by DirectX : 55ms
RGB4 -> NV12 converting by Intel Media SDK / VPP : 1ms
H.264 encoding by Intel Media SDK / H/W encoder : 7ms
Refer this link:
https://software.intel.com/en-us/forums/topic/358602
I am trying to use the SoX vad (voice activity detection) feature to analyze a wav file to determine if it contains speech (unsurprisingly.) However, I am using it on the command line on a Linux server that has no audio device. I would expect that I should be able run the command and capture the output somehow, but it seems like the vad feature is dependent on using the "play" command and that appears to be dependent on an audio device.
Is there a way that I can do this without an audio device?
Works here, how did you run it? Here's what I did:
sox infile.wav outfile.wav vad
outfile.wav is trimmed at the front until voice is detected.
I've googled around a few times, but I have not gotten a straight answer. I have a matrix that I would like to convolve with a discrete filter (e.g. the Sobel operator for edge detection). Is it possible to do this in an accelerated way with OpenGL ES on the iPhone?
If it is, how how? If it is not, are there other high-performance tricks I can use to speed up the operation? Wizardly ARM assembly operations that can do it fast? Ultimately I want to perform as fast of a convolution as possible on an iPhone's ARM processor.
You should be able to do this using programmable shaders under OpenGL ES 2.0. I describe OpenGL ES 2.0 shaders in more detail in the video for my class on iTunes U.
Although I've not done image convolution myself, I describe some GPU-accelerated image processing for Mac and iOS here. I present a sample application that uses GLSL shaders (based on Core Image filters developed by Apple) that does realtime color tracking from the iPhone's camera feed.
Since I wrote this, I've created an open source framework based on the above example which has built-in image convolution filters, ranging from Sobel edge detection to custom 3x3 convolution kernels. These can run up to 100X faster than CPU-bound implementations.
However, if you were to do this on the CPU, you might be able to use the Accelerate framework to run some of the operations on the iPhone's NEON SIMD unit. In particular, FFT operations (which are usually a key component in image convolution filters, or so I've heard) can get a ~4-5X speedup by using the routines Apple provides here.