I've built an Unity plugin for my UWP app which converts raw h264 packets to RGB data and renders it to a texture. I've used FFMPEG to do this and it works fine.
int framefinished = avcodec_send_packet(m_pCodecCtx, &packet);
framefinished = avcodec_receive_frame(m_pCodecCtx, m_pFrame);
// YUV to RGB conversion and render to texture after this
Now, I'm trying to shift to hardware based decoding using DirectX11 DXVA2.0.
Using this: https://learn.microsoft.com/en-us/windows/desktop/medfound/supporting-direct3d-11-video-decoding-in-media-foundation
I was able to create a decoder(ID3D11VideoDecoder) but I don't know how to supply it the raw H264 packets and get the YUV or NV12 data as output.
(Or if its possible to render the output directly to the texture since I can get the ID3D11Texture2D pointer)
so my question is, How do you send the raw h264 packets to this decoder and get the output from it?
Also, this is for real time operation so I'm trying to achieve minimal latency.
Thanks in advance!
Since you already have it done using FFMPEG, I'd like to suggest to you to use FFMPEG's dx11 hardware decoding directly.
Check the HW decode FFMPEG example for details: github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c
Related
I want to render a scene with its depth image,but I do not know how to render 16bit or 32bit single channel depth image? I can only save 8bit with 4 channel png image.
PNGCS writes 16bit per channel png files. With some work it can be extended to support 32bit/ch as well.
I have downloaded some .raw file of depth data from this website.
3D Video Download
In order to get a depth data image, I wrote a script in Unity as below:
However, this is the texture I got.
How can I got the depth data texture as below?
RAW is not a standarized format, while most of the variants are pretty easy to read (there's rarely any compression) its might not be just one call to LoadRawTextureData.
I am assuming you have tried other texture formats than PVRTC_RGBA4 and they all failed?
First off, if you have the resolution of your image, and file size, you can try to guess the format, for depth its common to use 8bit or 16bit values, if you need 16 bit you take two bytes and do
a<<8||b
or
a*256+b
But sometimes there's another operation required (i.e for 18bit formats).
Once you have your values, getting the texture is as easy as calling SetPixel enough times
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I want to create an hologram that is exported via the kinect to an hololense. But it's very slow.
I use this tutorial to collect point cloud data, and this library to export my data as a 3D object in .obj format. The library that exports obj doesnt accept points so I had to draw little triangles. I save the files .obj .png and .mtl on my local xampp.
Next, I download the files with a unity script and WWW object. I also use Runtime OBJ Importer from unity's asset store to create a 3D object at runtime.
The last part is to export the unity app on a hololense. (I will do it next).
But before that,
The process is working but is very slow. I want the hologram to be fluid. A lot of time is wasted :
take depth and rgb data of the kinect
export data to an obj png and mtl file
download the files on unity as frequent as possible
render the files
I think of streaming but does unity need a complet obj file to render ? If I compress .png to .jpg will a gain some time ?
Do you have some pointers to help me ?
Currently the way your question is phrased is confusing: it's unclear wether you want to record point clouds that you later load and render in Unity or you want to somehow stream the point cloud with the aligned RGB texture in close to realtime to Unity.
You initial attempts are using Processing.
In terms of recording data, I recommend using the SimpleOpenNI library which can record both depth and RGB data to an .oni file (see the RecorderPlay example).
Once you have a recording, you can loop through each frame and for each frame store the vertices to a file.
In terms of saving to .obj you'll need to convert the point cloud to a mesh (triangulate the vertices in 3D).
Another option would be to store the point cloud to a format like .ply.
You can find more info on writing to a .ply file in Processing in this answer
In terms of streaming the data, this will be complicated:
if you stream all the vertices it that's a lot of data: up to 921600 floats ( (640 x 480 = 307200) * 3)
if you stream both depth (11bit 640x480) and RGB (8bit 640x480) images that will be even more data.
One option might be to only send the vertices that have a depth and overall skipping points (e.g. send every 3rd point). In terms of sending the data you can try OSC
Once you get the points in Unity you should be able to render a point cloud in Unity
What would be ideal in terms of network performance is a codec (compressor/decompressor) for the depth data. I haven't used one thus far, but doing a quick I see there are options like this one(very date).
You'll need to do a bit of research and see what kinect v1 depth streaming libraries are out there already, test and see what works best for your scenario.
Ideally, if the library is written in C# there's a chance you'll be able to use it to decode the received in Unity.
when I use ffmpeg to scale a H264 video. It seems that the video is decoded to the raw graph then scaled then encoded again. But if the speed is very critical,is there a faster way if I specify a “good" ratio like 2:1, as if I want to pick up one pixel in every four?
I know a bit how h264 works, 8*8/4*4 pixels are coded as a group,so it's not easy to pick up 1/4 pixels in its range. But is there a way to merge 4 group into one quickly?
When you use ffmpeg for scaling purpose, there is no way you can avoid re-encoding to any part of the video. For scale operation, ffmpeg works in pipeline fashion as below:
Decoder ----> Scaler -----> Encoder
Scaler does scale operation only after a completely decoded frame is available to it. Since every packet passes through this pipeline, encoder receives video frames in decompressed (YUV format) form only. So, every YUV frame gets re-encoded after scale operation. I guess it clarifies why there is no way to avoid re-encoding.
The scaling ratio do play a role in complexity. I guess scale ratio 2:1 is OK, scale ratio affects the number of taps (filter coefficients) used in scale algorithm. Also, the scaling algorithm that you may choose add another layer of complexity. Least complex scaling algorithm in ffmpeg is "fast_bilinear". But be aware of the video quality trade-off.
Of course, encoding speed is another factor to consider. It seems you know fairly about it. One thing, see if you can make use of HW decoder & encoder that may be available in your system. If HW codec is available, it greatly improves the speed of this entire pipeline. You can try with -hwaccel dxva2 option for ffmpeg
I want to find a method to eliminate the repeating frames from a video. If I consider a video which will repeatedly show the same frame for 5 seconds, I want to include only one frame of that in the video and make it visible for 5 seconds. In here I am looking forward to minimize the file size by eliminating duplicate frames. Is there a method to do this using Matlab?
If your movie is just a series of stills that you wish to show as a slideshow/presentation with a fixed five second delay, then you should be able to used the 'FrameRate' property for the VideoWriter class. Try something like this example:
writerObj = VideoWriter('peaks.mp4','MPEG-4');
writerObj.FrameRate = 0.2; % One frame every 5 seconds
open(writerObj);
Z = peaks;
surf(Z);
for k = 1:4:20
surf(sin(2*pi*k/20)*Z,Z);
writeVideo(writerObj,getframe);
end
close(writerObj);
However, the frame-rate property cannot be varied over the course of your movie, so the more general form of your question is fundamentally an issue of encoder support for variable frame-rate encoding. Most modern encoders (e.g., H.264 implementations) are not designed to explicitly handle this, but rather have heuristics that can detect when content is not changing and efficiently encode the data (particularly if multi-pass encoding is used). Unfortunately, Matlab (I'm assuming that you've been using the VideoWriter class) doesn't really provide great deal of fidelity in this respect. I'm not even sure what inter-frame encoding settings are used for MPEG-4 with H.264 videos.
If the MPEG-4 with H.264 videos produced by VideoWriter are unacceptable, I'd recommend exporting your video in the highest quality possible (or lossless) and then learn to use a full-fledged encoding framework/library (ffmpeg, libav, x264) or application to encode to the quality and size you desire. Apparently Handbrake has support for variable frame-rate encoding, though it's not necessarily designed for what you may want I've not tested it. Or export your individual still frames and use actually video editing software (e.g. iMovie on OS X). There are also likely dedicated applications that can create a movie from a slideshow/presentation (both PowerPoint and Keynote can do this).
Within Matlab, another alternative, is to use a codec that explicitly supports variable frame rates – QuickTime's image-based codecs: Photo JPEG (not to be confused with Motion-JPEG), Photo PNG (a.k.a. Apple PNG), and Photo TIFF (a.k.a. Apple Tiff). You can encode content directly using these codecs with my QTWriter, which is available on Github. Note however, that on OS X 10.9+, QuickTime Player converts lossless variable frame-rate Photo PNG and Photo TIFF movies to lossy fixed frame-rate H.264 (Photo JPEG movies are not converted). See my note towards the bottom of this page for further details and a workaround.