Distored image when convert RGB into YUV. How to fix??? - yuv

From .ppm image, I extracted R,G,B data and saved following structures:
RGB RGB RGB RGB RGB RGB RGB..........
corresponding to each pixel position.
Then I used formula to convert R,G,V into Y,U,V.With each pixel, I obtained YUV correspondingly.
R0G0B0->Y0U0V0 , R1G1B1 ->Y1U1B1, ........
I saved data following YUV422 Data Format:The YUV422 data format shares U and V values between two pixels. As a result, these values are transmitted to the PC image buffer only once for every two pixels.
R0G0B0R1G1B1->Y0UY1V
a,How to calculate U and V from U0,U1 and V0,V1????????
b,In my case , I used this formula:
U=(U0+U1)/2; V=(V0+V1)/2;
Obtained data was saved following structure to create .yuv file:
YUYV YUYV YUYV YUYV YUYV YUYV......
But when I used YUV tools to read new .yuv file, that image is not similar to original image. What did I do wrong here ????

The formula what you have employed is the right one. But a small correction in the arrangement of output data of YUV422 Planar (Horizontal sampling).
Luminance data of Width * Height size followed by Cb data of Width * Height/2 and Cr data of width * Height/2.
It should be as mentioned below:
[Y1 Y2 Y3 ... (Width * Height)] [Cb1 Cb2 .... (Width * Height/2)] [Cr1 Cr2 ...(Width * Height/2)]

Related

Geometrical transformation of a polygon to a higher resolution image

I'm trying to resize and reposition a ROI (region of interest) correctly from a low resolution image (256x256) to a higher resolution image (512x512). It should also be mentioned that the two images cover different field of view - the low and high resolution image have 330mm x 330mm and 180mm x 180mm FoV, respectively.
What I've got at my disposal are:
Physical reference point (in mm) in the 256x256 and 512x512 image, which are refpoint_lowres=(-164.424,-194.462) and refpoint_highres=(-94.3052,-110.923). The reference points are located in the top left pixel (1,1) in their respective images.
Pixel coordinates of the ROI in the 256x256 image (named pxX and pxY). These coordinates are positioned relative to the reference point of the lower resolution image, refpoint_lowres=(-164.424,-194.462).
Pixel spacing for the 256x256 and 512x512 image, which are 0.7757 pixel/mm and 2.8444 pixel/mm respectively.
How can I rescale and reposition the ROI (the binary mask) to correct pixel location in the 512x512 image? Many thanks in advance!!
Attempt
% This gives correctly placed and scaled binary array in the 256x256 image
mask_lowres = double(poly2mask(pxX, pxY, 256., 256.));
% Compute translational shift in pixel
mmShift = refpoint_lowres - refpoint_highres;
pxShift = abs(mmShift./pixspacing_highres)
% This produces a binary array that is only positioned correctly in the
% 512x512 image, but it is not upscaled correctly...(?)
mask_highres = double(poly2mask(pxX + pxShift(1), pxY + pxShift(2), 512.,
512.));
So you have coordinates pxX, and pxY in pixels with respect to the low-resolution image. You can transform these coordinates to real-world coordinates:
pxX_rw = pxX / 0.7757 - 164.424;
pxY_rw = pxY / 0.7757 - 194.462;
Next you can transform these coordinates to high-res coordinates:
pxX_hr = (pxX_rw - 94.3052) * 2.8444;
pxY_hr = (pxY_rw - 110.923) * 2.8444;
Since the original coordinates fit in the low-res image, but the high-res image is smaller (in physical coordinates) than the low-res one, it is possible that these new coordinates do not fit in the high-res image. If this is the case, cropping the polygon is a non-trivial exercise, it cannot be done by simply moving the vertices to be inside the field of view. MATLAB R2017b introduces the polyshape object type, which you can intersect:
bbox = polyshape([0 0 180 180] - 94.3052, [180 0 0 180] - 110.923);
poly = polyshape(pxX_rw, pxY_rw);
poly = intersect([poly bbox]);
pxX_rw = poly.Vertices(:,1);
pxY_rw = poly.Vertices(:,2);
If you have an earlier version of MATLAB, maybe the easiest solution is to make the field of view larger to draw the polygon, then crop the resulting image to the right size. But this does require some proper calculation to get it right.

How to get depth value of specific RGB pixels in Kinect v2 images using Matlab

I'm working with the Kinect v2 and I have to map the depth information onto the RGB images to process them: in particular, I need to know which pixels in the RGB images are in a certain range of distance (depth) along the Z axis; I'm acquiring all the data with a C# program and saving them as images (RGB) and txt files (depth).
I've followed the instruction from here and here (and I thank them for sharing), but I still have some problems I don't know how to solve.
I have calculated the rotation (R) and translation (T) matrix between the depth sensor and the RGB camera, as well as their intrinsic parameters.
I have created P3D_d (depth pixels in world coordinates related to depth sensor) and P3D_rgb (depth pixels in world coordinates related to rgb camera).
row_num = 424;
col_num = 512;
P3D_d = zeros(row_num,col_num,3);
P3D_rgb = zeros(row_num,col_num,3);
cont = 1;
for row=1:row_num
for col=1:col_num
P3D_d(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
P3D_d(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
P3D_d(row,col,3) = depth(row,col);
temp = [P3D_d(row,col,1);P3D_d(row,col,2);P3D_d(row,col,3)];
P3D_rgb(row,col,:) = R*temp+T;
end
end
I have created P2D_rgb_x and P2D_rgb_y.
P2D_rgb_x(:,:,1) = (P3D_rgb(:,:,1)./P3D_rgb(:,:,3))*fx_rgb+cx_rgb;
P2D_rgb_y(:,:,2) = (P3D_rgb(:,:,2)./P3D_rgb(:,:,3))*fy_rgb+cy_rgb;
but now I don't understand how to continue.
Assuming that the calibration parameters are correct, I've tried to click on a defined point in both the depth (coordinates: row_d, col_d) and rgb (coordinates: row_rgb, col_rgb) images, but P2D_rgb_x(row_d, col_d) is totally different from row_rgb, as well as P2D_rgb_y(row_d, col_d) is totally different from col_rgb.
So, what do exactly mean P2D_rgb_x and P2D_rgb_y? How can I use them to map depth value onto rgb images or just to get the depth of a certain RGB pixel?
I'll apreciate any suggest or help!
PS: I've also a related post on MathWorks at this link

Align already captured rgb and depth images

I am trying to allign two images - one rgb and another depth using MATLAB. Please note that I have checked several places for this - like here , here which requires a kinect device, and here here which says that camera parameters are required for calibration. I was also suggested to use EPIPOLAR GEOMETRY to match the two images though I do not know how. The dataset I am referring to is given in rgb-d-t face dataset. One such example is illustrated below :
The ground truth which basically means the bounding boxes which specify the face region of interest are already provided and I use them to crop the face regions only. The matlab code is illustrated below :
I = imread('1.jpg');
I1 = imcrop(I,[218,198,158,122]);
I2 = imcrop(I,[243,209,140,108]);
figure, subplot(1,2,1),imshow(I1);
subplot(1,2,2),imshow(I2);
The two cropped images rgb and depth are shown below :
Is there any way by which we can register/allign the images. I took the hint from
here where basic sobel operator has been used on both the rgb and depth images to generate an edge map and then keypoints will need to be generated for matching purposes. The edge maps for both the images are generated here.
.
However they are so noisy that I do not think we will be able to do keypoint matching for this images.
Can anybody suggest some algorithms in matlab to do the same ?
prologue
This answer is based on mine previous answer:
Does Kinect Infrared View Have an offset with the Kinect Depth View
I manually crop your input image so I separate colors and depth images (as my program need them separated. This could cause minor offset change by few pixels. Also as I do not have the depths (depth image is 8bit only due to grayscale RGB) then the depth accuracy I work with is very poor see:
So my results are affected by all this negatively. Anyway here is what you need to do:
determine FOV for both images
So find some measurable feature visible on both images. The bigger in size the more accurate the result. For example I choose these:
form a point cloud or mesh
I use depth image as reference so my point cloud is in its FOV. As I do not have the distances but 8bit values instead I converted that to some distance by multiplying by constant. So I scan whole depth image and for every pixel I create point in my point cloud array. Then convert the dept pixel coordinate to color image FOV and copy its color too. something like this (in C++):
picture rgb,zed; // your input images
struct pnt3d { float pos[3]; DWORD rgb; pnt3d(){}; pnt3d(pnt3d& a){ *this=a; }; ~pnt3d(){}; pnt3d* operator = (const pnt3d *a) { *this=*a; return this; }; /*pnt3d* operator = (const pnt3d &a) { ...copy... return this; };*/ };
pnt3d **xyz=NULL; int xs,ys,ofsx=0,ofsy=0;
void copy_images()
{
int x,y,x0,y0;
float xx,yy;
pnt3d *p;
for (y=0;y<ys;y++)
for (x=0;x<xs;x++)
{
p=&xyz[y][x];
// copy point from depth image
p->pos[0]=2.000*((float(x)/float(xs))-0.5);
p->pos[1]=2.000*((float(y)/float(ys))-0.5)*(float(ys)/float(xs));
p->pos[2]=10.0*float(DWORD(zed.p[y][x].db[0]))/255.0;
// convert dept image x,y to color image space (FOV correction)
xx=float(x)-(0.5*float(xs));
yy=float(y)-(0.5*float(ys));
xx*=98.0/108.0;
yy*=106.0/119.0;
xx+=0.5*float(rgb.xs);
yy+=0.5*float(rgb.ys);
x0=xx; x0+=ofsx;
y0=yy; y0+=ofsy;
// copy color from rgb image if in range
p->rgb=0x00000000; // black
if ((x0>=0)&&(x0<rgb.xs))
if ((y0>=0)&&(y0<rgb.ys))
p->rgb=rgb2bgr(rgb.p[y0][x0].dd); // OpenGL has reverse RGBorder then my image
}
}
where **xyz is my point cloud 2D array allocated t depth image resolution. The picture is my image class for DIP so here some relevant members:
xs,ys is the image resolution in pixels
p[ys][xs] is the image direct pixel access as union of DWORD dd; BYTE db[4]; so I can access color as single 32 bit variable or each color channel separately.
rgb2bgr(DWORD col) just reorder color channels from RGB to BGR.
render it
I use OpenGL for this so here the code:
glBegin(GL_QUADS);
for (int y0=0,y1=1;y1<ys;y0++,y1++)
for (int x0=0,x1=1;x1<xs;x0++,x1++)
{
float z,z0,z1;
z=xyz[y0][x0].pos[2]; z0=z; z1=z0;
z=xyz[y0][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x0].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
if (z0 <=0.01) continue;
if (z1 >=3.90) continue; // 3.972 pre vsetko nad .=3.95m a 4.000 ak nechyti vobec nic
if (z1-z0>=0.10) continue;
glColor4ubv((BYTE* )&xyz[y0][x0].rgb);
glVertex3fv((float*)&xyz[y0][x0].pos);
glColor4ubv((BYTE* )&xyz[y0][x1].rgb);
glVertex3fv((float*)&xyz[y0][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x1].rgb);
glVertex3fv((float*)&xyz[y1][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x0].rgb);
glVertex3fv((float*)&xyz[y1][x0].pos);
}
glEnd();
You need to add the OpenGL initialization and camera settings etc of coarse. Here the unaligned result:
align it
If you notice I added ofsx,ofsy variables to copy_images(). This is the offset between cameras. I change them on arrows keystrokes by 1 pixel and then call copy_images and render the result. This way I manually found the offset very quickly:
As you can see the offset is +17 pixels in x axis and +4 pixels in y axis. Here side view to better see the depths:
Hope It helps a bit
Well I have tried doing it after reading lots of blogs and all. I am still not sure whether I am doing it correct or not. Please feel free to give comments if something is found amiss. For this I used a mathworks fex submission that can be found here : ginputc function.
The matlab code is as follows :
clc; clear all; close all;
% no of keypoint
N = 7;
I = imread('2.jpg');
I = rgb2gray(I);
[Gx, Gy] = imgradientxy(I, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
I = Gmag;
[x,y] = ginputc(N, 'Color' , 'r');
matchedpoint1 = [x y];
J = imread('2.png');
[Gx, Gy] = imgradientxy(J, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
J = Gmag;
[x, y] = ginputc(N, 'Color' , 'r');
matchedpoint2 = [x y];
[tform,inlierPtsDistorted,inlierPtsOriginal] = estimateGeometricTransform(matchedpoint2,matchedpoint1,'similarity');
figure; showMatchedFeatures(J,I,inlierPtsOriginal,inlierPtsDistorted);
title('Matched inlier points');
I = imread('2.jpg'); J = imread('2.png');
I = rgb2gray(I);
outputView = imref2d(size(I));
Ir = imwarp(J,tform,'OutputView',outputView);
figure; imshow(Ir, []);
title('Recovered image');
figure,imshowpair(I,J,'diff'),title('Difference with original');
figure,imshowpair(I,Ir,'diff'),title('Difference with restored');
Step 1
I used the sobel edge detector to extract the edges for both the depth and rgb images and then used a thresholding values to get the edge map. I will be primarily working with the gradient magnitude only. This gives me two images as this :
Step 2
Next I use the ginput or ginputc function to mark keypoints on both the images. The correspondence between the points are established by me beforehand. I tried using SURF features but they do not work well on depth images.
Step 3
Use the estimategeometrictransform to get the transformation matrix tform and then use this matrix to recover the original position of the moved image. The next set of images tells this story.
Granted I still believe the results can be further improved if the keypoint selections in either of the images are more judiciously done. I also think #Specktre method is better. I just noticed that I used a separate image-pair in my answer compared to that of the question. Both images come from the same dataset to be found here vap rgb-d-t dataset.

Applying temporal median filter to a video

I want to apply Temporal Median Filter to a depth map video to ensure temporal consistency and prevent the flickering effect.
Thus, I am trying to apply the filter on all video frames at once by:
First loading all frames,
%%% Read video sequence
numfrm = 5;
infile_name = 'depth_map_1920x1088_80fps.yuv';
width = 1920; %xdim
height = 1088; %ydim
fid_in = fopen(infile_name, 'rb');
[Yd, Ud, Vd] = yuv_import(infile_name,[width, height],numfrm);
fclose(fid_in);
then creating a 3-D depth matrix (height x width x number-of-frames),
%%% Build a stack of images from the video sequence
stack = zeros(height, width, numfrm);
for i=1:numfrm
RGB = yuv2rgb(Yd{i}, Ud{i}, Vd{i});
RGB = RGB(:, :, 1);
stack(:,:,i) = RGB;
end
and finally applying the 1-D median filter along the third direction (time)
temp = medfilt1(stack);
However, for some reason this is not working. When I try to view each frame, I get white images.
frame1 = temp(:,:,1);
imshow(frame1);
Any help would be appreciated!
My guess is that this is actually working but frame1 is of class double and contains values, e.g. between 0 and 255. As imshow represents double images by default on a [0,1] scale, you obtain a white, saturated image.
I would therefore suggest:
caxis auto
after imshow to fix the display problem.
Best,

How do I convert the whole image to grayscale except for a sub image which should be in color?

I have an image and a subimage which is cropped out of the original image.
Here's the code I have written so far:
val1 = imread(img);
val2 = imread(img_w);
gray1 = rgb2gray(val1);%grayscaling both images
gray2 = rgb2gray(val2);
matchingval = normxcorr2(gray1,gray2);%normalized cross correlation
[max_c,imax]=max(abs(matchingval(:)));
After this I am stuck. I have no idea how to change the whole image grayscale except for the sub image which should be in color.
How do I do this?
Thank you.
If you know what the coordinates are for your image, you can always just use the rgb2gray on just the section of interest.
For instance, I tried this on an image just now:
im(500:1045,500:1200,1)=rgb2gray(im(500:1045,500:1200,1:3));
im(500:1045,500:1200,2)=rgb2gray(im(500:1045,500:1200,1:3));
im(500:1045,500:1200,3)=rgb2gray(im(500:1045,500:1200,1:3));
Where I took the rows (500 to 1045), columns (500 to 1200), and the rgb depth (1 to 3) of the image and applied the rgb2gray function to just that. I did it three times as the output of rgb2gray is a 2d matrix and a color image is a 3d matrix, so I needed to change it layer by layer.
This worked for me, making only part of the image gray but leaving the rest in color.
The issue you might have though is this, a color image is 3 dimensions while a gray scale need only be 2 dimensions. Combining them means that the gray scale must be in a 3d matrix.
Depending on what you want to do, this technique may or may not help.
Judging from your code, you are reading the image and the subimage in MATLAB. What you need to know are the coordinates of where you extracted the subimage. Once you do that, simply take your original colour image, convert that to grayscale, then duplicate this image in the third dimension three times. You need to do this so that you can place colour pixels in this image.
For RGB images, grayscale images have the RGB components to all be the same. Duplicating this image in the third dimension three times creates the RGB version of the grayscale image. Once you do that, simply use the row and column coordinates of where you extracted the subimage and place that into the equivalent RGB grayscale image.
As such, given your colour image that is stored in img and your subimage stored in imgsub, and specifying the rows and columns of where you extracted the subimage in row1,col1 and row2,col2 - with row1,col1 being the top left corner of the subimage and row2,col2 is the bottom right corner, do this:
img_gray = rgb2gray(img);
img_gray = cat(3, img_gray, img_gray, img_gray);
img_gray(row1:row2, col1:col2,:) = imgsub;
To demonstrate this, let's try this with an image in MATLAB. We'll use the onion.png image that's part of the image processing toolbox in MATLAB. Therefore:
img = imread('onion.png');
Let's also define our row1,col1,row2,col2:
row1 = 50;
row2 = 90;
col1 = 80;
col2 = 150;
Let's get the subimage:
imgsub = img(row1:row2,col1:col2,:);
Running the above code, this is the image we get:
I took the same example as rayryeng's answer and tried to solve by HSV conversion.
The basic idea is to set the second layer i.e saturation layer to 0 (so that they are grayscale). then rewrite the layer with the original saturation layer only for the sub image area, so that, they alone have the saturation values.
Code:
img = imread('onion.png');
img = rgb2hsv(img);
sPlane = zeros(size(img(:,:,1)));
sPlane(50:90,80:150) = img(50:90,80:150,2);
img(:,:,2) = sPlane;
img = hsv2rgb(img);
imshow(img);
Output: (Same as rayryeng's output)
Related Answer with more details here