How to get transformation affine from ITK registration? - affinetransform

Given 3D MRI scans A, B, and C I want to perform an affine (co)registration of B onto A, take the transformation affine matrix of the registration and apply it on C.
My problem is that the affine matrix of the registration transform has the wrong signs. Maybe due to wrong orientation?
The TransformParameters contain 12 values of which the first 9 are the rotation matrix in row-major order and the last 3 are the translation values.
TransformParameters = [R1, R2, R3, R4, R5, R6, R7, R8, R9, Tx, Ty, Tz]
registration_affine = [[R1, R2, R3, Tx],
[R4, R5, R6, Ty],
[R7, R8, R9, Tz],
[0, 0, 0, 1 ]]
I know that ITK holds images in LPS orientation and nibabel in RAS.
So I tried to apply a change with respect to the orientation difference to the transform_affine but this did not work out.
I can not get the same registration output as ITK, below I will show some number examples and my minimal code example.
To test this I applied an affine transformation to an existing image. The inverse of that transformation matrix is the true affine the registration could find.
array([[ 1.02800583, 0.11462834, -0.11426342, -0.43383606],
[ 0.11462834, 1.02800583, -0.11426342, 0.47954143],
[-0.11426342, -0.11426342, 1.02285268, -0.20457054],
[ 0. , 0. , 0. , 1. ]])
But the affine constructed as explained above, yields:
array([[ 1.02757335, 0.11459412, 0.11448339, 0.23000557],
[ 0.11410441, 1.02746452, 0.11413955, -0.20848751],
[ 0.11398788, 0.11411115, 1.02255042, -0.04884404],
[ 0. , 0. , 0. , 1. ]])
You can see that the values are quite close but that only the signs are wrong.
In fact, if I manually set the same signs as in the "true" matrix, the transformation matrix is good.
In the ITK loader of MONAI I found code that suggested to do the following to convert an ITK affine to a nibabel affine:
np.diag([-1, -1, 1, 1]) # registration_affine
If I use nibabels ornt_transform methods to get the ornt transform from LPS to RAS, this returns [-1, -1, 1] and matches what is done in the ITK loader of MONAI.
But applying this to the affine from above does not actually yield the correct signs (only in the translation bit):
array([[-1.02757335, -0.11459412, -0.11448339, -0.23000557],
[-0.11410441, -1.02746452, -0.11413955, 0.20848751],
[ 0.11398788, 0.11411115, 1.02255042, -0.04884404],
[ 0. , 0. , 0. , 1. ]])
So I am a bit stuck here.
Here a complete minimal code example to run what I'm doing / trying to do. See below also the example data and package versions.
import nibabel
import numpy as np
from monai.transforms import Affine
from nibabel import Nifti1Image
import itk
# Import Images
moving_image = itk.imread('moving_2mm.nii.gz', itk.F)
fixed_image = itk.imread('fixed_2mm.nii.gz', itk.F)
# Import Default Parameter Map
parameter_object = itk.ParameterObject.New()
affine_parameter_map = parameter_object.GetDefaultParameterMap('affine', 4)
affine_parameter_map['FinalBSplineInterpolationOrder'] = ['1']
parameter_object.AddParameterMap(affine_parameter_map)
# Call registration function
result_image, result_transform_parameters = itk.elastix_registration_method(
fixed_image, moving_image, parameter_object=parameter_object)
parameter_map = result_transform_parameters.GetParameterMap(0)
transform_parameters = np.array(parameter_map['TransformParameters'], dtype=float)
itk.imwrite(result_image, 'reg_itk.nii.gz', compression=True)
# Convert ITK params to affine matrix
rotation = transform_parameters[:9].reshape(3, 3)
translation = transform_parameters[-3:][..., np.newaxis]
reg_affine: np.ndarray = np.append(rotation, translation, axis=1) # type: ignore
reg_affine = np.append(reg_affine, [[0, 0, 0, 1]], axis=0) # type: ignore
# Apply affine transform matrix via MONAI
moving_image_ni: Nifti1Image = nibabel.load('moving_2mm.nii.gz')
fixed_image_ni: Nifti1Image = nibabel.load('fixed_2mm.nii.gz')
moving_image_np: np.ndarray = moving_image_ni.get_fdata() # type: ignore
LPS = nibabel.orientations.axcodes2ornt(('L', 'P', 'S'))
RAS = nibabel.orientations.axcodes2ornt(('R', 'A', 'S'))
ornt_transform = nibabel.orientations.ornt_transform(LPS, RAS)[:, -1] # type: ignore
affine_transform = Affine(affine=np.diag([*ornt_transform, 1]) # reg_affine, image_only=False)
out_img, out_affine = affine_transform(moving_image_np[np.newaxis, ...])
reg_monai = np.squeeze(out_img)
out = Nifti1Image(reg_monai, fixed_image_ni.affine, header=fixed_image_ni.header)
nibabel.save(out, 'reg_monai.nii.gz')
Input data:
fixed_2mm.nii.gz
moving_2mm.nii.gz
Output data:
reg_itk.nii.gz
reg_monai.nii.gz
Package versions:
itk-elastix==0.12.0
monai==0.8.0
nibabel==3.1.1
numpy==1.19.2
I did ask this question before on the ITKElastix project on GitHub #145 but could not resolve my issues. Thanks to dzenanz and mstaring who tried to help over there.

After a lot of trying and discussing with my team we came to a realization on what is going on.
We already established how to read the ITK TransformParameters with the first 9 numbers being part of the rotation matrix read in row-major order and the last three part of the translation matrix.
rot00, rot01, rot02, rot10, rot11, rot12, rot20, rot21, rot22, tx, ty, tz = parameter_map['TransformParameters']
affine = np.array([
[rot00, rot01, rot02, tx],
[rot10, rot11, rot12, ty],
[rot20, rot21, rot22, tz],
[ 0, 0, 0, 1],
], dtype=np.float32) # yapf: disable
We already knew that nibabel has images in RAS orientation and ITK in LPS orientation.
We also already knew that if we want to change the orientation of an image, we need to flip the respective axis.
LPS to RAS means flipping L->R and P->A.
So a flip of the first two axis. Flip hereby is represented via -1 and no flip by 1. So a flip of the first two axis can be described by [-1, -1, 1].
We can construct an affine transformation matrix for this flip with np.diag([-1, -1, 1, 1]) (the last 1 is only for computation purposes).
The affine transformation matrix to flip between LPS and RAS is therefore:
flip_LPS_RAS = np.array([[-1, 0, 0, 0],
[ 0, -1, 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1]])
Note that this flip works both ways. LPS -> RAS and RAS -> LPS.
If you have a 3D image and the respective affine matrix, you can flip the axis in this image by applying the 'flip_LPS_RAS'.
If you want to calculate the new affine of this image, you can do:
flipped = flip_LPS_RAS # image_affine
Since we have the foundation layed out, let's now look at what we failed to figure out.
We were aware that the affine matrix of the registration transform is based on an image in the LPS orientation and we were aware that the nibabel image is in RAS.
The thought process was that we need to convert the transformation affine from LPS orientation into RAS orientation similar to the image reorientation mentioned above.
So we applied the flip_LPS_RAS affine to the registration affine.
What we got wrong is that this does not make the affine transformation a RAS oriented transformation.
The thing is that the registration affine expects to be applied to an image in LPS orientation and outputs an image in LPS orientation.
Let's recap, that the image affine has RAS orientation and that the registration affine expects to be applied to an image in LPS orientation.
Now it becomes easier to see that to apply the registration transform on an image in RAS orientation we first need to change the orientation of the image to LPS and after the registration back to RAS.
image -> flip_LPS_RAS -> registration_lps -> flip_LPS_RAS
We are only interested in the affine matrix for the registration transform, so let's ignore the image in the above chain of transformations.
Writing this affine transformation chain in code:
registration_ras = flip_LPS_RAS # registration_lps # flip_LPS_RAS
This will yield an affine matrix which takes in a RAS oriented image, changes it to LPS orientation, performs the registration in LPS orientation and then changes the orientation back to RAS - giving us a single affine transformation which performs the ITK registration on an RAS oriented image.
From the minimal code example above, the following should now work:
affine_transform = Affine(affine=registration_ras, image_only=True)
out_img = affine_transform(moving_image_np[np.newaxis, ...])

Taking a look at this diff, you might be more interested in the old way of doing it. It directly constructs an ITK transform from 4x4 matrix.
But beware, I think there is a bug somewhere in this code. I added this recently and it decreased test accuracy, which makes me believe there is a bug somewhere in there.

Related

Understanding ARKit World Transform Matrices

In ARKit, when I perform a hit-test, I get back an instance of ARHitTestResult. One of the properties of this is worldTransform, which I understand contains a 4x4 transformation matrix of the position of the object – simd_float4x4.
As someone who is very unfamiliar with linear algebra and 3D graphics, how would I edit this matrix to, say, increase its y coordinate by 0.05?
If there is a blog post or something I could look at that would help me wrap my head around this, please let me know. I am not sure what terms I should be googling.
Sorry if my question is full of misunderstandings! As you can probably tell, I am not too familiar with these concepts.
Thank you to anyone who helps.
EDIT: The original question is best addressed by just adding 0.05 to the y component of the node's position. However, the original answer below does address a bit about composing transformation matrices, if that is something you are interested in.
======================================================================
If you want to apply an operation to a matrix, the most immediately simple way is to make a matrix that does that operation, and then multiply your original matrix by that new matrix.
For a translation, assuming you want to translate by x, y, z, you can do this:
let translation = simd_float4x4(
float4(1, 0, 0, 0),
float4(0, 1, 0, 0),
float4(0, 0, 1, 0),
float4(x, y, z, 1)
)
Note that this is just an identity matrix (1 down the diagonal) with the last column (!!!important, the float4s above are COLUMNS, not ROWS, as they would visually seem) set to contain the x/y/z values. You can research further into homogeneous coordinates, but think of this as just how a translation is represented.
Then, in simd, just do this: let newWorldTransform = translation * oldWorldTransform and you will have the old world transform translated by your x/y/z translation values (in your example, [x, y, z] = [0, 0.05, 0]).
However, it may be worth exploring why you want to edit your hit test results. I cannot think of a practical use case for that, so maybe if you explain a bit more about what you are trying to do I could suggest a more intuitive way to do it.
Matrices in 3D graphics is a regular way to translate, rotate, scale and shear 3D objects. In ARKit, RealityKit and SceneKit for consistent linear transformations you need to use simd_float4x4-like matrices:
var myMatrix: simd_float4x4
Translation 4x4 Matrix has 16 elements inside – 4 elements (float4) by 4 columns. Columns indices are 0, 1, 2 and 3. Translation matrix uses the fourth column with index 3.
SceneKit example
Use the following code to place your model 25 cm above its default position SCNVector3(0,0,0):
let sphereNode = SCNNode()
sphereNode.geometry = SCNSphere(radius: 1.0)
sphereNode.geometry?.firstMaterial?.diffuse.contents = UIColor.red
scene.rootNode.addChildNode(sphereNode)
var translation = matrix_identity_float4x4
translation.columns.3.y = 0.25
sphereNode.simdWorldTransform = translation
RealityKit example
let model = ModelEntity(mesh: .generateBox(size: 0.3))
let anchor = AnchorEntity()
anchor.addChild(model)
let currentMatrix = anchor.transform.matrix
var positionMatrix = simd_float4x4()
positionMatrix.columns.3.y = 0.25
let transform = simd_mul(currentMatrix, positionMatrix)
anchor.move(to: transform, relativeTo: model, duration: 1.0)

RGB Depth Alignment [duplicate]

I am trying to allign two images - one rgb and another depth using MATLAB. Please note that I have checked several places for this - like here , here which requires a kinect device, and here here which says that camera parameters are required for calibration. I was also suggested to use EPIPOLAR GEOMETRY to match the two images though I do not know how. The dataset I am referring to is given in rgb-d-t face dataset. One such example is illustrated below :
The ground truth which basically means the bounding boxes which specify the face region of interest are already provided and I use them to crop the face regions only. The matlab code is illustrated below :
I = imread('1.jpg');
I1 = imcrop(I,[218,198,158,122]);
I2 = imcrop(I,[243,209,140,108]);
figure, subplot(1,2,1),imshow(I1);
subplot(1,2,2),imshow(I2);
The two cropped images rgb and depth are shown below :
Is there any way by which we can register/allign the images. I took the hint from
here where basic sobel operator has been used on both the rgb and depth images to generate an edge map and then keypoints will need to be generated for matching purposes. The edge maps for both the images are generated here.
.
However they are so noisy that I do not think we will be able to do keypoint matching for this images.
Can anybody suggest some algorithms in matlab to do the same ?
prologue
This answer is based on mine previous answer:
Does Kinect Infrared View Have an offset with the Kinect Depth View
I manually crop your input image so I separate colors and depth images (as my program need them separated. This could cause minor offset change by few pixels. Also as I do not have the depths (depth image is 8bit only due to grayscale RGB) then the depth accuracy I work with is very poor see:
So my results are affected by all this negatively. Anyway here is what you need to do:
determine FOV for both images
So find some measurable feature visible on both images. The bigger in size the more accurate the result. For example I choose these:
form a point cloud or mesh
I use depth image as reference so my point cloud is in its FOV. As I do not have the distances but 8bit values instead I converted that to some distance by multiplying by constant. So I scan whole depth image and for every pixel I create point in my point cloud array. Then convert the dept pixel coordinate to color image FOV and copy its color too. something like this (in C++):
picture rgb,zed; // your input images
struct pnt3d { float pos[3]; DWORD rgb; pnt3d(){}; pnt3d(pnt3d& a){ *this=a; }; ~pnt3d(){}; pnt3d* operator = (const pnt3d *a) { *this=*a; return this; }; /*pnt3d* operator = (const pnt3d &a) { ...copy... return this; };*/ };
pnt3d **xyz=NULL; int xs,ys,ofsx=0,ofsy=0;
void copy_images()
{
int x,y,x0,y0;
float xx,yy;
pnt3d *p;
for (y=0;y<ys;y++)
for (x=0;x<xs;x++)
{
p=&xyz[y][x];
// copy point from depth image
p->pos[0]=2.000*((float(x)/float(xs))-0.5);
p->pos[1]=2.000*((float(y)/float(ys))-0.5)*(float(ys)/float(xs));
p->pos[2]=10.0*float(DWORD(zed.p[y][x].db[0]))/255.0;
// convert dept image x,y to color image space (FOV correction)
xx=float(x)-(0.5*float(xs));
yy=float(y)-(0.5*float(ys));
xx*=98.0/108.0;
yy*=106.0/119.0;
xx+=0.5*float(rgb.xs);
yy+=0.5*float(rgb.ys);
x0=xx; x0+=ofsx;
y0=yy; y0+=ofsy;
// copy color from rgb image if in range
p->rgb=0x00000000; // black
if ((x0>=0)&&(x0<rgb.xs))
if ((y0>=0)&&(y0<rgb.ys))
p->rgb=rgb2bgr(rgb.p[y0][x0].dd); // OpenGL has reverse RGBorder then my image
}
}
where **xyz is my point cloud 2D array allocated t depth image resolution. The picture is my image class for DIP so here some relevant members:
xs,ys is the image resolution in pixels
p[ys][xs] is the image direct pixel access as union of DWORD dd; BYTE db[4]; so I can access color as single 32 bit variable or each color channel separately.
rgb2bgr(DWORD col) just reorder color channels from RGB to BGR.
render it
I use OpenGL for this so here the code:
glBegin(GL_QUADS);
for (int y0=0,y1=1;y1<ys;y0++,y1++)
for (int x0=0,x1=1;x1<xs;x0++,x1++)
{
float z,z0,z1;
z=xyz[y0][x0].pos[2]; z0=z; z1=z0;
z=xyz[y0][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x0].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
if (z0 <=0.01) continue;
if (z1 >=3.90) continue; // 3.972 pre vsetko nad .=3.95m a 4.000 ak nechyti vobec nic
if (z1-z0>=0.10) continue;
glColor4ubv((BYTE* )&xyz[y0][x0].rgb);
glVertex3fv((float*)&xyz[y0][x0].pos);
glColor4ubv((BYTE* )&xyz[y0][x1].rgb);
glVertex3fv((float*)&xyz[y0][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x1].rgb);
glVertex3fv((float*)&xyz[y1][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x0].rgb);
glVertex3fv((float*)&xyz[y1][x0].pos);
}
glEnd();
You need to add the OpenGL initialization and camera settings etc of coarse. Here the unaligned result:
align it
If you notice I added ofsx,ofsy variables to copy_images(). This is the offset between cameras. I change them on arrows keystrokes by 1 pixel and then call copy_images and render the result. This way I manually found the offset very quickly:
As you can see the offset is +17 pixels in x axis and +4 pixels in y axis. Here side view to better see the depths:
Hope It helps a bit
Well I have tried doing it after reading lots of blogs and all. I am still not sure whether I am doing it correct or not. Please feel free to give comments if something is found amiss. For this I used a mathworks fex submission that can be found here : ginputc function.
The matlab code is as follows :
clc; clear all; close all;
% no of keypoint
N = 7;
I = imread('2.jpg');
I = rgb2gray(I);
[Gx, Gy] = imgradientxy(I, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
I = Gmag;
[x,y] = ginputc(N, 'Color' , 'r');
matchedpoint1 = [x y];
J = imread('2.png');
[Gx, Gy] = imgradientxy(J, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
J = Gmag;
[x, y] = ginputc(N, 'Color' , 'r');
matchedpoint2 = [x y];
[tform,inlierPtsDistorted,inlierPtsOriginal] = estimateGeometricTransform(matchedpoint2,matchedpoint1,'similarity');
figure; showMatchedFeatures(J,I,inlierPtsOriginal,inlierPtsDistorted);
title('Matched inlier points');
I = imread('2.jpg'); J = imread('2.png');
I = rgb2gray(I);
outputView = imref2d(size(I));
Ir = imwarp(J,tform,'OutputView',outputView);
figure; imshow(Ir, []);
title('Recovered image');
figure,imshowpair(I,J,'diff'),title('Difference with original');
figure,imshowpair(I,Ir,'diff'),title('Difference with restored');
Step 1
I used the sobel edge detector to extract the edges for both the depth and rgb images and then used a thresholding values to get the edge map. I will be primarily working with the gradient magnitude only. This gives me two images as this :
Step 2
Next I use the ginput or ginputc function to mark keypoints on both the images. The correspondence between the points are established by me beforehand. I tried using SURF features but they do not work well on depth images.
Step 3
Use the estimategeometrictransform to get the transformation matrix tform and then use this matrix to recover the original position of the moved image. The next set of images tells this story.
Granted I still believe the results can be further improved if the keypoint selections in either of the images are more judiciously done. I also think #Specktre method is better. I just noticed that I used a separate image-pair in my answer compared to that of the question. Both images come from the same dataset to be found here vap rgb-d-t dataset.

Align already captured rgb and depth images

I am trying to allign two images - one rgb and another depth using MATLAB. Please note that I have checked several places for this - like here , here which requires a kinect device, and here here which says that camera parameters are required for calibration. I was also suggested to use EPIPOLAR GEOMETRY to match the two images though I do not know how. The dataset I am referring to is given in rgb-d-t face dataset. One such example is illustrated below :
The ground truth which basically means the bounding boxes which specify the face region of interest are already provided and I use them to crop the face regions only. The matlab code is illustrated below :
I = imread('1.jpg');
I1 = imcrop(I,[218,198,158,122]);
I2 = imcrop(I,[243,209,140,108]);
figure, subplot(1,2,1),imshow(I1);
subplot(1,2,2),imshow(I2);
The two cropped images rgb and depth are shown below :
Is there any way by which we can register/allign the images. I took the hint from
here where basic sobel operator has been used on both the rgb and depth images to generate an edge map and then keypoints will need to be generated for matching purposes. The edge maps for both the images are generated here.
.
However they are so noisy that I do not think we will be able to do keypoint matching for this images.
Can anybody suggest some algorithms in matlab to do the same ?
prologue
This answer is based on mine previous answer:
Does Kinect Infrared View Have an offset with the Kinect Depth View
I manually crop your input image so I separate colors and depth images (as my program need them separated. This could cause minor offset change by few pixels. Also as I do not have the depths (depth image is 8bit only due to grayscale RGB) then the depth accuracy I work with is very poor see:
So my results are affected by all this negatively. Anyway here is what you need to do:
determine FOV for both images
So find some measurable feature visible on both images. The bigger in size the more accurate the result. For example I choose these:
form a point cloud or mesh
I use depth image as reference so my point cloud is in its FOV. As I do not have the distances but 8bit values instead I converted that to some distance by multiplying by constant. So I scan whole depth image and for every pixel I create point in my point cloud array. Then convert the dept pixel coordinate to color image FOV and copy its color too. something like this (in C++):
picture rgb,zed; // your input images
struct pnt3d { float pos[3]; DWORD rgb; pnt3d(){}; pnt3d(pnt3d& a){ *this=a; }; ~pnt3d(){}; pnt3d* operator = (const pnt3d *a) { *this=*a; return this; }; /*pnt3d* operator = (const pnt3d &a) { ...copy... return this; };*/ };
pnt3d **xyz=NULL; int xs,ys,ofsx=0,ofsy=0;
void copy_images()
{
int x,y,x0,y0;
float xx,yy;
pnt3d *p;
for (y=0;y<ys;y++)
for (x=0;x<xs;x++)
{
p=&xyz[y][x];
// copy point from depth image
p->pos[0]=2.000*((float(x)/float(xs))-0.5);
p->pos[1]=2.000*((float(y)/float(ys))-0.5)*(float(ys)/float(xs));
p->pos[2]=10.0*float(DWORD(zed.p[y][x].db[0]))/255.0;
// convert dept image x,y to color image space (FOV correction)
xx=float(x)-(0.5*float(xs));
yy=float(y)-(0.5*float(ys));
xx*=98.0/108.0;
yy*=106.0/119.0;
xx+=0.5*float(rgb.xs);
yy+=0.5*float(rgb.ys);
x0=xx; x0+=ofsx;
y0=yy; y0+=ofsy;
// copy color from rgb image if in range
p->rgb=0x00000000; // black
if ((x0>=0)&&(x0<rgb.xs))
if ((y0>=0)&&(y0<rgb.ys))
p->rgb=rgb2bgr(rgb.p[y0][x0].dd); // OpenGL has reverse RGBorder then my image
}
}
where **xyz is my point cloud 2D array allocated t depth image resolution. The picture is my image class for DIP so here some relevant members:
xs,ys is the image resolution in pixels
p[ys][xs] is the image direct pixel access as union of DWORD dd; BYTE db[4]; so I can access color as single 32 bit variable or each color channel separately.
rgb2bgr(DWORD col) just reorder color channels from RGB to BGR.
render it
I use OpenGL for this so here the code:
glBegin(GL_QUADS);
for (int y0=0,y1=1;y1<ys;y0++,y1++)
for (int x0=0,x1=1;x1<xs;x0++,x1++)
{
float z,z0,z1;
z=xyz[y0][x0].pos[2]; z0=z; z1=z0;
z=xyz[y0][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x0].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
if (z0 <=0.01) continue;
if (z1 >=3.90) continue; // 3.972 pre vsetko nad .=3.95m a 4.000 ak nechyti vobec nic
if (z1-z0>=0.10) continue;
glColor4ubv((BYTE* )&xyz[y0][x0].rgb);
glVertex3fv((float*)&xyz[y0][x0].pos);
glColor4ubv((BYTE* )&xyz[y0][x1].rgb);
glVertex3fv((float*)&xyz[y0][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x1].rgb);
glVertex3fv((float*)&xyz[y1][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x0].rgb);
glVertex3fv((float*)&xyz[y1][x0].pos);
}
glEnd();
You need to add the OpenGL initialization and camera settings etc of coarse. Here the unaligned result:
align it
If you notice I added ofsx,ofsy variables to copy_images(). This is the offset between cameras. I change them on arrows keystrokes by 1 pixel and then call copy_images and render the result. This way I manually found the offset very quickly:
As you can see the offset is +17 pixels in x axis and +4 pixels in y axis. Here side view to better see the depths:
Hope It helps a bit
Well I have tried doing it after reading lots of blogs and all. I am still not sure whether I am doing it correct or not. Please feel free to give comments if something is found amiss. For this I used a mathworks fex submission that can be found here : ginputc function.
The matlab code is as follows :
clc; clear all; close all;
% no of keypoint
N = 7;
I = imread('2.jpg');
I = rgb2gray(I);
[Gx, Gy] = imgradientxy(I, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
I = Gmag;
[x,y] = ginputc(N, 'Color' , 'r');
matchedpoint1 = [x y];
J = imread('2.png');
[Gx, Gy] = imgradientxy(J, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
J = Gmag;
[x, y] = ginputc(N, 'Color' , 'r');
matchedpoint2 = [x y];
[tform,inlierPtsDistorted,inlierPtsOriginal] = estimateGeometricTransform(matchedpoint2,matchedpoint1,'similarity');
figure; showMatchedFeatures(J,I,inlierPtsOriginal,inlierPtsDistorted);
title('Matched inlier points');
I = imread('2.jpg'); J = imread('2.png');
I = rgb2gray(I);
outputView = imref2d(size(I));
Ir = imwarp(J,tform,'OutputView',outputView);
figure; imshow(Ir, []);
title('Recovered image');
figure,imshowpair(I,J,'diff'),title('Difference with original');
figure,imshowpair(I,Ir,'diff'),title('Difference with restored');
Step 1
I used the sobel edge detector to extract the edges for both the depth and rgb images and then used a thresholding values to get the edge map. I will be primarily working with the gradient magnitude only. This gives me two images as this :
Step 2
Next I use the ginput or ginputc function to mark keypoints on both the images. The correspondence between the points are established by me beforehand. I tried using SURF features but they do not work well on depth images.
Step 3
Use the estimategeometrictransform to get the transformation matrix tform and then use this matrix to recover the original position of the moved image. The next set of images tells this story.
Granted I still believe the results can be further improved if the keypoint selections in either of the images are more judiciously done. I also think #Specktre method is better. I just noticed that I used a separate image-pair in my answer compared to that of the question. Both images come from the same dataset to be found here vap rgb-d-t dataset.

Swift Scenekit - Multiple rotations

I have an issue with rotating an a node multiple times. I am working on a game with a rolling ball, and while I can rotate the ball along one axis, or two axis by the same amount, I cannot rotate at partial angles.
example:
// Roll right 90 -
SCNNode.pivot = SCNMatrix4MakeRotation(Float(M_PI_2), 0, 1, 0)
// Roll right 180 -
SCNNode.pivot = SCNMatrix4MakeRotation(Float(M_PI_2) * 2, 0, 1, 0)
// Roll up 90 -
SCNNode.pivot = SCNMatrix4MakeRotation(Float(M_PI_2), 1, 0, 0)
// Roll up & right 90 -
SCNNode.pivot = SCNMatrix4MakeRotation(Float(M_PI_2), 1, 1, 0)
All of which will work, however if I need to roll ball right 180 and up 90 I'm stuck.
Even if there was some way to add the vectors together that would do me.
Any help greatly appreciated.
To combine the effects of rotation matrices, use matrix multiplication.
To do that in SceneKit, you can either:
Create separate rotation matrices and multiply them together using SCNMatrix4Mult.
Apply a rotation directly to an existing matrix using SCNMatrix4Rotate. (This is equivalent to the SCNMatrix4MakeRotation + SCNMatrix4Mult option; it just combines those steps into a single function call.)
If the order of transformations is important to your app, remember that matrix multiplication order is the reverse of transformation order.

Matlab gradient equivalent in opencv

I am trying to migrate some code from Matlab to Opencv and need an exact replica of the gradient function. I have tried the cv::Sobel function but for some reason the values in the resulting cv::Mat are not the same as the values in the Matlab version. I need the X and Y gradient in separate matrices for further calculations.
Any workaround that could achieve this would be great
Sobel can only compute the second derivative of the image pixel which is not what we want.
(f(i+1,j) + f(i-1,j) - 2f(i,j)) / 2
What we want is
(f(i+i,j)-f(i-1,j)) / 2
So we need to apply
Mat kernelx = (Mat_<float>(1,3)<<-0.5, 0, 0.5);
Mat kernely = (Mat_<float>(3,1)<<-0.5, 0, 0.5);
filter2D(src, fx, -1, kernelx)
filter2D(src, fy, -1, kernely);
Matlab treats border pixels differently from inner pixels. So the code above is wrong at the border values. One can use BORDER_CONSTANT to extent the border value out with a constant number, unfortunately the constant number is -1 by OpenCV and can not be changed to 0 (which is what we want).
So as to border values, I do not have a very neat answer to it. Just try to compute the first derivative by hand...
You have to call Sobel 2 times, with arguments:
xorder = 1, yorder = 0
and
xorder = 0, yorder = 1
You have to select the appropriate kernel size.
See documentation
It might still be that the MatLab implementation was different, ideally you should retrieve which kernel was used there...
Edit:
If you need to specify your own kernel, you can use the more generic filter2D. Your destination depth will be CV_16S (16bit signed).
Matlab computes the gradient differently for interior rows and border rows (the same is true for the columns of course). At the borders, it is a simple forward difference gradY(1) = row(2) - row(1). The gradient for interior rows is computed by the central difference gradY(2) = (row(3) - row(1)) / 2.
I think you cannot achieve the same result with just running a single convolution filter over the whole matrix in OpenCV. Use cv::Sobel() with ksize = 1, then treat the borders (either manually or by applying a [ 1 -1 ] filter).
Pei's answer is partly correct. Matlab uses these calculations for the borders:
G(:,1) = A(:,2) - A(:,1);
G(:,N) = A(:,N) - A(:,N-1);
so used the following opencv code to complete the gradient:
static cv::Mat kernelx = (cv::Mat_<double>(1, 3) << -0.5, 0, 0.5);
static cv::Mat kernely = (cv::Mat_<double>(3, 1) << -0.5, 0, 0.5);
cv::Mat fx, fy;
cv::filter2D(Image, fx, -1, kernelx, cv::Point(-1, -1), 0, cv::BORDER_REPLICATE);
cv::filter2D(Image, fy, -1, kernely, cv::Point(-1, -1), 0, cv::BORDER_REPLICATE);
fx.col(fx.cols - 1) *= 2;
fx.col(0) *= 2;
fy.row(fy.rows - 1) *= 2;
fy.row(0) *= 2;
Jorrit's answer is partly correct.
In some cases, the value of the directional derivative may be negative, and MATLAB will retain these negative numbers, but OpenCV Mat will set the negative number to 0.