Kinect (Microsoft Sdk) Skeleton (recorded) Data from pixel to 3d real world coordinates

Kinect (Microsoft Sdk) Skeleton (recorded) Data from pixel to 3d real world coordinates - matlab

I have a dataset of partial joints (right elbow, shoulder and wrist) taken from a fellow who acquired this data with OpenNi.
The joints are in pixel as regarding to x and y, while z is in mm. I have to convert them to real world space to match them with data acquired by me (using Microsoft Sdk) for a gesture recognition application. I'm working in Matlab.
Searching on web and papers, I found that a floor reference is necessary for the conversion but I don't have any, so how could this conversion be done, possibly in matlab, and which candidate should I pick ? (maybe height of kinect from the floor?)

Here's a not-so-awesome solution:
Plot the 3D points you have of both data sets
Look for a pose where the arm and the forearm seems to be making a similar pose (as L-shaped as possible).
Use this to compute the transformation.

Related

Point cloud on an STL surface, integrate per element

I have a cloud of points that lie randomly on a 3D object surface. The object is a CAD model, can be saved as STL. The point cloud is obtained from ray tracing, each point representing the power of light absorbed when a ray partially reflects off the surface. I would like to visualize absorbed intensity on the object, using ParaView.
So, input: [x, y, z, p] + STL. (x, y, z) are guaranteed to lie on the object surface, but are likely slightly off the STL due to it being an approximation of the real surface.
Desired output: colored STL image, with each surface element coloured according to the total absorbed power in that element divided by its area.
Optional: Ideally, the data should be smoothened, something like "sliding average" or Gaussian blur.
Difficulty: The main problem I am facing, independent of using ParaView, is that I don't know the intensity, only the power. I can calculate the intensity myself, e.g. in Matlab, and get a poor Matlab graphics (compared to Paraview) and a very noisy image (because of random fluctuation of intensity between pixels due to finite number of rays). ParaView seems to be doing magic, hope to solve this problem with it.
Can I do the above with ParaView, without programming a new filter / with minimum programming?
I have just discovered ParaView, so please excuse a very novice question. Googling for answer didn't help, hopefully I didn't miss it due to poor wording.

RessampleToDataset filter should let you ressample your point cloud unto the stl.

Measuring objects in a photo taken by calibrated cameras, knowing the size of a reference object in the photo

I am writing a program that captures real time images from a scene by two calibrated cameras (so the internal parameters of the cameras are known to us). Using two view geometry, I can find the essential matrix and use OpenCV or MATLAB to find the relative position and orientation of one camera with respect to another. Having the essential matrix, it is shown in Hartley and Zisserman's Multiple View Geometry that one can reconstruct the scene using triangulation up to scale. Now I want to use a reference length to determine the scale of reconstruction and resolve ambiguity.
I know the height of the front wall and I want to use it for determining the scale of reconstruction to measure other objects and their dimensions or their distance from the center of my first camera. How can it be done in practice?
Thanks in advance.
Edit: To add more information, I have already done linear trianglation (minimizing the algebraic error) but I am not sure if it is any useful because there is still a scale ambiguity that I don't know how to get rid of it. My ultimate goal is to recognize an object (like a Pepsi can) and separate it in a rectangular area (which is going to be written as a separate module by someone else) and then find the distance of each pixel in this rectangular area, i.e. the region of interest, to the camera. Then the distance from the camera to the object will be the minimum of the distances from the camera to the 3D coordinates of the pixels in the region of interest.

Might be a bit late, but at least for someone struggling with the same staff.
As far as I remember it is actually linear problem. You got essential matrix, which gives you rotation matrix and normalized translation vector specifying relative position of cameras. If you followed Hartley and Zissermanm you probably chose one of the cameras as origin of world coordinate system. Meaning all your triangulated points are in normalized distance from this origin. What is important is, that the direction of every triangulated point is correct.
If you have some reference in the scene (lets say height of the wall), then you just have to find this reference (2 points are enough - so opposite ends of the wall) and calculate "normalization coefficient" (sorry for terminology) as
coeff = realWorldDistanceOf2Points / distanceOfTriangulatedPoints
Once you have this coeff, just mulptiply all your triangulated points with it and you got real world points.
Example:
you know that opposite corners of the wall are 5m from each other. you find these corners in both images, triangulate them (lets call triangulated points c1 and c2), calculate their distance in the "normalized" world as ||c1 - c2|| and get the
coeff = 5 / ||c1 - c2||
and you get real 3d world points as triangulatedPoint*coeff.
Maybe easier option is to have both cameras in fixed relative position and calibrate them together by stereoCalibrate openCV/Matlab function (there is actually pretty nice GUI in Matlab for that) - it returns not just intrinsic params, but also extrinsic. But I don't know if this is your case.

How to convert 3D point cloud (extracted from 3D sparse reconstruction) to millimeters?

Using Stereo vision and based on Multiple View Geometry book (http://www.robots.ox.ac.uk/~vgg/hzbook/), I have created a 3D point cloud in MATLAB. To do that, I first calibrated the cameras and rectified the stereo images. Then feature extraction and matching. Then eliminated the noisy matched based on camera locations. Finally created the 3D point cloud using triangulation.
Now my question is how to convert this 3D point cloud from pixel domain to actual millimeter/centimeter domain knowing my focal length and camera calibration matrices?
the goal is to find DEPTH IN MILLIMETERS.
I know how to do it in disparity/depth map case using formula: Z=(t*f)/d.
But here in the sparse case, can I do something like this? http://matlab.wikia.com/wiki/FAQ#How_do_I_measure_a_distance_or_area_in_real_world_units_instead_of_in_pixels.3F
or there is a more sophisticated method with more in depth explanation?
Thanks.

The formula you wrote is valid only in the special case when the image planes of the two cameras are on the same geometrical plane, and the motion from one to the other is a translation parallel to one of the image axes.
In the general case you'll need to triangulate actual rays in 3D space, using one of the techniques described in that book (it has a whole chapter on reconstruction). The reconstruction will be metrical if your calibration is. In particular, if the coordinate transform between the cameras has a translation vector whose units are meters (or millimeters, or inches, ...).

How to calculate perspective transformation using ellipse

I'm very new to 3D image processing.i'm working in my project to find the perspective angle of an circle.
A plate having set of white circles,using those circles i want to find the rotation angles (3D) of that plate.
For that i had finished camera calibration part and got camera error parameters.The next step i have captured an image and apply the sobel edge detection.
After that i have a little bit confusion about the ellipse fitting algorithm.i saw a lot of algorithms in ellipse fit.which one is the best method and fast method?
after finished ellipse fit i don't know how can i proceed further?how to calculate rotation and translation matrix using that ellipse?
can you tell me which algorithm is more suitable and easy. i need some matlab code to understand concept.
Thanks in advance
sorry for my English.

First, find the ellipse/circle centres (e.g. as Eddy_Em in other comments described).
You can then refer to Zhang's classic paper
https://research.microsoft.com/en-us/um/people/zhang/calib/
which allows you to estimate camera pose from a single image if some camera parameters are known, e.g. centre of projection. Note that the method fails for frontal recordings, i.e. the more of a perspective effect, the more accurate your estimate will be. The algorithm is fairly simple, you'll need a SVD and some cross products.

Calculating magnetic heading using raw accelerometer and magnetometer data

I have an accelerometer and magnetometer each producing raw X, Y and Z readouts. From this I need to determine the magnetic heading of an object.
I'm not that great at trig, but I've put together a formula that does respond pretty well to the rotation of the device, but also responds to movement that one would not think is relevant, such as angling the device in such a way that has no impact on the direction it is pointed. Such as laying it flat and "rolling" the device.
I think the formula I have for calculating the magnetic heading is fine, but I think my pitch and roll radians for input are wrong.
So I guess the core of my question (unless someone actually has a formula around that does this), is how do you calculate angles, in radians, using an accelerometer for pitch and roll.
Then secondly, any info on the heading formula itself would be great.

Depending on the accuracy your application requires, you may need to solve several problems:
Are the accelerometer axes calibrated? I've seen MEMs accelerometers that had axes that were not mutually perpendicular, and had significantly different response characteristics for each axis (typically X and Y would match, and Z would differ). You will need to synthesize ideal XYZ axes from whatever physical reading your device provides. (Google 'accelerometer calibration'.)
Are the magnetometer axes calibrated? Similar problem as above, except much harder to check: It is very difficult to generate uniform calibrated magnetic fields. If you use the ambient geomagnetic field, you will need to carefully control the local magnetism of your work environment and your tools. (Google 'magnetometer calibration'.)
After the accelerometer and magnetometer have been individually calibrated, their axes need to be calibrated relative to each other. Since both of these devices are typically soldered to a PCB, there is almost guaranteed to be significant misalignment. In many cases, the board layout and device parameters may not even permit the XYZ axes to correspond with each other! This may be the hardest part to do from a lab perspective, so I'd recommend you do a direct comparison using other hardware that has both kinds of sensors and is already calibrated (such as an iPhone or Android phone - but verify the device before use). Normally, this is accomplished by adjusting the prior two calibration matrices until they generate vectors that are correctly aligned relative to each other.
Only after you are generating mutually calibrated magnetic and accelerometer vectors can you apply the solutions suggested by the other respondents.
I've only described the static solution, where both the magnetometer and accelerometer are motionless relative to the local gravitational and magnetic fields. If you need to generate responses in real-time while the system is rapidly moving, you will need to account for the time behavior of each sensor. There are two basic ways to do this: 1) Apply time-domain filters to each sensor so that their outputs share a common time domain (generally adding some delay). 2) Use predictive modeling to modify the sensor outputs in real-time (less delay, but more noise).
I've seen Kalman filters used for such applications, but applying them in a vector domain may require using quaternions instead of Euler matrices. Quaternions are easier to use computationally (fewer operations needed compared to matrices), but I found them to be much more difficult to comprehend and get right.
Or, you may choose a completely different path, and use statistics and data fitting to do all the above work in one giant step. Consider the problem as follows: Given 6 input values (XYZ each from uncalibrated magnetometer and accelerometer) and a reference to the device (assuming it is hand-held, and there is an arrow painted on the case), output a single angle representing the magnetic bearing toward which the arrow on the case is pointing, and the elevation of the arrow relative to the gravity vector (tilt of the case).
Using a calibrated reference device (as mentioned above), pair it with the device to be calibrated, and take a several hundred data points, with the device at different orientations. Then use a powerful math package such a Matlab, MathCAD, R or SciPy to setup and solve the nonlinear equations to create the transformation matrices.

I would point to Euler Angles and Roll Pich Yaw.

You're not thinking in enough dimensions. This would be the answer in only 2 dimensions, and it works great if you can find a way to ensure "Z" always aligns with gravity.
int heading=180-atan2(mag_datX, mag_datY)/0.0174532925; // 0/359=N, 90=E, 180=S, 270=W
(if you're reading directly form the device - beware that it probably returns X, Z, Y - not X, Y, Z !)
However - this is not a 2D compass problem - imagine you take the needle out of the compass, balance it so that gravity plays no part in keeping it "level", and you'll find that "north" will point a bit up or down - depending where on earth you are (or, if at the poles, directly up or down!).
So you need to try and compute the THREE DIMENSIONAL vector from all 3 values - which is a matrix operation.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse