I have x, y coordinates of a feature from a single photograph. I know camera parameters. How can i get the 3D coordinate of that feature (in matlab). please help me.
Give a look to this: http://www.cim.mcgill.ca/~langer/558/4-cameramodel.pdf
Systems like this that I've seen before requires that you know where the camera is (latitude and longitude), in which direction (azimuth and elevation) the camera is pointing, together with the field of view. Then you project this onto the geodata of the environment and from there you can do all kinds of things, like finding the 3D position of an object based on its location in the photograph.
Related
I'm trying to make a Unity game that allows the user to explore the surface of an Earth shaped spheroid, based on WGS84.
The project so far is on Github, and there's a YouTube video of this behaviour.
A shape the size of Earth is way too big for Unity, so I just spawn tiles near the user, offset so that the first tile is at Unity's origin point. This bit works.
The issue is moving around. I've been using an approach where I get the user's position in ECEF coordinates, then normalise that to provide the global orientation for the player, then I translate the player forward based on that and their rotation.
The issue with this is that normalising the ECEF coordinate means that the player is moving in a spherical shape, but the WGS84 spheroid is not perfectly spherical. So the player sinks into the floor, or flies up if you got south or north, respectively.
My question is, how can I allow the user to move around the surface of the spheroid by way of translation? I feel like there might be some way of taking the major/minor axis of the spheroid into account as the player moves, but I'm not sure how to do that.
I have no experience with Unity or computer graphics, I'm approaching it purely from the navigation point of view.
Let's look at the real world.
We want to travel either by walking/driving on the surface or flying at some altitude. When we do it, we move in the local coordinate system (North-South, East-West, Up-Down), we can't see any curvature. We assume the Earth is flat.
The problem arises when we try to do it on a computer, which is ruthlessly precise and knows the shape of the Earth. We can't assume the Earth is flat, we can't assume the Earth is a sphere. The Earth is a geoid. Fortunately for some purposes we can simplify things and assume the Earth is an ellipsoid. You chose WGS84. Good!
So how to move around an ellipsoid? Solving the problem analitically is a nightmare. We have to cheat ;)
We should assume te Earth is flat for a moment, make a move in a chosen direction in the local coordinate system, write down the altitude of the new position, calculate the global geodetic coordinates (Lat, Long, Alt) of that new point and then replace the altitude with the one obtained while using the local coordinate system. In other words: each time we move forward along a perfectly straight line and diverge from the ellipsoid (just a tiny bit), we force the altitude not to change in relation to the ellipsoid.
Implementation.
You need to be able to freely translate coordinates between geodetic (Lat, Long, Alt) and ECEF. Going from geodetic to ECEF is easy. Finding geodetic coordinates for a given ECEF position is much more complex, there are many different algorithms, I'm sure you should be able to find a ready to use implementation somewhere.
What you also need is Local Tangent Plane, and to be precise, you are going to use NED.
Let's assume your object is initially at some geodetic position. You write down the altitude (relative to the ellipsoid). Then you create a local NED coordinate system with its origin at that point. Then you move the object in that local coordinate system. You write down how much the altitude (or rather the Down coordinate) changed. Then you must calculate the ECEF coordinates of that new position and transform it to geodetic (Lat, Long, Alt). You have the old altitude, you have the altitude change in the NED coordinates, which means you know the new altitude. You then apply that altitude to your new geodetic coordinates (brutally replace the Alt in Lat/Long/Alt with a new value).
Then you make another move in the NED coordinates defined for that new position. And so on...
I'm not sure if it is clear, the process is quite complicated. If you can't understand - shout :)
everyone.
I'm trying to record the movement from a person, frame to frame using the Microsoft Kinect API. For that i'm saving all the joint's position, and besides i would like to get the direction of the vector of the joint. I've seen that the API has something about joint orientation with quaternion matrices, but i don´t know how to use it to get the direction, or should i simply calculate the direction from the coordinates?
Thanks
Thanks to the answer from Carmine Si - MSFTMicrosoft (MSFT)
"To determine the direction a joint is travelling, then you should just calculate the vector based on the point locations from frame to frame. Typically the other values are for mapping your skeletion from different coordiante spaces so you can do things like the Avateering sample."
I'm building an augmented reality game, and working with CLLocation is rather cumbersome.
Is there some way to locally approximate CLLocation as XYZ coordinate, expressed in meters with the origin starting at some arbitrary point (for example the initial position when the game was started)?
Lets say I'm working with a 1 mile radius and do not really care about the curvature of the earth. Is it possible to approximate or somehow simplify the location based calculations for local position tracking?
Alternatively, is there a coordinate system that can be used with CLLocation that also incorporates the roll, pitch, yaw of the CMAttitude as well as compass orientation?
Clarification: As far as I understand, the problem with latitude and longitude is that their units vary in size, depending on the position on the globe. I should've specified that X,Y,Z should be in standard units, like meters or feet.
Thank you!
The Haversine formula may be useful.
I found a good article on it at http://www.jaimerios.com/?p=39 with code examples.
You could get the initial point at the app's launch and calculate the relative points based on the user coordinates as he or she moves. Admittedly, this is not super elegant, but if you are just trying to do some simple comparisons based on the user's location relative to an arbitrary origin, this should work. For the Z, Alex Stone's suggestion of calculating it based on the altitude should be fine.
I am creating a 2d sidescroller game. I have a point in space (where the mouse is) and I need the weapon to look and "follow" that point.
Does anyone know where to begin?
wikihow: How to Find the Angle Between Two Vectors
After you have the angle, you can appropriately rotate the thing to be rotated.
I am not sure if javascript also has an atan2(x,y) function, which could be used to get the angle.
The Screen-to-world problem on the iPhone
I have a 3D model (CUBE) rendered in an EAGLView and I want to be able to detect when I am touching the center of a given face (From any orientation angle) of the cube. Sounds pretty easy but it is not...
The problem:
How do I accurately relate screen-coordinates (touch point) to world-coordinates (a location in OpenGL 3D space)? Sure, converting a given point into a 'percentage' of the screen/world-axis might seem the logical fix, but problems would arise when I need to zoom or rotate the 3D space. Note: rotating & zooming in and out of the 3D space will change the relationship of the 2D screen coords with the 3D world coords...Also, you'd have to allow for 'distance' in between the viewpoint and objects in 3D space. At first, this might seem like an 'easy task', but that changes when you actually examine the requirements. And I've found no examples of people doing this on the iPhone. How is this normally done?
An 'easy' task?:
Sure, one might undertake the task of writing an API to act as a go-between between screen and world, but the task of creating such a framework would require some serious design and would likely take 'time' to do -- NOT something that can be one-manned in 4 hours...And 4 hours happens to be my deadline.
The question:
What are some of the simplest ways to
know if I touched specific locations
in 3D space in the iPhone OpenGL ES
world?
You can now find gluUnProject in http://code.google.com/p/iphone-glu/. I've no association with the iphone-glu project and haven't tried it yet myself, just wanted to share the link.
How would you use such a function? This PDF mentions that:
The Utility Library routine gluUnProject() performs this reversal of the transformations. Given the three-dimensional window coordinates for a location and all the transformations that affected them, gluUnProject() returns the world coordinates from where it originated.
int gluUnProject(GLdouble winx, GLdouble winy, GLdouble winz,
const GLdouble modelMatrix[16], const GLdouble projMatrix[16],
const GLint viewport[4], GLdouble *objx, GLdouble *objy, GLdouble *objz);
Map the specified window coordinates (winx, winy, winz) into object coordinates, using transformations defined by a modelview matrix (modelMatrix), projection matrix (projMatrix), and viewport (viewport). The resulting object coordinates are returned in objx, objy, and objz. The function returns GL_TRUE, indicating success, or GL_FALSE, indicating failure (such as an noninvertible matrix). This operation does not attempt to clip the coordinates to the viewport or eliminate depth values that fall outside of glDepthRange().
There are inherent difficulties in trying to reverse the transformation process. A two-dimensional screen location could have originated from anywhere on an entire line in three-dimensional space. To disambiguate the result, gluUnProject() requires that a window depth coordinate (winz) be provided and that winz be specified in terms of glDepthRange(). For the default values of glDepthRange(), winz at 0.0 will request the world coordinates of the transformed point at the near clipping plane, while winz at 1.0 will request the point at the far clipping plane.
Example 3-8 (again, see the PDF) demonstrates gluUnProject() by reading the mouse position and determining the three-dimensional points at the near and far clipping planes from which it was transformed. The computed world coordinates are printed to standard output, but the rendered window itself is just black.
In terms of performance, I found this quickly via Google as an example of what you might not want to do using gluUnProject, with a link to what might lead to a better alternative. I have absolutely no idea how applicable it is to the iPhone, as I'm still a newb with OpenGL ES. Ask me again in a month. ;-)
You need to have the opengl projection and modelview matrices. Multiply them to gain the modelview projection matrix. Invert this matrix to get a matrix that transforms clip space coordinates into world coordinates. Transform your touch point so it corresponds to clip coordinates: the center of the screen should be zero, while the edges should be +1/-1 for X and Y respectively.
construct two points, one at (0,0,0) and one at (touch_x,touch_y,-1) and transform both by the inverse modelview projection matrix.
Do the inverse of a perspective divide.
You should get two points describing a line from the center of the camera into "the far distance" (the farplane).
Do picking based on simplified bounding boxes of your models. You should be able to find ray/box intersection algorithms aplenty on the web.
Another solution is to paint each of the models in a slightly different color into an offscreen buffer and reading the color at the touch point from there, telling you which brich was touched.
Here's source for a cursor I wrote for a little project using bullet physics:
float x=((float)mpos.x/screensize.x)*2.0f -1.0f;
float y=((float)mpos.y/screensize.y)*-2.0f +1.0f;
p2=renderer->camera.unProject(vec4(x,y,1.0f,1));
p2/=p2.w;
vec4 pos=activecam.GetView().col_t;
p1=pos+(((vec3)p2 - (vec3)pos) / 2048.0f * 0.1f);
p1.w=1.0f;
btCollisionWorld::ClosestRayResultCallback rayCallback(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z));
game.dynamicsWorld->rayTest(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z), rayCallback);
if (rayCallback.hasHit())
{
btRigidBody* body = btRigidBody::upcast(rayCallback.m_collisionObject);
if(body==game.worldBody)
{
renderer->setHighlight(0);
}
else if (body)
{
Entity* ent=(Entity*)body->getUserPointer();
if(ent)
{
renderer->setHighlight(dynamic_cast<ModelEntity*>(ent));
//cerr<<"hit ";
//cerr<<ent->getName()<<endl;
}
}
}
Imagine a line that extends from the viewer's eye
through the screen touch point into your 3D model space.
If that line intersects any of the cube's faces, then the user has touched the cube.
Two solutions present themselves. Both of them should achieve the end goal, albeit by a different means: rather than answering "what world coordinate is under the mouse?", they answer the question "what object is rendered under the mouse?".
One is to draw a simplified version of your model to an off-screen buffer, rendering the center of each face using a distinct color (and adjusting the lighting so color is preserved identically). You can then detect those colors in the buffer (e.g. pixmap), and map mouse locations to them.
The other is to use OpenGL picking. There's a decent-looking tutorial here. The basic idea is to put OpenGL in select mode, restrict the viewport to a small (perhaps 3x3 or 5x5) window around the point of interest, and then render the scene (or a simplified version of it) using OpenGL "names" (integer identifiers) to identify the components making up each face. At the end of this process, OpenGL can give you a list of the names that were rendered in the selection viewport. Mapping these identifiers back to original objects will let you determine what object is under the mouse cursor.
Google for opengl screen to world (for example there’s a thread where somebody wants to do exactly what you are looking for on GameDev.net). There is a gluUnProject function that does precisely this, but it’s not available on iPhone, so that you have to port it (see this source from the Mesa project). Or maybe there’s already some publicly available source somewhere?