Iphone OpenGL : glOrthof vs glFrustumf. is glOrthof not 3D? - iphone

having a bad coding day.
Right I need to make a 3D cube that spins around etc via user interaction. Hey no biggy.
All the examples to make a 3D cube seem to use glOrthof and when I demo one to people they say its not 3D.
The problem is that glFrustumf seems to put me in the cube instead of in front of me. I cant move it back using glTransform because it re-uses the ModelView Matrix (I even tried manually modifying that)
/* save current rotation state */
GLfloat matrix[16];
glGetFloatv(GL_MODELVIEW_MATRIX, matrix);
/* re-center cube, apply new rotation */
glLoadIdentity();
glRotatef(self.angle, self.dy,self.dx,0);
/* reapply other rotations so far */
glMultMatrixf(matrix);
So questions are.
To do a 3D cube must I use glFrustumf and if so, how the hell do I step back 5 but still re-use the model matrix (it keeps the cube spinning in what ever direction the user moves it)

I'm not sure what you mean by glOrthof() "not being 3-D". The rotating cube example I have here (using both OpenGL ES 1.1 and 2.0 for rendering of the textured cube) seems to work on 3-D, and I use glOrthof() in the OpenGL ES 1.1 side of the renderer. Shading and other effects can be applied independently of the glOrthof() usage.
In that example, I don't read back the model view matrix to manipulate the cube. Instead, I keep a copy of the matrix locally and modify that using some Core Animation helper functions. In addition to the CATransform3DRotate() that I perform on the cube, you should be able to throw in a CATransform3DTranslate() to displace it in a certain direction, while still being able to spin it.
I keep a local copy of the model view matrix for performance (reading back the model view matrix halts the rendering pipeline on OpenGL ES 1.1), and for compatibility with 2.0 (where you need to send the matrix as a uniform to the shaders).
Also, in an answer to your later question (which might get closed), you can't just arbitrarily change values within the model view matrix and expect to see linear displacements from that. You need to get the math right, and matrix math was never one of my strong points. I find it best to let a transform operation (like those provided in Core Animation) do the math for you when manipulating matrices.

Related

Why does merging geometries improve rendering speed?

In my web application I only need to add static objects to my scene. It worked slow so I started searching and I found that merging geometries and merging vertices were the solution. When I implemented it, it indeed worked a lot better. All the articles said that the reason for this improvement is the decrease in number of WebGL calls. As I am not very familiar with things like OpenGL and WebGL (I use Three.js to avoid their complexity), I would like to know why exactly it reduces the WebGL calls?
Because you send one large object instead of many littles, the overhead reduces. So I understand that loading one big mesh to the scene goes faster than many small meshes.
BUT I do not understand why merging geometries also has a positive influence on the rendering calculation? I would also like to know the difference between merging geometries and merging vertices?
Thanks in advance!
three.js is a framework that helps you work with the WebGL API.
What a "mesh" is to three.js, to webgl, it's a series of low level calls that set up state and issue calls to the GPU.
Let's take a sphere for example. With three.js you would create it with a few lines:
var sphereGeometry = new THREE.SphereGeometry(10);
var sphereMaterial = new THREE.MeshBasicMaterial({color:'red'});
var sphereMesh = new THREE.Mesh( sphereGeometry, sphereMaterial);
myScene.add( sphereMesh );
You have your renderer.render() call, and poof, a sphere appears on screen.
A lot of stuff happens under the hood though.
The first line, creates the sphere "geometry" - the cpu will a bunch of math and logic describing a sphere with points and triangles. Points are vectors, three floats grouped together, triangles are a structure that groups these points by indecis (groups of integers).
Somewhere there is a loop that calculates the vectors based on trigonometry (sin, cos), and another, that weaves the resulting array of vectors into triangles (take every N , N + M , N + 2M, create a triangle etc).
Now these numbers exist in javascript land, it's just a bunch of floats and ints, grouped together in a specific way to describe shapes such as cubes, spheres and aliens.
You need a way to draw this construct on a screen - a two dimensional array of pixels.
WebGL does not actually know much about 3D. It knows how to manage memory on the gpu, how to compute things in parallel (or gives you the tools), it does know how to do mathematical operations that are crucial for 3d graphics, but the same math can be used to mine bitcoins, without even drawing anything.
In order for WebGL to draw something on screen, it first needs the data put into appropriate buffers, it needs to have the shader programs, it needs to be setup for that specific call (is there going to be blending - transparency in three.js land, depth testing, stencil testing etc), then it needs to know what it's actually drawing (so you need to provide strides, sizes of attributes etc to let it know where a 'mesh' actually is in memory), how it's drawing it (triangle strips, fans, points...) and what to draw it with - which shaders will it apply on the data you provided.
So, you need a way to 'teach' WebGL to do 3d.
I think the best way to get familiar with this concept is to look at this tutorial , re-reading if necessary, because it explains what happens pretty much on every single 3d object in perspective, ever.
To sum up the tutorial:
a perspective camera is basically two 4x4 matrices - a perspective matrix, that puts things into perspective, and a view matrix, that moves the entire world into camera space. Every camera you make, consists of these two matrices.
Every object exists in it's object space. TRS matrix, (world matrix in three.js terms) is used to transform this object into world space.
So this stuff - a concept such as "projective matrix" is what teaches webgl how to draw perspective.
Three.js abstracts this further and gives you things like "field of view" and "aspect ratio" instead of left right, top bottom.
Three.js also abstracts the transformation matrices (view matrix on the camera, and world matrices on every object) because it allows you to set "position" and "rotation" and computes the matrix based on this under the hood.
Since every mesh has to be processed by the vertex shader and the pixel shader in order to appear on the screen, every mesh needs to have all this information available.
When a draw call is being issued for a specific mesh, that mesh will have the same perspective matrix, and view matrix as any other object being rendered with the same camera. They will each have their own world matrices - numbers that move them around around your scene.
This is transformation alone, happening in the vertex shader. These results are then rasterized, and go to the pixel shader for processing.
Lets consider two materials - black plastic and red plastic. They will have the same shader, perhaps one you wrote using THREE.ShaderMaterial, or maybe one from three's library. It's the same shader, but it has one uniform value exposed - color. This allows you to have many instances of a plastic material, green, blue, pink, but it means that each of these requires a separate draw call.
Webgl will have to issue specific calls to change that uniform from red to black, and then it's ready to draw stuff using that 'material'.
So now imagine a particle system, displaying a thousand cubes each with a unique color. You have to issue a thousand draw calls to draw them all, if you treat them as separate meshes and change colors via a uniform.
If on the other hand, you assign vertex colors to each cube, you don't rely on the uniform any more, but on an attribute. Now if you merge all the cubes together, you can issue a single draw call, processing all the cubes with the same shader.
You can see why this is more efficient simply by taking a glance at webglrenderer from three.js, and all the stuff it has to do in order to translate your 3d calls to webgl. Better done once than a thousand times.
Back to those 3 lines, the sphereMaterial can take a color argument, if you look at the source, this will translate to a uniform vec3 in the shader. However, you can also achieve the same thing by rendering the vertex colors, and assigning the color you want before hand.
sphereMesh will wrap that computed geometry into an object that three's webglrenderer understands, which in turn sets up webgl accordingly.

Shader-coding: nonlinear projection models

As I understand it, the standard projection model places an imaginary grid in front of the camera, and for each triangle in the scene, determines which 3 pixels its 3 corners project onto. The color is determined for each of these points, and the fragment shader fills in the rest using interpolation.
My question is this: is it possible to gain control over this projection model? For example, create my own custom distorted uv-grid? Or even just supply my own algorithm:
xyPixelPos_for_Vector3( Vector3 v ) {...}
I'm working in Unity3D, so I think that limits me to cG or openGL.
I did once write a GLES2 shader, but I don't remember ever performing any kind of "ray hits quad" type test to resolve the pixel position of a particular 3D point in space.
I'm going to assume that you want to render 3d images based upon 3d primitives that are defined by vertices. This is not the only way to render images with OpenGL but it is the most common. The technique that you describe sounds much more like Ray-Tracing.
How OpenGL Typically Works:
I wouldn't say that OpenGL creates an imaginary grid. Instead, what it does is take the positions of each of your vertices, and converts them into a different space using linear algebra (Matrices).
If you want to start playing around with this, it would be best to do some reading on Matrices, to understand what the graphics card is doing.
You can easily start warping the positions of Vertices by making a vertex shader. However, there is some setup involved. See the Lighthouse tutorials (http://www.lighthouse3d.com/tutorials/glsl-tutorial/hello-world-in-glsl/) to get started with that! You will also want to read their tutorials on lighting (http://www.lighthouse3d.com/tutorials/glsl-tutorial/lighting/), to create a fully functioning vertex shader which includes a lighting model.
Thankfully, once the shader is set up, you can distort your entire scene to your hearts content. Just remember to do your distortions in the right 'space'. World coordinates are much different than eye coordinates!

Creating a sprite class in OpenGL ES 2.0

I am working on making a sprite class in OpenGL ES 2.0 and have succeeded to a point. Currently I have a render method for the sprite and it's called by the render method in my EAGL layer at intervals. I was creating new vertex buffer and index buffer every time render was called but it isn't efficient so I called glremovebuffer. Unfortunately when I do that the frame-rate is slowed down significantly.
So currently I have the vbo and ibo created at initialization which works fine in terms of frame-rate and memory consumption but is unable to update position.
I'm at a bit of a loss as I'm just beggining with OpenGL, any help is appreciated.
Typically you want to create your sprite with VBOs and IBOs once, located at the model origin. To translate, rotate, and scale, you would then use the model matrix to transform your sprite into a desired location.
I'm fairly certain that iphone sdk provides some nice functions to do that, but I don't know any of them :) Basically, in your shader, you take your position coordinates and you multiply it by one or more matrices, one of those matrices is the model matrix, which you can change to be a translate, rotate, scale, or any combination of those matrices (in fact, it can be any matrix you want and it will produce different results).
There's a lot of resources out there that explain these transformation matrices. Here's one for instance:
http://db-in.com/blog/2011/04/cameras-on-opengl-es-2-x/
My advise is to find a tutorial that speaks on the same level as your understand and learn from there...

Why is my object disappearing after using gluLookat for OpenGL ES 2.0?

I'm putting together a simple game for the iPhone, and am trying to implement the effect of moving the camera around the GLView.
I'm drawing about a hundred objects using glDrawArrays with vertex and color pointers. After this, I want to move the camera to the right by 1 unit. This is the snippet of code I have in my drawView method. I change the matrix mode to the projection stack, and then change back to model view mode after the project manipulation is complete (I may be getting this wrong, I am a newbie to OpenGL).
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glTranslatef(1.0, 0.0, 0.0);
glMatrixMode(GL_MODELVIEW);
In any case, the result is definitely not expected. What happens is that I see my objects very briefly (for perhaps a frame), and then they disappear. The same thing happens if I take away the glTranslatef in the block above.
What am I doing wrong?
Thanks in advance!
Before
After
One of the confusing aspects of OpenGL for beginners is the distinction between what happens in the projection matrix versus what happens on the modelview matrix (and why would you even need both of them?)
The projection matrix is in charge of transforming the coordinates of your vertices to points in a 2D coordinate system (it splats the world onto your virtual film - the viewport). The projection matrix only specifies the behavior of your camera (for example: should it be a wide-angle lens, or telephoto, or a completely orthogonal one, like architectural oblique drawings?).
The modelview matrix, on the other hand, is in charge of specifying where in 3D space your vertices go to. So for example, to specify where a character's arm is with respect to the character's body, or where this character is with respect to the world, you will want to change the modelview matrix. It is important to notice, in particular, that changes in the position and orientation of the camera belong on the modelview matrix (it is the "view" part of it)
The reason it gets confusing is that at the end of the day, the vertices you give OpenGL are multiplied by the modelview matrix and then by the projection matrix. That is, given a modelview matrix M, a projection matrix P, and a vertex v, the final coordinate of the vertex is given by PMv. This means that some transformations seem to work regardless of which matrix you use. You should be careful about this - when you get to fancier OpenGL techniques, you will run into situations in which using the correct matrices makes a difference.
Until you get to that point, however, let me give you a good rule of thumb. Until you get used to the distinction between the two matrices, only use glOrtho or glFrustum on projection matrices (gluPerspective and their friends are ok too). All other calls (glTranslate, glScale, glRotate, etc) belong to the class of things you should be doing to the modelview matrix.
You probably want to read http://www.opengl.org/resources/faq/technical/viewing.htm (especially section 8.080).
Use something along the following lines:
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
// setup your projection matrix here, e.g. with glFrustum or gluPerspective
glFrustum(...)
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(...) // with your gluLookAt parameters, of course
glTranslatef(1.0, 0.0, 0.0);
// your drawing code here
Just for clarification: gluLookAt creates a matrix which is meant to be multiplied with the modelview matrix. If you use the order of function calls suggested above, things should work as expected.
I hope that helps.

Screen-to-World coordinate conversion in OpenGLES an easy task?

The Screen-to-world problem on the iPhone
I have a 3D model (CUBE) rendered in an EAGLView and I want to be able to detect when I am touching the center of a given face (From any orientation angle) of the cube. Sounds pretty easy but it is not...
The problem:
How do I accurately relate screen-coordinates (touch point) to world-coordinates (a location in OpenGL 3D space)? Sure, converting a given point into a 'percentage' of the screen/world-axis might seem the logical fix, but problems would arise when I need to zoom or rotate the 3D space. Note: rotating & zooming in and out of the 3D space will change the relationship of the 2D screen coords with the 3D world coords...Also, you'd have to allow for 'distance' in between the viewpoint and objects in 3D space. At first, this might seem like an 'easy task', but that changes when you actually examine the requirements. And I've found no examples of people doing this on the iPhone. How is this normally done?
An 'easy' task?:
Sure, one might undertake the task of writing an API to act as a go-between between screen and world, but the task of creating such a framework would require some serious design and would likely take 'time' to do -- NOT something that can be one-manned in 4 hours...And 4 hours happens to be my deadline.
The question:
What are some of the simplest ways to
know if I touched specific locations
in 3D space in the iPhone OpenGL ES
world?
You can now find gluUnProject in http://code.google.com/p/iphone-glu/. I've no association with the iphone-glu project and haven't tried it yet myself, just wanted to share the link.
How would you use such a function? This PDF mentions that:
The Utility Library routine gluUnProject() performs this reversal of the transformations. Given the three-dimensional window coordinates for a location and all the transformations that affected them, gluUnProject() returns the world coordinates from where it originated.
int gluUnProject(GLdouble winx, GLdouble winy, GLdouble winz,
const GLdouble modelMatrix[16], const GLdouble projMatrix[16],
const GLint viewport[4], GLdouble *objx, GLdouble *objy, GLdouble *objz);
Map the specified window coordinates (winx, winy, winz) into object coordinates, using transformations defined by a modelview matrix (modelMatrix), projection matrix (projMatrix), and viewport (viewport). The resulting object coordinates are returned in objx, objy, and objz. The function returns GL_TRUE, indicating success, or GL_FALSE, indicating failure (such as an noninvertible matrix). This operation does not attempt to clip the coordinates to the viewport or eliminate depth values that fall outside of glDepthRange().
There are inherent difficulties in trying to reverse the transformation process. A two-dimensional screen location could have originated from anywhere on an entire line in three-dimensional space. To disambiguate the result, gluUnProject() requires that a window depth coordinate (winz) be provided and that winz be specified in terms of glDepthRange(). For the default values of glDepthRange(), winz at 0.0 will request the world coordinates of the transformed point at the near clipping plane, while winz at 1.0 will request the point at the far clipping plane.
Example 3-8 (again, see the PDF) demonstrates gluUnProject() by reading the mouse position and determining the three-dimensional points at the near and far clipping planes from which it was transformed. The computed world coordinates are printed to standard output, but the rendered window itself is just black.
In terms of performance, I found this quickly via Google as an example of what you might not want to do using gluUnProject, with a link to what might lead to a better alternative. I have absolutely no idea how applicable it is to the iPhone, as I'm still a newb with OpenGL ES. Ask me again in a month. ;-)
You need to have the opengl projection and modelview matrices. Multiply them to gain the modelview projection matrix. Invert this matrix to get a matrix that transforms clip space coordinates into world coordinates. Transform your touch point so it corresponds to clip coordinates: the center of the screen should be zero, while the edges should be +1/-1 for X and Y respectively.
construct two points, one at (0,0,0) and one at (touch_x,touch_y,-1) and transform both by the inverse modelview projection matrix.
Do the inverse of a perspective divide.
You should get two points describing a line from the center of the camera into "the far distance" (the farplane).
Do picking based on simplified bounding boxes of your models. You should be able to find ray/box intersection algorithms aplenty on the web.
Another solution is to paint each of the models in a slightly different color into an offscreen buffer and reading the color at the touch point from there, telling you which brich was touched.
Here's source for a cursor I wrote for a little project using bullet physics:
float x=((float)mpos.x/screensize.x)*2.0f -1.0f;
float y=((float)mpos.y/screensize.y)*-2.0f +1.0f;
p2=renderer->camera.unProject(vec4(x,y,1.0f,1));
p2/=p2.w;
vec4 pos=activecam.GetView().col_t;
p1=pos+(((vec3)p2 - (vec3)pos) / 2048.0f * 0.1f);
p1.w=1.0f;
btCollisionWorld::ClosestRayResultCallback rayCallback(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z));
game.dynamicsWorld->rayTest(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z), rayCallback);
if (rayCallback.hasHit())
{
btRigidBody* body = btRigidBody::upcast(rayCallback.m_collisionObject);
if(body==game.worldBody)
{
renderer->setHighlight(0);
}
else if (body)
{
Entity* ent=(Entity*)body->getUserPointer();
if(ent)
{
renderer->setHighlight(dynamic_cast<ModelEntity*>(ent));
//cerr<<"hit ";
//cerr<<ent->getName()<<endl;
}
}
}
Imagine a line that extends from the viewer's eye
through the screen touch point into your 3D model space.
If that line intersects any of the cube's faces, then the user has touched the cube.
Two solutions present themselves. Both of them should achieve the end goal, albeit by a different means: rather than answering "what world coordinate is under the mouse?", they answer the question "what object is rendered under the mouse?".
One is to draw a simplified version of your model to an off-screen buffer, rendering the center of each face using a distinct color (and adjusting the lighting so color is preserved identically). You can then detect those colors in the buffer (e.g. pixmap), and map mouse locations to them.
The other is to use OpenGL picking. There's a decent-looking tutorial here. The basic idea is to put OpenGL in select mode, restrict the viewport to a small (perhaps 3x3 or 5x5) window around the point of interest, and then render the scene (or a simplified version of it) using OpenGL "names" (integer identifiers) to identify the components making up each face. At the end of this process, OpenGL can give you a list of the names that were rendered in the selection viewport. Mapping these identifiers back to original objects will let you determine what object is under the mouse cursor.
Google for opengl screen to world (for example there’s a thread where somebody wants to do exactly what you are looking for on GameDev.net). There is a gluUnProject function that does precisely this, but it’s not available on iPhone, so that you have to port it (see this source from the Mesa project). Or maybe there’s already some publicly available source somewhere?