Testing point in the alpha channel - iphone

Is there a way to detect if the alpha of a pixel after drawing is not 0 when using OpenGLES on the iphone?
I would like to test multiple points to see id they are inside the area of a random polygon drawn by the user. If you know Flash, something equivalent to BitmapData::getPixel32 is what I'm looking for.

The framebuffer is kept by the GPU and is not immediately CPU accessible. I think the thing you'd most likely want from full OpenGL is the occlusion query; you can request geometry be drawn and be told how many pixels were actually plotted. Sadly that isn't available on the iPhone.
I think what you probably want is glReadPixels, which can be used to read a single pixel if you prefer, e.g. (written here, as I type, not tested)
GLubyte pixelValue[4];
glReadPixels(x, y, 1, 1, GL_RGBA, GL_UNSIGNED_BYTE, pixelValue);
NSLog(#"alpha was %d", pixelValue[3]);
Using glReadPixels causes a pipeline flush, so is generally a bad idea from a GL performance point of view, but it'll do what you want. Unlike iOS, OpenGL uses graph paper order for pixel coordinates, so (0, 0) is the lower left corner.

Related

In OpenGL ES2.0 for iOs, how can I use a CVPixelBufferRef to update a cubemap texture?

I have managed to get a CVPixelBufferRef from an AVPlayer to feed pixel data that I can use to texture a 2D object. When my pixelbuffer has data in it I do:
CVReturn err = CVOpenGLESTextureCacheCreateTextureFromImage('
kCFAllocatorDefault,
videoTextureCache_,
pixelBuffer, //this is a CVPixelBufferRef
NULL,
GL_TEXTURE_2D,
GL_RGBA,
frameWidth,
frameHeight,
GL_BGRA,
GL_UNSIGNED_BYTE,
0,
&texture);
I would like to use this buffer to create a GL_TEXTURE_CUBE_MAP. My video frame data is actually 6 sections in one image (e.g. a cubestrip) that in total makes the sides of a cube. Any thoughts on a way to do this?
I had thought to just pretend my GL_TEXTURE_2D was a GL_TEXTURE_CUBE_MAP and replace the texture on my skybox with the texture generated by the code above, but this creates a distorted mess (as I suppose should be expected when trying to force a skybox to be textured with a GL_TEXTURE_2D.
The other idea was to setup unpacking using glPixelStorei and then read from the pixelbuffur:
glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
glPixelStorei(GL_UNPACK_SKIP_PIXELS, X);
glPixelStorei(GL_UNPACK_SKIP_ROWS, Y);
glTexImage2D(...,&pixelbuffer);
But unbelievably GL_UNPACK_ROW_LENGTH is not supported in OpenGl ES2.0 for iOS.
So, is there:
-Any way to split us the pixel data in my CVPixelBufferRef through indexing the buffer to some pixel subset before using it to make a texture?
-Any way to make a 6 new GL_TEXTURE_2D as indexed subsets of my GL_TEXTURE_2D that is created by the code above
-any way to convert a GL_TEXTURE_2D to a valid GL_TEXTURE_CUBE_MAP (e.g. GLKit has a Skybox effect that loads a GL_TEXTURE_CUBE_MAP from a single cubestrip file. It doesnt have a method to load a texture from memory though or I would be sorted)
-any other ideas?
If it were impossible any other way (which is unlikely, there probably is an alternate way -- so this is probably not the best answer & involves more work than necessary) here is a hack I'd try:
How a cube map works is it projects the texture for each face from a point in the center of the geometry out toward each of the cube faces. So you could reproduce that behavior yourself; you could use Projective Texturing to make six draw calls, one for each face of your cube. Each time, you'd first draw the face you're interested in to the stencil buffer, then calculate the projection matrix for your texture (this technique is used a lot for 'spotlight' effects in games), then figure out the transform matrix required to augment the fragment shader's texture read so that for each face, only the portion of the texture that corresponds to that face winds up within the (0..1) texture lookup range. If everything has gone right, anything outside the 0..1 range should be discarded by the stencil buffer, and you'd be left with a DIY cube map out of a TEXTURE_2D.
The above method is actually really similar to what I'm doing for an app right now, except I'm only using projective texturing to mask off & replace a small portion of the cube map. I need to pixel-match the edges of the small square I'm projecting so that it's seamlessly applied to the skybox, so that's why I feel confident that this method will actually reproduce the cube map behavior -- otherwise, pixel-matching wouldn't be possible.
Anyway, I hope you find a way to simply transition your 2D to CUBEMAP, because that would probably be much easier and cleaner.

Warping an image on the iphone with OpenGL

I am fairly new to programming and I'm doing it, at this point, just to educate myself and have fun.
I'm having a lot of trouble understanding some OpenGL stuff despite having read this great article here. I've also downloaded and played around with an example from the apple developer site that uses a .png image for a sprite. I do eventually want to use an image.
All I want to do is take an image and warp it such that it's four corners end up at four different x,y coordinates that I supply. This would be on a timer of sorts (CADisplayLink?) with one or more of these points changing at each moment. I just want to stretch it between these dynamic points.
I'm just having trouble understanding exactly how this works. As I've understood some example code over at the developer center, I can use:
glVertexPointer(2, GL_FLOAT, 0, spriteVertices);
where spriteVertices is something like:
const GLfloat spriteVertices[] = {
-0.90f, -.85f,
0.95f, -0.83f,
-0.85f, 0.85f,
0.80f, 0.80f,
};
The problem is that I don't understand what the numbers actually mean, why some have negatives infront of them, and where they are counting from to get the four corners. How would I need to change normal x,y coordinates that I get in order to plug them into this? (the numbers I would have for x,y wouldn't look like numbers between 1 and 0 would they? I would like something akin to per pixel accuracy.
Any help is greatly appreciated even if it's just a link to more reading. I'm having trouble finding resources for a newb.
It isn't as complicated as it seems at first. Each pair of numbers relates to an x,y position on the screen. So, 0.80f, 0.80f, would say go to 80% of the drawable area for both x and y(left to right, down to up). While -0.80,-0.80 would say go to 80% of the drawable area from right to left, up to down. The negatives just switch the sides. A point of note, openGL draws down to up(as if you were looking up a building from the ground), while the iPhone draws up to down (as though you were reading a book).
To get pixels, you multiply the float value by drawable area 1024 X 0.8 = 819.2.
This tutorial is for textures, but it is amazing and really helps you learn the coordinate systems:
http://iphonedevelopment.blogspot.com/2009/05/opengl-es-from-ground-up-part-6_25.html

Screen-to-World coordinate conversion in OpenGLES an easy task?

The Screen-to-world problem on the iPhone
I have a 3D model (CUBE) rendered in an EAGLView and I want to be able to detect when I am touching the center of a given face (From any orientation angle) of the cube. Sounds pretty easy but it is not...
The problem:
How do I accurately relate screen-coordinates (touch point) to world-coordinates (a location in OpenGL 3D space)? Sure, converting a given point into a 'percentage' of the screen/world-axis might seem the logical fix, but problems would arise when I need to zoom or rotate the 3D space. Note: rotating & zooming in and out of the 3D space will change the relationship of the 2D screen coords with the 3D world coords...Also, you'd have to allow for 'distance' in between the viewpoint and objects in 3D space. At first, this might seem like an 'easy task', but that changes when you actually examine the requirements. And I've found no examples of people doing this on the iPhone. How is this normally done?
An 'easy' task?:
Sure, one might undertake the task of writing an API to act as a go-between between screen and world, but the task of creating such a framework would require some serious design and would likely take 'time' to do -- NOT something that can be one-manned in 4 hours...And 4 hours happens to be my deadline.
The question:
What are some of the simplest ways to
know if I touched specific locations
in 3D space in the iPhone OpenGL ES
world?
You can now find gluUnProject in http://code.google.com/p/iphone-glu/. I've no association with the iphone-glu project and haven't tried it yet myself, just wanted to share the link.
How would you use such a function? This PDF mentions that:
The Utility Library routine gluUnProject() performs this reversal of the transformations. Given the three-dimensional window coordinates for a location and all the transformations that affected them, gluUnProject() returns the world coordinates from where it originated.
int gluUnProject(GLdouble winx, GLdouble winy, GLdouble winz,
const GLdouble modelMatrix[16], const GLdouble projMatrix[16],
const GLint viewport[4], GLdouble *objx, GLdouble *objy, GLdouble *objz);
Map the specified window coordinates (winx, winy, winz) into object coordinates, using transformations defined by a modelview matrix (modelMatrix), projection matrix (projMatrix), and viewport (viewport). The resulting object coordinates are returned in objx, objy, and objz. The function returns GL_TRUE, indicating success, or GL_FALSE, indicating failure (such as an noninvertible matrix). This operation does not attempt to clip the coordinates to the viewport or eliminate depth values that fall outside of glDepthRange().
There are inherent difficulties in trying to reverse the transformation process. A two-dimensional screen location could have originated from anywhere on an entire line in three-dimensional space. To disambiguate the result, gluUnProject() requires that a window depth coordinate (winz) be provided and that winz be specified in terms of glDepthRange(). For the default values of glDepthRange(), winz at 0.0 will request the world coordinates of the transformed point at the near clipping plane, while winz at 1.0 will request the point at the far clipping plane.
Example 3-8 (again, see the PDF) demonstrates gluUnProject() by reading the mouse position and determining the three-dimensional points at the near and far clipping planes from which it was transformed. The computed world coordinates are printed to standard output, but the rendered window itself is just black.
In terms of performance, I found this quickly via Google as an example of what you might not want to do using gluUnProject, with a link to what might lead to a better alternative. I have absolutely no idea how applicable it is to the iPhone, as I'm still a newb with OpenGL ES. Ask me again in a month. ;-)
You need to have the opengl projection and modelview matrices. Multiply them to gain the modelview projection matrix. Invert this matrix to get a matrix that transforms clip space coordinates into world coordinates. Transform your touch point so it corresponds to clip coordinates: the center of the screen should be zero, while the edges should be +1/-1 for X and Y respectively.
construct two points, one at (0,0,0) and one at (touch_x,touch_y,-1) and transform both by the inverse modelview projection matrix.
Do the inverse of a perspective divide.
You should get two points describing a line from the center of the camera into "the far distance" (the farplane).
Do picking based on simplified bounding boxes of your models. You should be able to find ray/box intersection algorithms aplenty on the web.
Another solution is to paint each of the models in a slightly different color into an offscreen buffer and reading the color at the touch point from there, telling you which brich was touched.
Here's source for a cursor I wrote for a little project using bullet physics:
float x=((float)mpos.x/screensize.x)*2.0f -1.0f;
float y=((float)mpos.y/screensize.y)*-2.0f +1.0f;
p2=renderer->camera.unProject(vec4(x,y,1.0f,1));
p2/=p2.w;
vec4 pos=activecam.GetView().col_t;
p1=pos+(((vec3)p2 - (vec3)pos) / 2048.0f * 0.1f);
p1.w=1.0f;
btCollisionWorld::ClosestRayResultCallback rayCallback(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z));
game.dynamicsWorld->rayTest(btVector3(p1.x,p1.y,p1.z),btVector3(p2.x,p2.y,p2.z), rayCallback);
if (rayCallback.hasHit())
{
btRigidBody* body = btRigidBody::upcast(rayCallback.m_collisionObject);
if(body==game.worldBody)
{
renderer->setHighlight(0);
}
else if (body)
{
Entity* ent=(Entity*)body->getUserPointer();
if(ent)
{
renderer->setHighlight(dynamic_cast<ModelEntity*>(ent));
//cerr<<"hit ";
//cerr<<ent->getName()<<endl;
}
}
}
Imagine a line that extends from the viewer's eye
through the screen touch point into your 3D model space.
If that line intersects any of the cube's faces, then the user has touched the cube.
Two solutions present themselves. Both of them should achieve the end goal, albeit by a different means: rather than answering "what world coordinate is under the mouse?", they answer the question "what object is rendered under the mouse?".
One is to draw a simplified version of your model to an off-screen buffer, rendering the center of each face using a distinct color (and adjusting the lighting so color is preserved identically). You can then detect those colors in the buffer (e.g. pixmap), and map mouse locations to them.
The other is to use OpenGL picking. There's a decent-looking tutorial here. The basic idea is to put OpenGL in select mode, restrict the viewport to a small (perhaps 3x3 or 5x5) window around the point of interest, and then render the scene (or a simplified version of it) using OpenGL "names" (integer identifiers) to identify the components making up each face. At the end of this process, OpenGL can give you a list of the names that were rendered in the selection viewport. Mapping these identifiers back to original objects will let you determine what object is under the mouse cursor.
Google for opengl screen to world (for example there’s a thread where somebody wants to do exactly what you are looking for on GameDev.net). There is a gluUnProject function that does precisely this, but it’s not available on iPhone, so that you have to port it (see this source from the Mesa project). Or maybe there’s already some publicly available source somewhere?

Positioning elements in 2D space with OpenGL ES

In my spare time I like to play around with game development on the iPhone with OpenGL ES. I'm throwing together a small 2D side-scroller demo for fun, and I'm relatively new to OpenGL, and I wanted to get some more experienced developers' input on this.
So here is my question: does it make sense to specify the vertices of each 2D element in model space, then translate each element to it's final view space each time a frame is drawn?
For example, say I have a set of blocks (squares) that make up the ground in my side-scroller. Each square is defined as:
const GLfloat squareVertices[] = {
-1.0, 1.0, -6.0, // Top left
-1.0, -1.0, -6.0, // Bottom left
1.0, -1.0, -6.0, // Bottom right
1.0, 1.0, -6.0 // Top right
}
Say I have 10 of these squares that I need to draw together as the ground for the next frame. Should I do something like this, for each square visible in the current scene?
glPushMatrix();
{
glTranslatef(currentSquareX, currentSquareY, 0.0);
glVertexPointer(3, GL_FLOAT, 0, squareVertices);
glEnableClientState(GL_VERTEX_ARRAY);
// Do the drawing
}
glPopMatrix();
It seems to me that doing this for every 2D element in the scene, for every frame, gets a bit intense and I would imagine the smarter people who use OpenGL much more than I do may have a better way of doing this.
That all being said, I'm expecting to hear that I should profile the code and see where any bottlenecks may be: to those people, I say: I haven't written any of this code yet, I'm simply in the process of wrapping my mind around it so that when I do go to write it it goes smoother.
On the subject of profiling and optimization, I'm really not trying to prematurely optimize here, I'm just trying to wrap my mind around how one would set up a 2D scene and render it. Like I said, I'm relatively new to OpenGL and I'm just trying to get a feel for how things are done. If anyone has any suggestions on a better way to do this, I'd love to hear your thoughts.
Please keep in mind that I'm not interested in 3D, just 2D for now. Thanks!
You are concerned with the overhead it takes to transform a model (in this case a square) from model coordinates to world coordinates when you have a lot of models. This seems like an obvious optimization for static models.
If you build your square's vertices in world coordinates, then of course it is going to be faster as each square will avoid the extra cost of these three functions (glPushMatrix, glPopMatrix, and glTranslatef) since there is no need to translate from model to world coordinates at render time. I have no idea how much faster this will be, I suspect that it won't be a humongous optimization, and you lose the modularity of keeping the squares in model coordinates: What if in the future you decide you want these squares to be moveable? That will be a lot harder if you're keeping their vertices in world coordinates.
In short, it's a tradeoff:
World Coordinates
More Memory - each square needs its
own set of vertices.
Less computation - no need to perform
glPushMatrix, glPopMatrix, or
glTranslatef for each square at render time.
Less flexible - lacks support (or
complicates) for dynamically moving these squares
Model Coordinates
Less memory - the squares can share the same vertex data
More Computation - each square must
perform three extra functions at
render time.
More Flexible - squares can easily be
moved by manipulating the
glTranslatef call.
I guess the only way to know what is the right decision is by doing and profiling. I know you said you haven't written this yet, but I suspect that whether your squares are in model or world coordinates it won't make much of a difference - and if it does, I can't imagine an architecture that you could create where it would be hard to switch your squares from model to world coordinates or vice-versa.
Good luck to you and your adventures in iPhone game development!
If you are only using screen aligned quads it might be easier to use the OES Draw Texture extension. Then you can use a single texture to hold all your game "sprites". First specify the crop rectangle by setting the GL_TEXTURE_CROP_RECT_OES TexParameter. This is the boundry of the sprite within the larger texture. To render, call glDrawTexiOES passing in the desired position & size in viewport coordinates.
int rect[4] = {0, 0, 16, 16};
glBindTexture(GL_TEXTURE_2D, sprites);
glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_CROP_RECT_OES, rect);
glDrawTexiOES(x, y, z, width, height);
This extension isn't available on all devices, but it works great on the iPhone.
You might also consider using a static image and just scrolling that instead of drawing each individual block of the floor, and translating its position, etc.

Why is this OpenGL ES code slow on iPhone?

I've slightly modified the iPhone SDK's GLSprite example while learning OpenGL ES and it turns out to be quite slow. Even in the simulator (on the hw worst) so I must be doing something wrong since it's only 400 textured triangles.
const GLfloat spriteVertices[] = {
0.0f, 0.0f,
100.0f, 0.0f,
0.0f, 100.0f,
100.0f, 100.0f
};
const GLshort spriteTexcoords[] = {
0,0,
1,0,
0,1,
1,1
};
- (void)setupView {
glViewport(0, 0, backingWidth, backingHeight);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(0.0f, backingWidth, backingHeight,0.0f, -10.0f, 10.0f);
glMatrixMode(GL_MODELVIEW);
glClearColor(0.3f, 0.0f, 0.0f, 1.0f);
glVertexPointer(2, GL_FLOAT, 0, spriteVertices);
glEnableClientState(GL_VERTEX_ARRAY);
glTexCoordPointer(2, GL_SHORT, 0, spriteTexcoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// sprite data is preloaded. 512x512 rgba8888
glGenTextures(1, &spriteTexture);
glBindTexture(GL_TEXTURE_2D, spriteTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, spriteData);
free(spriteData);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glEnable(GL_TEXTURE_2D);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
}
- (void)drawView {
..
glClear(GL_COLOR_BUFFER_BIT);
glLoadIdentity();
glTranslatef(tx-100, ty-100,10);
for (int i=0; i<200; i++) {
glTranslatef(1, 1, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
..
}
drawView is called every time the screen is touched or the finger on the screen is moved and tx,ty are set to the x,y coordinates where that touch happened.
I've also tried using GLBuffer, when translation was pre-generated and there was only one DrawArray but gave the same performance (~4 FPS).
===EDIT===
Meanwhile I've modified this so that much smaller quads are used (sized: 34x20) and much less overlapping is done. There are ~400 quads->800 triangles spread on the whole screen. Texture size is 512x512 atlas and RGBA_8888 while the texture coordinates are in float.
The code is very ugly in terms of API efficiency: there are two MatrixMode change along with two loads and two translation then a drawarrays for a triangle strip (quad).
Now this produces ~45 FPS.
(I know this is very late, but I couldn't resist. I'll post anyway, in case other people come here looking for advice.)
This has nothing to do with the texture size. I don't know why people rated up Nils. He seems to have a fundamental misunderstanding of the OpenGL pipeline. He seems to think that for a given triangle, the entire texture is loaded and mapped onto that triangle. The opposite is true.
Once the triangle has been mapped into the viewport, it is rasterized. For every on-screen pixel the your triangle covers, the fragment shader is called. The default fragment shader (OpenGL ES 1.1, which you are using) will lookup the texel that most closely maps (GL_NEAREST) to the pixel you are drawing. It might look up 4 texels since you are using the higher quality GL_LINEAR method to average the best texel. Still, if the pixel count in your triangle is, say 100, then the most texture bytes you will have to read is 4(lookups) * 100(pixels) * 4(bytes per color. Far far less than what Nils was saying. It's amazing that he can make it sound like he actually knows what he's talking about.
WRT the tiled architecture, this is common in embedded OpenGL devices to preserve locality of reference. I believe that each tile gets exposed to each drawing operation, quickly culling most of them. Then the tile decides what to draw on itself. This is going to be much slower when you have blending turned on, as you do. Because you are using large triangles that might overlap and blend with other tiles, the GPU has to do a lot of extra work. If, instead of rendering the example square with alpha edges, you were to render an actual shape (instead of a square picture of the shape), then you could turn off blending for this part of the scene and I bet that would speed things up tremendously.
If you want to try it, just turn off blending and see how much things speed up, even if the don't look right. glDisable(GL_BLEND);
Your texture is 512*512*4 bytes per pixel. That's a megabyte of data. If you render it 200 times per frame you generate a bandwidth load of 200 megabytes per frame.
With roughly 4 fps you consume 800mb/second just for texture reads alone. Frame- and Zbuffer writes need bandwidth as well. Then there is the CPU, and don't underestimate the bandwidth requirements of the display as well.
RAM on embedded systems (e.g. your iphone) is not as fast as on a Desktop-PC. What you see here is a bandwidth starvation effect. The RAM simply can't handle the data faster.
How to cure this problem:
pick a sane texture-size. On average you should have 1 texel per pixel. This gives crisp looking textures. I know - it's not always possible. Use common sense.
use mipmaps. This takes up 33% of extra space but allows the graphic chip to pick use a lower resolution mipmap if possible.
Try smaller texture formats. Maybe you can use the ARGB4444 format. This would double the rendering speed. Also take a look at the compressed texture formats. Decompression does not cause a performance drop as it's done in hardware. Infact the opposite is true: Due to the smaller size in memory the graphic chip can read the texture-data faster.
I guess my first try was just a bad (or very good) test.
iPhone has a PowerVR MBX Lite which has a tile based graphics processor. It subdivides the screen into smaller tiles and renders them parallel. Now in the first case above the subdivision might got a bit exhausted because of the very high overlapping. More over, they couldn't be clipped because of the same distance and so all texture coordinates had to calculated (This could be easily tested by changing the translation in the loop).
Also because of the overlapping the parallelism couldn't be exploited and some tiles were sitting doing nothing and the rest (1/3) were working a lot.
So I think, while memory bandwidth could be a bottleneck, this wasn't the case in this example. The problem is more because of how the graphics HW works and the setup of the test.
I'm not familiar with the iPhone, but if it doesn't have dedicated hardware for handling floating point numbers (I suspect it doesn't) then it'd be faster to use integers whenever possible.
I'm currently developing for Android (which uses OpenGL ES as well) and for instance my vertex array is int instead of float. I can't say how much of a difference it makes, but I guess it's worth a try.
Apple is very tight-lipped about the specific hardware specs of the iPhone, which seems very strange to those of us coming from a console background. But people have been able to determine that the CPU is a 32-bit RISC ARM1176JZF. The good news is that it have a full floating-point unit, so we can continue writing math and physics code the way we do in most platforms.
http://gamesfromwithin.com/?p=239