Normally, you'd use something like:
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
glEnable(GL_LINE_SMOOTH);
glLineWidth(2.0f);
glVertexPointer(2, GL_FLOAT, 0, points);
glEnableClientState(GL_VERTEX_ARRAY);
glDrawArrays(GL_LINE_STRIP, 0, num_points);
glDisableClientState(GL_VERTEX_ARRAY);
It looks good in the iPhone simulator, but on the iPhone the lines get extremely thin and w/o any anti aliasing.
How do you get AA on iPhone?
One can achieve the effect of anti aliasing very cheaply using vertices with opacity 0.
Here's an image example to explain:
Comparison with AA:
You can read a paper about this here:
http://research.microsoft.com/en-us/um/people/hoppe/overdraw.pdf
You could do something along this way:
// Colors is a pointer to unsigned bytes (4 per color).
// Should alternate in opacity.
glColorPointer(4, GL_UNSIGNED_BYTE, 0, colors);
glEnableClientState(GL_COLOR_ARRAY);
// points is a pointer to floats (2 per vertex)
glVertexPointer(2, GL_FLOAT, 0, points);
glEnableClientState(GL_VERTEX_ARRAY);
glDrawArrays(GL_TRIANGLE_STRIP, 0, points_count);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_COLOR_ARRAY);
Starting in iOS Version 4.0 you have an easy solution, it's now possible to use Antialiasing for the whole OpenGL ES scene with just a few lines of added code. (And nearly no performance loss, at least on the SGX GPU).
For the code please read the following Apple Dev-Forum Thread.
There are also some sample pictures how it looks for me on my blog.
Using http://answers.oreilly.com/topic/1669-how-to-render-anti-aliased-lines-with-textures-in-ios-4/ as a starting point, I was able to get anti-aliased lines like these:
They aren't perfect nor are they as nice as the ones that I had been drawing with Core Graphics, but they are pretty good. I am actually drawing same lines (vertices) twice - once with bigger texture and color, then with smaller texture and translucent white.
There are artifacts when lines overlap too tightly and alphas start to accumulate.
One approach around this limitation is tessellating your lines into textured triangle strips (as seen here).
The problem is that on the iPhone OpenGl renders to a frame buffer object rather than the main frame buffer and as I understand it FBO's don't support multisampling.
There are various tricks that can be done, such as rendering to another FBO at twice the display size and then relying on texture filtering to smooth things out, not something that I've tried though so can't comment on how well this works.
I remember very specifically that I tried this and there is no simple way to do this using OpenGL on the iPhone. You can draw using CGPaths and a CGContextRef, but that will be significantly slower.
Put this in your render method and setUpFrame buffer...
You will get anti-aliased appearance.
/*added*/
//[_context presentRenderbuffer:GL_RENDERBUFFER];
//Bind both MSAA and View FrameBuffers.
glBindFramebuffer(GL_READ_FRAMEBUFFER_APPLE, msaaFramebuffer);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER_APPLE, framebuffer );
// Call a resolve to combine both buffers
glResolveMultisampleFramebufferAPPLE();
// Present final image to screen
glBindRenderbuffer(GL_RENDERBUFFER, _colorRenderBuffer);
[_context presentRenderbuffer:GL_RENDERBUFFER];
/*added*/
Related
once again I'm asking for help after quite a bit of research.
I need to create a view where the user can place an image to the background and draw lines/dots(touch events) on top of it and then save the "sketch" by pressing save button.
So after research I decide to pick up this code and build the thing on top of it because it already does half of what I want(it does the drawing).
The sample I have is using OpenGL for drawing and basically I don't care if it is OpenGL or CoreGraphics as soon as it does it.
The problem I have is how to put an image as a background of EAGLView I have in this sample code. My research gave me only suggestions for OpenGL experienced developers but not the working code snippet/solution.
If somebody can help me with this I would be very appreciate.
What I need is just a sample of how to put a UIImage to EAGLView background so then the user can draw(already have the code) on top of it and save the result.
One usually doesn't mix OpenGL with ordinary UI... views. Also drawing a background image using OpenGL is trivial:
First you need to load the Image into a texture. In GLPaint a image file is loaded as brush-texture
https://github.com/omeryavuz/glpaint/blob/master/Classes/PaintingView.m function initWithCoder
To draw a background, the first thing you draw after framebuffer clear is a fullscreen quad with that texture. If you build upong GLPaint, then the projection and modelview matrix and the vertex array state are set properly already. So it boils down to
GLfloat vert[] = {0,0, frame.size.width,0, frame.size.width,frame.size.height, 0,frame.size.height};
GLfloat tex[] = {0,0, 1,0, 1,1, 0,1};
GLuint indexes[] = {0, 1, 2, 2, 3, 0};
glBindTexture(GL_TEXTURE_2D, backgroundTexture);
glEnable(GL_TEXTURE_2D);
glVertexPointer(2, GL_FLOAT, 0, vert);
glTexCoordPointer(2, GL_FLOAT, 0, tex);
glDrawElements(GL_TRIANGLES, 2, GL_UNSIGNED_INT, indexes);
In PaintingView.m, on line 89, set eaglLayer.opaque = NO;.
In your viewController, put a UIImageView or whatever behind the paintingView.
Note: This will probably decrese performance.
Note: It might not initially work; the OpenGL layer may overwrite itself with some sort of default background color before rendering a frame. EDIT: Line 304 in PaintingView.m: Try setting the color to glClearColor(1.0, 1.0, 1.0, 0.0);. I am not sure this works, and don't have time to test this. If it doesn't work, wait till Brad Larson comes around, sees your question, and answers it perfectly ;)
I've got basically a 2d game on the iPhone and I'm trying to set up multiple backgrounds that scroll at different speeds (known as parallax backgrounds).
So my thought was to just stick the backgrounds BEHIND the foreground using different z-coordinate planes, and just make them bigger than the foreground (in size) to accommodate, so that the whole thing can be scrolled (just at a different speed).
And (as far as I know) I basically implemented that. The only problem is that it seems to entirely ignore whatever z-value I give it, or rather it just zeroes all of them. I see the background (I've only tested ONE background so far, to keep it simple...so for now I just have a foreground and I want one background scrolling at a different speed), but it scrolls 1:1 with my foreground, so it obviously doesn't look right, and most of it is cut off (cause it's bigger). And I've tried various z-values for the background and various near/far clipping planes...it's always the same. I'm probably just doing one simple thing wrong, but I can't figure it out. I'm wondering if it has to do with me using only 2 coordinates in glVertexPointer for the foreground? (Of course for the background I AM passing in 3)
I'll post some code:
This is some initial setup:
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(-1.0f, 1.0f, -1.5f, 1.5f, -10.0f, 10.0f);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glEnableClientState(GL_VERTEX_ARRAY);
//glEnableClientState(GL_COLOR_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
//transparency
glEnable (GL_BLEND);
glBlendFunc (GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
A little bit about my foreground's float array....it's interleaved. For my foreground it goes vertex x, vertex y, texture x, texture y, repeat. This all works just fine.
This is my FOREGROUND rendering:
glVertexPointer(2, GL_FLOAT, 4*sizeof(GLfloat), texes); <br>
glTexCoordPointer(2, GL_FLOAT, 4*sizeof(GLfloat), (GLvoid*)texes + 2*sizeof(GLfloat)); <br>
glDrawArrays(GL_TRIANGLES, 0, indexCount / 4);
BACKGROUND rendering:
Same drill here except this time it goes vertex x, vertex y, vertex z, texture x, texture y, repeat. Note the z value this time. I did make sure the data in this array was correct while debugging (getting the right z values). And again, it shows up...it's just not going far back in the distance like it should.
glVertexPointer(3, GL_FLOAT, 5*sizeof(GLfloat), b1Texes);
glTexCoordPointer(2, GL_FLOAT, 5*sizeof(GLfloat), (GLvoid*)b1Texes + 3*sizeof(GLfloat));
glDrawArrays(GL_TRIANGLES, 0, b1IndexCount / 5);
And to move my camera, I just do a simple glTranslatef(x, y, 0.0f);
I'm not understanding what I'm doing wrong cause this seems like the most basic 3D function imaginable...things further away are smaller and don't move as fast when the camera moves. Not the case for me. Seems like it should be pretty basic and not even really be affected by my projection and all that (though I've even tried doing glFrustum just for fun, no success). Please help, I feel like it's just one dumb thing. I will post more code if necessary.
Shot in the dark...
You may have to forgotten to setup the Depth-Buffering within the framebuffer initializer.
Copy&Paste from Apple's older EAGLView templates:
glGenRenderbuffersOES(1, &depthRenderbuffer);
glBindRenderbufferOES(GL_RENDERBUFFER_OES, depthRenderbuffer);
glRenderbufferStorageOES(GL_RENDERBUFFER_OES, GL_DEPTH_COMPONENT16_OES, backingWidth, backingHeight);
glFramebufferRenderbufferOES(GL_FRAMEBUFFER_OES, GL_DEPTH_ATTACHMENT_OES, GL_RENDERBUFFER_OES, depthRenderbuffer);
If you are depending of blending you must draw in depth order, meaning draw the furthest (deepest) layer first. Otherwise they will be covered by the layer on top as the z-buffer value is written even though the area is 100% transparent.
See here
I've figured out that I am using orthographic projections which are incapable of displaying things being further away (please correct me if I'm wrong on this). When I tried glFrustum earlier (as I stated in my question), I was doing something wrong with the setup of it. I was using a negative value for the near-clipping value, and I basically got the 1:1 scrolling problem, same as orthographic. But I have changed this to 0.01, and it finally started displaying correctly (backgrounds displayed further away).
My issue is resolved but just as a side idea, I'm now wondering if I can mix orthographic and perspective within the same frame, and what that would require. Because I'd rather keep the foreground very simple and orthographic (2d), but I want my backgrounds to display with the perspective depth.
My idea was something like:
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(-1.0f, 1.0f, -1.5f, 1.5f, -10.0f, 10.0f);
//render foreground
glLoadIdentity();
glFrustum(-1.0f, 1.0f, -1.5f, 1.5f, 0.01f, 1000.0f);
//render backgrounds
I will play around with this and comment with my results, in case anyone is curious. Feedback on this would be appreciated, though technically I have no pressing need on this issue anymore (from here on out it would just be idea discussion).
I'm using OpenGL ES to draw things in my iPhone game. Sometimes I like to change the alpha of the textures I'm drawing. Here is the (working) code I use. What, if anything, in this code sample is unnecessary? Thanks!
// draw texture with alpha "myAlpha"
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE,GL_COMBINE );
glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB,GL_MODULATE);
glTexEnvf(GL_TEXTURE_ENV, GL_SRC0_RGB,GL_PRIMARY_COLOR);
glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND0_RGB,GL_SRC_COLOR);
glTexEnvf(GL_TEXTURE_ENV, GL_SRC1_RGB,GL_TEXTURE);
glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND1_RGB,GL_SRC_COLOR);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND1_RGB, GL_SRC_COLOR);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
glColor4f(myAlpha, myAlpha, myAlpha, myAlpha);
glPushMatrix();
glTranslatef(xLoc, yLoc, 0);
[MyTexture drawAtPoint:CGPointZero];
glPopMatrix();
glColor4f(1.0, 1.0, 1.0, 1.0);
Edit:
The above code sample is for drawing w/ a modified alpha value (so I can fade things in and out). When I just want to draw w/o modifying the alpha value I'd use the last 5 lines of the above sample just without the last call to glColor4f.
My drawing looks like:
glBindTexture(GL_TEXTURE_2D, tex->name); // bind to the name
glVertexPointer(3, GL_FLOAT, 0, vertices); // verices is the alignment in triangles
glTexCoordPointer(2, GL_FLOAT, 0, coordinates); // coordinates describes the rectangle
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); //draw
It's hard to know what you can remove without knowing exactly what you're trying to do.
edit:
Unless you are disabling blend at some point, you can get rid of all of the blend calls:
glEnable(GL_BLEND);
glBlendFunc...
Put them in the initialization of your GL state if you need to set it once. If you need to enable and disable it is better to set the state once for everything you need to draw blended and then set the state again (once) for the unblended.
OpenGL is a state machine so this general idea applies anywhere you need to change the GL state (like setting texture environment with glTexEnvf).
About state machines:
A current state is determined by past
states of the system. As such, it can
be said to record information about
the past, i.e., it reflects the input
changes from the system start to the
present moment. A transition indicates
a state change and is described by a
condition that would need to be
fulfilled to enable the transition.
I'm not a expert but I was reading the Optimizing OpenGL ES for iPhone OS and it had a section on "Optimizing Texturing" which may help you out.
You have calls to configure both the Texture Environment and Framebuffer Blending, which are entirely different features. You are also calling glTexEnvf(GL_TEXTURE_ENV, GL_OPERAND1_RGB, x) twice before drawing, and so only the second call matters.
If all you really want to do is multiply the primary color by the texture color, then it is as simple as glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE). The rest of the calls to glTexEnv are unnecessary.
I'm testing my simple OpenGL ES implementation (a 2D game) on the iPhone and I notice a high render utilization while using the profiler. These are the facts:
I'm displaying only one preloaded large texture (512x512 pixels) at 60fps and the render utilization is around 40%.
My texture is blended using GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA, the only GL function I'm using.
I've tried to make the texture smaller and tiling it, which made no difference.
I'm using a PNG texture atlas of 1024x1024 pixels
I find it very strange that this one texture is causing such an intense GPU usage.
Is this to be expected? What am I doing wrong?
EDIT: My code:
// OpenGL setup is identical to OpenGL ES template
// initState is called to setup
// timer is initialized, drawView is called by the timer
- (void) initState
{
//usual init declarations have been omitted here
glEnable(GL_BLEND);
glBlendFunc(GL_ONE,GL_ONE_MINUS_SRC_ALPHA);
glEnableClientState (GL_VERTEX_ARRAY);
glVertexPointer (2,GL_FLOAT,sizeof(Vertex),&allVertices[0].x);
glEnableClientState (GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer (2,GL_FLOAT,sizeof(Vertex),&allVertices[0].tx);
glEnableClientState (GL_COLOR_ARRAY);
glColorPointer (4,GL_UNSIGNED_BYTE,sizeof(Vertex),&allVertices[0].r);
}
- (void) drawView
{
[EAGLContext setCurrentContext:context];
glBindFramebufferOES(GL_FRAMEBUFFER_OES, viewFramebuffer);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
GLfloat width = backingWidth /2.f;
GLfloat height = backingHeight/2.f;
glOrthof(-width, width, -height, height, -1.f, 1.f);
glMatrixMode(GL_MODELVIEW);
glClearColor(0.f, 0.f, 0.f, 1.f);
glClear(GL_COLOR_BUFFER_BIT);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glBindRenderbufferOES(GL_RENDERBUFFER_OES, viewRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER_OES];
[self checkGLError];
}
EDIT: I've made a couple of improvements, but none managed to lower the render utilization. I've divided the texture in parts of 32x32, changed the type of the coordinates and texture coordinates from GLfloat to GLshort and added extra vertices for degenerative triangles.
The updates are:
initState:
(vertex and texture pointer are now GL_SHORT)
glMatrixMode(GL_TEXTURE);
glScalef(1.f / 1024.f, 1.f / 1024.f, 1.f / 1024.f);
glMatrixMode(GL_MODELVIEW);
glScalef(1.f / 16.f, 1.f/ 16.f, 1.f/ 16.f);
drawView:
glDrawArrays(GL_TRIANGLE_STRIP, 0, 1536); //(16*16 parts * 6 vertices)
I'm writing an app which displays five 512x512 textures on top of each other in a 2D environment using GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA, and I can get about 14fps. Do you really need 60fps? For a game, I'd think 24-30 would be fine. Also, use PVR texture compression if at all possible. There's an example that does it included with the SDK.
I hope you didn't forget to disable GL_BLEND when you don't need it already.
You can make an attempt at memory bandwidth optimization - use 16 bpp formats or PVRTC. IMHO with your texture size texture cache doesn't help at all.
Don't forget that your framebuffer is being used as texture by iPhone UI. If it is created as 32 bit RGBA it will be alpha-blended one more time. For optimal performance 16 bit 565 framebuffers are the best (but graphics quality suffers).
I don't know all the details, such as cache size, but, I suppose, textures pixels are already swizzled when uploaded into video memory and triangles are split by PVR tile engine. Therefore your own splitting appears to be redundant.
And finally. This is only a mobile low-power GPU, not designed for huge screens and high fillrates. Alpha-blending is costly, maybe 3-4 times difference on PowerVR chips.
Read this post.
512x512 is probably a little over optimistic for the iPhone to deal with.
EDIT:
I assume you have already read this, but if not check Apples guide to optimal OpenGl ES performance on iPhone.
What is exactly is the problem?
You're getting your 60fps, which is silky smooth.
Who cares if render utilization is 40%?
The issue could be because of the iPhone's texture cache size. It may simply come down to how much of the texture is on each individual triangle, quad, or tristrip, depending on how you're setting state.
Try this: subdivide your quad and repeat your tests. So if you're 1 quad, make it 4. Then 16. and so on, and see if that helps. The key is to reduce the actual number of pixels that each primitive references.
When the texture cache gets blown, then the hardware will thrash texture lookups from main memory into whatever vram is set aside for the texture buffer for each pixel. This can kill performance mighty quick.
OR - I am completely wrong because I really don't know the iPhone hardware, and I also know that the PVR chip is a strange beast in comparison to what I'm used to (PS2, PSP). Still it's an easy test to try and I'm curious if it helps.
I've slightly modified the iPhone SDK's GLSprite example while learning OpenGL ES and it turns out to be quite slow. Even in the simulator (on the hw worst) so I must be doing something wrong since it's only 400 textured triangles.
const GLfloat spriteVertices[] = {
0.0f, 0.0f,
100.0f, 0.0f,
0.0f, 100.0f,
100.0f, 100.0f
};
const GLshort spriteTexcoords[] = {
0,0,
1,0,
0,1,
1,1
};
- (void)setupView {
glViewport(0, 0, backingWidth, backingHeight);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(0.0f, backingWidth, backingHeight,0.0f, -10.0f, 10.0f);
glMatrixMode(GL_MODELVIEW);
glClearColor(0.3f, 0.0f, 0.0f, 1.0f);
glVertexPointer(2, GL_FLOAT, 0, spriteVertices);
glEnableClientState(GL_VERTEX_ARRAY);
glTexCoordPointer(2, GL_SHORT, 0, spriteTexcoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// sprite data is preloaded. 512x512 rgba8888
glGenTextures(1, &spriteTexture);
glBindTexture(GL_TEXTURE_2D, spriteTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, spriteData);
free(spriteData);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glEnable(GL_TEXTURE_2D);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
}
- (void)drawView {
..
glClear(GL_COLOR_BUFFER_BIT);
glLoadIdentity();
glTranslatef(tx-100, ty-100,10);
for (int i=0; i<200; i++) {
glTranslatef(1, 1, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
..
}
drawView is called every time the screen is touched or the finger on the screen is moved and tx,ty are set to the x,y coordinates where that touch happened.
I've also tried using GLBuffer, when translation was pre-generated and there was only one DrawArray but gave the same performance (~4 FPS).
===EDIT===
Meanwhile I've modified this so that much smaller quads are used (sized: 34x20) and much less overlapping is done. There are ~400 quads->800 triangles spread on the whole screen. Texture size is 512x512 atlas and RGBA_8888 while the texture coordinates are in float.
The code is very ugly in terms of API efficiency: there are two MatrixMode change along with two loads and two translation then a drawarrays for a triangle strip (quad).
Now this produces ~45 FPS.
(I know this is very late, but I couldn't resist. I'll post anyway, in case other people come here looking for advice.)
This has nothing to do with the texture size. I don't know why people rated up Nils. He seems to have a fundamental misunderstanding of the OpenGL pipeline. He seems to think that for a given triangle, the entire texture is loaded and mapped onto that triangle. The opposite is true.
Once the triangle has been mapped into the viewport, it is rasterized. For every on-screen pixel the your triangle covers, the fragment shader is called. The default fragment shader (OpenGL ES 1.1, which you are using) will lookup the texel that most closely maps (GL_NEAREST) to the pixel you are drawing. It might look up 4 texels since you are using the higher quality GL_LINEAR method to average the best texel. Still, if the pixel count in your triangle is, say 100, then the most texture bytes you will have to read is 4(lookups) * 100(pixels) * 4(bytes per color. Far far less than what Nils was saying. It's amazing that he can make it sound like he actually knows what he's talking about.
WRT the tiled architecture, this is common in embedded OpenGL devices to preserve locality of reference. I believe that each tile gets exposed to each drawing operation, quickly culling most of them. Then the tile decides what to draw on itself. This is going to be much slower when you have blending turned on, as you do. Because you are using large triangles that might overlap and blend with other tiles, the GPU has to do a lot of extra work. If, instead of rendering the example square with alpha edges, you were to render an actual shape (instead of a square picture of the shape), then you could turn off blending for this part of the scene and I bet that would speed things up tremendously.
If you want to try it, just turn off blending and see how much things speed up, even if the don't look right. glDisable(GL_BLEND);
Your texture is 512*512*4 bytes per pixel. That's a megabyte of data. If you render it 200 times per frame you generate a bandwidth load of 200 megabytes per frame.
With roughly 4 fps you consume 800mb/second just for texture reads alone. Frame- and Zbuffer writes need bandwidth as well. Then there is the CPU, and don't underestimate the bandwidth requirements of the display as well.
RAM on embedded systems (e.g. your iphone) is not as fast as on a Desktop-PC. What you see here is a bandwidth starvation effect. The RAM simply can't handle the data faster.
How to cure this problem:
pick a sane texture-size. On average you should have 1 texel per pixel. This gives crisp looking textures. I know - it's not always possible. Use common sense.
use mipmaps. This takes up 33% of extra space but allows the graphic chip to pick use a lower resolution mipmap if possible.
Try smaller texture formats. Maybe you can use the ARGB4444 format. This would double the rendering speed. Also take a look at the compressed texture formats. Decompression does not cause a performance drop as it's done in hardware. Infact the opposite is true: Due to the smaller size in memory the graphic chip can read the texture-data faster.
I guess my first try was just a bad (or very good) test.
iPhone has a PowerVR MBX Lite which has a tile based graphics processor. It subdivides the screen into smaller tiles and renders them parallel. Now in the first case above the subdivision might got a bit exhausted because of the very high overlapping. More over, they couldn't be clipped because of the same distance and so all texture coordinates had to calculated (This could be easily tested by changing the translation in the loop).
Also because of the overlapping the parallelism couldn't be exploited and some tiles were sitting doing nothing and the rest (1/3) were working a lot.
So I think, while memory bandwidth could be a bottleneck, this wasn't the case in this example. The problem is more because of how the graphics HW works and the setup of the test.
I'm not familiar with the iPhone, but if it doesn't have dedicated hardware for handling floating point numbers (I suspect it doesn't) then it'd be faster to use integers whenever possible.
I'm currently developing for Android (which uses OpenGL ES as well) and for instance my vertex array is int instead of float. I can't say how much of a difference it makes, but I guess it's worth a try.
Apple is very tight-lipped about the specific hardware specs of the iPhone, which seems very strange to those of us coming from a console background. But people have been able to determine that the CPU is a 32-bit RISC ARM1176JZF. The good news is that it have a full floating-point unit, so we can continue writing math and physics code the way we do in most platforms.
http://gamesfromwithin.com/?p=239