My iphone game that I am currently working was developed using cocos2d. The game crashes with the error:
Program received 0, Data Formatters, Debugging cannot continue......
After doing some research I've found out that it is running out of memory. I got the:
Received memory warning. Level=1 etc.
The source of the problem seems to be loading of plists files. It uses 4.0 MB just for loading about 23 .plist files to run different animations.
I would like to know how to load a bunch of plist files that runs different animations. The image is a screenshot of code that loads plist files along with its memory usage. I used instruments to get that result.
On further debugging here's what I got the assembler code
pop {r4, r5, r7, pc}
adds r0, #100 ; 0x64
lsls r3, r1, #0
--Error--
lsls r2, r1, #0
add r7, sp, #720
lsls r4, r1, #0
cbz r4, <0x7a>
lsls r4, r1, #0
- (void)applicationDidReceiveMemoryWarning:(UIApplication *)application {
//[[CCDirector sharedDirector] purgeCachedData];
}
Try commenting out purgeCachedData, and just call it when you exit your gameScene
For every call to addSpriteFramesWithFile, Cocos2d loads the associated image file (.png), and you appear to have quite a lot of sprite sheets. I'm going to assume that each of these sheets is not huge, because obviously loading this many large textures would create memory warnings.
You should combine the smaller sprite sheets into one or more larger sprite sheets, as there is still a penalty for loading multiple textures, which internally will end up being padded to the next highest power of two dimensions. Not to mention the performance savings brought with less texture switching during drawing.
Also note that your change to applicationDidReceiveMemoryWarning does not stop Cocos2d from removing textures, you need to comment out the call to removeAllTextures as well in order to test that.
Related
Is memoryless textures available to use with multiple MTLRenderCommandEncoders? For example(in theory) I creating command encoder #1 and memoryless texture #1 and using it as render target, then creating command encoder #2 and memoryless texture #2 as render target but using texture #1 as argument in fragment shader(read only access). Would this work?
Short answer: No, this wouldn't work. You have to do it in a single render command encoder.
I'm guessing you want to read the contents of the whole texture in render encoder #2, which is not possible on tile-based Apple GPUs (the only GPUs that run Metal that will actually support memoryless render targets). If you want to read anything apart from contents of the current tile, you have to store the attachment out to system memory, that's just how tile-based deferred renderers work. For more info refer to this talk and other WWDC talks about tile shaders and game optimizations.
Long answer: At the end of a render encoder, Metal has to execute a store action of your choosing that you passed through MTLRenderPassDescriptor. The reason it has to do it, is that there is a bunch of internal synchronization including fences and barriers that ensures that the next encoder that actually uses the attachments of previous encoder as render targets or a sampled texture can read whatever was written there.
Hope this answers your question.
I am working on a painting program where I draw interactive strokes via an MTKView. If I set the renderPassDescriptor loadAction to 'clear':
renderPassDescriptor?.colorAttachments[0].loadAction = .clear
The frame buffer, as expected, shows the latest contents of renderCommandEncoder?.drawPrimitives, which is this case is the leading edge of the brushstroke.
If I set loadAction to 'load':
renderPassDescriptor?.colorAttachments[0].loadAction = .load
The frame buffer flashes like crazy and shows a patchy trail of what I've just drawn. I now understand that the flashing is likely caused by MTKView's default triple buffering in place. Thus, each time I write to the currentDrawable, I'm likely writing to one of 3 cycling buffers. Please correct me if I'm wrong.
My question is, what do I need to do to draw a clean brushstroke without the frame buffer flashing as it does now? In other words, is there a way to have a master buffer that gets updated with the latest contents of commandEncoder?
You can use a texture of your own as the color attachment of a render pass. You don't have to use the texture of a drawable. In that way, you can use the .load action without getting garbage or weird flashing or whatever. You will have full control over which texture you're rendering to and what its contents are.
After rendering to that texture for a render pass, you then need to blit that to the drawable's texture for display.
The main complication here is that you won't have the benefits of double- or triple-buffering. You'll lose a certain amount of performance, since everything will have to be synced to that one texture's state. I suspect, though, that you don't need that much performance, since this is interactive and only has to keep up with the speed of a human.
I'm trying to implement a state preserving particle system on the iPhone using OpenGL ES 2.0. By state-preserving, I mean that each particle is integrated forward in time, having a unique velocity and position vector that changes with time and can not be calculated from the initial conditions at every rendering call.
Here's one possible way I can think of.
Setup particle initial conditions in VBO.
Integrate particles in vertex shader, write result to texture in fragment shader. (1st rendering call)
Copy data from texture to VBO.
Render particles from data in VBO. (2nd rendering call)
Repeat 2.-4.
The only thing I don't know how to do efficiently is step 3. Do I have to go through the CPU? I wonder if is possible to do this entirely on the GPU with OpenGL ES 2.0. Any hints are greatly appreciated!
I don't think this is possible without simply using glReadPixels -- ES2 doesn't have the same flexible buffer management that OpenGL has to allow you to copy buffer contents using the GPU (where, for example, you could copy data between the texture and vbo, or use simply use transform feedback which is basically designed to do exactly what you want).
I think your only option if you need to use the GPU is to use glReadPixels to copy the framebuffer contents back out after rendering. You probably also want to check and use EXT_color_buffer_float or related if available to make sure you have high precision values (RGBA8 is probably not going to be sufficient for your particles). If you're intermixing this with normal rendering, you probably want to build in a bunch of buffering (wait a frame or two) so you don't stall the CPU waiting for the GPU (this would be especially bad on PowerVR since it buffers a whole frame before rendering).
ES3.0 will have support for transform feedback, which doesn't help but hopefully gives you some hope for the future.
Also, if you are running on an ARM cpu, it seems like it'd be faster to use NEON to quickly update all your particles. It can be quite fast and will skip all the overhead you'll incur from the CPU+GPU method.
I am developing a 2D Tile based game and currently struggling with performance issue as I am getting around 10 - 15 FPS even when running on iPad 3. OpenGL ES Frame capture reveals that I am making call to glDrawElements 689 times per frame! Is that a lot? Could it be the case of low performance?
Should I stack everything in one huge array and perform 1 draw call? will it make any difference?
At this point in time, you are currently limited by your command issue(assuming), if you run opengl performance detective ( it's under xcode (right click, open developer tools) you may have to download it through preferences ).
Your goal is to be limited by fill rate at the end of the day, here are some tips to help you get there
Sort all sprites by
Draw Depth
Blend Mode
Texture ID
Once sorted,
Pack all sprites into one vertex buffer object and an index buffer object.
When ever your draw depth, blend mode, or texture ID change, it's time to make a new draw call and bind those resources.
Also keep in mind that your sprites should have your vertices flatted on the cpu side (pos x mvp ) and you should not be sending over matrices and any other attributes such as color, should be part of the vertex.
Typical vertex
{
float pos[3]
int color
float uv[2]
}
I'm developing a game for iPhone in OpenGL ES 1.1; I have a lot of textured quads in a data structure where each node has a list of children nodes. So I traverse the structure from the root, and do the render of each quad, then its childs and so on.
The thing is, for each quad I'm calling glVertexPointer to set the vertices.
Should I avoid calling it for each quad? Will improve performance calling just once for example?
glVertexPointer copies the vertices to GPU memory or just saves the pointer?
Trying to minimize the number of calls will not be easy since each node may have a different quad. I have a lot of equal sprites with the same vertex data, but I'm not necessarily rendering one after another since I may be drawing a different sprite between them.
Thanks.
glVertexPointer keeps just the pointer, but incurs a state change in the OpenGL driver and an explicit synchronisation, so costs quite a lot. Normally when you say 'here's my data, please draw', the GPU starts drawing and continues to do so in parallel to whatever is going on on the CPU for as long as it can. When you change rendering state, it needs to finish whatever it was doing in the old state. So by changing once per quad, you're effectively forcing what could be concurrent processing to be consecutive. Hence, avoiding glVertexPointer (and, presumably, a glDrawArrays or glDrawElements?) per quad should give you a significant benefit.
An immediate optimisation is simply to keep a count of the number of quads in total in the data structure, allocate a single target buffer for vertices that is at least that size and have all quads copy their geometry into the target buffer rather than calling glVertexPointer each time. Then call glVertexPointer and your drawing calls (condensed to just one call also, hopefully) with the one big array at the end. It's a bit more costly on the CPU side but the parallelism and lack of repeated GPU/CPU synchronisations should save you a lot.
While tiptoeing around topics currently under NDA, I strongly suggest you look at the Xcode 4 beta. Amongst other features Apple have stated publicly to be present is an OpenGL ES profiler. So you can easily compare approaches.
To copy data to the GPU, you need to use a vertex buffer object. That means creating a buffer with glGenBuffers, pushing data to it with glBufferData and then posting a glVertexPointer with an address of e.g. 0 if the first byte in the data you uploaded is the first byte of your vertices. In ES 1.x, you can upload data as GL_DYNAMIC_DRAW to flag that you intend to update it quite often and draw from it quite often. It's probably worth doing if you can get into a position where you're drawing more often than you're uploading.
If you ever switch to ES 2.x there's also GL_STREAM_DRAW, which may be worth investigating but isn't directly relevant to your question. I mention it as it'll likely come up if you Google for vertex buffer objects, being available on desktop OpenGL. Options for ES 1.x are only GL_STATIC_DRAW and GL_DYNAMIC_DRAW.
I've just recently worked on an iPad ES 1.x application with objects that change every frame but are drawn twice per the rendering pipeline in use. There are only five such objects on screen, each 40 vertices, but switching from the initial implementation to the VBO implementation cut 20% off my total processing time.