Improving Finger Painting Performance

Improving Finger Painting Performance - iphone

Simple Painting:
Fingering the iPhone screen paints temporary graphics to a UIView. These temporary graphics are erased after a touch ends, and are then stored into an underlying UIView.
The process is simple:
1) Touch Starts & Moves >>
2) Paint Temporary Graphics on top UIView>>
3) Touch Ends >>
4) Pass Temporary Graphics To underlying UIView >>
5) Underlying UIView adds Temporary Graphics to Stored Graphics >>
6) Underlying UIView Re-Draws all Stored Graphics >>
7) Delete Temporary Graphics on top UIView.
In this manner, I can accumulate graphics on the underlying UIView while maintaining responsive painting of the temporary graphics on the top UIView.
(Sidenote: Each "Drawing" is simply an NSArray of custom "Point" Objects which are just NSObject containers for CGPoints. And the underlying UIView has a seperate NSArray, where it stores these NSArrays of CGPoints)
The Problem Is:
When a great deal of graphics has accumulated on the underlying UIView, it takes time to draw it all out on the screen. And any new drawings on the top UIView will not be displayed until the drawing of the underlying stored graphics is complete. Thus, there is a noticeable lag when many graphics are on the screen.
Question:
Can anyone think of a good way to improve performance here, so that there is no noticable lag between drawings when there are a lot of graphics on the screen?

An NSArray of CGPoints? You mean an NSArray of NSValues holding CGPoints? That's an incredibly time-expensive way to hold what has to be a huge number of values that you access constantly. You could store this information in many better ways. A 2-dimensional C-array representing the entire screen is the most obvious. You may also want to look into bitmap image representations, and draw directly into a CGImage rather than maintaining a bunch of CGPoints. Take a look at the Quartz 2D Programming Guide.
EDIT:
Your object (below) is the equivalent of an NSValue, just a little more specialized. There's a lot of overhead going on here when you have many, many objects (~100,000 I'm guessing when the screen is nearly full; more if you're not removing duplicates; run Instruments to profile it). Old-style C data structures are likely to be much faster for this, because you can avoid all the retains/releases, allocations, etc. There are other options, though. Duplicate point checking would be much faster with an NSMutableSet if you pixel-align your CGPoints, and overload -isEqual on your Point object.
Do make sure you're pixel-aligning your data. Drawing on fractional pixels (and storing them all), could dramatically increase the number of objects involved and the amount of drawing you're doing. Even if you want the anti-aliasing, at least round the pixels to .5 (or .25 or something). A CGPoint is made up of two doubles. You don't need that kind of precision to draw to the screen.

Why not just draw everything onto a CGBitmapContextRef buffer so the drawing operations will accumulate, and then draw that to the screen in your drawRect:? You will be able to perform arbitrary graphics operations without slowing down as the total number of operations increases.
If undo support is necessary, one could always keep a copy for each change made and invalidate the oldest copies when a memory warning is received. (or for an even fancier solution, store the operations as you do now, but keep a cached copy every dozen user operations or so)

Related

Optimizing OpenGL ES application. Should I avoid calling glVertexPointer when possible?

I'm developing a game for iPhone in OpenGL ES 1.1; I have a lot of textured quads in a data structure where each node has a list of children nodes. So I traverse the structure from the root, and do the render of each quad, then its childs and so on.
The thing is, for each quad I'm calling glVertexPointer to set the vertices.
Should I avoid calling it for each quad? Will improve performance calling just once for example?
glVertexPointer copies the vertices to GPU memory or just saves the pointer?
Trying to minimize the number of calls will not be easy since each node may have a different quad. I have a lot of equal sprites with the same vertex data, but I'm not necessarily rendering one after another since I may be drawing a different sprite between them.
Thanks.

glVertexPointer keeps just the pointer, but incurs a state change in the OpenGL driver and an explicit synchronisation, so costs quite a lot. Normally when you say 'here's my data, please draw', the GPU starts drawing and continues to do so in parallel to whatever is going on on the CPU for as long as it can. When you change rendering state, it needs to finish whatever it was doing in the old state. So by changing once per quad, you're effectively forcing what could be concurrent processing to be consecutive. Hence, avoiding glVertexPointer (and, presumably, a glDrawArrays or glDrawElements?) per quad should give you a significant benefit.
An immediate optimisation is simply to keep a count of the number of quads in total in the data structure, allocate a single target buffer for vertices that is at least that size and have all quads copy their geometry into the target buffer rather than calling glVertexPointer each time. Then call glVertexPointer and your drawing calls (condensed to just one call also, hopefully) with the one big array at the end. It's a bit more costly on the CPU side but the parallelism and lack of repeated GPU/CPU synchronisations should save you a lot.
While tiptoeing around topics currently under NDA, I strongly suggest you look at the Xcode 4 beta. Amongst other features Apple have stated publicly to be present is an OpenGL ES profiler. So you can easily compare approaches.
To copy data to the GPU, you need to use a vertex buffer object. That means creating a buffer with glGenBuffers, pushing data to it with glBufferData and then posting a glVertexPointer with an address of e.g. 0 if the first byte in the data you uploaded is the first byte of your vertices. In ES 1.x, you can upload data as GL_DYNAMIC_DRAW to flag that you intend to update it quite often and draw from it quite often. It's probably worth doing if you can get into a position where you're drawing more often than you're uploading.
If you ever switch to ES 2.x there's also GL_STREAM_DRAW, which may be worth investigating but isn't directly relevant to your question. I mention it as it'll likely come up if you Google for vertex buffer objects, being available on desktop OpenGL. Options for ES 1.x are only GL_STATIC_DRAW and GL_DYNAMIC_DRAW.
I've just recently worked on an iPad ES 1.x application with objects that change every frame but are drawn twice per the rendering pipeline in use. There are only five such objects on screen, each 40 vertices, but switching from the initial implementation to the VBO implementation cut 20% off my total processing time.

what is better: one big sprite or many small

I'm new to game programming. And i have a question. I want to have a dotted circle to be drawn on the screen. I can use one big sprite (for example 256x256 pixels) which contains all the circle or i can use many small sprites representing dots.
I use cocos2d libs and i'm able to render using batch. So what is the best way to perform such tasks ?

In my opinion your best bet (if all the dots are the same) is to have one sprite of the dot, and repeat it in the shape you are looking for.
Generally you'll want a single asset for each unique graphic. You can combine those assets into a single sprite and reuse them. This allows for more flexibility as well as speed.

Most of todays graphics hardware is optimized to texture dimensions that are a power of two. Your sprites are likely to have other dimensions. By using sprites, you can minimize the padding that is needed to fill this space (and thus, minimize CPU/GPU cycles spent on correcting this internally). Besides that, the file size will be smaller, since you need less overhead and compression is likely to be more effective.

Go with one large sprite. It's fewer calls into the rendering engine, and adds flexibility to change the look (for example, if you decide to have the circle made of dashed lines rather than dots).

Undo in painting apps like Penultimate and iDraft

In apps like iDraft and Penultimate, they perform undos and redos very well without any delay.
I tried many approaches. Currently, my testing app writes raw pixel data directly to a file after each undo using [NSData writeToFile:atomically:] but I am getting 0.6s delay.
Can anyone give some hints on it?

I don’t know iDraft nor Penultimate, but chances are they have a simpler drawing model than you have. When writing a drawing app you can choose between two essential drawing representations: either you track raw pixels, or you track drawing objects like lines, circles and so on. (Or, in other words, you choose between pixel and vector representation.)
When you draw using vectors, you don’t track the individual pixels. Instead you know there should be line between points X and Y of given width, color and other params. And when you are to draw such a representation, you call Quartz to stroke the line. In this case the model (the drawing representation) consists of a few numbers, takes little memory and therefore you can have many versions of a single drawing in a memory, allowing for a quick and convenient undo and redo.

Keep your undo stack in memory. Don't write to disk for every operation. Whether you keep around bitmaps or vectors, your file ops shouldn't be on the critical path for every paint operation you do.
If your data model is full bitmaps, keep just the changed rect for undo/redo.

As previously said, you probably don't need to write the data to disk for every operation, also in a pixel based case, unless you are trying to undo a full screen filter all you need to keep is the data contained within the bounding rectangle of the brush stroke that the user performed.
You can double buffer your drawing, i.e. keep a copy of the image before the draw, draw into the copy, determine the bounding rect of the user operation, copy and retain the appropriate data from the original (with size and location information). On undo you take that copy and paste it over the modified area.
This method extends to redo, on undo take the area that you are going to be overwriting and store it.

How to store CALayers for reuse?

I have a bunch of identical CALayers that I want to reuse. Often, a few of them should disappear, and then get reused in another position within the same superlayer (half a second or so later).
What is the best way (performance-wise) to keep them while they have disappeared from the screen? setHidden:YES, or setOpacity:0, or removeFromSuperLayer ? Or something else I am not thinking of?
(There are about 12 identical circle shaped CALayers with contents from a UIImage, and about 30 CAShapeLayers each one holding just a line segment -though usually in different orientations-)

You should use an nsset or nsarray to maintain a queue of unused calayers. The process would be similar to what you do when using tablecells.
As each calayer is removeFromSuperLayer'd, put it into your set and pull one out from the set when you need one.

The three you mentioned seem like all reasonable things to try. You really should test each one and see which gives your application the best performance, the results might surprise you.

What's the most efficient way to draw a large CGPath?

Alright, I have a UIView which displays a CGPath which is rather wide. It can be scrolled within a UIScrollView horizontally. Right now, I have the view using a CATiledLayer, because it's more than 1024 pixels wide.
So my question is: is this efficient?
- (void) drawRect:(CGRect)rect {
CGContextRef g = UIGraphicsGetCurrentContext();
CGContextAddPath(g,path);
CGContextSetStrokeColor(g,color);
CGContextDrawPath(g,kCGPathStroke);
}
Essentially, I'm drawing the whole path every time a tile in the layer is drawn. Does anybody know if this is a bad idea, or is CGContext relatively smart about only drawing the parts of the path that are within the clipping rect?
The path is mostly set up in such a way that I could break it up into blocks that are similar in size and shape to the tiles, but it would require more work on my part, would require some redundancy amongst the paths (for shapes that cross tile boundaries), and would also take some calculating to find which path or paths to draw.
Is it worth it to move in this direction, or is CGPath already drawing relatively quickly?

By necessity Quartz must clip everything it draws against the dimensions of the CGContext it's drawing into. But it will still save CPU if you only send it geometry that is visible within that tile. If you do this by preparing multiple paths, one per tile, you're talking about doing that clipping once (when you create your multiple paths) versus Quartz doing it every time you draw a tile. The efficiency gain will come down to how complex your path is: if it's a few simple shapes, no big deal; if it's a vector drawn map of the national road network, it could be huge! Of course you have to trade this speedup off against the increased complexity of your code.
What you could do is use instruments to see how much time is being spent in Quartz before you go crazy optimizing stuff.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse