How to push pixels faster on the iPhone? - iphone

I asked before about pixel-pushing, and have now managed to get far enough to get noise to show up on the screen. Here's how I init:
CGDataProviderRef provider;
bitmap = malloc(320*480*4);
provider = CGDataProviderCreateWithData(NULL, bitmap, 320*480*4, NULL);
CGColorSpaceRef colorSpaceRef;
colorSpaceRef = CGColorSpaceCreateDeviceRGB();
ir = CGImageCreate(
320,
480,
8,
32,
4 * 320,
colorSpaceRef,
kCGImageAlphaNoneSkipLast,
provider,
NULL,
NO,
kCGRenderingIntentDefault
);
Here's how I render each frame:
for (int i=0; i<320*480*4; i++) {
bitmap[i] = rand()%256;
}
CGRect rect = CGRectMake(0, 0, 320, 480);
CGContextDrawImage(context, rect, ir);
Problem is this is awfully awfully slow, around 5fps. I think my path to publish the buffer must be wrong. Is it even possible to do full-screen pixel-based graphics that I could update at 30fps, without using the 3D chip?

The slowness is almost certainly in the noise generation. If you run this in Instruments you'll probably see that a ton of time is spent sitting in your loop.
Another smaller issue is your colorspace. If you use the screen's colorspace, you'll avoid a colorspace conversion which is potentially expensive.
If you can use CoreGraphics routines for your drawing, you'd be better served by creating a CGLayer for the drawing context instead of creating a new object each time.
The bytesPerRow component is also important for performance. It should be a factor of 32 IIRC. There's some code available link text that shows how to compute it.
And yeah, for raw performance, OpenGL.

I suspect doing 614400 (320*480*4) memory writes, random number generation and making a new object each frame is slowing you down.
Have you tried just writing a static bitmap to screen and seeing how fast that is? Have you perhaps tried profiling the code? Do you also need to make a new CGRect each time?
If you just want to give the effect of randomness, there is probably no need to regenerate the entire bitmap each time.

To my knowledge, OpenGL is supposed to be the fastest way to do graphics on the iPhone. This includes 2D and 3D. A UIView is backed by a core animation layer, which ends up drawing with OpenGL anyway. So why not skip the middle-man.

You can avoid the trip through CGContextDrawImage by assigning your CGImageRef to -[CALayer setContents:], just be sure not to free bitmap while you're still using it.
[[view layer] setContents:(id)ir];
Yes, I know this is old, I stumbled upon it from Google

Related

Memory consumption spikes when resizing, rotating and cropping images on iPhone

I have an "ImageManipulator" class that performs some cropping, resizing and rotating of camera images on the iPhone.
At the moment, everything works as expected but I keep getting a few huge spikes in memory consumption which occasionally cause the app to crash.
I have managed to isolate the problem to a part of the code where I check for the current image orientation property and rotate it accordingly to UIImageOrientationUp. I then get the image from the bitmap context and save it to disk.
This is currently what I am doing:
CGAffineTransform transform = CGAffineTransformIdentity;
// Check for orientation and set transform accordingly...
transform = CGAffineTransformTranslate(transform, self.size.width, 0);
transform = CGAffineTransformScale(transform, -1, 1);
// Create a bitmap context with the image that was passed so we can perform the rotation
CGContextRef ctx = CGBitmapContextCreate(NULL, self.size.width, self.size.height,
CGImageGetBitsPerComponent(self.CGImage), 0,
CGImageGetColorSpace(self.CGImage),
CGImageGetBitmapInfo(self.CGImage));
// Rotate the context
CGContextConcatCTM(ctx, transform);
// Draw the image into the context
CGContextDrawImage(ctx, CGRectMake(0,0,self.size.height,self.size.width), self.CGImage);
// Grab the bitmap context and save it to the disk...
Even after trying to scale the image down to half or even 1/4 of the size, I am still seeing the spikes to I am wondering if there is a different / more efficient way to get the rotation done as above?
Thanks in advance for the replies.
Rog
If you are saving to JPEG, I guess an alternative approach is to save the image as-is and then set the rotation to whatever you'd like by manipulating the EXIF metadata? See for example this post. Simple but probably effective, even if you have to hold the image payload bytes in memory ;)
Things you can do:
Scale down the image even more (which you probably don't want)
Remember to release everything as soon as you finish with it
Live with it
I would choose option 2 and 3.
Image editing is very resource intensive, as it loads the entire raw uncompressed image data into the memory for processing. This is inevitable as there is absolutely no other way to modify an image other than to load the complete raw data into the memory. Having memory consumption spikes doesn't really matter unless the app receives a memory warning, in that case quickly get rid of everything before it crashes. It is very rare that you would get a memory warning, though, because my app regularly loads a single > 10 mb file into the memory and I don't get a warning, even on older devices. So you'll be fine with the spikes.
Have you tried checking for memory leaks and analyzing allocations?
If the image is still too big, try rotating the image in pieces instead of as a whole.
As Anomie mentioned, CGBitmapContextCreate creates a context. We should release that by using
CGContextRelease(ctx);
If you have any other objects created using create or copy, that should also be released. If it is CFData, then
CFRelease(cfdata);

CVPixelBufferLockBaseAddress why? Capture still image using AVFoundation

I'm writing an iPhone app that creates still images from the camera using AVFoundation.
Reading the programming guide I've found a code that does almost I need to do, so I'm trying to "reverse engineering" and understand it.
I'm founding some difficulties to understand the part that converts a CMSampleBuffer into an image.
So here is what I understood and later the code.
The CMSampleBuffer represent a buffer in the memory where the image with additional data is stored. Later I call the function CMSampleBufferGetImageBuffer() to receive a CVImageBuffer back with just the image data.
Now there is a function that I didn't understand and I can only imagine its function: CVPixelBufferLockBaseAddress(imageBuffer, 0); I can't understand if it is a "thread lock" to avoid multiple operation on it or a lock to the address of the buffer to avoid changes during operation(and why should it change?..another frame, aren't data copied in another location?). The rest of the code it's clear to me.
Tried to search on google but still didn't find nothing helpful.
Can someone bring some light?
-(UIImage*) getUIImageFromBuffer:(CMSampleBufferRef) sampleBuffer{
// Get a CMSampleBuffer's Core Video image buffer for the media data
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// Lock the base address of the pixel buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
// Get the number of bytes per row for the pixel buffer
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
// Get the pixel buffer width and height
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
// Create a device-dependent RGB color space
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
// Create a bitmap graphics context with the sample buffer data
CGContextRef context = CGBitmapContextCreate(baseAddress, width, height, 8,
bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
// Create a Quartz image from the pixel data in the bitmap graphics context
CGImageRef quartzImage = CGBitmapContextCreateImage(context);
// Unlock the pixel buffer
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
// Free up the context and color space
CGContextRelease(context);
CGColorSpaceRelease(colorSpace);
// Create an image object from the Quartz image
UIImage *image = [UIImage imageWithCGImage:quartzImage];
// Release the Quartz image
CGImageRelease(quartzImage);
return (image);
}
Thanks,
Andrea
The header file says that CVPixelBufferLockBaseAddress makes the memory "accessible". I'm not sure what that means exactly, but if you don't do it, CVPixelBufferGetBaseAddress fails so you'd better do it.
EDIT
Just do it is the short answer. For why consider that image may not live in main memory, it may live in a texture on some GPU somewhere (CoreVideo works on the mac too) or even be in a different format to what you expect, so the pixels you get are actually a copy. Without Lock/Unlock or some kind of Begin/End pair the implementation has no way to know when you've finished with the duplicate pixels so they would effectively be leaked. CVPixelBufferLockBaseAddress simply gives CoreVideo scope information, I wouldn't get too hung up on it.
Yes, they could have simply returned the pixels from CVPixelBufferGetBaseAddress and eliminate CVPixelBufferLockBaseAddress altogether. I don't know why they didn't do that.
I'd like to give more hints about this function, I made some tests so far and I can tell you that. When you get the base address you are probably getting the address of some shared memory resource. This becomes clear if you print the address of the base address, doing that you can see that base addresses are repeated while getting video frames. In my app I take frames at specific intervals and pass the CVImageBufferRef to an NSOperation subclass that converts the buffer in an image and saves it on the phone. I do not lock the pixel buffer until the operation starts to convert the CVImageBufferRef, even if pushing at higher framerates the base address of the pixel and the CVImageBufferRef buffer address are equal before the creation of the NSOperation and inside it. I just retain the CVImageBufferRef. I was expecting to se unmatching references and even if I didn't see it I guess that the best description is that CVPixelBufferLockBaseAddress locks the memory portion where the buffer is located, making it inaccessible from other resources so it will keep the same data, until you unlock it.

iPhone: How to use CGContextConcatCTM for saving a transformed image properly?

I am making an iPhone application that loads an image from the camera, and then the user can select a second image from the library, move/scale/rotate that second image, and then save the result. I use two UIImageViews in IB as placeholders, and then apply transformations while touching/pinching.
The problem comes when I have to save both images together. I use a rect of the size of the first image and pass it to UIGraphicsBeginImageContext. Then I tried to use CGContextConcatCTM but I can't understand how it works:
CGRect rect = CGRectMake(0, 0, img1.size.width, img1.size.height); // img1 from camera
UIGraphicsBeginImageContext(rect.size); // Start drawing
CGContextRef ctx = UIGraphicsGetCurrentContext();
CGContextClearRect(ctx, rect); // Clear whole thing
[img1 drawAtPoint:CGPointZero]; // Draw background image at 0,0
CGContextConcatCTM(ctx, img2.transform); // Apply the transformations of the 2nd image
But what do I need to do next? What information is being held in the img2.transform matrix? The documentation for CGContextConcatCTM doesn't help me that much unfortunately..
Right now I'm trying to solve it by calculating the points and the angle using trigonometry (with the help of this answer), but since the transformation is there, there has to be an easier and more elgant way to do this, right?
Take a look at this excellent answer, you need to create a bitmapped/image context, draw to it, and get the resultant image out. You can then save that. iOS UIImagePickerController result image orientation after upload

iPhone: Transforming an image using Quartz 2D

I am trying to apply some transformations on images using a CGContextRef. I am using CGContextTranslateCTM, CGContextScaleCTM and CGContextRotateCTM functions, but to keep things simple lets focus on just the first. I was wondering why the following code produces exactly the original image?! Am I missing something?
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef g = CGBitmapContextCreate((void*) pixelData,
width,
height,
RGBA_8_BIT,
bytesPerRow,
colorSpace,
kCGImageAlphaPremultipliedLast);
CGContextSetShouldAntialias(g, YES);
CGContextSetInterpolationQuality(g, kCGInterpolationHigh);
CGContextTranslateCTM( g,translateX, translateY );
CGImageRef tempImg = CGBitmapContextCreateImage (g);
CGContextDrawImage( g, CGRectMake (0, 0, width, height), tempImg );
CGContextRelease(g);
CGColorSpaceRelease( colorSpace );
Also, after translating, how to draw another image over this one but with partial transparency (eg alpha = 0.5).
I searched alot but didn't find an answer, any help is appreciated... :)
Please note that I am creating the context from pixelData, and that tempImg is created after the translation. There is nothing wrong in the initialization, as the original image is being currently produced, but the issue is with the translation I suppose..
Transformations to the graphics state only affect subsequent drawing operations - they don't change the existing image data. If you want to apply transforms to an image, try something like this:
Create an empty CGContext (on iPhone, use UIGraphicsBeginImageContext)
Translate, scale, or rotate it's graphics state
Draw existing image into it.
Get image from new CGContext (on iPhone, use UIGraphicsGetImageFromCurrentImageContext)
When you perform step 3, the existing image is drawn into your new graphics context with the transformations applied. The trick here is that in order to apply the transformations, we have to actually draw something.
You can do some really cool things with transformations this way. You can draw half your image, apply some transforms, and draw some more.
As noted in other answers, transformations only apply to subsequent drawing operations; they don't affect the pixel buffer you started with.
So you need a drawing operation. The solution is to create a CGImage; drawing that image is a drawing operation, so it will be subject to the current transformation matrix (CTM).
Step-by-step:
Create the context with empty pixel data. (If you pass NULL for the buffers, Quartz should create them for you. That works on the Mac, anyway.)
Create the image with the pixel data you want to draw transformed.
Transform the CTM in the context.
Draw the image.
You have to call CGBitmapContextCreateImage() after you draw the image.
Then you can draw another image on top of the first one and call CGBitmapContextCreateImage() again to get the second image. You can set the alpha using CGContextSetAlpha(ctx, alphaValue);

Why is this OpenGL ES code slow on iPhone?

I've slightly modified the iPhone SDK's GLSprite example while learning OpenGL ES and it turns out to be quite slow. Even in the simulator (on the hw worst) so I must be doing something wrong since it's only 400 textured triangles.
const GLfloat spriteVertices[] = {
0.0f, 0.0f,
100.0f, 0.0f,
0.0f, 100.0f,
100.0f, 100.0f
};
const GLshort spriteTexcoords[] = {
0,0,
1,0,
0,1,
1,1
};
- (void)setupView {
glViewport(0, 0, backingWidth, backingHeight);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(0.0f, backingWidth, backingHeight,0.0f, -10.0f, 10.0f);
glMatrixMode(GL_MODELVIEW);
glClearColor(0.3f, 0.0f, 0.0f, 1.0f);
glVertexPointer(2, GL_FLOAT, 0, spriteVertices);
glEnableClientState(GL_VERTEX_ARRAY);
glTexCoordPointer(2, GL_SHORT, 0, spriteTexcoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// sprite data is preloaded. 512x512 rgba8888
glGenTextures(1, &spriteTexture);
glBindTexture(GL_TEXTURE_2D, spriteTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, spriteData);
free(spriteData);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glEnable(GL_TEXTURE_2D);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
}
- (void)drawView {
..
glClear(GL_COLOR_BUFFER_BIT);
glLoadIdentity();
glTranslatef(tx-100, ty-100,10);
for (int i=0; i<200; i++) {
glTranslatef(1, 1, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
..
}
drawView is called every time the screen is touched or the finger on the screen is moved and tx,ty are set to the x,y coordinates where that touch happened.
I've also tried using GLBuffer, when translation was pre-generated and there was only one DrawArray but gave the same performance (~4 FPS).
===EDIT===
Meanwhile I've modified this so that much smaller quads are used (sized: 34x20) and much less overlapping is done. There are ~400 quads->800 triangles spread on the whole screen. Texture size is 512x512 atlas and RGBA_8888 while the texture coordinates are in float.
The code is very ugly in terms of API efficiency: there are two MatrixMode change along with two loads and two translation then a drawarrays for a triangle strip (quad).
Now this produces ~45 FPS.
(I know this is very late, but I couldn't resist. I'll post anyway, in case other people come here looking for advice.)
This has nothing to do with the texture size. I don't know why people rated up Nils. He seems to have a fundamental misunderstanding of the OpenGL pipeline. He seems to think that for a given triangle, the entire texture is loaded and mapped onto that triangle. The opposite is true.
Once the triangle has been mapped into the viewport, it is rasterized. For every on-screen pixel the your triangle covers, the fragment shader is called. The default fragment shader (OpenGL ES 1.1, which you are using) will lookup the texel that most closely maps (GL_NEAREST) to the pixel you are drawing. It might look up 4 texels since you are using the higher quality GL_LINEAR method to average the best texel. Still, if the pixel count in your triangle is, say 100, then the most texture bytes you will have to read is 4(lookups) * 100(pixels) * 4(bytes per color. Far far less than what Nils was saying. It's amazing that he can make it sound like he actually knows what he's talking about.
WRT the tiled architecture, this is common in embedded OpenGL devices to preserve locality of reference. I believe that each tile gets exposed to each drawing operation, quickly culling most of them. Then the tile decides what to draw on itself. This is going to be much slower when you have blending turned on, as you do. Because you are using large triangles that might overlap and blend with other tiles, the GPU has to do a lot of extra work. If, instead of rendering the example square with alpha edges, you were to render an actual shape (instead of a square picture of the shape), then you could turn off blending for this part of the scene and I bet that would speed things up tremendously.
If you want to try it, just turn off blending and see how much things speed up, even if the don't look right. glDisable(GL_BLEND);
Your texture is 512*512*4 bytes per pixel. That's a megabyte of data. If you render it 200 times per frame you generate a bandwidth load of 200 megabytes per frame.
With roughly 4 fps you consume 800mb/second just for texture reads alone. Frame- and Zbuffer writes need bandwidth as well. Then there is the CPU, and don't underestimate the bandwidth requirements of the display as well.
RAM on embedded systems (e.g. your iphone) is not as fast as on a Desktop-PC. What you see here is a bandwidth starvation effect. The RAM simply can't handle the data faster.
How to cure this problem:
pick a sane texture-size. On average you should have 1 texel per pixel. This gives crisp looking textures. I know - it's not always possible. Use common sense.
use mipmaps. This takes up 33% of extra space but allows the graphic chip to pick use a lower resolution mipmap if possible.
Try smaller texture formats. Maybe you can use the ARGB4444 format. This would double the rendering speed. Also take a look at the compressed texture formats. Decompression does not cause a performance drop as it's done in hardware. Infact the opposite is true: Due to the smaller size in memory the graphic chip can read the texture-data faster.
I guess my first try was just a bad (or very good) test.
iPhone has a PowerVR MBX Lite which has a tile based graphics processor. It subdivides the screen into smaller tiles and renders them parallel. Now in the first case above the subdivision might got a bit exhausted because of the very high overlapping. More over, they couldn't be clipped because of the same distance and so all texture coordinates had to calculated (This could be easily tested by changing the translation in the loop).
Also because of the overlapping the parallelism couldn't be exploited and some tiles were sitting doing nothing and the rest (1/3) were working a lot.
So I think, while memory bandwidth could be a bottleneck, this wasn't the case in this example. The problem is more because of how the graphics HW works and the setup of the test.
I'm not familiar with the iPhone, but if it doesn't have dedicated hardware for handling floating point numbers (I suspect it doesn't) then it'd be faster to use integers whenever possible.
I'm currently developing for Android (which uses OpenGL ES as well) and for instance my vertex array is int instead of float. I can't say how much of a difference it makes, but I guess it's worth a try.
Apple is very tight-lipped about the specific hardware specs of the iPhone, which seems very strange to those of us coming from a console background. But people have been able to determine that the CPU is a 32-bit RISC ARM1176JZF. The good news is that it have a full floating-point unit, so we can continue writing math and physics code the way we do in most platforms.
http://gamesfromwithin.com/?p=239