resizing a camera UIImage returned by the UIImagePickerController takes a ridiculously long time if you do it the usual way as in this post.
[update: last call for creative ideas here! my next option is to go ask Apple, I guess.]
Yes, it's a lot of pixels, but the graphics hardware on the iPhone is perfectly capable of drawing lots of 1024x1024 textured quads onto the screen in 1/60th of a second, so there really should be a way of resizing a 2048x1536 image down to 640x480 in a lot less than 1.5 seconds.
So why is it so slow? Is the underlying image data the OS returns from the picker somehow not ready to be drawn, so that it has to be swizzled in some fashion that the GPU can't help with?
My best guess is that it needs to be converted from RGBA to ABGR or something like that; can anybody think of a way that it might be possible to convince the system to give me the data quickly, even if it's in the wrong format, and I'll deal with it myself later?
As far as I know, the iPhone doesn't have any dedicated "graphics" memory, so there shouldn't be a question of moving the image data from one place to another.
So, the question: is there some alternative drawing method besides just using CGBitmapContextCreate and CGContextDrawImage that takes more advantage of the GPU?
Something to investigate: if I start with a UIImage of the same size that's not from the image picker, is it just as slow? Apparently not...
Update: Matt Long found that it only takes 30ms to resize the image you get back from the picker in [info objectForKey:#"UIImagePickerControllerEditedImage"], if you've enabled cropping with the manual camera controls. That isn't helpful for the case I care about where I'm using takePicture to take pictures programmatically. I see that that the edited image is kCGImageAlphaPremultipliedFirst but the original image is kCGImageAlphaNoneSkipFirst.
Further update: Jason Crawford suggested CGContextSetInterpolationQuality(context, kCGInterpolationLow), which does in fact cut the time from about 1.5 sec to 1.3 sec, at a cost in image quality--but that's still far from the speed the GPU should be capable of!
Last update before the week runs out: user refulgentis did some profiling which seems to indicate that the 1.5 seconds is spent writing the captured camera image out to disk as a JPEG and then reading it back in. If true, very bizarre.
Seems that you have made several assumptions here that may or may not be true. My experience is different than yours. This method seems to only take 20-30ms on my 3Gs when scaling a photo snapped from the camera to 0.31 of the original size with a call to:
CGImageRef scaled = CreateScaledCGImageFromCGImage([image CGImage], 0.31);
(I get 0.31 by taking the width scale, 640.0/2048.0, by the way)
I've checked to make sure the image is the same size you're working with. Here's my NSLog output:
2009-12-07 16:32:12.941 ImagePickerThing[8709:207] Info: {
UIImagePickerControllerCropRect = NSRect: {{0, 0}, {2048, 1536}};
UIImagePickerControllerEditedImage = <UIImage: 0x16c1e0>;
UIImagePickerControllerMediaType = "public.image";
UIImagePickerControllerOriginalImage = <UIImage: 0x184ca0>;
}
I'm not sure why the difference and I can't answer your question as it relates to the GPU, however I would consider 1.5 seconds and 30ms a very significant difference. Maybe compare the code in that blog post to what you are using?
Best Regards.
Use Shark, profile it, figure out what's taking so long.
I have to work a lot with MediaPlayer.framework and when you get properties for songs on the iPod, the first property request is insanely slow compared to subsequent requests, because in the first property request MobileMediaPlayer packages up a dictionary with all the properties and passes it to my app.
I'd be willing to bet that there is a similar situation occurring here.
EDIT: I was able to do a time profile in Shark of both Matt Long's UIImagePickerControllerEditedImage situation and the generic UIImagePickerControllerOriginalImage situation.
In both cases, a majority of the time is taken up by CGContextDrawImage. In Matt Long's case, the UIImagePickerController takes care of this in between the user capturing the image and the image entering 'edit' mode.
Scaling the percentage of time taken to CGContextDrawImage = 100%, CGContextDelegateDrawImage then takes 100%, then ripc_DrawImage (from libRIP.A.dylib) takes 100%, and then ripc_AcquireImage (which it looks like decompresses the JPEG, and takes up most of its time in _cg_jpeg_idct_islow, vec_ycc_bgrx_convert, decompress_onepass, sep_upsample) takes 93% of the time. Only 7% of the time is actually spent in ripc_RenderImage, which I assume is the actual drawing.
I have had the same problem and banged my head against it for a long time. As far as I can tell, the first time you access the UIImage returned by the image picker, it's just slow. As an experiment, try timing any two operations with the UIImage--e.g., your scale-down, and then UIImageJPEGRepresentation or something. Then switch the order. When I've done this in the past, the first operation gets a time penalty. My best hypothesis is that the memory is still on the CCD somehow, and transferring it into main memory to do anything with it is slow.
When you set allowsImageEditing=YES, the image you get back is resized and cropped down to about 320x320. That makes it faster, but it's probably not what you want.
The best speedup I've found is:
CGContextSetInterpolationQuality(context, kCGInterpolationLow)
on the context you get back from CGBitmapContextCreate, before you do CGContextDrawImage.
The problem is that your scaled-down images might not look as good. However, if you're scaling down by an integer factor--e.g., 1600x1200 to 800x600--then it looks OK.
Here's a git project that I've used and it seems to work well. The usage is pretty clean as well - one line of code.
https://github.com/AliSoftware/UIImage-Resize
DO NOT USE CGBitmapImageContextCreate in this case! I spent almost a week in the same situation you are in. Performance will be absolutely terrible and it will eat up memory like crazy. Use UIGraphicsBeginImageContext instead:
// create a new CGImage of the desired size
UIGraphicsBeginImageContext(desiredImageSize);
CGContextRef c = UIGraphicsGetCurrentContext();
// clear the new image
CGContextClearRect(c, CGRectMake(0,0,desiredImageSize.width, desiredImageSize.height));
// draw in the image
CGContextDrawImage(c, rect, [image CGImage]);
// return the result to our parent controller
UIImage * result = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
In the above example (from my own image resize code), "rect" is significantly smaller than the image. The code above runs very fast, and should do exactly what you need.
I'm not entirely sure why UIGraphicsBeginImageContext is so much faster, but I believe it has something to do with memory allocation. I've noticed that this approach requires significantly less memory, implying that the OS has already allocated space for an image context somewhere.
Related
I have a roughly 1000x1000 image, which will be mutated at 60fps, each time changing color of only dozens of pixels.
Then, I need to display this image in Flutter. Of course I can use Image.memory(the_uint8list_data), but it will be extremely costly since it will create 60 new big images in one second. I have also checked ui.Image, but it is opaque and I do not see any handles to change pixel data. I originally used Path-based solutions, but that was also too slow because the mutation can happen for a long time and the Path will be very huge.
The question is, how can I display this image with high performance?
I have an "ImageManipulator" class that performs some cropping, resizing and rotating of camera images on the iPhone.
At the moment, everything works as expected but I keep getting a few huge spikes in memory consumption which occasionally cause the app to crash.
I have managed to isolate the problem to a part of the code where I check for the current image orientation property and rotate it accordingly to UIImageOrientationUp. I then get the image from the bitmap context and save it to disk.
This is currently what I am doing:
CGAffineTransform transform = CGAffineTransformIdentity;
// Check for orientation and set transform accordingly...
transform = CGAffineTransformTranslate(transform, self.size.width, 0);
transform = CGAffineTransformScale(transform, -1, 1);
// Create a bitmap context with the image that was passed so we can perform the rotation
CGContextRef ctx = CGBitmapContextCreate(NULL, self.size.width, self.size.height,
CGImageGetBitsPerComponent(self.CGImage), 0,
CGImageGetColorSpace(self.CGImage),
CGImageGetBitmapInfo(self.CGImage));
// Rotate the context
CGContextConcatCTM(ctx, transform);
// Draw the image into the context
CGContextDrawImage(ctx, CGRectMake(0,0,self.size.height,self.size.width), self.CGImage);
// Grab the bitmap context and save it to the disk...
Even after trying to scale the image down to half or even 1/4 of the size, I am still seeing the spikes to I am wondering if there is a different / more efficient way to get the rotation done as above?
Thanks in advance for the replies.
Rog
If you are saving to JPEG, I guess an alternative approach is to save the image as-is and then set the rotation to whatever you'd like by manipulating the EXIF metadata? See for example this post. Simple but probably effective, even if you have to hold the image payload bytes in memory ;)
Things you can do:
Scale down the image even more (which you probably don't want)
Remember to release everything as soon as you finish with it
Live with it
I would choose option 2 and 3.
Image editing is very resource intensive, as it loads the entire raw uncompressed image data into the memory for processing. This is inevitable as there is absolutely no other way to modify an image other than to load the complete raw data into the memory. Having memory consumption spikes doesn't really matter unless the app receives a memory warning, in that case quickly get rid of everything before it crashes. It is very rare that you would get a memory warning, though, because my app regularly loads a single > 10 mb file into the memory and I don't get a warning, even on older devices. So you'll be fine with the spikes.
Have you tried checking for memory leaks and analyzing allocations?
If the image is still too big, try rotating the image in pieces instead of as a whole.
As Anomie mentioned, CGBitmapContextCreate creates a context. We should release that by using
CGContextRelease(ctx);
If you have any other objects created using create or copy, that should also be released. If it is CFData, then
CFRelease(cfdata);
I'm currently working on an iphone app that lets the user take a picture with the camera and then process it using Quartz 2D.
With Quartz 2D I transform the context to make the picture appear with the right orientation (scale and translate because it's mirrored) and then I stack a bunch of layers whith blending modes to process the picture.
The initial (and the final result) picture is 3mp or 5mp depending on device and it takes a great amount of memory once drawn. Reminder : it's not a jpeg in memory, it's bitmap data.
My layers are the same size as the initial picture so every time i draw a new layer on top of my picture i need the current picture state in memory (A) + the layer to blend memory space (B) + the space in memory to write the result (C).
When i get the result i ditch "A" and "B", takes "C" to the next stage of processing where it become the new "A"...
I need 4 pass like this to obtain the finale picture.
Giving the resolution of these pictures my memory usage can climb high.
I can see a peek at 14Mo-15Mo and most of the time i only get level 1 warnings but level 2s sometimes wave at me and kill my app.
Am i doing this the right way regarding general process ?
Is there a way to speed up the processing ?
Why oh why memory warnings spawn randomly ?
Why the second picture process is longer than the first as you can see in this pic:
Because the duration looks to be about twice as long; I'd say it was doing twice as much processing. Does the third photo taken take three times as long?
If so, that would seem to indicate it's processing all previous photos / layers taken. Which - of course - is a bug in your code somewhere.
I'm having some performance issues calling CGContextDrawLayerAtPoint in iOS 4 that didn't seem to exist in previous version of the OS.
I'm copying a layer obtained from a bitmap context created with CGBitmapContextCreate to my view's context during a drawRect call. The view and the bitmap are the same size.
The bitmap was created with:
CGBitmapContextCreate(NULL, width, height, 8, width * 4, genericRGBSpace, kCGBitmapByteOrder32Host | kCGImageAlphaNoneSkipFirst);
Instruments in indicating I'm spending more time in CGContextDrawLayerAtPoint than I was on devices running OS 3.2. In fact the stack trace indicates the following stack trace taking a higher percentage of time:
argb32_sample_argb32
argb32_image_mark
argb32_image
ripl_Mark
ripl_BltImage
RIPLayerBltImage
ripc_RenderImage
ripc_DrawLayer
CGContextDelegateDrawlayer
CGContextDrawLayerAtPoint
whereas the same code running under 3.2 shows
argb32_image_mark_rgb32
argb32_image
ripl_Mark
ripl_BltImage
ripc_RenderImage
ripc_DrawLayer
CGContextDelegateDrawLayer
CGContextDrawLayerAtPoint
with a much lower percentage of time. I'm not sure exactly why argb32_sample_argb32 is being called on iOS 4 and what it's doing - sampling the existing data into a new buffer? I'm not sure why this would be necessary. The view and the bitmap are the same size and no scaling has been performed.
Any insight into this would be appreciated.
Well, no one had any insight in to this, so I'll recount our theories and what we ended up doing. The theory is that since iOS 4 now has to deal with different screen resolutions, it's taking a more "generalized" approach to blitting and is a bit more inefficient than it was.
In the end, it doesn't really matter - it's slower and we have to deal with it. We discovered that there were places where we were copying redundantly, so we were able to reduce the amount copying we had to do. This bought us the speed we needed.
I'm working on an iPhone App that relies heavily on OpenGL. Right now it runs a bit slow on the iPhone 3G, but looks snappy on the new 32G iPod Touch. I assume this is hardware related. Anyway, I want to get the iPhone performance to resemble the iPod Touch performance. I believe I'm doing a lot of things sub-optimally in OpenGL and I'd like advice on what improvements will give me the most bang for the buck.
My scene rendering goes something like this:
Repeat 35 times
glPushMatrix
glLoadIdentity
glTranslate
Repeat 7 times
glBindTexture
glVertexPointer
glNormalPointer
glTexCoordPointer
glDrawArrays(GL_TRIANGLES, ...)
glPopMatrix
My Vertex, Normal and Texture Coords are already interleaved.
So, what steps should I take to speed this up? What step would you try first?
My first thought is to eliminate all those glBindTexture() calls by using a Texture Atlas.
What about some more efficient matrix operations? I understand the gl*() versions aren't too efficient.
What about VBOs?
Update
There are 8260 triangles.
Texture sizes are 64x64 pngs. There are 58 different textures.
I have not run instruments.
Update 2
After running the OpenGL ES Instrument on the iPhone 3G I found that my Tiler Utilization is in the 90-100% range, and my Render Utilization is in the 30% range.
Update 3
Texture Atlasing had no noticeable affect on the problem. Utilization ranges are still as noted above.
Update 4
Converting my Vertex and Normal pointers to GL_SHORT seemed to improve FPS, but the Tiler Utilization is still in the 90% range a lot of the time. I'm still using GL_FLOAT for my texture coordinates. I suppose I could knock those down to GL_SHORT and save four more bytes per vertex.
Update 5
Converting my texture coordinates to GL_SHORT yielded another performance increase. I'm now consistently getting >30 FPS. Tiler Utilization is still around 90%, but frequently drops down in the the 70-80% range. The Renderer Utilization is hovering around 50%. I suppose this might have something to do with scaling the texture coordinates from GL_TEXTURE Matrix Mode.
I'm still seeking additional improvements. I'd like to get closer to 40 FPS, as that's what my iPod Touch gets and it's silky smooth there. If anyone is still paying attention, what other low-hanging fruit can I pick?
With a tiler utilization still above 90%, you’re likely still vertex throughput-bound. Your renderer utilization is higher because the GPU is rendering more frames. If your primary focus is improving performance on older devices, then the key is still to cut down on the amount of vertex data needed per triangle. There are two sides to this:
Reducing the amount of data per vertex: Now that all of your vertex attributes are already GL_SHORTs, the next thing to pursue is finding a way to do what you want using fewer attributes or components. For example, if you can live without specular highlights, using DOT3 lighting instead of OpenGL ES fixed-function lighting would replace your 3 shorts (+ 1 short of padding) for normals with 2 shorts for an extra texture coordinate. As an additional bonus, you’d be able to light your models per-pixel.
Reducing the number of vertices needed per triangle: When drawing with indexed triangles, you should make sure that your indices are sorted for maximum reuse. Running your geometry through Imagination Technologies’ PVRTTriStrip tool would probably be your best bet here.
If you only have 58 different 64x64 textures, a texture atlas seems like a good idea, since they'd all fit in a single 512x512 texture... if you don't rely on texture wrap modes, I'd certainly at least try this.
What format are your textures in? You might try using a compressed PVRTC texture; I think that's less load on the Tiler, and I've been pleasantly surprised by the image quality even for 2-bit-per-pixel textures. (Good for natural images, not good if you're doing something that looks like an 8-bit video game)
The first thing I would do is run Instruments profiling on the hardware device that is slow. It should show you pretty quickly where the bottlenecks are for your particular case.
Update after instruments results:
This question has a similar result in Instruments to you, perhaps the advice is also applicable in your case (basically reducing number vertex data)
The biggest win in graphics programming comes down to this:
Batch, Batch, Batch
TextureAtlasing will make a bigger difference than most anything else you can do. Switching textures is like stopping a speeding train to let on new passengers every time.
Combine all those textures into an atlas and cut your draw calls down a lot.
This web-based tool may be helpful: http://zwoptex.zwopple.com/
Have you looked over the "OpenGL ES Programming Guide for iPhone OS" in the dev center? There are sections on Best Practices for Vertex Data and Texture Data.
Is your data formatted to be able to use triangle strips?
In terms of least effort, the modification sequence for you would probably be:
Reducing vertex attribute size
VBOs
Note that when you do these, you need to make sure that components are aligned on their native alignment, i.e. the floats or full ints are on 4-byte boundaries, the shorts are on 2-byte boundaries. If you don't do this it will tank your performance. It might be helpful to mentally map it by typing out your attribute ordering as a struct definition so you can sanity check your layout and alignment.
making sure your data is stripped to share vertices
using a texture atlas to reduce texture swaps
To try converting your textures to 16-bit RGB565 format, see this code in Apple's venerable Texture2D.m, search for kTexture2DPixelFormat_RGB565
http://code.google.com/p/cocos2d-iphone/source/browse/branches/branch-0.1/OpenGLSupport/Texture2D.m
(this code loads PNGs and converts them to RGB565 at texture creation time; I don't know if there's an RGB565 file format as such)
For more information on PVRTC compressed textures (which looked way better than I expected when I used them, even at 2 bits per pixel) see Apple's PVRTextureLoader sample:
http://developer.apple.com/iPhone/library/samplecode/PVRTextureLoader/index.html
it has both the code for loading PVRTC textures in your app and also instructions for using the texturetool to convert your .png files into .pvr files.