iPhone Image Processing with Accelerate Framework and vDSP

iPhone Image Processing with Accelerate Framework and vDSP - iphone

UPDATE: Please see additional question below with more code;
I am trying to code a category for blurring an image. My starting point is Jeff LaMarche's sample here. Whilst this (after the fixes suggested by others) works fine, it is an order of magnitude too slow for my requirements - on a 3GS it takes maybe 3 seconds to do a decent blur and I'd like to get this down to under 0.5 sec for a full screen (faster is better).
He mentions the Accelerate framework as a performance enhancement so I've spent the last day looking at this, and in particular vDSP_f3x3 which according to the Apple Documenation
Filters an image by performing a
two-dimensional convolution with a 3x3
kernel; single precision.
Perfect - I have a suitable filter matrix, and I have an image ... but this is where I get stumped.
vDSP_f3x3 assumes image data is (float *) but my image comes from;
srcData = (unsigned char *)CGBitmapContextGetData (context);
and the context comes from CGBitmapContextCreate with kCGImageAlphaPremultipliedFirst, so my srcData is really ARGB with 8 bits per component.
I suspect what I really need is a context with float components, but according to the Quartz documentation here, kCGBitMapFloatComponents is only available on Mac OS and not iOS :-(
Is there a really fast way using the accelerate framework of converting the integer components I have into the float components that vDSP_f3x3 needs? I mean I could do it myself, but by the time I do that, then the convolution, and then convert back, I suspect I'll have made it even slower than it is now since I might as well convolve as I go.
Maybe I have the wrong approach?
Does anyone have some tips for me having done some image processing on the iphone using vDSP? The documentation I can find is very reference orientated and not very newbie friendly when it comes to this sort of thing.
If anyone has a reference for really fast blurring (and high quality, not the reduce resolution and then rescale stuff I've seen and looks pants) that would be fab!
EDIT:
Thanks #Jason. I've done this and it is almost working, but now my problem is that although the image does blur, on every invocation it shifts left 1 pixel. It also seems to make the image black and white, but that could be something else.
Is there anything in this code that leaps out as obviously incorrect? I haven't optimised it yet and it's a bit rough, but hopefully the convolution code is clear enough.
CGImageRef CreateCGImageByBlurringImage(CGImageRef inImage, NSUInteger pixelRadius, NSUInteger gaussFactor)
{
unsigned char *srcData, *finalData;
CGContextRef context = CreateARGBBitmapContext(inImage);
if (context == NULL)
return NULL;
size_t width = CGBitmapContextGetWidth(context);
size_t height = CGBitmapContextGetHeight(context);
size_t bpr = CGBitmapContextGetBytesPerRow(context);
int componentsPerPixel = 4; // ARGB
CGRect rect = {{0,0},{width,height}};
CGContextDrawImage(context, rect, inImage);
// Now we can get a pointer to the image data associated with the bitmap
// context.
srcData = (unsigned char *)CGBitmapContextGetData (context);
if (srcData != NULL)
{
size_t dataSize = bpr * height;
finalData = malloc(dataSize);
memcpy(finalData, srcData, dataSize);
//Generate Gaussian kernel
float *kernel;
// Limit the pixelRadius
pixelRadius = MIN(MAX(1,pixelRadius), 248);
int kernelSize = pixelRadius * 2 + 1;
kernel = malloc(kernelSize * sizeof *kernel);
int gauss_sum =0;
for (int i = 0; i < pixelRadius; i++)
{
kernel[i] = 1 + (gaussFactor*i);
kernel[kernelSize - (i + 1)] = 1 + (gaussFactor * i);
gauss_sum += (kernel[i] + kernel[kernelSize - (i + 1)]);
}
kernel[(kernelSize - 1)/2] = 1 + (gaussFactor*pixelRadius);
gauss_sum += kernel[(kernelSize-1)/2];
// Scale the kernel
for (int i=0; i<kernelSize; ++i) {
kernel[i] = kernel[i]/gauss_sum;
}
float * srcAsFloat,* resultAsFloat;
srcAsFloat = malloc(width*height*sizeof(float)*componentsPerPixel);
resultAsFloat = malloc(width*height*sizeof(float)*componentsPerPixel);
// Convert uint source ARGB to floats
vDSP_vfltu8(srcData,1,srcAsFloat,1,width*height*componentsPerPixel);
// Convolve (hence the -1) with the kernel
vDSP_conv(srcAsFloat, 1, &kernel[kernelSize-1],-1, resultAsFloat, 1, width*height*componentsPerPixel, kernelSize);
// Copy the floats back to ints
vDSP_vfixu8(resultAsFloat, 1, finalData, 1, width*height*componentsPerPixel);
free(resultAsFloat);
free(srcAsFloat);
}
size_t bitmapByteCount = bpr * height;
CGDataProviderRef dataProvider = CGDataProviderCreateWithData(NULL, finalData, bitmapByteCount, &providerRelease);
CGImageRef cgImage = CGImageCreate(width, height, CGBitmapContextGetBitsPerComponent(context),
CGBitmapContextGetBitsPerPixel(context), CGBitmapContextGetBytesPerRow(context), CGBitmapContextGetColorSpace(context), CGBitmapContextGetBitmapInfo(context),
dataProvider, NULL, true, kCGRenderingIntentDefault);
CGDataProviderRelease(dataProvider);
CGContextRelease(context);
return cgImage;
}
I should add that if I comment out the vDSP_conv line, and change the line following to;
vDSP_vfixu8(srcAsFloat, 1, finalData, 1, width*height*componentsPerPixel);
Then as expected, my result is a clone of the original source. In colour and not shifted left. This implies to me that it IS the convolution that is going wrong, but I can't see where :-(
THOUGHT: Actually thinking about this, it seems to me that the convolve needs to know the input pixels are in ARGB format as otherwise the convolution will be multiplying the values together with no knowledge about their meaning (ie it will multiple R * B etc). This would explain why I get a B&W result I think, but not the shift. Again, I think there might need to be more to it than my naive version here ...
FINAL THOUGHT: I think the shifting left is a natural result of the filter and I need to look at the image dimensions and possibly pad it out ... so I think the code is actually working OK given what I've fed it.

While the Accelerate framework will be faster than simple serial code, you'll probably never see the greatest performance by blurring an image using it.
My suggestion would be to use an OpenGL ES 2.0 shader (for devices that support this API) to do a two-pass box blur. Based on my benchmarks, the GPU can handle these kinds of image manipulation operations at 14-28X the speed of the CPU on an iPhone 4, versus the maybe 4.5X that Apple reports for the Accelerate framework in the best cases.
Some code for this is described in this question, as well as in the "Post-Processing Effects on Mobile Devices" chapter in the GPU Pro 2 book (for which the sample code can be found here). By placing your image in a texture, then reading values in between pixels, bilinear filtering on the GPU gives you some blurring for free, which can then be combined with a few fast lookups and averaging operations.
If you need a starting project to feed images into the GPU for processing, you might be able to use my sample application from the article here. That sample application passes AVFoundation video frames as textures into a processing shader, but you can modify it to send in your particular image data and run your blur operation. You should be able to use my glReadPixels() code to then retrieve the blurred image for later use.
Since I originally wrote this answer, I've created an open source image and video processing framework for doing these kinds of operations on the GPU. The framework has several different blur types within it, all of which can be applied very quickly to images or live video. The GPUImageGaussianBlurFilter, which applies a standard 9-hit Gaussian blur, runs in 16 ms for a 640x480 frame of video on the iPhone 4. The GPUImageFastBlurFilter is a modified 9-hit Gaussian blur that uses hardware filtering, and it runs in 2.0 ms for that same video frame. Likewise, there's a GPUImageBoxBlurFilter that uses a 5-pixel box and runs in 1.9 ms for the same image on the same hardware. I also have median and bilateral blur filters, although they need a little performance tuning.
In my benchmarks, Accelerate doesn't come close to these kinds of speeds, especially when it comes to filtering live video.

You definitely want to convert to float to perform the filtering since that is what the accelerated functions take, plus it is a lot more flexible if you want to do any additional processing. The computation time of a 2-D convolution (filter) will most likely dwarf any time spent in conversion. Take a look at the function vDSP_vfltu8() which will quickly convert the uint8 data to float. vDSP_vfixu8() will convert it back to uint8.
To perform a blur, you are probably going to want a bigger convolution kernel than 3x3 so I would suggest using the function vDSP_imgfir() which will take any kernel size.
Response to edit:
A few things:
You need to perform the filtering on each color channel independently. That is, you need to split the R, G, and B components into their own images (of type float), filter them, then remultiplex them into the ARGB image.
vDSP_conv computes a 1-D convolution, but to blur an image, you really need a 2-D convolution. vDSP_imgfir essentially computes the 2-D convolution. For this you will need a 2-D kernel as well. You can look up the formula for a 2-D Gaussian function to produce the kernel.
Note: You actually can perform a 2-D convolution using 1-D convolutions if your kernel is seperable (which Gaussian is). I won't go into what that means, but you essentially have to perform 1-D convolution across the columns and then perform 1-D convolution across the resulting rows. I would not go this route unless you know what you are doing.

So answering my own question with Jason's excellent help, the final working code fragment is provided here for reference in case it helps anyone else. As you can see, the strategy is to split the source ARGB (I'm ignoring A for performance and assuming the data is XRGB) into 3 float arrays, apply the filter and then re-multiplex the result.
It works a treat - but it is achingly slow. I'm using a large kernel of 16x16 to get a heavy blur and on my 3GS it takes about 5 seconds for a full screen image so that's not going to be a viable solution.
Next step is to look at alternatives ... but thanks for getting me up and running.
vDSP_vfltu8(srcData+1,4,srcAsFloatR,1,pixels);
vDSP_vfltu8(srcData+2,4,srcAsFloatG,1,pixels);
vDSP_vfltu8(srcData+3,4,srcAsFloatB,1,pixels);
// Now apply the filter to each of the components. For a gaussian blur with a 16x16 kernel
// this turns out to be really slow!
vDSP_imgfir (srcAsFloatR, height, width, kernel,resultAsFloatR, frows, fcols);
vDSP_imgfir (srcAsFloatG, height, width, kernel,resultAsFloatG, frows, fcols);
vDSP_imgfir (srcAsFloatB, height, width, kernel,resultAsFloatB, frows, fcols);
// Now re-multiplex the final image from the processed float data
vDSP_vfixu8(resultAsFloatR, 1, finalData+1, 4, pixels);
vDSP_vfixu8(resultAsFloatG, 1, finalData+2, 4, pixels);
vDSP_vfixu8(resultAsFloatB, 1, finalData+3, 4, pixels);

For future reference if you're considering implementing this DON'T: I've done it for you!
see:
https://github.com/gdawg/uiimage-dsp
for a UIImage Category which adds Gaussian/Box Blur/Sharpen using vDSP and the Accelerate framework.

Why are you using vDSP to do image filtering? Try vImageConvolve_ARGB8888(). vImage is the image processing component of Accelerate.framework.

Related

What is the best way to smoothen a noisy image filter?

I am currently trying to recreate the watercolor effect of Instagram in Unity.
Instagram: https://i.imgur.com/aMyEhjS.jpg
My approach: https://i.imgur.com/9zIOQ7k.jpg
My approach is rather noisy. This is the main code which creates the effect:
float3 stepColor(float3 col){
const float3 lumvals = float3(0.5,0.7,1.0);
float3 hsv = rgb2hsv(col);
if(hsv.z <= 0.33){
hsv.z = lumvals.x;
}
else if(hsv.z <= 0.55){
hsv.z = lumvals.y;
}
else{
hsv.z = lumvals.z;
}
return hsv2rgb(hsv);
}
Which algorithm would be suitable here to denoise and smoothen the end result as Instagram is achieving it?

Watercolor filters use something called mean shift analysis to average out the image while preserving features. It is an iterative approach where you make clusters of pixels gravitate towards their mean value.
Here is a java code example:
https://imagej.nih.gov/ij/plugins/mean-shift.html
Here is a paper which describes the watercolor effect and its components (including edge darkening):
http://maverick.inria.fr/Publications/2006/BKTS06/watercolor.pdf
There is a github project with CUDA and OpenCL implementations, but if you want to actually understand the algorithm, i'd refer you to this page which explains it quite neatly using python code:
http://www.chioka.in/meanshift-algorithm-for-the-rest-of-us-python/
Another option from the top of my head is to use a Sobel/Roberts cross filter to detect all the borders in the image, and then use the inverse of this value as a mask for a gaussian blur. It won't give you the same nice layering effect though.

detect objects of any shape from image and color individual object

I am new to opencv and doing some like detect different objects from image and apply effects on individual object. I find edges, and using following code to get contours, but how how to proceed ahead i dont know. Any help ????
Thanks in advance
cv::Mat edges;
cv::Canny(gray, edges, 50, 150);
std::vector< std::vector<cv::Point> > c;
std::vector<cv::Point> points;
cv::findContours(edges, c, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
cv::Mat mask = cv::Mat::zeros(edges.rows, edges.cols, CV_8UC1);
for (size_t i=0; i<c.size(); i++)
{
for (size_t j = 0; j < c[i].size(); j++)
{
cv::Point p = c[i][j];
points.push_back(p);
// printf(" %d \t",p.x);
}
}
cv::Mat crop(inputFrame.rows, inputFrame.cols, CV_8UC3);
inputFrame.copyTo(outputFrame, mask);

Since you have chosen to identify the objects through their contour, I suggest that you continue with the "Generalized Hough Transform" (PDF). You will have to create reference contours for the objects, that you want to recognize (from every conceivable viewpoint).
Another option, that might be interesting to you is to look into segmentation algorithms in order to select certain objects in the image. Without knowing anything about the objects, that you are looking for and the images that you are processing, it is impossible to give good recommendations. There is no general purpose algorithm that works on every image (at least as far as I know).
To give you an idea about the state-of-the-art object class recognition, you can have a look at the PASCAL VOC Challenge. If your problem is simpler than the challenge (e.g. a small set of immutable objects, that stand in front of a one colored background), you should specify it in your question, and maybe someone can give you better suggestions.

Fragment Shader - Average Luminosity

Does any body know how to find average luminosity for a texture in a fragment shader? I have access to both RGB and YUV textures the Y component in YUV is an array and I want to get an average number from this array.

I recently had to do this myself for input images and video frames that I had as OpenGL ES textures. I didn't go with generating mipmaps for these due to the fact that I was working with non-power-of-two textures, and you can't generate mipmaps for NPOT textures in OpenGL ES 2.0 on iOS.
Instead, I did a multistage reduction similar to mipmap generation, but with some slight tweaks. Each step down reduced the size of the image by a factor of four in both width and height, rather than the normal factor of two used for mipmaps. I did this by sampling from four texture locations that were in the middle of the four squares of four pixels each that made up a 4x4 area in the higher-level image. This takes advantage of hardware texture interpolation to average the four sets of four pixels, then I just had to average those four pixels to yield a 16X reduction in pixels in a single step.
I converted the image to luminance at the very first stage using a dot product of the RGB values with a vec3 of (0.2125, 0.7154, 0.0721). This allowed me to just read the red channel for each subsequent reduction stage, which really helps on iOS hardware. Note that you don't need this if you are starting with a Y channel luminance texture already, but I was dealing with RGB images.
Once the image had been reduced to a sufficiently small size, I read the pixels from that back onto the CPU and did a last quick iteration over the remaining few to arrive at the final luminosity value.
For a 640x480 video frame, this process yields a luminosity value in ~6 ms on an iPhone 4, and I think I can squeeze out a 1-2 ms reduction in that processing time with a little tuning. In my experience, that seems faster than the iOS devices normally generate mipmaps for power-of-two images at around that size, but I don't have solid numbers to back that up.
If you wish to see this in action, check out the code for the GPUImageLuminosity class in my open source GPUImage framework (and the GPUImageAverageColor superclass). The FilterShowcase example demonstrates this luminosity extractor in action.

You generally don't do this just with a shader.
One of the more common methods is to create a buffer texture with full mip-maps (down to 1x1, this is important). When you want to find luminosity, you copy the backbuffer to this buffer, then regenerate mips with a nearest neighbor algorithm. The bottom pixel will then have the average color of the entire surface and can be used to find average lum through something like (c.r * 0.6) + (c.g * 0.3) + (c.b * 0.1) (edit: if you have a YUV, then do similar and use the Y; the trick is just averaging the texture down to a single value, which is what mips do).
This isn't a precise technique, but is reasonably fast, especially on hardware that can generate mipmaps internally.

I'm presenting a solution for the RGB texture here as I'm not sure mip map generation would work with a YUV texture.
The first step is to create mipmaps for the texture, if not already present:
glGenerateMipmapOES(GL_TEXTURE_2D);
Now we can access the RGB value of the smallest mipmap level from the fragment shader by using the optional third argument of the sampler function texture2D, the "bias":
vec4 color = texture2D(sampler, vec2(0.5, 0.5), 8.0);
This will shift the mipmap level up eight levels, resulting in sampling a far smaller level.
If you have a 256x256 texture and render it with a scale of 1, a bias of 8.0 will effectively reduce the picked mipmap to the smallest 1x1 level (256 / 2^8 == 1). Of course you have to adjust the bias for your conditions to sample the smallest level.
OK, now we have the average RGB value of the whole image. The third step is to reduce RGB to a luminosity:
float lum = dot(vec3(0.30, 0.59, 0.11), color.xyz);
The dot product is just a fancy (and fast) way of calculating a weighted sum.

Testing point in the alpha channel

Is there a way to detect if the alpha of a pixel after drawing is not 0 when using OpenGLES on the iphone?
I would like to test multiple points to see id they are inside the area of a random polygon drawn by the user. If you know Flash, something equivalent to BitmapData::getPixel32 is what I'm looking for.

The framebuffer is kept by the GPU and is not immediately CPU accessible. I think the thing you'd most likely want from full OpenGL is the occlusion query; you can request geometry be drawn and be told how many pixels were actually plotted. Sadly that isn't available on the iPhone.
I think what you probably want is glReadPixels, which can be used to read a single pixel if you prefer, e.g. (written here, as I type, not tested)
GLubyte pixelValue[4];
glReadPixels(x, y, 1, 1, GL_RGBA, GL_UNSIGNED_BYTE, pixelValue);
NSLog(#"alpha was %d", pixelValue[3]);
Using glReadPixels causes a pipeline flush, so is generally a bad idea from a GL performance point of view, but it'll do what you want. Unlike iOS, OpenGL uses graph paper order for pixel coordinates, so (0, 0) is the lower left corner.

How would I draw something like this in Core Graphics

I want to be able to draw using this as my stroke. How would I do this as efficient as possible, and on the fly drawing, I was thinking CGPatternRef, but I really don't know how to do that.
Edit:
It does not need to warp to the path. I just coultn't fix that issue in Illustrator.

Does the Apple doc help?
See the section in Quartz 2D Programming Guide, How Patterns Work, or Patterns in general.
Here is how to draw a start (from the above docs), PSIZE is the star size:
static void MyDrawStencilStar (void *info, CGContextRef myContext)
{
int k;
double r, theta;
r = 0.8 * PSIZE / 2;
theta = 2 * M_PI * (2.0 / 5.0); // 144 degrees
CGContextTranslateCTM (myContext, PSIZE/2, PSIZE/2);
CGContextMoveToPoint(myContext, 0, r);
for (k = 1; k < 5; k++) {
CGContextAddLineToPoint (myContext,
r * sin(k * theta),
r * cos(k * theta));
}
CGContextClosePath(myContext);
CGContextFillPath(myContext);
}
Just add the curve transformation and at each point draw the star.
Here is a simple C code to calculate points on cubic bezier curve.

You could try importing your Illustrator document into this application: Opacity, and then doing an "Export as source code".
See here: http://likethought.com/opacity/workflow/ for more information.

You will need to walk the path and compute coordinates of the curve at equal distances (along the path). This is inexpensive. Rotation gets a little hairy (rotating about the tangent), but overall, this is a basic bézier curve problem. Simply solve the curve equation and render a star at each vertex.
There's nothing built-in that will do it all. You can query points for intersection, but only rendering solves the curve. This is encapsulated by CoreGraphics, so you can't pull it out or take advantage of what they already have.
This is something you'll should consider writing yourself. It's not difficult; honest. Only a basic understanding of calculus is required... if at all. If you write it yourself, you can even add in the warping effects, if you like.

This looks like a job for OpenGL. CoreGraphics doesn't offer any simple way that I know of to warp the stars according to their proximity to a path.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse