I'd like to efficiently create an up-scaled CIImage from a minimally sized one, using Nearest Neighbour scaling.
Say I want to create an image at arbitrary resolutions such as these EBU Color Bars:
In frameworks like OpenGL, we can store this a tiny 8x1 pixel texture and render it to arbitrary sized quads, and as long as we use Nearest Neighbour scaling the resulting image is sharp.
Our options with CIImage appear to be limited to .transformedBy(CAAffineTransform(scaleX:y:)) and .filteredBy(filterName: "CILanczosScaleTransform") which both use smooth sampling, which is a good choice for photographic images but will blur edges of line art images such as these color bars - I specifically want a pixellated effect.
Because I'm trying to take advantage of GPU processing in the Core Image backend, I'd rather not provide an already upscaled bitmap image to the process (using CGImage, for example)
Is there some way of either telling Core Image to use Nearest Neighbour sampling, or perhaps write a custom subclass of CIImage to achieve this?
I think you can use samplingNearest() for that:
let scaled = image.samplingNearest().transformedBy(…)
i need some help with a corner detection.
I printed a checkerboard and created an image of this checkerboard with a webcam. The problem is that the webcam has a low resolution, therefore it do not find all corners. So i enhanced the number of searched corner. Now it finds all corner but different one for the same Corner.
All Points are stored in a matrix therefore i don't know which element depends to which point.
(I can not use the checkerboard function because the fuction is not available in my Matlab Version)
I am currently using the matlab function corner.
My Question:
Is it possible to search the Extrema of all the point clouds to get 1 Point for each Corner? Or has sb an idea what i could do ? --> Please see the attached photo
Thanks for your help!
Looking at the image my guess is that the false positives of the corner detection are caused by compression artifacts introduced by the lossy compression algorithm used by your webcam's image acquisition software. You can clearly spot ringing artifacts around the edges of the checkerboard fields.
You could try two different things:
Check in your webcam's acquisition software whether you can disable the compression or change to a lossless compression
Working with the image you already have, you could try to alleviate the impact of the compression by binarising the image using a simple thresholding operation (which in the case of a checkerboard would not even mean loosing information since the image is intrinsically binary).
In case you want to go for option 2) I would suggest to do the following steps. Let's assume the variable storing your image is called img
look at the distribution of grey values using e.g. the imhist function like so: imhist(img)
Ideally you would see a clean bimodal distribution with no overlap. Choose an intensity value I in the middle of the two peaks
Then simply binarize by assigning img(img<I) = 0; img(img>I) = 255 (assuming img is of type uint8).
Then run the corner algorithm again and see if the outliers have disappeared
I'm working with the kinect camera and trying to display real-life depth imaging using the ptCloud method combining the RGB and Depth Sensor. However just using the initial setup my image is disfigured missing pertinent information, is there anyway to improve this so that it captures more data. I have also attached an image of what i mean. Any help would be great thank you!
colorDevice = imaq.VideoDevice('kinect',1)
depthDevice = imaq.VideoDevice('kinect',2)
step(colorDevice);
step(depthDevice);
colorImage = step(colorDevice);`enter code here`
depthImage = step(depthDevice);
gridstep = 0.1;
ptCloud = pcfromkinect(depthDevice,depthImage,colorImage);
player = pcplayer(ptCloud.XLimits,ptCloud.YLimits,ptCloud.ZLimits,...
'VerticalAxis','y','VerticalAxisDir','down');
xlabel(player.Axes,'X (m)');
ylabel(player.Axes,'Y (m)');
zlabel(player.Axes,'Z (m)');
for i = 1:1000
colorImage = step(colorDevice);
depthImage = step(depthDevice);
ptCloud = pcfromkinect(depthDevice,depthImage,colorImage);
ptCloudOut = pcdenoise(ptCloud);
view(player,ptCloudOut);
end
release(colorDevice);
release(depthDevice);
From the looks of the image, you are trying to capture a cabinet with a TV screen in the middle. In cases like these, the TV screen actually absorbs the IR emitted from the sensor or reflects it at oblong angles/multiple reflections etc. Therefore Kinect is unable to capture the depth data. Furthermore, since when you want to display the RGB data on top of the point cloud, it tries to align the two and rejects any depth data that is not aligned with the RGB image pixels.
So in order to improve your depth data acquisition, you could either take care that there are no reflective surfaces like screen, mirrors etc in the scene. Also, try displaying the depth data without the RGB overlay, which will hopefully improve the point cloud shown.
UPDATE: Please see additional question below with more code;
I am trying to code a category for blurring an image. My starting point is Jeff LaMarche's sample here. Whilst this (after the fixes suggested by others) works fine, it is an order of magnitude too slow for my requirements - on a 3GS it takes maybe 3 seconds to do a decent blur and I'd like to get this down to under 0.5 sec for a full screen (faster is better).
He mentions the Accelerate framework as a performance enhancement so I've spent the last day looking at this, and in particular vDSP_f3x3 which according to the Apple Documenation
Filters an image by performing a
two-dimensional convolution with a 3x3
kernel; single precision.
Perfect - I have a suitable filter matrix, and I have an image ... but this is where I get stumped.
vDSP_f3x3 assumes image data is (float *) but my image comes from;
srcData = (unsigned char *)CGBitmapContextGetData (context);
and the context comes from CGBitmapContextCreate with kCGImageAlphaPremultipliedFirst, so my srcData is really ARGB with 8 bits per component.
I suspect what I really need is a context with float components, but according to the Quartz documentation here, kCGBitMapFloatComponents is only available on Mac OS and not iOS :-(
Is there a really fast way using the accelerate framework of converting the integer components I have into the float components that vDSP_f3x3 needs? I mean I could do it myself, but by the time I do that, then the convolution, and then convert back, I suspect I'll have made it even slower than it is now since I might as well convolve as I go.
Maybe I have the wrong approach?
Does anyone have some tips for me having done some image processing on the iphone using vDSP? The documentation I can find is very reference orientated and not very newbie friendly when it comes to this sort of thing.
If anyone has a reference for really fast blurring (and high quality, not the reduce resolution and then rescale stuff I've seen and looks pants) that would be fab!
EDIT:
Thanks #Jason. I've done this and it is almost working, but now my problem is that although the image does blur, on every invocation it shifts left 1 pixel. It also seems to make the image black and white, but that could be something else.
Is there anything in this code that leaps out as obviously incorrect? I haven't optimised it yet and it's a bit rough, but hopefully the convolution code is clear enough.
CGImageRef CreateCGImageByBlurringImage(CGImageRef inImage, NSUInteger pixelRadius, NSUInteger gaussFactor)
{
unsigned char *srcData, *finalData;
CGContextRef context = CreateARGBBitmapContext(inImage);
if (context == NULL)
return NULL;
size_t width = CGBitmapContextGetWidth(context);
size_t height = CGBitmapContextGetHeight(context);
size_t bpr = CGBitmapContextGetBytesPerRow(context);
int componentsPerPixel = 4; // ARGB
CGRect rect = {{0,0},{width,height}};
CGContextDrawImage(context, rect, inImage);
// Now we can get a pointer to the image data associated with the bitmap
// context.
srcData = (unsigned char *)CGBitmapContextGetData (context);
if (srcData != NULL)
{
size_t dataSize = bpr * height;
finalData = malloc(dataSize);
memcpy(finalData, srcData, dataSize);
//Generate Gaussian kernel
float *kernel;
// Limit the pixelRadius
pixelRadius = MIN(MAX(1,pixelRadius), 248);
int kernelSize = pixelRadius * 2 + 1;
kernel = malloc(kernelSize * sizeof *kernel);
int gauss_sum =0;
for (int i = 0; i < pixelRadius; i++)
{
kernel[i] = 1 + (gaussFactor*i);
kernel[kernelSize - (i + 1)] = 1 + (gaussFactor * i);
gauss_sum += (kernel[i] + kernel[kernelSize - (i + 1)]);
}
kernel[(kernelSize - 1)/2] = 1 + (gaussFactor*pixelRadius);
gauss_sum += kernel[(kernelSize-1)/2];
// Scale the kernel
for (int i=0; i<kernelSize; ++i) {
kernel[i] = kernel[i]/gauss_sum;
}
float * srcAsFloat,* resultAsFloat;
srcAsFloat = malloc(width*height*sizeof(float)*componentsPerPixel);
resultAsFloat = malloc(width*height*sizeof(float)*componentsPerPixel);
// Convert uint source ARGB to floats
vDSP_vfltu8(srcData,1,srcAsFloat,1,width*height*componentsPerPixel);
// Convolve (hence the -1) with the kernel
vDSP_conv(srcAsFloat, 1, &kernel[kernelSize-1],-1, resultAsFloat, 1, width*height*componentsPerPixel, kernelSize);
// Copy the floats back to ints
vDSP_vfixu8(resultAsFloat, 1, finalData, 1, width*height*componentsPerPixel);
free(resultAsFloat);
free(srcAsFloat);
}
size_t bitmapByteCount = bpr * height;
CGDataProviderRef dataProvider = CGDataProviderCreateWithData(NULL, finalData, bitmapByteCount, &providerRelease);
CGImageRef cgImage = CGImageCreate(width, height, CGBitmapContextGetBitsPerComponent(context),
CGBitmapContextGetBitsPerPixel(context), CGBitmapContextGetBytesPerRow(context), CGBitmapContextGetColorSpace(context), CGBitmapContextGetBitmapInfo(context),
dataProvider, NULL, true, kCGRenderingIntentDefault);
CGDataProviderRelease(dataProvider);
CGContextRelease(context);
return cgImage;
}
I should add that if I comment out the vDSP_conv line, and change the line following to;
vDSP_vfixu8(srcAsFloat, 1, finalData, 1, width*height*componentsPerPixel);
Then as expected, my result is a clone of the original source. In colour and not shifted left. This implies to me that it IS the convolution that is going wrong, but I can't see where :-(
THOUGHT: Actually thinking about this, it seems to me that the convolve needs to know the input pixels are in ARGB format as otherwise the convolution will be multiplying the values together with no knowledge about their meaning (ie it will multiple R * B etc). This would explain why I get a B&W result I think, but not the shift. Again, I think there might need to be more to it than my naive version here ...
FINAL THOUGHT: I think the shifting left is a natural result of the filter and I need to look at the image dimensions and possibly pad it out ... so I think the code is actually working OK given what I've fed it.
While the Accelerate framework will be faster than simple serial code, you'll probably never see the greatest performance by blurring an image using it.
My suggestion would be to use an OpenGL ES 2.0 shader (for devices that support this API) to do a two-pass box blur. Based on my benchmarks, the GPU can handle these kinds of image manipulation operations at 14-28X the speed of the CPU on an iPhone 4, versus the maybe 4.5X that Apple reports for the Accelerate framework in the best cases.
Some code for this is described in this question, as well as in the "Post-Processing Effects on Mobile Devices" chapter in the GPU Pro 2 book (for which the sample code can be found here). By placing your image in a texture, then reading values in between pixels, bilinear filtering on the GPU gives you some blurring for free, which can then be combined with a few fast lookups and averaging operations.
If you need a starting project to feed images into the GPU for processing, you might be able to use my sample application from the article here. That sample application passes AVFoundation video frames as textures into a processing shader, but you can modify it to send in your particular image data and run your blur operation. You should be able to use my glReadPixels() code to then retrieve the blurred image for later use.
Since I originally wrote this answer, I've created an open source image and video processing framework for doing these kinds of operations on the GPU. The framework has several different blur types within it, all of which can be applied very quickly to images or live video. The GPUImageGaussianBlurFilter, which applies a standard 9-hit Gaussian blur, runs in 16 ms for a 640x480 frame of video on the iPhone 4. The GPUImageFastBlurFilter is a modified 9-hit Gaussian blur that uses hardware filtering, and it runs in 2.0 ms for that same video frame. Likewise, there's a GPUImageBoxBlurFilter that uses a 5-pixel box and runs in 1.9 ms for the same image on the same hardware. I also have median and bilateral blur filters, although they need a little performance tuning.
In my benchmarks, Accelerate doesn't come close to these kinds of speeds, especially when it comes to filtering live video.
You definitely want to convert to float to perform the filtering since that is what the accelerated functions take, plus it is a lot more flexible if you want to do any additional processing. The computation time of a 2-D convolution (filter) will most likely dwarf any time spent in conversion. Take a look at the function vDSP_vfltu8() which will quickly convert the uint8 data to float. vDSP_vfixu8() will convert it back to uint8.
To perform a blur, you are probably going to want a bigger convolution kernel than 3x3 so I would suggest using the function vDSP_imgfir() which will take any kernel size.
Response to edit:
A few things:
You need to perform the filtering on each color channel independently. That is, you need to split the R, G, and B components into their own images (of type float), filter them, then remultiplex them into the ARGB image.
vDSP_conv computes a 1-D convolution, but to blur an image, you really need a 2-D convolution. vDSP_imgfir essentially computes the 2-D convolution. For this you will need a 2-D kernel as well. You can look up the formula for a 2-D Gaussian function to produce the kernel.
Note: You actually can perform a 2-D convolution using 1-D convolutions if your kernel is seperable (which Gaussian is). I won't go into what that means, but you essentially have to perform 1-D convolution across the columns and then perform 1-D convolution across the resulting rows. I would not go this route unless you know what you are doing.
So answering my own question with Jason's excellent help, the final working code fragment is provided here for reference in case it helps anyone else. As you can see, the strategy is to split the source ARGB (I'm ignoring A for performance and assuming the data is XRGB) into 3 float arrays, apply the filter and then re-multiplex the result.
It works a treat - but it is achingly slow. I'm using a large kernel of 16x16 to get a heavy blur and on my 3GS it takes about 5 seconds for a full screen image so that's not going to be a viable solution.
Next step is to look at alternatives ... but thanks for getting me up and running.
vDSP_vfltu8(srcData+1,4,srcAsFloatR,1,pixels);
vDSP_vfltu8(srcData+2,4,srcAsFloatG,1,pixels);
vDSP_vfltu8(srcData+3,4,srcAsFloatB,1,pixels);
// Now apply the filter to each of the components. For a gaussian blur with a 16x16 kernel
// this turns out to be really slow!
vDSP_imgfir (srcAsFloatR, height, width, kernel,resultAsFloatR, frows, fcols);
vDSP_imgfir (srcAsFloatG, height, width, kernel,resultAsFloatG, frows, fcols);
vDSP_imgfir (srcAsFloatB, height, width, kernel,resultAsFloatB, frows, fcols);
// Now re-multiplex the final image from the processed float data
vDSP_vfixu8(resultAsFloatR, 1, finalData+1, 4, pixels);
vDSP_vfixu8(resultAsFloatG, 1, finalData+2, 4, pixels);
vDSP_vfixu8(resultAsFloatB, 1, finalData+3, 4, pixels);
For future reference if you're considering implementing this DON'T: I've done it for you!
see:
https://github.com/gdawg/uiimage-dsp
for a UIImage Category which adds Gaussian/Box Blur/Sharpen using vDSP and the Accelerate framework.
Why are you using vDSP to do image filtering? Try vImageConvolve_ARGB8888(). vImage is the image processing component of Accelerate.framework.
I'm working with an image processing tool in MATLAB. How can I convert MATLAB code to Objective-C?
Here are some of the tasks I want to do:
I want to rotate an oblique line to normal.
I have algorıthm that converts an colored image to black & white. (How can I access pixel color values in Objective-C?)
In the function getrgbfromimage, how can I access output of pixel values to consol?
How can I run a function (getrotatedımage) on each element of an array?
Quartz 2D (aka. Core Graphics) is the 2D drawing API in iOS. Quartz will, most likely, do everything you're looking for. I recommend checking out the Quartz 2D Programming Guide in the documentation. For your specific requests check out these sections:
Colors and Color Spaces - for color and b&w images
Transforms - for rotating or performing any affine transform
Bitmap Images and Image Masks - for information on the underlying image data
As for running a function on each element of an array, you can use the block iteration API (as long as your app can require iOS 4.0 or higher). An example:
[myArray enumerateObjectsUsingBlock:^(id item, NSUInteger index, BOOL *stop) {
doSomethingWith(item);
}];
If you just want to call a method on each item in the array, there is also:
[myArray makeObjectsPerformSelector:#selector(doSomething)];
CGBitmapContext will get pixel values from an image.
http://developer.apple.com/library/ios/#documentation/GraphicsImaging/Reference/CGBitmapContext/Reference/reference.html
This has same demo code.
Retrieving a pixel alpha value for a UIImage (MonoTouch)
printf will dump the RGB values to the console.
NSArray or NSMutableArray will hold your images and a simple for loop will let you iterate through them.