How could implement pixel data as Array[i][j] within iphone app? - iphone

How could I implement image's pixel data as
Array[i][j]= 255; without calculating byteIndex of raw data with bytesPerPixel and bytesPerRow like that:
int byteIndex = (bytesPerRow * j) + i * bytesPerPixel;
i and j are x,y coordinates of image.

Well, you can do this by allocating N rows of K elements, where N is the height of the image - and K - it's width.
Like this:
char ** image;
image = new char*[height];
for (size_t i=0;i<height;i++)
{
image[i] = new char[width];
}
But! Don't do this. It's the words idea to operate pixels in this way. First of all, when the image data are in planar form, it operates more efficiently since there are less cache-misses. Another reason - almost all image processing libraries uses planar memory model for storing the image data in memory.
I suggest to use ready image containers from either OpenCV (cv::Mat type) or Boost::Gil library. I prefefer Gil much more, since in provide explicilty typed image containers, like:
boost::gil::bgra8_image_t
boost::gil::gray32f_image_t
and so on...
And provide efficient and flexible way for doing color conversion (CMYK <-> HSV <-> RGBA <-> GRAY) and image transformations (90 degreee rotation, flipping, cropping, etc)

Related

Wrong background subtraction

I'm trying to subtract the background of an image with two images.
Image A is the background and image B is an image with things over the background.
I'm normalizing the images but I don't get the expected result.
Here's the code:
a = rgb2gray(im);
b = rgb2gray(im2);
resA = ((a - min(a(:)))./(max(a(:))-min(a(:))));
resB = ((b - min(b(:)))./(max(b(:))-min(b(:))));
resAbs = abs(resB-resA);
imshow(resAbs);
The resulting image is a completely dark image. Thanks to the answer of the user saeed masoomi, I realized that was because of the data type, so now, I have the following code:
a = rgb2gray(im);
b = rgb2gray(im2);
resA = im2double(a);
resB = im2double(b);
resAbs = imsubtract(resB,resA);
imshow(resAbs,[]);
The resulting image is not well filtered and there are parts of image B that don't appear but they should.
If I try doing this without normalizing, I still have the same problem.
The only difference between image A and B are the arms that only appears in image B, so they should appear without any cut.
Can you see something wrong? Maybe I should filter with a threshold?
Do not normalize the two images. Background subtraction is typically done with identical camera settings, so the two images are directly comparable. If the background image doesn't have a bright object in it, normalizing like you do would brighten it w.r.t. the second image. The intensities are no longer comparable, and you'd see differences where there were none.
If you recorded the background image with different camera settings (different exposure time, illumination, etc) then background subtraction is a lot more complicated than you think. You'd have to apply an optimization scheme to make the two images comparable, such that their difference is sparse. You'd have to look through the literature for that, it's not at all trivial.
Hi please pay attention to your data type ... images in matlab save in unsigned char(or int) (8-bit 0 to 255 and there is no 0.1 or 0.2 or any float number so if you have 1.2 output will be 1).
you have a wrong computation in uint8 data like below
max=uint8(255); %uint8
min=uint8(20); %uint8
data=uint8(40); %uint8
normalized=(data-min)/(max-min) %uint8
output will be
normalized =
uint8
0
ooops, you may think that this output will be 0.0851 but it's not because data type is uint8 and output will be 0 ... so i guess your all data is zero( result image is dark ) ...so for prevent this mistake MATLAB have a handy function named im2double (convert uint8 to double and all data normalized between 0 and one)
I2 = im2double(I) converts the intensity image I to double precision, rescaling the data if necessary. I can be a grayscale intensity image, a truecolor image, or a binary image.
so we can rewrite your code like below
a = rgb2gray(im);
b = rgb2gray(im2);
resA = im2double(a);
resB = im2double(b);
resAbs = abs(imsubtract(a,b)); %edited
imshow(resAbs,[])
edited
so if again output image is dark you must be check that two image have different pixel by below code!!
if(isempty(nonzeros))
disp('Two image is diffrent -> normal')
else
disp('Two image is same -> something wrong')
end

How to get depth value of specific RGB pixels in Kinect v2 images using Matlab

I'm working with the Kinect v2 and I have to map the depth information onto the RGB images to process them: in particular, I need to know which pixels in the RGB images are in a certain range of distance (depth) along the Z axis; I'm acquiring all the data with a C# program and saving them as images (RGB) and txt files (depth).
I've followed the instruction from here and here (and I thank them for sharing), but I still have some problems I don't know how to solve.
I have calculated the rotation (R) and translation (T) matrix between the depth sensor and the RGB camera, as well as their intrinsic parameters.
I have created P3D_d (depth pixels in world coordinates related to depth sensor) and P3D_rgb (depth pixels in world coordinates related to rgb camera).
row_num = 424;
col_num = 512;
P3D_d = zeros(row_num,col_num,3);
P3D_rgb = zeros(row_num,col_num,3);
cont = 1;
for row=1:row_num
for col=1:col_num
P3D_d(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
P3D_d(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
P3D_d(row,col,3) = depth(row,col);
temp = [P3D_d(row,col,1);P3D_d(row,col,2);P3D_d(row,col,3)];
P3D_rgb(row,col,:) = R*temp+T;
end
end
I have created P2D_rgb_x and P2D_rgb_y.
P2D_rgb_x(:,:,1) = (P3D_rgb(:,:,1)./P3D_rgb(:,:,3))*fx_rgb+cx_rgb;
P2D_rgb_y(:,:,2) = (P3D_rgb(:,:,2)./P3D_rgb(:,:,3))*fy_rgb+cy_rgb;
but now I don't understand how to continue.
Assuming that the calibration parameters are correct, I've tried to click on a defined point in both the depth (coordinates: row_d, col_d) and rgb (coordinates: row_rgb, col_rgb) images, but P2D_rgb_x(row_d, col_d) is totally different from row_rgb, as well as P2D_rgb_y(row_d, col_d) is totally different from col_rgb.
So, what do exactly mean P2D_rgb_x and P2D_rgb_y? How can I use them to map depth value onto rgb images or just to get the depth of a certain RGB pixel?
I'll apreciate any suggest or help!
PS: I've also a related post on MathWorks at this link

How Kinect depth images are created ? Can simple RGB images can be converted to images like those depth images?

My primary motive is to detect hand from simple RGB images (images from my webcam ).
I found a sample code find_hand_point
function [result, depth] = find_hand_point(depth_frame)
% function result = find_hand_point(depth_frame)
%
% returns the coordinate of a pixel that we expect to belong to the hand.
% very simple implementation, we assume that the hand is the closest object
% to the sensor.
max_value = max(depth_frame(:));
current2 = depth_frame;
current2(depth_frame == 0) = max_value;
blurred = imfilter(current2, ones(5, 5)/25, 'symmetric', 'same');
minimum = min(blurred(:));
[is, js] = find(blurred == minimum);
result = [is(1), js(1)];
depth = minimum;
The result variable is the co-ordinate of the nearest thing to the camera (the hand).
A depth image from kinect device was passed to this function and the result is as:
http://img839.imageshack.us/img839/5562/testcs.jpg
the green rectangle shows the closest thing to the camera (the hand).
PROBLEM:
The images that my laptop camera captures are not Depth images but are simple RGB images.
Is there a way to convert my RGB images to those depth images ?
Is there a simple alternative technique to detect hand ?
Kinect uses extra sensors to retrieve the depth data. There is not enough information in a single webcam image to reconstruct a 3D picture. But it is possible to make far-reaching estimates based on a series of images. This is the principle behind XTR-3D and similar solutions.
A much simpler approach can be found in http://www.andol.info/hci/830.htm
There the author converts the rgb image to hsv, and he just keeps specific ranges of the H, S and V values, that he assumes that are hand-like colors.
In Matlab:
function [hand] = get_hand(rgb_image)
hsv_image = rgb2hsv(rgb_image)
hand = ( (hsv_image(:,:,1)>= 0) & (hsv_image(:,:,1)< 20) ) & ( (hsv_image(:,:,2)>= 30) & (hsv_image(:,:,2)< 150) ) & ( (hsv_image(:,:,3)>= 80) & (hsv_image(:,:,3)< 255) )
end
the hand=... will give you a matrix that will have 1s in the pixels where
0 <= H < 20 AND 30 <= S < 150 AND 80 <= V < 255
A better technique I found to detect hand via skin color :)
http://www.edaboard.com/thread200673.html

How to get the real RGBA or ARGB color values without premultiplied alpha?

I'm creating an bitmap context using CGBitmapContextCreate with the kCGImageAlphaPremultipliedFirst option.
I made a 5 x 5 test image with some major colors (pure red, green, blue, white, black), some mixed colors (i.e. purple) combined with some alpha variations. Every time when the alpha component is not 255, the color value is wrong.
I found that I could re-calculate the color when I do something like:
almostCorrectRed = wrongRed * (255 / alphaValue);
almostCorrectGreen = wrongGreen * (255 / alphaValue);
almostCorrectBlue = wrongBlue * (255 / alphaValue);
But the problem is, that my calculations are sometimes off by 3 or even more. So for example I get a value of 242 instead of 245 for green, and I am 100% sure that it must be exactly 245. Alpha is 128.
Then, for the exact same color just with different alpha opacity in the PNG bitmap, I get alpha = 255 and green = 245 as it should be.
If alpha is 0, then red, green and blue are also 0. Here all data is lost and I can't figure out the color of the pixel.
How can I avoid or undo this alpha premultiplication alltogether so that I can modify pixels in my image based on the true R G B pixel values as they were when the image was created in Photoshop? How can I recover the original values for R, G, B and A?
Background info (probably not necessary for this question):
What I'm doing is this: I take a UIImage, draw it to a bitmap context in order to perform some simple image manipulation algorithms on it, shifting the color of each pixel depending on what color it was before. Nothing really special. But my code needs the real colors. When a pixel is transparent (meaning it has alpha less than 255) my algorithm shouldn't care about this, it should just modify R,G,B as needed while Alpha remains at whatever it is. Sometimes though it will shift alpha up or down too. But I see them as two separate things. Alpha contorls transparency, while R G B control the color.
This is a fundamental problem with premultiplication in an integral type:
245 * (128/255) = 122.98
122.98 truncated to an integer = 122
122 * (255/128) = 243.046875
I'm not sure why you're getting 242 instead of 243, but this problem remains either way, and it gets worse the lower the alpha goes.
The solution is to use floating-point components instead. The Quartz 2D Programming Guide gives the full details of the format you'll need to use.
Important point: You'd need to use floating-point from the creation of the original image (and I don't think it's even possible to save such an image as PNG; you might have to use TIFF). An image that was already premultiplied in an integral type has already lost that precision; there is no getting it back.
The zero-alpha case is the extreme version of this, to such an extent that even floating-point cannot help you. Anything times zero (alpha) is zero, and there is no recovering the original unpremultiplied value from that point.
Pre-multiplying alpha with an integer color type is an information lossy operation. Data is destroyed during the quantization process (rounding to 8 bits).
Since some data is destroy (by rounding), there is no way to recover the exact original pixel color (except for some lucky values). You have to save the colors of your photoshop image before you draw it into a bitmap context, and use that original color data, not the multiplied color data from the bitmap.
I ran into this same problem when trying to read image data, render it to another image with CoreGraphics, and then save the result as non-premultiplied data. The solution I found that worked for me was to save a table that contains the exact mapping that CoreGraphics uses to map non-premultiplied data to premultiplied data. Then, estimate what the original premultipled value would be with a mult and floor() call. Then, if the estimate and the result from the table lookup do not match, just check the value below the estimate and the one above the estimate in the table for the exact match.
// Execute premultiply logic on RGBA components split into componenets.
// For example, a pixel RGB (128, 0, 0) with A = 128
// would return (255, 0, 0) with A = 128
static
inline
uint32_t premultiply_bgra_inline(uint32_t red, uint32_t green, uint32_t blue, uint32_t alpha)
{
const uint8_t* const restrict alphaTable = &extern_alphaTablesPtr[alpha * PREMULT_TABLEMAX];
uint32_t result = (alpha << 24) | (alphaTable[red] << 16) | (alphaTable[green] << 8) | alphaTable[blue];
return result;
}
static inline
int unpremultiply(const uint32_t premultRGBComponent, const float alphaMult, const uint32_t alpha)
{
float multVal = premultRGBComponent * alphaMult;
float floorVal = floor(multVal);
uint32_t unpremultRGBComponent = (uint32_t)floorVal;
assert(unpremultRGBComponent >= 0);
if (unpremultRGBComponent > 255) {
unpremultRGBComponent = 255;
}
// Pass the unpremultiplied estimated value through the
// premultiply table again to verify that the result
// maps back to the same rgb component value that was
// passed in. It is possible that the result of the
// multiplication is smaller or larger than the
// original value, so this will either add or remove
// one int value to the result rgb component to account
// for the error possibility.
uint32_t premultPixel = premultiply_bgra_inline(unpremultRGBComponent, 0, 0, alpha);
uint32_t premultActualRGBComponent = (premultPixel >> 16) & 0xFF;
if (premultRGBComponent != premultActualRGBComponent) {
if ((premultActualRGBComponent < premultRGBComponent) && (unpremultRGBComponent < 255)) {
unpremultRGBComponent += 1;
} else if ((premultActualRGBComponent > premultRGBComponent) && (unpremultRGBComponent > 0)) {
unpremultRGBComponent -= 1;
} else {
// This should never happen
assert(0);
}
}
return unpremultRGBComponent;
}
You can find the complete static table of values at this github link.
Note that this approach will not recover information "lost" when the original unpremultiplied pixel was premultiplied. But, it does return the smallest unpremultiplied pixel that will become the premultiplied pixel once run through the premultiply logic again. This is useful when the graphics subsystem only accepts premultiplied pixels (like CoreGraphics on OSX). If the graphics subsystem only accepts premultipled pixels, then you are better off storing only the premultipled pixels, since less space is consumed as compared to the unpremultiplied pixels.

AVFoundation buffer comparison to a saved image

I am a long time reader, first time poster on StackOverflow, and must say it has been a great source of knowledge for me.
I am trying to get to know the AVFoundation framework.
What I want to do is save what the camera sees and then detect when something changes.
Here is the part where I save the image to a UIImage :
if (shouldSetBackgroundImage) {
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
// Create a bitmap graphics context with the sample buffer data
CGContextRef context = CGBitmapContextCreate(rowBase, bufferWidth,
bufferHeight, 8, bytesPerRow,
colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
// Create a Quartz image from the pixel data in the bitmap graphics context
CGImageRef quartzImage = CGBitmapContextCreateImage(context);
// Free up the context and color space
CGContextRelease(context);
CGColorSpaceRelease(colorSpace);
// Create an image object from the Quartz image
UIImage * image = [UIImage imageWithCGImage:quartzImage];
[self setBackgroundImage:image];
NSLog(#"reference image actually set");
// Release the Quartz image
CGImageRelease(quartzImage);
//Signal that the image has been saved
shouldSetBackgroundImage = NO;
}
and here is the part where I check if there is any change in the image seen by the camera :
else {
CGImageRef cgImage = [backgroundImage CGImage];
CGDataProviderRef provider = CGImageGetDataProvider(cgImage);
CFDataRef bitmapData = CGDataProviderCopyData(provider);
char* data = CFDataGetBytePtr(bitmapData);
if (data != NULL)
{
int64_t numDiffer = 0, pixelCount = 0;
NSMutableArray * pointsMutable = [NSMutableArray array];
for( int row = 0; row < bufferHeight; row += 8 ) {
for( int column = 0; column < bufferWidth; column += 8 ) {
//we get one pixel from each source (buffer and saved image)
unsigned char *pixel = rowBase + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);
unsigned char *referencePixel = data + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);
pixelCount++;
if ( !match(pixel, referencePixel, matchThreshold) ) {
numDiffer++;
[pointsMutable addObject:[NSValue valueWithCGPoint:CGPointMake(SCREEN_WIDTH - (column/ (float) bufferHeight)* SCREEN_WIDTH - 4.0, (row/ (float) bufferWidth)* SCREEN_HEIGHT- 4.0)]];
}
}
}
numberOfPixelsThatDiffer = numDiffer;
points = [pointsMutable copy];
}
For some reason, this doesn't work, meaning that the iPhone detects almost everything as being different from the saved image, even though I set a very low threshold for detection in the match function...
Do you have any idea of what I am doing wrong?
There are three possibilities I can think of for why you might be seeing nearly every pixel be different: colorspace conversions, incorrect mapping of pixel locations, or your thresholding being too sensitive for the actual movement of the iPhone camera. The first two aren't very likely, so I think it might be the third, but they're worth checking.
There might be some color correction going on when you place your pixels within a UIImage, then extract them later. You could try simply storing them in their native state from the buffer, then using that original buffer as the point of comparison, not the UIImage's backing data.
Also, check to make sure that your row / column arithmetic works out for the actual pixel locations in both images. Perhaps generate a difference image the absolute difference of subtracting the two images, then use a simple black / white divided area as a test image for the camera.
The most likely case is that the overall image is shifting by more than one pixel simply through the act of a human hand holding it. These whole-frame image shifts could cause almost every pixel to be different in a simple comparison. You may need to adjust your thresholding or do more intelligent motion estimation, like is used in video compression routines.
Finally, when it comes to the comparison operation, I'd recommend taking a look at OpenGL ES 2.0 shaders for performing this. You should see a huge speedup (14-28X in my benchmarks) over doing this pixel-by-pixel comparison on the CPU. I show how to do color-based thresholding using the GPU in this article, which has this iPhone sample application that tracks colored objects in real time using GLSL shaders.
Human eyes are way much different than a camera (even a very expensive one) in the way that we don't perceive minimal light changes or small motion changes. Cameras DO, they are very sensitive but not smart at all!
With your current approach (it seems you are comparing each pixel):
What would happen if the frame is shifted only 1 pixel to the right?! You can image right the result of your algorithm, right?. Humans will perceive nothing or almost nothing.
There is also the camera shutter problem: That means that every frame might not have the same amount of light. Hence, a pixel-by-pixel comparison method is too prone to fail.
You want to at least pre-process your image and extract some basic features. Maybe edges, corners, etc. OpenCV is easy for that but I am not sure that doing such a processing will be fast in the iPhone. (It depends on your image size)
Alternatively you can try the naive template matching algorithm with a template size that will be a little short than your hole view size.
Image Processing is computationally expensive so don't expect it to be fast from the first time, specially in a mobile device and even more if you don't have experience in Image Processing/Computer Vision stuff.
Hope it helps ;)