I have two images – mannequin with and without garment.
Please refer sample images below. Ignore the jewels, footwear on the mannequin, imagine the second mannequin has only dress.
I want to extract only the garment from the two images for further processing.
The complexity is that there is slight displacement in the position of camera when taking the two pictures. Due to this simple subtraction to generate the garment mask will not work.
Can anyone tell me how to handle it?
I think I need to do registration between the two images so that I can extract only the garment from the image?
Any references to blogs, articles and codes is highly appreciated.
--
Thanks
Idea
This is an idea of how you could do it, I haven't tested it but my gut tells me it might work. I'm assuming that there will be slight differences in the pose of the manequin as well as the camera attitude.
Let the original image be A, and the clothed image be B.
Take the difference D = |A - B|, apply a median filter that is proportional to the largest deviation you expect from pose and camera attitude error: Dmedian = Median(D, kernelsize).
Quantize Dmedian into a binary mask Dmask = Q(Dmedian, threshold) using appropriate threshold values to obtain an approximate mask for the garment (this will be smaller than the garment itself due to the median filter). Reject any shapes in Dmedian that have too small area by setting their pixels to 0.
Expand the shape(s) in Dmask proportionally to the size of the median kernel into Emask=expand(Dmask, k*kernelsize). Then construct the difference in the masks Fmask=|Dmask - Emask| which now contains areas of pixels where the garment edge is expected to be. For every pixel in Fmask which is in this area, find the correlation Cxy between A and B using a small neighbourhood, store the correlations into an image C=1.0 - Corr(A,B, Fmask, n).
Your final garment mask will be M=C+Dmask.
Explanation
Since your image has nice and continuous swatches of colour, the difference between the two similar images will be thin lines and small gradients where the pose and camera attitude is different. When taking a median filter of the difference image over a sufficiently large kernel, these lines will be removed because they are in a minority of the pixels.
The garment on the other hand will (hopefully) have a significant difference from the colors in the unclothed version. And will generate a bigger difference. Thresholding the difference after the median filter should give you a rough mask of the garment that is undersized dues to some of the pixels on the edge being rejected due to their median values being too low. You could stop here if the approximation is good enough for you.
By expanding the mask we obtained above we get a probable region for the "true" edge. The above process has served to narrow our search region for the true edge considerably and we can apply a more costly correlation search between the images along this edge to find where the garment is. High correlation means no carment and low correlation means garment.
We use the inverted correlation as an alpha value together with the initially smaller mask to obtain a alpha valued mask of the garment that can be used for extracting it.
Clarification
Expand: What I mean by "expanding the mask" is to find the contour of the mask region and outsetting/growing/enlarging it to make it larger.
Corr(A,B,Fmask,n): Is just an arbitrarily chosen correlation function that gives correlation between pixels in A and B that are selected by the mask Fmask using a region of size n. The function returns 1.0 for perfect match and 0.0 for anti-match for each pixel tested. A good function is this pseudocode:
foreach px_pos in Fmask where Fmask[px_pos] == 1
Ap = subregion(A, px_pos, size) - mean(mean(A));
Bp = subregion(B, px_pos, size) - mean(mean(B))
Cxy = sum(sum(Ap .* Bp))*sum(sum(Ap .* Bp)) / (sum(sum(Ap.*Ap))*sum(sum(Bp.*Bp)))
C[px_pos] = 1.0 - Cxy;
end
where subregion selects a region of size size around the pixel with position px_pos.
You can see that if Ap == Bp then Cxy=1
Related
I am trying to write a program that uses computer vision techniques to detect (and track) tiny blobs in a stream of very noisy images. The image stream comes from an dual X ray imaging setup, which outputs left and right views (different sizes because of collimating differently). My data is of two types: one set of images are not so noisy, which I am just using to try different techniques with, and the other set are noisier, and this is where the detection needs to work at the end. The image stream is at 60 Hz. This is an example of a raw image from the X ray imager:
Here are some cropped out samples of the regions of interest. The blobs that need to be detected are the small black spots near the center of the image.
Initially I started off with a simple contour/blob detection techniques in OpenCV, which were not very helpful. Eventually I moved on to techniques such as "opening" the image using morphological operators, and subsequently performing a Laplacian of Gaussian blob detection to detect areas of interest. This gave me better results for the low-noise versions of the images, but fails when it comes to the high-noise ones: gives me too many false positives. Here is a result from a low-noise image (please note input image was inverted).
The code for my current LoG based approach in MATLAB goes as below:
while ~isDone(videoReader)
frame = step(videoReader);
roi_frame = imcrop(frame, [660 410 120 110]);
I_roi = rgb2gray(roi_frame);
I_roi = imcomplement(I_roi);
I_roi = wiener2(I_roi, [5 5]);
background = imopen(I_roi,strel('disk',3));
I2 = imadjust(I_roi - background);
K = imgaussfilt(I2, 5);
level = graythresh(K);
bw = im2bw(I2);
sigma = 3;
% Filter image with LoG
I = double(bw);
h = fspecial('log',sigma*30,sigma);
Ifilt = -imfilter(I,h);
% Threshold for points of interest
Ifilt(Ifilt < 0.001) = 0;
% Dilate to obtain local maxima
Idil = imdilate(Ifilt,strel('disk',50));
% This is the final image
P = (Ifilt == Idil) .* Ifilt;
Is there any way I can improve my current detection technique to make it work for images with a lot of background noise? Or are there techniques better suited for images like this?
The approach I would take:
-Average background subtraction
-Aggressive Gaussian smoothing (this filter should be shaped based on your target object, off the top of my head I think you want the sigma about half the smallest cross section of your object, but you may want to fiddle with this) Basically the goal is blurring the noise as much as possible without completely losing your target objects (based on shape and size)
-Edge detection. Try to be specific to the object if possible (basically, look at what the object's edge looks like after Gaussian smoothing and set your edge detection to look for that width and contrast shift)
-May consider running a closing operation here.
-Search the whole image for islands (fully enclosed regions) filter based on size and then on shape.
I am taking a hunch that despite the incredibly low signal to noise ratio, your granularity of noise is hopefully significantly smaller than your object size. (if your noise is both equivalent contrast and same ballpark size as your object... you are sunk and need to re-evaluate your acquisition imo)
Another note based on your speed needs. Extreme amounts of processing savings can be made through knowing last known positions and searching locally and also knowing where new targets can enter the image from.
I want to detect edges (with sub-pixel accuracy) in images like the one displayed:
The resolution would be around 600 X 1000.
I came across a comment by Mark Ransom here, which mentions about edge detection algorithms for vertical edges. I haven't come across any yet. Will it be useful in my case (since the edge isn't strictly a straight line)? It will always be a vertical edge though. I want it to be accurate till 1/100th of a pixel at least. I also want to have access to these sub-pixel co-ordinate values.
I have tried "Accurate subpixel edge location" by Agustin Trujillo-Pino. But this does not give me a continuous edge.
Are there any other algorithms available? I will be using MATLAB for this.
I have attached another similar image which the algorithm has to work on:
Any inputs will be appreciated.
Thank you.
Edit:
I was wondering if I could do this:
Apply Canny / Sobel in MATLAB and get the edges of this image (note that it won't be a continuous line). Then, somehow interpolate this Sobel edges and get the co-ordinates in subpixel. Is it possible?
A simple approach would be to project your image vertically and fit the projected profile with an appropriate function.
Here is a try, with an atan shape:
% Load image
Img = double(imread('bQsu5.png'));
% Project
x = 1:size(Img,2);
y = mean(Img,1);
% Fit
f = fit(x', y', 'a+b*atan((x0-x)/w)', 'Startpoint', [150 50 10 150])
% Display
figure
hold on
plot(x, y);
plot(f);
legend('Projected profile', 'atan fit');
And the result:
I get x_0 = 149.6 pix for your first image.
However, I doubt you will be able to achieve a subpixel accuracy of 1/100th of pixel with those images, for several reasons:
As you can see on the profile, your whites are saturated (grey levels at 255). As you cut the real atan profile, the fit is biased. If you have control over the experiments, I suggest you do it again again with a smaller exposure time for instance.
There are not so many points on the transition, so there is not so many information on where the transition is. Typically, your resolution will be the square root of the width of the atan (or whatever shape you prefer). In you case this limits the subpixel resolution at 1/5th of a pixel, at best.
Finally, your edges are not stricly vertical, they are slightly titled. If you choose to use this projection method, to increase the accuracy you should look for a way to correct this tilt before projecting. This won't increase your accuracy by several orders of magnitude, though.
Best,
There is a problem with your image. At pixel level, it seems like there are four interlaced subimages (odd and even rows and columns). Look at this zoomed area close to the edge.
In order to avoid this artifact, I just have taken the even rows and columns of your image, and compute subpixel edges. And finally, I look for the best fitting straight line, using the function clsq whose code is in this page:
%load image
url='http://i.stack.imgur.com/bQsu5.png';
image = imread(url);
imageEvenEven = image(1:2:end,1:2:end);
imshow(imageEvenEven, 'InitialMagnification', 'fit');
% subpixel detection
threshold = 25;
edges = subpixelEdges(imageEvenEven, threshold);
visEdges(edges);
% compute fit line
A = [ones(size(edges.x)) edges.x edges.y];
[c n] = clsq(A,2);
y = [1,200];
x = -(n(2)*y+c) / n(1);
hold on;
plot(x,y,'g');
When executing this code, you can see the green line that best aproximate all the edge points. The line is given by the equation c + n(1)*x + n(2)*y = 0
Take into account that this image has been scaled by 1/2 when taking only even rows and columns, so the right coordinates must be scaled.
Besides, you can try with the other tree subimages (imageEvenOdd, imageOddEven and imageOddOdd) and combine the four straigh lines to obtain the best solution.
I'm looking for a quick way to combine overlapping blocks into one image. Assume the size of the full image and the coordinates of each block within the full image are known. Also assume the blocks are regularly spaced both horizontally and vertically.
The catch - in the overlapping region, a pixel in the output image should get a value according to a weighted average of the corresponding pixels in the overlapping blocks. The weights should be proportional to the distance from the block center.
So, for example, take a pixel location p (relative to the full image coordinates) in the overlapping region between block B1 and B2. Assume the overlap region is due to a horizontal shift only of size h. If B1(p) and B2(p) are the values at that location as they appear in blocks B1,B2, and d1,d2 are the respective distances of p from the center of blocks B1 and B2 then in the output image O the location p will get O(p) = (h-d1)/h*B1(p) + (h-d2)/h*B2(p).
Note that generally, there can be up to 4 overlapping blocks in any region.
I'm looking for the best way to do this in Matlab. Hopefully, for any choice of distance function.
blockproc and alike can help splitting an image into blocks but allow for very basic combination of results. imfuse comes close to what I need, but offers simple non-weighted alpha blending only. bwdist seems to be useful, but I haven't figured what the most efficient method to put it to use is.
You should use the command im2col.
Once you have all your patches in vectors aligned in one matrix you'll be able to work on the columns (Filtering per patch) and rows (Filtering between patches).
It will be trickier than the classic usage of im2col but it should work.
Here with i have attached two consecutive frames captured by a cmos camera with IR Filter.The object checker board was stationary at the time of capturing images.But the difference between two images are nearly 31000 pixels.This could be affect my result.can u tell me What kind of noise is this?How can i remove it.please suggest me any algorithms or any function possible to remove those noises.
Thank you.Sorry for my poor English.
Image1 : [1]: http://i45.tinypic.com/2wptqxl.jpg
Image2: [2]: http://i45.tinypic.com/v8knjn.jpg
That noise appears to result from camera sensor (Bayer to RGB conversion). There's the checkerboard pattern still left.
Also lossy jpg contributes a lot to the process. You should first have an access to raw images.
From those particular images I'd first try to use edge detection filters (Sobel Horizontal and Vertical) to make a mask that selects between some median/local histogram equalization for the flat areas and to apply some checker board reducing filter to the edges. The point is that probably no single filter is able to do good for both jpeg ringing artifacts and to the jagged edges. Then the real question is: what other kind of images should be processed?
From the comments: if corner points are to be made exact, then the solution more likely is to search for features (corner points with subpixel resolution) and make a mapping from one set of points to the other images set of corners, and search for the best affine transformation matrix that converts these sets to each other. With this matrix one can then perform resampling of the other image.
One can fortunately estimate motion vectors with subpixel resolution without brute force searching all possible subpixel locations: when calculating a matched filter, one gets local maximums for potential candidates of exact matches. But this is not all there is. One can try to calculate a more precise approximation of the peak location by studying the matched filter outputs in the nearby pixels. For exact match the output should be symmetric. Otherwise the 'energies' of the matched filter are biased towards the second best location. (A 2nd degree polynomial fit + finding maximum can work.)
Looking closely at these images, I must agree with #Aki Suihkonen.
In my view, the main noise comes from the jpeg compression, that causes sharp edges to "ring". I'd try a "de-speckle" type of filter on the images, and see if this makes a difference. Some info that can help you implement this can be found in this link.
In a more quick and dirty fashion, you apply one of the many standard tools, for example, given the images are a and b:
(i) just smooth the image with a Gaussian filter, this can reduce noise differences between the images by an order of magnitude. For example:
h=fspecial('gaussian',15,2);
a=conv2(a,h,'same');
b=conv2(b,h,'same');
(ii) Reduce Noise By Adaptive Filtering
a = wiener2(a,[5 5]);
b = wiener2(b,[5 5]);
(iii) Adjust ntensity Values Using Histogram Equalization
a = histeq(a);
b = histeq(b);
(iv) Adjust Intensity Values to a Specified Range
a = imadjust(a,[0 0.2],[0.5 1]);
b = imadjust(b,[0 0.2],[0.5 1]);
If your images are supposed to be black and white but you have captured them in gray scale there could be difference due to noise.
You can convert the images to black and white by defining a threshold, any pixel with a value less than that threshold should be assigned 0 and anything larger than that threshold should be assigned 1, or whatever your gray scale range is (maybe 255).
Assume your image is I, to make it black and white assuming your gray scale image level is from 0 to 255, assume you choose a threshold of 100:
ind = find(I < 100);
I(ind) = 0;
ind = find(I >= 100);
I(ind) = 255;
Now you have a black and white image, do the same thing for the other image and you should get very small difference if the camera and the subject have note moved.
Anyone knows why the pseudomedian filter is faster than the median filter?
I used medfilt2.m for median filtering and I implemented my own pseudomedian filter which is:
b = strel('square',3);
psmedIm = (0.5*imclose(noisedIm,b)) + (0.5*imopen(noisedIm,b));
where b is a square flat structuring element and noisedIm is an image noised by a salt and pepper noise.
Also I don't understand why the image generated using the pseudomedian filter isn't denoised.
Thank you!
In terms of your speed query, I'd propose that your pseudomedian filter is faster because it doesn't involve sorting. The true median filter requires that you sort elements and find the central value, which takes a fair bit of time.
The reason why your salt and pepper noise isn't removed is that you're always maintaining their effects because you're always using both the min and max values inside the structuring element when you use imclose and imopen. Because you're just weighting each by half, if there's a white pixel, the 0.5 factor contribution from the max function will bump the pixel value up, and vice versa for black pixels.
EDIT: Here's a quick demo I did that helps your pseudomedian behave a little more nicely with salt and pepper noise. The big difference is that it tries to use the 'best parts' of the opened and closed images rather than making them fight it out. I think it works quite well for eliminating the salt and pepper noise you used as an example.
img = imread('cameraman.tif');
img = imnoise(img, 'salt & pepper', 0.01);
subplot(2,2,1); imshow(img);
b = strel('square', 3);
closed = double(imclose(img, b));
opened = double(imopen(img, b));
subplot(2,2,2); imshow(closed,[]);
subplot(2,2,3); imshow(opened,[]);
img = double(img);
img = img + (closed - img) + (opened - img);
subplot(2,2,4); imshow(img,[]);
EDIT: Here's the result of running the code:
EDIT 2: Here's the underlying theory (it's not overly mathematical and based entirely on intuition!)
Salt and pepper noise exists as pure white and pure black pixels scattered randomly. The idea is that the 'closed' and 'opened' images will each eliminate one of the halves -- either the white salt noise or the black pepper noise -- and the pixel value in that location should be corrected by one of the operations. We just don't know which one. So we know that one of the images out of both 'closed' and 'open' is 'correct' for that pixel because the operation should have effectively 'median-ed' that pixel correctly. Since the one that is 'incorrect' should have exactly the same value at that pixel (white or black) as the original image, subtracting its value doesn't affect the original image. Only the 'correct' one (which differs by the exact amount required to return the image to its supposedly correct value) is right, so we adjust the image at that pixel by the corresponding amount. Thus, taking the noisy original image and adding to it both the differences gives us something with much of the noise reduced.