I have a binary image with 4 blobs. 3 of them has an aspect ratio of more than 1. and 1 has aspect ratio of 1. Now I want to reduce that blobs which aspect ratio more than 1 in binary image. How could i do this. Can some one please provide a code??
Here is a link of the binary image. I want to reduce that 3 blobs which has an aspect ratio more than 1. And only want to keep that triangle shape.
https://www.dropbox.com/s/mngjlcsin46fgim/demo.png?dl=0
you can use regionprops for that , for example:
s=regionprops(bw,'BoundingBox','PixelIdxList');
where bw is your binary image.
The output of s.BoundingBox is a [x,y,width,height] vector
you can loop over s
for i=1:numel(s)
ar(i) = s(i).BoundingBox(3)/s(i).BoundingBox(4)
end
and see if the width/height ratio ar (or whatever you define aspect ratio) is approx above 1 or not (because of noise I'd take a value of ar>1.2) . Then for that i use you can use the pixel list s(i).PixelIdxList
bw(s(ar>1.2).PixelIdxList)=0;
to zero these intensities...
Related
I have created depth label using LiDAR point cloud. Originally, the labels are of size 1536 X 1536. All pixels where LiDAR points are projected contains depth values and remaining values are filled with NANs. See mask_from_original_depth_label . And I want to resize these labels to size 512 X 512 without aliasing.
I am using PIL to resize them, however all interpolations except for nearest neighbor fails. And with nearest neighbor interpolation, it appears that the labels have aliasing. See mask_from_resized_image . How should I resize these labels without causing an aliasing. Any hints would be appreciated.
I training a CNN, many authors have mentioned of randomly cropping images from the center of the original image with a factor of 2048 data augmentation. Can anyone plz elaborate what does it mean?
I believe you are referring to the ImageNet Classification with Deep Convolutional Neural Networks data augmentation scheme. The 2048x aspect of their data augmentation scheme goes as follows:
First all images are rescaled down to 256x256
Then for each image they take random 224x224 sized crops.
For each random 224x224 crop, they additionally augment by taking horizontal reflections of these 224x224 patches.
So my guess as to how they get to the 2048x data augmentation factor:
There are 32*32 = 1024 possible 224x224 sized image crops of a 256x256 image. To see this simply observe that 256-224=32, so we have 32 possible horizontal indices and 32 possible vertical indices for our crops.
Doing horizontal reflections of each crop doubles the size.
1024 * 2 = 2048.
The center crop aspect of your question stems from the fact that the original images are not all the same size. So what the authors did was they rescaled each rectangular image so that the shortest side was now of size 256, and they they took the center crop from this, thereby rescaling the entire dataset to 256x256. Once they have rescaled all the images to 256x256, they can perform the above (up to)-2048x data augmentation scheme.
I do binary thresholding on a 16 bit gray scale image.This would help me in segmenting the region of interest.After binary thresholding,i would like to get the individual pixel intensities which are the intensities of the original 16 bit image and not the binary intensity values say 0,65535...
How can i do this?
Find the region of interest in image segmentation using the binary image. After this, use the pixel locations in the 16 bit image for further processing of the image.
To get a image from your original image, I and a binary (logical) segmented image BW:
I2 = I.*BW;
I2 should have the original values in the ROI and 0 elsewhere. Or, to get just a list of pixels and their values, via logical indexing:
I2 = I(BW);
Alternatively, depending on what you're doing, you may want to use regionprops:
stats = regionprops(BW,I,'MeanIntensity','PixelValues');
For a BW image showing the regions of interest, and a greyscale image I this will return the mean intensity and all list of all pixel values in I for each separate region (defined as a connected areas in BW).
I have two images – mannequin with and without garment.
Please refer sample images below. Ignore the jewels, footwear on the mannequin, imagine the second mannequin has only dress.
I want to extract only the garment from the two images for further processing.
The complexity is that there is slight displacement in the position of camera when taking the two pictures. Due to this simple subtraction to generate the garment mask will not work.
Can anyone tell me how to handle it?
I think I need to do registration between the two images so that I can extract only the garment from the image?
Any references to blogs, articles and codes is highly appreciated.
--
Thanks
Idea
This is an idea of how you could do it, I haven't tested it but my gut tells me it might work. I'm assuming that there will be slight differences in the pose of the manequin as well as the camera attitude.
Let the original image be A, and the clothed image be B.
Take the difference D = |A - B|, apply a median filter that is proportional to the largest deviation you expect from pose and camera attitude error: Dmedian = Median(D, kernelsize).
Quantize Dmedian into a binary mask Dmask = Q(Dmedian, threshold) using appropriate threshold values to obtain an approximate mask for the garment (this will be smaller than the garment itself due to the median filter). Reject any shapes in Dmedian that have too small area by setting their pixels to 0.
Expand the shape(s) in Dmask proportionally to the size of the median kernel into Emask=expand(Dmask, k*kernelsize). Then construct the difference in the masks Fmask=|Dmask - Emask| which now contains areas of pixels where the garment edge is expected to be. For every pixel in Fmask which is in this area, find the correlation Cxy between A and B using a small neighbourhood, store the correlations into an image C=1.0 - Corr(A,B, Fmask, n).
Your final garment mask will be M=C+Dmask.
Explanation
Since your image has nice and continuous swatches of colour, the difference between the two similar images will be thin lines and small gradients where the pose and camera attitude is different. When taking a median filter of the difference image over a sufficiently large kernel, these lines will be removed because they are in a minority of the pixels.
The garment on the other hand will (hopefully) have a significant difference from the colors in the unclothed version. And will generate a bigger difference. Thresholding the difference after the median filter should give you a rough mask of the garment that is undersized dues to some of the pixels on the edge being rejected due to their median values being too low. You could stop here if the approximation is good enough for you.
By expanding the mask we obtained above we get a probable region for the "true" edge. The above process has served to narrow our search region for the true edge considerably and we can apply a more costly correlation search between the images along this edge to find where the garment is. High correlation means no carment and low correlation means garment.
We use the inverted correlation as an alpha value together with the initially smaller mask to obtain a alpha valued mask of the garment that can be used for extracting it.
Clarification
Expand: What I mean by "expanding the mask" is to find the contour of the mask region and outsetting/growing/enlarging it to make it larger.
Corr(A,B,Fmask,n): Is just an arbitrarily chosen correlation function that gives correlation between pixels in A and B that are selected by the mask Fmask using a region of size n. The function returns 1.0 for perfect match and 0.0 for anti-match for each pixel tested. A good function is this pseudocode:
foreach px_pos in Fmask where Fmask[px_pos] == 1
Ap = subregion(A, px_pos, size) - mean(mean(A));
Bp = subregion(B, px_pos, size) - mean(mean(B))
Cxy = sum(sum(Ap .* Bp))*sum(sum(Ap .* Bp)) / (sum(sum(Ap.*Ap))*sum(sum(Bp.*Bp)))
C[px_pos] = 1.0 - Cxy;
end
where subregion selects a region of size size around the pixel with position px_pos.
You can see that if Ap == Bp then Cxy=1
suppose this is the word "S I D" in an image. I have to find out the aspect ratio of all the connected components(in this example it will be 3 components)
Use bwlabel and regionprops to get the 'BoundingBox' property for each connected component.
Then you can get the aspect ratio by dividing the width and the height of the bounding-box (last 2 entries of the 4-vector describing each bounding box).
Best of luck...