I am currently trying to build a CNN that takes in images. The label that I want is the (i,j) coordinates. I know that the image data layer that uses a file with the filename and the label in the following format:
folder/file1.jpg label
Is it possible to have a label that isn't a number but two numbers?
Related
I'm using U-Net for image segmentation.
The model was trained with images that could contain up to 4 different classes. The train classes are never overlapping.
The output of the UNet is a heatmap (with float values between 0 and 1) for each of these 4 classes.
Now, I have 2 problems:
for a certain class, how do I segment (draw contours) in the original image only for the points where the heatmap has significant values? (In the image below an example: the values in the centre are significant, while the values on the left aren't. If I draw the segmentation of the entire image without any additional operation, both are considered.)
downstream of the first point, how do I avoid that in the original image the contours of two superimposed classes are drawn? (maybe by drawing only the one that has higher values in the corresponding heatmap)
I have recognized and labeled objects in my image that is fully consists of texts. you can see the objects are labeled as red color in the attached image. so, I want to separate the objects in the second line (or more lines) from the first line and give them different colors (each line would has a different colors) but I can't do that. do you have any idea? thanks for all answers.
this is part of my matlab code that does the labeling:
%% Label connected components
[L, Ne]=bwlabel(imagen);
%% Measure properties of image regions
propied=regionprops(L,'BoundingBox');
hold on
%% Plot Bounding Box
for n=1:size(propied,1)
rectangle('Position',propied(n).BoundingBox,'EdgeColor','r','LineWidth',2)
end
and this is labeled image that all the objects in different lines have the same label (same color=red).
I think the following methods should work if the lines are not too curvy.
Find the centroids of the bounding boxes, or get the centroids from the regionprops itself, then cluster their y coordinates using kmeans with k = 2.
The result is not perfect, but fine. May be you can then fit a curve to the clustered points, with outlier removal (e.g. RANSAC)
OR
Prepare a new image by filling in the bounding boxes.
Prepare a rectangular structuring element whose height is 1 and width is the width of the widest bounding box.
Perform a morphological closing of the filled image using this structuring element. This will connect the regions horizontally. Now you get a mask separating the two regions.
The resulting images were obtained using opencv (I'm not posting the code because it's too untidy. Hope the instructions are clear enough).
I am trying to import the SUN RGB-D dataset into lmdb format so that caffe can train for the bounding box regression. I see for imagenet conversion, there is a file putting the filename and the class label on one row. How can I prepare the data so I can label an object by four point coordinates? There are about 10 objects recognized in the ground truth image, so one image should contain around 10 * 8 values for the regression result.
I have a medical imaging matrix of size [200x200x200].
In order to display it, I am currently using imshow3D function, which is an excellent tool, built by Maysam Shahedi.
This tool displays the 3D image slice by slice, with mouse based slice browsing
In my current project, I generate an RGB image for each z-layer from the original input image. The output is a 3D color image of size [200x200x200x3] (each layer is now represented by 3 channels).
The imshow3D function works great on grayscale images. Is it possible to use it to display RGB images?
I took a look at this nice imshow3D function from Matlab FileExchange, and it is quite straight-forward to change it to allow working with a stack of RGB images.
The magic part of the function is
imshow(Img(:,:,S))
which displays the slice S of the image Img. We can simply change it to show all 3 channels of image S by changing this to Img(:,:,S,:). The result will be of size 200-by-200-by-1-by-3, while MATLAB expects RGB images to be of size 200-by-200-by-3. Simply squeeze this image to get the correct dimension. This results in:
imshow(squeeze(Img(:,:,S,:))
So to show RGB images, do a search-and-replace inside the function imshow3D, to replace all occurrences of Img(:,:,S) with squeeze(Img(:,:,S,:)) and it works!
I am trying to create a model of background from multiple images of the same size. I want to use the model to segment moving objects from the roller belt (background model), but I'm not sure how to achieve this.
My concern is how to compute background model with 4 images representing roller belt?
I was thinking of calculating mean for each belt image, add it together, and divide it by the number of images, but in this case I will end up with one value for the whole background. How can I compute the mean of each pixel by looking at 4 images provided?
%some example data
A{1}=imread('http://dummyimage.com/600x400/00a/fff.jpg&text=a');
A{2}=imread('http://dummyimage.com/600x400/00a/fff.jpg&text=b');
A{3}=imread('http://dummyimage.com/600x400/00a/fff.jpg&text=c');
A{4}=imread('http://dummyimage.com/600x400/00a/fff.jpg&text=d');
%make sure every image is of type double. Every uint type is possible as well, but requires casting at the end.
for ix=1:numel(A),A{ix}=im2double(A{ix});end
%concatinate on ndims+1, which means 4th dimension for color images and third dimension for grey scale images.
I=mean(cat(ndims(A{1})+1,A{:}),ndims(A{1})+1);
imshow(I);
If the images are im1,im2,im3,im4 calculate the mean along the third dimension.
% concatenate the images to a single 3-D stack.
im = cat(3,im1,im2,im3,im4);
% find the mean of each pixel.
meanIm = mean(im,3);