I would like to identify clusters of a nano structure.
Here is the raw image: http://i.imgur.com/PDEcW4G.png
The clusters should look like this: http://i.imgur.com/ZVPaf7p.gif
Prefered tool would be Matlab.
Background information
Overall goal is to distinguish between foreground and background structures. To reconstruct a 3D model of porous media, SEM images are taken from slices, then binarized into solid and pore and finally placed in a row in z-direction. While identifying solid is easy, the pores unfortunately show solid of the subsequent slice.
Hence, the idea is to verify which structures change, i.e. solid of the particular slice, and which do not, being solid of the subsequent slices. As pixel-wise comparison is inaccurate due to (nano) drift of the structure comparing whole clusters promises better recognition.
Suggestions and criticism regarding the overall approach are very welcome!
Image segmentation is a hard problem. Different approaches are suited well to different conditions, and I'm not entirely sure what the "optimum" segmentation you're actually after. If you want to separate "reflecting" from "not reflecting" in the SEM, then you're right -- you're probably better off using the morphological threshold-based operations like you've said in the comment above. You can always use imopen and imclose (in matlab) to morphologically open and close the image (i.e. connect or shrink structures).
However, in general, no algorithm will work perfectly across a heterogeneous set of images. Your best bet -- ultimately -- is to use the nuclear option and take a machine learning approach with a large amount of training data. Looking at those images, it isn't immediately clear to me what the ideal solution should be -- and that's probably a bad sign.
Good luck!
Related
When it comes to convolutional neural networks there are normally many papers recommending different strategies. I have heard people say that it is an absolute must to add padding to the images before a convolution, otherwise to much spatial information is lost. On the other hand they are happy to use pooling, normally max-pooling, to reduce the size of the images. I guess the thought here is that max pooling reduces the spatial information but also reduces the sensitivity to relative positions, so it is a trade-off?
I have heard other people saying that zero-padding does not keep more information, just more empty data. This is because by adding zeros you will not get a reaction from your kernel anyway when part of the information is missing.
I can imagine that zero-padding works if you have big kernels with "scrap values" in the edges and the source of activation centered in a smaller region of the kernel?
I would be happy to read some papers about the effect of down-sampling using pooling contra not using padding, but I cant find much about it. Any good recommendations or thoughts?
Figure: Spatial down-sampling using convolution contra pooling (Researchgate)
Adding padding is NOT an "absolute must". Sometimes it can be useful to control the size of the output so that it is not reduced by the convolution (it can also augment the output, depending on its size and kernel size). The only information that zero padding adds is the condition of border (or near-border) of the features- pixels in the limits of the input, also depending on kernel size. (You can think of it as a "passe-partout" in a picture frame)
Pooling is of MUCH MORE IMPORTANCE in convnets. Pooling is not exactly "down-sampling", or "losing spatial information". Consider first that kernel calculations have been made previous to pooling, with full spatial information. Pooling reduces dimension but keeps -hopefully- the information learnt by the kernels previously. And, by doing so, achieves one of the most interesting things about convnets; robustness to displacement, rotation or distortion of the input. Invariance, if learnt, is located even if it appears in another location or with distortions. It also implies learning through increasing scale, discovering -again, hopefully- hierarchical patterns on different scales. And of course, and also necessary in convnets, pooling makes computation possible as number of layers grows.
I have bothered on this question for a while too, and I have also seen some papers mention this same issue. Here is a recent paper I found; Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation. I have not fully read the paper but it seems to bother on your question. I can update this answer as soon as I fully grasp the paper.
I would like to train a conv neural network to detect the presence of hands in images.
The difficulty is that:
1/ the images will contain other objects than the hands, just like a picture of a group of people where the hands are just a small part of the image
2/ hands can have many orientations / shapes etc (whether they are open or not , depending on the angle etc..)
I was thinking of training the convnet on a big set of cropped hand images (+ random images without hands) and then apply the classifier on all the subsquares of my images. Is this a good approach?
Are there other examples of complex 2-class convnets / RNNs I could use for inspiration?
Thank you!
I was thinking of training the convnet on a big set of cropped hand
images (+ random images without hands) and then apply the classifier
on all the subsquares of my images. Is this a good approach?
Yes, I believe this would be a good approach. However, note that when you say random, you should perhaps sample it from images where "hands are most likely to appear". It really depends on your use case, and you have to tune the data set to fit what you're doing.
How you should build your data set, would be something like this:
Crop images of hands from a big image.
Sample X number of images from that same image, but not anywhere near the hand/hands.
If however, you should choose to do something like this:
Crop images of hands from a big image.
Download 1 million images (an exaggeration) that definitely don't have hands. For example, deserts, oceans, skies, caves, mountains, basically lots of scenery. And then use this as your "random images without hands", you might get bad results.
The reason for this, is because there is an underlying distribution already. I assume that most of your images could be pictures of groups of friends, having a party at a house, or perhaps the background images would be buildings. Hence, introducing scenery images, could corrupt this distribution, whilst holding the above assumption.
Therefore, be really careful when using "random images"!
on all the subsquares of my images
As to this part of your question, you are essentially running a sliding window on the entire image. Yes, practically, it would work. But if you're looking for performance, this may not be a good idea. You might want to run some segmentation algorithms, to narrow down the search space.
Are there other examples of complex 2-class convnets / RNNs I could
use for inspiration?
I'm not sure what you mean by complex 2-class convnets. I'm not familiar with RNNs, so let me focus on convnets. You can basically define the convolutional net yourself. For example, the convolutional layers size, how many layers, what's your max pooling method, how big is your fully connected layer going to be, etc. The last layer, is basically a softmax layer, where the net decides what class it's going to be. If you have 2 classes, your last layer has 2 nodes. If you have 3, then 3. And so on. So it can range from 2, to perhaps even 1000. I've not heard of convnets that have more than 1000 classes, but I could be ill-informed. I hope this helps!
This seems more a matter of finding good labeled training data than of choosing a network. A neural network can learn the difference between "pictures of hands" and "pictures which incidentally include hands", but it needs some labeled examples to figure out which category an image belongs to.
You might want to take a look at this: http://www.socher.org/index.php/Main/ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks
I have applied Two different Image Enhancement Algorithm on a particular Image and got two resultant image , Now i want to compare the quality of those two image in order to find the effectiveness of those two Algorithms and find the more appropriate one based on the comparison of Feature vectors of those two images.So what Suitable Feature Vectors should i compare in this Case?
Iam asking in context of comparing the texture features of the images and which feature vector will be more suitable.
I need Mathematical support for verifying the effectiveness of any one algorithm based on the evaluation of Images for example using Constrast and Variance.So are there any more approaches do that?
A better approach would be to do some Noise/Signal ratio by comparing image spectra ?
Slayton is right, you need a metric and a way to measure against it, which can be an academic project in itself. However, i could think of one approach straightaway, not sure if it makes sense to your specific task at hand:
Metric:
The sum of abs( colour difference ) across all pixels. The lower, the more similar the images are.
Method:
For each pixel, get the absolute colour difference (or distance, to be precise) in LAB space between original and processed image and sum that up. Don't ruin your day trying to understand the full wikipedia article and coding that, this has been done before. Try re-using the methods getDistanceLabFrom(Color color) or getDistanceRgbFrom(Color color) from this PHP implementation. It worked like a charm for me when i needed a way to match a color of pixels in a jpg picture - which basically is the same principle.
The theory behind it (as far as my limited understanding goes): It's doing a mathematical abstraction of rgb or (better) lab colour space as a three dimensional room, and then calculate the distance, that's why it works well - and hardly worked for me when looking at a color code from a one-dimensionional perspective.
The usual way is to start with a reference image (a good one), then add some noise on it (in a controlled way).
Then, your algorithm should remove as much as possible from the added noise. The results are easy to compare with a signal-to-noise ration (see wikipedia).
Now, the approach is easy to apply on simple noise models, but if you aim to improve more complex appearance issues, you must devise a way to apply the noise, which is not easy.
Another, quite common way to do it is the one recommended by slayton: take all your colleagues to appreciate the output of your algorithm, then average their impressions.
If you have only the 2 images and no reference (higest quality) image, then you can see my rude solution/bash script there: https://photo.stackexchange.com/questions/75995/how-do-i-compare-two-similar-images-sharpness/117823#117823
It gets the 2 filenames and outputs the higher quality filename. It assumes the content of the images is identical (same source image).
It can be fooled though.
I am currently doing a project in Matlab regarding liver segmentation. I used region growing for that. I need to compare the region growing method with any other method. Can you suggest me any segmentation method?(It must be worser than region growing. Because I need to prove mine is best.)Kindly help me out.
If your intent is to show a really bad method, try e.g. Otsu segmentation (graythresh in Matlab), which will fail on most complex images.
However, setting out to find a really bad method to show that yours is good is intellectually dishonest. Instead, you should test your method against a either some state of the art algorithm (would clustering of greyvalues be useful in your case?), or to manual segmentation, where you have somebody draw an outline onto the image.
It is possible that the other method is better. However, when you compare to manual segmentation, your method is most likely faster, and then you can discuss whether the trade-off between speed and quality is acceptable.
Can you enlarge a feature so that rather than take up a certain number of pixels it actually takes up one or two times that many to make it easier to analyze? Would there be a way to generalize that in MATLAB?
This sounds an awful lot like a fictitious "zoom, enhance!" procedure that you'd hear about on CSI. In general, "blowing up" a feature doesn't make it any easier to analyze, because no additional information is created when you do this. Generally you would apply other, different transformations like noise reduction to make analysis easier.
As John F has stated, you are not adding any information. In fact, with more pixels to crunch through you are making it "harder" in the sense of requiring more processing.
You might be able to intelligently increase the resolution of an image using Compressed Sensing. It will require some work (or at least some serious thought), though, as you'll have to determine how best to sample the image you already have. There's a large number of papers referenced at Rice University Compressive Sensing Resources.
The challenge is that the image is already sampled using Nyquist-Shannon constraints. You essentially have to re-sample it using a linear basis function (with IID random elements) in such a way that the estimate is at the desired resolution and find some surrogate for the original image at that same resolution that doesn't bias the estimate.
The function imresize is useful for, well, resizing images, larger or smaller. And imcrop is useful for cropping images.
You might get other more useful answers if you tag the question image-processing too.