I also know the fact that saliency map is also a form of image segmentation task.
But it has been used very widely for interpretable deep learning ( Read GradCam etc ) .
I also came across this paper (http://img.cs.uec.ac.jp/pub/conf16/161011shimok_0.pdf)
which talks about Class Saliency Maps - something that rings a bell when it comes to Image Segmentation. Please tell if this concept exists for Image Segmentation or I need to read more on this subject.
Class saliency maps as described in Deep Inside Convolutional Networks: VisualisingImage Classification Models and Saliency Maps explain that such a map describes per pixel how much changing such a pixel will influence a prediction. Hence I see no reason why this could not be applied to image segmentation tasks.
The resulting images from the segmentation task and saliency map have to be interpreted differently however. In an image segmentation task the output is a per pixel prediction of whether or not a pixel belongs to a class, sometimes in the form of a certainty score.
A class saliency map describes per pixel how much changing that pixel would change the score of the classifier. Or quote from above paper: "which pixels need to be changed the least to affect the class score the most"
Edit: Added example.
Say that a pixel gets a score of 99% for being of the class "Dog", we can be rather certain that this pixel actually is part of a dog. The salience map can show a low score for this same pixel. This means that changing this pixel slightly would not influence the prediction of that pixel belonging to the class "Dog". In my experience so far, both the per pixel class probability map and the salience map show somewhat similar patterns, but this does not mean they are to be interpreted equal.
A piece of code I came across that can be applied to pytorch models (from Nikhil Kasukurthi, not mine) can be found on github.
Related
I am capturing static images of particulate biological materials on the millimeter scale, and then processing them in MATLAB. My routine is working well so far, but I am using a rudimentary calibration procedure where I include some coins in the image, automatically find them based on their size and circularity, count their pixels, and then remove them. This allows me to generate a calibration line with input "area-mm^2" and output "Area- pixels," which I then use to convert the pixel area of the particles into physical units of millimeters squared.
My question is: is there a better calibrant object that I can use, such as a stage graticule or "phantom" as some people seem to call them? Do you know where I could purchase such a thing? I can't even seem to find a possible vendor. Is there another rigorous way to approach this problem without using calibrant objects in the field of view?
Thanks in advance.
Clay
Image calibration is always done using features of knowns size or distance.
You could calculate the scale based on nominal specifications but your imaging equipment will always have some production tolerances, your object distance is only known to a certain accuracy...
So it's always safer and simpler to actually calibrate your scale.
As a calibrant you can use anything that meets your requirements. If you know the size well enough and if you are able to extract it's dimensions in pixels properly you can use it...
I don't know your requirements and your budget, but if you want something very precise and fancy you can use glass masks.
There are temperature stable glass slides that are coated with chrome for example. There are many companies that produce such masks customized (IMT AG, BVM maskshop, ...) Also most optics lab equipment suppliers have such things on stock. Edmund Optics, Newport, ...
Given are two monochromatic images of same size. Both are prealigned/anchored to one common point. Some points of the original image did move to a new position in the new image, but not in a linear fashion.
Below you see a picture of an overlay of the original (red) and transformed image (green). What I am looking for now is a measure of "how much did the "individual" points shift".
At first I thought of a simple average correlation of the whole matrix or some kind of phase correlation, but I was wondering whether there is a better way of doing so.
I already found that link, but it didn't help that much. Currently I implement this in Matlab, but this shouldn't be the point I guess.
Update For clarity: I have hundreds of these image pairs and I want to compare each pair how similar they are. It doesn't have to be the most fancy algorithm, rather easy to implement and yielding in a good estimate on similarity.
An unorthodox approach uses RASL to align an image pair. A python implementation is here: https://github.com/welch/rasl and it also
provides a link to the RASL authors' original MATLAB implementation.
You can give RASL a pair of related images, and it will solve for the
transformation (scaling, rotation, translation, you choose) that best
overlays the pixels in the images. A transformation parameter vector
is found for each image, and the difference in parameters tells how "far apart" they are (in terms of transform parameters)
This is not the intended use of
RASL, which is designed to align large collections of related images while being indifferent to changes in alignment and illumination. But I just tried it out on a pair of jittered images and it worked quickly and well.
I may add a shell command that explicitly does this (I'm the author of the python implementation) if I receive encouragement :) (today, you'd need to write a few lines of python to load your images and return the resulting alignment difference).
You can try using Optical Flow. http://www.mathworks.com/discovery/optical-flow.html .
It is usually used to measure the movement of objects from frame T to frame T+1, but you can also use it in your case. You would get a map that tells you the "offset" each point in Image1 moved to Image2.
Then, if you want a metric that gives you a "distance" between the images, you can perhaps average the pixel values or something similar.
I am looking for a code or an application which can extract the salient object out of a video considering both context and motion,
or
an algorithm just for motion saliency map detection (motion contrast) so I can fuse it with a context_aware salient object detector that I have.
Actually I have tested context_aware saliency map detector already but it in some frame detects some part of background as salient object and I want to involve the motion and time in this detection so I can extract the exact salient object as it's possible.
Can anyone help me?
one of the most popular approaches (although a bit dated) in the computer vision community is the graph based visual saliency (GBVS) model.
it uses a graph-based method to compute visual saliency. first, the same feature maps than in the fsm model are extracted. it leads to three multiscale feature maps: colors, intensity and orientations. then, a fully-connected graph is built over all grid locations of each feature map and a weight is assigned between each node. this weight depends on the spatial distance and the value of the feature map between nodes. finally, each graph is treated as markov chains to build an activation map where nodes which are highly dissimilar to surrounding nodes will be assigned high values. finally, all activation maps are ultimately merged into the final saliency map.
you can find matlab source code here: http://www.vision.caltech.edu/~harel/share/gbvs.php
I have a image with noise. i want to remove all background variation from an image and want a plain image .My image is a retinal image and i want only the blood vessel and the retinal ring to remain how do i do it? 1 image is my original image and 2 image is how i want it to be.
this is my convoluted image with noise
There are multiple approaches for blood vessel extraction in retina images.
You can find a thorough overview of different approaches in Review of Blood Vessel Extraction Techniques and Algorithms. It covers prominent works of many approache.
As Martin mentioned, we have the Hessian-based Multiscale Vessel Enhancement Filtering by Frangi et al. which has been shown to work well for many vessel-like structures both in 2D and 3D. There is a Matlab implementation, FrangiFilter2D, that works on 2D vessel images. The overview fails to mention Frangi but cover other works that use Hessian-based methods. I would still recommend trying Frangi's vesselness approach since it is both powerful and simple.
Aside from the Hesisan-based methods, I would recommend looking into morphology-based methods since Matlab provides a good base for morphological operations. One such method is presented in An Automatic Hybrid Method for Retinal Blood Vessel Extraction. It uses a morphological approach with openings/closings together with the top-hat transform. It then complements the morphological approach with fuzzy clustering and some post processing. I haven't tried to reproduce their method, but the results look solid and the paper is freely available online.
This is not an easy task.
Detecting boundary of blood vessals - try edge( I, 'canny' ) and play with the threshold parameters to see what you can get.
A more advanced option is to use this method for detecting faint curves in noisy images.
Once you have reasonably good edges of the blood vessals you can do the segmentation using watershed / NCuts or boundary-sensitive version of meanshift.
Some pointers:
- the blood vessals seems to have relatively the same thickness, much like text strokes. Would you consider using Stroke Width Transform (SWT) to identify them? A mex implementation of SWT can be found here.
- If you have reasonably good boundaries, you can consider this approach for segmentation.
Good luck.
I think you'll be more served using a filter based on tubes. There is a filter available which is based on the work done by a man called Frangi, and the filter is often dubbed the Frangi filter. This can help you with identifying the vasculature in the retina. The filter is already written for Matlab and a public version is available here. If you would like to read about the underlying research search for: 'Multiscale vessel enhancement', by Frangi (1998). Another group who's done work in the same field are Sato et.al.
Sorry for the lack of a link in the last one, I could only find payed sites for looking at the research paper on this computer.
Hope this helps
Here is what I will do. Basically traditional image arithmetic to extract background and them subtract it from input image. This will give you the desired result without background. Below are the steps:
Use a median filter with large kernel as the first step. This will estimate the background.
Divide the input image with the output of step 1 [You may have to shift the denominator a little (+1) ] to avoid divide by 0.
Do the quantization to 8 or n bit integer based on what bit the original image is.
The output of step 3 above is the background. Subtract it from original image, to get the desired result. This clips all the negative values as well.
I'm using Itti's saliency map. So given an example image, I could get a salient map as shown below(compare the saliency map with the color photo):
The problem is that although the algorithms pinpoints roughly where the salient object is, it fails to reliably get the dimensions of the object. Thus, if I want my program to automatically crop out the most salient object in an image, I can only speculate the dimensions based on the shape of the salient map for the object. This is pretty unreliable since the salient shape could vary greatly.
Are there more reliable methods to do this?
In order to find a better estimate of the salient objects' boundaries, I suggest you'd use some foreground extraction algorithm. One popular algorithm for that task is called GrabCut. It requires an initial estimate of the object boundaries. For the task in hand, the initial estimate would be the boundaries of the blobs from the saliency map.