Better Sliding Window, Feature Extraction - neural-network

I am currently trying to implement a digit recognizer on a video stream. Which is able to scan handwritten digits on objects, these digits represent an inventory number. The digits are written on a homogenous background, so there are no hard contrast changes, except for the regions with digits.
First I implemented the classical sliding window approach, then I recognized this is to slow for Realtime. Then I tried to calculate a spatial C-dimensional (C=classes) map of the image, following the approach of “Overfeat”, described in the following paper: Link: https://arxiv.org/abs/1312.6229
Now I ask myself, weather this is the right approach to solve this problem, is there maybe a solution, where I crop all image areas where big contrast changes are happening, because obviously written numbers would be contained in this region?
Thank you in advance for your help and time.

Related

Compare two nonlinear transformed (monochromatic) images

Given are two monochromatic images of same size. Both are prealigned/anchored to one common point. Some points of the original image did move to a new position in the new image, but not in a linear fashion.
Below you see a picture of an overlay of the original (red) and transformed image (green). What I am looking for now is a measure of "how much did the "individual" points shift".
At first I thought of a simple average correlation of the whole matrix or some kind of phase correlation, but I was wondering whether there is a better way of doing so.
I already found that link, but it didn't help that much. Currently I implement this in Matlab, but this shouldn't be the point I guess.
Update For clarity: I have hundreds of these image pairs and I want to compare each pair how similar they are. It doesn't have to be the most fancy algorithm, rather easy to implement and yielding in a good estimate on similarity.
An unorthodox approach uses RASL to align an image pair. A python implementation is here: https://github.com/welch/rasl and it also
provides a link to the RASL authors' original MATLAB implementation.
You can give RASL a pair of related images, and it will solve for the
transformation (scaling, rotation, translation, you choose) that best
overlays the pixels in the images. A transformation parameter vector
is found for each image, and the difference in parameters tells how "far apart" they are (in terms of transform parameters)
This is not the intended use of
RASL, which is designed to align large collections of related images while being indifferent to changes in alignment and illumination. But I just tried it out on a pair of jittered images and it worked quickly and well.
I may add a shell command that explicitly does this (I'm the author of the python implementation) if I receive encouragement :) (today, you'd need to write a few lines of python to load your images and return the resulting alignment difference).
You can try using Optical Flow. http://www.mathworks.com/discovery/optical-flow.html .
It is usually used to measure the movement of objects from frame T to frame T+1, but you can also use it in your case. You would get a map that tells you the "offset" each point in Image1 moved to Image2.
Then, if you want a metric that gives you a "distance" between the images, you can perhaps average the pixel values or something similar.

Find correspondence between two sets of 2D points

I have two sets of 2D Points (shown in images below).
And I would like to find some high confidence correspondence between these dots.
These dots are extracted feature points from 2 camera images from different angles. Two images are relatively well rectified, though not perfect. However, there will be distortion/warp caused by depth in the scene, the number of points might not be the same, there might be outliers, etc.
One approach could be using a sliding window that contains multiple dots and try block matching. But that might be kind of slow. I feel like there should be a relatively straight forward solution to this problem.
For example, this paper might be addressing a similar problem.
You can use each dot/point in one of the images, and search for its "neighbors" in the other image.
Just a few days ago someone asked a similar question here, and got a very sophisticated (accepted) answer:
How to calculate the nearest neighbors using weka from the command line?
But maybe your problem is so common in image processing that there are even better solutions, but you might try this one (implemented in java).

How to detect if an image is a texture or a pattern-based image?

I have a question regarding computer vision; seems to be a general question but anyways, just wondering if you might have a clue. I was wondering if there is an efficient way to distinguish texture images (or photos with repetitive patterns) between whatnot, say realistic photos? The patterns could have exact repetitions, or just have major similarity. Actually I'm trying to see given an image if, it is possible to detect it is a texture or a pattern-based image, and that in real-time maybe?
For instance these three are considered textures in our context:
http://www.bigchrisart.com/sites/default/files/video/TR_Texture_RockWall.jpg
http://www.colourbox.com/preview/4440275-144135-seamless-geometric-op-art-texture.jpg
Thank you!
I cannot open your first image. I implemented the Fourier transform on your second one, and you can see frequency responses at specific points:
You can further process the image by extract the local maximum of the magnitude, and they share the same distance to the center (zero frequency). This may be considered as repetitive patterns.
Regarding the case that patterns share major similarity instead of repetitive feature, it is hard to tell whether the frequency magnitude still has such evident response. It depends on how the pattern looks like.
Another possible approach is the auto-correlation on your image.

remove the background of an object in image using matlab

I have a image with noise. i want to remove all background variation from an image and want a plain image .My image is a retinal image and i want only the blood vessel and the retinal ring to remain how do i do it? 1 image is my original image and 2 image is how i want it to be.
this is my convoluted image with noise
There are multiple approaches for blood vessel extraction in retina images.
You can find a thorough overview of different approaches in Review of Blood Vessel Extraction Techniques and Algorithms. It covers prominent works of many approache.
As Martin mentioned, we have the Hessian-based Multiscale Vessel Enhancement Filtering by Frangi et al. which has been shown to work well for many vessel-like structures both in 2D and 3D. There is a Matlab implementation, FrangiFilter2D, that works on 2D vessel images. The overview fails to mention Frangi but cover other works that use Hessian-based methods. I would still recommend trying Frangi's vesselness approach since it is both powerful and simple.
Aside from the Hesisan-based methods, I would recommend looking into morphology-based methods since Matlab provides a good base for morphological operations. One such method is presented in An Automatic Hybrid Method for Retinal Blood Vessel Extraction. It uses a morphological approach with openings/closings together with the top-hat transform. It then complements the morphological approach with fuzzy clustering and some post processing. I haven't tried to reproduce their method, but the results look solid and the paper is freely available online.
This is not an easy task.
Detecting boundary of blood vessals - try edge( I, 'canny' ) and play with the threshold parameters to see what you can get.
A more advanced option is to use this method for detecting faint curves in noisy images.
Once you have reasonably good edges of the blood vessals you can do the segmentation using watershed / NCuts or boundary-sensitive version of meanshift.
Some pointers:
- the blood vessals seems to have relatively the same thickness, much like text strokes. Would you consider using Stroke Width Transform (SWT) to identify them? A mex implementation of SWT can be found here.
- If you have reasonably good boundaries, you can consider this approach for segmentation.
Good luck.
I think you'll be more served using a filter based on tubes. There is a filter available which is based on the work done by a man called Frangi, and the filter is often dubbed the Frangi filter. This can help you with identifying the vasculature in the retina. The filter is already written for Matlab and a public version is available here. If you would like to read about the underlying research search for: 'Multiscale vessel enhancement', by Frangi (1998). Another group who's done work in the same field are Sato et.al.
Sorry for the lack of a link in the last one, I could only find payed sites for looking at the research paper on this computer.
Hope this helps
Here is what I will do. Basically traditional image arithmetic to extract background and them subtract it from input image. This will give you the desired result without background. Below are the steps:
Use a median filter with large kernel as the first step. This will estimate the background.
Divide the input image with the output of step 1 [You may have to shift the denominator a little (+1) ] to avoid divide by 0.
Do the quantization to 8 or n bit integer based on what bit the original image is.
The output of step 3 above is the background. Subtract it from original image, to get the desired result. This clips all the negative values as well.

Matlab video processing of heart beating. code supplemented

I'm trying to write a code The helps me in my biology work.
Concept of code is to analyze a video file of contracting cells in a tissue
Example 1
Example 2: youtube.com/watch?v=uG_WOdGw6Rk
And plot out the following:
Count of beats per min.
Strenght of Beat
Regularity of beating
And so i wrote a Matlab code that would loop through a video and compare each frame vs the one that follow it, and see if there was any changes in frames and plot these changes on a curve.
Example of My code Results
Core of Current code i wrote:
for i=2:totalframes
compared=read(vidObj,i);
ref=rgb2gray(compared);%% convert to gray
level=graythresh(ref);%% calculate threshold
compared=im2bw(compared,level);%% convert to binary
differ=sum(sum(imabsdiff(vid,compared))); %% get sum of difference between 2 frames
if (differ ~=0) && (any(amp==differ)==0) %%0 is = no change happened so i dont wana record that !
amp(end+1)=differ; % save difference to array amp wi
time(end+1)=i/framerate; %save to time array with sec's, used another array so i can filter both later.
vid=compared; %% save current frame as refrence to compare the next frame against.
end
end
figure,plot(amp,time);
=====================
So thats my code, but is there a way i can improve it so i can get better results ?
because i get fealing that imabsdiff is not exactly what i should use because my video contain alot of noise and that affect my results alot, and i think all my amp data is actually faked !
Also i actually can only extract beating rate out of this, by counting peaks, but how can i improve my code to be able to get all required data out of it ??
thanks also really appreciate your help, this is a small portion of code, if u need more info please let me know.
thanks
You say you are trying to write a "simple code", but this is not really a simple problem. If you want to measure the motion accuratly, you should use an optical flow algorithm or look at the deformation field from a registration algorithm.
EDIT: As Matt is saying, and as we see from your curve, your method is suitable for extracting the number of beats and the regularity. To accuratly find the strength of the beats however, you need to calculate the movement of the cells (more movement = stronger beat). Unfortuantly, this is not straight forwards, and that is why I gave you links to two algorithms that can calculate the movement for you.
A few fairly simple things to try that might help:
I would look in detail at what your thresholding is doing, and whether that's really what you want to do. I don't know what graythresh does exactly, but it's possible it's lumping different features that you would want to distinguish into the same pixel values. Have you tried plotting the differences between images without thresholding? Or you could threshold into multiple classes, rather than just black and white.
If noise is the main problem, you could try smoothing the images before taking the difference, so that differences in noise would be evened out but differences in large features, caused by motion, would still be there.
You could try edge-detecting your images before taking the difference.
As a previous answerer mentioned, you could also look into motion-tracking and registration algorithms, which would estimate the actual motion between each image, rather than just telling you whether the images are different or not. I think this is a decent summary on Wikipedia: http://en.wikipedia.org/wiki/Video_tracking. But they can be rather complicated.
I think if all you need is to find the time and period of contractions, though, then you wouldn't necessarily need to do a detailed motion tracking or deformable registration between images. All you need to know is when they change significantly. (The "strength" of a contraction is another matter, to define that rigorously you probably would need to know the actual motion going on.)
What are the structures we see in the video? For example what is the big dark object in the lower part of the image? This object would be relativly easy to track, but would data from this object be relevant to get data about cell contraction?
Is this image from a light microscop? At what magnification? What is the scale?
From the video it looks like there are several motions and regions of motion. So should you focus on a smaller or larger area to get your measurments? Per cell contraction or region contraction? From experience I know that changing what you do at the microscope might be much better then complex image processing ;)
I had sucsess with Gunn and Nixons Dual Snake for a similar problem:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.6831
I placed the first aproximation in the first frame by hand and used the segmentation result as starting curv for the next frame and so on. My implementation for this is from 2000 and I only have it on paper, but if you find Gunn and Nixons paper interesting I can probably find my code and scan it.
#Matt suggested smoothing and edge detection to improve your results. This is good advise. You can combine smoothing, thresholding and edge detection in one function call, the Canny edge detector.Then you can dialate the edges to get greater overlap between frames. Little overlap will probably mean a big movement between frames. You can use this the same way as before to find the beat. You can now make a second pass and add all the dialated edge images related to one beat. This should give you an idea about the area traced out by the cells as they move trough a contraction. Maybe this can be used as a useful measure for contraction of a large cluster of cells.
I don't have access to Matlab and the Image Processing Toolbox now, so I can't give you tested code. Here are some hints: http://www.mathworks.se/help/toolbox/images/ref/edge.html , http://www.mathworks.se/help/toolbox/images/ref/imdilate.html and http://www.mathworks.se/help/toolbox/images/ref/imadd.html.