I am studying a finger recognition system.
There are so many algorithms.
If I write it with my own words, one of the algorithm (especially about minutiae-based binarized image) will consist of the following steps:
STEP1. Segmentation. This step to separate the foreground from the background. Most done with Thresholding.
STEP2. Normalization. This step to distribute the intensity. Most done with Histogram Equalization.
STEP3. Filtering. This step to fill the gap along the ridge. Also to enhance the contrast between the ridge and the valley. Most done with Gabor Filter.
STEP4. Binarization. This step to binarize the filtered image.
STEP5. Thinning. This step to skeletonize the binarized image.
STEP6. Minutiae Extraction. This step to extract minutiae (ridge ending and ridge bifurcation) from the skeletonized image.
STEP7. Minutiae Matching. This step to match between extracted minutiae template and extracted minutiae input.
I still not understand about STEP3 especially related Gabor Filter.
I really need step-by-step explanation about Gabor Filter.
Can you help me?
The Gabor filter is in fact a collection of filters. Each filter has a modulated Gaussian function as convolution kernel. The difference between each filter is the orientation. See an example here page 17, here section 2.2.2, or here. Each filter is applied to the image, and the maximum answer is kept.
As each filter may have an elongated shape with a specific direction, and because you keep the maximum answer, the gabor filter will:
Find (thin) oriented patterns, like lines, edges, etc. (see page 17 too)
Reconnect discontinued lines. Indeed, the middle of the kernels will be between the lines, but the filter with the good orientation will have both extremities touching the lines. You can also do such operation with mathematical morphology (oriented opening/closing).
Here are other examples:
link.
link.
You should read this.
Is that not what you want to do?
Related
I used this script to reconstruct an image of the Shepp-Logan phantom.
Basically, it just simply used radon to get sinogram and used iradon to transform it back.
However, I found that a very obvious moire pattern can be seen when I adjust contrast. This is even more obvious if I use my CT image dataset.
Can anyone help me to understand this? Thanks!
img = phantom(512)*1000;
views = 576;
angles = [0:180/576:180-180/576];
sino = radon(img,angles);
img_rec = iradon(sino,angles);
imshow(img_rec,[]);
Full image after being adjusted contrast:
Regions with obvious moire pattern:
This may be happening because of some factors:
From the MATLAB documentation, iradon uses 'Ram-Lak' (known as ramp filter) filtering as default and does not use any windowing to de-emphasizes noise in the high frequencies. You stated "This is even more obvious if I use my CT image dataset", that is because there you have real noise in the images. The documentation itself advises to use some windowing:
"Because this filter is sensitive to noise in the projections, one of the filters listed below might be preferable. These filters multiply the Ram-Lak filter by a window that de-emphasizes higher frequencies."
Other inconvenient is related to the projector itself. The built-in functions radon and iradon from MATLAB does not take into account the detector size and the x-ray length which cross the pixel. These functions are just pixel driving methods, i.e., they basically project geometrically the pixels in the detector and interpolate them.
Possible solutions:
There are more sophisticated projectors today as [1] and [2]. As I stated here, I implemented the distance-driven projector for 2D Computed Tomography (CT) and 3D Digital Breast Tomosynthesis (DBT), so feel free to use it for your experiments.
For example, I generated 3600 equally spaced projections of the phantom with the distance-driven method, and reconstructed it with the iradon function using this line code:
slice = iradon(sinogram',rad2deg(geo.theta));
I am trying to solve this problem by using of Monte-Carlo Flooding algorithm. As result I receive set of semicircles (the picture below), but the requested solution is for trapezoid like polygons. Please, can you suggest me an algorithm by which I will be able to transform this semicircles in polygons?
First, you extract your pieces as contours, using Suzuki-Abe algorithm (Suzuki, S. and Abe, K., Topological Structural Analysis of Digitized Binary Images by Border Following. CVGIP 30 1, pp 32-46 (1985)). You'll get all contours out of your image as they are produced.
Then, you approximate contours into polygons using Ramer-Douglas-Peucker algorithm.
THere is well-known library which does it all - OpenCV, see link for details https://docs.opencv.org/2.4.13.2/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html
I am currently looking into some image processing project and just wondering how to obtain the low, middle and high frequency components of an image? For example, as this picture showed (I got it from googling without detailed description how to obtained this picture, but presumably using some filtering).
Also, I came across this post of using discrete cosine transform (DCT), and it can help us to get the low and high frequency components of an image. Just wondering how to use DCT to get the middle frequency component?
Link of DCT
I also have very basic knowledge about filtering. I think there are also Gaussian high/low pass filters available to use. And also wavelet based filtering. Just wondering what are the differences between Gaussian, Wavelet and DCT based filtering? Which one should I use?
Typical steps would be:
use a Fourier Transform to bring the image into frequency domain
apply filtering by zero-ing out areas of the fft image
reverse the fourier transform to bring image back to spatial domain
This is a really good example of high/low/mid pass filters in frequency domain: http://paulbourke.net/miscellaneous/imagefilter/
You will want to use MatLab's built in fft our fast fourier transform function. Fourier transforms are an extremely powerful method to filter frequencies. http://www.mathworks.com/help/matlab/ref/fft.html has some great examples on how to use the fft. Once you find the frequencies that make up the image you can take out the undesired frequencies to fit and then reverse fourier transform to obtain the new image.
first of all this is my first question here, so I hope I can explain it in a clear way.
My goal is to detect different classes of traffic signs in images. For that purpose I have trained binary SVMs following these steps:
First I got a database of cropped traffic signs like the one in the link. I considered different classes (prohibition, danger, etc), and negative images. All of them were scaled to 40x40 pixels.
http://i.imgur.com/Hm9YyZT.jpg
I trained linear-SVM models for each class (1-vs-all), using HOG as feature. Each image is described with a 1728-dimensional feature. (I append the three feature vectors for all three image planes). I did crossvalidation to set parameter C, and tested on previously unseen 40x40 images, and I got very accurate results (F1 score over 0.9 for all classes). I used libsvm for training and testing.
Now I'd want to detect signs in full road images, sliding a window in different image scales. The problem I'm facing is that I couldn't find any function that can do it for me (as DetectMultiScale in OpenCV), and my solution is very slow and rudimentary (I'm just doing a triple for loop, and for each scale I crop consecutive and overlapping 40x40 images, obtain HOG features and apply svmpredict for each one).
Can someone give me a clue to find a faster way to do it? I thought too about getting the HOG feature vector of the whole input image, and then reorder that vector to a matrix where each row will have the features corresponding to each 40x40 window, but I couldn't find a straightforward way of doing it.
Thanks,
I would suggest using SURF feature detection, however I don't know if this would also be too slow your needs.
See : http://morf.lv/modules.php?name=tutorials&lasit=2 for more information on how to implement and weather it is a viable solution for you.
I was reading up on the DWT for the first time and the document stated that it is used to represent time-frequency data of a signal which other transforms do not provide.
But when I look for a usage example of the DWT in MATLAB I see the following code:
X=imread('cameraman.tif');
X=im2double(X);
[F1,F2]= wfilters('db1', 'd');
[LL,LH,HL,HH] = dwt2(X,'db1','d');
I am unable to understand the implementation of dwt2 or rather what is it and when and where we use it. What actually does dwt2 return and what does the above code do?
The first two statements simply read in the image, and convert it so that the dynamic range of each channel is between [0,1] through im2double.
Now, the third statement, wfilters constructs the wavelet filter banks for you. These filter banks are what are used in the DWT. The method of the DWT is the same, but you can use different kinds of filters to achieve specific results.
Basically, with wfilters, you get to choose what kind of filter you want (in your case, you chose db1: Daubechies), and you can optionally specify the type of filter that you want. Different filters provide different results and have different characteristics. There are a lot of different wavelet filter banks you could use and I'm not quite the expert as to the advantages and disadvantages for each filter bank that exists. Traditionally, Daubechies-type filters are used so stick with those if you don't know which ones to use.
Not specifying the type will output both the decomposition and the reconstruction filters. Decomposition is the forward transformation where you are given the original image / 2D data and want to transform it using the DWT. Reconstruction is the reverse transformation where you are given the transform data and want to recreate the original data.
The fourth statement, dwt2, computes the 2D DWT for you, but we will get into that later.
You specified the flag d, so you want only the decomposition filters. You can use wfilters as input into the 2D DWT if you wish, as this will specify the low-pass and high-pass filters that you want to use when decomposing your image. You don't have to do it like this. You can simply specify what filter you want to use, which is how you're calling the function in your code. In other words, you can do this:
[F1,F2]= wfilters('db1', 'd');
[LL,LH,HL,HH] = dwt2(X,F1,F2);
... or you can just do this:
[LL,LH,HL,HH] = dwt2(X,'db1','d');
The above statements are the same thing. Note that there is a 'd' flag on the dwt2 function because you want the forward transform as well.
Now, dwt2 is the 2D DWT (Discrete Wavelet Transform). I won't go into the DWT in detail here because this isn't the place to talk about it, but I would definitely check out this link for better details. They also have fully working MATLAB code and their own implementation of the 2D DWT so you can fully understand what exactly the DWT is and how it's computed.
However, the basics behind the 2D DWT is that it is known as a multi-resolution transform. It analyzes your signal and decomposes your signal into multiple scales / sizes and features. Each scale / size has a bunch of features that describe something about the signal that was not seen in the other scales.
One thing about the DWT is that it naturally subsamples your image by a factor of 2 (i.e. halves each dimension) after the analysis is done - hence the multi-resolution bit I was talking about. For MATLAB, dwt2 outputs four different variables, and these correspond to the variable names of the output of dwt2:
LL - Low-Low. This means that the vertical direction of your 2D image / signal is low-pass filtered as well as the horizontal direction.
LH - Low-High. This means that the vertical direction of your 2D image / signal is low-pass filtered while the horizontal direction is high-pass filtered.
HL - High-Low. This means that the vertical direction of your 2D image / signal is high-pass filtered while the horizontal direction is low-pass filtered.
HH - High-High. This means that the vertical direction of your 2D image / signal is high-pass filtered as well as the horizontal direction.
Roughly speaking, LL corresponds to just the structural / predominant information of your image while HH corresponds to the edges of your image. The LH and HL components I'm not too familiar with, but they're used in feature analysis sometimes. If you want to do a further decomposition, you would apply the DWT again on the LL only. However, depending on your analysis, the other components are used.... it just depends on what you want to use it for! dwt2 only performs a single-level DWT decomposition, so if you want to use this again for the next level, you would call dwt2 on the LL component.
Applications
Now, for your specific question of applications. The DWT for images is mostly used in image compression and image analysis. One application of the 2D DWT is in JPEG 2000. The core of the algorithm is that they break down the image into the DWT components, then construct trees of the coefficients generated by the DWT to determine which components can be omitted before you save the image. This way, you eliminate extraneous information, but there is also a great benefit that the DWT is lossless. I don't know which filter(s) is/are being used in JPEG 2000, but I know for certain that the standard is lossless. This means that you will be able to reconstruct the original data back without any artifacts or quantization errors. JPEG 2000 also has a lossy option, where you can reduce the file size even more by eliminating more of the DWT coefficients in such a way that is imperceptible to the average use.
Another application is in watermarking images. You can embed information in the wavelet coefficients so that it prevents people from trying to steal your images without acknowledgement. The DWT is also heavily used in medical image analysis and compression as the images generated in this domain are quite high resolution and quite large. It would be extremely useful if you could represent the images in the same way but occupying less physical space in comparison to the standard image compression algorithms (that are also lossy if you want high compression ratios) that exist.
One more application I can think of would be the dynamic delivery of video content over networks. Depending on what your connection speed is or the resolution of your screen, you get a lower or higher quality video. If you specifically use the LL component of each frame, you would stream / use a particular version of the LL component depending on what device / connection you have. So if you had a bad connection or if your screen has a low resolution, you would most likely show the video with the smallest size. You would then keep increasing the resolution depending on the connection speed and/or the size of your screen.
This is just a taste as to what the DWT is used for (personally, I don't use it because the DWT is used in domains that I don't personally have any experience in), but there are a lot more applications that are quite useful where the DWT is used.