how can i generate quantizatinon metrices with diffrent size and
quality, is there a function in matlab for this?
Pls explain your context. quantization matrix ... for what ? If you are dealing with JPEG image compression (image blocks + DCT + quantization + huffman coding), the compressor has freedom to use its own quatization matrix - or rather a family of matrices, one for each "quality factor".
Conceptually, one usually want to assign many bits to the low frequency components and few to the high frequencies - but that's about all that can be said in general.
Also, be aware that JPEG compresses luminance and croma separated (and chroma usually subsampled), so one can use different matrices for each.
I believe the standard suggests some typical matrix, eg, including a scaling factor for different qualities. But this is not required at all. Also, you can find (googling!) here many matrices for many cameras and image apps.
Update: From here:
Tuning the quantization tables for best results is something of a black art, and is an active research area. Most existing encoders use simple linear scaling of the example tables given in the JPEG standard, using a single user-specified "quality" setting to determine the scaling multiplier. This works fairly well for midrange qualities (not too far from the sample tables themselves) but is quite nonoptimal at very high or low quality settings.
Related
During unsupervised learning we do cluster analysis (like K-Means) to bin the data to a number of clusters.
But what is the use of these clustered data in practical scenario.
I think during clustering we are losing information about the data.
Are there some practical examples where clustering could be beneficial?
The information loss can be intentional. Here are three examples:
PCM signal quantification (Lloyd's k-means publication). You know that are certain number (say 10) different signals are transmitted, but with distortion. Quantifying removes the distortions and re-extracts the original 10 different signals. Here, you lose the error and keep the signal.
Color quantization (see Wikipedia). To reduce the number of colors in an image, a quite nice method uses k-means (usually in HSV or Lab space). k is the number of desired output colors. Information loss here is intentional, to better compress the image. k-means attempts to find the least-squared-error approximation of the image with just k colors.
When searching motifs in time series, you can also use quantization such as k-means to transform your data into a symbolic representation. The bag-of-visual-words approach that was the state of the art for image recognition prior to deep learning also used this.
Explorative data mining (clustering - one may argue that above use cases are not data mining / clustering; but quantization). If you have a data set of a million points, which points are you going to investigate? clustering methods try ro split the data into groups that are supposed to be more homogeneous within and more different to another. Thrn you don't have to look at every object, but only at some of each cluster to hopefully learn something about the whole cluster (and your whole data set). Centroid methods such as k-means even can proviee a "prototype" for each cluster, albeit it is a good idea to also lool at other points within the cluster. You may also want to do outlier detection and look at some of the unusual objects. This scenario is somewhere inbetween of sampling representative objects and reducing the data set size to become more manageable. The key difference to above points is that the result is usually not "operationalized" automatically, but because explorative clustering results are too unreliable (and thus require many iterations) need to be analyzed manually.
I was reading this particular paper http://www.robots.ox.ac.uk/~vgg/publications/2011/Chatfield11/chatfield11.pdf and I find the Fisher Vector with GMM vocabulary approach very interesting and I would like to test it myself.
However, it is totally unclear (to me) how do they apply PCA dimensionality reduction on the data. I mean, do they calculate Feature Space and once it is calculated they perform PCA on it? Or do they just perform PCA on every image after SIFT is calculated and then they create feature space?
Is this supposed to be done for both training test sets? To me it's an 'obviously yes' answer, however it is not clear.
I was thinking of creating the feature space from training set and then run PCA on it. Then, I could use that PCA coefficient from training set to reduce each image's sift descriptor that is going to be encoded into Fisher Vector for later classification, whether it is a test or a train image.
EDIT 1;
Simplistic example:
[coef , reduced_feat_space]= pca(Feat_Space','NumComponents', 80);
and then (for both test and train images)
reduced_test_img = test_img * coef; (And then choose the first 80 dimensions of the reduced_test_img)
What do you think? Cheers
It looks to me like they do SIFT first and then do PCA. the article states in section 2.1 "The local descriptors are fixed in all experiments to be SIFT descriptors..."
also in the introduction section "the following three steps:(i) extraction
of local image features (e.g., SIFT descriptors), (ii) encoding of the local features in an image descriptor (e.g., a histogram of the quantized local features), and (iii) classification ... Recently several authors have focused on improving the second component" so it looks to me that the dimensionality reduction occurs after SIFT and the paper is simply talking about a few different methods of doing this, and the performance of each
I would also guess (as you did) that you would have to run it on both sets of images. Otherwise your would be using two different metrics to classify the images it really is like comparing apples to oranges. Comparing a reduced dimensional representation to the full one (even for the same exact image) will show some variation. In fact that is the whole premise of PCA, you are giving up some smaller features (usually) for computational efficiency. The real question with PCA or any dimensionality reduction algorithm is how much information can I give up and still reliably classify/segment different data sets
And as a last point, you would have to treat both images the same way, because your end goal is to use the Fisher Feature Vector for classification as either test or training. Now imagine you decided training images dont get PCA and test images do. Now I give you some image X, what would you do with it? How could you treat one set of images differently from another BEFORE you've classified them? Using the same technique on both sets means you'd process my image X then decide where to put it.
Anyway, I hope that helped and wasn't to rant-like. Good Luck :-)
I was reading up on the DWT for the first time and the document stated that it is used to represent time-frequency data of a signal which other transforms do not provide.
But when I look for a usage example of the DWT in MATLAB I see the following code:
X=imread('cameraman.tif');
X=im2double(X);
[F1,F2]= wfilters('db1', 'd');
[LL,LH,HL,HH] = dwt2(X,'db1','d');
I am unable to understand the implementation of dwt2 or rather what is it and when and where we use it. What actually does dwt2 return and what does the above code do?
The first two statements simply read in the image, and convert it so that the dynamic range of each channel is between [0,1] through im2double.
Now, the third statement, wfilters constructs the wavelet filter banks for you. These filter banks are what are used in the DWT. The method of the DWT is the same, but you can use different kinds of filters to achieve specific results.
Basically, with wfilters, you get to choose what kind of filter you want (in your case, you chose db1: Daubechies), and you can optionally specify the type of filter that you want. Different filters provide different results and have different characteristics. There are a lot of different wavelet filter banks you could use and I'm not quite the expert as to the advantages and disadvantages for each filter bank that exists. Traditionally, Daubechies-type filters are used so stick with those if you don't know which ones to use.
Not specifying the type will output both the decomposition and the reconstruction filters. Decomposition is the forward transformation where you are given the original image / 2D data and want to transform it using the DWT. Reconstruction is the reverse transformation where you are given the transform data and want to recreate the original data.
The fourth statement, dwt2, computes the 2D DWT for you, but we will get into that later.
You specified the flag d, so you want only the decomposition filters. You can use wfilters as input into the 2D DWT if you wish, as this will specify the low-pass and high-pass filters that you want to use when decomposing your image. You don't have to do it like this. You can simply specify what filter you want to use, which is how you're calling the function in your code. In other words, you can do this:
[F1,F2]= wfilters('db1', 'd');
[LL,LH,HL,HH] = dwt2(X,F1,F2);
... or you can just do this:
[LL,LH,HL,HH] = dwt2(X,'db1','d');
The above statements are the same thing. Note that there is a 'd' flag on the dwt2 function because you want the forward transform as well.
Now, dwt2 is the 2D DWT (Discrete Wavelet Transform). I won't go into the DWT in detail here because this isn't the place to talk about it, but I would definitely check out this link for better details. They also have fully working MATLAB code and their own implementation of the 2D DWT so you can fully understand what exactly the DWT is and how it's computed.
However, the basics behind the 2D DWT is that it is known as a multi-resolution transform. It analyzes your signal and decomposes your signal into multiple scales / sizes and features. Each scale / size has a bunch of features that describe something about the signal that was not seen in the other scales.
One thing about the DWT is that it naturally subsamples your image by a factor of 2 (i.e. halves each dimension) after the analysis is done - hence the multi-resolution bit I was talking about. For MATLAB, dwt2 outputs four different variables, and these correspond to the variable names of the output of dwt2:
LL - Low-Low. This means that the vertical direction of your 2D image / signal is low-pass filtered as well as the horizontal direction.
LH - Low-High. This means that the vertical direction of your 2D image / signal is low-pass filtered while the horizontal direction is high-pass filtered.
HL - High-Low. This means that the vertical direction of your 2D image / signal is high-pass filtered while the horizontal direction is low-pass filtered.
HH - High-High. This means that the vertical direction of your 2D image / signal is high-pass filtered as well as the horizontal direction.
Roughly speaking, LL corresponds to just the structural / predominant information of your image while HH corresponds to the edges of your image. The LH and HL components I'm not too familiar with, but they're used in feature analysis sometimes. If you want to do a further decomposition, you would apply the DWT again on the LL only. However, depending on your analysis, the other components are used.... it just depends on what you want to use it for! dwt2 only performs a single-level DWT decomposition, so if you want to use this again for the next level, you would call dwt2 on the LL component.
Applications
Now, for your specific question of applications. The DWT for images is mostly used in image compression and image analysis. One application of the 2D DWT is in JPEG 2000. The core of the algorithm is that they break down the image into the DWT components, then construct trees of the coefficients generated by the DWT to determine which components can be omitted before you save the image. This way, you eliminate extraneous information, but there is also a great benefit that the DWT is lossless. I don't know which filter(s) is/are being used in JPEG 2000, but I know for certain that the standard is lossless. This means that you will be able to reconstruct the original data back without any artifacts or quantization errors. JPEG 2000 also has a lossy option, where you can reduce the file size even more by eliminating more of the DWT coefficients in such a way that is imperceptible to the average use.
Another application is in watermarking images. You can embed information in the wavelet coefficients so that it prevents people from trying to steal your images without acknowledgement. The DWT is also heavily used in medical image analysis and compression as the images generated in this domain are quite high resolution and quite large. It would be extremely useful if you could represent the images in the same way but occupying less physical space in comparison to the standard image compression algorithms (that are also lossy if you want high compression ratios) that exist.
One more application I can think of would be the dynamic delivery of video content over networks. Depending on what your connection speed is or the resolution of your screen, you get a lower or higher quality video. If you specifically use the LL component of each frame, you would stream / use a particular version of the LL component depending on what device / connection you have. So if you had a bad connection or if your screen has a low resolution, you would most likely show the video with the smallest size. You would then keep increasing the resolution depending on the connection speed and/or the size of your screen.
This is just a taste as to what the DWT is used for (personally, I don't use it because the DWT is used in domains that I don't personally have any experience in), but there are a lot more applications that are quite useful where the DWT is used.
I am registering multi-modal MRI slices that are 512x512 greyscale (each normalised to 0..1 range). The slices are of the same object but taken with different sequences and have very different intensities. I am currently finding the translation-only transformation between the two slices using imregister(moving,fixed,'translation',optimizer,metric) where optimizer and metric are from imregconfig('multimodal').
However, the transformation it finds (inspecting tform) is like '2.283' in the x and '-0.019' in the y, and actually I only wish for whole value translations i.e. '2' and '0' in this case.
How to modify imregister (or a similar function) to check only whole-pixel translations? This would save a lot of computation and it suits my needs better.
Without modifying imregister I assume the easiest solution to just round the x and y translations?
I'm not sure how imregister is implemented for the 'multimodal' case, but pure translation estimation for conventional image registration is done using image gradients and taylor apporximation and gives sub-pixel accuracy at the same cost as pixel-level accuracy.
So, in that case limiting yourself to pixel-wise translation does not seems to benefit you in any way.
If you do not want to bother with sib-pixel shifts, I suppose rounding would be the simplest approach.
I am trying to plot a 2 GB matrix using MATLAB hist on a computer with 4 GB RAM. The operation is taking hours. Are there ways to increase the performance of the computation, by pre-sorting the data, pre-determining bin sizes, breaking the data into smaller groups, deleting the raw data as the data is added to bins, etc?
Also, after the data is plotted, I need to adjust the binning to ensure the curve is smooth. This requires starting over and re-binning the raw data. I assume the strategy involving the least computation would be to first bin the data using very small bins and then manipulate the bin size of the output, rather than re-binning the raw data. What is the best way to adjust bin sizes post-binning (assuming the bin sizes can only grow and not shrink)?
I don't like answers to StackOverflow Questions of the form "well even though you asked how to do X, you don't really want to do X, you really want to do Y, so here's a solution to Y"
But that's what i am going to do here. I think such an answer is justified in this rare instance becuase the answer below is in accord with sound practices in statistical analysis and because it avoids the current problem in front of you which is crunching 4 GB of datda.
If you want to represent the distribution of a population using a non-parametric density estimator, and you wwish to avoid poor computational performance, a kernel density estimator (KDE) will do the job far better than a histogram.
To begin with, there's a clear preference for KDEs versus histograms among the majority of academic and practicing statisticians. Among the numerous texts on this topic, ne that i think is particularly good is An introduction to kernel density estimation )
Reasons why KDE is preferred to histogram
the shape of a histogram is strongly influenced by the choice of
total number of bins; yet there is no authoritative technique for
calculating or even estimating a suitable value. (Any doubts about this, just plot a histogram from some data, then watch the entire shape of the histogram change as you adjust the number of bins.)
the shape of the histogram is strongly influenced by the choice of
location of the bin edges.
a histogram gives a density estimate that is not smooth.
KDE eliminates completely histogram properties 2 and 3. Although KDE doesn't produce a density estimate with discrete bins, an analogous parameter, "bandwidth" must still be supplied.
To calculate and plot a KDE, you need to pass in two parameter values along with your data:
kernel function: the most common options (all available in the MATLAB kde function) are: uniform, triangular, biweight, triweight, Epanechnikov, and normal. Among these, gaussian (normal) is probably most often used.
bandwith: the choice of value for bandwith will almost certainly have a huge effect on the quality of your KDE. Therefore, sophisticated computation platforms like MATLAB, R, etc. include utility functions (e.g., rusk function or MISE) to estimate bandwith given oother parameters.
KDE in MATLAB
kde.m is the function in MATLAB that implementes KDE:
[h, fhat, xgrid] = kde(x, 401);
Notice that bandwith and kernel are not supplied when calling kde.m. For bandwitdh: kde.m wraps a function for bandwidth selection; and for the kernel function, gaussian is used.
But will using KDE in place of a histogram solve or substantially eliminate the very slow performance given your 2 GB dataset?
It certainly should.
In your Question, you stated that the lagging performance occurred during plotting. A KDE does not require mapping of thousands (missions?) of data points a symbol, color, and specific location on a canvas--instead it plots a single smooth line. And because the entire data set doesn't need to be rendered one point at a time on the canvas, they don't need to be stored (in memory!) while the plot is created and rendered.