which are the ranges in hsv image representation in matlab? - matlab

i've to compare several images of the same scene taken from different devices/position. To do so, i want to quantize the colors in order to remove some color representation differences due to device and illumination.
If i work in RGB i know that matlab represent each channel in the range [0 255], if i work in YCbCr i know that the three ranges are[16 235] and [16 240], but if i wanted to work in HSV color space i just know that converting with rgb2hsv i get an image which each channel is a double... but i don't know if all range between 0 and 1 are used for all the three channels.... so that i cannot make a quantization without this information.

Parag basically answered your question, but if you want physical proof, you can do what chappjc suggested and just... try it yourself! Read in an image, convert it to HSV using rgb2hsv, and take a look at the distribution of values. For example, using onion.png that is part of MATLAB's system path, try something like:
im = imread('onion.png');
out = rgb2hsv(im);
str = 'HSV';
for idx = 1 : 3
disp(['Range of ', str(idx)]);
disp([min(min(out(:,:,idx))) max(max(out(:,:,idx)))]);
end
The above code will read in each channel and display the minimum and maximum in each (Hue, Saturation and Value). This is what I get:
Range of H
0 0.9991
Range of S
0.0791 1.0000
Range of V
0.0824 1.0000
As you can see, the values range between [0,1]. Have fun!

Related

How to change pixel values of an RGB image in MATLAB?

So what I need to do is to apply an operation like
(x(i,j)-min(x)) / max(x(i,j)-min(x))
which basically converts each pixel value such that the values range between 0 and 1.
First of all, I realised that Matlab saves our image(rows * col * colour) in a 3D matrix on using imread,
Image = imread('image.jpg')
So, a simple max operation on image doesn't give me the max value of pixel and I'm not quite sure what it returns(another multidimensional array?). So I tried using something like
max_pixel = max(max(max(Image)))
I thought it worked fine. Similarly I used min thrice. My logic was that I was getting the min pixel value across all 3 colour planes.
After performing the above scaling operation I got an image which seemed to have only 0 or 1 values and no value in between which doesn't seem right. Has it got something to do with integer/float rounding off?
image = imread('cat.jpg')
maxI = max(max(max(image)))
minI = min(min(min(image)))
new_image = ((I-minI)./max(I-minI))
This gives output of only 1s and 0s which doesn't seem correct.
The other approach I'm trying is working on all colour planes separately as done here. But is that the correct way to do it?
I could also loop through all pixels but I'm assuming that will be time taking. Very new to this, any help will be great.
If you are not sure what a matlab functions returns or why, you should always do one of the following first:
Type help >functionName< or doc >functionName< in the command window, in your case: doc max. This will show you the essential must-know information of that specific function, such as what needs to be put in, and what will be output.
In the case of the max function, this yields the following results:
M = max(A) returns the maximum elements of an array.
If A is a vector, then max(A) returns the maximum of A.
If A is a matrix, then max(A) is a row vector containing the maximum
value of each column.
If A is a multidimensional array, then max(A) operates along the first
array dimension whose size does not equal 1, treating the elements as
vectors. The size of this dimension becomes 1 while the sizes of all
other dimensions remain the same. If A is an empty array whose first
dimension has zero length, then max(A) returns an empty array with the
same size as A
In other words, if you use max() on a matrix, it will output a vector that contains the maximum value of each column (the first non-singleton dimension). If you use max() on a matrix A of size m x n x 3, it will result in a matrix of maximum values of size 1 x n x 3. So this answers your question:
I'm not quite sure what it returns(another multidimensional array?)
Moving on:
I thought it worked fine. Similarly I used min thrice. My logic was that I was getting the min pixel value across all 3 colour planes.
This is correct. Alternatively, you can use max(A(:)) and min(A(:)), which is equivalent if you are just looking for the value.
And after performing the above operation I got an image which seemed to have only 0 or 1 values and no value in between which doesn't seem right. Has it got something to do with integer/float rounding off?
There is no way for us to know why this happens if you do not post a minimal, complete and verifiable example of your code. It could be that it is because your variables are of a certain type, or it could be because of an error in your calculations.
The other approach I'm trying is working on all colour planes separately as done here. But is that the correct way to do it?
This depends on what the intended end result is. Normalizing each colour (red, green, blue) seperately will result in a different result as compared to normalizing the values all at once (in 99% of cases, anyway).
You have a uint8 RGB image.
Just convert it to a double image by
I=imread('https://upload.wikimedia.org/wikipedia/commons/thumb/0/0b/Cat_poster_1.jpg/1920px-Cat_poster_1.jpg')
I=double(I)./255;
alternatively
I=im2double(I); %does the scaling if needed
Read about image data types
What are you doing wrong?
If what you want todo is convert a RGB image to [0-1] range, you are approaching the problem badly, regardless of the correctness of your MATLAB code. Let me give you an example of why:
Say you have an image with 2 colors.
A dark red (20,0,0):
A medium blue (0,0,128)
Now you want this changed to [0-1]. How do you scale it? Your suggested approach is to make the value 128->1 and either 20->20/128 or 20->1 (not relevant). However when you do this, you are changing the color! you are making the medium blue to be intense blue (maximum B channel, ) and making R way way more intense (instead of 20/255, 20/128, double brightness! ). This is bad, as this is with simple colors, but with combined RGB values you may even change the color itsef, not only the intensity. Therefore, the only correct way to convert to [0-1] range is to assume your min and max are [0, 255].

Confusion in different HOG codes

I have downloaded three different HoG codes.
using the image of 64x128
1) using the matlab function:extractHOGFeatures,
[hog, vis] = extractHOGFeatures(img,'CellSize',[8 8]);
The size of hog is 3780.
How to calculate:
HOG feature length, N, is based on the image size and the function parameter values.
N = prod([BlocksPerImage, BlockSize, NumBins])
BlocksPerImage = floor((size(I)./CellSize – BlockSize)./(BlockSize – BlockOverlap) + 1)
2) the second HOG function is downloaded from here.
Same image is used
H = hog( double(rgb2gray(img)), 8, 9 );
% I - [mxn] color or grayscale input image (must have type double)
% sBin - [8] spatial bin size
% oBin - [9] number of orientation bins
The size of H is 3024
How to calculate:
H - [m/sBin-2 n/sBin-2 oBin*4] computed hog features
3) HoG code from vl_feat.
cellSize = 8;
hog = vl_hog(im2single(rgb2gray(img)), cellSize, 'verbose','variant', 'dalaltriggs') ;
vl_hog: image: [64 x 128 x 1]
vl_hog: descriptor: [8 x 16 x 36]
vl_hog: number of orientations: 9
vl_hog: bilinear orientation assignments: no
vl_hog: variant: DalalTriggs
vl_hog: input type: Image
the output is 4608.
Which one is correct?
All are correct. Thing is HOG feature extraction function default parameters vary with packages. (Eg - opencv, matlab, scikit-image etc). By parameters I mean, winsize, stride, blocksize, scale etc.
Usually HOG descriptor length is :
Length = Number of Blocks x Cells in each Block x Number of Bins in each Cell
Since all are correct, which one you may use can be answered in many ways.
You can experiment with different param values and choose the one that suits you. Since there is no fixed way to find right values, it would be helpful if you know how change in each parameters affect the result.
Cell-size : If you increase this, you may not capture small details.
Block-size : Again, large block with large cell size may not help you capture the small details. Also since large block means illumination variation can be more and due to gradient normalization step, lot of details will be lost. So choose accordingly.
Overlap/Stride: This again helps you capture more information about the image patch if you choose overlapping blocks. Usually it is set to half the blocksize.
You may have lot of information by choosing the values of the above params accordingly. But the descriptor length will become unnecessarily long.
Hope this helps :)

How do I plot Precision-Recall graphs for Content-Based Image Retrieval in MATLAB?

I am accessing 10 images from a folder "c1" and I have query image. I have implemented code for loading images in cell array and then I'm calculating histogram intersection between query image and each image from folder "c1" one-by-one. Now i want to draw precision-recall curve but i am not sure how to write code for getting "precision-recall curve" using the data obtained from histogram intersection.
My code:
Inp1=rgb2gray(imread('D:\visionImages\c1\1.ppm'));
figure, imshow(Inp1), title('Input image 1');
srcFiles = dir('D:\visionImages\c1\*.ppm'); % the folder in which images exists
for i = 1 : length(srcFiles)
filename = strcat('D:\visionImages\c1\',srcFiles(i).name);
I = imread(filename);
I=rgb2gray(I);
Seq{i}=I;
end
for i = 1 : length(srcFiles) % loop for calculating histogram intersections
A=Seq{i};
B=Inp1;
a = size(A,2); b = size(B,2);
K = zeros(a, b);
for j = 1:a
Va = repmat(A(:,j),1,b);
K(j,:) = 0.5*sum(Va + B - abs(Va - B));
end
end
Precision-Recall graphs measure the accuracy of your image retrieval system. They're also used in the performance of any search engine really, like text or documents. They're also used in machine learning evaluation and performance, though ROC Curves are what are more commonly used.
Precision-Recall graphs are more suitable for document and data retrieval. For the case of images here, given your query image, you are measuring how similar this image is with the rest of the images in your database. You then have similarity measures for each of the database images in relation to your query image and then you sort these similarities in descending order. For a good retrieval system, you would want the images that are the most relevant (i.e. what you are searching for) to all appear at the beginning, while the irrelevant images would appear after.
Precision
The definition of Precision is the ratio of the number of relevant images you have retrieved to the total number of irrelevant and relevant images retrieved. In other words, supposing that A was the number of relevant images retrieved and B was the total number of irrelevant images retrieved. When calculating precision, you take a look at the first several images, and this amount is A + B, as the total number of relevant and irrelevant images is how many images you are considering at this point. As such, another definition Precision is defined as the ratio of how many relevant images you have retrieved so far out of the bunch that you have grabbed:
Precision = A / (A + B)
Recall
The definition of Recall is slightly different. This evaluates how many of the relevant images you have retrieved so far out of a known total, which is the the total number of relevant images that exist. As such, let's say you again take a look at the first several images. You then determine how many relevant images there are, then you calculate how many relevant images that have been retrieved so far out of all of the relevant images in the database. This is defined as the ratio of how many relevant images you have retrieved overall. Supposing that A was again the total number of relevant images you have retrieved out of a bunch you have grabbed from the database, and C represents the total number of relevant images in your database. Recall is thus defined as:
Recall = A / C
How you calculate this in MATLAB is actually quite easy. You first need to know how many relevant images are in your database. After, you need to know the similarity measures assigned to each database image with respect to the query image. Once you compute these, you need to know which similarity measures map to which relevant images in your database. I don't see that in your code, so I will leave that to you. Once you do this, you then sort on the similarity values then you go through where in the sorted similarity values these relevant images occur. You then use these to calculate your precision and recall.
I'll provide a toy example so I can show you what the graph looks like as it isn't quite clear on how you're calculating your similarities here. Let's say I have 5 images in a database of 20, and I have a bunch of similarity values between them and a query image:
rng(123); %// Set seed for reproducibility
num_images = 20;
sims = rand(1,num_images);
sims =
Columns 1 through 13
0.6965 0.2861 0.2269 0.5513 0.7195 0.4231 0.9808 0.6848 0.4809 0.3921 0.3432 0.7290 0.4386
Columns 14 through 20
0.0597 0.3980 0.7380 0.1825 0.1755 0.5316 0.5318
Also, I know that images [1 5 7 9 12] are my relevant images.
relevant_IDs = [1 5 7 9 12];
num_relevant_images = numel(relevant_IDs);
Now let's sort the similarity values in descending order, as higher values mean higher similarity. You'd reverse this if you were calculating a dissimilarity measure:
[sorted_sims, locs] = sort(sims, 'descend');
locs will now contain the image ranks that each image ranked as. Specifically, these tell you which position in similarity the image belongs to. sorted_sims will have the similarities sorted in descending order:
sorted_sims =
Columns 1 through 13
0.9808 0.7380 0.7290 0.7195 0.6965 0.6848 0.5513 0.5318 0.5316 0.4809 0.4386 0.4231 0.3980
Columns 14 through 20
0.3921 0.3432 0.2861 0.2269 0.1825 0.1755 0.0597
locs =
7 16 12 5 1 8 4 20 19 9 13 6 15 10 11 2 3 17 18 14
Therefore, the 7th image is the highest ranked image, followed by the 16th image being the second highest image and so on. What you need to do now is for each of the images that you know are relevant, you need to figure out where these are located after sorting. We will go through each of the image IDs that we know are relevant, and figure out where these are located in the above locations array:
locations_final = arrayfun(#(x) find(locs == x, 1), relevant_IDs)
locations_final =
5 4 1 10 3
Let's sort these to get a better understand of what this is saying:
locations_sorted = sort(locations_final)
locations_sorted =
1 3 4 5 10
These locations above now tell you the order in which the relevant images will appear. As such, the first relevant image will appear first, the second relevant image will appear in the third position, the third relevant image appears in the fourth position and so on. These precisely correspond to part of the definition of Precision. For example, in the last position of locations_sorted, it would take ten images to retrieve all of the relevant images (5) in our database. Similarly, it would take five images to retrieve four relevant images in the database. As such, you would compute precision like this:
precision = (1:num_relevant_images) ./ locations_sorted;
Similarly for recall, it's simply the ratio of how many images were retrieved so far from the total, and so it would just be:
recall = (1:num_relevant_images) / num_relevant_images;
Your Precision-Recall graph would now look like the following, with Recall on the x-axis and Precision on the y-axis:
plot(recall, precision, 'b.-');
xlabel('Recall');
ylabel('Precision');
title('Precision-Recall Graph - Toy Example');
axis([0 1 0 1.05]); %// Adjust axes for better viewing
grid;
This is the graph I get:
You'll notice that between a recall ratio of 0.4 to 0.8 the precision is increasing a bit. This is because you have managed to retrieve a successive chain of images without touching any of the irrelevant ones, and so your precision will naturally increase. It goes way down after the last image, as you've had to retrieve so many irrelevant images before finally hitting a relevant image.
You'll also notice that precision and recall are inversely related. As such, if precision increases, then recall decreases. Similarly, if precision decreases, then recall will increase.
The first part makes sense because if you don't retrieve that many images in the beginning, you have a greater chance of not including irrelevant images in your results but at the same time, the amount of relevant images is rather small. This is why recall would decrease when precision would increase
The second part also makes sense because as you keep trying to retrieve more images in your database, you'll inevitably be able to retrieve all of the relevant ones, but you'll most likely start to include more irrelevant images, which would thus drive your precision down.
In an ideal world, if you had N relevant images in your database, you would want to see all of these images in the top N most similar spots. As such, this would make your precision-recall graph a flat horizontal line hovering at y = 1, which means that you've managed to retrieve all of your images in all of the top spots without accessing any irrelevant images. Unfortunately, that's never going to happen (or at least not for now...) as trying to figure out the best features for CBIR is still an on-going investigation, and no image search engine that I have seen has managed to get this perfect. This is still one of the most broadest and unsolved computer vision problems that exist today!
Edit
You retrieved this code to compute histogram intersection from this post. They have a neat way of computing histogram intersection as:
n is the total number of bins in your histogram. You'll have to play around with this to get good results, but we can leave that as a parameter in your code. The code above assumes that you have two matrices A and B where each column is a histogram. You'll generate a matrix that is of a x b, where a is the number of columns in A and b is the number of columns in b. The row and column of this matrix (i,j) tells you the similarity between the ith column in A with the b jth column in B. In your case, A would be a single column which denotes the histogram of your query image. B would be a 10 column matrix that denotes the histograms for each of the database images. Therefore, we will get a 1 x 10 array of similarity measures through histogram intersection.
As such, we need to modify your code so that you're using imhist for each of the images. We can also specify an additional parameter that gives you how many bins each histogram will have. Therefore, your code will look like this. Each new line that I have placed will have a %// NEW comment beside each line.
Inp1=rgb2gray(imread('D:\visionImages\c1\1.ppm'));
figure, imshow(Inp1), title('Input image 1');
num_bins = 32; %// NEW - I'm specifying 32 bins here. Play around with this yourself
A = imhist(Inp1, num_bins); %// NEW - Calculate histogram
srcFiles = dir('D:\visionImages\c1\*.ppm'); % the folder in which images exists
B = zeros(num_bins, length(srcFiles)); %// NEW - Store histograms of database images
for i = 1 : length(srcFiles)
filename = strcat('D:\visionImages\c1\',srcFiles(i).name);
I = imread(filename);
I=rgb2gray(I);
B(:,i) = imhist(I, num_bins); %// NEW - Put each histogram in a separate
%// column
end
%// NEW - Taken directly from the website
%// but modified for only one histogram in `A`
b = size(B,2);
Va = repmat(A, 1, b);
K = 0.5*sum(Va + B - abs(Va - B));
Take note that I have copied the code from the website, but I have modified it because there is only one image in A and so there is some code that isn't necessary.
K should now be a 1 x 10 array of histogram intersection similarities. You would then use K and assign sims to this variable (i.e. sims = K;) in the code I have written above, then run through your images. You also need to know which images are relevant images, and you'd have to change the code I've written to reflect that.
Hope this helps!

get vector which's mean is zero from arbitrary vector

as i know to get zero mean vector from given vector,we should substract mean of given vector from each memeber of this vector.for example let us see following example
r=rand(1,6)
we get
0.8687 0.0844 0.3998 0.2599 0.8001 0.4314
let us create another vector s by following operation
s=r-mean(r(:));
after this we get
0.3947 -0.3896 -0.0743 -0.2142 0.3260 -0.0426
if we calculate mean of s by following formula
mean(s)
we get
ans =
-5.5511e-017
actually as i have checked this number is very small
-5.5511*exp(-017)
ans =
-2.2981e-007
so we should think that our vector has mean zero?so it means that that small deviation from 0 is because of round off error?for exmaple when we are creating white noise or such kind off random uncorrelated sequence of data,actually it is already supposed that even for such small data far from 0,it has zero mean and it is supposed in this case that for example for this case
-5.5511e-017 =0 ?
approximately of course
e-017 means 10 to the power of -17 (10^-17) but still the number is very small and hypothetically it is 0. And if you type
format long;
you will see the real precision used by Matlab
Actually you can refer to the eps command. Although matlab uses double that can encode numbers down to 2.2251e-308 the precission is determined size of the number.
Use it in the format eps(number) - it tell you the how large is the influence of the least significant bit.
on my machine eg. eps(0.3) returns 5.5511e-17 - exactly the number you report.

How can I find quantized coefficients from MATLAB using Sallee's code?

First, I admit that this is a homework question. However, I seem to be stuck. I need to get all quantized coefficients from a jpeg image using Phil Sallee's JPEG Toolbox (link listed at the bottom of the table under an "update" heading)(I'll be building a histogram, but that part I can handle once I can get to the data I need). I have a JPEG image that is about 5 MB in size and get back this data when I run it through Sallee's code:
image_width: 3000
image_height: 4000
image_components: 3
image_color_space: 2
jpeg_components: 3
jpeg_color_space: 3
comments: {}
coef_arrays: {[4000x3000 double] [2000x3000 double] [2000x3000 double]}
quant_tables: {[8x8 double] [8x8 double]}
ac_huff_tables: [1x2 struct]
dc_huff_tables: [1x2 struct]
optimize_coding: 0
comp_info: [1x3 struct]
progressive_mode: 0
How do I get the quantized coefficients from this image? At first I tried something like this to just spit out the coefficients so I could see what I was dealing with:
pic = jpeg_read(image)
img_coef = pic.quant_tables{pic.comp_info(1).quant_tbl_no}
img_coef = pic.quant_tables{pic.comp_info(2).quant_tbl_no}
img_coef is run twice because there are two elements to the quant_tables data point above. However, this seems like a very low amount of coefficients for such a large image. Can someone more knowledgeable than me in this regard point me in the right direction? Where/how do I pull the quantized coefficients from a jpeg image?
It appears that you have the information you need. From the data you've provided, it looks like the JPEG toolkit decodes the coefficients and loads them into the "coef_arrays". Your image has horizontal subsampling; this is indicated by the color coefficient arrays being half the width of the luminance. The 3 arrays represent (Y, Cr, Cb). There are 2 quantization tables because one is for the Y component and the other is for the Cr and Cb components. In order to de-quantize the coefficients, you will need to multiply the correct element of the quant_tables[] array with each coefficient. For example, element [8, 10] of your coefficients array should be multiplied by element [0,2] of your quant_table. The 8x8 quantization array gets re-used across every 8x8 set of coefficients. Normally these are in zig-zag order, but it appears that your toolkit has laid it out like a complete image.
This will open a file, pull off the luminance, Cr and Cb arrays, and the two quantization arrays. It will then quantize luminance, Cr and Cb into their own variables.
im = jpeg_read(image);
% Pull image information - Lum, Cb, Cr
lum = im.coef_arrays{im.comp_info(1).component_id};
cb = im.coef_arrays{im.comp_info(2).component_id};
cr = im.coef_arrays{im.comp_info(3).component_id};
% Pull quantization arrays
lqtable = im.quant_tables{im.comp_info(1).quant_tbl_no};
cqtable = im.quant_tables{im.comp_info(2).quant_tbl_no};
% Quantize above two sets of information
lqcof = quantize(lum,lum_qtable);
bqcof = quantize(cb,cho_qtable);
rqcof = quantize(cr,cho_qtable);