Cosine distance range interpretation - matlab

I am trying to use the cosine distance in pdist2. I am confused about it's output. As far as I know it should be between 0 and 1. Since MATLAB uses 1-(cosine), then 1 would be the highest variability while 0 would be the lowest. However the output seems to range from 0.5 to 1.5 or something along that!
Can somebody please advise me on how to interpret this output?

From help pdist2:
'cosine' - One minus the cosine of the included angle
between observations (treated as vectors)
Since the cosine varies between -1 and 1, the result of pdist2(...'cosine') varies between 0 and 2. If you want the cosine, use 1-pdist2(matrix1,matrix2,'cosine').

Related

Increasing precision of polyeig in Matlab

I am using the polyeig command in Matlab to solve a polynomial eigenvalue problem of order 2 in Matlab. I know that the system has a single 0 eigenvalue (this is due to the form of the zero coefficient matrix where each diagonal element is -1 times the sum of the elements in he same row so the vector (1 1 1 ... 1) has 0 eigenvalue).
Size of the system is about 150 to 150.
When I use the polyeig command the lowest eigenvalue I get is of the order 1E-4 (which is supposed to be the 0 eigenvalue) and the second lowest is of the order 1E-1. As the system size decreases the lowest eigenvalue decreases to something of the order 1E-14 which is reasonable but 1E-4 is too much.
Is there anyway to achieve better accuracy or any other library you would suggest?
I could also turn the polynomial eigenvalue problem to generalized eigenvalue problem in higher dimensions (2 times the given dimension) but I am not sure how that affects speed and accuracy. I would like to see if there is a simpler solution before reformulating the problem. So I would welcome any suggestions on these matters.
EDIT: The problem is resolved it was actually about the precision of the INPUT file that I was using which was printed only up to 4 digits. Having found better ones the precision has increased. Thanks in any case.
The problem turned out to be with the input file I was using which was printed only up to 4 decimal points. Now even with matrices of 800x800 I only get accuracy problems up to e-11 which is good.

Approximation for mean(abs(fft(vector)))?

Some MatLab code I'm trying to simplify goes through the effort of finding FFT, only to take the absolute value, and then the mean:
> vector = [0 -1 -2 -1 0 +1 +2 +1];
> mean(abs(fft(vector)))
ans = 2
All these coefficients are built up, then chopped back down to a single value. The platform I'm working on is really limited, and if I can get by without using FFT, all the better. Is there a way to approximate these operations without FFT?
Can assume vector will be at most 64 values in length.
I think this is a just a very inefficient way of calculating the RMS value of the signal. See Parseval's Theorem. You can probably just simplify the expression to:
sqrt(mean(vector.*vector))
If your vector is real and has zero average as in your example, you can take advantage of the fact that the two halves of the DFT are complex-conjugate (from basic signal processing) and save half the abs computations. Also, use sum, which is faster than mean:
fft_vector = fft(vector);
len = length(fft_vector)/2;
sum(abs(fft_vector(1:len)))/len

Mahalanobis distance

I would like to apply Mahalanobis distanc method to the data obained from the observation.
Each observation is a time response of the system. I have 30 onservations each 14000 points.
I would like to use MAHAL command in matlab. but it notifies me that the number of the rows in variable X must be greater than the columns. But the nature of my observations is so that for each observation I have 1 row (observation) and 14000 columns (time points).
I don't know how to overcome such a problem.
If anybody knows please help me.
You can't do that. The Mahalanobis distance of a point x from a group of values with mean mu and variance sigma is defined as sqrt((x-mu)*sigma^-1*(x-mu)). If sigma is not invertible - and it will not be if you have 30 observations and 14000 variables - the Mahalanobis distance is not defined.

How to calculate probability in normal distribution by Matlab?

I'm new to Matlab and I would appreciate if someone could help.
The problem:
IQ coefficients are Normally distributed with a mean of 100 and a standard deviation of 15. Calculate the probability that a randomly drawn person from this population has an IQ greater than 110 but smaller than 130. You can achieve this using one line of matlab code.
What does this look like?
I tried like this:
>> max(normpdf(linspace(110,130,100),100,15))
ans =
0.0213
But not sure if it is correct..
I would be thankful for any help!
This is most efficiently handled using the normal cumulative density function.
normcdf(130,100,15) - normcdf(110,100,15)
Or if you prefer to manually convert these to "Z" scores then you can use the single argument version of the cdf.
normcdf(30/15) - normcdf(10/15)
In either case the answer is 0.2297, so about 23%.
Lets check:
N=1e7; %Number of "experimental" samples
iq = randn(1,N)*15 + 100; %Create a set of IQ values
p = sum(iq>=110 & iq<=130)/N %Determine how many are in range of interest.
This returns a number around 23%.

comparing numbers in matlab

I am trying to classify some data based on euclidean distances in matlab the only problem is that matlab is giving me numbers that look like these as distances
0 + 4.9713i
0 + 7.8858i
num1<num2
num2<num1
both return 0. how is this possible?
The numbers you're getting are imaginary numbers. You should never obtain imaginary numbers when you calculate Euclidean distances.
Check that your Euclidean distances are correct, such as
distance=sqrt(deltaX.^2 + deltaY.^2)
If you're really sure that your distances should be complex numbers, make the comparison using e.g. the norm, i.e.
norm(num2) > norm(num1)
This evaluates to true for me.
Numbers with real and imaginary parts are not orderable. Maybe you mean order by distance from origin?