Say I have two column vectors of same size: P and Q. What I need to do is find the least squares-functional distance D = ||P - Q||^2. What does this mean and how do I implement it in matlab. Should I use the norm() function?
You can use
norm(P-Q)^2
or
sum((P-Q).^2)
A small test:
P = randn(1e7,1);
Q = randn(1e7,1);
tic
norm(P-Q)^2;
toc
tic
sum((P-Q).^2);
toc
Results:
Elapsed time is 0.130086 seconds.
Elapsed time is 0.098494 seconds.
So manually squaring and summing is a tad faster, and perhaps more intuitive.
If you are just trying to find the square of the Euclidian norm, or in other words sum((P(i) - Q(i))^2), then you can use sumsqr(P - Q).
EDIT: The Euclidian norm is defined as the square root of the sum of the squares, so in your case it would be sqrt(sum((P(i) - Q(i))^2)). That is what ||P - Q|| means. So ||P - Q||^2 would just be the sum of the squares of (P - Q), and MATLAB has a built-in function for that, as stated above.
Related
I am trying to increase the speed of code that operates on large datasets. I need to perform the function out = sinc(x), where x is a 2048-by-37499 matrix of doubles. This is very expensive and is the bottleneck of my program (even when computed on the GPU).
I am looking for any solution which improves the speed of this operation.
I expect that this might be achieved by pre-computing a vector LookUp = sinc(y) where y is the vector y = min(min(x)):dy:max(max(x)), i.e. a vector spanning the whole range of expected x elements.
How can I efficiently generate an approximation of sinc(x) from this LookUp vector?
I need to avoid generating a three dimensional array, since this would consume more memory than I have available.
Here is a test for the interp1 solution:
a = -15;
b = 15;
rands = (b-a).*rand(1024,37499) + a;
sincx = -15:0.000005:15;
sincy = sinc(sincx);
tic
res1 = interp1(sincx,sincy,rands);
toc
tic
res2 = sinc(rands);
toc'
sincx = gpuArray(sincx);
sincy = gpuArray(sincy);
r = gpuArray(rands);
tic
r = interp1(sincx,sincy,r);
toc
r = gpuArray(rands);
tic
r = sinc(r);
toc
Elapsed time is 0.426091 seconds.
Elapsed time is 0.472551 seconds.
Elapsed time is 0.004311 seconds.
Elapsed time is 0.130904 seconds.
Corresponding to CPU interp1, CPU sinc, GPU interp1, GPU sinc respectively
Not sure I understood completely your problem.
But once you have LookUp = sinc(y) you can use the Matlab function interp1
out = interp1(y,LookUp,x)
where x can be a matrix of any size
I came to the conclusion, that your code can not be improved significantly. The fastest possible lookup table is based on simple indexing. For a performance test, lets just perform the test based on random data:
%test data:
x=rand(2048,37499);
%relevant code:
out = sinc(x);
Now the lookup based on integer indices:
a=min(x(:));
b=max(x(:));
n=1000;
x2=round((x-a)/(b-a)*(n-1)+1);
lookup=sinc(1:n);
out2=lookup(x2);
Regardless of the size of the lookup table or the input data, the last lines in both code blocks take roughly the same time. Having sinc evaluate roughly as fast as a indexing operation, I can only assume that it is already implemented using a lookup table.
I found a faster way (if you have a NVIDIA GPU on your PC) , however this will return NaN for x=0, but if, for any reason, you can deal with having NaN or you know it will never be zero then:
if you define r = gpuArray(rands); and actually evaluate the sinc function by yourself in the GPU as:
tic
r=rdivide(sin(pi*r),pi*r);
toc
This generally is giving me about 3.2x the speed than the interp1 version in the GPU, and its more accurate (tested using your code above, iterating 100 times with different random data, having both methods similar std).
This works because sin and elementwise division rdivide are also GPU implemented (while for some reason sinc isn't) . See: http://uk.mathworks.com/help/distcomp/run-built-in-functions-on-a-gpu.html
m = min(x(:));
y = m:dy:max(x(:));
LookUp = sinc(y);
now sinc(n) should equal
LookUp((n-m)/dy + 1)
assuming n is an integer multiple of dy and lies within the range m and max(x(:)). To get to the LookUp index (i.e. an integer between 1 and numel(y), we first shift n but the minimum m, then scale it by dy and finally add 1 because MATLAB indexes from 1 instead of 0.
I don't know what that wll do for you efficiency though but give it a try.
Also you can put this into an anonymous function to help readability:
sinc_lookup = #(n)(LookUp((n-m)/dy + 1))
and now you can just call
sinc_lookup(n)
I've been given a stepsize h that to be used in the implementation of a Runge-Kutta method for higher order ODEs. My problem comes when dividing up the interval from the starting time t_0 to the final time t_f. I thought of using N = ceil((t_f-t_i)/h), and then using
t = linspace(t_0, t_f, N)
But I want to keep the points used by the Runge-Kutta algorithm spaced by h for most of the process, is there a way to include the endpoint t_f while keeping the stepsize at h for the first n-1 steps? I tried using
t = t_0:h:t_f
But this does not always include the endpoint t_f.
Assuming that you want use the largest N such that your range is t_0, t_0 + h, t_0 + 2*h, ..., t_0 + h*(N-2), t_f, you would use t = [t_0 : h : t_f-eps t_f], where eps is a MATLAB function that gives the smallest distance between two floating point numbers.
Be aware that this means your last two points can be very close together. In general, if you write t = [t_0 : h : t_f-a t_f] with a < h, then the distance between the final two points will be in the interval [a, h+a).
let say that from given function f(t), we want to construct new function which is given from existed function by this way
where T is some constant let say T=3; of course k can't be from -infinity to infinity in reality because we can't do infinity summation using computer,so it is first my afford
first let us define our function
function y=f(t);
y=-1/(t^2);
end
and second program
k=-1000:1:999;
F=zeros(1,length(k));
T=3;
for t=1:length(k)
F(t)=sum(f(t+k*T));
end
but when i am running second program ,i am getting
>> program
Error using ^
Inputs must be a scalar and a square matrix.
To compute elementwise POWER, use POWER (.^) instead.
Error in f (line 2)
y=-1/(t^2);
Error in program (line 5)
F(t)=sum(f(t+k*T));
so i have two question related to this program :
1.first what is error why it shows me mistake
how can i do it in excel? can i simplify it somehow? thanks in advance
EDITED :
i have changed my code by this way
k=-1000:1:999;
F=zeros(1,length(k));
T=3;
for t=1:length(k)
result=0;
for l=1:length(k)
result=result+f(t+k(l)*T);
end
F(t)=result;
end
is it ok?
To solve your problem in a vectorized way, you'll have to change the function f such that it can be called with vectors as input. This is, as #patrik suggested, achieved by using the element-wise operators .* ./ .^ (Afaik, no .+ .- exist). Unfortunately the comment of #rayryeng is not entirely correct, which may have lead to confusion. The correct way is to use the element-wise operators for both the division ./ and the square .^:
function y = f(t)
y = -1 ./ (t.^2);
end
Your existing code (first version)
k = -1000:1:999;
F = zeros(1,length(k));
T = 3;
for t=1:length(k)
F(t) = sum(f(t+k*T));
end
then works as expected (and is much faster then the version you posted in the edit).
You can even eliminate the for loop and use arrayfun instead. For simple functions f, you can also use function handles instead of creating a separate file. This gives
f = #(t) -1 ./ (t.^2);
k = -1000:1:999;
t = 1:2000;
T = 3;
F = arrayfun(#(x)sum(f(x+k*T)), t);
and is even faster and a simple one-liner. arrayfun takes any function handle as first input. We create a function handle which takes an argument x and does the sum over all k: #(x) sum(f(x+k*T). The second argument, the vector t, contains all values for which the function handle is evaluated.
As proposed by #Divakar in comments, you can also use the bsxfun function:
f = #(t) -1 ./ (t.^2);
k = -1000:1:999;
t = 1:2000;
T = 3;
F = sum(f(bsxfun(#plus,k*T,t.')),2);
where bsxfun creates a matrix containing all combinations between t and k*T, they are all evaluated using f(...) and last, the sum along the second dimension sums over all k's.
Benchmarking
Lets compare these solutions:
Combination of for loop and sum (original question):
Elapsed time is 0.043969 seconds.
Go through all combinations in 2 for loops (edited question):
Elapsed time is 1.367181 seconds.
Vectorized approach with arrayfun:
Elapsed time is 0.063748 seconds.
Vectorized approach with bsxfun as proposed by #Divakar:
Elapsed time is 0.099399 seconds.
So (sadly) the first solution including a for loop beats both vectorized approaches. For larger k vectors (-10000:1:9999), this behavior can be reproduced. The conclusion seems to be that MATLAB has indeed learned how to optimize for loops.
In my application, I need to find the "closest" (minimum Euclidean distance vector) to an input vector, among a set of vectors (i.e. a matrix)
Therefore every single time I have to do this :
function [match_col] = find_closest_column(input_vector, vectors)
cmin = 99999999999; % current minimum distance
match_col = -1;
for col=1:width
candidate_vector = vectors(:,c); % structure of the input is not important
dist = norm(input_vector - candidate_vector);
if dist < cmin
cmin = dist;
match_col = col;
end
end
is there a built-in MATLAB function that does this kind of thing easily (with a short amount of code) for me ?
Thanks for any help !
Use pdist2. Assuming (from your code) that your vectors are columns, transposition is needed because pdist2 works with rows:
[cmin, match_col] = min(pdist2(vectors.', input_vector.' ,'euclidean'));
It can also be done with bsxfun (in this case it's easier to work directly with columns):
[cmin, match_col] = min(sum(bsxfun(#minus, vectors, input_vector).^2));
cmin = sqrt(cmin); %// to save operations, apply sqrt only to the minimizer
norm can't be directly applied to every column or row of a matrix, so you can use arrayfun:
dist = arrayfun(#(col) norm(input_vector - candidate_vector(:,col)), 1:width);
[cmin, match_col] = min(dist);
This solution was also given here.
HOWEVER, this solution is much much slower than doing a direct computation using bsxfun (as in Luis Mendo's answer), so it should be avoided. arrayfun should be used for more complex functions, where a vectorized approach is harder to get at.
I need to calculate the euclidean distance between 2 matrices in matlab. Currently I am using bsxfun and calculating the distance as below( i am attaching a snippet of the code ):
for i=1:4754
test_data=fea_test(i,:);
d=sqrt(sum(bsxfun(#minus, test_data, fea_train).^2, 2));
end
Size of fea_test is 4754x1024 and fea_train is 6800x1024 , using his for loop is causing the execution of the for to take approximately 12 minutes which I think is too high.
Is there a way to calculate the euclidean distance between both the matrices faster?
I was told that by removing unnecessary for loops I can reduce the execution time. I also know that pdist2 can help reduce the time for calculation but since I am using version 7. of matlab I do not have the pdist2 function. Upgrade is not an option.
Any help.
Regards,
Bhavya
Here is vectorized implementation for computing the euclidean distance that is much faster than what you have (even significantly faster than PDIST2 on my machine):
D = sqrt( bsxfun(#plus,sum(A.^2,2),sum(B.^2,2)') - 2*(A*B') );
It is based on the fact that: ||u-v||^2 = ||u||^2 + ||v||^2 - 2*u.v
Consider below a crude comparison between the two methods:
A = rand(4754,1024);
B = rand(6800,1024);
tic
D = pdist2(A,B,'euclidean');
toc
tic
DD = sqrt( bsxfun(#plus,sum(A.^2,2),sum(B.^2,2)') - 2*(A*B') );
toc
On my WinXP laptop running R2011b, we can see a 10x times improvement in time:
Elapsed time is 70.939146 seconds. %# PDIST2
Elapsed time is 7.879438 seconds. %# vectorized solution
You should be aware that it does not give exactly the same results as PDIST2 down to the smallest precision.. By comparing the results, you will see small differences (usually close to eps the floating-point relative accuracy):
>> max( abs(D(:)-DD(:)) )
ans =
1.0658e-013
On a side note, I've collected around 10 different implementations (some are just small variations of each other) for this distance computation, and have been comparing them. You would be surprised how fast simple loops can be (thanks to the JIT), compared to other vectorized solutions...
You could fully vectorize the calculation by repeating the rows of fea_test 6800 times, and of fea_train 4754 times, like this:
rA = size(fea_test,1);
rB = size(fea_train,1);
[I,J]=ndgrid(1:rA,1:rB);
d = zeros(rA,rB);
d(:) = sqrt(sum(fea_test(J(:),:)-fea_train(I(:),:)).^2,2));
However, this would lead to intermediary arrays of size 6800x4754x1024 (*8 bytes for doubles), which will take up ~250GB of RAM. Thus, the full vectorization won't work.
You can, however, reduce the time of the distance calculation by preallocation, and by not calculating the square root before it's necessary:
rA = size(fea_test,1);
rB = size(fea_train,1);
d = zeros(rA,rB);
for i = 1:rA
test_data=fea_test(i,:);
d(i,:)=sum( (test_data(ones(nB,1),:) - fea_train).^2, 2))';
end
d = sqrt(d);
Try this vectorized version, it should be pretty efficient. Edit: just noticed that my answer is similar to #Amro's.
function K = calculateEuclideanDist(P,Q)
% Vectorized method to compute pairwise Euclidean distance
% Returns K(i,j) = sqrt((P(i,:) - Q(j,:))'*(P(i,:) - Q(j,:)))
[nP, d] = size(P);
[nQ, d] = size(Q);
pmag = sum(P .* P, 2);
qmag = sum(Q .* Q, 2);
K = sqrt(ones(nP,1)*qmag' + pmag*ones(1,nQ) - 2*P*Q');
end