Hope title isn't confusing. It's simple to show by example. I have a row vector like so: [1 5 6]. I want to find the average difference between each element. The differences in this example are 4 and 1 so the average is 2.5. This is a small example. My row vectors might be very large. I'm new to MatLab so is there some efficient way of using MATLAB's efficient matrix/array manipulation to do this nicely?
There is a similar question on SOF already but this question is specifically for MATLAB!
Thanks :)
EDIT: As queried by #gnovice, I wanted the absolute difference.
Simple solution using diff and mean
aveDiff = mean(diff(myVector)) %#(1)
Example
>> v=[1 5 6]
v =
1 5 6
>> mean(diff(v))
ans =
2.5000
This works but #Jonas' answer is the correct solution.
Edit
From #gnovice, #vivid-colours and #sevenless comments.
The mean of the absolute value of the difference can be found by inserting abs into (1)
aveDiff = mean(abs(diff(myVector))) %#(2)
If you have an array array, then the average difference is
(array(end) - array(1))/(length(array)-1)
because diff(array), where array = [a b c d], is [b-a c-b d-c]. The average of that is (b-a+c-b+d-c)/3, which simplifies to (d-a)/3.
In your example
array = [1 5 6];
(array(end)-array(1))/2
ans =
2.5
If X is your vector, you can do
mean( X(2:end) - X(1:end-1) )
Related
I am trying to resolve why the following Matlab syntax does not work.
I have an array
A = [2 3 4 5 8 9...]
I wish to create an indexed cumulative, for example
s(1) = 2; s(2)=5, s(3)=9; ... and so on
Can someone please explain why the following does not work
x = 1:10
s(x) = sum(A(1:x))
The logic is that if a vector is created for s using x, why would not the sum function behave the same way? The above returns just the first element (2) for all x.
For calculating the cumulative sum, you should be using cumsum:
>> A = [2 3 4 5 8 9]
A =
2 3 4 5 8 9
>> cumsum(A)
ans =
2 5 9 14 22 31
The issue is that 1:x is 1 and that sum reduces linear arrays. To do this properly, you need a 2d array and then sum the rows:
s(x)=sum(triu(repmat(A,[prod(size(A)) 1])'))
You are asking two questions, really. One is - how do I compute the cumulative sum. #SouldEc's answer already shows how the cumsum function does that. Your other question is
Can someone please explain why the following does not work
x = 1:10
s(x) = sum(A(1:x))
It is reasonable - you think that the vector expansion should turn
1:x
into
1:1
1:2
1:3
1:4
etc. But in fact the arguments on either side of the colon operator must be scalars - they cannot be vectors themselves. I'm surprised that you say Matlab isn't throwing an error with your two lines of code - I would have expected that it would (I just tested this on Freemat, and it complained...)
So the more interesting question is - how would you create those vectors (if you didn't know about / want to use cumsum)?
Here, we could use arrayfun. It evaluates a function with an array as input element-by-element; this can be useful for a situation like this. So if we write
x = 1:10;
s = arrayfun(#(n)sum(A(1:n)), x);
This will loop over all values of x, substitute them into the function sum(A(1:n)), and voila - your problem is solved.
But really - the right answer is "use cumsum()"...
Actually what you are doing is
s(1:10)= sum(A(1:[1,2,3...10]))
what you should do is
for i=1:10
s(i)=sum(A(1:i))
end
hope it will help you
I am trying to resolve why the following Matlab syntax does not work.
I have an array
A = [2 3 4 5 8 9...]
I wish to create an indexed cumulative, for example
s(1) = 2; s(2)=5, s(3)=9; ... and so on
Can someone please explain why the following does not work
x = 1:10
s(x) = sum(A(1:x))
The logic is that if a vector is created for s using x, why would not the sum function behave the same way? The above returns just the first element (2) for all x.
For calculating the cumulative sum, you should be using cumsum:
>> A = [2 3 4 5 8 9]
A =
2 3 4 5 8 9
>> cumsum(A)
ans =
2 5 9 14 22 31
The issue is that 1:x is 1 and that sum reduces linear arrays. To do this properly, you need a 2d array and then sum the rows:
s(x)=sum(triu(repmat(A,[prod(size(A)) 1])'))
You are asking two questions, really. One is - how do I compute the cumulative sum. #SouldEc's answer already shows how the cumsum function does that. Your other question is
Can someone please explain why the following does not work
x = 1:10
s(x) = sum(A(1:x))
It is reasonable - you think that the vector expansion should turn
1:x
into
1:1
1:2
1:3
1:4
etc. But in fact the arguments on either side of the colon operator must be scalars - they cannot be vectors themselves. I'm surprised that you say Matlab isn't throwing an error with your two lines of code - I would have expected that it would (I just tested this on Freemat, and it complained...)
So the more interesting question is - how would you create those vectors (if you didn't know about / want to use cumsum)?
Here, we could use arrayfun. It evaluates a function with an array as input element-by-element; this can be useful for a situation like this. So if we write
x = 1:10;
s = arrayfun(#(n)sum(A(1:n)), x);
This will loop over all values of x, substitute them into the function sum(A(1:n)), and voila - your problem is solved.
But really - the right answer is "use cumsum()"...
Actually what you are doing is
s(1:10)= sum(A(1:[1,2,3...10]))
what you should do is
for i=1:10
s(i)=sum(A(1:i))
end
hope it will help you
I have two vectors of values and I want to compare them statistically. For simplicity assume A = [2 3 5 10 15] and B = [2.5 3.1 4.8 10 18]. I want to compute the standard deviation, the root mean square error (RMSE), the mean, and present conveniently, maybe as histogram. Can you please help me how to do it so that I understand? I know question is probably simple, but I am new into this. Many thanks!
edited:
This is how I wanted to implement RMSE.
dt = 1;
for k=1:numel(A)
err(k)=sqrt(sum(A(1,1:k)-B(1,1:k))^2/k);
t(k) = dt*k;
end
However it gives me bigger values than I expect, since e.g. 3 and 3.1 differ only in 0.1.
This is how I calculate error between reference value of each cycle with corresponding estimated in that cycle.
Can you tell me, am I doing right, or what's wrong?
abs_err = A-B;
The way you are looping through the vectors is not element by element but rather by increasing the vector length, that is, you are comparing the following at each iteration:
A(1,1:k) B(1,1:k)
-------- --------
k=1 [2] [2.5]
=2 [2 3] [2.5 3.1]
=3 [2 3 5] [2.5 3.1 4.8]
....
At no point do you compare only 2 and 2.1!
Assuming A and B are vectors of identical length (and both are either column or row vectors), then you want functions std(A-B), mean(A-B), and if you look in matlab exchange, you will find a user-contributed rmse(A-B) but you can also compute the RMSE as sqrt(mean((A-B).^2)). As for displaying a histogram, try hist(A-B).
In your case:
dt = 1;
for k=1:numel(A)
stdab(k) = std(A(1,1:k)-B(1,1:k));
meanab(k) = mean(A(1,1:k)-B(1,1:k));
err(k)=sqrt(mean((A(1,1:k)-B(1,1:k)).^2));
t(k) = dt*k;
end
You can also include hist(A(1,1:k)-B(1,1:k)) in the loop if you want to compute histograms for every vector pair difference A(1,1:k)-B(1,1:k).
It should return lowest index i if a number a lies between x(i) and x(i+1). I know it's not hard to write a function which would do this but is there any built-in Matlab function for this?
Assuming the vector elements are sorted it would be a trivial search O(logn) I guess but is there any better way to do it if the elements are not sorted without going through sorting?
Thanks in advance!
Logical indices are well suited for these kind of comparisons:
x = [6 2 6 7 3 5];
a = 4;
find(a > x(1:end-1) & a < x(2:end), 1)
ans = 2
Try
a=rand(1);
b=rand(1,10);
c=a-b;
find(c(2:end).*c(1:end-1)<0,1)
In MATLAB, is there a more concise way to handle discrete conditional indexing by column than using a for loop? Here's my code:
x=[1 2 3;4 5 6;7 8 9];
w=[5 3 2];
q=zeros(3,1);
for i = 1:3
q(i)=mean(x(x(:,i)>w(i),i));
end
q
My goal is to take the mean of the top x% of a set of values for each column. The above code works, but I'm just wondering if there is a more concise way to do it?
You mentioned that you were using the function PRCTILE, which would indicate that you have access to the Statistics Toolbox. This gives you yet another option for how you could solve your problem, using the function NANMEAN. In the following code, all the entries in x less than or equal to the threshold w for a column are set to NaN using BSXFUN, then the mean of each column is computed with NANMEAN:
x(bsxfun(#le,x,w)) = nan;
q = nanmean(x);
I don't know of any way to index the columns the way you want. This may be faster than a for loop, but it also creates a matrix y that is the size of x.
x=[1 2 3;4 5 6;7 8 9];
w=[5 3 2];
y = x > repmat(w,size(x,1),1);
q = sum(x.*y) ./ sum(y)
I don't claim this is more concise.
Here's a way to solve your original problem: You have an array, and you want to know the mean of the top x% of each column.
%# make up some data
data = magic(5);
%# find out how many rows the top 40% are
nRows = floor(size(data,1)*0.4);
%# sort the data in descending order
data = sort(data,1,'descend');
%# take the mean of the top 20% of values in each column
topMean = mean(data(1:nRows,:),1);