Using MATLAB, how can I find the 3-day moving average of a specific column of a matrix and append the moving average to that matrix? I am trying to compute the 3-day moving average from bottom to top of the matrix. I have provided my code:
Given the following matrix a and mask:
a = [1,2,3;4,5,6;7,8,9;10,11,12;13,14,15;16,17,18];
mask = ones(3,1);
I have tried implementing the conv command but I am receiving an error. Here is the conv command I have been trying to use on the 2nd column of matrix a:
a(:,4) = conv(a(:,2),mask,'valid');
The output I desire is given in the following matrix:
desiredOutput = [1,2,3,5;4,5,6,8;7,8,9,11;10,11,12,14;13,14,15,0;16,17,18,0;]
If you have any suggestions, I would greatly appreciate it. Thank you!
In general it would help if you would show the error. In this case you are doing two things wrong:
First your convolution needs to be divided by three (or the length of the moving average)
c = conv(a(:,2),mask,'valid')/3
c =
5
8
11
14
Second, notice the size of c. You cannot just fit c into a. The typical way of getting a moving average would be to use same:
a(:,4) = conv(a(:,2),mask,'same')/3
a =
1.0000 2.0000 3.0000 2.3333
4.0000 5.0000 6.0000 5.0000
7.0000 8.0000 9.0000 8.0000
10.0000 11.0000 12.0000 11.0000
13.0000 14.0000 15.0000 14.0000
16.0000 17.0000 18.0000 10.3333
but that doesn't look like what you want.
Instead you are forced to use a couple of lines:
c = conv(a(:,2),mask,'valid')/3;
a(1:length(c),4) = c
a =
1 2 3 5
4 5 6 8
7 8 9 11
10 11 12 14
13 14 15 0
16 17 18 0
Related
Is there a way to calculate a moving mean in a way that the values at the beginning and at the end of the array are averaged with the ones at the opposite end?
For example, instead of this result:
A=[2 1 2 4 6 1 1];
movmean(A,2)
ans = 2.0 1.5 1.5 3.0 5 3.5 1.0
I want to obtain the vector [1.5 1.5 1.5 3 5 3.5 1.0], as the initial array element 2 would be averaged with the ending element 1.
Generalizing to an arbitrary window size N, this is how you can add circular behavior to movmean in the way you want:
movmean(A([(end-floor(N./2)+1):end 1:end 1:(ceil(N./2)-1)]), N, 'Endpoints', 'discard')
For the given A and N = 2, you get:
ans =
1.5000 1.5000 1.5000 3.0000 5.0000 3.5000 1.0000
For an arbitrary window size n, you can use circular convolution with an averaging mask defined as [1/n ... 1/n] (with n entries; in your example n = 2):
result = cconv(A, repmat(1/n, 1, n), numel(A));
Convolution offers some nice ways of doing this. Though, you may need to tweak your input slightly if you are only going to partially average the ends (i.e. the first is averaged with the last in your example, but then the last is not averaged with the first).
conv([A(end),A],[0.5 0.5],'valid')
ans =
1.5000 1.5000 1.5000 3.0000 5.0000 3.5000 1.0000
The generalized case here, for a moving average of size N, is:
conv(A([end-N+2:end, 1:end]),repmat(1/N,1,N),'valid')
I have some matrices and I'd like to print them using printmat. I only have column names for them so I don't need the row labels (also I don't know how much rows there are).
So I tried:
printmat(test,'test name','','test1 test2 test3 test4')
But it told me
Error using printmat (line 66)
Not enough row labels.
What can I do now? Thx.
You could do it manually, but then you need to set number formatting and label spacing to match the columns. For example:
>> A = magic(4) %// example data
A =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
>> disp([' test1 test2 test3 test4'; num2str(A, '%0.4f ')])
test1 test2 test3 test4
16.0000 2.0000 3.0000 13.0000
5.0000 11.0000 10.0000 8.0000
9.0000 7.0000 6.0000 12.0000
4.0000 14.0000 15.0000 1.0000
Note that incorrect spacing may give a concatenation error. Spacing has to be set manually.
By the way, printmat seems to be obsolete.
Let say I have a 6x5 matrix (my actual data is way bigger)
A B C D E
1 5 7 2 3
2 1 9 8 5
3 1 2 3 1
4 1 3 4 2
5 2 9 0 1
6 5 3 4 3
I have to make a plot with A on the x-axis and B,C,D on the y-axis. If I want to reduce the data points by half (by averaging each adjacent pair of data points), how do I do it? What if I want to decrease the points even further by averaging every five (or n) points?
I looked at the MATLAB help document but I'm still confused
I got what I needed, thanks for the input guys, it really helped
There you go:
M = [1 5 7 2 3
2 1 9 8 5
3 1 2 3 1
4 1 3 4 2
5 2 9 0 1
6 5 3 4 3]; % data
>>result = (M(1:2:end-1,:) + M(2:2:end,:))/2;
result =
1.5000 3.0000 8.0000 5.0000 4.0000
3.5000 1.0000 2.5000 3.5000 1.5000
5.5000 3.5000 6.0000 2.0000 2.0000
An even number of rows scenario is straightforward, using mean to do the work:
>> M = magic(4)
M =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
>> reshape(mean(reshape(M,2,[]),1),[],size(M,2))
ans =
10.5000 6.5000 6.5000 10.5000
6.5000 10.5000 10.5000 6.5000
For the odd number of rows scenario, let's say you want to retain the last row. Here's a general even/odd solution:
>> M = magic(5) % 5 rows!
M =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> Mp = [M; repmat(M(end,:),mod(size(M,1),2),1)]; % replicate last row if odd
>> Mthin = reshape(mean(reshape(Mp,2,[]),1),[],size(Mp,2))
Mthin =
20.0000 14.5000 4.0000 11.0000 15.5000
7.0000 9.0000 16.0000 20.5000 12.5000
11.0000 18.0000 25.0000 2.0000 9.0000
Alternatively, if you want to throw away the last row when encountered with an odd number of rows:
>> Mp = M(1:end-mod(size(M,1),2),:);
>> Mthin = reshape(mean(reshape(Mp,2,[]),1),[],size(Mp,2))
Mthin =
20.0000 14.5000 4.0000 11.0000 15.5000
7.0000 9.0000 16.0000 20.5000 12.5000
Now for averaging n points, retaining the average of the mod(size(M,1),n) last rows:
n = 5;
M = rand(972,5); % or whatever
p = mod(size(M,1),n);
r = repmat(mean(M(end-p+1:end,:),1),(p>0)*(n-p),1);
Mp = [M; r];
Mthin = reshape(mean(reshape(Mp,n,[]),1),[],size(Mp,2));
And for throwing out the last mod(size(M,1),n) rows:
Mp = M(1:end-mod(size(M,1),n),:);
Mthin = reshape(mean(reshape(Mp,n,[]),1),[],size(Mp,2));
>>> A= magic(5) %some "random" data
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>>> B=(A(1:2:end-1,:)+A(2:2:end,:))/2
B =
20.0000 14.5000 4.0000 11.0000 15.5000
7.0000 9.0000 16.0000 20.5000 12.5000
Takes average of each pair of rows, ignores the last row if row count is not even.
And some general solution:
%input data data
X=randi(30,30,5)
step=7
%extend matrix, until size matches step (could be done faster using repmat)
while(mod(size(X,1),step)~=0)
X(end+1,:)=X(end,:)
end
%Split data into segments of size "step"
C=mat2cell(X,repmat(step,floor(size(X,1)/step),1),size(X,2))
%Average over each segment:
AVG=cell2mat(cellfun(#(x)(mean(x,1)),C,'UniformOutput',false))
So I have quite a few (over 60000) data points
f(x_k) = k, here k=0,1,2,...,N.
Function is monotonically increasing and visually looks pretty smooth. I would love to be able to find fitting F(x) such that for every x_k it so happens that k <= F(x_k) < k+1.
How should I approach this problem?
Data example
x 0 1 3 5 8 10 14 16 20 23 27 29 35 37 41
f(x) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
(This looks a bit like a lookup table. Maybe an image processing application of some sort? I did some tools in my past life where an unrounding was needed.)
Is this a one time problem, or will you be doing it often, so you have a need for speed?
I'd throw it into SLM. Since I don't have the data, I cannot test it out or give you any results myself, but there is certainly no problem with an assured fit of the quality you wish as long as you use sufficient number of knots. You would need additional knots on the right hand side, as it appears to approach a vertical asymptote, thus a singularity. Splines in general tend not to like singularities, as they are still polynomials at heart.
Better yet, swap the x and y axes to do the fit, thus fitting x = f(y). The left end point is not an asymptote, so there is no longer a singularity. Now all you need do is constrain the result to be monotonic increasing, and concave down (thus everywhere a negative second derivative.) You will require far fewer knots for the inverse fit, but use enough knots that the fit is of adequate quality for your goals.
To use the inverse fit, simply interpolate in the reverse direction, something that SLMEVAL is capable of doing. I'll see how it does on the little bit of test data you have provided (with just the default number of knots):
x = [0 1 3 5 8 10 14 16 20 23 27 29 35 37 41];
y = [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14];
slm = slmengine(y,x,'plot','on','increasing','on');
So the fit seems reasonable, but I note that your data seems a bit bumpy. It may indeed be difficult to get a solution that is smooth, yet fits entirely within your requirements.
Lets see how well it did:
[x;y;slmeval(x,slm,-1)]'
ans =
0 0 0.0190
1.0000 1.0000 0.9656
3.0000 2.0000 2.0522
5.0000 3.0000 2.9239
8.0000 4.0000 4.1096
10.0000 5.0000 4.8419
14.0000 6.0000 6.1963
16.0000 7.0000 6.8331
20.0000 8.0000 8.0638
23.0000 9.0000 8.9699
27.0000 10.0000 10.1459
29.0000 11.0000 10.7088
35.0000 12.0000 12.2942
37.0000 13.0000 12.8285
41.0000 14.0000 NaN
It misses the last point completely, refusing to extrapolate. But the remainder are not far off. They do fail your requirement though, as it is not true that
k <= F(x_k) < k+1
Of course, I did not build the spline with such a requirement in the specs. Were I to try to solve this problem in general, I might write code that would estimate the values on the curve directly, with no spline intermediary. Then I could easily enforce your constraints, finding the smoothest set of points that satisfies your error bar requirements and monotonicity, that also lies as close to the original data as is possible. Of course, that would involve a large system solve, with 60k unknowns. I don't know how lsqlin would handle that large of a problem, but there are other solvers that might be able to do so if time was an issue.
Again, with your test data as a small scale example:
x = [0 1 3 5 8 10 14 16 20 23 27 29 35 37 41]';
n = numel(x);
k = (0:(n-1))';
% The "unrounding" bound constraints
LB = k;
UB = k+1;
% The best fit possible
Afit = speye(n,n);
% And as smooth as possible
ind = 1:(n-2);
% could do this with diff of course
dx1 = x(ind+1) - x(ind);
dx2 = x(ind+2) - x(ind + 1);
% central second finite difference, for unequal spacing
den = dx1.*dx2.*(dx1 + dx2)/2;
Areg = spdiags([dx2./den,-(dx1 + dx2)./den,dx1./den],[0 1 2],n-2,n);
rhs = [k;zeros(n-2,1)];
% monotonicity constraints...
Amono = spdiags(repmat([1 -1],14,1),[0 1],n-1,n);
bmono = zeros(n-1,1);
% choose a value for r, that allows you to control the smoothness
% larger values of r will make the curve smoother, but the bounds
% will always be enforced. I played with it, and r = 5 seemed a
% reasonable compromise here.
r = 5;
yhat = lsqlin([Afit;r*Areg],rhs,Amono,bmono,[],[],LB,UB);
lsqlin is a bit unhappy, since it does not handle sparse problem of this form at this time. So it throws a warning that it is converting the problem to a full one.
Warning: Large-scale algorithm can handle bound constraints only;
using medium-scale algorithm instead.
> In lsqlin at 270
Warning: This problem formulation not yet available for sparse matrices.
Converting to full to solve.
> In lsqlin at 320
Optimization terminated.
Of course, this conversion will be TOTALLY unacceptable for a problem with 60k unknowns. DO NOT TRY IT ON 60k data points!!!!!!!!!!!!!!!! Your computer will go into a deep freeze.
How did it do though?
disp([x,k,yhat,k+1])
0 0 0.4356 1.0000
1.0000 1.0000 1.0000 2.0000
3.0000 2.0000 2.0504 3.0000
5.0000 3.0000 3.0000 4.0000
8.0000 4.0000 4.2026 5.0000
10.0000 5.0000 5.0000 6.0000
14.0000 6.0000 6.2739 7.0000
16.0000 7.0000 7.0000 8.0000
20.0000 8.0000 8.0916 9.0000
23.0000 9.0000 9.0000 10.0000
27.0000 10.0000 10.2497 11.0000
29.0000 11.0000 11.0000 12.0000
35.0000 12.0000 12.2994 13.0000
37.0000 13.0000 13.0000 14.0000
41.0000 14.0000 14.0594 15.0000
It worked nicely, although it would be a hog of obscene proportions for large problems as you have. Perhaps there is another optimizer (maybe in TOMLAB or some other package) that can handle a large scale sparse linear problem, subject to linear and bound constraints. You also might wish to force the first point through zero, but that is trivial to do.
A final option, is if say 1000 points is doable, to recreate the curve in batches of 1010 at a time using the above scheme. lsqlin should be able to handle problems of that size with no problem. Leave some overlap at the ends, 5 points in each overlap region should be sufficient. Then average the results in the overlap regions.
I have a ~ 100000/2 matrix. I'd like to go down the columns, average each vertically adjacent value, and insert that value in between the two values. For example...
1 2
3 4
4 6
7 8
would become
1 2
2 3
3 4
3.5 5
4 6
5.5 7
7 8
I'm not sure if there is a terse way to do this in matlab. I took a look at http://www.mathworks.com/matlabcentral/fileexchange/9984 but it seems to insert all of the rows in a matrix into the other one at a specific point. Obviously it can still be used, but just wondering if there is a simpler way.
Any help is appreciated, thanks.
Untested:
% Take the mean of adjacent pairs
x_mean = ([x; 0 0] + [0 0; x]) / 2;
% Interleave the two matrices
y = kron(x, [1;0]) + kron(x_mean(1:end-1,:), [0;1]);
%# works for any 2D matrix of size N-by-M
X = rand(100,2);
adjMean = mean(cat(3, X(1:end-1,:), X(2:end,:)), 3);
Y = zeros(2*size(X,1)-1, size(X,2));
Y(1:2:end,:) = X;
Y(2:2:end,:) = adjMean;
octave-3.0.3:57> a = [1,2; 3,4; 4,6; 7,8]
a =
1 2
3 4
4 6
7 8
octave-3.0.3:58> b = (circshift(a, -1) + a) / 2
b =
2.0000 3.0000
3.5000 5.0000
5.5000 7.0000
4.0000 5.0000
octave-3.0.3:60> reshape(vertcat(a', b'), 2, [])'(1:end-1, :)
ans =
1.0000 2.0000
2.0000 3.0000
3.0000 4.0000
3.5000 5.0000
4.0000 6.0000
5.5000 7.0000
7.0000 8.0000