Average values in one file based upon values in another - matlab

I have a problem which I hope contributors can help me solve. I think it is best just to provide a working example:
I have two cells both of which consist of the same number of matrices (the result of reading a series of data files followed by some loop calculations). Each matrix is a column of decimal year days followed by a series of columns of data. Here is the dummy data:
A = [ 186.356 1 2 3 4;186.364 2 3 4 5;186.372 3 4 5 6]
B = [ 187.356 1 2 3 4;187.364 2 3 4 5;187.372 3 4 5 6]
C = [ 188.356 1 2 3 4;188.364 2 3 4 5;188.372 3 4 5 6]
x = {A,B,C}
D = [ 186.3568 1 2 3 4; 186.3576 2 3 4 5; 186.3584 3 4 5 6; 186.3592 4 5 6 7; 186.36 5 6 7 8; 186.3608 6 7 8 9; 186.3616 7 8 9 10; 186.3624 8 9 10 11; 186.3632 9 10 11 12; 186.364 10 11 12 13; 186.3648 11 12 13 14; 186.3656 12 13 14 15]
E = [ 187.3568 1 2 3 4; 187.3576 2 3 4 5; 187.3584 3 4 5 6; 187.3592 4 5 6 7; 187.36 5 6 7 8; 187.3608 6 7 8 9; 187.3616 7 8 9 10; 187.3624 8 9 10 11; 187.3632 9 10 11 12; 187.364 10 11 12 13; 187.3648 11 12 13 14; 187.3656 12 13 14 15]
F = [ 188.3568 1 2 3 4; 188.3576 2 3 4 5; 188.3584 3 4 5 6; 188.3592 4 5 6 7; 188.36 5 6 7 8; 188.3608 6 7 8 9; 188.3616 7 8 9 10; 188.3624 8 9 10 11; 188.3632 9 10 11 12; 188.364 10 11 12 13; 188.3648 11 12 13 14; 188.3656 12 13 14 15]
y = {D,E,F}
My intention is to sum the data columns contained within both x and y. However you can see that the resolution of the data in y is much higher than x therefore I would first like to average the data in y based upon the timesteps of x.
As an example the first time period which matches between x and y correspond to row 1 in matrix A but only the first 10 rows in matrix D. The sum of the first row in A is 10:
sumA = sum(A(1,2:end),2)
and the average of the first 10 rows in D is
sumD = sum(mean(D(1:10,2:end)),2)
resulting in a total of 38.
This is a simple example; I have many rows of data in two large cells. I suspect I need to extract the data from the cells, loop through the data whilst rewriting to another cell of the same dimensions as the first two cells, x and y but am at a loss as to where to start. Any help would be great.
Edit
In looking to clarify my problem I realise I made a mistake in the original question. This is no doubt the cause of the confusion.
Everything above is correct however the sum of the first 10 rows of D:
sumD = sum(mean(D(1:10,2:end)),2)
sumD =
28
should actually be added to the sum of the second row in A:
sumA = sum(A(2,2:end),2)
sumA =
14
This is because all the values in rows 1-10 of column 1 in matrix D are larger than the the value in row 1 and column 1 of matrix A but smaller than or equal to row 2 and column 2 of matrix A. It might be easier if increase the dummy data in matrix D:
D = [ 186.3568 1 2 3 4; 186.3576 2 3 4 5; 186.3584 3 4 5 6; 186.3592 4 5 6 7; 186.36 5 6 7 8; 186.3608 6 7 8 9; 186.3616 7 8 9 10; 186.3624 8 9 10 11; 186.3632 9 10 11 12; 186.364 10 11 12 13; 186.3648 11 12 13 14; 186.3656 12 13 14 15; 186.3664 13 14 15 16; 186.3672 14 15 16 17; 186.368 15 16 17 18; 186.3688 16 17 18 19; 186.3696 17 18 19 20; 186.3704 18 19 20 21; 186.3712 19 20 21 22; 186.372 20 21 22 23]
Now the result would be a two value vector. The first value would be 28+14, the result of the sum of the second row in A (or sumA) and the sum of the mean of the first 10 rows of data in matrix D (or sumD). The second value would the sum of the third row in A, lets say sumA2:
sumA2 = sum(A(3,2:end),2)
sumA2 =
18
and sumD2:
sumD2 = sum(mean(D(11:end,2:end)),2)
sumD2 =
68
sumA2+sumD2
ans =
86
I would like this process to be automated so that I can go through each matrix in the cell. i.e. if I start with cells x and y with dims:
x =
[300x5 double] [300x5 double] [300x5 double]
y =
[2000x5 double] [2000x5 double] [2000x5 double]
I would like the result to be
z =
[300x1 double] [300x1 double] [300x1 double]
I am not sure if that makes things any clearer but lets see!

Well, if I managed to properly get all your tricky specifications, here is the code:
function z = foo(x, y)
z = x;
for i = 1:length(x)
z{i} = sum(z{i}(:, 2:end), 2);
dmin = 0;
for j = 1:size(x{i}, 1)
dmax = x{i}(j, 1);
t = y{i}(:, 1);
mask = t > dmin & t <= dmax;
if any(mask)
z{i}(j) = z{i}(j) + sum(median(y{i}(mask, 2:end)), 2);
end
dmin = dmax;
end
end
end
For the given x and y from the question, for the answer z I have z{1} == z{2} == z{3}, and
>> z{1}
ans =
10
42
70
If I substitute D from your "Edit" section, I get z{1}(3) == 86, as you claimed.
Nothing special about the code. dmin and dmax hold current dates range based on the value of the first column of a matrix from x (i.e. A, B, etc.). An if any(mask) statement is needed to avoid taking median from empty array, which leads to a vector of NaN's, which screws up the sum.

Related

Merging two three column matrices: uniques in cols 1&2, maximum in col 3

I have two matrices:
a = [ 1 10 20; 2 11 22; 3 12 34; 4 13 12];
b = [ 3 12 1; 4 13 25; 5 14 60; 6 15 9 ];
I want to merge them into a single matrix where the rows with the maximum in column 3 used where columns 1 and 2 are identical, i.e. the resulting matrix should look like this:
c = [ 1 10 20; 2 11 22; 3 12 34; 4 13 25; 5 14 60; 6 15 9];
Any suggestions how to do this easily in MATLAB would be greatly appreciated. I've banged my head on a wall trying to use intersect but to no avail.
When merging the arrays you should make sure that they end up sorted (by col1 then col2 then col3). Fortunately, the union function does exactly that.
In your example, where the values in the 1st and 2nd columns are always unique, we can only observe the values in the 1st column to choose the correct rows. This happens when diff returns a nonzero value (which means this is the bottom row of a group):
a = [ 1 10 20; 2 11 22; 3 12 34; 4 13 12];
b = [ 3 12 1; 4 13 25; 5 14 60; 6 15 9];
c = [ 1 10 20; 2 11 22; 3 12 34; 4 13 25; 5 14 60; 6 15 9 ];
u = union(a,b,'rows'); % this merges and sorts the arrays
r = u(logical([diff(u(:,1)); 1]),:); % since the array is sorted, the last entry will have
% the maximum value in column 3
assert(isequal(r,c));
You can also use a mix between unique and accumarray.
Use unique to create an index based on the 2 first columns
Use accumarray to find the maximum value in the third column according to the index.
The code:
a = [ 1 10 20;2 11 22; 3 12 34; 4 13 12];
b = [3 12 1; 4 13 25; 5 14 60; 6 15 9];
M = [a;b];
[res,~,ind] = unique(M(:,1:2),'rows');
c = [res,accumarray(ind,M(:,3),[],#max)]
Read about unique
a = [ 1 10 20;2 11 22; 3 12 34; 4 13 12];
b = [3 12 1; 4 13 25; 5 14 60; 6 15 9];
A = [a;b] ;
[c,ia,ib] = unique(A(:,1)) ;
C = A(ia,:)

If A is a vector subset of B, how can I find the indices of A within B in MATLAB?

Consider a row vector A and row vector B. For example:
A = [1 2 3 7 8 10 12];
B = [1 1 2 2 2 3 5 6 6 7 7 7 8 8 10 10 10 11 12 12 12 13 15 16 18 19];
A has previously been checked to be a subset of B. By subset, I specifically mean that all elements in A can be found in B. I know that elements in A will not ever repeat. However, the elements in B are free to repeat as many or as few times as they like. I checked this condition using:
is_subset = all(ismember(A,B));
With all that out of the way, I need to know the indices of the elements of A within B including the times when these elements repeat within B. For the example A and B above, the output would be:
C = [1 2 3 4 5 6 10 11 12 13 14 15 16 17 19 20 21];
Use ismember to find the relevant logical indices. Then convert them to linear indices using find.
C = find(ismember(B,A));
You can find the difference of each element of A with B, and get the indices you want. Something like below:
A = [1 2 3 7 8 10 12];
B = [1 1 2 2 2 3 5 6 6 7 7 7 8 8 10 10 10 11 12 12 12 13 15 16 18 19];
C = [1 2 3 4 5 6 10 11 12 13 14 15 16 17 19 20 21];
tol = 10^-3 ;
N = length(A) ;
iwant = cell(N,1) ;
for i = 1:N
idx = abs(A(i)-B)<=tol ;
iwant{i} = find(idx) ;
end
iwant = [iwant{:}] ;

How to create a matrix B from a matrix A using conditions in MATLAB

If I have this matrix:
A:
X Y Z
1 1 2
0 3 4
0 5 6
2 7 8
7 9 10
8 11 12
3 13 14
12 14 16
15 17 18
How could I create new matrix B, C, D and E which contains:
B:
0 3 4
0 5 6
C:
X Y Z
1 1 2
2 7 8
3 13 14
D:
7 9 10
8 11 12
E:
12 14 16
15 17 18
The idea is to construct a loop asking if 0<A<1 else 1<A<5 else 6<A<10 else 11<A<15. and create new matrix from that condition. Any idea about how to store the results of the loop?
I suggest you an approach that uses the discretize function in order to group the matrix rows into different categories based on their range. Here is the full implementation:
A = [
1 1 2;
0 3 4;
0 5 6;
2 7 8;
7 9 10;
8 11 12;
3 13 14;
12 14 16;
15 17 18
];
A_range = [0 1 5 10 15];
bin_idx = discretize(A(:,1),A_range);
A_split = arrayfun(#(bin) A(bin_idx == bin,:),1:(numel(A_range) - 1),'UniformOutput',false);
celldisp(A_split);
Since you want to consider 5 different ranges based on the first column values, the arguments passed to discretize must be the first matrix column and a vector containing the group limits (first number inclusive left, second number exclusive right, second number inclusive left, third number exclusive right, and so on...). Since your ranges are a little bit messed up, feel free to adjust them to respect the correct output. The latter is returned in the form of a cell array of double matrices in which every element contains the rows belonging to a distinct group:
A_split{1} =
0 3 4
0 5 6
A_split{2} =
1 1 2
2 7 8
3 13 14
A_split{3} =
7 9 10
8 11 12
A_split{4} =
12 14 16
15 17 18
Instead of using a loop, use logical indexing to achieve what you want. Use the first column of A and check for the ranges that you want to look for, then use this to subset into the final matrix A to get what you want.
For example, to create the matrix C, find all locations in the first column of A that are between 1 and 5, then subset the matrix along the rows using these locations:
m = A(:,1) >= 1 & A(:,1) <= 5;
C = A(m,:);
You can repeat this in a similar way for the rest of the matrices you want to create.

How to efficiently reshape one column matrix to many specific length columns by moving specific interval

The input is an N-by-1 matrix. I need to reshape it to L-by-M matrix. The following is an example.
Input:
b =
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Set length = 18, Output:
X =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 10
9 10 11
10 11 12
11 12 13
12 13 14
13 14 15
14 15 16
15 16 17
16 17 18
17 18 19
18 19 20
Because I have a very big matrix, using a loop to reshape is very inefficient. How can I improve the reshape speed?
Your example output matrix X is the perfect matrix to index a vector of length N to get what you want. It's also very easy to create using bsxfun:
N = 20;
b = rand(N,1);
M = 3; %// number of columns
L = N-M; %// Note that N-M is an upper limit for L!
idx = bsxfun(#plus, (0:L)', 1:M)
X = b(idx)
That's exactly what im2col (from the Image Processing Toolbox) does:
b = (1:20).'; %'// example data
L = 18; % // desired length of sliding blocks
x = im2col(b, [L 1]); % // result
I'd use horzcat. For example:
function X = reshaper(b,len)
diff = length(b) - len + 1;
X = b(1:len);
for i=2:diff
X = horzcat(X,b(i:len+(i-1)));
end
You could probably remove the for loop with some further thought.

How can I discard some unwanted rows from a matrix in Matlab?

I have a matrix
A= [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16; 17 18 19 20]
I want to do some calculation on this matrix. But actually I do not need all the rows. So I have to discard some of the rows from the above matrix before doing a calculation. After discarding 3 rows, we will have a new matrix.
B= [1 2 3 4; 9 10 11 12; 17 18 19 20];
Now I have to use B to make some other calculations. So how can I discard some of the unwanted rows from a matrix in matlab? Any suggestion will be helpful. Thanks.
Try this: (Use when no. of rows to keep is lesser)
%// Input A
A = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16; 17 18 19 20];
%// Rows (1-3,5) you wanted to keep
B = A([1:3, 5],:)
Output:
B =
1 2 3 4
5 6 7 8
9 10 11 12
17 18 19 20
Alternative: (Use when no. of rows to discard is lesser)
%// rows 2 and 3 discarded
A([2,3],:) = [];
Output:
>> A
A =
1 2 3 4
13 14 15 16
17 18 19 20
Note: Here (in the alternate method), the output replaces the original A. So you need to back up A if you need it afterwards. You could do this before discarding operation to backup Input matrix
%// Input A is backed up in B
B = A;
You can select the indices of the rows you want to keep:
A([1,3,5],:)
ans =
1 2 3 4
9 10 11 12
17 18 19 20