Group values in different rows by their first-column index - matlab

This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same?
If
A = [2 3 234 ; 2 44 33; 2 12 22; 3 123 99; 3 1232 45; 5 224 57]
1st column | 2nd column | 3rd column
2 3 234
2 44 33
2 12 22
3 123 99
3 1232 45
5 224 57
then running
[U ix iu] = unique(A(:,1) );
r= accumarray( iu, A(:,2:3), [], #(x) {x'} )
will show me the error
Error using accumarray
Second input VAL must be a vector with one element for each row in SUBS, or a
scalar.
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
2 3 234 44 33 12 22
3 123 99 1232 45
5 224 57
I know how to do it using for and if, but that spends too much time for big data.
How can I do this?
Thank you in advance!

You're misusing accumarray in the solution provided to your previous question. The first parameter iu is the vector of indices and the second parameter should be a vector of values, of the same length. What you did here is specify a matrix as the second parameter, which in fact has twice more values than indices in iu.
What you need to do in order to make it work is create a vector of indices both for the second column and for the third column (they are the same indices, not coincidentally!) and specify a matching column vector of values, like so:
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
This solution is generalized for any number of columns that you want to pass to accumarray.

Related

Calculating mean over column with condition

So my question is as follows: I have a matrix (let's take
A = [ 1 11 22 33; 44 13 12 33; 1 14 33 44,]
as an example) where I want to calculate the mean for all columns separately. The tricky part is that I only want to calculate the mean for those numbers in each column which are greater than the column 25th percentile.
I was thinking to simply create the 25th percentile and then use this as a criterion for selecting rows. This, unfortunately, does not work.
In order to further clarify: What should happen is to go through each column and calculate the 25th percentile
prctile(A,25,1)
And then calculating the mean only for those numbers which are respectively to their column bigger than the percentile.
Any help?
Thanks!
You can create a version of A which is NaN for values below the 25th percentile, then use the 'omitnan' flag in mean to exclude those points:
A = [1 11 22 33; 44 13 12 33; 1 14 33 44];
B = A; % copy to leave A unaltered
B( B <= prctile(B,25,1) ) = NaN; % Turn values to NaN which we want to exclude
C = mean( B, 1, 'omitnan' ); % Omit the NaN values from the caculation
% C >>
% [ 15.33 13.50 27.50 36.67 ]

How can I use values within a MATLAB matrix as indices to determine the location of data in a new matrix?

I have a matrix that looks like the following.
I want to take the column 3 values and put them in another matrix, according to the following rule.
The value in the Column 5 is the row index for the new matrix, and Column 6 is the column index. Therefore 20 (taken from 29,3) should be in Row 1 Column 57 of the new matrix, 30 (from 30,3) should in Row 1 column 4 of the new matrix, and so on.
If the value in column 3 is NaN then I want NaN to be copied over to the new matrix.
Example:
% matrix of values and row/column subscripts
A = [
20 1 57
30 1 4
25 1 16
nan 1 26
nan 1 28
25 1 36
nan 1 53
50 1 56
nan 2 1
nan 2 2
nan 2 3
80 2 5
];
% fill new matrix
B = zeros(5,60);
idx = sub2ind(size(B), A(:,2), A(:,3));
B(idx) = A(:,1);
There are a couple other ways to do this, but I think the above code is easy to understand. It is using linear indexing.
Assuming you don't have duplicate subscripts, you could also use:
B = full(sparse(A(:,2), A(:,3), A(:,1), m, n));
(where m and n are the output matrix size)
Another one:
B = accumarray(A(:,[2 3]), A(:,1), [m,n]);
I am not sure if I understood your question clearly but this might help:
(Assuming your main matrix is A)
nRows = max(A(:,5));
nColumns = max(A(:,6));
FinalMatrix = zeros(nRows,nColumns);
for i=1:size(A,1)
FinalMatrix(A(i,5),A(i,6))=A(i,3);
end
Note that above code sets the rest of the elements equal to zero.

Matlab loop to convert N x 1 matrix to 60 x 4718 matrix

I am a novice at Matlab and am struggling a bit with creating a loop that will a convert a 283080 x 2 matrix - column 1 lists all stockID numbers (each repeated 60 times) and column 2 contains all lagged monthly returns (60 observations for each stock) into a 60 x 4718 matrix with a column for each stockID and its corresponding lagged returns falling in 60 rows underneath each ID number.
My aim is to then try to calculate a variance-covariance matrix of the returns.
I believe I need a loop because I will be repeating this process over 70 times as I have multiple data sets in this same current format
Thanks so much for the help!
Let data denote your matrix. Then:
aux = sortrows(data,1); %// sort rows according to value in column 1
result = reshape(aux(:,2),60,[]); %// reshape second column as desired
If you need to insert the stockID values as headings (first row of result), add this as a last line:
result = [ unique(aux(:,1)).'; result ];
A simple example, replacing 60 by 2:
>> data = [1 100
2 200
1 101
2 201
4 55
3 0
3 33
4 56];
>> aux = sortrows(data,1);
>> result = reshape(aux(:,2),2,[])
>> result = [ unique(aux(:,1)).'; result ];
result =
1 2 3 4
100 200 0 55
101 201 33 56

MatLab to convert a matrix with respect to 1st col

This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same? and Group values in different rows by their first-column index
If
A = [2 3 234 ; 2 44 99999; 2 99999 99999; 3 123 99; 3 1232 45; 5 99999 57]
1st column | 2nd column | 3rd column
--------------------------------------
2 3 234
2 44 99999
2 99999 99999
3 123 99
3 1232 45
5 99999 57
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
--------------------------------------------------------------------
2 3 234 44
3 123 99 1232 45
5 57
That is, for each numbers in the 1st col of A, I want to put numbers EXCEPT "99999"
If we disregard the "except 99999" part, we can code as Group values in different rows by their first-column index
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
But obviously this code won't ignore 99999.
I guess there are two ways
1. first make r, and then remove 99999
2. remove 99999 first, and then make r
Whichever, I just want faster one.
Thank you in advance!
I think options 1 is better, i.e. first make r, and then remove 99999. Having r, u can remove 99999 as follows:
r2 = {}; % new cell array without 99999
for i = 1:numel(r)
rCell = r{i};
whereIs9999 = rCell == 99999;
rCell(whereIs9999) = []; % remove 99999
r2{i} = rCell;
end
Or more fancy way:
r2= cellfun(#(c) {c(c~=99999)}, r);

find sorting index per row of a 2D matrix in MatLab and populate a new matrix

I have a challenge to order my matrix. The provided functions like sortrows work in the opposite way...
Take this 2D matrix
M =
40 45 68
50 65 58
60 55 48
57 67 44
,
The objective is to find matrix O that indicates the sorting index (rank) per row, i.e.:
O =
1 2 3
1 3 2
3 2 1
2 3 1
.
So for the second row 50 is the smallest element (1), 65 the largest (3), and 58 is the second largest (2), therefore row vector [1 3 2].
[~,sorted_inds] = sort(M,2);
will do.
I think you're looking for the second output of the regular sort function:
[~,I] = sort(M,2)
This syntax supresses the actual sorted matrix Msorted, and returns the indices I such that
for j = 1:n, Msorted(j,:) = M(I(j,:),j); end
Type doc sort for more information.