MatLab to convert a matrix with respect to 1st col - matlab

This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same? and Group values in different rows by their first-column index
If
A = [2 3 234 ; 2 44 99999; 2 99999 99999; 3 123 99; 3 1232 45; 5 99999 57]
1st column | 2nd column | 3rd column
--------------------------------------
2 3 234
2 44 99999
2 99999 99999
3 123 99
3 1232 45
5 99999 57
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
--------------------------------------------------------------------
2 3 234 44
3 123 99 1232 45
5 57
That is, for each numbers in the 1st col of A, I want to put numbers EXCEPT "99999"
If we disregard the "except 99999" part, we can code as Group values in different rows by their first-column index
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
But obviously this code won't ignore 99999.
I guess there are two ways
1. first make r, and then remove 99999
2. remove 99999 first, and then make r
Whichever, I just want faster one.
Thank you in advance!

I think options 1 is better, i.e. first make r, and then remove 99999. Having r, u can remove 99999 as follows:
r2 = {}; % new cell array without 99999
for i = 1:numel(r)
rCell = r{i};
whereIs9999 = rCell == 99999;
rCell(whereIs9999) = []; % remove 99999
r2{i} = rCell;
end
Or more fancy way:
r2= cellfun(#(c) {c(c~=99999)}, r);

Related

Extract rows having a [maximum / minimum] on a certain column grouped by another column

I would like to extract rows having a maximum value on col3 and grouped by col2.
So for example if I have:
% col1 col2 col3
M = [112 1 78
112 2 2
120 2 77
101 1 86
112 3 103]
The result of MAX on col3 GROUP BY col2 would be (the row order doesn't matter):
% col1 col2 col3
R = [120 2 77
101 1 86
112 3 103]
Actually I'm using:
M = sortrows(M,[2,3])
[~,ind] = unique(M(:,2),'last')
R = M(ind,:)
But I found this solution overcomplicated, have you a simpler solution ? I would like to avoid using a matlab table.
This may be faster but seems more complicated than the original solution:
function out = getmax (x)
[~, out] = max (x);
end
idx = accumarray(M(:,2), 1:size(M,1), [], #(x) x(getmax(M(x,3))));
R = M(idx, :);
EDIT:
Note that if values in M(:,2) don't form a permutation of a continuous range of positive integers starting from 1 (and possibly with repetition) they should be transformed by unique.
[~, ~, col2] = unique (M(:,2));
idx = accumarray(col2, 1:size(M,1), [], #(x) x(getmax(M(x,3))));
R = M(idx, :);
Using findgroups and splitapply:
>> G = findgroups(M(:,2));
>> Y = arrayfun(#(x) find(M(:,3)==x),splitapply(#max,M(:,3),G));
>> M(Y,:)
ans =
101 1 86
120 2 77
112 3 103

Matlab: How I can make this transformation on the matrix A? (part 2)

N.B: This question is more complex than my previous question: Matlab: How I can make this transformation on the matrix A?
I have a matrix A 4x10000, I want to use it to find another matrix C, based on a predefined vector U.
I'll simplify my problem with a simple example:
from a matrix A
20 4 4 74 20 20 4
36 1 1 11 36 36 1
77 1 1 15 77 77 1
3 4 2 6 7 8 15
and
U=[2 3 4 6 7 8 2&4&15 7&8 4|6].
& : AND
| : OR
I want, first, to find an intermediate entity B:
2 3 4 6 7 8 2&4&15 7&8 4|6
[20 36 77] 0 1 0 0 1 1 0 1 0 4
[4 1 1] 1 0 1 0 0 0 1 0 1 4
[74 11 15] 0 0 0 1 0 0 0 0 1 2
we put 1 if the corresponding value of the first line and the vector on the left, made ​​a column in the matrix A.
the last column of the entity B is the sum of 1 of each line.
at the end I want a matrix C, consisting of vectors which are left in the entity B, but only if the sum of 1 is greater than or equal to 3.
for my example:
20 4
C = 36 1
77 1
This was a complex one indeed and because of the many restrictions and labeling processes involved, it won't be as efficient as the solution to the previous problem. Here's the code to solve the posted problem -
find_labels1 = 2:8; %// Labels to be detected - main block
find_labels2 = {[2 4 15],[7 8],[4 6]}; %// ... side block
A1 = A(1:end-1,:); %// all of A except the last row
A2 = A(end,:); %// last row of A
%// Find unique columns and their labels for all of A execpt the last row
[unqmat_notinorder,row_ind,inv_labels] = unique(A1.','rows'); %//'
[tmp_sortedval,ordered_ind] = sort(row_ind);
unqcols = unqmat_notinorder(ordered_ind,:);
[tmp_matches,labels] = ismember(inv_labels,ordered_ind);
%// Assign labels to each group
ctl = numel(unique(labels));
labelgrp = arrayfun(#(x) find(labels==x),1:ctl,'un',0);
%// Work for the main comparisons
matches = bsxfun(#eq,A2,find_labels1'); %//'
maincols = zeros(ctl,numel(find_labels1));
for k = 1:ctl
maincols(k,:) = any(matches(:,labelgrp{k}),2);
end
%// Work for the extra comparisons added that made this problem extra-complex
lens = cellfun('length',find_labels2);
lens(end) = 1;
extcols = nan(ctl,numel(find_labels2));
for k = 1:numel(find_labels2)
idx = find(ismember(A2,find_labels2{k}));
extcols(:,k)=arrayfun(#(n) sum(ismember(labelgrp{n},idx))>=lens(k),1:ctl).'; %//'
end
C = unqcols(sum([maincols extcols],2)>=3,:).' %//'# Finally the output
I will give you a partial answer. I think you can take from here. Idea is to concatenate first 3 rows of A with each element of U replicated as last column. After you get the 3D matrix, replicate your original A and then just compare the rows. The rows which are equal, that is equivalent to putting one in your table.
B=(A(1:3,:).';
B1=repmat(B,[1 1 length(U)]);
C=permute(U,[3 1 2]);
D=repmat(C,[size(B1,1),1,1]);
E=[B1 D];
F=repmat(A',[1 1 size(E,3)]);
Now compare F and E, row-wise. If the rows are equal, then you put 1 in your table. For replicating & and |, you can form some kind of indicator vector.
Say,
indU=[1 2 3 4 5 6 7 7 7 8 8 -9 -9];
Same positive value indicates &, same negative value indicates |. Different value indicate different entries.
I hope you can take from here.

Tracking changes in a cell in Matlab

I have a cell with 3 different columns. The first is a simple ranking, the second is a code composed by X elements and the third is a code composed by Y elements that usually is the same for a certain combination of numbers in column two. So if in column two you have the number 345, it is likely that in column three you will always have 798. The thing is that sometimes it changes. So what I have, for instance, is:
1 453 4789
1 56 229
1 453 1246 %here the corresponding code has changed
2 43 31
2 453 1246 %here the code did not change
3 56 31 %here the corresponding code has changed (it was 229 previously)
What I want to have at the end is a new cell with three columns, only descriminating the cases in which a change in the code of the third column (correspondent to the code form the second column) was observed. For instance, in this simple example I would get:
1 453 1246
3 56 31
If you have your data in a matrix A you can use sorting:
[~, I] = sort(A(:,2));
B = A(I,:);
code_diff = logical(diff(B(:, 2)));
value_diff = logical(diff(B(:, 3)));
value_diff(code_diff) = false;
rows = sort(I([false; value_diff]));
ans = A(rows, :);
If the "codes" in the second column are all smallish integers, another possibility is to use a lookup table:
n = size(A, 1);
m = max(A(:, 2));
mask = false(n, 1);
lookup = inf(m, 1);
for i = 1:n
code = A(i,2);
if isinf(lookup(code))
lookup(code) = A(i,3);
elseif lookup(code) ~= A(i,3)
mask(i) = true;
lookup(code) = A(i,3);
end
end
ans = A(mask, :);
Assuming the values are in a matrix, this is a possible solution:
CJ2 = [1 453 4789
1 56 229
1 453 1246
2 43 31
2 453 1246
3 56 31];
changes = zeros(size(CJ2));
nChanges = 0;
for i = 2:size(CJ2,1)
pos = find(CJ2(1:i-1,2) == CJ2(i,2), 1, 'last');
if ~isempty(pos) && CJ2(pos,3) ~= CJ2(i,3)
nChanges = nChanges + 1;
changes(nChanges, :) = CJ2(i,:);
end
end
changes = changes(1:nChanges, :);
changes
Results:
>> changes
changes =
1 453 1246
3 56 31

Matlab loop to convert N x 1 matrix to 60 x 4718 matrix

I am a novice at Matlab and am struggling a bit with creating a loop that will a convert a 283080 x 2 matrix - column 1 lists all stockID numbers (each repeated 60 times) and column 2 contains all lagged monthly returns (60 observations for each stock) into a 60 x 4718 matrix with a column for each stockID and its corresponding lagged returns falling in 60 rows underneath each ID number.
My aim is to then try to calculate a variance-covariance matrix of the returns.
I believe I need a loop because I will be repeating this process over 70 times as I have multiple data sets in this same current format
Thanks so much for the help!
Let data denote your matrix. Then:
aux = sortrows(data,1); %// sort rows according to value in column 1
result = reshape(aux(:,2),60,[]); %// reshape second column as desired
If you need to insert the stockID values as headings (first row of result), add this as a last line:
result = [ unique(aux(:,1)).'; result ];
A simple example, replacing 60 by 2:
>> data = [1 100
2 200
1 101
2 201
4 55
3 0
3 33
4 56];
>> aux = sortrows(data,1);
>> result = reshape(aux(:,2),2,[])
>> result = [ unique(aux(:,1)).'; result ];
result =
1 2 3 4
100 200 0 55
101 201 33 56

Group values in different rows by their first-column index

This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same?
If
A = [2 3 234 ; 2 44 33; 2 12 22; 3 123 99; 3 1232 45; 5 224 57]
1st column | 2nd column | 3rd column
2 3 234
2 44 33
2 12 22
3 123 99
3 1232 45
5 224 57
then running
[U ix iu] = unique(A(:,1) );
r= accumarray( iu, A(:,2:3), [], #(x) {x'} )
will show me the error
Error using accumarray
Second input VAL must be a vector with one element for each row in SUBS, or a
scalar.
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
2 3 234 44 33 12 22
3 123 99 1232 45
5 224 57
I know how to do it using for and if, but that spends too much time for big data.
How can I do this?
Thank you in advance!
You're misusing accumarray in the solution provided to your previous question. The first parameter iu is the vector of indices and the second parameter should be a vector of values, of the same length. What you did here is specify a matrix as the second parameter, which in fact has twice more values than indices in iu.
What you need to do in order to make it work is create a vector of indices both for the second column and for the third column (they are the same indices, not coincidentally!) and specify a matching column vector of values, like so:
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
This solution is generalized for any number of columns that you want to pass to accumarray.