Making a match-and-append code more efficient without 'for' loop - matlab

I am trying to match 1st column of A with 1st to 3rd columns of B, and append corresponding 4th column of B to A.
For example,
A=
1 2
3 4
B=
1 2 4 5 4
1 2 3 5 3
1 1 1 1 2
3 4 5 6 5
I compare A(:,1) and B(:, 1:3)
1 and 3 are in A(:,1)
1 is in the 1st, 2nd, 3rd rows of B(:, 1:3), so append B([1 2 3], 4:end)' to A's 1st row.
3 is in the 2nd and 4th rows of B(:,1:3), so append B([2 4], 4:end)' to A's 2nd row.
So that it becomes:
1 2 5 4 5 3 1 2
3 4 5 3 6 5 0 0
I could code this using only for and if.
clearvars AA A B mem mem2 mem3
A = [1 2 ; 3 4]
B = [1 2 4 5 4; 1 2 3 5 3; 1 1 1 1 2; 3 4 5 6 5]
for n=1:1:size(A,1)
mem = ismember(B(:,[1:3]), A(n,1));
mem2 = mem(:,1) + mem(:,2) + mem(:,3);
mem3 = find(mem2>0);
AA{n,:} = horzcat( A(n,:), reshape(B(mem3,[4,5])',1,[]) ); %'
end
maxLength = max(cellfun(#(x)numel(x),AA));
out = cell2mat(cellfun(#(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false))
I am trying to make this code efficient, by not using for and if, but couldn't find an answer.

Try this
a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
row2 = mat2cell(permute(sum(ab==b,2),[3,1,2]),ones(1,numel(a)));
AA = cellfun(#(x)(reshape(B(x>0,4:end)',1,numel(B(x>0,4:end)))),row2,'UniformOutput',0);
maxLength = max(cellfun(#(x)numel(x),AA));
out = cat(2,A,cell2mat(cellfun(#(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false)))
UPDATE Below code runs in almost same time as the iterative code
a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
df = permute(sum(ab==b,2),[3,1,2])';
AA = arrayfun(#(x)(B(df(:,x)>0,4:end)),1:size(df,2),'UniformOutput',0);
AA = arrayfun(#(x)(reshape(AA{1,x}',1,numel(AA{1,x}))),1:size(AA,2),'UniformOutput',0);
maxLength = max(arrayfun(#(x)(numel(AA{1,x})),1:size(AA,2)));
out2 = cell2mat(arrayfun(#(x,i)((cat(2,A(i,:),AA{1,x},zeros(1,maxLength-length(AA{1,x}))))),1:numel(AA),1:size(A,1),'UniformOutput',0));

How about this:
%# example data
A = [1 2
3 4];
B = [1 2 4 5 4
1 2 3 5 3
1 1 1 1 2
3 4 5 6 5];
%# rename for clarity & reshape for algorithm's convenience
needle = permute(A(:,1), [2 3 1]);
haystack = B(:,1:3);
data = B(:,4:end).';
%# Get the relevant rows of 'haystack' for each entry in 'needle'
inds = any(bsxfun(#eq, haystack, needle), 2);
%# Create data that should be appended to A
%# All data and functionality in this loop is local and static, so speed
%# should be optimal.
append = zeros( size(A,1), numel(data) );
for ii = 1:size(inds,3)
newrow = data(:,inds(:,:,ii));
append(ii,1:numel(newrow)) = newrow(:);
end
%# Now append to A, stripping unneeded zeros
A = [A append(:, ~all(append==0,1))]

Related

Find and replace the rows of an array having repeated number by a fixed given row

I have a matrix having rows with repeated numbers. I want to find those rows and replace them with a dummy row so as to keep the number of rows of the matrix constant.
Dummy_row = [1 2 3]
(5x3) Matrix A
A = [2 3 6;
4 7 4;
8 7 2;
1 3 1;
7 8 2]
(5x3) Matrix new_A
new_A = [2 3 6;
1 2 3;
8 7 2;
1 2 3;
7 8 2]
I tried the following which deleted the rows having repeated numbers.
y = [1 2 3]
w = sort(A,2)
v = all(diff(t,1,2)~=0|w(:,1:2)==0,2) % When v is zero, the row has repeated numbers
z = A(w,:)
Can you please help?
bsxfun based solution -
%// Create a row mask of the elements that are to be edited
mask = any(sum(bsxfun(#eq,A,permute(A,[1 3 2])),2)>1,3);
%// Setup output variable and set to-be-edited rows as copies of [1 2 3]
new_A = A;
new_A(mask,:) = repmat(Dummy_row,sum(mask),1)
Code run -
A =
2 3 6
4 7 4
8 7 2
1 3 1
7 8 2
new_A =
2 3 6
1 2 3
8 7 2
1 2 3
7 8 2
You could use the following:
hasRepeatingNums = any(diff(sort(A, 2), 1, 2)==0, 2);
A(hasRepeatingNums,:) = repmat(Dummy_row, nnz(hasRepeatingNums), 1);
See if this works for you,
A= [ 2 3 6;
4 7 4;
8 7 2;
5 5 5;
1 8 8;
1 3 1;
7 8 2 ];
Dummy_row = [1 2 3];
b = diff(sort(A,2),1,2);
b = sum(b == 0,2);
b = b > 0;
c = repmat(Dummy_row,sum(b),1);
b = b' .* (1:length(b));
b = b(b > 0);
newA = A;
newA(b,:) = c;
gives,
newA =
2 3 6
1 2 3
8 7 2
1 2 3
1 2 3
1 2 3
7 8 2
Edit
Not much change is needed, try this,
Dummy_row = [1 2 3];
b = sum(A == 0,2);
b = b > 0;
c = repmat(Dummy_row,sum(b),1);
b = b' .* (1:length(b));
b = b(b > 0);
newA = A;
newA(b,:) = c;

Divide list of numbers into 3 groups in matlab

I have a list of numbers, [1:9], that I need to divide three groups. Each group must contain at least one number. I need to enumerate all of the combinations (i.e. order does not matter). Ideally, the output is a x by 3 array. Any ideas of how to do this in matlab?
Is this what you want:
x = 1:9;
n = length(x);
T=3;
out = {};
%// Loop over all possible solutions
for k=1:T^n
s = dec2base(k, T, n);
out{k}{T} = [];
for p=1:n
grpIndex = str2num(s(p))+1;
out{k}{grpIndex} = [out{k}{grpIndex} x(p)];
end
end
%// Print result. size of out is the number of ways to divide the input. out{k} contains 3 arrays with the values of x
out
Maybe this is what you want. I'm assuming that the division in groups is "monotonous", that is, first come the elements of the first group, then those of the second etc.
n = 9; %// how many numbers
k = 3; %// how many groups
b = nchoosek(1:n-1,k-1).'; %'// "breaking" points
c = diff([ zeros(1,size(b,2)); b; n*ones(1,size(b,2)) ]); %// result
Each column of c gives the sizes of the k groups:
c =
Columns 1 through 23
1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4 5
1 2 3 4 5 6 7 1 2 3 4 5 6 1 2 3 4 5 1 2 3 4 1
7 6 5 4 3 2 1 6 5 4 3 2 1 5 4 3 2 1 4 3 2 1 3
Columns 24 through 28
5 5 6 6 7
2 3 1 2 1
2 1 2 1 1
This produces what I was looking for. The function nchoosekr_rec() is shown below as well.
for x=1:7
numgroups(x)=x;
end
c=nchoosekr_rec(numgroups,modules);
i=1;
d=zeros(1,modules);
for x=1:length(c(:,1))
c(x,modules+1)=sum(c(x,1:modules));
if c(x,modules+1)==length(opt_mods)
d(i,:)=c(x,1:modules);
i=i+1;
end
end
numgroups=[];
for x=1:length(opt_mods)
numgroups(x)=x;
end
count=0;
for x=1:length(d(:,1))
combos=combnk(numgroups,d(x,1));
for y=1:length(combos(:,1))
for z=1:nchoosek(9-d(x,1),d(x,2))
new_mods{count+z,1}=combos(y,:);
numgroups_temp{count+z,1}=setdiff(numgroups,new_mods{count+z,1});
end
count=count+nchoosek(9-d(x,1),d(x,2));
end
end
count=0;
for x=1:length(d(:,1))
for y=1:nchoosek(9,d(x,1))
combos=combnk(numgroups_temp{count+1},d(x,2));
for z=1:length(combos(:,1))
new_mods{count+z,2}=combos(z,:);
new_mods{count+z,3}=setdiff(numgroups_temp{count+z,1},new_mods{count+z,2});
end
count=count+length(combos(:,1));
end
end
function y = nchoosekr_rec(v, n)
if n == 1
y = v;
else
v = v(:);
y = [];
m = length(v);
if m == 1
y = zeros(1, n);
y(:) = v;
else
for i = 1 : m
y_recr = nchoosekr_rec(v(i:end), n-1);
s_repl = zeros(size(y_recr, 1), 1);
s_repl(:) = v(i);
y = [ y ; s_repl, y_recr ];
end
end
end

Expand matrix based on first row value (MATLAB)

My input is the following:
X = [1 1; 1 2; 1 3; 1 4; 2 5; 1 6; 2 7; 1 8];
X =
1 1
1 2
1 3
1 4
2 5
1 6
2 7
1 8
I am looking to output a new matrix based on the value of the first column. If the value is equal to 1 -- the output will remain the same, when the value is equal to 2 then I would like to output two of the values contained in the second row. Like this:
Y =
1
2
3
4
5
5
6
7
7
8
Where 5 is output two times because the value in the first column is 2 and the same for 7
Here it is (vectorized):
C = cumsum(X(:,1))
A(C) = X(:,2)
D = hankel(A)
D(D==0) = inf
Y = min(D)
Edit:
Had a small bug, now it works.
% untested code:
Y = []; % would be better to pre-allocate
for ii = 1:size(X,1)
Y = [Y; X(ii,2)*ones(X(ii,1),1)];
end

Expand a matrix with polynomials

Say I have a matrix A with 3 columns c1, c2 and c3.
1 2 9
3 0 7
3 1 4
And I want a new matrix of dimension (3x3n) in which the first column is c1, the second column is c1^2, the n column is c1^n, the n+1 column is c2, the n+2 column is c2^2 and so on. Is there a quickly way to do this in MATLAB?
Combining PERMUTE, BSXFUN, and RESHAPE, you can do this quite easily such that it works for any size of A. I have separated the instructions for clarity, you can combine them into one line if you want.
n = 2;
A = [1 2 9; 3 0 7; 3 1 4];
[r,c] = size(A);
%# reshape A into a r-by-1-by-c array
A = permute(A,[1 3 2]);
%# create a r-by-n-by-c array with the powers
A = bsxfun(#power,A,1:n);
%# reshape such that we get a r-by-n*c array
A = reshape(A,r,[])
A =
1 1 2 4 9 81
3 9 0 0 7 49
3 9 1 1 4 16
Try the following (don't have access to Matlab right now), it should work
A = [1 2 9; 3 0 7; 3 1 4];
B = [];
for i=1:n
B = [B A.^i];
end
B = [B(:,1:3:end) B(:,2:3:end) B(:,3:3:end)];
More memory efficient routine:
A = [1 2 9; 3 0 7; 3 1 4];
B = zeros(3,3*n);
for i=1:n
B(3*(i-1)+1:3*(i-1)+3,:) = A.^i;
end
B = [B(:,1:3:end) B(:,2:3:end) B(:,3:3:end)];
Here is one solution:
n = 4;
A = [1 2 9; 3 0 7; 3 1 4];
Soln = [repmat(A(:, 1), 1, n).^(repmat(1:n, 3, 1)), ...
repmat(A(:, 2), 1, n).^(repmat(1:n, 3, 1)), ...
repmat(A(:, 3), 1, n).^(repmat(1:n, 3, 1))];

Split matrix based on number in first column

I have a matrix which has the following form:
M =
[1 4 56 1;
1 3 5 1;
1 3 6 4;
2 3 5 0;
2 0 0 0;
3 1 2 3;
3 3 3 3]
I want to split this matrix based on the number given in the first column. So I want to split the matrix into this:
A =
[1 4 56 1;
1 3 5 1;
1 3 6 4]
B =
[2 3 5 0;
2 0 0 0]
C =
[3 1 2 3;
3 3 3 3]
I tried this by making the following loop, but this gave me the desired matrices with rows of zeros:
for i = 1:length(M)
if (M(i,1) == 1)
A(i,:) = M(i,:);
elseif (M(i,1) == 2)
B(i,:) = M(i,:);
elseif (M(i,1) == 3)
C(i,:) = M(i,:);
end
end
The result for matrix C is then for example:
C =
[0 0 0 0;
0 0 0 0;
0 0 0 0;
2 3 5 0;
2 0 0 0]
How should I solve this issue?
Additional information:
The actual data has a date in the first column in the form yyyymmdd. The data set spans several years and I want to split this dataset in matrices for each year and after that for each month.
You can use arrayfun to solve this task:
M = [
1 4 56 1;
1 3 5 1;
1 3 6 4;
2 3 5 0;
2 0 0 0;
3 1 2 3;
3 3 3 3]
A = arrayfun(#(x) M(M(:,1) == x, :), unique(M(:,1)), 'uniformoutput', false)
The result A is a cell array and its contents can be accessed as follows:
>> a{1}
ans =
1 4 56 1
1 3 5 1
1 3 6 4
>> a{2}
ans =
2 3 5 0
2 0 0 0
>> a{3}
ans =
3 1 2 3
3 3 3 3
To split the data based on an yyyymmdd format in the first column, you can use the following:
yearly = arrayfun(#(x) M(floor(M(:,1)/10000) == x, :), unique(floor(M(:,1)/10000)), 'uniformoutput', false)
monthly = arrayfun(#(x) M(floor(M(:,1)/100) == x, :), unique(floor(M(:,1)/100)), 'uniformoutput', false)
If you don't know how many outputs you'll have, it is most convenient to put the data into a cell array rather than into separate arrays. The command to do this is MAT2CELL. Note that this assumes your data is sorted. If it isn't use sortrows before running the code.
%# count the repetitions
counts = hist(M(:,1),unique(M(:,1));
%# split the array
yearly = mat2cell(M,counts,size(M,2))
%# if you'd like to split each cell further, but still keep
%# the data also grouped by year, you can do the following
%# assuming the month information is in column 2
yearByMonth = cellfun(#(x)...
mat2cell(x,hist(x(:,2),unique(x(:,2)),size(x,2)),...
yearly,'uniformOutput',false);
You'd then access the data for year 3, month 4 as yearByMonth{3}{4}
EDIT
If the first column of your data is yyyymmdd, I suggest splitting it into three columns yyyy,mm,dd, like below, to facilitate grouping afterward:
ymd = 20120918;
yymmdd = floor(ymd./[10000 100 1])
yymmdd(2:3) = yymmdd(2:3)-100*yymmdd(1:2)