Outerjoin is not merging as expected, are my specifications wrong?

Outerjoin is not merging as expected, are my specifications wrong? - matlab

I am trying to merge three tables using outerjoin() but I am not getting the result I want/expect. Below is the code I am using, the result I am getting, and the result I want. Using Matlab R2018a.
Code
%%% set up dummy data tables
Key1 = [1 1 1 2 2 3 3 3 3 3];
Key2 = [1 2 3 1 2 1 2 3 4 5];
Val1 = [0 NaN NaN 0 NaN 0.09 NaN NaN NaN NaN];
Val2 = [NaN 0.55 0.55 0.04 0.04 0.58 0.634 0.668 0.6950 0.7560];
mytable = array2table([Key1', Key2', Val1', Val2']);
mytable.Properties.VariableNames = {'Key1', 'Key2', 'Val1', 'Val2'};
temp1 = array2table([1 4 0; 2 3 0; 3 6 0.09]);
temp1.Properties.VariableNames = {'Key1', 'Key2', 'Val1'};
temp2 = array2table([1 4 0.55; 2 3 0.04; 3 6 0.07560]);
temp2.Properties.VariableNames = {'Key1', 'Key2', 'Val2'};
%%% try to join mytable, temp1, and temp2 together
Tout = outerjoin(mytable, temp1, 'MergeKeys', true);
Tout = outerjoin(Tout, temp2, 'MergeKeys', true);
Result from code
I want the highlight rows to be combined, such that the Key1-Key2 pair is not duplicated in the output table. I tried various combinations of ...'MergeKeys', true, 'LeftVariables', {'Key1', 'Key2', 'Val1', 'Val2'}, 'RightVariables', {'Key1', 'Key2', 'Val2'} etc. but I couldn't get it to work.
Desired result

Solved by reversing the order:
Tout = outerjoin(temp1, temp2, 'MergeKeys',true);
Tout = outerjoin(mytable, Tout, 'MergeKeys',true);

Related

union of two matrices with their common attribute added together in matlab

I have two structure and I'd like to have union of two matrices while the third row is added according to the common first two rows, the result is order-insensitive with respect to the first tow rows and duplicate values are avoided. i.e.
% struct1
row1 = [1 1 1 2 4 3];
col1 = [1 2 3 1 2 4];
att1 = [2 3 4 6 2 5];
% struct2
row2 = [2 2 1 3 3];
col2 = [1 2 3 1 4];
att2 = [1 0 1 1 5];
% result
resultRow = [1 1 1 2 2 3]
resultCol = [1 2 3 2 4 4]
resultAtt = [2 10 6 0 2 10]
I have the previously asked the intersection of two structure but it seems accumarray works for rows not matrices. Any help is appreciated.

Based on the solution to intersection part of this question, I think we can proceed as follows:
% struct1
row1 = [1 1 1 2 4 3];
col1 = [1 2 3 1 2 4];
att1 = [2 3 4 6 2 5];
% struct2
row2 = [2 2 1 3 3];
col2 = [1 2 3 1 4];
att2 = [1 0 1 1 5];
% sort in 2nd dimension to get row-column indexes insensitive for order
idx1 = sort([row1(:) col1(:)],2);
idx2 = sort([row2(:) col2(:)],2);
%find union
[idx,~,bins] = unique([idx1;idx2],'rows','stable');
att = accumarray(bins,[att1,att2]);
ResultUnion= [idx att]';
disp(ResultUnion)
and you get
ResultUnion =
1 1 1 2 3 2
1 2 3 4 4 2
2 10 6 2 10 0

How to creat a four dimentional (from a vector) matrix and reset it's 'lower triangle'

I am trying to get a 4 dimentional matrix out of a vector and then reset it's
'lower triangel'.
for example, if my original vector is two dimentional: A = [1 2]' then I would like my initial matrix to be:
C(:,:,1,1) = [1*1*1*1 1*1*1*2 ; 1*1*2*1 1*1*2*2] = [ 1 2 ; 2 4]
C(:,:,2,1) = [2*1*1*1 2*1*1*2 ; 2*1*2*1 2*1*2*2] = [ 2 4 ; 4 8]
C(:,:,1,2) = [1*2*1*1 1*2*1*2 ; 1*2*2*1 1*2*2*2] = [ 2 4 ; 4 8]
C(:,:,2,2) = [2*2*1*1 2*2*1*2 ; 2*2*2*1 2*2*2*2] = [ 4 8 ; 8 16]
So C is:
C(:,:,1,1) = [ 1 2 ; 2 4] C(:,:,2,1) = [ 2 4 ; 4 8]
C(:,:,1,2) = [ 2 4 ; 4 8] C(:,:,2,2) = [ 4 8 ; 8 16]
and after reset I would like it to be:
C(:,:,1,1) = [ 1 2 ; 2 4] C(:,:,2,1) = [ 0 0 ; 0 0]
C(:,:,1,2) = [ 0 0 ; 4 8] C(:,:,2,2) = [ 0 0 ; 8 16]
shotrly, I want no rows repetitions.
I tried the following code:
A = [1 2]';
C = bsxfun(#times, permute(C, [4 3 2 1]), C*C');
disp('C before reset is:');
disp(C);
for k = 2:size(C, 4)
C(1:k-1,:,k) = 0;
end
disp('C after reset is:');
disp(C);
disp('The size of C is:');
disp(size(C));
But the output is:
BB before reset is:
(:,:,1,1) =
1 2
2 4
(:,:,1,2) =
2 4
4 8
C after reset is:
(:,:,1,1) =
1 2
2 4
(:,:,1,2) =
0 0
4 8
The size of BB is:
2 2 1 2
What did I miss?
I think I don't understand what is behind the line:
C = bsxfun(#times, permute(C, [4 3 2 1]), C*C');
what is the meaning of each number in the row [4 3 2 1]?
Thanks!
edit note: The matrix represents correlations between neurons. I am trying to look at the correlation structure of groups of 4 neurons. So, each 4 neurons sould only be measuresd once. I think that he matrix before reset contains 4! times, every group of 4, because they apear in all orders. I could leave it like this but I am think it might slow the program..

Permute exchanges dimensions, so for example
C = [1:3;4:6];
permute(C, [2 1])
Computes a simple transpose by swapping rows and columns. The [2 1] argument means that the 2st and 1st dimension of C are mapped to the 1st and 2nd dimension in the result. Each 'new' dimension is specified in order. So [3 2 1] would take the 3rd, 2nd and 1st dimensions to be the new 1st, 2nd and 3rd dimensions.
permute(C, [3 2 1])
ans =
ans(:,:,1) =
1 2 3
ans(:,:,2) =
4 5 6
Elements of C with row = 1 are found in where the 3rd dimension = 1 in the result. Similarly, elements of C with row = 2 are found where the 3rd dimension = 2 in the result.
Elements of C with column = 1 are still found where column = 1 in the result (and so on) as the column dimension was mapped to itself.
The rows of the result is the interesting dimension, it is singleton (i.e. there is only one row) as a result of C having no 3rd dimension.
Addressing the first part of your problem, the correct output for C can be obtained by
A = [1 2]'*[1 2];
C = bsxfun(#times, permute(A, [4,3,1,2]), A);
I would need more information on what you want the final behaviour to be ('resetting the lower triangle') as it is unclear to me what you desire.
A function that might be useful to you is the triu function which extracts upper triangular components of a matrix.

Making a match-and-append code more efficient without 'for' loop

I am trying to match 1st column of A with 1st to 3rd columns of B, and append corresponding 4th column of B to A.
For example,
A=
1 2
3 4
B=
1 2 4 5 4
1 2 3 5 3
1 1 1 1 2
3 4 5 6 5
I compare A(:,1) and B(:, 1:3)
1 and 3 are in A(:,1)
1 is in the 1st, 2nd, 3rd rows of B(:, 1:3), so append B([1 2 3], 4:end)' to A's 1st row.
3 is in the 2nd and 4th rows of B(:,1:3), so append B([2 4], 4:end)' to A's 2nd row.
So that it becomes:
1 2 5 4 5 3 1 2
3 4 5 3 6 5 0 0
I could code this using only for and if.
clearvars AA A B mem mem2 mem3
A = [1 2 ; 3 4]
B = [1 2 4 5 4; 1 2 3 5 3; 1 1 1 1 2; 3 4 5 6 5]
for n=1:1:size(A,1)
mem = ismember(B(:,[1:3]), A(n,1));
mem2 = mem(:,1) + mem(:,2) + mem(:,3);
mem3 = find(mem2>0);
AA{n,:} = horzcat( A(n,:), reshape(B(mem3,[4,5])',1,[]) ); %'
end
maxLength = max(cellfun(#(x)numel(x),AA));
out = cell2mat(cellfun(#(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false))
I am trying to make this code efficient, by not using for and if, but couldn't find an answer.

Try this
a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
row2 = mat2cell(permute(sum(ab==b,2),[3,1,2]),ones(1,numel(a)));
AA = cellfun(#(x)(reshape(B(x>0,4:end)',1,numel(B(x>0,4:end)))),row2,'UniformOutput',0);
maxLength = max(cellfun(#(x)numel(x),AA));
out = cat(2,A,cell2mat(cellfun(#(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false)))
UPDATE Below code runs in almost same time as the iterative code
a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
df = permute(sum(ab==b,2),[3,1,2])';
AA = arrayfun(#(x)(B(df(:,x)>0,4:end)),1:size(df,2),'UniformOutput',0);
AA = arrayfun(#(x)(reshape(AA{1,x}',1,numel(AA{1,x}))),1:size(AA,2),'UniformOutput',0);
maxLength = max(arrayfun(#(x)(numel(AA{1,x})),1:size(AA,2)));
out2 = cell2mat(arrayfun(#(x,i)((cat(2,A(i,:),AA{1,x},zeros(1,maxLength-length(AA{1,x}))))),1:numel(AA),1:size(A,1),'UniformOutput',0));

How about this:
%# example data
A = [1 2
3 4];
B = [1 2 4 5 4
1 2 3 5 3
1 1 1 1 2
3 4 5 6 5];
%# rename for clarity & reshape for algorithm's convenience
needle = permute(A(:,1), [2 3 1]);
haystack = B(:,1:3);
data = B(:,4:end).';
%# Get the relevant rows of 'haystack' for each entry in 'needle'
inds = any(bsxfun(#eq, haystack, needle), 2);
%# Create data that should be appended to A
%# All data and functionality in this loop is local and static, so speed
%# should be optimal.
append = zeros( size(A,1), numel(data) );
for ii = 1:size(inds,3)
newrow = data(:,inds(:,:,ii));
append(ii,1:numel(newrow)) = newrow(:);
end
%# Now append to A, stripping unneeded zeros
A = [A append(:, ~all(append==0,1))]

Matlab: Add vectors not in the same length to a matrix

Is it possible to automatically add vectors that are not in the same length together for a matrix?
i.e:
a = [1 2 3 4]
b = [1 2]
How can I make C to be:
c = [1 2 3 4 ; 1 2 0 0]
or
c = [1 2 3 4 ; 1 2 NaN NaN]
or something like that
Thanks

This might help
a = [1 2 3 4];
b = [1 2];
c = a;
c(2,1:length(b)) = b;
c =
1 2 3 4
1 2 0 0
then, if you'd rather have NaN than 0, you could do what Dennis Jaheruddin suggests in a comment below.

Make a function like this
function out = cat2(a, b)
diff = length(a) - length(b)
if diff > 0
b = [b, nan(1, diff)];
else
a = [a, nan(1, -diff)];
end
out = [a;b];
end
(but also add a check to handle column vectors too)
cat2([1 2 3 4], [1 2])
ans =
1 2 3 4
1 2 NaN NaN

match IDs + find a number within a matrix in Matlab

I am facing a problem in matching elements in 2 matrices. The first element can be matched using ismember but the second element should be within a range. Please see the example below:
% col1 is integerID, col2 is a number. -->col1 is Countrycode, col2 is date
bigmat = [ 600 2
600 4
800 1
800 5
900 1] ;
% col1 is integerID, col2 is VALUE, col2 is a range -->value is Exchange rate
datamat = {...
50 0.1 [2 3 4 5 6] % 2:6
50 0.2 [9 10 11] % 9:11
600 0.01 [1 2 3 4] % 1:4
600 0.2 [8 9 10] % 8:10
800 0.12 [1] % 1:1
800 0.13 [3 4] % 3:4
900 0.15 [1 2] } ; % 1:2
I need the answer as:
ansmat = [ 600 2 0.01
600 4 0.01
800 1 0.12
800 5 nan % even deleting this row is fine
930 1 0.15 ] ;
For simplicity:
All intIDs from matrix_1 exist in matrix_2.
The numbers in range are dates! Within a range, these numbers are consecutive: [1 2...5]
For any ID, dates in the next row are not continuous. Eg, you can see [1 2 3 4] and then [8 9 10] in next row.
bigmat is a huge matrix! 300,000-500,000 rows and so a vectorized code would be appreciated. datamat is roughly 5000 rows or less. You can convert the cell to matrix. For each row, I have the minimum and maximum. The 3 column is minimum:maximum. Thanks!

Here is one possible implementation:
%# data matrices
q = [
600 2
600 4
800 1
800 5
900 1
];
M = {
50 0.1 [2 3 4 5 6]
50 0.2 [9 10 11]
600 0.01 [1 2 3 4]
600 0.2 [8 9 10]
800 0.12 [1]
800 0.13 [3 4]
900 0.15 [1 2]
};
%# build matrix: ID,value,minDate,maxDate
M = [cell2num(M(:,1:2)) cellfun(#min,M(:,3)) cellfun(#max,M(:,3))];
%# preallocate result
R = zeros(size(M,1),3);
%# find matching rows
c = 1; %# counter
for i=1:size(q,1)
%# rows indices matching ID
ind = find( ismember(M(:,1),q(i,:)) );
%# out of those, keep only those where date number is in range
ind = ind( M(ind,3) <= q(i,2) & q(i,2) <= M(ind,4) );
%# check if any
num = numel(ind);
if num==0, continue, end
%# extract matching rows
R(c:c+num-1,:) = [M(ind,1) repmat(q(i,2),[num 1]) M(ind,2)];
c = c + num;
end
%# remove excess
R(c:end,:) = [];
The result as expected:
>> R
R =
600 2 0.01
600 4 0.01
800 1 0.12
900 1 0.15

I'm not completely sure I understand .. should the second entry be '600 4 0.02'?
Anyways, you may be able to try something like:
% grab first column
col = bigmat(:, 1);
% find all entries in column that are equal to ID
rel = (col == id);
% retrieve just those rows
rows = bigmat(rel, :);
Then once you have the rows you need from your matrices, you can concatenate them together like so:
result = [rowsA(1:3) rowsB(2) rowsC(5:6)];