Join Matrices in MATLAB - matlab

I have two matrices like the following ones:
'01/01/2010' 1
'02/01/2010' 2
'03/01/2010' 3
'05/01/2010' 11
'06/01/2010' 17
'01/01/2010' 4
'02/01/2010' 5
'04/01/2010' 6
'05/01/2010' 7
, and after doing a few tricky things in MATLAB, I want to create the following three matrices:
'01/01/2010' 1 4
'02/01/2010' 2 5
'03/01/2010' 3 NaN
'04/01/2010' NaN 6
'05/01/2010' 11 7
'06/01/2010' 17 NaN
'01/01/2010' 1 4
'02/01/2010' 2 5
'05/01/2010' 11 7
Any idea on how to join these tables?
Cheers.
EDIT: Really sorry for my typos, guys. I updated both the question and the input/output data. Please, feel free to provide suggestions.

I believe what you are trying to achieve are called inner join, and full outer join in the database world.
First we start with the two datasets:
d1 = {
'01/01/2010' 1
'02/01/2010' 2
'03/01/2010' 3
'05/01/2010' 11
'06/01/2010' 17
};
d2 = {
'01/01/2010' 4
'02/01/2010' 5
'04/01/2010' 6
'05/01/2010' 7
};
Here is the code to perform the two types of join:
%# get all possible dates, and convert them to indices starting at 1
[keys,~,ind] = unique( [d1(:,1);d2(:,1)] );
%# full outer join
ind1 = ind(1:size(d1,1));
ind2 = ind(size(d1,1)+1:end);
fullOuterJoin = cell(numel(keys),3);
fullOuterJoin(:) = {NaN}; %# fill with NaNs
fullOuterJoin(:,1) = keys; %# union of dates
fullOuterJoin(ind1,2) = d1(:,2); %# insert 1st dataset values
fullOuterJoin(ind2,3) = d2(:,2); %# insert 2nd dataset values
%# inner join
loc1 = ismember(ind1, ind2);
loc2 = ismember(ind2, ind1);
innerJoin = cell(sum(loc1),3);
innerJoin(:,1) = d1(loc1,1); %# intersection of dates
innerJoin(:,2) = d1(loc1,2); %# insert 1st dataset values
innerJoin(:,3) = d2(loc2,2); %# insert 2nd dataset values
Alternatively, we could have extracted the inner join from the outer join dataset by simply removing rows with any NaN values:
idx = all(~isnan(cell2mat(fullOuterJoin(:,2:end))), 2);
innerJoin = fullOuterJoin(idx,:);
Either way, the result:
>> fullOuterJoin
fullOuterJoin =
'01/01/2010' [ 1] [ 4]
'02/01/2010' [ 2] [ 5]
'03/01/2010' [ 3] [NaN]
'04/01/2010' [NaN] [ 6]
'05/01/2010' [ 11] [ 7]
'06/01/2010' [ 17] [NaN]
>> innerJoin
innerJoin =
'01/01/2010' [ 1] [4]
'02/01/2010' [ 2] [5]
'05/01/2010' [11] [7]

In MATLAB, you cannot have strings as matrix elements. For that you need to use a cell array. This is a solution using cell arrays and containers.Maps.
FirstCellArray = {
'01/01/2010', 1;
'02/01/2010', 2;
'03/01/2010', 3;
'05/01/2010', 11;
'06/01/2010', 17
};
SecondCellArray = {
'01/01/2010', 4;
'02/01/2010', 5;
'04/01/2010', 6;
'05/01/2010', 7;
};
AllDatesCellArray = union(FirstCellArray(:,1), SecondCellArray(:,1));
% Create containers.Maps for both cell arrays. containers.Maps are hash tables.
DateToFirstNumberMap = containers.Map(FirstCellArray(:,1), FirstCellArray(:,2));
DateToSecondNumberMap = containers.Map(SecondCellArray(:,1), SecondCellArray(:,2));
WithNaNsCellArray = AllDatesCellArray;
for Index = 1:size(WithNaNsCellArray, 1)
Key = AllDatesCellArray{Index, 1};
try
NumberOne = cell2mat(values(DateToFirstNumberMap, cellstr(Key)));
catch
NumberOne = NaN;
end
WithNaNsCellArray{Index, 2} = NumberOne;
try
NumberTwo = cell2mat(values(DateToSecondNumberMap, cellstr(Key)));
catch
NumberTwo = NaN;
end
WithNaNsCellArray{Index, 3} = NumberTwo;
end
WithoutNaNsCellArray = WithNaNsCellArray;
NaNIndicesVector = (isnan([WithNaNsCellArray{:,2}]) | isnan([WithNaNsCellArray{:,3}]));
WithoutNaNsCellArray(NaNIndicesVector == 1, :) = [];
Then WithNaNsCellArray contains the result with NaN rows and WithoutNaNsCellArray contains the result without NaN rows.
WithNaNsCellArray =
'01/01/2010' [ 1] [ 4]
'02/01/2010' [ 2] [ 5]
'03/01/2010' [ 3] [NaN]
'04/01/2010' [NaN] [ 6]
'05/01/2010' [ 11] [ 7]
'06/01/2010' [ 17] [NaN]
WithoutNaNsCellArray =
'01/01/2010' [ 1] [4]
'02/01/2010' [ 2] [5]
'05/01/2010' [11] [7]

The statistics toolbox contains a function called JOIN that basically does what you want.
http://www.mathworks.de/de/help/stats/dataset.join.html
Unfortunately, it probably can't handle strings and polytyped matrices. But you might be able to use JOIN to shorten the solutions proposed by the other answers.

Related

How to creat a four dimentional (from a vector) matrix and reset it's 'lower triangle'

I am trying to get a 4 dimentional matrix out of a vector and then reset it's
'lower triangel'.
for example, if my original vector is two dimentional: A = [1 2]' then I would like my initial matrix to be:
C(:,:,1,1) = [1*1*1*1 1*1*1*2 ; 1*1*2*1 1*1*2*2] = [ 1 2 ; 2 4]
C(:,:,2,1) = [2*1*1*1 2*1*1*2 ; 2*1*2*1 2*1*2*2] = [ 2 4 ; 4 8]
C(:,:,1,2) = [1*2*1*1 1*2*1*2 ; 1*2*2*1 1*2*2*2] = [ 2 4 ; 4 8]
C(:,:,2,2) = [2*2*1*1 2*2*1*2 ; 2*2*2*1 2*2*2*2] = [ 4 8 ; 8 16]
So C is:
C(:,:,1,1) = [ 1 2 ; 2 4] C(:,:,2,1) = [ 2 4 ; 4 8]
C(:,:,1,2) = [ 2 4 ; 4 8] C(:,:,2,2) = [ 4 8 ; 8 16]
and after reset I would like it to be:
C(:,:,1,1) = [ 1 2 ; 2 4] C(:,:,2,1) = [ 0 0 ; 0 0]
C(:,:,1,2) = [ 0 0 ; 4 8] C(:,:,2,2) = [ 0 0 ; 8 16]
shotrly, I want no rows repetitions.
I tried the following code:
A = [1 2]';
C = bsxfun(#times, permute(C, [4 3 2 1]), C*C');
disp('C before reset is:');
disp(C);
for k = 2:size(C, 4)
C(1:k-1,:,k) = 0;
end
disp('C after reset is:');
disp(C);
disp('The size of C is:');
disp(size(C));
But the output is:
BB before reset is:
(:,:,1,1) =
1 2
2 4
(:,:,1,2) =
2 4
4 8
C after reset is:
(:,:,1,1) =
1 2
2 4
(:,:,1,2) =
0 0
4 8
The size of BB is:
2 2 1 2
What did I miss?
I think I don't understand what is behind the line:
C = bsxfun(#times, permute(C, [4 3 2 1]), C*C');
what is the meaning of each number in the row [4 3 2 1]?
Thanks!
edit note: The matrix represents correlations between neurons. I am trying to look at the correlation structure of groups of 4 neurons. So, each 4 neurons sould only be measuresd once. I think that he matrix before reset contains 4! times, every group of 4, because they apear in all orders. I could leave it like this but I am think it might slow the program..
Permute exchanges dimensions, so for example
C = [1:3;4:6];
permute(C, [2 1])
Computes a simple transpose by swapping rows and columns. The [2 1] argument means that the 2st and 1st dimension of C are mapped to the 1st and 2nd dimension in the result. Each 'new' dimension is specified in order. So [3 2 1] would take the 3rd, 2nd and 1st dimensions to be the new 1st, 2nd and 3rd dimensions.
permute(C, [3 2 1])
ans =
ans(:,:,1) =
1 2 3
ans(:,:,2) =
4 5 6
Elements of C with row = 1 are found in where the 3rd dimension = 1 in the result. Similarly, elements of C with row = 2 are found where the 3rd dimension = 2 in the result.
Elements of C with column = 1 are still found where column = 1 in the result (and so on) as the column dimension was mapped to itself.
The rows of the result is the interesting dimension, it is singleton (i.e. there is only one row) as a result of C having no 3rd dimension.
Addressing the first part of your problem, the correct output for C can be obtained by
A = [1 2]'*[1 2];
C = bsxfun(#times, permute(A, [4,3,1,2]), A);
I would need more information on what you want the final behaviour to be ('resetting the lower triangle') as it is unclear to me what you desire.
A function that might be useful to you is the triu function which extracts upper triangular components of a matrix.

Join time series in matlab and replace missing data points with NaN [duplicate]

I have two matrices like the following ones:
'01/01/2010' 1
'02/01/2010' 2
'03/01/2010' 3
'05/01/2010' 11
'06/01/2010' 17
'01/01/2010' 4
'02/01/2010' 5
'04/01/2010' 6
'05/01/2010' 7
, and after doing a few tricky things in MATLAB, I want to create the following three matrices:
'01/01/2010' 1 4
'02/01/2010' 2 5
'03/01/2010' 3 NaN
'04/01/2010' NaN 6
'05/01/2010' 11 7
'06/01/2010' 17 NaN
'01/01/2010' 1 4
'02/01/2010' 2 5
'05/01/2010' 11 7
Any idea on how to join these tables?
Cheers.
EDIT: Really sorry for my typos, guys. I updated both the question and the input/output data. Please, feel free to provide suggestions.
I believe what you are trying to achieve are called inner join, and full outer join in the database world.
First we start with the two datasets:
d1 = {
'01/01/2010' 1
'02/01/2010' 2
'03/01/2010' 3
'05/01/2010' 11
'06/01/2010' 17
};
d2 = {
'01/01/2010' 4
'02/01/2010' 5
'04/01/2010' 6
'05/01/2010' 7
};
Here is the code to perform the two types of join:
%# get all possible dates, and convert them to indices starting at 1
[keys,~,ind] = unique( [d1(:,1);d2(:,1)] );
%# full outer join
ind1 = ind(1:size(d1,1));
ind2 = ind(size(d1,1)+1:end);
fullOuterJoin = cell(numel(keys),3);
fullOuterJoin(:) = {NaN}; %# fill with NaNs
fullOuterJoin(:,1) = keys; %# union of dates
fullOuterJoin(ind1,2) = d1(:,2); %# insert 1st dataset values
fullOuterJoin(ind2,3) = d2(:,2); %# insert 2nd dataset values
%# inner join
loc1 = ismember(ind1, ind2);
loc2 = ismember(ind2, ind1);
innerJoin = cell(sum(loc1),3);
innerJoin(:,1) = d1(loc1,1); %# intersection of dates
innerJoin(:,2) = d1(loc1,2); %# insert 1st dataset values
innerJoin(:,3) = d2(loc2,2); %# insert 2nd dataset values
Alternatively, we could have extracted the inner join from the outer join dataset by simply removing rows with any NaN values:
idx = all(~isnan(cell2mat(fullOuterJoin(:,2:end))), 2);
innerJoin = fullOuterJoin(idx,:);
Either way, the result:
>> fullOuterJoin
fullOuterJoin =
'01/01/2010' [ 1] [ 4]
'02/01/2010' [ 2] [ 5]
'03/01/2010' [ 3] [NaN]
'04/01/2010' [NaN] [ 6]
'05/01/2010' [ 11] [ 7]
'06/01/2010' [ 17] [NaN]
>> innerJoin
innerJoin =
'01/01/2010' [ 1] [4]
'02/01/2010' [ 2] [5]
'05/01/2010' [11] [7]
In MATLAB, you cannot have strings as matrix elements. For that you need to use a cell array. This is a solution using cell arrays and containers.Maps.
FirstCellArray = {
'01/01/2010', 1;
'02/01/2010', 2;
'03/01/2010', 3;
'05/01/2010', 11;
'06/01/2010', 17
};
SecondCellArray = {
'01/01/2010', 4;
'02/01/2010', 5;
'04/01/2010', 6;
'05/01/2010', 7;
};
AllDatesCellArray = union(FirstCellArray(:,1), SecondCellArray(:,1));
% Create containers.Maps for both cell arrays. containers.Maps are hash tables.
DateToFirstNumberMap = containers.Map(FirstCellArray(:,1), FirstCellArray(:,2));
DateToSecondNumberMap = containers.Map(SecondCellArray(:,1), SecondCellArray(:,2));
WithNaNsCellArray = AllDatesCellArray;
for Index = 1:size(WithNaNsCellArray, 1)
Key = AllDatesCellArray{Index, 1};
try
NumberOne = cell2mat(values(DateToFirstNumberMap, cellstr(Key)));
catch
NumberOne = NaN;
end
WithNaNsCellArray{Index, 2} = NumberOne;
try
NumberTwo = cell2mat(values(DateToSecondNumberMap, cellstr(Key)));
catch
NumberTwo = NaN;
end
WithNaNsCellArray{Index, 3} = NumberTwo;
end
WithoutNaNsCellArray = WithNaNsCellArray;
NaNIndicesVector = (isnan([WithNaNsCellArray{:,2}]) | isnan([WithNaNsCellArray{:,3}]));
WithoutNaNsCellArray(NaNIndicesVector == 1, :) = [];
Then WithNaNsCellArray contains the result with NaN rows and WithoutNaNsCellArray contains the result without NaN rows.
WithNaNsCellArray =
'01/01/2010' [ 1] [ 4]
'02/01/2010' [ 2] [ 5]
'03/01/2010' [ 3] [NaN]
'04/01/2010' [NaN] [ 6]
'05/01/2010' [ 11] [ 7]
'06/01/2010' [ 17] [NaN]
WithoutNaNsCellArray =
'01/01/2010' [ 1] [4]
'02/01/2010' [ 2] [5]
'05/01/2010' [11] [7]
The statistics toolbox contains a function called JOIN that basically does what you want.
http://www.mathworks.de/de/help/stats/dataset.join.html
Unfortunately, it probably can't handle strings and polytyped matrices. But you might be able to use JOIN to shorten the solutions proposed by the other answers.

Using nested for-loop to generate a table of numbers

I try to write a function in which print this output:
1
2 4
3 6 9
4 8 12 16
5 10 15 20 25
I wrote this code, but I'm not getting the desired output:
rows = 5;
% there are 5 rows
for i=1:rows
for j=1:i
b=i*j;
end
fprintf('%d\n',b)
end
How to I need to correct this algorithm or can you tell me, if there are any other alternate methods to solve this?
I don't know what you mean by "print", but this is how you could start:
%// initial vector
a = 1:5;
A = tril( bsxfun(#plus,a(:)*[0:numel(a)-1],a(:)) )
%// or
A = tril(a.'*a) %'// thanks to Daniel!
mask = A == 0
out = num2cell( A );
out(mask) = {[]}
A =
1 0 0 0 0
2 4 0 0 0
3 6 9 0 0
4 8 12 16 0
5 10 15 20 25
out =
[1] [] [] [] []
[2] [ 4] [] [] []
[3] [ 6] [ 9] [] []
[4] [ 8] [12] [16] []
[5] [10] [15] [20] [25]
To print it to a file, you can use.
out = out.'; %'
fid = fopen('output.txt','w')
fprintf(fid,[repmat('%d \t',1,n) '\r\n'],out{:})
fclose(fid)
and you get:
just for the command window:
out = out.'; %'
fprintf([repmat('%d \t',1,n) '\r\n'],out{:})
will be sufficient. Choose your desired delimiter, if you don't like '\t'.
If you insist on a nested for loop, you can do it like this:
rows = 5;
% there are 5 rows
for ii = 1:rows
for jj = 1:ii
b = ii*jj;
if ii <= jj
fprintf('%d \n',b)
else
fprintf('%d ',b)
end
end
end
displays:
1
2 4
3 6 9
4 8 12 16
5 10 15 20 25

Grouping elements of one column of matrix according to values of another column into cell array

I've been trying to come up with a smart way of doing this for a while. Given a matrix (or cell) with the following structure:
A = [-1 1
-1 2
1 3
3 5
2 3
2 4
2 7
4 5
5 6
6 7
7 -2 ]
(Note that the above matrix/cell is unsorted in both columns and contains negative numbers).
How could one group it by the unique values of a particular column. E.g. the desired output for grouping by the second column would be something like:
B{1} = [-1]
B{2} = [-1]
B{3} = [1,2]
B{4} = [2]
B{5} = [3,4]
B{6} = [5]
B{7} = [2,6]
B{-2} = [7]
Thanks in advance!
You can use accumarray:
[~,~,subs] = unique(A(:,2));
values = accumarray(subs,A(:,1),[],#(x) {x});
ofGroup = accumarray(subs,A(:,2),[],#(x) {x(1)});
out = [ofGroup values]
out =
[-2] [ 7]
[ 1] [ -1]
[ 2] [ -1]
[ 3] [2x1 double]
[ 4] [ 2]
[ 5] [2x1 double]
[ 6] [ 5]
[ 7] [2x1 double]
If you REALLY insist on your order proposed, you could do the following, but I don't think that should be necessary.
% positives
pos = A( A(:,2) >= 0 , :);
[~,~,subs] = unique(pos(:,2));
posvalues = accumarray(subs,pos(:,1),[],#(x) {x});
posofGroup = accumarray(subs,pos(:,2),[],#(x) {x(1)});
% negatives
neg = A( A(:,2) < 0 , :);
[~,~,subs] = unique(neg(:,2));
negvalues = flipud( accumarray(subs,neg(:,1),[],#(x) {x}) );
negofGroup = flipud( accumarray(subs,neg(:,2),[],#(x) {x(1)}) );
out = [posofGroup posvalues; negofGroup negvalues ]
out =
[ 1] [ -1]
[ 2] [ -1]
[ 3] [2x1 double]
[ 4] [ 2]
[ 5] [2x1 double]
[ 6] [ 5]
[ 7] [2x1 double]
[-2] [ 7]
How about:
[group, ~, subs] = unique(A(:,2))
B = accumarray(subs, A(:,1), [], #(x){x'})
Results in
B=
[ 7]
[ -1]
[ -1]
[2,1]
[ 2]
[4,3]
[ 5]
[2,6]
and group matches the index of B to the number of the group it represents
Also if you are attached to your ordering then you can do this:
[group, ~, subs] = unique(A(end:-1:1,2), 'stable');
B = flipud(accumarray(subs, A(end:-1:1,1), [], #(x){x'}));
group = flipud(group);
B =
[ -1]
[ -1]
[1x2 double]
[ 2]
[1x2 double]
[ 5]
[1x2 double]
[ 7]
group =
1
2
3
4
5
6
7
-2

Joining data from different cell arrays in Matlab

I have data in Matlab that is in cell array format with columns representing different items. The cell arrays have different columns, as in the following example:
a = {'A', 'B', 'C' ; 1, 1, 1; 2, 2, 2 }
a =
'A' 'B' 'C'
[1] [1] [1]
[2] [2] [2]
b = {'C', 'D'; 3, 3; 4, 4}
b =
'C' 'D'
[3] [3]
[4] [4]
I would like to be able to join the different cell arrays in the following manner:
c =
'A' 'B' 'C' 'D'
[1] [1] [1] [NaN]
[2] [2] [2] [NaN]
[NaN] [NaN] [3] [3]
[NaN] [NaN] [4] [4]
In the real example I have hundreds of columns and several rows, so creating a new cell array manually is not an option for me.
If you were willing to store your data in dataset arrays (or convert them to dataset arrays for this purpose), you could do the following:
>> d1
d1 =
A B C
1 1 1
2 2 2
>> d2
d2 =
C D
3 3
4 4
>> join(d1,d2,'Keys','C','type','outer','mergekeys',true)
ans =
A B C D
1 1 1 NaN
2 2 2 NaN
NaN NaN 3 3
NaN NaN 4 4
I'm assuming you want to join the two arrays based on their first row only.
% get the list of all keys
keys = unique([a(1,:) b(1,:)]);
lena = size(a,1)-1; lenb = size(b,1)-1;
% allocate space for the joined array
joined = cell(lena+lenb+1, length(keys));
joined(1,:) = keys;
% add a
tf = ismember(keys, a(1,:));
joined(2:(2+lena-1),tf) = a(2:end,:);
% add b
tf = ismember(keys, b(1,:));
joined((lena+2):(lena+lenb+1),tf) = b(2:end,:);
This will give you the joined array except that it has empty cells instead NaNs. I hope this is OK.
Here is my solution adapted from an old another to a similar question (simply transpose rows/columns):
%# input cell arrays
a = {'A', 'B', 'C' ; 1, 1, 1; 2, 2, 2 };
b = {'C', 'D'; 3, 3; 4, 4};
%# transpose rows/columns
a = a'; b = b';
%# get all key values, and convert them to indices starting at 1
[allKeys,~,ind] = unique( [a(:,1);b(:,1)] );
indA = ind(1:size(a,1));
indB = ind(size(a,1)+1:end);
%# merge the two datasets (key,value1,value2)
c = cell(numel(allKeys), size(a,2)+size(b,2)-1);
c(:) = {NaN}; %# fill with NaNs
c(:,1) = allKeys; %# available keys from both
c(indA,2:size(a,2)) = a(:,2:end); %# insert 1st dataset values
c(indB,size(a,2)+1:end) = b(:,2:end); %# insert 2nd dataset values
Here is the result (transposed to match original orientation):
>> c'
ans =
'A' 'B' 'C' 'D'
[ 1] [ 1] [1] [NaN]
[ 2] [ 2] [2] [NaN]
[NaN] [NaN] [3] [ 3]
[NaN] [NaN] [4] [ 4]
Also here is the solution using the DATASET class from the Statistics Toolbox:
aa = dataset([cell2mat(a(2:end,:)) a(1,:)])
bb = dataset([cell2mat(b(2:end,:)) b(1,:)])
cc = join(aa,bb, 'Keys',{'C'}, 'type','fullouter', 'MergeKeys',true)
with
cc =
A B C D
1 1 1 NaN
2 2 2 NaN
NaN NaN 3 3
NaN NaN 4 4