Intersecting several tables of different lengths in Matlab - matlab

I have more 8+ tables in Matlab of different lengths. They all include dates in the their first column. I would like to get the intersection of all these tables on the date columns. The following small example with 3 tables shows what I want:
Date1=datenum(2011,6,7,0:24:240,0,0).';
Date2=datenum(2011,6,8,0:24:240,0,0).';
Date3=datenum(2011,6,5,0:24:240,0,0).';
T1 = table(Date1,ones(11,1),'VariableNames',{'Date','Var1'})
T2 = table(Date2,ones(11,1)*2,'VariableNames',{'Date','Var2'})
T3 = table(Date3,ones(11,1)*3,'VariableNames',{'Date','Var3'})
Thus, I would like the following output:
Date Var1 Var2 Var3
______ ____ ____ ____
734662 1 2 3
734663 1 2 3
734664 1 2 3
734665 1 2 3
734666 1 2 3
734667 1 2 3
734668 1 2 3
734669 1 2 3
Is there a function in Matlab that can do this?

The intersect function may be of use for your case.
I am not sure how you use the tables, but the code below should let you find the indices of the intersection for each DateN vector. Once you have the indices, you can rebuild a global table which only incorporate the common indices between all tables.
[C,ia,ib ] = intersect(Date1,Date2) ; %// get indices of intersection of first 2 vectors (Date1&2)
[D,ja,ix3] = intersect( C ,Date3) ; %// get indices of intersection of last result (Date1&2) with last vector (Date 3)
ix1 = ia(ja) ; %// take only the common indices of the 2 intersection operations (For Date1)
ix2 = ib(ja) ; %// take only the common indices of the 2 intersection operations (For Date2)
%// Build the "common" table
intersectionTable = [ Date1(ix1) Var1(ix1) Var2(ix2) Var3(ix3) ] ;

I wrote a very closely related code for another question and I am putting down a polished function code here that finds the intersecting elements and the corresponding indices for several arrays at the same time without resorting to any kind of looping for the computation part. Here's the code -
function [out_val,out_idx] = intersect_arrays(varargin)
%// Concatenate all vector arrays into a 2D array
if isrow(varargin{1})
M = vertcat(varargin{:});
else
M = horzcat(varargin{:}).'; %//'
end
%// Find unique values for all elements in all arrays
unqvals = unique(M(:),'stable')'; %//'
%// Find unqiue elements common across all arrays (intersecting elements)
out_val = unqvals(all(any(bsxfun(#eq,M,permute(unqvals,[1 3 2])),2),1));
%// Find first indices across all arrays holding the intersecting elements
[~,idx] = max(bsxfun(#eq,M,permute(out_val,[1 3 2])),[],2);
out_idx = squeeze(idx).'; %//'
return
Now, to solve your case, we can use the function code like so -
num_arrays = 3; %// Number of arrays to be used
%// Find intersecting elements and their corresppinding indices in each array
[int_ele,int_idx] = intersect_arrays(T1.Date,T2.Date,T3.Date) %// Add inputs here
%// Create an array of all Var data
all_idx = cat(2,T1.Var1,T2.Var2,T3.Var3) %// Add inputs here
%// Select Var data based on the intersecting indices
select_idx = all_idx(bsxfun(#plus,int_idx,[0:num_arrays-1]*size(T1.Var1,1)))
%// Output results as a numeric array and table
out_array = [int_ele(:) select_idx]
out_table = cell2table(num2cell(out_array),'VariableNames',...
{'Date','Var1','Var2','Var3'})
Output -
out_table =
Date Var1 Var2 Var3
______ ____ ____ ____
734662 1 2 3
734663 1 2 3
734664 1 2 3
734665 1 2 3
734666 1 2 3
734667 1 2 3
734668 1 2 3
734669 1 2 3

Related

loop a matrix through disorder columns indices

I have a matrix A which has 100 rows and 5 columns, I would like to iterate the matrix with disorder indices of columns and save them in each iteration, column indices [2;5;3;4;1].
1st iteration: get all the rows of A with column 2 then do some processes.
2nd iteration : get all the rows of A with columns 2 and 5.
.....
last iteration: get all the rows and columns of A.
Anyone helps me to implement it on Matlab environment please.
Define the column indices as cols = [2 5 3 4 1]; and iterate through each submatrix of A like subA = A(:,cols(1:i)).
A = rand(100,5);
cols = [2 5 3 4 1];
for i = 1:length(cols)
subA = A(:,cols(1:i));
% do calculations on subA ..
end

How to efficiently find in some dataset the number of occurrences of a given list of items, without using loops?

I have a dataset, M, where some items and their category types are stored in columns 1 and 2 respectively. The vector cat stores the unique category types present in M. Vector Y is a subset of items in M. I want to find how many times each category type is associated with the items in Y. This is the code I have written to do this:
cat(:,1) = unique(M(:,2)); % Unique items in M
cat(:,2) = zeros(size(cat,1),1); % initialize column 2 of cat to 0s
N = size(Y,1);
for i=1:N
item = Y(i,1);
temp = M(M(:,1)==item,:);
C(:,1) = unique(temp(:,2));
C(:,2) = histc(temp(:,2), unique(temp(:,2))); % Frequency of items in temp(:,2)
for j=1:size(cat,1)
for k=1:size(C,1)
if cat(j,1)==C(k,1)
cat(j,2) = cat(j,2)+C(k,2);
end
end
end
clear C; clear temp; clear item;
end
But this is obviously slow for even moderately sized M, Y and cat. How do I make it faster?
To illustrate with an example, say:
M=[3 2
4 12
1 7
3 4
2 10
1 6
4 19
4 6
3 12
1 10
2 12];
And,
Y=[2;3];
Then I want the output cat to be the following:
cat=[2 1
4 1
6 0
7 0
10 1
12 2
19 0];
If I correctly understand you want histogram of categories of those items from M that also appear in Y.
Using ismember you can find index of items of M that also appear in Y:
idx = ismember(M(:,1), Y);
Use that index to filter out desired items and save it to temp:
temp = M(idx, :);
Form histogram of temp with unique values from Cat(:,1):
Cat(:,2) = histc(temp(:, 2), Cat(:, 1));
Avoiding saving intermediate results the above code can be simplified :
idx = ismember(M(:,1),Y);
Cat(:,2) = histc(M(idx, 2), Cat(:,1));
Or all in one line:
Cat(:,2) = histc(M(ismember(M(:,1),Y), 2), Cat(:,1));
Note: cat is name of a builtin function in MATLAB so I renamed your variable cat to Cat

Matlab - insert/append rows into matrix iteratively

How in matlab I can interactively append matrix with rows?
For example lets say I have empty matrix:
m = [];
and when I run the for loop, I get rows that I need to insert into matrix.
For example:
for i=1:5
row = v - x; % for example getting 1 2 3
% m.append(row)?
end
so after inserting it should look something like:
m = [
1 2 3
3 2 1
1 2 3
4 3 2
1 1 1
]
In most programming languages you can simply append rows into array/matrix. But I find it hard to do it in matlab.
m = [m ; new_row]; in your loop. If you know the total row number already, define m=zeros(row_num,column_num);, then in your loop m(i,:) = new_row;
Just use
m = [m; row];
Take into account that extending a matrix is slow, as it involves memory reallocation. It's better to preallocate the matrix to its full size,
m = NaN(numRows,numCols);
and then fill the row values at each iteration:
m(ii,:) = row;
Also, it's better not to use i as a variable name, because by default it represents the imaginary unit (that's why I'm using ii here as iteration index).
To create and add a value into the matrix you can do this and can make a complete matrix like yours.
Here row = 5 and then column = 3 and for hence two for loop.
Put the value in M(i, j) location and it will insert the value in the matrix
for i=1:5
for j=1:3
M(i, j) = input('Enter a value = ')
end
fprintf('Row %d inserted successfully\n', i)
end
disp('Full Matrix is = ')
disp(M)
Provably if you enter the same values given, the output will be like yours,
Full Matrix is =
1 2 3
3 2 1
1 2 3
4 3 2
1 1 1

Subtotal Calculation in Matlab

I would like to take subtotal of table in matlab. If the values of two columns are equal, take the value and add if there is an entry.
If we give an example, source matrix is as follows:
A = [1 2 3;
1 2 2;
1 4 1;
2 2 1;
2 2 3];
The output would look like this:
B = [1 2 5;
1 4 1;
2 2 4];
If the first two columns are equal, sum the third column. Is there a simple way of doing, without having to loop several times?
You can do this with a combination of unique and accumarray:
%# find unique rows and their corresponding indices in A
[uniqueRows,~,rowIdx]=unique(A(:,1:2),'rows');
%# for each group of unique rows, sum the values of the third column of A
subtotal = accumarray(rowIdx,A(:,3),[],#sum);
B = [uniqueRows,subtotal];
You can use unique to get all of the groups, then splitapply to sum them
[u, ~, iu] = unique( A(:,1:2), 'rows' ); % Get unique rows & their indices
sums = splitapply( #sum, A(:,3), iu ); % Sum all values according to unique indices
output = [u, sums]
% >> output =
% output =
% 26 7 124
% 26 8 785
% 27 7 800
This is a late answer because a duplicate question has just been asked so I posted here instead. Note that splitapply was introduced in R2015b, so wasn't around when the accumarray solution was posted.

Sort Coordinates Points in Matlab

What I want to do is to sort these coordinates points:
Measured coordinates (x,y)= (2,2),(2,3),(1,2),(1,3),(2,1),(1,1),(3,2),(3,3),(3 ,1)
I need to get sequences or trajectories of this points to follow them by iteration.
data = [2,2 ; 2,3 ; 1,2 ; 1,3 ; 2,1 ; 1,1 ; 3,2 ; 3,3 ; 3 ,1]
% corresponding sort-value, pick one out or make one up yourself:
sortval = data(:,1); % the x-value
sortval = data(:,2); % y-value
sortval = (data(:,1)-x0).^2 + (data(:,2)-y0).^2; % distance form point (xo,y0)
sortval = ...
[~,sortorder] = sort(sortval);
sorted_data = data(sortorder,:);
But from you comment, I understand you actually need something to reconstruct a path and iteratively find the closest neighbour of the last found point (of the reconstructed path so far).
The following is how I would solve this problem (using pdist2 for calculating the distances between all the points for easiness):
data = [2,2 ; 2,3 ; 1,2 ; 1,3 ; 2,1 ; 1,1 ; 3,2 ; 3,3 ; 3 ,1];
dist = pdist2(data,data);
N = size(data,1);
result = NaN(1,N);
result(1) = 1; % first point is first row in data matrix
for ii=2:N
dist(:,result(ii-1)) = Inf;
[~, closest_idx] = min(dist(result(ii-1),:));
result(ii) = closest_idx;
end
which results in:
result =
1 2 4 3 6 5 9 7 8
being the indices to consecutive points on the curve. Here's a plot of this result:
As #mathematician1975 already mentioned, there can be equal distances to a point. This is solved here by using min which just finds the first occurrence of the minimum in an array. This means that if you order your input data differently, you can get different results of course, this is inherent to the equal-distance issue.
2nd remark: I don't know how this will behave when using large input data matrices, probably a bit slow because of the loop, which you can't avoid. I still see room for improvement, but that's up to you ;)
Create a matrix from your points so that you have something like
A = [2 2 1 1 2 1 3 3 3;
2 3 2 3 1 1 2 3 1]';
then try
B = sortrows(A,1);
to get a matrix with rows that are your points ordered by xvalue or
B = sortrows(A,2)
to get a matrix with rows that are your points ordered by their 'y' value. If your points are ordered with respect to some other ordering parameter (such as time) then sorting will not work unless you remember the order that they were created in.