Ignoring similar columns when concating matrixes vertically - matlab

In matlab I have a 128 by n matrix, which we can call
[A B C]
where each letter is an 128 by 1 matrix.
So what I want to do is concat the above matrix with another matrix,
[A~ D E].
Where A~ is similar in its values to A.
What I want to get as the result of the concat would be:
[A B C D E],
where A~ is omitted.
What is the best way to do this? Note that I do not know beforehand that A~ is similar.
To clarify, my problem is how would I determine if two columns are similar? By similar I mean where between two columns, many of the row values are close in value.
Maybe an illustration would help as well
Vector A: [1 2 3 4 5 6 7 8 9]'
| | | | | | | | |
Vector B: [20 2.4 4 5 0 7 7 7.6 10]'
where there are some instances where the values are completely different, but for the most part the values are close. I don't have a defined threshold for this, but ideally it would be something that I could experiment with.

If you want to omit only identical columns, this is one way to do it:
%# Define the example matrices.
Matrix1 = [ 1 2 3; 4 5 6; 7 8 9 ]';
Matrix2 = [ 4 5 6; 7 8 10 ]';
%# Concatenate the matrices and keep only unique columns.
OutputMatrix = unique([ Matrix1, Matrix2 ]', 'rows')';

To solve this, a matching algorithm called vl_ubcmatch can be used.
[matches, scores] = vl_ubcmatch(da, db) ; For each descriptor in da,
vl_ubcmatch finds the closest descriptor in db (as measured by the L2
norm of the difference between them). The index of the original match
and the closest descriptor is stored in each column of matches and the
distance between the pair is stored in scores.
source:
http://www.vlfeat.org/overview/sift.html
Thus, the solution is to find the matched columns with the highest scores and eliminate them before concatenating.

I think it's pdist2 you need.
Consider the following example:
>> X = rand(25, 5);
>> Y = rand(100, 5);
>> Y(22, : ) = 0.99*X(22,:);
>> D = pdist2(X,Y, 'euclidean');
>> [~,ind] = min(D(:));
>> [i,j]=ind2sub(size(D),ind)
i =
22
j =
22
which is indeed the entry we manipulated to be similar. Read help pdist2 or doc pdist2 for more background.

Related

How to get all the possible combinations of elements in a matrix, but don't allow exchange of elements inbetween columns?

Lets say I have this matrice A: [3 x 4]
1 4 7 10
2 5 8 11
3 6 9 12
I want to permute the element of in each column, but they can't change to a different column, so 1 2 3 need to always be part of the first column. So for exemple I want:
3 4 8 10
1 5 7 11
2 6 9 12
3 4 8 11
1 6 7 10
2 5 9 12
1 6 9 11
. . . .
So in one matrix I would like to have all the possible permutation, in this case, there are 3 different choices 3x3x3x3=81possibilities.So my result matrixe should be 81x4, because I only need each time one [1x4]line vector answer, and that 81 time.
An other way to as the question would be (for the same end for me), would be, if I have 4 column vector:
a=[1;2;3]
b=[4;5;6]
c=[7;8;9]
d=[10;11;12;13]
Compare to my previous exemple, each column vector can have a different number of row. Then is like I have 4 boxes, A, B C, D and I can only put one element of a in A, b in B and so on; so I would like to get all the permutation possible with the answer [A B C D] beeing a [1x4] row, and in this case, I would have 3x3x3x4=108 different row. So where I have been missunderstood (my fault), is that I don't want all the different [3x4] matrix answers but just [1x4]lines.
so in this case the answer would be:
1 4 7 10
and 1 4 7 11
and 1 4 7 12
and 1 4 7 13
and 2 4 8 10
and ...
until there are the 108 combinations
The fonction perms in Matlab can't do that since I don't want to permute all the matrix (and btw, this is already a too big matrix to do so).
So do you have any idea how I could do this or is there is a fonction which can do that? I, off course, also could have matrix which have different size. Thank you
Basically you want to get all combinations of 4x the permutations of 1:3.
You could generate these with combvec from the Neural Networks Toolbox (like #brainkz did), or with permn from the File Exchange.
After that it's a matter of managing indices, applying sub2ind (with the correct column index) and rearranging until everything is in the order you want.
a = [1 4 7 10
2 5 8 11
3 6 9 12];
siz = size(a);
perm1 = perms(1:siz(1));
Nperm1 = size(perm1,1); % = factorial(siz(1))
perm2 = permn(1:Nperm1, siz(2) );
Nperm2 = size(perm2,1);
permidx = reshape(perm1(perm2,:)', [Nperm2 siz(1), siz(2)]); % reshape unnecessary, easier for debugging
col_base_idx = 1:siz(2);
col_idx = col_base_idx(ones(Nperm2*siz(1) ,1),:);
lin_idx = reshape(sub2ind(size(a), permidx(:), col_idx(:)), [Nperm2*siz(1) siz(2)]);
result = a(lin_idx);
This avoids any loops or cell concatenation and uses straigh indexing instead.
Permutations per column, unique rows
Same method:
siz = size(a);
permidx = permn(1:siz(1), siz(2) );
Npermidx = size(permidx, 1);
col_base_idx = 1:siz(2);
col_idx = col_base_idx(ones(Npermidx, 1),:);
lin_idx = reshape(sub2ind(size(a), permidx(:), col_idx(:)), [Npermidx siz(2)]);
result = a(lin_idx);
Your question appeared to be a very interesting brain-teaser. I suggest the following:
in = [1,2,3;4,5,6;7,8,9;10,11,12]';
b = perms(1:3);
a = 1:size(b,1);
c = combvec(a,a,a,a);
for k = 1:length(c(1,:))
out{k} = [in(b(c(1,k),:),1),in(b(c(2,k),:),2),in(b(c(3,k),:),3),in(b(c(4,k),:),4)];
end
%and if you want your result as an ordinary array:
out = vertcat(out{:});
b is a 6x3 array that contains all possible permutations of [1,2,3]. c is 4x1296 array that contains all possible combinations of elements in a = 1:6. In the for loop we use number from 1 to 6 to get the permutation in b, and that permutation is used as indices to the column.
Hope that helps
this is another octave friendly solution:
function result = Tuples(A)
[P,n]= size(A);
M = reshape(repmat(1:P, 1, P ^(n-1)), repmat(P, 1, n));
result = zeros(P^ n, n);
for i = 1:n
result(:, i) = A(reshape(permute(M, circshift((1:n)', i)), P ^ n, 1), i);
end
end
%%%example
A = [...
1 4 7 10;...
2 5 8 11;...
3 6 9 12];
result = Tuples(A)
Update:
Question updated that: given n vectors of different length generates a list of all possible tuples whose ith element is from vector i:
function result = Tuples( A)
if exist('repelem') ==0
repelem = #(v,n) repelems(v,[1:numel(v);n]);
end
n = numel(A);
siz = [ cell2mat(cellfun(#numel, A , 'UniformOutput', false))];
tot_prd = prod(siz);
cum_prd=cumprod(siz);
tot_cum = tot_prd ./ cum_prd;
cum_siz = cum_prd ./ siz;
result = zeros(tot_prd, n);
for i = 1: n
result(:, i) = repmat(repelem(A{i},repmat(tot_cum(i),1,siz(i))) ,1,cum_siz(i));
end
end
%%%%example
a = {...
[1;2;3],...
[4;5;6],...
[7;8;9],...
[10;11;12;13]...
};
result =Tuples(a)
This is a little complicated but it works without the need for any additional toolboxes:
You basically want a b element 'truth table' which you can generate like this (adapted from here) if you were applying it to each element:
[b, n] = size(A)
truthtable = dec2base(0:power(b,n)-1, b) - '0'
Now you need to convert the truth table to linear indexes by adding the column number times the total number of rows:
idx = bsxfun(#plus, b*(0:n-1)+1, truthtable)
now you instead of applying this truth table to each element you actually want to apply it to each permutation. There are 6 permutations so b becomes 6. The trick is to then create a 6-by-1 cell array where each element has a distinct permutation of [1,2,3] and then apply the truth table idea to that:
[m,n] = size(A);
b = factorial(m);
permutations = reshape(perms(1:m)',[],1);
permCell = mat2cell(permutations,ones(b,1)*m,1);
truthtable = dec2base(0:power(b,n)-1, b) - '0';
expandedTT = cell2mat(permCell(truthtable + 1));
idx = bsxfun(#plus, m*(0:n-1), expandedTT);
A(idx)
Another answer. Rather specific just to demonstrate the concept, but can easily be adapted.
A = [1,4,7,10;2,5,8,11;3,6,9,12];
P = perms(1:3)'
[X,Y,Z,W] = ndgrid(1:6,1:6,1:6,1:6);
You now have 1296 permutations. If you wanted to access, say, the 400th one:
Permutation_within_column = [P(:,X(400)), P(:,Y(400)), P(:,Z(400)), P(:,W(400))];
ColumnOffset = repmat([0:3]*3,[3,1])
My_permutation = Permutation_within_column + ColumnOffset; % results in valid linear indices
A(My_permutation)
This approach allows you to obtain the 400th permutation on demand; if you prefer to have all possible permutations concatenated in the 3rd dimension, (i.e. a 3x4x1296 matrix), you can either do this with a for loop, or simply adapt the above and vectorise; for example, if you wanted to create a 3x4x2 matrix holding the first two permutations along the 3rd dimension:
Permutations_within_columns = reshape(P(:,X(1:2)),3,1,[]);
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,Y(1:2)),3,1,[]));
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,Z(1:2)),3,1,[]));
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,W(1:2)),3,1,[]));
ColumnOffsets = repmat([0:3]*3,[3,1,2]);
My_permutations = Permutations_within_columns + ColumnOffsets;
A(My_permutations)
This approach enables you to collect a specific subrange, which may be useful if available memory is a concern (i.e. for larger matrices) and you'd prefer to perform your operations by blocks. If memory isn't a concern you can get all 1296 permutations at once in one giant matrix if you wish; just adapt as appropriate (e.g. replicate ColumnOffsets the right number of times in the 3rd dimension)

Matlab: find second argmax [duplicate]

How do I find the index of the 2 maximum values of a 1D array in MATLAB? Mine is an array with a list of different scores, and I want to print the 2 highest scores.
You can use sort, as #LuisMendo suggested:
[B,I] = sort(array,'descend');
This gives you the sorted version of your array in the variable B and the indexes of the original position in I sorted from highest to lowest. Thus, B(1:2) gives you the highest two values and I(1:2) gives you their indices in your array.
I'll go for an O(k*n) solution, where k is the number of maximum values you're looking for, rather than O(n log n):
x = [3 2 5 4 7 3 2 6 4];
y = x; %// make a copy of x because we're going to modify it
[~, m(1)] = max(y);
y(m(1)) = -Inf;
[~, m(2)] = max(y);
m =
5 8
This is only practical if k is less than log n. In fact, if k>=3 I would put it in a loops, which may offend the sensibilities of some. ;)
To get the indices of the two largest elements: use the second output of sort to get the sorted indices, and then pick the last two:
x = [3 2 5 4 7 3 2 6 4];
[~, ind] = sort(x);
result = ind(end-1:end);
In this case,
result =
8 5

Matching by Matlab

A =[1;2;3;4;5]
B= [10 1 ;11 2;19 5]
I want to get
D = [1 10 ;2 11; 3 -9; 4 -9; 5 19]
That is, if something in A doesn't exist in B(:,2), 2nd column of D should be -9.
If something in A exists in B(:,2), I want to put the corresponding row of 1st column of B in 2nd column of D.
I know how to do it with a mix of ismember and for and if. But I need a more elegant method which doesn't use "for" to speed it up.
Unless I'm missing something (or A is not a vector of indices), this is actually much simpler and doesn't require ismember or find at all, just direct indexing:
D = [A zeros(length(A),1)-9];
D(B(:,2),2) = B(:,1)
which for your example matrices gives
D =
1 10
2 11
3 -9
4 -9
5 19
For general A:
A =[1;2;3;4;6]; % Note change in A
B= [10 1 ;11 2;19 5];
[aux, where] = ismember(A,B(:,2));
b_where = find(where>0);
D = [(1:length(A)).' repmat(-9,length(A),1)];
D(b_where,2) = B(where(b_where),1);
This gives
D = [ 1 10
2 11
3 -9
4 -9
5 -9 ]
I am not sure how efficient this solution is, but it avoids using any loops so at least it would be a place to start:
D = [A -9*ones(size(A))]; % initialize your result
[tf, idx] = ismember(A, B(:,2)); % get indices of matching elements
idx(idx==0) = []; % trim the zeros
D(find(tf),2) = B(idx,1); % set the matching entries in D to the appropriate entries in B
disp(E)
You allocate your matrix ahead of time to save time later on (building up matrices dynamically is really slow in MATLAB). The ismember call is returning the true-false vector tf showing which elements of A correspond to something in B, as well as the associated index of what they correspond to in idx. The problem with idx is that it contains a zero any time ft is zero, which is why we have the line idx(idx==0) = []; to clear out these zeros. Finally, find(tf) is used to get the indices associated with a match on the destination (rows of D in this case), and we set the values at those indices equal to the corresponding value in B that we want.
First create a temporary matrix:
tmp=[B; -9*ones(size(A(~ismember(A,B)))), A(~ismember(A,B))]
tmp =
10 1
11 2
19 5
-9 3
-9 4
Then use find to obtain the first row index with a match to elements of A in the second column of tmp (there is always a match by design).
D=[A, arrayfun(#(x) tmp(find(tmp(:,2)==x,1,'first'),1),A)];
D =
1 10
2 11
3 -9
4 -9
5 19
As an alternative for the second step, you could simply sort tmp based on its second column:
[~,I]=sort(tmp(:,2));
D=[tmp(I,2), tmp(I,1)]
D =
1 10
2 11
3 -9
4 -9
5 19

Averaging every n elements of a vector in matlab

I would like to average every 3 values of an vector in Matlab, and then assign the average to the elements that produced it.
Examples:
x=[1:12];
y=%The averaging operation;
After the operation,
y=
[2 2 2 5 5 5 8 8 8 11 11 11]
Therefore the produced vector is the same size, and the jumping average every 3 values replaces the values that were used to produce the average (i.e. 1 2 3 are replaced by the average of the three values, 2 2 2). Is there a way of doing this without a loop?
I hope that makes sense.
Thanks.
I would go this way:
Reshape the vector so that it is a 3×x matrix:
x=[1:12];
xx=reshape(x,3,[]);
% xx is now [1 4 7 10; 2 5 8 11; 3 6 9 12]
after that
yy = sum(xx,1)./size(xx,1)
and now
y = reshape(repmat(yy, size(xx,1),1),1,[])
produces exactly your wanted result.
Your parameter 3, denoting the number of values, is only used at one place and can easily be modified if needed.
You may find the mean of each trio using:
x = 1:12;
m = mean(reshape(x, 3, []));
To duplicate the mean and reshape to match the original vector size, use:
y = m(ones(3,1), :) % duplicates row vector 3 times
y = y(:)'; % vector representation of array using linear indices

Generating all combinations containing at least one element of a given set in Matlab

I use combnk to generate a list of combinations. How can I generate a subset of combinations, which always includes particular values. For example, for combnk(1:10, 2) I only need combinations which contain 3 and/or 5. Is there a quick way to do this?
Well, in your specific example, choosing two integers from the set {1, ..., 10} such that one of the chosen integers is 3 or 5 yields 9+9-1 = 17 known combinations, so you can just enumerate them.
In general, to find all of the n-choose-k combinations from integers {1, ..., n} that contain integer m, that is the same as finding the (n-1)-choose-(k-1) combinations from integers {1, ..., m-1, m+1, ..., n}.
In matlab, that would be
combnk([1:m-1 m+1:n], k-1)
(This code is still valid even if m is 1 or n.)
For a brute force solution, you can generate all your combinations with COMBNK then use the functions ANY and ISMEMBER to find only those combinations that contain one or more of a subset of numbers. Here's how you can do it using your above example:
v = 1:10; %# Set of elements
vSub = [3 5]; %# Required elements (i.e. at least one must appear in the
%# combinations that are generated)
c = combnk(v,2); %# Find pairwise combinations of the numbers 1 through 10
rowIndex = any(ismember(c,vSub),2); %# Get row indices where 3 and/or 5 appear
c = c(rowIndex,:); %# Keep only combinations with 3 and/or 5
EDIT:
For a more elegant solution, it looks like Steve and I had a similar idea. However, I've generalized the solution so that it works for both an arbitrary number of required elements and for repeated elements in v. The function SUBCOMBNK will find all the combinations of k values taken from a set v that include at least one of the values in the set vSub:
function c = subcombnk(v,vSub,k)
%#SUBCOMBNK All combinations of the N elements in V taken K at a time and
%# with one or more of the elements in VSUB as members.
%# Error-checking (minimal):
if ~all(ismember(vSub,v))
error('The values in vSub must also be in v.');
end
%# Initializations:
index = ismember(v,vSub); %# Index of elements in v that are in vSub
vSub = v(index); %# Get elements in v that are in vSub
v = v(~index); %# Get elements in v that are not in vSub
nSubset = numel(vSub); %# Number of elements in vSub
nElements = numel(v); %# Number of elements in v
c = []; %# Initialize combinations to empty
%# Find combinations:
for kSub = max(1,k-nElements):min(k,nSubset)
M1 = combnk(vSub,kSub);
if kSub == k
c = [c; M1];
else
M2 = combnk(v,k-kSub);
c = [c; kron(M1,ones(size(M2,1),1)) repmat(M2,size(M1,1),1)];
end
end
end
You can test this function against the brute force solution above to see that it returns the same output:
cSub = subcombnk(v,vSub,2);
setxor(c,sort(cSub,2),'rows') %# Returns an empty matrix if c and cSub
%# contain exactly the same rows
I further tested this function against the brute force solution using v = 1:15; and vSub = [3 5]; for values of N ranging from 2 to 15. The combinations created were identical, but SUBCOMBNK was significantly faster as shown by the average run times (in msec) displayed below:
N | brute force | SUBCOMBNK
---+-------------+----------
2 | 1.49 | 0.98
3 | 4.91 | 1.17
4 | 17.67 | 4.67
5 | 22.35 | 8.67
6 | 30.71 | 11.71
7 | 36.80 | 14.46
8 | 35.41 | 16.69
9 | 31.85 | 16.71
10 | 25.03 | 12.56
11 | 19.62 | 9.46
12 | 16.14 | 7.30
13 | 14.32 | 4.32
14 | 0.14 | 0.59* #This could probably be sped up by checking for
15 | 0.11 | 0.33* #simplified cases (i.e. all elements in v used)
Just to improve Steve's answer : in your case (you want all combinations with 3 and/or 5) it will be
all k-1/n-2 combinations with 3 added
all k-1/n-2 combinations with 5 added
all k-2/n-2 combinations with 3 and 5 added
Easily generalized for any other case of this type.