MATLAB: ismember vs isequal

MATLAB: ismember vs isequal - matlab

If A and B are tables (or datasets) having the same the same columns (and in the same order), then an expression like ismember(A(:, somecols), B(:, somecols)) will produce a boolean array suitable for indexing A with, as in
A(ismember(A(:, somecols), B(:, somecols)), :)
The line above evaluates to a table (or dataset, depending on the class of A) consisting of those rows of A that match some row of B at the columns specified in somecols.
But now suppose that B has exactly one row. More realistically, suppose that the criterion for selecting rows from A is simply to match this one single row of B, say the first one.
One could do this:
A(ismember(A(:, somecols), B(1, somecols)), :)
The main quibble I have with this is that it is not "semantically clear", because ismember is being used, in effect, to test for equality.
It would be semantically clearer if one could write
A(isequal(A(:, somecols), B(1, somecols)), :)
but this does line not produce the desired results. (Specifically, it returns no matches even when A(:, ...) contains rows matching B(1, ...).)
My question is, what is the predicate that will correctly produce the logical vector corresponding to the question "does this row of A match this reference row at somecols"?

For the table data type you can also use innerjoin, but ismember is fairly clear in this case. Consider the tables At and Bt, where Bt has two common rows and one unique row:
>> A = randi(7,4,5);
>> commonRows = [1 3];
>> B = [A(commonRows,:); randi(2,1,5)+7];
>> At = array2table(A,'VariableNames',sprintfc('C%d',1:size(A,2)))
At =
C1 C2 C3 C4 C5
__ __ __ __ __
4 1 5 7 7
2 6 5 1 4
4 4 6 7 4
2 7 7 5 6
>> Bt = array2table(B,'VariableNames',sprintfc('C%d',1:size(A,2)))
Bt =
C1 C2 C3 C4 C5
__ __ __ __ __
4 1 5 7 7
4 4 6 7 4
8 8 9 9 9
The second output argument of innerjoin, IA, gives you the indexes of rows in A that are also in B. As in your example, consider a subset of the columns, specified by somecols:
>> somecols = [2 5]
somecols =
2 5
>> [Ct,IA] = innerjoin(At(:,somecols), Bt(1,somecols))
Ct =
C2 C5
__ __
1 7
IA =
1
>> [Ct,IA] = innerjoin(At(:,somecols), Bt(2,somecols))
Ct =
C2 C5
__ __
4 4
IA =
3
>> [Ct,IA] = innerjoin(At(:,somecols), Bt(3,somecols))
Ct =
empty 0-by-2 table
IA =
[]
If IA is empty (or not) is a suitable test:
>> [~,IA] = innerjoin(At, Bt(3,:));
>> isempty(IA)
ans =
1
>> [~,IA] = innerjoin(At, Bt(2,:));
>> isempty(IA)
ans =
0
Or just test the first output, the common table rows:
>> isempty(innerjoin(At, Bt(3,:)))
ans =
1
>> isempty(innerjoin(At, Bt(1,:)))
ans =
0

I agree that with the ismember option it may not be immediately clear what you are intending (though there is nothing wrong with it exactly). Another way you could do which I guess might be more semantically clear (though potentially less efficient) is to use bsxfun like so:
all(bsxfun(#eq,A(:,somecols),B(1,somecols)),2);
If you were to expand this into what is essentially happening under the hood, it would be something like:
a = A(:,somecols);
b = repmat(B(1,somecols),size(A,1),1);
abeq = all(a == b,2);
A(abeq,:);
Basically you're replicating the one B row so that it is the same size as A(:,somecols) and then comparing each value in each array. Finally, you're checking which rows have a whole row of true (by using all), which indicates it matches the single row of B.
EDIT: Sorry, apparently I misunderstood the question - if you're using the table datatype (which I didn't actually know existed until a few minutes ago - thanks horchler), then this approach probably won't work.
EDIT2: Notlikethat pointed out the existence of the function rowfun, which acts on each row in a table. I can't test this (my version of MATLAB isn't new enough) but I assume that something like this would do what you are wanting:
A(rowfun(#(x) isequal(B(1,somecols),x),A(:,somecols)),:);

Related

How to get all the possible combinations of elements in a matrix, but don't allow exchange of elements inbetween columns?

Lets say I have this matrice A: [3 x 4]
1 4 7 10
2 5 8 11
3 6 9 12
I want to permute the element of in each column, but they can't change to a different column, so 1 2 3 need to always be part of the first column. So for exemple I want:
3 4 8 10
1 5 7 11
2 6 9 12
3 4 8 11
1 6 7 10
2 5 9 12
1 6 9 11
. . . .
So in one matrix I would like to have all the possible permutation, in this case, there are 3 different choices 3x3x3x3=81possibilities.So my result matrixe should be 81x4, because I only need each time one [1x4]line vector answer, and that 81 time.
An other way to as the question would be (for the same end for me), would be, if I have 4 column vector:
a=[1;2;3]
b=[4;5;6]
c=[7;8;9]
d=[10;11;12;13]
Compare to my previous exemple, each column vector can have a different number of row. Then is like I have 4 boxes, A, B C, D and I can only put one element of a in A, b in B and so on; so I would like to get all the permutation possible with the answer [A B C D] beeing a [1x4] row, and in this case, I would have 3x3x3x4=108 different row. So where I have been missunderstood (my fault), is that I don't want all the different [3x4] matrix answers but just [1x4]lines.
so in this case the answer would be:
1 4 7 10
and 1 4 7 11
and 1 4 7 12
and 1 4 7 13
and 2 4 8 10
and ...
until there are the 108 combinations
The fonction perms in Matlab can't do that since I don't want to permute all the matrix (and btw, this is already a too big matrix to do so).
So do you have any idea how I could do this or is there is a fonction which can do that? I, off course, also could have matrix which have different size. Thank you

Basically you want to get all combinations of 4x the permutations of 1:3.
You could generate these with combvec from the Neural Networks Toolbox (like #brainkz did), or with permn from the File Exchange.
After that it's a matter of managing indices, applying sub2ind (with the correct column index) and rearranging until everything is in the order you want.
a = [1 4 7 10
2 5 8 11
3 6 9 12];
siz = size(a);
perm1 = perms(1:siz(1));
Nperm1 = size(perm1,1); % = factorial(siz(1))
perm2 = permn(1:Nperm1, siz(2) );
Nperm2 = size(perm2,1);
permidx = reshape(perm1(perm2,:)', [Nperm2 siz(1), siz(2)]); % reshape unnecessary, easier for debugging
col_base_idx = 1:siz(2);
col_idx = col_base_idx(ones(Nperm2*siz(1) ,1),:);
lin_idx = reshape(sub2ind(size(a), permidx(:), col_idx(:)), [Nperm2*siz(1) siz(2)]);
result = a(lin_idx);
This avoids any loops or cell concatenation and uses straigh indexing instead.
Permutations per column, unique rows
Same method:
siz = size(a);
permidx = permn(1:siz(1), siz(2) );
Npermidx = size(permidx, 1);
col_base_idx = 1:siz(2);
col_idx = col_base_idx(ones(Npermidx, 1),:);
lin_idx = reshape(sub2ind(size(a), permidx(:), col_idx(:)), [Npermidx siz(2)]);
result = a(lin_idx);

Your question appeared to be a very interesting brain-teaser. I suggest the following:
in = [1,2,3;4,5,6;7,8,9;10,11,12]';
b = perms(1:3);
a = 1:size(b,1);
c = combvec(a,a,a,a);
for k = 1:length(c(1,:))
out{k} = [in(b(c(1,k),:),1),in(b(c(2,k),:),2),in(b(c(3,k),:),3),in(b(c(4,k),:),4)];
end
%and if you want your result as an ordinary array:
out = vertcat(out{:});
b is a 6x3 array that contains all possible permutations of [1,2,3]. c is 4x1296 array that contains all possible combinations of elements in a = 1:6. In the for loop we use number from 1 to 6 to get the permutation in b, and that permutation is used as indices to the column.
Hope that helps

this is another octave friendly solution:
function result = Tuples(A)
[P,n]= size(A);
M = reshape(repmat(1:P, 1, P ^(n-1)), repmat(P, 1, n));
result = zeros(P^ n, n);
for i = 1:n
result(:, i) = A(reshape(permute(M, circshift((1:n)', i)), P ^ n, 1), i);
end
end
%%%example
A = [...
1 4 7 10;...
2 5 8 11;...
3 6 9 12];
result = Tuples(A)
Update:
Question updated that: given n vectors of different length generates a list of all possible tuples whose ith element is from vector i:
function result = Tuples( A)
if exist('repelem') ==0
repelem = #(v,n) repelems(v,[1:numel(v);n]);
end
n = numel(A);
siz = [ cell2mat(cellfun(#numel, A , 'UniformOutput', false))];
tot_prd = prod(siz);
cum_prd=cumprod(siz);
tot_cum = tot_prd ./ cum_prd;
cum_siz = cum_prd ./ siz;
result = zeros(tot_prd, n);
for i = 1: n
result(:, i) = repmat(repelem(A{i},repmat(tot_cum(i),1,siz(i))) ,1,cum_siz(i));
end
end
%%%%example
a = {...
[1;2;3],...
[4;5;6],...
[7;8;9],...
[10;11;12;13]...
};
result =Tuples(a)

This is a little complicated but it works without the need for any additional toolboxes:
You basically want a b element 'truth table' which you can generate like this (adapted from here) if you were applying it to each element:
[b, n] = size(A)
truthtable = dec2base(0:power(b,n)-1, b) - '0'
Now you need to convert the truth table to linear indexes by adding the column number times the total number of rows:
idx = bsxfun(#plus, b*(0:n-1)+1, truthtable)
now you instead of applying this truth table to each element you actually want to apply it to each permutation. There are 6 permutations so b becomes 6. The trick is to then create a 6-by-1 cell array where each element has a distinct permutation of [1,2,3] and then apply the truth table idea to that:
[m,n] = size(A);
b = factorial(m);
permutations = reshape(perms(1:m)',[],1);
permCell = mat2cell(permutations,ones(b,1)*m,1);
truthtable = dec2base(0:power(b,n)-1, b) - '0';
expandedTT = cell2mat(permCell(truthtable + 1));
idx = bsxfun(#plus, m*(0:n-1), expandedTT);
A(idx)

Another answer. Rather specific just to demonstrate the concept, but can easily be adapted.
A = [1,4,7,10;2,5,8,11;3,6,9,12];
P = perms(1:3)'
[X,Y,Z,W] = ndgrid(1:6,1:6,1:6,1:6);
You now have 1296 permutations. If you wanted to access, say, the 400th one:
Permutation_within_column = [P(:,X(400)), P(:,Y(400)), P(:,Z(400)), P(:,W(400))];
ColumnOffset = repmat([0:3]*3,[3,1])
My_permutation = Permutation_within_column + ColumnOffset; % results in valid linear indices
A(My_permutation)
This approach allows you to obtain the 400th permutation on demand; if you prefer to have all possible permutations concatenated in the 3rd dimension, (i.e. a 3x4x1296 matrix), you can either do this with a for loop, or simply adapt the above and vectorise; for example, if you wanted to create a 3x4x2 matrix holding the first two permutations along the 3rd dimension:
Permutations_within_columns = reshape(P(:,X(1:2)),3,1,[]);
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,Y(1:2)),3,1,[]));
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,Z(1:2)),3,1,[]));
Permutations_within_columns = cat(2, Permutations_within_columns, reshape(P(:,W(1:2)),3,1,[]));
ColumnOffsets = repmat([0:3]*3,[3,1,2]);
My_permutations = Permutations_within_columns + ColumnOffsets;
A(My_permutations)
This approach enables you to collect a specific subrange, which may be useful if available memory is a concern (i.e. for larger matrices) and you'd prefer to perform your operations by blocks. If memory isn't a concern you can get all 1296 permutations at once in one giant matrix if you wish; just adapt as appropriate (e.g. replicate ColumnOffsets the right number of times in the 3rd dimension)

Reshape Matlab table

I have the following table
name = ['A' 'A' 'A' 'B' 'B' 'C' 'C' 'C' 'C' 'D' 'D' 'E' 'E' 'E']';
value = randn(14, 1);
T = table(name, value);
i,e.
T =
name value
____ _________
A 0.0015678
A -0.76226
A 0.98404
B -1.0942
B 0.71249
C 1.688
C 1.4001
C -0.9278
C -1.3725
D 0.11563
D 0.076776
E 1.0568
E 1.1972
E 0.29037
I want to transform it in the following way: take the first two cells in value corresponding to different values in name and put it in the 5x2 matrix. This matrix would have rows corresponding to different names A,B,C,D,E and columns corresponding to values, e.g. the first two rows are
0.0015678 -0.76226
-1.0942 0.71249

This can be done with accumarray using a custom function. The first step is to convert the name column of T into a numeric vector; and then accumarray can be applied.
This approach requires T being sorted according to column 1, because only in this case is accumarray guaranteed to preserve order (as indicated in its documentation). So if T may not be sorted (although it is in your example), sort it first using sortrows.
T = sortrows(T, 1); %// you can remove this line if T is guaranteed to be sorted
[~, ~, names] = unique(T(:,1)); %// names as a numeric vector
result = cell2mat(accumarray(names, T.value, [], #(x) {x([1 2]).'}));

First figure out where each name has values located in the table, then cycle through each name and place the first two values encountered for each name into individual cell arrays. Once you're done, reshape the matrix to 5 x 2 as you have said. As such, do something like this:
names = unique(T.name); %// 1
ind = arrayfun(#(x) find(T.name == x), names, 'uni', 0); %// 2
vals = cellfun(#(x) T.value(x(1:2)), ind, 'uni', 0); %// 3
m = [vals{:}].'; %// 4
Let's go through each line of code slowly.
Line #1
The first line finds all unique names through unique and we store them into names.
Line #2
The next line goes through all of the unique names and finds those locations / rows in the table that share that particular name. I use arrayfun and go through each name in names, find those rows that share the same name as one we are looking for, and place those row locations into individual cells; these are stored into ind. To find the locations of each valid name in our table, I use find and the locations are placed into a column vector. As such, we will have five column vectors where each column vector is placed into an individual cell. These column vectors will tell us which rows match a particular name located in your table.
Line #3
The next line uses cellfun to go through each of the cells in ind and extracts the first two row locations that share a particular name, indexes into the value field for your table to pull those two values, and these are placed as two-element vectors into individual cells for each name.
Line #4
The last line of code simply unrolls each two-element vector. The first two elements of each name get stored into columns. To get them into rows, I simply transpose the unrolling. The output matrix is stored into m.
If you want to see what the output looks like, this is what I get when I run the above code with your example table:
m =
0.0016 -0.7623
-1.0942 0.7125
1.6880 1.4001
0.1156 0.0768
1.0568 1.1972
Be advised that I only showed the first 5 digits of precision so there is some round-off at the end. However, this is only for display purposes and so what I got is equivalent to what your expect for the output.
Hope this helps!

If you want use the tables, you could try something like this:
count = 1;
U = unique(table2array(T(:,1)));
for ii = 1:size(U,1)
A = find(table2array(T(:,1)) == U(ii));
A = A(1:2);
B(count,1:2) = table2array(T(A,2));
count = count + 1;
end
Personally, I would find this simpler to do with your name and value arrays and forget about the table. If it is a requirement then I understand, however I will provide my solution still. It may provide some insight either way.
count = 1;
U = unique(name);
for ii = 1:size(U,1)
A = find(name == U(ii));
A = A(1:2);
B(count,1:2) = value(A);
count = count + 1;
end
Quick and dirty, but hopefully it's good enough. Good luck.

Another solution that is more manageable and easily scalable exists. Since MATLAB R2013b you can use a specialized function for pivoting a table (which is what you want to do): unstack.
In order to get exactly what you wanted, you need to add an extra variable to your table that will indicate replications:
name = ['A' 'A' 'A' 'B' 'B' 'C' 'C' 'C' 'C' 'D' 'D' 'E' 'E' 'E']';
value = randn(14, 1);
rep = [1, 2, 3, 1, 2, 1, 2, 3, 4, 1, 2, 1, 2, 3];
T = table(name, value, rep);
T =
name value rep
____ _________ ___
A 0.53767 1
A 1.8339 2
A -2.2588 3
B 0.86217 1
B 0.31877 2
C -1.3077 1
C -0.43359 2
C 0.34262 3
C 3.5784 4
D 2.7694 1
D -1.3499 2
E 3.0349 1
E 0.7254 2
E -0.063055 3
Then you just use unstack like this:
pivotTable = unstack(T, 'value','name')
pivotTable =
rep A B C D E
___ _______ _______ ________ _______ _________
1 0.53767 0.86217 -1.3077 2.7694 3.0349
2 1.8339 0.31877 -0.43359 -1.3499 0.7254
3 -2.2588 NaN 0.34262 NaN -0.063055
4 NaN NaN 3.5784 NaN NaN
Afterwards, it's a matter of re-arranging the table if you still want to.

The easiest way is to first convert the table into a matrix form and then reshape it by using the "reshape" function in Matlab.
matrix = t{:,:};% t-- your table variable
reshape_matrix = reshape(matrix ,[2,3]) % [2,3]--> the size of the matrix you desire
These two steps can be done by one line of code
reshape_matrix = reshape(t{:,:},[2,3]);

Writing sums of product taken two or three at a time

I have the variables a1 a2 a3 a4.... an etc
I want to find out F1 F2 F3 ...Fk.... Fn
where
F1 = a1+a2+a3+a4+.....an
F2 = - (sum of products of a's taken two at a time )
F3 = (sum of products of a's taken three at a time)
Fk = (-1)^(k+1)* (sum of products of a's taken k at a time)
and so on till Fn
Can anyone help me how to get Fk through pseudo code or matlab code. Is there any built-in function in matlab which can help me in doing this ?

Use the in-built function poly(r) in matlab which takes roots of polynomial
r as an array:
function Fk = get_comb(r,k)
p=poly(r)
Fk=(-1)^(k)*p(k+1);
end

This might be what you need -
A = round(10.*rand(13,1)); %// Input data, which you need to replace with your data
out = NaN(numel(A),1); %// Stores the output
for d = 1:numel(A)
extended_len = ceil(numel(A)/d)*d;
A_ext = ones(extended_len,1);
A_ext(1:numel(A)) = A;
A_ext_res = reshape(A_ext,d,numel(A_ext)/d);
A_prod = cumprod(A_ext_res,1);
out(d) = sum(A_prod(end,:));
end
out(2:2:end)=-1.*out(2:2:end);
NOTE: You may create a function out of the above script, which would have input as 'A' and output as 'out'.
Printing the input and output -
A =
5
8
1
4
9
8
10
7
0
8
9
7
8
>> out
out =
84
-257
840
-5208
1944
-11528
115200
-806400
4032
-504
56
-8
0
The above results could be verified from the fact that the first element of output is the sum of all elements and the last one is the product of them, with all alternate elements being multiplied by -1.

You might take a look at the kron() function. It's not a complete solution to your problem, but it may help you.

Ignoring similar columns when concating matrixes vertically

In matlab I have a 128 by n matrix, which we can call
[A B C]
where each letter is an 128 by 1 matrix.
So what I want to do is concat the above matrix with another matrix,
[A~ D E].
Where A~ is similar in its values to A.
What I want to get as the result of the concat would be:
[A B C D E],
where A~ is omitted.
What is the best way to do this? Note that I do not know beforehand that A~ is similar.
To clarify, my problem is how would I determine if two columns are similar? By similar I mean where between two columns, many of the row values are close in value.
Maybe an illustration would help as well
Vector A: [1 2 3 4 5 6 7 8 9]'
| | | | | | | | |
Vector B: [20 2.4 4 5 0 7 7 7.6 10]'
where there are some instances where the values are completely different, but for the most part the values are close. I don't have a defined threshold for this, but ideally it would be something that I could experiment with.

If you want to omit only identical columns, this is one way to do it:
%# Define the example matrices.
Matrix1 = [ 1 2 3; 4 5 6; 7 8 9 ]';
Matrix2 = [ 4 5 6; 7 8 10 ]';
%# Concatenate the matrices and keep only unique columns.
OutputMatrix = unique([ Matrix1, Matrix2 ]', 'rows')';

To solve this, a matching algorithm called vl_ubcmatch can be used.
[matches, scores] = vl_ubcmatch(da, db) ; For each descriptor in da,
vl_ubcmatch finds the closest descriptor in db (as measured by the L2
norm of the difference between them). The index of the original match
and the closest descriptor is stored in each column of matches and the
distance between the pair is stored in scores.
source:
http://www.vlfeat.org/overview/sift.html
Thus, the solution is to find the matched columns with the highest scores and eliminate them before concatenating.

I think it's pdist2 you need.
Consider the following example:
>> X = rand(25, 5);
>> Y = rand(100, 5);
>> Y(22, : ) = 0.99*X(22,:);
>> D = pdist2(X,Y, 'euclidean');
>> [~,ind] = min(D(:));
>> [i,j]=ind2sub(size(D),ind)
i =
22
j =
22
which is indeed the entry we manipulated to be similar. Read help pdist2 or doc pdist2 for more background.

MATLAB: Conditional summation

I have two arrays of the following form:
v1 = [ 1 2 3 4 5 6 7 8 9 ... ]
c2 = { 'a' 'a' 'a' 'b' 'b' 'c' 'c' 'c' 'c' ... }
(all values are examples only, no pattern can be assumed in the real data. v1 and c2 have the same size)
I want to obtain a vector containing the summation of the components of v1 corresponding to equal values in c2. In the example above, the first component of the resulting vector would be 1+2+3, the second 4+5, and so on.
I know I can do it in a loop of the form:
uni_c2 = unique(c2);
result = zeros(size(uni_c2));
for i = 1:numel(uni_c2)
result(i) = sum( v1(strcmp(uni_c2(i),c2)) );
end
Is there a single command or a vectorized way of doing the same operation?

You can do this in two lines:
[b, m, n] = unique(c2)
result = accumarray(n', v1)
The elements of result correspond to the strings in the cell array b.

This is vectorized but a bad idea for very large vectors. For some problems a "vectorized" solution is worse than a for loop.
>> v1 = [ 1 2 3 4 5 6 7 8 9];
>> c2 = 'aaabbcccc'-'a'
c2 =
0 0 0 1 1 2 2 2 2
>> N = repmat(c2',1,max(c2)-min(c2)+1) == repmat([min(c2):max(c2)],size(c2,2),1);
>> v1*N
ans =
6 9 30

I think a very general (and vectorized) solution is something like this:
v1 = [ 1 2 3 4 5 6 7 8 9 ]
c2 = { 'a' 'a' 'a' 'b' 'b' 'c' 'c' 'c' 'c' }
uniqueValuesInC2 = unique(c2);
conditionalSumOfV1 = #(x)(sum(v1(strcmp(c2, x))));
result = cellfun(conditionalSumOfV1, uniqueValuesInC2)
Perhaps my solution needs a bit of an explanation to the untrained eye:
So first you actually need to compute the different possible values in c2, which is done by unique.
The conditionalSumOfV1 function takes an argument x, it compares every element in c2 with x and selects the corresponding elements in v1 and sums them.
Finally cellfun is comparable to a foreach construct in some other languages: the function conditionalSum is evaluated for every value in the cell array you provide (in this case: every unique value in c2) and stores it in the output array. For other types of container variables (arrays, structs), MATLAB has equivalent foreach-like constructs: arrayfun, structfun.
This will work for contents of c2 that are longer than a single character and it does not require a large repmat operation as stardt's solution. I do however have my doubts when it comes to long arrays where c2 has only a few duplicate values., but I guess that will be a hard case for most algorithms. If you are in such a case, you might need to take a look at the extra outputs of unique or write your own alternative to unique (i.e. write for loops, preferably in a compiled language/MEX).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse