generate all possible subset from a character array in MATLAB - matlab

I need to generate all possible subset from a character array in MATLAB with reduced execution time.
For example:
input='ABCA';
output ='A',
'B',
'C',
'AB',
'BC',
'CA',
'ABC',
'BCA',
'ABCA'

You can find all these subsets using straight-forward loops. I don't know if it is worth-while vectorizing these, as any vectorization will require large intermediate arrays.
With a random input of 500 characters, and maxLen at 20, I got 4207817 unique substrings. It took my computer (with MATLAB R2017a) 12 seconds to find these. Whether that is fast enough or not is up to you, but I would not bother further optimizing this.
input = 'ABCA';
maxLen = 4;
subsets = {};
for len = 1:maxLen
subs = cell(1,numel(input)-len+1);
for start = 1:numel(subs)
subs{start} = input(start:start+len-1);
end
subs = unique(subs);
subsets = [subsets,subs];
end
disp(subsets)
Output:
'A' 'B' 'C' 'AB' 'BC' 'CA' 'ABC' 'BCA' 'ABCA'
If it is important to preserve the order of the substrings, then add the 'stable' argument to the unique call:
subs = unique(subs,'stable');
For example, for input = 'AFCB';, the output without 'stable' is:
'A' 'B' 'C' 'F' 'AF' 'CB' 'FC' 'AFC' 'FCB' 'AFCB'
and with 'stable' it is:
'A' 'F' 'C' 'B' 'AF' 'FC' 'CB' 'AFC' 'FCB' 'AFCB'

Related

How would you perform inter-row operations based on multiple columns? MATLAB

I am a novice programmer that is primarily self-taught. I am new to MATLAB and relational mathematics. Currently, I am attempting to perform math operations between rows. I would like to normalize the exp by the corresponding con and then multiply by the constant.
This constant is a laboratory measurement that could be subject to change in future experments. Thus, I have given it a column.
Below is some sample code that I have generated to exemplify my problem and solution. I am trying to get from myTable to rTable.
I recognize my solution is very sloppy and there must be a way to perform these operations that is human-readable and uses less temporary variables. To put it shortly, there must be a simpler way.
rTable = table();
myTable = table(transpose(1:8), ...
transpose({'Con1', 'Con2', 'Exp1', 'Exp2',...
'Con1', 'Con2', 'Exp1', 'Exp2'}),...
transpose({'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'}),...
ones(8, 1) * 2,...
'VariableNames', {'Values' , 'Condition', 'Group', 'Constant'});
[r, c] = size(myTable)
a = myTable(strcmp(myTable.Group, 'A'), :);
b = myTable(strcmp(myTable.Group, 'B'), :);
aexp1 = a.Values(strcmp(a.Condition, 'Exp1'), :) / a.Values(strcmp(a.Condition, 'Con1'), :) * mean(a.Constant);
aexp2 = a.Values(strcmp(a.Condition, 'Exp2'), :) / a.Values(strcmp(a.Condition, 'Con2'), :) * mean(a.Constant);
bexp1 = b.Values(strcmp(b.Condition, 'Exp1'), :) / b.Values(strcmp(b.Condition, 'Con1'), :) * mean(b.Constant);
bexp2 = b.Values(strcmp(b.Condition, 'Exp2'), :) / b.Values(strcmp(b.Condition, 'Con2'), :) * mean(b.Constant);
aT = table(transpose({aexp1, aexp2}),...
transpose({'Exp1', 'Exp2'}),...
transpose({'A', 'A'}),...
transpose({2, 2,}),...
'VariableNames', {'Values', 'Condition', 'Group', 'Constant'});
bT = table(transpose({bexp1, bexp2}),...
transpose({'Exp1', 'Exp2'}),...
transpose({'B', 'B'}),...
transpose({2, 2,}),...
'VariableNames', {'Values', 'Condition', 'Group', 'Constant'});
rTable = [aT; bT]
Thank you for any input or suggestions. Perhaps, the data structure i am handling is poorly organized.
Here's one solution:
rTable = table();
myTable = table((1:8)',{'Con1', 'Con2', 'Exp1', 'Exp2','Con1', 'Con2', 'Exp1', 'Exp2'}',...
{'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'}','VariableNames', {'Values' , 'Condition', 'Group', 'Constant'})
conditionrows = contains(myTable.Condition,'Con')
exprows = contains(myTable.Condition,'Exp')
conditionTable = myTable(conditionrows,:)
expTable = myTable(exprows,:)
constant = 2
rValues = expTable.Values./conditionTable.Values * constant
rTable = expTable
rTable.Values = rValues
Since you are trying to get a table of only exprows, you separate your original table into a conditionTable and an expTable. I'm assuming you have one condition row for each exp row, and also that you have a good correspondence in the tables (if not it will require more processing), then you can calculate the rValue simply with a one line expression. The ./ is element-wise division. Also note that you can use ' to perform transpose in matlab (further note that if you want a column vector of 1:10 for example you have to do (1:10)', 1:10' gives you a row vector from 1 to 10 since 1:10' is interpreted as vector from 1 to the transpose of 10.

If statement with strings of letters in Matlab

I have in Matlab the following cells containing various combinations of the letters a,b,c,d
%all combinations containing 'a' and/or 'b'
G1={'a', 'ab', 'ac', 'ad', 'abc', 'acd', 'abd', 'abcd', 'b', 'bc', 'bd', 'bcd'};
%all combinations containing 'c' and/or 'd'
G2={'c', 'ac', 'bc', 'cd', 'abc', 'acd', 'bcd', 'abcd', 'd', 'ad', 'bd', 'abd'};
%all combinations containing 'c'
G3={'c', 'ac', 'bc', 'cd', 'acd', 'abd', 'bcd', 'abcd'};
I then construct a cell all of dimension
allsize=size(G1,2)*size(G2,2)*size(G3,2);
containing all possible ways to match one element of G1 with one element of G2 with one element of G3.
all=cell(allsize,3);
count=0;
for h=1:size(G1,2)
for k=1:size(G2,2);
for j=1:size(G3,2);
count=count+1;
all(count,1)=G1(h);
all(count,2)=G2(k);
all(count,3)=G3(j);
end
end
end
Question: I want to construct a vector check of dimension allsize x 1 such that check(l)=1 if [all(l,1) contains a and all(l,2) contains c] or [all(l,1) contains b and all(l,2) contains d], and zero otherwise.
I am having problems in writing the if condition
check=zeros(allsize,1);
for l=1:allsize
%if [all(l,1) contains a and all(l,2) contains c] or [all(l,1) contains b and all(l,2) contains d]
check(l)=1;
%end
end
Could you kindly provide some help?
(For the if statement, always best to show what you tried rather than some pseudo code , however...)
Firstly using all as a variable name is bad - it's an important built-in function and one you may want to use... I've renamed it allG below. But you probably want something like this:
check(l) = (any(allG{l,1}=='a') && any(allG{l,2}=='c')) || ...
(any(allG{l,1}=='b') && any(allG{l,2}=='d'))
Note I haven't used an if statement, since the right hand side evaluates to a logical value (a true/false value) which can be generally used in the same way as 1 and 0...
Also above we're treating the strings as arrays of characters, so something like 'abcd'=='b' returns a [0 1 0 0] logical array... We then use any() to see if any of the values are 1 (true).

Create variables from an array of cells in Matlab

I have an array of cells, for example,
cells = {'a', 'b', 'c', d', 'e'};
which is inside a for loop of 1 to 5.
I want to create a variable from a to e depending on the loop index, as 1 to a, 2 to b...
When I try (i is the for index),
eval(cells{i}) = values; it gives me the error,
Undefined function or method 'eval' for input arguments of type 'a'
Here the answer:
eval(sprintf([cells{i} '=values;']))
And you can remove the ; if you want to see the display in command window.
In answer to your comment :
cells = {'a', 'b', 'c', 'd', 'e'};
values = 4;
i = 1;
eval(sprintf([cells{i} '=values;']))
This works perfectly fine on my computer, and i get no warning or error messages.
when calling eval, all arguments must be strings, so convert your cell elements to strings first.
eval([ cellstr(cells{i}) ' = values;']))

Matlab - how can I label variables?

I have a matrix with population data and a vector that makes reference to each type of data example, age, country, gender, height, ethnicity.
I need to in a part of the code, use those strings as char 1x1. I thougnt in making some relation as
variables = {'age', 'a';
'gender', 'b';
'country', 'c';
'height', 'd';
'ethnicity', 'e'};
I would like something that any time I use the leters, 'a', 'b', 'c', 'd' or 'e', the code understands that I want to use 'age', 'gender', 'country', 'height' or 'ehtnicity', respectively.
how could I do this?
thanks!
You have two options:
A more common method is to use a structure:
codes.a = 'age';
codes.b = 'gender';
...
So anytime you need a code, just get the value of the equivalent structure member:
character_you_typed = 'a';
getfield(codes, character_you_typed)
or (based on #Amro 's comment below):
codes.(character_you_typed)
This method does not restrict you to one-character keys. Another method is to use the recently added Map container with a 'char' key:
codes = containers.Map('KeyType', 'char');
codes('a') = 'age';
codes('b') = 'gender';
...
Then:
character_you_typed = 'a';
codes(character_you_typed)
The second method looks much better, but unfortunately you are restricted to a single character for the keys.

MATLAB: Combinations of an arbitrary number of cell arrays

Is there a command or one-line strategy in MATLAB that will return all the combinations of the components of n cell arrays, taken n at a time?
An example of what I want to accomplish:
A = {'a1','a2'};
B = {'b1','b2','b3'};
C = combinations(A,B)
C = {'a1','b1' ;
'a1','b2' ;
'a1','b3' ;
'a2','b1' ;
'a2','b2' ;
... }
The command would be able to accept an arbitrary number of arguments and the result in the example would have as many columns as there are arguments to the function. (Of course, the syntax above is just meant for illustration and any method that would generate the results whatever the format would fit the bill)
EDIT: Similar questions have been asked for matrices instead of cells, e.g. link. Many solutions point to the FEX submission allcomb, but all such solutions are just wrappers around ndgrid, which only work with doubles. Any suggestions for non numeric sets?
Although I address this in my answer to a related/near duplicate question, I'm posting a different version of my solution here since it appears you want a generalized solution, and my other answer is specific for the case of three input sets. Here's a function that should do what you want for any number of cell array inputs:
function combMat = allcombs(varargin)
sizeVec = cellfun('prodofsize', varargin);
indices = fliplr(arrayfun(#(n) {1:n}, sizeVec));
[indices{:}] = ndgrid(indices{:});
combMat = cellfun(#(c,i) {reshape(c(i(:)), [], 1)}, ...
varargin, fliplr(indices));
combMat = [combMat{:}];
end
And here's how you would call it:
>> combMat = allcombs(A, B)
combMat =
'a1' 'b1'
'a1' 'b2'
'a1' 'b3'
'a2' 'b1'
'a2' 'b2'
'a2' 'b3'
A 2-line strategy:
A = {'a1','a2'};
B = {'b1','b2','b3'};
[a b]=ndgrid(1:numel(A),1:numel(B));
C= [A(a(:))' B(b(:))']
C =
'a1' 'b1'
'a2' 'b1'
'a1' 'b2'
'a2' 'b2'
'a1' 'b3'
'a2' 'b3'