Convert matlab symbol to array of products - matlab

Can I convert a symbol that is a product of products into an array of products?
I tried to do something like this:
syms A B C D;
D = A*B*C;
factor(D);
but it doesn't factor it out (mostly because that isn't what factor is designed to do).
ans =
A*B*C
I need it to work if A B or C is replaced with any arbitrarily complicated parenthesized function, and it would be nice to do it without knowing what variables are in the function.
For example (all variables are symbolic):
D = x*(x-1)*(cos(z) + n);
factoring_function(D);
should be:
[x, x-1, (cos(z) + n)]
It seems like a string parsing problem, but I'm not confident that I can convert back to symbolic variables afterwards (also, string parsing in matlab sounds really tedious).
Thank you!

Use regexp on the string to split based on *:
>> str = 'x*(x-1)*(cos(z) + n)';
>> factors_str = regexp(str, '\*', 'split')
factors_str =
'x' '(x-1)' '(cos(z) + n)'
The result factor_str is a cell array of strings. To convert to a cell array of sym objects, use
N = numel(factors_str);
factors = cell(1,N); %// each cell will hold a sym factor
for n = 1:N
factors{n} = sym(factors_str{n});
end

I ended up writing the code to do this in python using sympy. I think I'm going to port the matlab code over to python because it is a more preferred language for me. I'm not claiming this is fast, but it serves my purposes.
# Factors a sum of products function that is first order with respect to all symbolic variables
# into a reduced form using products of sums whenever possible.
# #params orig_exp A symbolic expression to be simplified
# #params depth Used to control indenting for printing
# #params verbose Whether to print or not
def factored(orig_exp, depth = 0, verbose = False):
# Prevents sympy from doing any additional factoring
exp = expand(orig_exp)
if verbose: tabs = '\t'*depth
terms = []
# Break up the added terms
while(exp != 0):
my_atoms = symvar(exp)
if verbose:
print tabs,"The expression is",exp
print tabs,my_atoms, len(my_atoms)
# There is nothing to sort out, only one term left
if len(my_atoms) <= 1:
terms.append((exp, 1))
break
(c,v) = collect_terms(exp, my_atoms[0])
# Makes sure it doesn't factor anything extra out
exp = expand(c[1])
if verbose:
print tabs, "Collecting", my_atoms[0], "terms."
print tabs,'Seperated terms with ',v[0], ', (',c[0],')'
# Factor the leftovers and recombine
c[0] = factored(c[0], depth + 1)
terms.append((v[0], c[0]))
# Combines trivial terms whenever possible
i=0
def termParser(thing): return str(thing[1])
terms = sorted(terms, key = termParser)
while i<len(terms)-1:
if equals(terms[i][1], terms[i+1][1]):
terms[i] = (terms[i][0]+terms[i+1][0], terms[i][1])
del terms[i+1]
else:
i += 1
recombine = sum([terms[i][0]*terms[i][1] for i in range(len(terms))])
return simplify(recombine, ratio = 1)

Related

find string in non-scalar structure matlab

Here's a non-scalar structure in matlab:
clearvars s
s=struct;
for id=1:3
s(id).wa='nko';
s(id).test='5';
s(id).ad(1,1).treasurehunt='asdf'
s(id).ad(1,2).treasurehunt='as df'
s(id).ad(1,3).treasurehunt='foobar'
s(id).ad(2,1).treasurehunt='trea'
s(id).ad(2,2).treasurehunt='foo bar'
s(id).ad(2,3).treasurehunt='treasure'
s(id).ad(id,4).a=magic(5);
end
is there an easy way to test if the structure s contains the string 'treasure' without having to loop through every field (e.g. doing a 'grep' through the actual content of the variable)?
The aim is to see 'quick and dirtily' whether a string exists (regardless of where) in the structure. In other words (for Linux users): I'd like to use 'grep' on a matlab variable.
I tried arrayfun(#(x) any(strcmp(x, 'treasure')), s) with no success, output:
ans =
1×3 logical array
0 0 0
One general approach (applicable to any structure array s) is to convert your structure array to a cell array using struct2cell, test if the contents of any of the cells are equal to the string 'treasure', and recursively repeat the above for any cells that contain structures. This can be done in a while loop that stops if either the string is found or there are no structures left to recurse through. Here's the solution implemented as a function:
function found = string_hunt(s, str)
c = reshape(struct2cell(s), [], 1);
found = any(cellfun(#(v) isequal(v, str), c));
index = cellfun(#isstruct, c);
while ~found && any(index)
c = cellfun(#(v) {reshape(struct2cell(v), [], 1)}, c(index));
c = vertcat(c{:});
found = any(cellfun(#(c) isequal(c, str), c));
index = cellfun(#isstruct, c);
end
end
And using your sample structure s:
>> string_hunt(s, 'treasure')
ans =
logical
1 % True!
This is one way to avoid an explicit loop
% Collect all the treasurehunt entries into a cell with strings
s_cell={s(1).ad.treasurehunt, s(2).ad.treasurehunt, s(3).ad.treasurehunt};
% Check if any 'treasure 'entries exist
find_treasure=nonzeros(strcmp('treasure', s_cell));
% Empty if none
if isempty(find_treasure)
disp('Nothing found')
else
disp(['Found treasure ',num2str(length(find_treasure)), ' times'])
end
Note that you can also just do
% Collect all the treasurehunt entries into a cell with strings
s_cell={s(1).ad.treasurehunt, s(2).ad.treasurehunt, s(3).ad.treasurehunt};
% Check if any 'treasure 'entries exist
find_treasure=~isempty(nonzeros(strcmp('treasure', s_cell)));
..if you're not interested in the number of occurences
Depending on the format of your real data, and if you can find strings that contain your string:
any( ~cellfun('isempty',strfind( arrayfun( #(x)[x.ad.treasurehunt],s,'uni',0 ) ,str)) )

How to Give int-string-int Input as Parameter for Matlab's Matrix?

I would like to have short-hand form about many parameters which I just need to keep fixed in Matlab 2016a because I need them in many places, causing many errors in managing them separately.
Code where the signal is 15x60x3 in dimensions
signal( 1:1 + windowWidth/4, 1:1 + windowWidth,: );
Its pseudocode
videoParams = 1:1 + windowWidth/4, 1:1 + windowWidth,: ;
signal( videoParams );
where you cannot write videoParams as string but should I think write ":" as string and everything else as integers.
There should be some way to do the pseudocode.
Output of 1:size(signal,3) is 3 so it gives 1:3. I do not get it how this would replace : in the pseudocode.
Extension for horcler's code as function
function videoParams = fix(k, windowWidth)
videoParams = {k:k + windowWidth/4, k:k + windowWidth};
end
Test call signal( fix(1,windowWidth){:}, : ) but still unsuccessful giving the error
()-indexing must appear last in an index expression.
so I am not sure if such a function is possible.
How can you make such a int-string-int input for the matrix?
This can be accomplished via comma-separated lists:
signal = rand(15,60,3); % Create random data
windowWidth = 2;
videoParams = {1:1+windowWidth/4, 1:1+windowWidth, 1:size(signal,3)};
Then use the comma-separated list as such:
signal(videoParams{:})
which is equivalent to
signal(1:1+windowWidth/4, 1:1+windowWidth, 1:size(signal,3))
or
signal(1:1+windowWidth/4, 1:1+windowWidth, :)
The colon operator by itself is shorthand for the entirety of a dimension. However, it is only applicable in a direct context. The following is meaningless (and invalid code) as the enclosing cell has no defined size for its third element:
videoParams = {1:1+windowWidth/4, 1:1+windowWidth, :};
To work around this, you could of course use:
videoParams = {1:1+windowWidth/4, 1:1+windowWidth};
signal(videoParams{:},:)

Replacing Several Different Character of a string

I have to write a function to replace the characters of a string with those letters.
A=U
T=A
G=C
C=G
Example:
Input: 'ATAGTACCGGTTA'
Therefore, the output should be:
'UAUCAUGGCCAAU'
I can replace only one character. However, I have no how to do several. I could replace several if '"G=C and C=G" this condition was not there.
I use:
in='ATAGTACCGGTTA'
check=in=='A'
in(check)='U'
ans='UTUGTUCCGGTTU'
if I keep doing this at some point G will be replaced by C then then all the C will be replaced by G. How can I stop this?? Any help will be appreciated.
Just for fun, here's probably the absolute simplest way, via indexing:
key = 'UGCA';
[~, ~, idx] = unique(in);
out = key(idx'); % transpose idx since unique() returns a column vector
I do love indexing :D
Edit: As rightly pointed out, this is very optimised for the question as stated. Since [a, ~, idx] = unique(in); returns a and idx such that a(idx) == in, and by default a is sorted, we can just assume that a == 'ACGT' and pre-construct key to be the appropriate translation of indices into a.
If some characters from the known alphabet never appear in the input string, or if other unknown characters appear, then the indices don't match and the assumption breaks. In that case, we have to calculate the appropriate key explicitly - filling in the step that was optimised out above:
alph = 'ACGT';
trans = 'UGCA';
[key, ~, idx] = unique(in);
[~, alphidx, keyidx] = intersect(alph, key); % find which elements of alph
% appear at which points in key
key(keyidx) = trans(alphidx); % translate the elements of key that we can
out = key(idx');
The simplest way would be to use an intermediary letter. For instance:
in='ATAGTACCGGTTA'
in(in == 'A')='U'
in(in == 'T')='A'
in(in == 'C')='X'
in(in == 'G')='C'
in(in == 'X')='G'
This way you keep the 'C' and 'G' characters separate.
EDIT:
As others have mentioned, there are a few things other things you could do to improve this approach (though personally I think Notlikethat's way is cleanest). For instance, if you use a second variable, you don't have to worry about keeping 'C' and 'G' separate:
in='ATAGTACCGGTTA'
out=in;
out(in == 'A')='U';
out(in == 'T')='A';
out(in == 'C')='G';
out(in == 'G')='C';
Alternatively, you could make your indices first, then index after:
in='ATAGTACCGGTTA'
inA=in=='A';
inT=in=='T';
inC=in=='C';
inG=in=='G';
in(inA)='U';
in(inT)='A';
in(inC)='G';
in(inG)='C';
Finally, my personal favourite for sheer idiocy:
out=char(in+floor((68-in).*(in<70)*7/4)*4-round(ceil((in-67)/4)*3.7));
(Seriously, that last one works)
You can perform multiple character translation with bsxfun.
Inputs:
in = 'ATAGTACCGGTTA';
pat = ['A','T','G','C'];
subst = ['U','A','C','G'];
out0 ='UAUCAUGGCCAAU';
Translate all characters simultaneously:
>> ii = (1:numel(pat))*bsxfun(#eq,in,pat.'); %' instead of repmat and .*
>> out = subst(ii)
out =
UAUCAUGGCCAAU
>> isequal(out,out0)
ans =
1
Say you only want to translate a subset of the characters, leaving part of the sequence intact, it is easily solved with logical indexing and a few extra lines:
% Leave the Gs and Cs in place
pat = ['A','T'];
subst = ['U','A'];
ii = (1:numel(pat))*bsxfun(#eq,in,pat.'); %' same
out = char(zeros(1,numel(in)));
nz = ii>0;
out(nz) = subst(ii(nz));
out(~nz) = in(~nz)
out =
UAUGAUCCGGAAU
The original Gs and Cs are unchanged; A became U, and T became A (T is gone).
I would suggest to use containter.Map:
m=containers.Map({'A','T','G','C'},{'U','A','C','G'})
mapfkt=#(input)(cell2mat(m.values(num2cell(input))))
Usage:
mapfkt('ATAGTACCGGTTA')
Here is another method that should be fairly efficient, general, and in the line of thought of your original attempt:
%Suppose this is your input
myString = 'abcdeabcde';
fromSting = 'ace';
toString = 'xyz';
%Then it just takes this:
[idx fromLocation] = ismember(myString,fromSting)
myString(idx)=toString(fromLocation(idx))
If you know that all letters need to be replaced, the last line can be slightly simplified as you wont need to use idx.

Creating a translator in MatLab

I am trying to create a simple program in Matlab where the user can input a string (such as "A", "B", "AB" or "A B") and the program will output a word corresponding to my letter.
Input | Output
A Hello
B Hola
AB HelloHola
A B Hello Hola
This is my code:
A='Hello'; B='Hola';
userText = input('What is your message: ', 's');
userText = upper(userText);
for ind = 1:length(userText)
current = userText(ind);
X = ['The output is ', current];
disp(X);
end
Currently I don't get my desired results. I instead get this:
Input | Output
A The output is A
B The output is B
I'm not totally sure why X = ['The output is ', current]; evaluates to The output is A instead of The output is Hello.
Edit:
How would this program be able to handle numbers... such as 1 = "Goodbye"
What's going on:
%// intput text
userText = input('What is your message: ', 's');
%// ...and some lines later
X = ['The output is ', userText];
You never map what you type to what is contained by the variables A and B.
The fact that they are called A and B has nothing to do with what you type. You could call them C and blargh and still get the same result.
Now, you could use eval, but that's really not advisable here. In this case, using eval would force the one typing in the strings to know the exact names of your variables...that's a portability, maintainability, security, etc. disaster waiting to happen.
There are better solutions possible, for instance, create a simple map:
map = {
'A' 'Hello'
'B' 'Hola'
'1' 'Goodbye'
};
userText = input('What is your message: ', 's');
str = map{strcmpi(map(:,1), userText), 2};
disp(['The output is ', str]);
I would recommend using a map object to contain what you want. This will circumvent the eval function (which I suggest avoiding like the plague). This is pretty simple to read and understand, and is pretty efficient especially in the case of a long input string.
translation = containers.Map()
translation('A') = 'Hola';
translation('B') = 'Hello';
translation('1') = 'Goodbye';
inputString = 'ABA1BA1B11ABBA';
resultString = '';
for i = 1:length(inputString)
if translation.isKey(inputString(i))
% get mapped string if it exists
resultString = [resultString,translation(inputString(i))];
else
% if mapping does not exist, simply put the input string in (covers space case)
resultString = [resultString,inputString(i)];
end
end
Take a look at the command eval. Currently, you are displaying the name of the variable that contains the string you want. eval will help you in actually accessing and printing it.
What you need to do it :
X = ['The output is ', eval(current)];
Here the documentation : eval

MATLAB -> struct.field(1:end).field?

Is there a way that I get all the structure subsubfield values of a subfield in one line ? Something like this :
struct.field(1:end).field
If I understand your question aright, you want to collect all the fields of the second-level structure, with the name 'field', into a single output array. It doesn't quite meet your request for a one-liner, but you can do it like this:
a.field1.a = 1;
a.field1.b = 2;
a.field2.a = 3;
a.field2.b = 4;
result = [];
for x = fieldnames(a)'
result = horzcat(result, a.(x{:}).a);
end
The ending value of result is [1 3]
Simple Structure Example
aStruct.subField = struct('subSubField', {1;2;3;4})
So that
aStruct.subField(1).subSubField == 1
aStruct.subField(1).subSubField == 2
Etc. Then the values of the leaf nodes can be obtained via a one-liner as
valueLeafs = [aStruct.subField.subSubField];
Which can be checked via assert(all(valueLeafs == [1,2,3,4])).
Non-Scalar Structure Example
The above one-liner also works when the leaf node values are non-scalar such that they can be horizontally concatenated. For example
bStruct.subField = struct('subSubField', {[1,2];[3,4]})
valueLeafs_b = [bStruct.subField.subSubField]; % works okay
cStruct.subField = struct('subSubField', {[1,2];[3;4]})
valueLeafs_c = [cStruct.subField.subSubField]; % error: bad arg dims
Distinct Class Structure Example
The one-line solution given previously does not work whenever the leaf node values are different class since they cannot - in general, be concatenated. However, use of arrayfun and a tricky anonymous function typically provide the required indexing technique:
dStruct.subField = struct('subSubField', {[1;2];'myString'});
valueLeafs_d = arrayfun(#(x) x.subSubField, dStruct.subField, 'UniformOutput', false)