Can matlab (or mupad) evaluate symbolic expressions containing non-commuting operators? - matlab

Say I give something like AB+AB+BA to matlab (or mupad), and ask it to simplify it. the answer should be: 2AB+BA. Can this be done in matlab or mupad?
Edit:
Ok, this is feeling rediculous. I'm trying to do this in either matlab or mulab, and.. it's frustrating not knowing how to do what should be the simplest things, and not being able to find the answers right away via google.
I want to expand the following, multiplied together, as a taylor series:
eq1 := exp(g*l*B):
eq2 := exp(l*A):
eq3 := exp((1-g)*l*B):
g is gamma, l is lambda (don't know how to represent either of these in matlab or mulab). A and B don't commute. I want to multiply the three exponentials together, expand, select all terms of a given power in lambda, and simplify the result. Is there a simple way to do this? or should I give up and go to another system, like maple?

This is mupad, not matlab:
operator("x", _vector_product, Binary, 1999):
A x B + A x B + B x A
returns
2 A x B + B x A
The vetor product is used, simply because it matches the described requirements.

Related

Solving for [A] to satisfy [A]*[B]=[C] when [C] is known and [B] is randomly generated with less rows than columns

My goal is to solve for a matrix [A] that satisfies [A]*[B]=[C] where [C] is known and [B] is randomly generated. Below is an example:
C=[1/3 1/3 1/3]'*[1/3 1/6 1/6 1/6 1/6];
B=rand(5,5);
A=C*pinv(B);
A*B=C_test;
norm(C-C_test);
ans =
4.6671e-16
Here the elements of [C_test] are within 1e-15 to the original [C], but when [B] has less rows than columns, the error dramatically increases (not sure is norm() is the best way to show the error, but I think it illustrates the problem). For example:
B=rand(4,5);
A=C*pinv(B);
A*B=C_test;
norm(C-C_test);
ans =
0.0173
Additional methods:
QR-Factorization
[Q,R,P]=qr(B);
A=((C*P)/R))*Q';
norm(C-A*B);
ans =
0.0173
/ Operator
A=C/B;
norm(C-A*B);
ans =
0.0173
Why does this happen? In both cases [B]*pinv([B])=[I] so it seems like the process should work.
If this is a numerical or algebraic fact of life associated with pinv() or the other methods, is there another way I can generate [A] to satisfy the equation? Thank you!
Since C is 3×5, the number of elements in C and hence the number of equations is equal to 15. If B is 5×5, the number of unknowns (the elements in A) equals 3×5 = 15 as well, and the solution will be accurate.
If on the other hand B is for instance 3×5, the number of elements in A is equal to 3×3 = 9 and hence the system is overdetermined, which means that the resulting A will be the least-squares solution.
See for general information wikipedia: System of linear equations, and Matlabs Overdetermined system.
The resulting matrix A is the best fit and there is no way to improve (in a least square sense).
In response to your second question: you are measuring the quality of A*B as an approximation of C by applying the 2-norm to A*B-C: which is equivalent to least-squares fitting. In this measure, all the approaches that you use provide the optimal answer.
If you however would prefer some other measure, such as the 1-norm, the Infinity-norm or any other measure (for instance by picking different weights for column, row or element), the obtained answers from the original approach will of course not be necessarily optimal with respect to this new measure.
The most general approach would be to use some optimization routine, like this:
x = fminunc(f, zeros(3*size(B,1),1));
A = reshape(x,3,size(B,1));
where f is some (any) measure. The least-square measure should result in the same A. So if you try this one:
f = #(x) norm(reshape(x,3,size(B,1))*B - C);
A should match the results in your approaches.
But you could use any f here. For instance, try the 1-norm:
f = #(x) norm(reshape(x,3,size(B,1))*B - C, 1);
Or something crazy like:
f = #(x) sum(abs(reshape(x,3,size(B,1))*B - C)*[1 10 100 1000 10000]');
This will give different results, which are according to the new measure f optimal. That being said, I would stick to the least squares ;)

Double sqrt solution in Matlab?

I would like to know how can I get both the positive and the negative solution from a sqrt in Matlab.
For example if I have:
sin(a) = sqrt(1-cos(a)^2);
The docs don't say anything specific about always only providing the positive square root but it does seem like a fair assumption in which case you can get the negative square pretty easily like this:
p = sqrt(1-cos(a)^2);
n = -sqrt(1-cos(a)^2);
btw assigning to sin(a) like that is going to create a variable called sin which will hide the sin function leading to many possible errors, so I would highly recommend choosing a different variable name.
MATLAB (and every other programming language that I know of) only returns the principal square root of x when calling sqrt(x) or equivalent.
How you'd write the square root of x mathematically, is
s = ±√x
which is just a shorthand for writing the whole solution set
s = {+√x -√x}
In MATLAB, you'd write it the same as this last case, but with slightly different syntax,
s = [+sqrt(x) -sqrt(x)]
which can be computed more efficiently if you "factor out" the sqrt:
s = sqrt(x) * [1 -1]
So, for your case,
s = sqrt(1-cos(a)^2) * [1 -1]
or, if you so desire,
s = sin(acos(a)) * [1 -1]
which is a tad slower, but perhaps more readable (and actually a bit more accurate as well).
Now of course, if you can somehow find the components whose quotient results in the value of your cosine, then you wouldn't have to deal with all this messy business of course....
sqrt does not solve equations, only gives numerical output. You will need to formulate your equation as you need it, and then you can use sqrt(...) -1*sqrt(...) to give your positive and negative outputs.

MATLAB: how to stack up arrays "shape-agnostically"?

Suppose that f is a function of one parameter whose output is an n-dimensional (m1 × m2… × mn) array, and that B is a vector of length k whose elements are all valid arguments for f.
I am looking for a convenient, and more importantly, "shape-agnostic", MATLAB expression (or recipe) for producing the (n+1)-dimensional (m1 × m2 ×…× mn × k) array obtained by "stacking" the k n-dimensional arrays f(b), where the parameter b ranges over B.
To do this in numpy, I would use an expression like this one:
C = concatenate([f(b)[..., None] for b in B], -1)
In case it's of any use, I'll unpack this numpy expression below (see APPENDIX), but the feature of it that I want to emphasize now is that it is entirely agnostic about the shapes/sizes of f(b) and B. For the types of applications I have in mind, the ability to write such "shape-agnostic" code is of utmost importance. (I stress this point because much MATLAB code I come across for doing this sort of manipulation is decidedly not "shape-agnostic", and I don't know how to make it so.)
APPENDIX
In general, if A is a numpy array, then the expression A[..., None] can be thought as "reshaping" A so that it gets one extra, trivial, dimension. Thus, if f(b) is an n-dimensional (m1 × m2… × mn) array, then, f(b)[..., None] is the corresponding (n+1)-dimensional (m1 × m2 ×…× mn × 1) array. (The reason for adding this trivial dimension will become clear below.)
With this clarification out of the way, the meaning of the first argument to concatenate, namely:
[f(b)[..., None] for b in B]
is not too hard to decipher. It is a standard Python "list comprehension", and it evaluates to the sequence of the k (n+1)-dimensional (m1 × m2 ×…× mn × 1) arrays f(b)[..., None], as the parameter b ranges over the vector B.
The second argument to concatenate is the "axis" along which the concatenation is to be performed, expressed as the index of the corresponding dimension of the arrays to be concatenated. In this context, the index -1 plays the same role as the end keyword does in MATLAB. Therefore, the expression
concatenate([f(b)[..., None] for b in B], -1)
says "concatenate the arrays f(b)[..., None] along their last dimension". It is in order to provide this "last dimension" to concatenate over that it becomes necessary to reshape the f(b) arrays (with, e.g., f(b)[..., None]).
One way of doing that is:
% input:
f=#(x) x*ones(2,2)
b=1:3;
%%%%
X=arrayfun(f,b,'UniformOutput',0);
X=cat(ndims(X{1})+1,X{:});
Maybe there are more elegant solutions?
Shape agnosticity is an important difference between the philosophies underlying NumPy and Matlab; it's a lot harder to accomplish in Matlab than it is in NumPy. And in my view, shape agnosticity is a bad thing, too -- the shape of matrices has mathematical meaning. If some function or class were to completely ignore the shape of the inputs, or change them in a way that is not in accordance with mathematical notations, then that function destroys part of the language's functionality and intent.
In programmer terms, it's an actually useful feature designed to prevent shape-related bugs. Granted, it's often a "programmatic inconvenience", but that's no reason to adjust the language. It's really all in the mindset.
Now, having said that, I doubt an elegant solution for your problem exists in Matlab :) My suggestion would be to stuff all of the requirements into the function, so that you don't have to do any post-processing:
f = #(x) bsxfun(#times, permute(x(:), [2:numel(x) 1]), ones(2,2, numel(x)) )
Now obviously this is not quite right, since f(1) doesn't work and f(1:2) does something other than f(1:4), so obviously some tinkering has to be done. But as the ugliness of this oneliner already suggests, a dedicated function might be a better idea. The one suggested by Oli is pretty decent, provided you lock it up in a function of its own:
function y = f(b)
g = #(x)x*ones(2,2); %# or whatever else you want
y = arrayfun(g,b, 'uni',false);
y = cat(ndims(y{1})+1,y{:});
end
so that f(b) for any b produces the right output.

Understanding colon notation in MATLAB

So I'm completely new to MATLAB and I'm trying to understand colon notation within mathematical operations. So, in this book I found this statement:
w(1:5)=j(1:5) + k(1:5);
I do not understand what it really does. I know that w(1:5) is pretty much iterating through the w array from index 1 through 5, but in the statement above, shouldn't all indexes of w be equal to j(5) + k(5) in the end? Or am I completely wrong on how this works? It'd be awesome if someone posted the equivalent in Java to that up there. Thanks in advance :-)
I am pretty sure this means
"The first 5 elements of w shall be the first 5 elements of j + the first 5 elements of k" (I am not sure if matlab arrays start with 0 or 1 though)
So:
w1 = j1+k1
w2 = j2+k2
w3 = j3+k3
w4 = j4+k4
w5 = j5+k5
Think "Vector addition" here.
w(1:5)=j(1:5) + k(1:5);
is the same that:
for i=1:5
w(i)=j(i)+k(i);
end
MATLAB uses vectors and matrices, and is heavily optimized to handle operations on them efficiently.
The expression w(1:5) means a vector consisting of the first 5 elements of w; the expression you posted adds two 5 element vectors (the first 5 elements of j and k) and assigns the result to the first five elements of w.
I think your problem comes from the way how do you call this statement. It is not an iteration, but rather simple assignment. Now we only need to understand what was assigned to what.
I will assume j,k, w are all vectors 1 by N.
j(1:5) - means elements from 1 to 5 of the vector j
j(1:5) + k(1:5) - will result in elementwise sum of both operands
w(1:5) = ... - will assign the result again elementwise to w
Writing your code using colon notation makes it less verbose and more efficient. So it is highly recommended to do so. Also, colon notation is the basic and very powerful feature of MATLAB. Make sure you understand it well, before you move on. MATLAB is very well documented so you can read on this topic here.

Minimizing objective function by changing a variable - in Matlab?

I have a 101x82 size matrix called A. Using this variable matrix, I compute two other variables called:
1) B, a 1x1 scalar, and
2) C, a 50x6 matrix.
I compare 1) and 2) with their analogues variables 3) and 4), whose values are fixed:
3) D, a 1x1 scalar, and
4) E, a 50x6 matrix.
Now, I want to perturb/change the values of A matrix, such that:
1) ~ 3), i.e. B is nearly equal to D , and
2) ~ 4), i.e. C is nearly equal to E
Note that on perturbing A, B and C will change, but not D and E.
Any ideas how to do this? Thanks!
I can't run your code as it's demanding to load data (which I don't have) and it's not immediatly obvious how to calculate B or C.
Fortunately I may be able to answer your problem. You're describing an optimization problem, and the solution would be to use fminsearch (or something of that variety).
What you do is define a function that returns a vector with two elements:
y1 = (B - D)^weight1;
y2 = norm((C - E), weight2);
with weight being how strong you allow for variability (weight = 2 is usually sufficient).
Your function variable would be A.
From my understanding you have a few functions.
fb(A) = B
fc(A) = C
Do you know the functions listed above, that is do you know the mappings from A to each of these?
If you want to try to optimize, so that B is close to D, you need to pick:
What close means. You can look at some vector norm for the B and D case, like minimizing ||B-D||^2. The standard sum of the squares of the elements of this different will probably do the trick and is computationally nice.
How to optimize. This depends a lot on your functions, whether you want local or global mimina, etc.
So basically, now we've boiled the problem down to minimizing:
Cost = ||fb(A) - fd(A)||^2
One thing you can certainly do is to compute the gradient of this cost function with respect to the individual elements of A, and then perform minimization steps with forward Euler method with a suitable "time step". This might not be fast, but with small enough time step and well-behaved enough functions it will at least get you to a local minima.
Computing the gradient of this
grad_A(cost) = 2*||fb(A)-fd(A)||*(grad_A(fb)(A)-grad_A(fd)(A))
Where grad_A means gradient with respect to A, and grad_A(fb)(A) means gradient with respect to A of the function fb evaluated at A, etc.
Computing the grad_A(fb)(A) depends on the form of fb, but here are some pages have "Matrix calculus" identities and explanations.
Matrix calculus identities
Matrix calculus explanation
Then you simply perform gradient descent on A by doing forward Euler updates:
A_next = A_prev - timestep * grad_A(cost)