Convert symbolic expression into a rational polynomial - matlab

I have a lengthy symbolic expression that involves rational polynomials (basic arithmetic and integer powers). I'd like to simplify it into a single (simple) rational polynomial.
numden does it, but it seems to use some expensive optimization, which probably addresses a more general case. When tried on my example below, it crashed after a few hours--out of memory (32GB).
I believe something more efficient is possible even if I don't have a cpp access to matlab functionality (e.g. children).
Motivation: I have an objective function that involves polynomials. I manually derived it, and I'd like to verify and compare the derivatives: I subtract the two expressions, and the result should vanish.
Currently, my interest in this is academic since practically, I simply substitute some random expression, get zero, and it's enough for me.
I'll try to find the time to play with this as some point, and I'll update here about it, but I posted in case someone finds it interesting and would like to give it a try before that.
To run my function:
x = sym('x', [1 32], 'real')
e = func(x)
The function (and believe it or not, this is just the Jacobian, and I also have the Hessian) can't be pasted here since the text limit is 30K:
https://drive.google.com/open?id=1imOAa4VS87WDkOwAK0NoFCJPTK_2QIRj

Related

Matlab Symbolic expression creates overflow

I am using Matlab symbolic toolbox to create a function of high complexity. This function is then written to a .m-file (using matlabFunction). For some reason, after simplifying the function, the function is returned on a form that looks like fun = (A*1.329834759483753e310 + B*5.873798798237459e305 + ...)*7.577619127319697e-320, where A and B are functions of my variables (too complex to repeat here). That is, all the terms within the parenthesis are in the order of about 1e280 to 1e300. The problems arises when the exponents become larger than about 1.79e308, as this causes an overflow for doubles (when calling the generated .m-function). The real size of the function is nowhere close to create an overflow, but this way of expressing the function does. This would be solved if the simplify function multiplied the 1e-320 into the parenthesis, but for some reason it doesn't.
Any idea why the symbolic toolbox chooses to represent my function this way?
I have found that I can call call expand(fun) to multiply 1e-320 into the parenthesis. The resulting expression then has exponents with the expected sizes (in the range -1 to -30) but I would prefer to know the reason why the expression looks like this in the first place, and if there are better options than calling expand to avoid the problem. Besides, calling expand seems to create a more complex function than the one I have, and I am trying to obtain a function that evaluates very fast here.
Large exponent multipliers are probably due to some floats in the formula. Try to avoid them and use rationals (1/2 instead of 0.5).
The best reason I've read is this one:
The floating point numbers tend to get converted to rational numbers, which usually involves multiplying by 2^53 and then factoring out the gcd from the top and bottom of the ratio. Square such a value and you are working with numbers on the order of 2^100... and so on.

MATLAB: using a minimum function within symsum

I am trying to run code similar to the following, I replaced the function I had with one much smaller, to provide a minimum working example:
clear
syms k m
n=2;
symsum(symsum(k*m,m,0,min(k,n-k)),k,0,n)
I receive the following error message:
"Error using sym/min (line 86)
Input arguments must be convertible to floating-point numbers."
I think this means that the min function cannot be used with symbolic arguments. However, I was hoping that MATLAB would be substituting in actual numbers through its iterations of k=0:n.
Is there a way to get this to work? Any help much appreciated. So far I the most relevant page I found was here, but I am somewhat hesitant as I find it difficult to understand what this function does.
EDIT following #horchler, I messed around putting it in various places to try and make it work, and this one did:
clear
syms k m
n=2;
symsum(symsum(k*m,m,0,feval(symengine, 'min', k,n-k)),k,0,n)
Because I do not really understand this feval function, I was curious to whether there was a better, perhaps more commonly-used solution. Although it is a different function, there are many pieces online advising against the eval function, for example. I thought perhaps this one may also carry issues.
I agree that Matlab should be able to solve this as you expect, even though the documentation is clear that it won't.
Why the issue occurs
The problem is due the inner symbolic summation, and the min function itself, being evaluated first:
symsum(k*m,m,0,min(k,n-k))
In this case, the input arguments to sym/min are not "convertible to floating-point numbers" as k is a symbolic variable. It is only after you wrap the above in another symbolic summation that k becomes clearly defined and could conceivably be reduced to numbers, but the inner expression has already generated an error so it's too late.
I think that it's a poor choice for sym/min to return an error. Rather, it should just return itself. This is what the sym/int function does when it can't evaluate an integral symbolically or numerically. MuPAD (see below) and Mathematica 10 also do something like this as well for their min functions.
About the workaround
This directly calls a MuPAD's min function. Calling MuPAD functions from Matlab is discussed in more detail in this article from The MathWorks.
If you like, you can wrap it in a function or an anonymous function to make calling it cleaner, e.g.:
symmin = #(x,y)feval(symengine,'min',x,y);
Then, you code would simply be:
syms k m
n = 2;
symsum(symsum(k*m,m,0,symmin(k,n-k)),k,0,n)
If you look at the code for sym/min in the Symbolic Math toolbox (type edit sym/min in your Command Window), you'll see that it's based on a different function: symobj::maxmin. I don't know why it doesn't just call MuPAD's min, other than performance reasons perhaps. You might consider filing a service request with The MathWorks to ask about this issue.

Turn off "smart behavior" in Matlab

There is one thing I do not like on Matlab: It tries sometimes to be too smart. For instance, if I have a negative square root like
a = -1; sqrt(a)
Matlab does not throw an error but switches silently to complex numbers. The same happens for negative logarithms. This can lead to hard to find errors in a more complicated algorithm.
A similar problem is that Matlab "solves" silently non quadratic linear systems like in the following example:
A=eye(3,2); b=ones(3,1); x = A \ b
Obviously x does not satisfy A*x==b (It solves a least square problem instead).
Is there any possibility to turn that "features" off, or at least let Matlab print a warning message in this cases? That would really helps a lot in many situations.
I don't think there is anything like "being smart" in your examples. The square root of a negative number is complex. Similarly, the left-division operator is defined in Matlab as calculating the pseudoinverse for non-square inputs.
If you have an application that should not return complex numbers (beware of floating point errors!), then you can use isreal to test for that. If you do not want the left division operator to calculate the pseudoinverse, test for whether A is square.
Alternatively, if for some reason you are really unable to do input validation, you can overload both sqrt and \ to only work on positive numbers, and to not calculate the pseudoinverse.
You need to understand all of the implications of what you're writing and make sure that you use the right functions if you're going to guarantee good code. For example:
For the first case, use realsqrt instead
For the second case, use inv(A) * b instead
Or alternatively, include the appropriate checks before/after you call the built-in functions. If you need to do this every time, then you can always write your own functions.

MATLAB: alternatives to calling feval in ode45

I hope I am on topic here. I'm asking here since it said on the faq page: a question concerning (among others) a software algorithm :) So here it goes:
I need to solve a system of ODEs (like $ \dot x = A(t) x$. The Matrix A may change and is given as a string in the function call (Calc_EDS_v2('Sys_EDS_a',...)
Then I'm using ode45 in a loop to find my x:
function [intervals, testing] = EDS_calc_v2(smA,options,debug)
[..]
for t=t_start:t_step:t_end)
[Te,Qe]=func_int(#intQ_2_v2,[t,t+t_step],q);
q=Qe(end,:);
[..]
end
[..]
with func_int being ode45 and #intQ_2_v2 my m-file. q is given back to the call as the starting vector. As you can see I'm just using ode45 on the intervall [t, t+t_step]. That's because my system matrix A can force ode45 to use a lot of steps, leading it to hit the AbsTol or RelTol very fast.
Now my A is something like B(t)*Q(t), so in the m-file intQ_2_v2.m I need to evaluate both B and Q at the times t.
I first done it like so: (v1 -file, so function name is different)
function q=intQ_2_v1(t,X)
[..]
B(1)=...; ... B(4)=...;
Q(1)=...; ...
than that is naturally only with the assumption that A is a 2x2 matrix. With that setup it took a basic system somewhere between 10 and 15 seconds to compute.
Instead of the above I now use the files B1.m to B4.m and Q1.m to B4.m (I know that that's not elegant, but I need to use quadgk on B later and quadgk doesn't support matrix functions.)
function q=intQ_2_v2(t,X)
[..]
global funcnameQ, funcnameB, d
for k=1:d
Q(k)=feval(str2func([funcnameQ,int2str(k)]),t);
B(k)=feval(str2func([funcnameB,int2str(k)]),t);
end
[..]
funcname (string) referring to B or Q (with added k) and d is dimension of the system.
Now I knew that it would cost me more time than the first version but I'm seeing the computing times are ten times as high! (getting 150 to 160 seconds) I do understand that opening 4 files and evaluate roughly 40 times per ode-loop is costly... and I also can't pre-evalute B and Q, since ode45 uses adaptive step sizes...
Is there a way to not use that last loop?
Mostly I'm interested in a solution to drive down the computing times. I do have a feeling that I'm missing something... but can't really put my finger on it. With that one taking nearly three minutes instead of 10 seconds I can get a coffee in between each testrun now... (plz don't tell me to get a faster computer)
(sorry for such a long question )
I'm not sure that I fully understand what you're doing here, but I can offer a few tips.
Use the profiler, it will help you understand exactly where the bottlenecks are.
Using feval is slower than using function handles directly, especially when using str2func to build the handle each time. There is also a slowdown from using the global variables (and it's a good habit to avoid these unless absolutely necessary). Each of these really adds up when using them repeatedly (as it looks like here). Store function handles to each of your mfiles in a cell array and either pass them directly to the function or use nested function for the optimization so that the cell array of handles is visible to the function being optimized. Personally, I prefer the nested method, but passing is better if you will use those mfiles elsewhere.
I expect this will get your runtime back to close to what the first method gave. Be sure to tell us if this was the problem or if you found another solution.

MATLAB interview questions?

I programmed in MATLAB for many years, but switched to using R exclusively in the past few years so I'm a little out of practice. I'm interviewing a candidate today who describes himself as a MATLAB expert.
What MATLAB interview questions should I ask?
Some other sites with resources for this:
"Matlab interview questions" on Wilmott
"MATLAB Questions and Answers" on GlobaleGuildLine
"Matlab Interview Questions" on CoolInterview
This is a bit subjective, but I'll bite... ;)
For someone who is a self-professed MATLAB expert, here are some of the things that I would personally expect them to be able to illustrate in an interview:
How to use the arithmetic operators for matrix or element-wise operations.
A familiarity with all the basic data types and how to convert effortlessly between them.
A complete understanding of matrix indexing and assignment, be it logical, linear, or subscripted indexing (basically, everything on this page of the documentation).
An ability to manipulate multi-dimensional arrays.
The understanding and regular usage of optimizations like preallocation and vectorization.
An understanding of how to handle file I/O for a number of different situations.
A familiarity with handle graphics and all of the basic plotting capabilities.
An intimate knowledge of the types of functions in MATLAB, in particular nested functions. Specifically, given the following function:
function fcnHandle = counter
value = 0;
function currentValue = increment
value = value+1;
currentValue = value;
end
fcnHandle = #increment;
end
They should be able to tell you what the contents of the variable output will be in the following code, without running it in MATLAB:
>> f1 = counter();
>> f2 = counter();
>> output = [f1() f1() f2() f1() f2()]; %# WHAT IS IT?!
We get several new people in the technical support department here at MathWorks. This is all post-hiring (I am not involved in the hiring), but I like to get to know people, so I give them the "Impossible and adaptive MATLAB programming challenge"
I start out with them at MATLAB and give them some .MAT file with data in it. I ask them to analyze it, without further instruction. I can very quickly get a feel for their actual experience.
http://blogs.mathworks.com/videos/2008/07/02/puzzler-data-exploration/
The actual challenge does not mean much of anything, I learn more from watching them attempt it.
Are they making scripts, functions, command line or GUI based? Do they seem to have a clear idea where they are going with it? What level of confidence do they have with what they are doing?
Are they computer scientists or an engineer that learned to program. CS majors tend to do things like close their parenthesis immediately, and other small optimizations like that. People that have been using MATLAB a while tend to capture the handles from plotting commands for later use.
How quickly do they navigate the documentation? Once I see they are going down the 'right' path then I will just change the challenge to see how quickly they can do plots, pull out submatrices etc...
I will throw out some old stuff from Project Euler. Mostly just ramp up the questions until one of us is stumped.
Floating Point Questions
Given that Matlab's main (only?) data type is the double precision floating point matrix, and that most people use floating point arithmetic -- whether they know it or not -- I'm astonished that nobody has suggested asking basic floating point questions. Here are some floating point questions of variable difficulty:
What is the range of |x|, an IEEE dp fpn?
Approximately how many IEEE dp fpns are there?
What is machine epsilon?
x = 10^22 is exactly representable as a dp fpn. What are the fpns xp
and xs just below and just above x ?
How many dp fpns are in [1,2)? How many atoms are on an edge of a
1-inch sugar cube?
Explain why sin(pi) ~= 0, but cos(pi) = -1.
Why is if abs(x1-x2) < 1e-10 then a bad convergence test?
Why is if f(a)*f(b) < 0 then a bad sign check test?
The midpoint c of the interval [a,b] may be calculated as:
c1 = (a+b)/2, or
c2 = a + (b-a)/2, or
c3 = a/2 + b/2.
Which do you prefer? Explain.
Calculate in Matlab: a = 4/3; b = a-1; c = b+b+b; e = 1-c;
Mathematically, e should be zero but Matlab gives e = 2.220446049250313e-016 = 2^(-52), machine epsilon (eps). Explain.
Given that realmin = 2.225073858507201e-308, and Matlab's u = rand gives a dp fpn uniformly distributed over the open interval (0,1):
Are the floating point numbers [2^(-400), 2^(-100), 2^(-1)]
= 3.872591914849318e-121, 7.888609052210118e-031, 5.000000000000000e-001
equally likely to be output by rand ?
Matlab's rand uses the Mersenne Twister rng which has a period of
(2^19937-1)/2, yet there are only about 2^64 dp fpns. Explain.
Find the smallest IEEE double precision fpn x, 1 < x < 2, such that x*(1/x) ~= 1.
Write a short Matlab function to search for such a number.
Answer: Alan Edelman, MIT
Would you fly in a plane whose software was written by you?
Colin K would not hire me (and probably fire me) for saying "that
Matlab's main (only?) data type is the double precision floating
point matrix".
When Matlab started that was all the user saw, but over the years
they have added what they coyly call 'storage classes': single,
(u)int8,16,32,64, and others. But these are not really types
because you cannot do USEFUL arithmetic on them. Arithmetic on
these storage classes is so slow that they are useless as types.
Yes, they do save storage but what is the point if you can't do
anything worthwhile with them?
See my post (No. 13) here, where I show that arithmetic on int32s is 12 times slower than
double arithmetic and where MathWorkser Loren Shure says "By
default, MATLAB variables are double precision arrays. In the olden
days, these were the ONLY kind of arrays in MATLAB. Back then even
character arrays were stored as double values."
For me the biggest flaw in Matlab is its lack of proper types,
such as those available in C and Fortran.
By the way Colin, what was your answer to Question 14?
Ask questions about his expertise and experience in applying MATLAB in your domain.
Ask questions about how he would approach designing an application for implementation in MATLAB. If he refers to recent features of MATLAB, ask him to explain them, and how they are different from the older features they replace or supplement, and why they are preferable (or not).
Ask questions about his expertise with MATLAB data structures. Many of the MATLAB 'experts' I've come across are very good at writing code, but very poor at determining what are the best data structures for the job in hand. This is often a direct consequence of their being domain experts who've picked up MATLAB rather than having been trained in computerism. The result is often good code which has to compensate for the wrong data structures.
Ask questions about his experience, if any, with other languages/systems and invite him to expand upon his observations about the relative strengths and weaknesses of MATLAB.
Ask for top tips on optimising MATLAB programs. Expect the answers: vectorisation, pre-allocation, clearing unused variables, etc.
Ask about his familiarity with the MATLAB profiler, debugger and lint tools. I've recently discovered that the MATLAB 'expert' over in the corner here had never, in 10 years using the tool, found the profiler.
That should get you started.
I. I think this recent SO question
on indexing is a very good question
for an "expert".
I have a 2D array, call it 'A'. I have
two other 2D arrays, call them 'ix'
and 'iy'. I would like to create an
output array whose elements are the
elements of A at the index pairs
provided by x_idx and y_idx. I can do
this with a loop as follows:
for i=1:nx
for j=1:ny
output(i,j) = A(ix(i,j),iy(i,j));
end
end
How can I do this without the loop? If
I do output = A(ix,iy), I get the
value of A over the whole range of
(ix)X(iy).
II. Basic knowledge of operators like element-wise multiplication between two matrices (.*).
III. Logical indexing - generate a random symmetric matrix with values from 0-1 and set all values above T to 0.
IV. Read a file with some properly formatted data into a matrix (importdata)
V. Here's another sweet SO question
I have three 1-d arrays where elements
are some values and I want to compare
every element in one array to all
elements in other two.
For example:
a=[2,4,6,8,12]
b=[1,3,5,9,10]
c=[3,5,8,11,15]
I want to know if there are same
values in different arrays (in this
case there are 3,5,8)
Btw, there's an excellent chance your interviewee will Google "MATLAB interview questions" and see this post :)
Possible question:
I have an array A of n R,G,B triplets. It is a 3xn matrix. I have another array B in the form 1xn which stores an index value (association to a cluster) for each triplet.
How do I plot the triplets of A in 3D space (using plot3 function), coloring each triplet according to its index in B? (The goal is to qualitatively evaluate my clustering)
Really, really good programmers who are MATLAB novices won't be able to give you an efficient (== MATLAB style) solution. However, it is a very simple problem if you do know your MATLAB.
Depends a bit what you want to test.
To test MATLAB fluency, there are several nice Stack Overflow questions that you could use to test e.g. array manipulations (example 1, example 2), or you could use fix-this problems like this question (I admit, I'm rather fond of that one), or look into this list for some highly MATLAB-specific stuff. If you want to be a bit mean, throw in a question like this one, where the best solution is a loop, and the typical MATLAB-way-of-thinking solution would just fill up the memory.
However, it may be more useful to ask more general programming questions that are related to your area of work and see whether they get the problem solved with MATLAB.
For example, since I do image analysis, I may ask them to design a class for loading images of different formats (a MATLAB expert should know how to do OOP, after all, it has been out for two years now), and then ask follow-ups as to how to deal with large images (I want to see a check on how much memory would be used - or maybe they know memory.m - and to hear about how MATLAB usually works with doubles), etc.