Generating multivariate normal random numbers with zero covariances in matlab - matlab

Suppose to generate a n-dim normal random number with distribution N(u, diag(sigma_1^2, ..., sigma_n^2) in Matlab, where u is a vertical vector.
There are two ways.
randn(n,1).*[sigma_1, ..., sigma_n]' + u;
mvnrnd(u', diag(sigma_1^2, ..., sigma_n^2))';
I think they are both correct. But I wonder if there is some preference of one over the other based on some reasons? I ask this question, because I saw another person always choose the first way, while I choose the second without having thought about it yet.
Thanks and regards!

They are equivalent methods. Personally, I would prefer the second option because it's one function that can be used to generate this sort of data for arbitrarily-shaped arrays. If all of a sudden you wanted a whole matrix of Gaussian values, you can get that more easily from the second function call, without doing any calls to reshape(). I also think the second example is easier to read because it relies on a built-in of Matlab's that has been ubiquitous for a long time.
I suppose that if n is large, one could argue that it's inefficient to actually form diag(sigma_1^2, ..., sigma_n^2). But if you're needing to make random draws from a matrix that large, then Matlab is already the wrong tool for the job and you should use Boost::Probability in C++, or perhaps SciPy / scikits.statsmodels in Python.

If there are correlations between the random variables then the covariance matrix is not anymore diagonal. In such case you may use mvnrnd or use randn with Cholesky decompistion as following.
U = chol(SIGMA);
x = U'*randn(n,1);
Whenever possible, use basic functions instead of using toolbox functions. Basic function are faster and portable.

Related

Dot product with huge vectors

I am facing the following problem: I have a system of 160000 linear equations with 160000 variables. I am going to write two programs on conjugate gradient method and steepest descent method to solve it. The matrix is block tridiagonal with only 5 non zero diagonals, thus it's not necessary to create and store the matrix. But I am having the following problem: when I go to the iterarion stepe, there must be dot product of vectors involved. I have tried the following commands: dot(u,v), u'*v, which are commonly used. But when I run the program, MATLAB told me the data size is too large for the memory.
To resolve this problem, I tried to decompose the huge vector into sparse vectors with small support, then calculate the dot products of small vectors and finally glue them together. But it seems that this method is more complicated and not very efficient, and it is easy (especially for beginners like me) to make mistakes. I wonder if there're any more efficient ways to deal with this problem. Thanks in advance.

Large linear programs in Matlab

I have a linear program with order N^4 variables and order N^4 constraints. If I want to solve this in AMPL, I define the constraints one by one without having to bother about the exact coefficient matrices. No memory issues arises. When using the standard LP-solver in Matlab however, I need to define the matrices explicitly.
When I have variables with four subscripts, this will lead to a massively sparse matrix of dimension order N^4 x N^4. This matrix won't even fit in memory for non trivial problem sizes.
Is there a way to get around this problem using Matlab, apart from various column generation/cutting plane techniques? Since AMPL manages to solve it, I suppose they're either automating some kind of decomposition, or they somehow solve the LP without explicitly working with this sparse monster matrix.
Apart from sparse mentioned by m.s. you can also use AMPL API for MATLAB. It is especially useful if you already have an AMPL model and want to work with it from MATLAB.
Converting my comment into an answer:
MATLAB supports sparse matrices using the sparse command which allows you to build your constraint matrix without exceeding memory limits.

Function with errors in numerical integration

I'm looking for a function that generates significant errors in numerical integration using Gaussian quadrature or Simpson quadrature.
Since Simpson's and Gaussian's methods are trying to fit a supposedly smooth function with pieces of simple smooth functions, such as 2nd-order polynomials, and otherwise make use of low-order polynomials and other simple algebraic functions such as $$a+5/6$$, it makes sense that the biggest challenges would be functions that aren't 2nd order polynomials or resembling those simple functions.
Step functions, or more generally functions that are constant for short runs then jump to another value. A staircase, or the Walsh functions (used for a kind of binary Fourier transform) should be interesting. Just a plain simple single step does not fit any polynomial approximation very well.
Try a high-order polynomial. Just x^n for a large n should be interesting. Maybe subtract x^n - x^(n-1) for some large n. How large is "large"? For Simpson, perhaps 4 or more. For Gaussian using k points, n>k. (Don't go nuts trying n beyond modest two digit numbers; that just becomes nasty calculation apart from any integration.)
Few numerical integration methods like poles, that is, functions resembling 1/(x-a) for some neighborhood around a. Since it may be trouble to deal with actual infinity, try pushing it off the real line, or a complex conjugate pair. Make a big but finite spike using 1/( (x-a)^2 + b) where b>0 is small. Or the square root of that expression, or the sine or exponential of it. You could replace the "2" with a bigger power, I bet that'll be nasty.
Once upon a time I wanted to test a numerical integration routine. I started with a stairstep function, or train of rectangular pulses, sampled on some set of points.
I computed an approximate derivative using a Savitzky-Golay filter. SG can differentiate numerical data using a finite window of neighboring points, though normally it's used for smoothing. It takes a window size (number of points), polynomial order (2 or 4 in practice, but you may want to go nuts with higher), and differentiation order (normally 0 to smooth, 1 to get derivatives).
The result was a series of pulses, which I then integrated. A good routine will recreate the original stairstep or rectangular pulses. I imagine if the SG parameters are chosen right, you will make Simpson and Gauss roll over in their graves.
If you are looking for a difficult function to integrate as a test method, you could consider the one in the CS Stack Exchange question:
Method for numerical integration of difficult oscillatory integral
In this question, one of the answers suggests using the chebfun library for Matlab, which contains an implementation of a basic Levin-type method. This suggests to me that the function would fail using a simpler method such as Simpsons rule.

Functional form of 2D interpolation in Matlab

I need to construct an interpolating function from a 2D array of data. The reason I need something that returns an actual function is, that I need to be able to evaluate the function as part of an expression that I need to numerically integrate.
For that reason, "interp2" doesn't cut it: it does not return a function.
I could use "TriScatteredInterp", but that's heavy-weight: my grid is equally spaced (and big); so I don't need the delaunay triangularisation.
Are there any alternatives?
(Apologies for the 'late' answer, but I have some suggestions that might help others if the existing answer doesn't help them)
It's not clear from your question how accurate the resulting function needs to be (or how big, 'big' is), but one approach that you could adopt is to regress the data points that you have using a least-squares or Kalman filter-based method. You'd need to do this with a number of candidate function forms and then choose the one that is 'best', for example by using an measure such as MAE or MSE.
Of course this requires some idea of what the form underlying function could be, but your question isn't clear as to whether you have this kind of information.
Another approach that could work (and requires no knowledge of what the underlying function might be) is the use of the fuzzy transform (F-transform) to generate line segments that provide local approximations to the surface.
The method for this would be:
Define a 2D universe that includes the x and y domains of your input data
Create a 2D fuzzy partition of this universe - chosing partition sizes that give the accuracy you require
Apply the discrete F-transform using your input data to generate fuzzy data points in a 3D fuzzy space
Pass the inverse F-transform as a function handle (along with the fuzzy data points) to your integration function
If you're not familiar with the F-transform then I posted a blog a while ago about how the F-transform can be used as a universal approximator in a 1D case: http://iainism-blogism.blogspot.co.uk/2012/01/fuzzy-wuzzy-was.html
To see the mathematics behind the method and extend it to a multidimensional case then the University of Ostravia has published a PhD thesis that explains its application to various engineering problems and also provides an example of how it is constructed for the case of a 2D universe: http://irafm.osu.cz/f/PhD_theses/Stepnicka.pdf
If you want a function handle, why not define f=#(xi,yi)interp2(X,Y,Z,xi,yi) ?
It might be a little slow, but I think it should work.
If I understand you correctly, you want to perform a surface/line integral of 2-D data. There are ways to do it but maybe not the way you want it. I had the exact same problem and it's annoying! The only way I solved it was using the Surface Fitting Tool (sftool) to create a surface then integrating it.
After you create your fit using the tool (it has a GUI as well), it will generate an sftool object which you can then integrate in (2-D) using quad2d
I also tried your method of using interp2 and got the results (which were similar to the sfobject) but I had no idea how to do a numerical integration (line/surface) with the data. Creating thesfobject and then integrating it was much faster.
It was the first time I do something like this so I confirmed it using a numerically evaluated line integral. According to Stoke's theorem, the surface integral and the line integral should be the same and it did turn out to be the same.
I asked this question in the mathematics stackexchange, wanted to do a line integral of 2-d data, ended up doing a surface integral and then confirming the answer using a line integral!

Integration with matlab

i want to solve this problem:
alt text http://img265.imageshack.us/img265/6598/greenshot20100727091025.png
i don't want to use "int", i want to use "quad" family (quad,dblquad,triplequad)
but i can't.
can you help me?
I assume that your real problem is more complex than this trivial one. The best solution is just to use a symbolic integral. Why is numerical integration difficult?
Numerical integration in ONE dimension typically requires on the order of say 100 function evaluations. (The exact number will be very dependent on the accuracy required, the limits, etc.) This makes a 2-d integral typically require on the order of 100^2 = 10000 function evals. So an adaptive, 5-d integral will require on the order of 100^5 = 1e10 function evaluations. (This is only a very rough order of magnitude estimate here.) My point is, you simply don't want to do that!
Better is to reduce the problem in complexity. If your integral is separable (as is this one) then do so! Reduce a 5-d problem into multiple 1-d problems.
Also, in many cases I see people wanting to do a numerical integration of a Gaussian PDF. See that this is easily solved using a call to erf or erfc, coupled with a transformation. The point is that in many cases special functions are defined to greatly reduce the complexity of a problem.
I should add that in many cases, the key to solving a difficult problem in mathematics is to use mathematics to reduce the problem to something simpler. If you can find a way to reduce the dimensionality of your problem just a bit, it will become much more tractable.
The integral you show is
Analytically solvable: always do analytically what you can
?equal to a number: constant expressions should be eliminated from numerical calculations
not easy to get calculated in MATLAB (or very correct).
You can use cumtrapz to integrate over each variable alone, and call trapz the final integration. Remember that this will blow up the error on any problem that is more complicated than the simple sum of linear functions.
Mathematica is more suited to nD integrations, if you have access to that.
matlab can do symbolic integration
>> x = sym('x'); y = sym('y'); z = sym('z'); u = sym('u'); v = sym('v');
>> int(int(int(int(int(x+y+z+u+v,1,5),-2,3),0,1),-1,1),0,1)
ans =
180
Just noticed you want to do numeric, not symbolic integration
If you look at the source of dblquad and triplequad
>> edit dblquad
you see that they just call the lower versions.
it should be possible for you to add a quadquad and a quintquad (or recursively an n-quad)