Why is Matlab function interpn being modified? - matlab

The Matlab function interpn does n-grid interpolation. According to the documentation page:
In a future release, interpn will not accept mixed combinations of row and column vectors for the sample and query grids.
This page provides a bit more information but is still kind of cryptic.
My question is this: Why is this modification being implemented? In particular, are there any pitfalls to using interpn?
I am writing a program in fortran that is supposed to produce similar results to a Matlab program that uses interpn as a crucial component. I'm wondering if the Matlab program might have a problem that is related to this modification.

No, I don't think this indicates that there is any sort of problem with using interpn, or any of the other MATLAB interpolation functions.
Over the last few releases MathWorks has been introducing some new/better functionality for interpolation (for example the griddedInterpolant, scatteredInterpolant and delaunayTriangulation classes). This has been going on in small steps since R2009a, when they replaced the underlying QHULL libraries for computational geometry with CGAL.
It seems likely to me that interpn has for a long time supported an unusual form of input arguments (i.e. mixed row and column vectors to define the sample grid) that is probably a bit confusing for people, hardly ever used, and a bit of a pain for MathWorks to support. So as they move forward with the newer functionality, they're just taking the opportunity to simplify some of the syntaxes supported: it doesn't mean that there is any problem with interpn.

Related

Matlab: Fit a custom function to xy-data with given x-y errors

I have been looking for a Matlab function that can do a nonlinear total least square fit, basically fit a custom function to data which has errors in all dimensions. The easiest case being x and y data-points with different given standard deviations in x and y, for every single point. This is a very common scenario in all natural sciences and just because most people only know how to do a least square fit with errors in y does not mean it wouldn't be extremely useful. I know the problem is far more complicated than a simple y-error, this is probably why most (not even physicists like myself) learned how to properly do this with multidimensional errors.
I would expect that a software like matlab could do it but unless I'm bad at reading the otherwise mostly useful help pages I think even a 'full' Matlab license doesn't provide such fitting functionality. Other tools like Origin, Igor, Scipy use the freely available fortran package "ODRPACK95", for instance. There are few contributions about total least square or deming fits on the file exchange, but they're for linear fits only, which is of little use to me.
I'd be happy for any hint that can help me out
kind regards
First I should point out that I haven't practiced MATLAB much since I graduated last year (also as a Physicist). That being said, I remember using
lsqcurvefit()
in MATLAB to perform non-linear curve fits. Now, this may, or may not work depending on what you mean by custom function? I'm assuming you want to fit some known expression similar to one of these,
y = A*sin(x)+B
y = A*e^(B*x) + C
It is extremely difficult to perform a fit without knowning the form, e.g. as above. Ultimately, all mathematical functions can be approximated by polynomials for small enough intervals. This is something you might want to consider, as MATLAB does have lots of tools for doing polynomial regression.
In the end, I would acutally reccomend you to write your own fit-function. There are tons of examples for this online. The idea is to know the true solution's form as above, and guess on the parameters, A,B,C.... Create an error- (or cost-) function, which produces an quantitative error (deviation) between your data and the guessed solution. The problem is then reduced to minimizing the error, for which MATLAB has lots of built-in functionality.

Complex inverse and complex pseudo-inverse in Scala?

I'm considering to learn Scala for my algorithm development, but first need to know if the language has implemented (or is implementing) complex inverse and pseudo-inverse functions. I looked at the documentation (here, here), and although it states these functions are for real matrices, in the code, I don't see why it wouldn't accept complex matrices.
There's also the following comment left in the code:
pinv for anything that can be transposed, multiplied with that transposed, and then solved
Is this just my wishful thinking, or will it not accept complex matrices?
Breeze implementer here:
I haven't implemented inv etc. for complex numbers yet, because I haven't figured out a good way to store complex numbers unboxed in a way that is compatible with blas and lapack and doesn't break the current API. You can set the call up yourself using netlib java following a similar recipe to the code you linked.

Can someone tell me about the kNN search algo that Matlab uses?

I wrote a basic O(n^2) algorithm for a nearest neighbor search. As usual Matlab 2013a's knnsearch(..) method works a lot faster.
Can someone tell me what kind of optimization they used in their implementation?
I am okay with reading any documentation or paper that you may point me to.
PS: I understand the documentation on the site mentions the paper on kd trees as a reference. But as far as I understand kd trees are the default option when column number is less than 10. Mine is 21. Correct me if I'm wrong about it.
The biggest optimization MathWorks have made in implementing nearest-neighbors search is that all the hard stuff is implemented in a MEX file, as compiled C, rather than MATLAB.
With an algorithm such as kNN that (in my limited understanding) is quite recursive and difficult to vectorize, that's likely to give such an improvement that the O() analysis will only be relevant at pretty high n.
In more detail, under the hood the knnsearch command uses createns to create a NeighborSearcher object. By default, when X has less than 10 columns, this will be a KDTreeSearcher object, and when X has more than 10 columns it will be an ExhaustiveSearcher object (both KDTreeSearcher and ExhaustiveSearcher are subclasses of NeighborSearcher).
All objects of class NeighbourSearcher have a method knnsearch (which you would rarely call directly, using instead the convenience command knnsearch rather than this method). The knnsearch method of KDTreeSearcher calls straight out to a MEX file for all the hard work. This lives in matlabroot\toolbox\stats\stats\#KDTreeSearcher\private\knnsearchmex.mexw64.
As far as I know, this MEX file performs pretty much the algorithm described in the paper by Friedman, Bentely, and Finkel referenced in the documentation page, with no structural changes. As the title of the paper suggests, this algorithm is O(log(n)) rather than O(n^2). Unfortunately, the contents of the MEX file are not available for inspection to confirm that.
The code builds a KD-tree space-partitioning structure to speed up nearest neighbor search, think of it like building indexes commonly used in RDBMS to speed up lookup operations.
In addition to nearest neighbor(s) searches, this structure also speeds up range-searches, which finds all points that are within a distance r from a query point.
As pointed by #SamRoberts, the core of the code is implemented in C/C++ as a MEX-function.
Note that knnsearch chooses to build a KD-tree only under certain conditions, and falls back to an exhaustive search otherwise (by naively searching all points for the nearest one).
Keep in mind that in cases of very high-dimensional data (and few instances), the algorithm degenerates and is no better than an exhaustive search. In general as you go with dimensions d>30, the cost of searching KD-trees will increase to searching almost all the points, and could even become worse than a brute force search due to the overhead involved in building the tree.
There are other variations to the algorithm that deals with high dimensions such as the ball trees which partitions the data in a series of nesting hyper-spheres (as opposed to partitioning the data along Cartesian axes like KD-trees). Unfortunately those are not implemented in the official Statistics toolbox. If you are interested, here is a paper which presents a survey of available kNN algorithms.
(The above is an illustration of searching a kd-tree partitioned 2d space, borrowed from the docs)

Algorithm used by the matlab function linopt::minimize

What algorithm is used by the matlab function linopt::minimize ? I have to write the source code for solving a Mixed Integer Linear Programming problem. Please tell me the algorithms I can work upon.
It is not specified in the documentation of Mathlab (c.f. http://www.mathworks.fr/fr/help/symbolic/mupad_ref/linopt-minimize.html). However, they do provide some references at the end of the page (you should have a look there).
My guesses are the following (based on my humble knowledge regarding state-of-the-art methods for solving LP/MILP) :
1-If the problem is not MILP, we usually use the simplex algorithm (http://en.wikipedia.org/wiki/Simplex_algorithm).
2-If the problem is MILP (i.e. there is some integer variables), one usually use a Branch and Bound algorithm in which we might use simplex for improving the search (we call it Branch and Cut)
P.S. This is not an exhaustive response since other methods can be used but the above are the most known ones.

Functional form of 2D interpolation in Matlab

I need to construct an interpolating function from a 2D array of data. The reason I need something that returns an actual function is, that I need to be able to evaluate the function as part of an expression that I need to numerically integrate.
For that reason, "interp2" doesn't cut it: it does not return a function.
I could use "TriScatteredInterp", but that's heavy-weight: my grid is equally spaced (and big); so I don't need the delaunay triangularisation.
Are there any alternatives?
(Apologies for the 'late' answer, but I have some suggestions that might help others if the existing answer doesn't help them)
It's not clear from your question how accurate the resulting function needs to be (or how big, 'big' is), but one approach that you could adopt is to regress the data points that you have using a least-squares or Kalman filter-based method. You'd need to do this with a number of candidate function forms and then choose the one that is 'best', for example by using an measure such as MAE or MSE.
Of course this requires some idea of what the form underlying function could be, but your question isn't clear as to whether you have this kind of information.
Another approach that could work (and requires no knowledge of what the underlying function might be) is the use of the fuzzy transform (F-transform) to generate line segments that provide local approximations to the surface.
The method for this would be:
Define a 2D universe that includes the x and y domains of your input data
Create a 2D fuzzy partition of this universe - chosing partition sizes that give the accuracy you require
Apply the discrete F-transform using your input data to generate fuzzy data points in a 3D fuzzy space
Pass the inverse F-transform as a function handle (along with the fuzzy data points) to your integration function
If you're not familiar with the F-transform then I posted a blog a while ago about how the F-transform can be used as a universal approximator in a 1D case: http://iainism-blogism.blogspot.co.uk/2012/01/fuzzy-wuzzy-was.html
To see the mathematics behind the method and extend it to a multidimensional case then the University of Ostravia has published a PhD thesis that explains its application to various engineering problems and also provides an example of how it is constructed for the case of a 2D universe: http://irafm.osu.cz/f/PhD_theses/Stepnicka.pdf
If you want a function handle, why not define f=#(xi,yi)interp2(X,Y,Z,xi,yi) ?
It might be a little slow, but I think it should work.
If I understand you correctly, you want to perform a surface/line integral of 2-D data. There are ways to do it but maybe not the way you want it. I had the exact same problem and it's annoying! The only way I solved it was using the Surface Fitting Tool (sftool) to create a surface then integrating it.
After you create your fit using the tool (it has a GUI as well), it will generate an sftool object which you can then integrate in (2-D) using quad2d
I also tried your method of using interp2 and got the results (which were similar to the sfobject) but I had no idea how to do a numerical integration (line/surface) with the data. Creating thesfobject and then integrating it was much faster.
It was the first time I do something like this so I confirmed it using a numerically evaluated line integral. According to Stoke's theorem, the surface integral and the line integral should be the same and it did turn out to be the same.
I asked this question in the mathematics stackexchange, wanted to do a line integral of 2-d data, ended up doing a surface integral and then confirming the answer using a line integral!