Complex inverse and complex pseudo-inverse in Scala? - scala

I'm considering to learn Scala for my algorithm development, but first need to know if the language has implemented (or is implementing) complex inverse and pseudo-inverse functions. I looked at the documentation (here, here), and although it states these functions are for real matrices, in the code, I don't see why it wouldn't accept complex matrices.
There's also the following comment left in the code:
pinv for anything that can be transposed, multiplied with that transposed, and then solved
Is this just my wishful thinking, or will it not accept complex matrices?

Breeze implementer here:
I haven't implemented inv etc. for complex numbers yet, because I haven't figured out a good way to store complex numbers unboxed in a way that is compatible with blas and lapack and doesn't break the current API. You can set the call up yourself using netlib java following a similar recipe to the code you linked.

Related

Why is Matlab function interpn being modified?

The Matlab function interpn does n-grid interpolation. According to the documentation page:
In a future release, interpn will not accept mixed combinations of row and column vectors for the sample and query grids.
This page provides a bit more information but is still kind of cryptic.
My question is this: Why is this modification being implemented? In particular, are there any pitfalls to using interpn?
I am writing a program in fortran that is supposed to produce similar results to a Matlab program that uses interpn as a crucial component. I'm wondering if the Matlab program might have a problem that is related to this modification.
No, I don't think this indicates that there is any sort of problem with using interpn, or any of the other MATLAB interpolation functions.
Over the last few releases MathWorks has been introducing some new/better functionality for interpolation (for example the griddedInterpolant, scatteredInterpolant and delaunayTriangulation classes). This has been going on in small steps since R2009a, when they replaced the underlying QHULL libraries for computational geometry with CGAL.
It seems likely to me that interpn has for a long time supported an unusual form of input arguments (i.e. mixed row and column vectors to define the sample grid) that is probably a bit confusing for people, hardly ever used, and a bit of a pain for MathWorks to support. So as they move forward with the newer functionality, they're just taking the opportunity to simplify some of the syntaxes supported: it doesn't mean that there is any problem with interpn.

Diagonalization hermitian matrices julia vs fortran

I have a program written in Fortran and in Julia, one of the cases I have symmetric matrices
and I get results more or less similar with both programs. When I switch to a case where I have hermitian matrices, the program in Julia and the program in Fortran give me different stuff. I would guess that maybe the difference comes from the diagonalization procedure, in Fortran I use:
ZHEEVD(..)
while in Julia I simply use:
eig(matrix)
The first thing that I notice is that ZHEEVD fixes the first row of the eigenvector matrices to real numbers (no imaginary part), while eig fixes the last row to real numbers.
Any idea how to overcome this tiny differences? Any more info that can be useful when dealing with julia's linear algebra built-ins?
Digging in to the Julia methods (the #less macro is very handy for this), you'll find that it eventually calls the LAPACK.syevr! method, which in the Complex128 case is a wrapper for the ZHEEVR LAPACK method (scroll down a bit to see the actual definition).
If you'd prefer to keep using ZHEEVD, you can access it via the ccall interface: see the manual section on Calling C and Fortran code. The LAPACK wrappers linked above should provide plenty of examples (LAPACK comes as part of OpenBLAS, which is included in Julia, so you shouldn't need to install anything else).

Can someone tell me about the kNN search algo that Matlab uses?

I wrote a basic O(n^2) algorithm for a nearest neighbor search. As usual Matlab 2013a's knnsearch(..) method works a lot faster.
Can someone tell me what kind of optimization they used in their implementation?
I am okay with reading any documentation or paper that you may point me to.
PS: I understand the documentation on the site mentions the paper on kd trees as a reference. But as far as I understand kd trees are the default option when column number is less than 10. Mine is 21. Correct me if I'm wrong about it.
The biggest optimization MathWorks have made in implementing nearest-neighbors search is that all the hard stuff is implemented in a MEX file, as compiled C, rather than MATLAB.
With an algorithm such as kNN that (in my limited understanding) is quite recursive and difficult to vectorize, that's likely to give such an improvement that the O() analysis will only be relevant at pretty high n.
In more detail, under the hood the knnsearch command uses createns to create a NeighborSearcher object. By default, when X has less than 10 columns, this will be a KDTreeSearcher object, and when X has more than 10 columns it will be an ExhaustiveSearcher object (both KDTreeSearcher and ExhaustiveSearcher are subclasses of NeighborSearcher).
All objects of class NeighbourSearcher have a method knnsearch (which you would rarely call directly, using instead the convenience command knnsearch rather than this method). The knnsearch method of KDTreeSearcher calls straight out to a MEX file for all the hard work. This lives in matlabroot\toolbox\stats\stats\#KDTreeSearcher\private\knnsearchmex.mexw64.
As far as I know, this MEX file performs pretty much the algorithm described in the paper by Friedman, Bentely, and Finkel referenced in the documentation page, with no structural changes. As the title of the paper suggests, this algorithm is O(log(n)) rather than O(n^2). Unfortunately, the contents of the MEX file are not available for inspection to confirm that.
The code builds a KD-tree space-partitioning structure to speed up nearest neighbor search, think of it like building indexes commonly used in RDBMS to speed up lookup operations.
In addition to nearest neighbor(s) searches, this structure also speeds up range-searches, which finds all points that are within a distance r from a query point.
As pointed by #SamRoberts, the core of the code is implemented in C/C++ as a MEX-function.
Note that knnsearch chooses to build a KD-tree only under certain conditions, and falls back to an exhaustive search otherwise (by naively searching all points for the nearest one).
Keep in mind that in cases of very high-dimensional data (and few instances), the algorithm degenerates and is no better than an exhaustive search. In general as you go with dimensions d>30, the cost of searching KD-trees will increase to searching almost all the points, and could even become worse than a brute force search due to the overhead involved in building the tree.
There are other variations to the algorithm that deals with high dimensions such as the ball trees which partitions the data in a series of nesting hyper-spheres (as opposed to partitioning the data along Cartesian axes like KD-trees). Unfortunately those are not implemented in the official Statistics toolbox. If you are interested, here is a paper which presents a survey of available kNN algorithms.
(The above is an illustration of searching a kd-tree partitioned 2d space, borrowed from the docs)

Is there a MATLAB version of partial Schur decomposition?

I'm quite surprised not to find one in standard library. Is there some reason it is missing or I just need to use a specific toolbox?
Implementing it myself would be very problematic because of the complexity of algorithm involved.

solve multiobjective optimization: CPLEX or Matlab?

I have to solve a multiobjective problem but I don't know if I should use CPLEX or Matlab. Can you explain the advantage and disadvantage of both tools.
Thank you very much!
This is really a question about choosing the most suitable modeling approach in the presence of multiple objectives, rather than deciding between CPLEX or MATLAB.
Multi-criteria Decision making is a whole sub-field in itself. Take a look at: http://en.wikipedia.org/wiki/Multi-objective_optimization.
Once you have decided on the approach and formulated your problem (either by collapsing your multiple objectives into a weighted one, or as series of linear programs) either tool will do the job for you.
Since you are familiar with MATLAB, you can start by using it to solve a series of linear programs (a goal programming approach). This page by Mathworks has a few examples with step-by-step details: http://www.mathworks.com/discovery/multiobjective-optimization.html to get you started.
Probably this question is not a matter of your current concern. However my answer is rather universal, so let me post it here.
If solving a multiobjective problem means deriving a specific Pareto optimal solution, then you need to solve a single-objective problem obtained by scalarizing (aggregating) the objectives. The type of scalarization and values of its parameters (if any) depend on decision maker's preferences, e.g. how he/she/you want(s) to prioritize different objectives when they conflict with each other. Weighted sum, achievement scalarization (a.k.a. weighted Chebyshev), and lexicographic optimization are the most widespread types. They have different advantages and disadvantages, so there is no universal recommendation here.
CPLEX is preferred in the case, where (A) your scalarized problem belongs to the class solved by CPLEX (obviously), e.g. it is a [mixed integer] linear/quadratic problem, and (B) the problem is complex enough for computational time to be essential. CPLEX is specialized in the narrow class of problems, and should be much faster than Matlab in complex cases.
You do not have to limit the choice of multiobjective methods to the ones offered by Matlab/CPLEX or other solvers (which are usually narrow). It is easy to formulate a scalarized problem by yourself, and then run appropriate single-objective optimization (source: it is one of my main research fields, see e.g. implementation for the class of knapsack problems). The issue boils down to finding a suitable single-objective solver.
If you want to obtain some general information about the whole Pareto optimal set, I recommend to start with deriving the nadir and the ideal objective vectors.
If you want to derive a representation of the Pareto optimal set, besides the mentioned population based-heuristics such as GAs, there are exact methods developed for specific classes of problems. Examples: a library implemented in Julia, a recently published method.
All concepts mentioned here are described in the comprehensive book by Miettinen (1999).
Can cplex solve a pareto type multiobjective one? All i know is that it can solve a simple goal programming by defining the lexicographical objs, or it uses the weighted sum to change weights gradually with sensitivity information and "enumerate" the pareto front, which highly depends on the weights and looks very subjective.
You can refer here as how cplex solves the bi-objetive one, which seems not good.
For a true pareto way which includes the ranking, i only know some GA variants can do like NSGA-II.
A different approach would be to use a domain-specific modeling language for mathematical optimization like YALMIP (or JUMP.jl if you like to give Julia a try). There you can write your optimization problem with Matlab with some extra YALMIP functionalities and use CPLEX (or any other supported solver as a backend) without restricting to one solver.