How can I access a MATLAB (interpolated) spline from another program? - matlab

If I was to create interpolated splines from a large amount of data (about 400 charts, 500,000 values each), how could I then access the coordinates of those splines from another software quickly and efficiently?
Initially I intended to run a regression on the data and use the resulting formula in my delphi program, but that turned out to be a bigger pain than I thought.
I am currently using Matlab but I can use another software if need be.
Edit: It is probably relevant that this data represents the empirical cumulative distribution of some other data (which I already have in a database).
Here is what one of these charts would look like.
The emphasis is on speed of access. I intend to use this data to run simulations on financial data.

MATLAB has a command for converting a spline into a piecewise polynomial. You can then extract the breaks and the coefficients of each piece of polynomial with unmkpp, and evaluate them in another program.

If you are also familiar with C, you could use Matlab coder or something similar to get an intermediate library to connect your Delphi program and MATLAB together. Interfacing Delphi and C code is, albeit a tad tedious, certainly possible (or it was back in the days of Delphi 7). Or you could even write the algorithm in MATLAB, convert the code to C using Matlab coder and from within Delphi call the generated C library.
Perhaps a bit overkill, but you can store your data in a database (e.g. MySQL) from MATLAB and retrieve them from Delphi.
Finally: is Delphi a real constraint? You could also use MATLAB to do the simulations, as you might have the same tools (or even more) available for MATLAB than in Delphi. Afterwards you can just share the results, which I suppose is less speed critical.

My initial guess at doing this efficiently would be to create a memory mapped file in MATLAB using memmapfile, stuff a look-up table with your data into that, then open the memory mapped file in your Delphi code and read the data from that file.

The fastest is most likely a look-up table that you save to disk and that you load and use in your simulation code (although: why not run the simulation in Matlab?)
You can evaluate the spline for a finely-grained list of values of x using FNVAL, and use the closest value of x to look up the cdf.

Related

Using output of Matlab protected file as input in Matlab

I am aware that .p files are Matlab protected files. So, I am not trying to access them. However, I was wondering if I could use their output onto the Matlab shell as input to a Matlab program.
What I mean is the following: I have to simulate a dynamic system in Matlab using a controller. Afterwards, I need to assess its performance. This is done by the .p file. Now, the controller behaviour is defined by six distinct variables. I pretty much know their range. So, what I did was create an optimization to find the optimal coefficients. However, when I run the .p file I see that the coefficients I obtained as optimal are in fact not optimal, i.e. my cost function is biased in some way.
So, what I would like to do is to use the output of the .p file (there are always six strings with only two numerical values - so they would be easy to extract if it were a text file) to run a new optimization so that I can understand what I did wrong in my original cost function.
The alternative is finding the parameters starting from my values by trial and error, but considering there are six variables I would prefer a more mathematically pure approach.
Basically, the question is how I can read the output onto the command prompt of a Matlab .p function, and use it as input in a Matlab function.
Thanks for the help!

Solving Ax=b where A is too big to be stored in a single array

Problem: A is square, full rank, sparse and banded. It has way too many elements to be stored as a single matrix in Matlab (at least ~4.6*1018 and ideally ~1040, both of which exceed max array size. EDIT: A is stored as sparse, and the problem is not with limited memory but with limited number of elements). Therefore I have to store it as a collection of smaller arrays (rows/diagonals/columns/blocks).
Looking for: a way to solve Ax=b, with A given as a collection of smaller arrays. Ideally in Matlab but not a must.
Alternatively, if not in Matlab: maybe there's a program that can store and solve such a big A?
Found so far: methods if A is tri/pentadiagonal, but my A has N diagonals. Also found something about partitioning A to blocks, but couldn't find a way to then solve a linear system with these blocks.
p.s. The system is 64-bit.
Thanks everyone!
Not using Matlab would allow you to store larger arrays. ROOT is an open source framework developed at CERN that has C++ and Python interfaces and a variety of solvers. It is also capable of handling huge datasets and has a variety of visualization and analysis tools as well.
If you are interested in writing C or Fortran BLAS(Basic Linear Algebra Subroutines) and CBLAS would be good options. There are many open source and proprietary implementations of BLAS that should be available for most Linux/UNIX distributions. There are also plenty of examples showing how to use the BLAS subroutines in C and Fortran code available online.
If you have access to MATLAB's Parallel Computing Toolbox together with MATLAB Distributed Computing Server, you may be able to store A as a distributed array, in other words a single array whose elements are distributed across the memories of multiple machines in a cluster. You can call MATLAB's backslash command directly on a distributed array, and MATLAB handles the parallelization for you.
I wanted to put this as a comment, but I think it is better to state it as an answer.
You have a serious problem. It is not only a problem of indexing, it is also a problem of memory: 4.6x10^18 is huge. That is 4.6 exa elements. If you store them as real single precision, you need 4x4.6 exabyte of memory. A computer which such a huge memory, does not yet exists to my knowledge. You will need to gather all the storage (hard disk, not RAM) of a significant proportion of all computers in the world to store such a matrix. Think about it. Going to 10^40 elements is nearly impractical for the time being. With your 64 bit computers, the 64 bit address space can bearly address 4.6x10^18 elements. 64 bits address (or integer) makes it possible to directly index 2^64 elements which is roughly 16x10^18. So you have to think twice.
Going back to the problem itself, there are chances that you can turn your matrix into an implicit operator. By implicit operator, I mean, you do not need to store it, because it has a pattern that you know how to reproduce, or you can apply it to a vector without actually forming the matrix. If you have the matrix in hand, you are very likely in this situation, considering what I said above.
If that is the case, to solve your problem, you simply need to use an iterative solver and provide a black box that does your matrix multiplication. Going to other directions might be a waste of your time.

Use a dataset array without Statistics Toolbox

At my workplace I have one license of MATLAB on a virtual machine, which has Statistics Toolbox included with it. I like to use that instance of MATLAB to import csv data into dataset arrays, because of the convenience it provides.
However, I'd like to use the imported data on my local machine, which has its own license for MATLAB but (unfortunately) no Statistics Toolbox.
What is the best way to convert the dataset object to something that can be used with only base MATLAB? dataset2struct? It seems that if I'm just converting it back to a structure, I might as well just write a function that imports the data directly to a structure. Or is there any other way to work with dataset array in a MATLAB instance that lacks Statistics Toolbox?
In version 13b of MATLAB (out this September, prerelease is available now), there will be something similar to a dataset array in base MATLAB called a table data container (I haven't tried it yet, and can't be sure it will be exactly the same). Also a categorical array similar to that currently in Statistics Toolbox.
Until then, there's not really a way to use a dataset array without Statistics Toolbox, and I would suggest either of the two methods you mention (personally I'd go with just using a structure throughout, as I find the convenience of dataset arrays to be overrated - but that's just my experience, yours may differ).

Matlab versus simulation products such as ANSYS and COMSOL

This may be the wrong place to ask this, but I can't find a better place on the SE network.
I've briefly worked with both Matlab and Ansys, and from what I have learnt/can gather, Matlab is a programming environment that has functions that perform common math, visualization and analysis operations. You primarily write programs in a textual fashion (.m files) or use Simulink to generate flow graphs (model-based development). Ansys on the other hand is primary a simulation environment where quite a lot can be done simply with the GUI (3D models, physics domains, configuration, display settings), and you can add equations at various points in the simulation engine in order to modify the simulation flow.
Whatever I understand is cursory and only serves as an overview. Can anyone give me a suitable real-world comparison between Matlab and Ansys (or any other simulation product such as COMSOL) that would allow us to understand when to use which, and the weaknesses of each system.
I haven't used Ansys, but Ansys is often compared with Comsol, and I've used Comsol and Matlab for years.
Matlab:
Programming language and environment that runs it. Which means it can do anything (that any other programming language can do). What are its highlights, compared to other languages?
Hundreds of built-in functions to work with Matrices. For example, in one project I needed to do simple matrix algebra (add, multiply, scale matrices), and also needed singular value decomposition. SVD is not something you could write in 50 lines of code, so I needed a ready-made library. At the time I used a library for Java, and wrote my own code for representing matrices and doing matrix algebra on them. That's a few hundreds of lines of code. Had I used Matlab, it would have been about ten lines of code, because all of it is there. I would have needed only to type help svd to find out how to use it. However, if you don't need any of that, stay away from Matlab at all costs! There are much better languages that are free.
Great to use as a calculator that is always open on the desktop, and can do back-of-the-envelope style calculations.
Plotting graphs. Many academics recommend Matlab as the tool of choice for producing publication-quality graphics. These can be exported as PDF and imported into Inkscape for further editing. The best thing is that commands for plotting a graph could be put into a script file, and then parts of it can be changed later as needed, which can save a lot of work compared to manually drawing a graph (imagine you wanted to change the axes or symbols used to present the data points).
Personally, I also use it for curve-fitting. It has many toolboxes, one of which is a neat tool that allows me to find equations that model a set of data points.
Comsol:
Specialised tool for solving partial differential equations (PDEs) on complicated domains using the finite element method (FEM). This might sound obscure, but many real-world engineering needs reduce to this. Such things as:
Finding loads, stresses and strains in civil engineering structures with complicated real-world geometry (what happens when there is gusty wind blowing onto a building or bridge?)
How do currents flow in particular conductive objects?
Chemical reactions in various industrial reactors.
What is the power efficiency of a generator (magnet spinning in coil) design?
How to place aircon outlets in a nontrivially-shaped room to achieve both good temperature distribution and good efficiency?
Comsol, as any other FEM tool that can work with arbitrary equations, can do multiphysics, which means, for example, that one could solve for chemistry of a battery, as well as the temperature and pressure, and how that feeds back into the chemical reaction (speeds up or slows down). Compared with a tool where you need to provide the equations, in Comsol, most of the things that would be needed to solve most problems are already there, and just need to be selected and applied to the geometry, which is also built inside Comsol. Also, equations of arbitrary description can be introduced.
The physical descriptions of how these physical substances behave are called PDEs.
Once Comsol has finished solving a problem, the data could be exported for post-processing into Matlab, which has much more versatile tools for manipulating data and making various plots.

Functional form of 2D interpolation in Matlab

I need to construct an interpolating function from a 2D array of data. The reason I need something that returns an actual function is, that I need to be able to evaluate the function as part of an expression that I need to numerically integrate.
For that reason, "interp2" doesn't cut it: it does not return a function.
I could use "TriScatteredInterp", but that's heavy-weight: my grid is equally spaced (and big); so I don't need the delaunay triangularisation.
Are there any alternatives?
(Apologies for the 'late' answer, but I have some suggestions that might help others if the existing answer doesn't help them)
It's not clear from your question how accurate the resulting function needs to be (or how big, 'big' is), but one approach that you could adopt is to regress the data points that you have using a least-squares or Kalman filter-based method. You'd need to do this with a number of candidate function forms and then choose the one that is 'best', for example by using an measure such as MAE or MSE.
Of course this requires some idea of what the form underlying function could be, but your question isn't clear as to whether you have this kind of information.
Another approach that could work (and requires no knowledge of what the underlying function might be) is the use of the fuzzy transform (F-transform) to generate line segments that provide local approximations to the surface.
The method for this would be:
Define a 2D universe that includes the x and y domains of your input data
Create a 2D fuzzy partition of this universe - chosing partition sizes that give the accuracy you require
Apply the discrete F-transform using your input data to generate fuzzy data points in a 3D fuzzy space
Pass the inverse F-transform as a function handle (along with the fuzzy data points) to your integration function
If you're not familiar with the F-transform then I posted a blog a while ago about how the F-transform can be used as a universal approximator in a 1D case: http://iainism-blogism.blogspot.co.uk/2012/01/fuzzy-wuzzy-was.html
To see the mathematics behind the method and extend it to a multidimensional case then the University of Ostravia has published a PhD thesis that explains its application to various engineering problems and also provides an example of how it is constructed for the case of a 2D universe: http://irafm.osu.cz/f/PhD_theses/Stepnicka.pdf
If you want a function handle, why not define f=#(xi,yi)interp2(X,Y,Z,xi,yi) ?
It might be a little slow, but I think it should work.
If I understand you correctly, you want to perform a surface/line integral of 2-D data. There are ways to do it but maybe not the way you want it. I had the exact same problem and it's annoying! The only way I solved it was using the Surface Fitting Tool (sftool) to create a surface then integrating it.
After you create your fit using the tool (it has a GUI as well), it will generate an sftool object which you can then integrate in (2-D) using quad2d
I also tried your method of using interp2 and got the results (which were similar to the sfobject) but I had no idea how to do a numerical integration (line/surface) with the data. Creating thesfobject and then integrating it was much faster.
It was the first time I do something like this so I confirmed it using a numerically evaluated line integral. According to Stoke's theorem, the surface integral and the line integral should be the same and it did turn out to be the same.
I asked this question in the mathematics stackexchange, wanted to do a line integral of 2-d data, ended up doing a surface integral and then confirming the answer using a line integral!