Coupling Lua and MATLAB - matlab

I am in the situation where I have a part of the codebase written in MATLAB and another part in Lua (which is used for scripting of a 3rd party program). As of now the exchange of data between them is makeshift, using the file I/O system. This evolved to be a substantial part of the code, even though that wasn't really planned.
The program is structured in a way, that some Lua scripts are run, then some MATLAB evaluation is done based on which some more Lua is run and so on. It handles simulations and evaluations (scientific code) and creates new simulations based on that. It handles thousands of files and sims.
To streamline the process I started looking into possibilities to change the data I/O and make easy calls from one to another.
I wanted to hear some opinions on how to solve the problem, the optimal solution would be one where I could call everything from MATLAB or Lua, and organize the large datasets in a more consistent and accessible way.
Solutions:
Use the Lua C API to create bindings for the Lua modules, and to add this to MATLAB as a C-Library. In this way I should hopefully be able to achieve my goals and reduce the system complexity.
Some smarter data format for the exchange of datasets (HDF?), and some functions which read the needed workspace variables. This way the parts of the program remain independent, but the data exchange gets solved.
Create wrappers for Lua/MATLAB functions, so they can be called more easily. Data exchange could be done through the return parameters of the functions.
Suggestions?

I would suggest 1 or if you aren't adverse to spending a lot of money, use MATLAB coder to generate C functions from the MATLAB side of the analysis, compile the generated code as a shared library, import the library with the LuaJIT FFI, and run everything from Lua. With this solution you would not have to change any of the MATLAB code and not much of the Lua code thanks to the LuaJIT's semantics regarding array indexing. Solution 1 is free, but it is not as efficient because of the constant marshaling between the two languages' data structures. It would also be a lot of work writing the interface. But either solution would be more efficient than file I/O.
As a easy performance boost, have you tried keeping the files in memory using a RAMdisk or tmpfs?

Related

What is the obvious advantage of using AMPL?

I am doing a project using CPLEX solver, on Netbeans with Java. We have several optimization problems to solve, I have already solved one of them by coding in Java all the constraints, objective and variables, without using AMPL. However, some people in my team want to use AMPL.
Thus, as I don't want to read all the AMPL book to find the answer, is there an obvious reason to rather use AMPL than coding all the constraints "manually"? Moreover, can AMPL be integrated in Netbeans ? I did not find any documentation about that.
Is AMPL useful when the constraints need to be "flexible" (I mean, we can't guess in advance the exact number of constraints, it depends on the parameters fixed by the user, modularity is a high importance factor...)
I am really curious to hear about that soon !
Thanks for help
AMPL is an algebraic modeling language and quoting from that link:
One advantage of AMPL is the similarity of its syntax to the
mathematical notation of optimization problems.
For example, this can allow you to define groups of constraints without knowing in advance the dimensions of the model. And, perhaps, you can make big changes to your model more quickly. (You'll have to think about how often you will actually do that.)
However, one could argue that the "obvious advantage" of AMPL is that it supports dozens of different solvers. You can create your model and solve it with CPLEX, but then decide that you want to use a different solver (e.g., Gurobi, Xpress, etc.). On the AMPL Solvers web page, they have the following recommendation:
We recommend that you then test alternative solvers to determine which
offers the best tradeoff of price and performance for your needs.
The AMPL API web page says that there is a Java API, so that should allow you to include it in a Netbeans project, but I have no experience with that.
At the end of the day, you could also argue that these "advantages" are a matter of taste. Using the CPLEX Java API directly, as you have already done, is certainly a valid solution if it meets your requirements. It may allow you to build the model more efficiently, use solver-specific/advanced features that might not be supported by AMPL, and to have more fine-grained control over the model formulation.
You have just coded an optimisation model to optimise your company's production of widgets. Your company got a really good deal on $SOLVER1 so that's what you're using.
Over the next ten years, you improve and extend that model as your bosses throw new requirements at you. By the end of that time, you may have tens of thousands of lines of optimisation code as part of a system that, by now, is absolutely critical to your company's operations.
Your company's original licensing deal has expired, and the manufacturers of $SOLVER1 have massively increased the licensing fees, so you're now paying hundreds of thousands a year in licensing costs.
Meanwhile, the boffins at a rival company have just released a new version of $SOLVER2. It has fancy new algorithms that could solve the widget optimisation problem 20% faster and find better solutions than $SOLVER1 is giving you. It doesn't cost any more than $SOLVER1 and the performance is better.
Meanwhile, the open-source community has released $FREESOLVER. It might not be quite as powerful as the top commercial options, but it's as good as $SOLVER1 was ten years ago, and if you weren't paying $100k/year for licensing you could rent an awful lot of server time to make up for it.
...so, did you write your optimisation model on a platform that lets you switch to a new solver and take advantage of these opportunities without having to jettison ten years' worth of code?
There are huge advantages to being able to switch solvers quickly and easily. I know of one company who uses three different solvers for their work: they try two different open-source solvers both running in the cloud, and if neither of those can find an adequate solution then they throw it to an expensive solver with smarter algorithms. The open-source solvers handle 90% of their problems, so they only have to use the commercial solver for the last 10%, which allows them to make significant savings on their licensing costs.
One option we've discussed at my work is to use a commercial solver for mission-critical work, and open-source alternatives for applications like training or small-scale prototyping where we don't have the same requirements. That way we can minimise the number of concurrent users we need to license for the commercial solver.
(And, yes, there is still an issue of lock-in with the platform, but platforms like AMPL are significantly cheaper than a high-end commercial solver.)
Totally agree with everything that rkersh says. Also note that you should never write your model in a way that hard-codes details of your problem sizes etc. whether you write in an algebraic modelling language or through one of the more direct APIs.
Also, working with a modelling language gives you an extra level/layer of abstraction which can help, especially in sharing or explaining your model to others, comparing with a range of standard problem types etc., but I prefer the more nuts-and-bolts 'feel' of working with the more direct APIs, and almost never need (or have time & budget) to reformulate my models that deeply.
Even GPL means "general" yet newer and newer GPLs coming to life, so a given GPL is "more general" to somet tasks than others... :-) In theory writing a compiler the most efficiently for Pascal or Perl should not matter, so in fact you could write in whatever language you want and yet you should not lose expressivity or efficiency (e.g. for C# which is in the same league for Java now, MS writes a better compiler than the opensource equivalent).
Humans are specializing - this is why we have gotten this far :-) . No different when it comes to achieve a given task to convert a business problem to a math model (aka modeling). The whole idea of having a given modeling layer is that
A. you have the outmost expressivity for that particular task (aka math modeling)
B. it enforces some best practicies for modeling what in GPL you are not "forced" to do (1. you are free to do 2. it is marketed to you as such = flexibility). E.g. AMPL, GAMS, others are mixing declarative code (aka model code) and procedural code (aka flow-control-like) which is not a good practice. On the other hand e.g. separating data and an abstract model is getting to ALL modeling languages but interestingly enough very slowly...
C. thru no.A you can maintain the code more efficiently than otherwise (contrary to API modeling - I have clients who say they turned to modelinglanguage becuase API modeling is a liability for rapid model revamp)
D. in theory you could be solver independent.
If you look around all modeling languages are trying to maintain no.C except OPL (that's for historical reasons). But even in case of OPL, you get constraint-programming and constraint-based scheduling (beside math-programming) what with AMPL/GAMS you don't, however solverindependent they are...
the $Solver1 and $Solver2 + $Freesolver comparison is a bit broken for 4 reasons
A. opensolvers are still very far away from commercial solvers in term of performance when it comes to large/complex problems (probably LP is getting to the exception) - I have clients - the fastest ever sales in my memory - when they tested commercial solvers after their "free-ride".
B. while indeed the scenario described in relation with $Solver1 and $Solver2 seems plausible ($Solver1, the incumbent is getting more expensive over time), we could witness just the other way around where the $Solver2 (a new comer) actualy increased its pricing 4x in 7 years and in some cases doubled it, while $Solver1 (the incumbent) has had no change.
C. mixing up modeling capabilities and solvers is a mistake. The whole idea is that somebody writes models in APIs IS the way to stick to a solver much more than thru modeling languages. At a minimum, as the Hungarians say "what you gain on the custom you lose it on the ferry", in other words, "freedom (i.e. flexibility) comes with using it responsibly"
D. owning a solver for development is NOT expensive at all, i.e. a company can maintain large # of solvers (for less than 10k$ a company could have +4 solvers for development) to test which is the fastest for any given model and then choose the best suited for deployment.
in addition, solver is just one piece of the puzzle. E.g. I have a client who has disparate data sources and it takes 8hours to create a model and 4hours to solve it. Would this client welcome a more efficient data handling suite or would it insist that the solver should be faster? Modelers are too isolated from the business in most cases and while in their mind a given model is perfect, how it is populated by data is secondary, yet it makes or breaks a good performance.
I witness that API modelers are moving to modeling languages, not the other way around for various reasons...
but as somebody wrote above, there are lots of "tastes in the game", so eventually if you feel more confortable with a given approach then nobody can blame you to choose so... :-) after all it is very difficult to compare the/an other approach since it's almost never there on a given case... so eventually what counts is speed from business problem to a model which solve fast in the given application context :-)
phew, it was long... but I gave all my shots... :-)
To keep it short to illustrate advantage/disadvantage of using AMPL just compare using Java(AMPL) instead of assembly language(CPLEX).

MATLAB pointers that access memory

Is it possible to make pointers in MATLAB that accesses the actual memory locations? I would like to use pointers to reference certain structures that I've made, but I want to be able to modify the structures through the pointer. I would use C++ but I can't use C++ on the servers I'm working with.
This is the best thing I've found so far, but it doesn't look like what I want.
http://www.mathworks.com/help/matlab/matlab_external/working-with-pointers.html
If it's not possible I have other ways around it, but it makes my code significantly less extensible.

How does logical indexing work?

In some high-level languages like Matlab, you can use "logical indexing" to select a whole set of entries in an array for operating on.
I understand what logical indexing is and how to use it.
Instead, I am asking:
How does it work ("behind the scenes")?
Does it not boil down to just a for-loop?
If so, why is it so much faster than for-looping?
Interpreted languages can be thought of as a variation on assembler running on an emulated core. They have stacks and commands that work in ways like the assembler without actually being the assembler. They are a virtual machine.
A for loop can be thought of as telling the system, set a value, run a sequence of tasks, and when you are done then come back and check on that value. If it is not at a threshold, then change it in a prescribed way, and go repeat those tasks and come back. In assembler you are running screaming fast, but in the "VM" not so much. Consider the demonstration between 13:50 and 15:30 of this link: (link)
This means that what appears to be a for loop, isn't actually a for loop. It is operating system interrupts, and virtualized memory. It is virus-scans in the background and megasloth bloatware.
If you had a virtual system, could you make a short-cut for addressing memory that didn't use the virtualized for loop, that was reasonably efficient? MatLab tries to major on data processing, so it has to have very efficient ways of storing, sorting, and selecting data within its virtual machine.
MathWorks is not going to make the details of this accessible to the public. If it has a great idea then they don't want it implemented in Python, and R tomorrow. If it has a mediocre idea then they don't want to be beaten in execution by Python and R tomorrow. Either way, making the nuts and bolts of that particular approach accessible to the public without an NDA - it is likely a losing proposition for them.
Bottom lines:
its not a real "for", even for a for loop, because its running virtually
they are opening up some of the internals of their data handling to improve usability
they aren't likely to disclose actual code because of negative business consequences
It is worthy to note that vectorized code can outperform for loops while doing the same thing. This means they likely are applying more of that internals to execution of the "sequence of tasks" to get performance improvement.

How to combine version control with data analysis

I do a lot of solo data analysis, using a combination of tools such as R, Python, PostgreSQL, and whatever I need to get the job done. I use version control software (currently Subversion, though I'm playing around with git on the side) to manage all of my scripts, but the data is perpetually a challenge. My scripts tend to run for a long period of time (hours, or occasionally days) to generate small or large datasets, which I in turn use as input for more scripts.
The challenge I face is in how to "rollback" what I do if I want to check out my scripts from an earlier point in time. Getting the old scripts is easy. Getting the old data would be easy if I put my data into version control, but conventional wisdom seems to be to keep data out of version control because it's so darned big and cumbersome.
My question: how do you combine and/or manage your processed data with a version control system on your code?
Subversion, maybe other [d]vcs as well, supports symbolic links. The idea is to store raw data 'well organized' on a filesystem, while tracking the relation between 'script' and 'generated date' with symbolic links under version control.
data -> data-1.2.3
All your scripts will call load data to retrieve a given dataset, being linked through versioned symbolic link to a given dataset.
Using this approach, code and calculated datasets are tracked within one tool, without bloating your repository with binary data.

Learning PLC programming [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
How do I learn PLC programming? Would it differ greatly for different brands of PLCs? Is ladder programming the same as PLC programming?
I did a lot of PLC programming, and now do quite a bit of .NET programming. It's very dangerous to make the switch either way, because a lot of the skills that you think should be transferrable (patterns and such) lead you very far astray.
The biggest difference that I tell people is that PC program code should be written as if other programmers are the audience, but PLC programs (ladder logic) must be written as if maintenance people are the audience. Maintainance in most facilities (particularly manufacturing) frequently connect directly to PLCs and in online mode they can watch the code execute graphically to figure out what's wrong.
For instance, if an output isn't turning on, they'll type the output electrical device ID into the find function of the programming software, find that output coil, and start tracing back from there looking for issues. One of the frequent mistakes that some PLC programmers make is to "map" their I/O into a structure (in PLCs, these are called user-defined types), and they use a copy instruction to move all the inputs or outputs over to the structure at once. Makes sense from a PC programming perspective, but it makes the maintenance person want to kill you. Typically the programming software provides a cross reference feature where they can specify that output coil, and it will tell them everywhere in the program that it's used. If you use a copy instruction to move 10 words of I/O into a 10 word data structure, he's got to sit there and count bits to figure out which bit in the source of the copy maps to which bit on the destination side of the copy. True, comments can help, but there's a problem with that too... PLCs store the whole program and allow you to upload the program from it in an emergency if you need to troubleshoot and you don't have a copy of the original program. The problem is that for space reasons, the PLC doesn't store the comments. So if the line is down, it's costing $5000 per minute in downtime, and a guy runs out there with a laptop, he might have to do a quick upload without comments and try to troubleshoot it. Having those copy instructions in there, wasting 10 minutes of his time, just cost the company $50,000 in downtime. These are the things you have to be aware of when writing PLC programs.
Some other tips: some PLCs have support for FOR loops. Never use them. For the same reason above, they make the code very difficult to troubleshoot for a maintenance person. This is because if you have one piece of code in the PLC that gets scanned more than once per scan (like the contents of a loop), then when you go into online debugging mode, the software can't show you the values for each of 10 loops that executed this scan, so you really have no idea what value you're looking at. Then you have to write all this tricky code to pull the loop values for a specific loop index out into some other tags (variables) that you can monitor. That's just one more impedance to fixing the problem in an emergency. Using a subroutine more than once per scan suffers from the same problem.
Indirect addressing (what we would call Arrays) are very difficult for maintenance people to understand. It's generally OK to use them when you're dealing with recipe management (storing and retrieving values for how to build your part) but you should try to stay away from it in the control part of the program.
In PC programming, of course we seek to re-use code as much as possible. However, in PLCs and control systems, downtime is extremely expensive, and hardware is expensive. Memory is cheap, and actually PLC programmers are cheap. Therefore, it's expected that if you have 10 identical things on your machine (like conveyor drives or something) that you will have 10 different files (subroutines), one for each drive, and each drive will have its own variables associated with them: e.g. Drive1_Run, Drive2_Run, Drive3_Run, etc. This is going to feel very "wrong" to you when you come from a PC programming background, but this is all because of the points I've made above. When you're in a downtime situation, and someone says that Drive 3 isn't working, you crack open the laptop, go to the file for Drive 3 and you look at the Run output rung. You start troubleshooting from there, while the program is executing. There's no breakpoints (the program never stops).
Good luck on your endeavors. I wrote up some more insights from my years of programming PLCs, if you want to check them out.
You can learn PLC programming from various sources on the internet, one of which is this(wikibooks) or this
The program that you write will be pretty much the same across different brands of PLCs for LLDs (Ladder Logic Diagrams) unless you use PLC specific functions. But there will be much more differences if you use some language like IL (Instruction List). But once you have written the program, the format of storage and execution differs widely across brands
Ladder logic is one of the 5 programming languages for PLC, the others being FBD (Function block diagram), ST (Structured text, similar to the Pascal programming language), IL (Instruction list, similar to assembly language) and SFC (Sequential function chart). These are just various representations of the programming language, various flavours if you will. But usually, a given brand supports only one of these. In USA, LLDs are widely used, while in Europe, ILs are more popular.
Ladder, often call LD is one of several language styles defined in ISO 61131 automation programming standard. Others are SFC (sequential flow chart), FBD (functional block diagram), ST (structured text), and IL (instruction list). IL is similar to assembler and very few people use it. ST is a text based programming much like early versions of BASIC. It is not often used either. LD is designed to resemble relay contacts off an electrical control panel (which many PLC replaced). FBD looks more like a circuit diagram. SFC is basically a flow chart.
Some PLC support all, other only some, or even one. While LD is the most common, FBD and SFC are gain popularity.
Different brands do use slightly different programming languages. They are usually similar enough that once you understand one brand, you can work with any of them, but you cannot directly take code from one PLC and using on another brand.
The answers given so far are pretty on target. One thing I found that PLCs have a split personality when it comes to their langauges and setup. Their core design is to give the electrical guys a flexible means of setting up control logic for their overall design. PLCs are basically a bunch of input and a bunch of outputs and how they are connected is controlled by the software you load into the device.
One of the emphasis of the languages that are used for PLCs is that they are accessible to people coming from an electrical background. So the idioms and structures seem counter intuitive for a person used to high level languages or even assembly languages. Ladder Logic for example is very accessible for electrical folks.
However in recent years PLCs have been supporting a multitude of languages for maximum flexibility. However in my opinion the handful of PLCs I worked are very lacking in terms of being a programming environment. Simple things like assigning variable names to memory location are often not designed into the language being used. The ones that are easy to work are often not the most cost effective for the job.
Despite these handicaps they are excellent for simplifying complex electrical systems. If you are working with others on a project, you will find that your knowledge of programming will help the project solve thorny programs. I was able to take a 100 rung ladder logic program and rewrite it into a third of the rungs. Once I was able to learn the ladder logic language I was able implement various optimizations that reduced the complexity of the program.
One tip is that you will need to learn about latching. Sometimes you will need to store or hold some output and unless you have a latch it the result will disappear the next cycle. Once you understand the issue it become clear but at first it was a great source of frustration for me.
PLC programming should be viewed as implementation activity of PLC software engineering output, unless you are using PLC as purely part of alternative components to mechanical or electrical solutions.
With this as basis, PLC programming environment is typically IEC61131 driven, gauranteed cycle time, "pre-emptive" realtime, no need to handle realtime OS related issues, continuous code scanning, non-program-pointer, different concept from typical computer task spawning kind of multi-tasking. Code execution is naturally atomic, no need to use monitors between tasks.
Each of the languages has its closeness to how conceivable is your code to the logic model you want to implement.
Ladder has its basic concept on electrical power flow interlocking style. Code resolution within single network is either horizontal or vertical scanning (your can find resource on this topic from manufacturer or other sites). If your code has single scan resolution nature and is within one network, some unconceivable behavior can be due to scanning type (important to remember that ladder is only emulation of electrical circuit, it is still sequential in execution).
FBD or function block diagram was electronic signal flow but today can be data flow depending on type of PLC. FBD shows clearer execution sequence quite similar to horizontal scanning ladder in scanning sequence. Today, FBD is typically used as container for object function blocks, although dependency implementation and visual similarity to process model is dependent on PLC type.
Literal is very similar to BASIC, but syntax only; execution is still scan-through. Literal language is good for mathematical calculation. For high level implementation, methods or derivation of attributes within object can be easier using Literal. State machine programming using English-like state representation or constants makes program very readable.
Statement list looks similar to assembly mnemonics but again execution is still scan-through and not program pointer. It is strong in bit operation and parenthesis-styled discrete logics. It can be a very efficient language to use with proper structuring and commenting.
SFC or sequential flow chart is a complementary language for sequence implementation. SFC has inherent rules on action block activation, state transitions, parellel sequence activation and merging. However, complex exception branching or concurrent action management can make implementation complicated and flow chart difficult to read.
PLC system management on IO handling, communication, hot-standby is hardware configuration effort, and is product dependent. Generally, can be treated separately from software engineering. However, data related to PLC system management are of "located" (independent data addressing area) type, good data modeling approach in software engineering can help in manageability of system data.
The Online PLC Simulator may be useful.
You can use Structured Text (ST) which consists of a series of instructions which, as determined in high level languages, ("IF..THEN..ELSE") or in loops (WHILE..DO) can be executed.
I find it better than Ladder as it is close to standard programming language.
I had a little of PLC programming on University. It seemed to me, to be a one level lower than assembly, but device we were using wasn't the newest one.
I belive you need to have a PLC driver, but I would first look for simulators and read more about it before buying.
Allen-Bradley has a free dos based software PLC, specifically for training. You can probably find it if you go to their site, or Google it. It's used to teach PLC programming in schools.
For a beginner trying to learn ladder logic, the best way is to attend free online training at http://plcs.net
PLC is the term used for the devices that use ladder logic. The devices that are programmed in more typical programming languages are generally called microcontrollers. However, there are some of us that on occasion lump them all under the PLC name. :-) Not sure how much ladder logic varies, but microcontroller code can vary significantly.