Flow Control Instructions in a virtual machine - control-flow

I've been implementing my own scripting language + virtual machine from scratch for a small experiment. A script reader parses the script and translates it to a stream of instructions that a runtime engine will execute.
At the beginning I didn't think about it but now I'd like to include flow control (loops, branching etc). I'm not well versed with language theory and just looked at some examples for inspiration.
But both the x86 and the java virtual machine have a plethora of instructions used for flow control. In x86 there are plenty instructions that jump based on the state of flags and other instructions that manipulate the relevant flags one way or another. In Java there seem to be 16 instructions that make some sort of comparison and a conditional jump.
This might be efficient or motivated by hardware specific reasons but it's not what I'm looking for.
I look for a lean, elegant solution to flow control that only requires a few dedicated instructions and isn't too complicated to implement and maintain.
I'm pretty confident I could come up with something that works but I'd rather improve my knowledge instead of reinventing the wheel. Any explanations or links to relevant material are very welcome!

Generally the minimum primitives required for flow control are
unconditional jump
conditional jump
Of these, the conditional jump is the complex one, and at a minimum it needs to support the following atomically:
test a binary variable/flag
if the flag is set, cause instruction execution to jump to some specified location
if the flag is unset, allow instruction execution to continue uninterrupted
However with such a primitive conditional jump, you would need ways to set that binary variable/flag to the appropriate value for every type of boolean expression that could be used in the flow control structures of your language.
This would therefore either lead to the need for various primitives of varying complexity for setting the binary variable/flag, or the need to emit complex sequences of instructions to get the desired effect.
The other alternative is to introduce more complex conditional jump primitives.
Generally there will be a trade-off between the number and complexity of each of: conditional jump primitives; condition (variable/flag) setting primitives; emitted instructions.


Convert MIndiGolog fluents to the IndiGolog causes_val format

I am using Eclipse (version: Kepler Service Release 1) with Prolog Development Tool (PDT) plug-in for Prolog development in Eclipse. Used these installation instructions: http://sewiki.iai.uni-bonn.de/research/pdt/docs/v0.x/download.
I am working with Multi-Agent IndiGolog (MIndiGolog) 0 (the preliminary prolog version of MIndiGolog). Downloaded from here: http://www.rfk.id.au/ramblings/research/thesis/. I want to use MIndiGolog because it represents time and duration of actions very nicely (I want to do temporal planning), and it supports planning for multiple agents (including concurrency).
MIndiGolog is a high-level programming language based on situation calculus. Everything in the language is exactly according to situation calculus. This however does not fit with the project I'm working on.
This other high-level programming language, Incremental Deterministic (Con)Golog (IndiGolog) (Download from here: http://sourceforge.net/p/indigolog/code/ci/master/tree/) (also made with Prolog), is also (loosly) based on situation calculus, but uses fluents in a very different way. It makes use of causes_val-predicates to denote which action changes which fluent in what way, and it does not include the situation in the fluent!
However, this is what the rest of the team actually wants. I need to rewrite MIndiGolog so that it is still an offline planner, with the nice representation of time and duration of actions, but with the causes_val predicate of IndiGolog to change the values of the fluents.
I find this extremely hard to do, as my knowledge in Prolog and of situation calculus only covers the basics, but they see me as the expert. I feel like I'm in over my head and could use all the help and/or advice I can get.
I already removed the situations from my fluents, made a planning domain with causes_val predicates, and tried to add IndiGolog code into MIndiGolog. But with no luck. Running the planner just returns "false." And I can make little sense of the trace, even when I use the GUI-tracer version of the SWI-Prolog debugger or when I try to place spy points as strategically as possible.
Thanks in advance,
Best, PJ
If you are still interested (sounds like you might not be): this isn't actually very hard.
If you look at Reiter's book, you will find that causes_vals are just effect axioms, while the fluents that mention the situation are usually successor-state-axioms. There is a deterministic way to convert from the former to the latter, and the correct interpretation of the causes_vals is done in the implementation of regression. This is always the same, and you can just copy that part of Prolog code from indiGolog to your flavor.

Why doesn't a primitive `call-with-current-continuations` exist in Common Lisp

Scheme offers a primitive call-with-current-continuation, commonly abbreviated call/cc, which has no equivalent in the ANSI Common Lisp specification (although there are some libraries that try to implement them).
Does anybody know the reason why the decision of not creating a similar primitive in the ANSI Common Lisp specification was made?
Common Lisp has a detailed file compilation model as part of the standard language. The model supports compiling the program to object files in one environment, and loading them into an image in another environment. There is nothing comparable in Scheme. No eval-when, or compile-file, load-time-value or concepts like what is an externalizable object, how semantics in compiled code must agree with interpreted code. Lisp has a way to have functions inlined or not to have them inlined, and so basically you control with great precision what happens when a compiled module is re-loaded.
By contrast, until a recent revision of the Scheme report, the Scheme language was completely silent on the topic of how a Scheme program is broken into multiple files. No functions or macros were provided for this. Look at R5RS, under 6.6.4 System Interface. All that you have there is a very loosely defined load function:
optional procedure: (load filename)
Filename should be a string naming an existing file containing Scheme source code. The load procedure reads expressions and definitions from the file and evaluates them sequentially. It is unspecified whether the results of the expressions are printed. The load procedure does not affect the values returned by current-input-port and current-output-port. Load returns an unspecified value.
Rationale: For portability, load must operate on source files. Its operation on other kinds of files necessarily varies among implementations.
So if that is the extent of your vision about how applications are built from modules, and all details beyond that are left to implementors to work out, of course the sky is the limit regarding inventing programming language semantics. Note in part the Rationale part: if load is defined as operating on source files (with all else being a bonus courtesy of the implementors) then it is nothing more than a textual inclusion mechanism like #include in the C language, and so the Scheme application is really just one body of text that is physically spread into multiple text files pulled together by load.
If you're thinking about adding any feature to Common Lisp, you have to think about how it fits into its detailed dynamic loading and compilation model, while preserving the good performance that users expect.
If the feature you're thinking of requires global, whole-program optimization (whereby the system needs to see the structural source code of everything) in order that users' programs not run poorly (and in particular programs which don't use that feature) then it won't really fly.
Specifically with regard to the semantics of continuations, there are issues. In the usual semantics of a block scope, once we leave a scope and perform cleanup, that is gone; we cannot go back to that scope in time and resume the computation. Common Lisp is ordinary in that way. We have the unwind-protect construct which performs unconditional cleanup actions when a scope terminates. This is the basis for features like with-open-file which provides an open file handle object to a block scope and ensures that this is closed no matter how the block scope terminates. If a continuation escapes from that scope, that continuation no longer has a valid file. We cannot simply not close the file when we leave the scope because there is no assurance that the continuation will ever be used; that is to say, we have to assume that the scope is in fact being abandoned forever and clean up the resource in a timely way. The band-aid solution for this kind of problem is dynamic-wind, which lets us add handlers on entry and exit to a block scope. Thus we can re-open the file when the block is restarted by a continuation. And not only re-open it, but actually position the stream at exactly the same position in the file and so on. If the stream was half way through decoding some UTF-8 character, we must put it into the same state. So if Lisp got continuations, either they would be broken by various with- constructs that perform cleanup (poor integration) or else those constructs would have to acquire much more hairy semantics.
There are alternatives to continuations. Some uses of continuations are non-essential. Essentially the same code organization can be obtained with closures or restarts. Also, there is a powerful language/operating-system construct that can compete with the continuation: namely, the thread. While continuations have aspects that are not modeled nicely by threads (and not to mention that they do not introduce deadlocks and race conditions into the code) they also have disadvantages compared to threads: like the lack of actual concurrency for utilization of multiple processors, or prioritization. Many problems expressible with continuations can be expressed with threads almost as easily. For instance, continuations let us write a recursive-descent parser which looks like a stream-like object which just returns progressive results as it parses. The code is actually a recursive descent parser and not a state machine which simulates one. Threads let us do the same thing: we can put the parser into a thread wrapped in an "active object", which has some "get next thing" method that pulls stuff from a queue. As the thread parsers, instead of returning a continuation, it just throws objects into a queue (and possibly blocks for some other thread to remove them). Continuation of execution is provided by resuming that thread; its thread context is the continuation. Not all threading models suffer from race conditions (as much); there is for instance cooperative threading, under which one thread runs at a time, and thread switches only potentially take place when a thread makes an explicit call into the threading kernel. Major Common Lisp implementations have had light-weight threads (typically called "processes") for decades, and have gradually moved toward more sophisticated threading with multiprocessing support. The support for threads lessens the need for continuations, and is a greater implementation priority because language run-times without thread support are at technological disadvantage: inability to take full advantage of the hardware resources.
This is what Kent M. Pitman, one of the designers of Common Lisp, had to say on the topic: from comp.lang.lisp
The design of Scheme was based on using function calls to replace most common control structures. This is why Scheme requires tail-call elimination: it allows a loop to be converted to a recursive call without potentially running out of stack space. And the underlying approach of this is continuation-passing style.
Common Lisp is more practical and less pedagogic. It doesn't dictate implementation strategies, and continuations are not required to implement it.
Common Lisp is the result of a standardization effort on several flavors of practical (applied) Lisps (thus "Common"). CL is geared towards real life applications, thus it has more "specific" features (like handler-bind) instead of call/cc.
Scheme was designed as small clean language for teaching CS, so it has the fundamental call/cc which can be used to implement other tools.
See also Can call-with-current-continuation be implemented only with lambdas and closures?

VHDL beta function

A friend of mine needs to implement some statistical calculations in hardware.
She wants it to be accomplished using VHDL.
(cross my heart, I haven't written a line of code in VHDL and know nothing about its subtleties)
In particular, she needs a direct analogue of MATLAB's betainc function.
Is there a good package around for doing this?
Any hints on the implementation are also highly appreciated.
If it's not a good idea at all, please tell me about it as well.
Thanks a lot!
There isn't a core available that performs an incomplete beta function in the Xilinx toolset. I can't speak for the other toolsets available, although I would doubt that there is such a thing.
What Xilinx does offer is a set of signal processing blocks, like multipliers, adders and RAM Blocks (amongst other things, filters, FFTs), that can be used together to implement various custom signal transforms.
In order for this to be done, there needs to be a complete understanding of the inner workings of the transform to be applied.
A good first step is to implement the function "manually" in matlab as a proof of concept:
Instead of using the built-in function in matlab, your friend can try to implement the function just using fundamental operators like multipliers and adders.
The results can be compared with those produced by the built-in function for verification.
The concept can then be moved to VHDL using the building blocks that are provided.
Doing this for the incomplete beta function isn't something for the faint-hearted, but it can be done.
As far as I know there is no tool which allow interface of VHDL and matlab.
But interface of VHDL and C is fairly easy, so if you can implement your code(MATLAB's betainc function) in C then it can be done easily with FLI(foreign language interface).
If you are using modelsim below link can be helpful.
First of all a word of warning, if you haven't done any VHDL/FPGA work before, this is probably not the best place to start. With VHDL (and other HDL languages) you are basically describing hardware, rather than a sequential line of commands to execute on a processor (as you are with C/C++, etc.). You thus need a completely different skill- and mind-set when doing FPGA-development. Just because something can be written in VHDL, it doesn't mean that it actually can work in an FPGA chip (that it is synthesizable).
With that said, Xilinx (one of the major manufacturers of FPGA chips and development tools) does provide the System Generator package, which interfaces with Matlab and can automatically generate code for FPGA chips from this. I haven't used it myself, so I'm not at all sure if it's usable in your friend's case - but it's probably a good place to start.
The System Generator User guide (link is on the previously linked page) also provides a short introduction to FPGA chips in general, and in the context of using it with Matlab.
You COULD write it yourself. However, the incomplete beta function is an integral. For many values of the parameters (as long as both are greater than 1) it is fairly well behaved. However, when either parameter is less than 1, a singularity arises at an endpoint, making the problem a bit nasty. The point is, don't write it yourself unless you have a solid background in numerical analysis.
Anyway, there are surely many versions in C available. Netlib must have something, or look in Numerical Recipes. Or compile it from MATLAB. Then link it in as nav_jan suggests.
As an alternative to VHDL, you could use MyHDL to write and test your beta function - that can produce synthesisable (ie. can go into an FPGA chip) VHDL (or Verilog as you wish) out of the back end.
MyHDL is an extra set of modules on top of Python which allow hardware to be modelled, verified and generated. Python will be a much more familiar environment to write validation code in than VHDL (which is missing many of the abstract data types you might take for granted in a programming language).
The code under test will still have to be written with a "hardware mindset", but that is usually a smaller piece of code than the test environment, so in some ways less hassle than figuring out how to work around the verification limitations of VHDL.

What are "not so well defined problems" that LISP is supposed to solve?

Most people agree that LISP helps to solve problems that are not well defined, or that are not fully understood at the beginning of the project.
"Not fully understood"" might indicate that we don't know what problem we are trying to solve, so the developer refines the problem domain continuously. But isn't this process language independent?
All this refinement does not take away the need for, say, developing algorithms/solutions for the final problem that does need to be solved. And that is the actual work.
So, I'm not sure what advantage LISP provides if the developer has no idea where he's going i.e. solving a problem that is not finalised yet.
Lisp (not "LISP") has a number of advantages when you're facing problems that are not well-defined. First of all, you have a REPL where you can quickly experiment with -- that helps in sketching out quick functions and trying to play with them, leading to a very rapid development cycle. Second, having a dynamically typed language is working well in this context too: with a statically typed language you need to "design more" before you begin, and changing the design leads to changing more code -- in contrast, with Lisps you just write the code and the data it operates on can change as needed. In addition to these, there's the usual benefits of a functional language -- one with first class lambda functions, etc (eg, garbage collection).
In general, these advantage have been finding their way into other languages. For example, Javascript has everything that I listed so far. But there is one more advantage for Lisps that is still not present in other languages -- macros. This is an important tool to use when your problem calls for a domain specific language. Basically, in Lisp you can extend the language with constructs that are specific to your problem -- even if these constructs lead to a completely different language.
Finally, you need to plan ahead for what happens when the code becomes more than a quick experiment. In this case you want your language to cope with "growing scripts into applications" -- for example, having a module system means that you can get a more "serious"
application. For example, in Racket you can get your solution separated into such modules, where each can be written in its own language -- it even has a statically typed language which makes it possible to start with a dynamically typed development cycle and once the code becomes more stable and/or big enough that maintenance becomes difficult, you can switch some modules into the static language and get the usual benefits from that. Racket is actually unique among Lisps and Schemes in this kind of support, but even with others the situation is still far more advanced than in non-Lisp languages.
In AI (Artificial Intelligence) historically Lisp was seen as the AI assembly language. It was used to build higher-level languages which help to work with the problem domain in a more direct way. Many of these domains need a lot of 'knowledge' for finding usable answers.
A typical example is an expert system for, say, oil exploration. The expert system gets as inputs (geological) observations and gives information about the chances to find oil, what kind of oil, in what depths, etc. To do that it needs 'expert knowledge' how to interpret the data. When you start such a project to develop such an expert system it is typically not clear what kind of inferences are needed, what kind of 'knowledge' experts can provide and how this 'knowledge' can be written down for a computer.
In this case one typically develops new languages on top of Lisp and you are not working with a fixed predefined language.
As an example see this old paper about Dipmeter Advisor, a Lisp-based expert system developed by Schlumberger in the 1980s.
So, Lisp does not solve any problems. But it was originally used to solve problems that are complex to program, by providing new language layers which should make it easier to express the domain 'knowledge', rules, constraints, etc. to find solutions which are not straight forward to compute.
The "big" win with a language that allows for incremental development is that you (typically) has a read-eval-print loop (or "listener" or "console") that you interact with, plus you tend to not need to lose state when you compile and load new code.
The ability to keep state around from test run to test run means that lengthy computations that are untouched by your changes can simply be kept around instead of being re-computed.
This allows you to experiment and iterate faster. Being able to iterate faster means that exploration is less of a hassle. Very useful for exploratory programming, something that is typical with dealing with less well-defined problems.

How to write a X86_64 _assembler_?

Goal: I want to write an X86_64 assembler. Note: marked as community wiki
Background: I'm familiar with C. I've written MIPS assembly before. I've written some x86 assembly. However, I want to write an x86_64 assembler -- it should output machine code that I can jump to and start executing (like in a JIT).
Question is: what is the best way to approach this? I realize this problem looks kind large to tackle. I want to start out with a basic minimum set:
Load into register
Arithmetric ops on registers (just integers is fine, no need to mess with FPU yet)
Just a basic set to make it Turing complete. Anyone done this? Suggestions / resources?
An assembler, like any other "compiler", is best written as a lexical analyser feeding into a language grammar processor.
Assembly language is usually easier than the regular compiled languages since you don't need to worry about constructs crossing line boundaries and the format is usually fixed.
I wrote an assembler for a (fictional) CPU some two years ago for educational purposes and it basically treated each line as:
optional label (e.g., :loop).
operation (e.g., mov).
operands (e.g., ax,$1).
The easiest way to do it is to ensure that tokens are easily distinguishable.
That's why I made the rule that labels had to begin with : - it made the analysis of the line so much easier. The process for handling a line was:
strip off comments (first ; outside a string to end of line).
extract label if present.
first word is then the operation.
rest are the operands.
You can easily insist that different operands have special markers as well, to make your life easier. All this is assuming you have control over the input format. If you're required to use Intel or AT&T format, it's a little more difficult.
The way I approached it is that there was a simple per-operation function that got called (e.g., doJmp, doCall, doRet) and that function decided on what was allowed in the operands.
For example, doCall only allows a numeric or label, doRet allows nothing.
For example, here's a code segment from the encInstr function:
private static MultiRet encInstr(
boolean ignoreVars,
String opcode,
String operands)
if (opcode.length() == 0) return hlprNone(ignoreVars);
if (opcode.equals("defb")) return hlprByte(ignoreVars,operands);
if (opcode.equals("defbr")) return hlprByteR(ignoreVars,operands);
if (opcode.equals("defs")) return hlprString(ignoreVars,operands);
if (opcode.equals("defw")) return hlprWord(ignoreVars,operands);
if (opcode.equals("defwr")) return hlprWordR(ignoreVars,operands);
if (opcode.equals("equ")) return hlprNone(ignoreVars);
if (opcode.equals("org")) return hlprNone(ignoreVars);
if (opcode.equals("adc")) return hlprTwoReg(ignoreVars,0x0a,operands);
if (opcode.equals("add")) return hlprTwoReg(ignoreVars,0x09,operands);
if (opcode.equals("and")) return hlprTwoReg(ignoreVars,0x0d,operands);
The hlpr... functions simply took the operands and returned a byte array containing the instructions. They're useful when many operations have similar operand requirements, such as adc,addandand` all requiring two register operands in the above case (the second parameter controlled what opcode was returned for the instruction).
By making the types of operands easily distinguishable, you can check what operands are provided, whether they are legal and which byte sequences to generate. The separation of operations into their own functions provides for a nice logical structure.
In addition, most CPUs follow a reasonably logical translation from opcode to operation (to make the chip designers lives easier) so there will be very similar calculations on all opcodes that allow, for example, indexed addressing.
In order to properly create code in a CPU that allows variable-length instructions, you're best of doing it in two passes.
In the first pass, don't generate code, just generate the lengths of instructions. This allows you to assign values to all labels as you encounter them. The second pass will generate the code and can fill in references to those labels since their values are known. The ignoreVars in that code segment above was used for this purpose (byte sequences of code were returned so we could know the length but any references to symbols just used 0).
Not to discourage you, but there are already many assemblers with various bells and whistles. Please consider contributing to an existing open source project like elftoolchain.