libraries for external DSL evaluation in Scala - scala

What are the steps required for evaluating an external DSL in scala, and what libraries are available for these?
After digging around i am able to create an AST out of case classes using parser combinators. What are the next steps in the process? I looked at kiama (https://code.google.com/p/kiama/) but it seems unclear from documentation ( may be due to my limited langauage processing knowledge ) how to maintain symbol tables, how to bind actions to dsl statements etc.

I agree that it would be good to have more tutorial-style documentation for common language processing tasks in Kiama. We are working on it, but I have nothing concrete to report at the moment.
In the meantime, all I can offer is the examples in the Kiama distribution. In particular, the minijava example is a reasonably accessible compiler for a non-trivial subset of Java. It does name and type analysis (see SemanticAnalysis.scala) and generates JVM bytecode. The semantic analysis uses a simple model of passing around an environment from declarations to uses of names. Feel free to contact us here or on the Kiama mailing list if you have specific questions about how the example works.
The Oberon-0 example is also a complete compiler from an imperative language to C, including semantic analysis.

Related

How to avoid writing confusing DSLs in Scala

I've read comments stating that Scala's flexibility makes it easy for developers to write DSLs that are difficult to understand and reason about.
DSLs are possible because
we can sometimes omit . and parentheses (e.g. List(1) map println)
we can sometimes interchange () and {}
we have implicit values, parameters, and classes (also conversions, which are now discouraged)
there is a relatively small number of reserved symbols in the language (e.g. I can define + for my class)
and possibly other language features.
How can I avoid writing confusing DSLs ... what are the common antipatterns? Where is a DSL not appropriate?
Whenever you create DSL of your own you're embedding new language into Scala, which is not standard, so it doesn't follow standard code guides, conventions, etc.
I would say it's nothing wrong with adding new DSL as long you add proper documentation, explain the purpose of creating it and add examples of usage. If you feel adding new DSL would increase readability of your code, just go for it, but remember that whenever anyone encounters your DSL and it won't be documented enough, they will be very confused.
A good example of well-documented and serving good purpose DSL would be matchers of scalatest or Scala duration.

Compilation / Code Generation of External Scala DSL

My understanding is that it is quite simple to create & parse an external DSL in Scala (e.g. representing rules). Is my assumption correct that the DSL can only be interpreted during runtime but does not support code generation (like ANTLR) for archiving better performance ?
EDIT: To be more precise, my question is if I could achieve this (create an external domain specific language and generate java/scala code) with built-in Scala tools/libraries (e.g. http://www.artima.com/pins1ed/combinator-parsing.html). Not writing a whole parser / code generator completely by yourself in scala. It's also clear that you can achieve this with third-party tools but you have to learn additional stuff and have additional dependencies. I'm new in the area of implementing DSLs, so I have no gutfeeling so far when to use external tools like ANTLR and what you can (with a reasonable effort) do with Scala on-board stuff.
Is my assumption correct that the DSL can only be interpreted during runtime but does not support code generation (like ANTLR) for archiving better performance ?
No, this is wrong. It is possible to write a compiler in Scala, after all, Scala is Turing-complete (i.e. you can write anything), and you don't even need Turing-completeness for a compiler.
Some examples of compilers written in Scala include
the Scala compiler itself (in all its variations, Scala-JVM, Scala.js, Scala-native, Scala-virtualized, Typelevel Scala, the abandoned Scala.NET, …)
the Dotty compiler
Scalisp
Scalispa
… and many others …

BDD in Scala - Does it have to be ugly?

I've used lettuce for python in the past. It is a simple BDD framework where specs are written in an external plain text file. Implementation uses regex to identify each step, proving reusable code for each sentence in the specification.
Using scala, either with specs2 or scalatest I'm being forced to write the the specification alongside the implementation, making it impossible to reuse the implementation in another test (sure, we could implement it in a function somewhere) and making it impossible to separate the test implementation from the specification itself (something that I used to do, providing acceptance tests to clients for validation).
Concluding, I raise my question: Considering the importance of validating tests by clients, is there a way in BDD frameworks for scala to load the tests from an external file, raising an exception if a sentence in the test is not implemented yet and executing the test normally if all sentences have been implemented?
I've just discovered a cucumber plugin for sbt. Tests would be implemented under test/scala and specifications would be kept in test/resources as plain txt files. I'm just not sure on how reliable the library is and if it will have support in the future.
Edit:
The above is a wrapper for the following plugin wich solves perfectly the problem and supports Scala.
https://github.com/cucumber/cucumber-jvm
This is all about trade-offs. The cucumber-style of specifications is great because it is pure text, that easily editable and readable by non-coders.
However they are also pretty rigid as specifications because they impose a strict format based on features and Given-When-Then. In specs2 for example we can write any text we want and annotate only the lines which are meant to be actions on the system or verification. The drawback is that the text becomes annotated and that pending must be explicitly specified to indicate what hasn't been implemented yet. Also the annotation is just a reference to some code, living somewhere, and you can of course use the usual programming techniques to get reusability.
BTW, the link above is an interesting example of trade-off: in this file, the first spec is "uglier" but there are more compile-time checks that the When step uses the information from a Given step or that we don't have a sequence of Then -> When steps. The second specification is nicer but also more error-prone.
Then there is the issue of maintaining the regular expressions. If there is a strict separation between the people writing the features and the people implementing them, then it's very easy to break the implementation even if nothing substantial changes.
Finally, there is the question of version control. Who owns the document? How can we be sure that the code is in sync with the spec? Who refactors the specification when required?
There is no, by far, perfect solution. My own conclusion is that BDD artifacts should be in the hand of developers and verified by the other stakeholders, reading the code directly if it's readable or reading an html/pdf output. And if the BDD artifacts are owned by developers they might as well use their own tools to make their life easier with verification (using a compiler when possible) and maintenance (using automated refactorings).
You said yourself that it is easy to make the implementation reusable by the normal methods Scala provides for this kind of stuf (methods, functions, traits, classes, types ...), so there isn't really a problem there.
If you want to give a version without code to your customer, you can still give them the code files, and if they can't ignore a little syntax, you probably could write a custom reporter writing all the text out to a file, maybe even formatted with as html or something.
Another option would be to use JBehave or any other JVM based framework, they should work with Scala without a problem.
Eric's main design criteria was sustainability of executable specification development (through refactoring) and not initial convenience due to "beauty" of simple text.
see http://etorreborre.github.io/specs2/
The features of specs2 are:
Concurrent execution of examples by default
ScalaCheck properties
Mocks with Mockito
Data tables
AutoExamples, where the source code is extracted to describe the example
A rich library of matchers
Easy to create and compose
Usable with must and should
Returning "functional" results or throwing exceptions
Reusable outside of specs2 (in JUnit tests for example)
Forms for writing Fitnesse-like specifications (with Markdown markup)
Html reporting to create documentation for acceptance tests, to create a User Guide
Snippets for documenting APIs with always up-to-date code
Integration with sbt and JUnit tools (maven, IDEs,...)
Specs2 is quite impressive in both design and implementation.
If you look closely you will see the DSL can be extended while you keep the typesafe-ty and strong command of domain code under development.
He who leaves aside the "is more ugly" argument and tries this seriously will find power.
Checkout the structured forms and snippets

Convert Scala AST to source code

Given a Scala AST, is there a way to generate Scala source code?
I'm looking into ways to autogenerate Scala source by parsing/analyzing other Scala source. Any tips would be appreciated!
I have been successfully using Scala-Refactoring by Mirko Stocker for this task.
For synthetically constructing ASTs, it relies strongly on the existing Tree DSL of Scala's NSC.
Although the code is a bit messy, you can find an example usage in my project ScalaCollider-UGens.
I have also come across a very useful class by Johannes Rudolph.
See our DMS Software Reengineering Toolkit.
DMS provides a complete ecosystem for parsing/analyzing/optimizing/transforming source code in many languages. It achieves this by provide generic machinery for these tasks as its core capabilities, and specializing those according to explicitly supplied language definitions ("front ends"). DMS has front ends for many languages (C, C++, C#, Java, COBOL, ...) that have been used in anger, and a process for defining others very quickly.
We work on expanding the language set more or less continuously. DMS already has parts of a Scala front end implemented, and we know how to finish it based on the other 30+ front ends we have built, with special emphasis on knowledge of Java.

Use of Clojure macros for DSLs

I am working on a Clojure project and I often find myself writing Clojure macros for DSLs, but I was watching a Clojure video of how a company uses Clojure in their real work and the speaker said that in practical use they do not use macros for their DSLs, they only use macros to add a little syntactic sugar. Does this mean I should write my DSL in using standard functions and then add a few macros at the end?
Update:
After reading the many varied (and entertaining) responses to this question I have realized that the answer is not as clear cut as I first thought, for many reasons:
There are many different types of API in an application (internal, external)
There are many types of user of the API (business user who just wants to get something done fast, Clojure expert)
Is there macro there to hide boiler plate code?
I will go away and think about the question more deeply, but thanks for your answers as they have given me lots to think about. Also I noticed that Paul Graham thinks the opposite of the Christophe video and thinks macros should be a large part of the codebase (25%):
http://www.paulgraham.com/avg.html
To some extent I believe this depends on the use / purpose of your DSL.
If you are writing a library-like DSL to be used in Clojure code and want it to be used in a functional way, then I would prefer functions over macros. Functions are "nice" for Clojure users because they can be composed dynamically into higher order functions etc. For example, you are writing a functional web framework like Ring.
If you are writing a imperative DSL that will be used pretty independently of other Clojure code and you have decided that you definitely don't need higher order functions, then the usage will be pretty similar and you can chose whichever makes most sense. For example, you might be creating some kind of business rules engine.
If you are writing a specialised DSL that needs to produce highly performant code, then you will probably want to use macros most of the time since they will be expanded at compile time for maximum efficiency. For example, you're writing some graphics code that needs to expand to exactly the right sequence of OpenGL calls......
Yes!
Write functions whenever possible. Never write a macro when a function will do. If you write to many macros you end up with somthing that is much harder to extend. Macros for example cant be applied or passed around.
Christophe Grand: (not= DSL macros)
http://clojure.blip.tv/file/4522250/
No!
Don't be afraid of using macros extensively. Always write a macro when in doubt. Functions are inferior for implementing DSLs - they're taking the burden onto the runtime, whereas macros allows to do many heavyweight computations in a compilation time. Just think of a difference of implementing, say, an embedded Prolog as an interpreter function and as a macro which compiles Prolog into some form of a WAM.
And do not listen to those who say that "macros cant be applied or passed around", this argument is entirely a strawman. Those people are advocating interpreters over compilers, which is simply ridiculous.
A couple of tips on how to implement DSLs using macros:
Do it in stages. Define a long chain of languages from your DSL to the underlying Clojure. Keep each transform as simple as possible - this way you'd be able to easily maintain and debug your DSL compiler.
Prepare a toolbox of DSL components that you will reuse when implementing your DSLs. It should include target languages of different semantics (e.g., untyped eager functional - it is Clojure itself, untyped lazy functional, first order logic, typed imperative, Hindley-Millner typed eager functional, dataflow, etc.). With macros it is trivial to combine properties of all that target semantics seamlessly.
Maintain a set of compiler-building tools. It should include parser generators (useful even if your DSLs are entirely in S-expressions), term rewriting engines, pattern matching engines, implementations for some common algorithms on graphs (e.g., graph colouring), etc.
Here's an example of a DSL in Haskell that uses functions rather than macros:
http://contracts.scheming.org/
Here is a video of Simon Peyton Jones giving a talk about this implementation:
http://ulf.wiger.net/weblog/2008/02/29/simon-peyton-jones-composing-contracts-an-adventure-in-financial-engineering/
Leverage the characteristics of Clojure and FP before going down the path of implementing your own language. I think SK-logic's tips give you a good indication of what is needed to implement a full blown language. There are times when it's worth the effort, but those are rare.