Where to find a similar to C grammar as inspiration in Xtext?

Where to find a similar to C grammar as inspiration in Xtext? - eclipse

I am working on a DSP toolchain named CrossCore Embedded Studio for SHARC Processors. The IDE is well implemented apart from the assembly language support which is very poor.
I am not sure whether or not Analog Devices will give a better support to it in a near future.
I mean, there is currently no code folding, no outlines and a very minimalistic code colouring support. I would like to quickly implement something more efficient and I was looking at a way to implement a new language definition in Eclipse Luna.
What I found is xtext. So I followed the 5 and 15 minutes tutorials and read some articles about it.
I am now ready to implement my language. Because this assembly language inherits very similar aspects from c (i.e. preprocessor directives, C/C++ comments, arithmetic operations and semicolons endings), I was actively looking at a C grammar example for Xtext.
Unfortunately I did not find anything yet. Sadly, I discovered that C or C++ cannot be easily described in Xtext because of the muliple language layers such as C-preprocessor and C. I will not admit defeat yet and I think I can implement something sufficent as I only require approximate syntax colouring, some code folding and outlines support.
Where can I find helpful examples to implement such language?
Here what the SHARC assembly language looks like:
#include <foo.h>
#include "foo.h"
#ifdef BAR
.segment/dm slow;
#else
.segment/dm fast;
#endif
label:
r1 = r2; /* Another comment */
r3 = dm( _symbol + 0 );
r5 = r3 + 1; // A comment
jump(db);
nop;
r8 = pass r8;
another_label:
{ // Not currently recognized by the assembly, but useful for readabily, I would like to enable code folding here...
r2 = 3;
}
final_label.end: nop;
.endseg;

Although I do not have a long trajectory using Xtext, what I would recommend you is to search a grammar like the one you are trying to accomplish not limiting it to Xtext. As far as I know, Xtext is built on top of Antlr, which is a powerful parser generator for reading, processing, executing, or translating structured text or binary files (from its website). Now, there are several contributions to the Antlr project in terms of grammars examples, and you can find them here.
In that repository you can find the C grammar written in Antlr, that should be a good starting point towards the Xtext grammar; it does not vary too much.

Related

Advantages Jetbrains MPS has over Xtext

I want to ask the advantages mps and xtext have over each other and the main features when writing a language. I know when working with mps you are directly editing the AST and xtext uses a parser. I have read an advantage of using a AST allows for multiple languages to be extended for the language you are making, I don’t really understand what this means, could this be explained further and why would someone want to extend multiple language ?
Also i have read that the AST cut out ambiguous code, how does it do this?
I know that both MPS and xtext have features like underlining and highlighting code is their any other feature relating to code validation ?
Any other main differences and general feature of them are welcome ?

I have no practical experience with Xtext, so I will talk mostly about MPS.
LWB
Both Xtext and MPS are language workbenches, so they have their own schema used to metamodel the abstract syntax (structure of concepts), some way to define the concrete syntax (notations) and some way to define generators (M2M or M2T transformations) or less usually interpreters. Then they have provide the IDE itself with highlighting, smart actions like refactoring and contextual error fixes, advanced search and navigation (go to declaration etc.), checking for errors (type errors, static code analysis, checking of defined constraints & rules, checking cardinality, dataflow analysis), ... So yes, lots of options for validation. I have mentioned things, that are in MPS, not sure if Xtext provides everything. However, all of these features are organised in so-called aspects which you can check out in a summary table which shortly describes each aspect.
Projectional editor
As you have mentioned, MPS uses a projectional editor. You directly manipulate the AST, parser-based post-IntelliJ smart IDEs are able to provide you with intelligent actions like recaftoring and go to declaration etc. only because they parse the language in memory and construct an AST behind the scenes anyway. Projectional editors skip the parsing step.
Dodging ambiguity
It uses no parser at all, so all of the downsides of having a parser are gone. First of all, the language developer does not need to be an expert in syntax analysis, so you don't need to hire them specifically. But the best win is to have infinite language composability. This is achieved, as you have mentioned, by totally avoiding ambiguities which could appear in grammars (MPS does not use a grammar, but a model). Let's say you use language A and language B. For demonstration, let's say both languages extend BaseLanguage (abbr. BL, the MPS-equivalent of Java) and they have both defined a statement to log. Concept a logs to stderr and b logs to a file. However, both a and b have an identical concrete syntax (i.e. editor definition in MPS) which just says log. Now if you had a parser and it encounters the token log it cannot decide from which language the concept is, so it's ambiguous - not even a look-ahead parser can do it. In a projectional editor this cannot happen, because only the projection is identical and under the hood the AST has an instance of either a or b (you can think of it as always using the whole FQN of a class in Java, just the package is hidden in the IDE, so you can use identically named classes from different packages). The "ambiguity" is resolved at the time of writing by the user: when he writes log a dropdown menu appears clearly showing that one of them is a and the other is b (maybe even shows a description which would say "Log to file" / "Log to stderr").
Modularity
Consequently, MPS has very good modularity, composability and extensibility of languages. You have mentioned
allows for multiple languages to be extended for the language you are making [...]
why would someone want to extend multiple language
You need to differentiate between using a language and extending it
(if you are interested more Völter talks about 4 kinds of composition techniques regarding languages: referencing, extension, reuse and embedding). Using a lanugage is just the ability to write programs in it. If you extend a language, it's kind of like inheritance, you add new concepts to it, f.e. create a new type of Java (BL) statement. And it has been done in the standard languages shipped with MPS too. You have for example the checkedDots language which extends BL with an operation .? which is null-safe (similarly to null-conditional operator ?. in C#). So why extend a language? Because you can use new constructs, add new functionality or syntactic sugar. Another ready-to-use language in BL is the tuples language, which has both indexed and named tuples. Then there is the collections language, which kind-of replaces the Java Stream API. All of these little languages are extensions which you can start using with a simple Ctrl+L. You could also embed another language to your language - use a regex inside an SQL statement inside your Java code.
Generation
Another kind of language dependency in MPS is to have a "generation target" language. Generators in MPS work in a way that you transform your language sentence (i.e. model) into another MPS language. You can invent your own little language, or implement LOLcode and setup the generator to transform it into valid Java code. However, this language must already exist in MPS, so you cannot generate it to Python, if there is no Python implementation in MPS. The other alternative is to generate text (M2T), this way you could theoretically generate Python source code, or just print the LOLcode as-is.
Multiple notations
The second great difference in projectional vs. parser-based editors is that the latter inherently supports only textual notation. Maybe there are some external tools you can use. On the other side, MPS provides textual, tabular, symbolic (math symbols) and graphical (diagrams) notations. There is a possibility to swap your view from one notation to another, per concept or for the whole "file" (program).
Drawbacks
It's not all roses though. Projectional editors have some limitations, or challenges to solve. There is an analysis of challenges in projectional editors which points out mainly usability and infrastructure integration. They are mostly solved in MPS, f.e. regarding infrastructure you have a good VCS diff/merge tool. For automatic/cmd builds there is a language that generates Ant. Gradle or Maven does not work with MPS directly, but through Ant. Regarding usability "MPS takes a
while to get used to, but then its usability is comparable to ParEs."3 You should use a language called GrammarCells (available through MPS-extensions or mbeddr.platform) which makes it easy to build good editors (mainly for arithemtic expressions), otherwise by default you must enter concepts in prefix order (+ first, not the number). Comments in MPS cannot be placed willy-nilly. Cannot establish references to non-existing nodes... (see the Table 1. in 3)
MPS currently does not have a web-based version. There are some planned, though. Jetbrains works on WebMPS, then there is modelix.
Portability
Generally, you are stuck to working in MPS. By default it is not really portable, unless you explicitly define generators which produce portable output. If you want to input a program , you can code a paste-handler where you could put your parser, or you can change the format in which the AST is stored (from XML to maybe directly your language, but this would again require a parser to read). I am currently working on a solution which enables to import an MPS language from a YAJCo model (model-based parser generator, where the input is not a grammar, but Java classes representing the semantic model). Then you can import a sentence (file) which creates and populates a model (AST). From the program in MPS you can generate Java source code which fills the original Java classes if you need it.
BTW the mbeddr project has implemented importing from ECore check here
Dictionary
M2M = model to model
M2T = model to text

Auto-Wrap huge C++ libs to C for import in Swift / Go

Let's say i have a huge lib in C++ (with tons of dependencies, it needs about 3h for a full build under GCC). I want to build upon that lib but don't want to do so in C++ but rather in a more productive language. How can i actually bridge or wrap that extern lib package so i can access it in another language and program on top of it?
Languages considered:
Swift
Go
What i found is, that both languages do provide auto bridging or wrapping for C libs and code (I don't actually know whats the difference between wrapping / bridging). So, if i have some c code, i can just throw it in the same Swift or Go project and can use it with a simple import in my project.
This doesn't work in both languages for C++ code however. So i googled how to transform C++ libs to C code or generate autowrappers. I found the following:
swig.org - auto wrapper for C++ libs
Comeau C++ compiler - automatically transfers C++ to C code
LLVM - should be able to take any input and transform it to any output that LLVM is capable of.
Question:
Is it even in the realms of usable / realistic / managable to build
on top of such a huge lib in other languages like Swift / Go, if
using auto wrapping or auto bridging?
What of the 3 listed libs / programs / frameworks works best for the process of C++ -> C (because Swift and Go both provide C auto
wrapping).
Are there better alternatives than what i considered so far?
Would it be better to just "stick with C++" as using any other tools to do the wrapping / bridging process would be far to much
work to equal out the benefit of using a more productive language
like Swift / Go?
Thanks:)
Disclaimer: There is also the possibility to manually wrap a C++ lib in C but that would take an unbearable amount of work for such a huge lib.

Q1: Is it realistic?
Not realistic, because any large complicated C++ interop is going to get too complicated. Automatic tools are likely to fail and manual work is too hard.
Q2: What's best?
I don't know and given A1 it does not seem to matter.
Q3: Alternative?
Q4: Is C++ only the best alternative?
If you want to take advantage of existing C++ code from another language regardless of the language involved the best option in complex scenarios is to use a hybrid approach.
Most languages provide interop to C and not C++ due to non-standard C++ naming convention. In other words, just about every language provides access to plain C-functions, but C++ is frequently not supported.
Since your library is complex, the best solution would be based on "Facade" pattern. Create a new C-library and implement application specific logic that utilizes C++ library. Try to design this library to be as thin as possible. The goal is not to write all business logic, but to provide C-functions that hold on C++ objects and call C++ functions. The GO-level language code would then call this library to use C++ library underneath. This approach differs from Q1 approach. In Q1 you attempt to have one interop call on per C++ function or object's method. In Facade you attempt to implement C++ usage scenarios that are unique to your application.
With Facade you reduce the scope of interop work, because you target your application scenarios. At the same time you mitigate away from C++ complexity at GO language level.
For example, you need to read a temperature sensor using C++ library.
In C++ you'd have to do:
open file
read stream until you find SLIP terminator
read one "record"
close file
With facade you create a single function called "readTemperature(deviceFileName)" and that C function executes 4 calls at once.
That's a fake example, just to show the point.
With facade you might want to hide original C++ objects and at this point it becomes a small layer. The goal here is to stay focused and balance your application needs with generalization to support your application.
Interestingly enough Facade approach is a way to improve interop performance. Interop in just about every language is more expensive than normal operations due to need to marshal from langauage runtime environment and keep it protected. Lots of interop calls slow down application (we are talking about millions here). For example, having 10 interop calls combined into 1 improves performance, because amount of itnerop operations is reduced.

I was successful wrapping a large (although perhaps not "huge") C++ library (hundreds of header files) in Swift using a relatively simple process. You directly link your project to the library. The only thing you have to wrap are any new functions that you write (to be invoked in Swift) that actually use the library (in the C++ wrapper file). The verbose stuff can be left in the wrapper file, mostly without any modification. There is a simple little tutorial which helped me: https://www.swiftprogrammer.info/swift_call_cpp.html
(FYI, there is one step he omitted: Set your library search paths in Build Settings => Search Paths => Library Search Paths (both Debug and Release) )

Convert Scala AST to source code

Given a Scala AST, is there a way to generate Scala source code?
I'm looking into ways to autogenerate Scala source by parsing/analyzing other Scala source. Any tips would be appreciated!

I have been successfully using Scala-Refactoring by Mirko Stocker for this task.
For synthetically constructing ASTs, it relies strongly on the existing Tree DSL of Scala's NSC.
Although the code is a bit messy, you can find an example usage in my project ScalaCollider-UGens.
I have also come across a very useful class by Johannes Rudolph.

See our DMS Software Reengineering Toolkit.
DMS provides a complete ecosystem for parsing/analyzing/optimizing/transforming source code in many languages. It achieves this by provide generic machinery for these tasks as its core capabilities, and specializing those according to explicitly supplied language definitions ("front ends"). DMS has front ends for many languages (C, C++, C#, Java, COBOL, ...) that have been used in anger, and a process for defining others very quickly.
We work on expanding the language set more or less continuously. DMS already has parts of a Scala front end implemented, and we know how to finish it based on the other 30+ front ends we have built, with special emphasis on knowledge of Java.

What second language to use besides Scala for LowLevel? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am absolutely happy with Scala and just love it :)
But sometimes I really want to go a bit more "low level", without a JVM and using "cool" CPU-Features like SSE etc.
So what would be a good second language besides Scala?
It should be:
Compiled to machine code
Easy usage of C-libraries
Possible to program very close to the hardware
Possible to program in a very highlevel-way when I want to
So basically I want a Scala where I can just throw in inline assembler when I want to :) I assume, that such a language does not exist, but maybe there are some that come close.
So what would be a good choice?
C++?, D?, OCaml?
I programmed a bit in C++ (15 Years ago) and very little with OCaml. In both cases, I only solved a few problems and never got very "deep" into the language itself.

You're pretty much describing D.
Compiled to machine code: Check. There is an experimental .NET VM implementation, but all three major implementations (DMD, LDC, GDC) compile directly to native code and the language is designed to make native compilation feasible.
Easy usage of C libraries: D supports the C ABI and all C types. Pretty much all you have to do is translate the header file and link in the C object file. This can even be partially automated.
Possible to program very close to the hardware: Check. D is what I'd call an idiomatic superset of C. It does not support every piece of C syntax, its module system is completely different, static arrays are value types in D2, etc. However, for any construct in the C language proper (i.e. excluding the preprocessor) there is an equivalent construct in D or the standard library. For any piece of C code (excluding preprocessor abuse) there is a canonical D translation that looks roughly the same and should generate the same assembly language instructions if you're using the same compiler backend. In other words, every C idiom (excluding preprocessor abuse) can be translated to D in a straightforward way.
The reference implementation of D also supports inline ASM, so you can mess with SSE, etc.
Possible to program in a very highlevel-way when I want to: Check. D is designed to be primarily garbage-collected language (though you can use manual memory management if you insist and are careful not to use library/runtime features that assume GC). Other than that, high-level programming is mostly implemented via template metaprogramming. Before you run away, please understand that template metaprogramming in D is greatly improved compared to C++. Doing template metaprogramming in D vs. C++ is like doing object oriented programming in C++ vs. C. In D template metaprogramming is designed into the language, whereas in C++ there are just enough features that you can use clever hackishness to make it barely work. The same could be said for object-oriented programming in C++ vs. C. The std.algorithm and std.range modules of Phobos are good examples of the high-level subset of D.

Here are some that satisfy the criteria mentioned in your question:
BitC
Clay
D
Rust
Go

I'm thinking about this, too, as I'm currently doing a C project and feeling very unproductive, also missing Scala. (I also did a lot of C++ in the Pleistocene...) I may switch to go. D also looks attractive.
Another option, if it makes sense for the problem, is to use C + a scripting language, like Lua or Ruby. It's what Unix+shells and emacs have done forever. You get performance and low-level bit twiddling when you need it and productivity when that's more important.

C++0X, Erlang and maybe Haskell and Go. C++ and Erlang has a strong user base and there is many jobs avaliable with C++0x and Erlang. (I am uncertain how good the C/C++ interop is with Go)
C++0X ("cee plus plus oh ex") is a good option. It has lamda functions and other good stuff.
Walktrough of C++0X TechDays 2010: Modern Programming with C++0x
Also C++0X has good Generics support as documented in Type Classes as Objects and Implicits, Oliviera, Moors, Odersky, OOPSLA 2010. See their Figure 12 below:

Something that fits your requirement is C/C++, as you can inline assembly language with regular code. Calling C libraries will be natural :)
Another thing that fits is the HLA implementation of assembly language (wiki article here) - it is assembly with a lot of high level constructs to make things easier (and faster) for beginners to learn (it compiles to "proper" native code).

Like D and BitC, ooc (http://www.ooc-lang.org/) has a lot of features that appeal to a Scala (or Haskell) fan.

I think Nimrod is also a valid candidate here based on your requirements.

You should take a look at Go.

It's still very new, but take a look at Vala. It's a sweet layer of syntactic frosting upon the GObject cake and compiled to pure C.
It supports features like closures and limited type inference.

Think about using C or C++ for the very lowest level programming, and then wrapping that with JNI or JNA in a Scala library. In some cases, you can have your cake and eat it too this way.

What is practical use of IDEA MPS and Eclipse Xtext

Both of those frameworks deal with meta-model:
XText (Eclipse)
MPS (JetBrain)
Do you have example of practical applications based on meta-model transformation with those tools?

We created whole bug tracker using MPS. Code generation is not the goal but mean to get some executable code. The goal is to give a tool to developer that allows creating DSLs with minimum effort.
Cool thing about MPS is that it also provides you with an IDE for your language. And different DSLs you create are compatible, i.e. you can create DSL that extends Java with closures and another DSL that enables external methods, and these extensions will work together.

They are different in term of document storing the metamodel.
Regarding XText, this article illustrates one usage, when it comes to y create your own programming languages and domain-specific languages (DSLs).
Once you have a language, you want to process it and this means usually to transform your model into another representation.
The facility responsible for this transformation is called generator and consists of a bunch of transformation templates (e.G. XPand) and some code executing them. On some event, the model is read in and the transformations are applied to produce code.
Example of such a model transformation:
dot3zest, which comes with a DOT to Zest interpreter (which now uses the Xtext switch API generated for the DOT grammar) is support for ad-hoc DOT edge definitions.
Regarding MPS, you have here a serie of practical examples,
like this code generation to GPL such as Java, C#, C++ or XML:
(source: googlecode.com)

I think the main usage of XText is firstly to create a DSL from the grammer you defined and an eclipse workbench auto-generated for you. Secondly, it can transform the scrpit written in your DSL into java. The built-in expressions from XText2 is a plus.
The framework gives you a free IDE to support your writing DSL you created. And the DSL is the ulimate product to provide. It can be used to abstract the rules and logics from the real world. For example, in our project, the product config rule. Only specialist knows them, so they write some in the DSL you create.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse