Xtext grammar of negative numbers: terminal vs datatype rules - eclipse

What I would like to achive is an Xtext grammar which is able to distinguish between negative numerics of type int and float.
As I faced the same term problems as eclipse community, I followed their recommendation, to write both as datatype rules:
SignedInteger returns ecore::EIntegerObject:
'-'? INT;
SignedFloat returns ecore::EFloatObject:
'-'? INT* '.' INT+;
But the above will give me the following error (and finally i have the same problem by the leading minus sign):
Decision can match input such as "RULE_INT" using multiple alternatives: 1, 2
To solve this I could write both as terminal rules, but then the grammar will conflict in:
The following token definitions can never be matched because prior tokens
match the same input: RULE_INT
because both rules are hidden behinded the Xtext common terminals rule INT.
It seem like the solution for one of the problems would force a conflic with the other one. Any recommendations how to solve this?
Besides another question refering ecore datatypes: What return type would you recommend, whats the difference between EInt and EIntegerObject? (Is the second the wrapper class of the primitive type?)

I solved the problem by removing the with common terminals statement. And copied the rest (without the INT rule) i need into my own grammar. So there is no conflict any more.
But i guess that is not realy the root of the problem....
If anyone can explain what's going on here I would be very thankful.
(I hope that way does not bring later problems with it)

Related

How do purely functional compilers annotate the AST with type info?

In the syntax analysis phase, an imperative compiler can build an AST out of nodes that already contain a type field that is set to null during construction, and then later, in the semantic analysis phase, fill in the types by assigning the declared/inferred types into the type fields.
How do purely functional languages handle this, where you do not have the luxury of assignment? Is the type-less AST mapped to a different kind of type-enriched AST? Does that mean I need to define two types per AST node, one for the syntax phase, and one for the semantic phase?
Are there purely functional programming tricks that help the compiler writer with this problem?
I usually rewrite a source (or an already several steps lowered) AST into a new form, replacing each expression node with a pair (tag, expression).
Tags are unique numbers or symbols which are then used by the next pass which derives type equations from the AST. E.g., a + b will yield something like { numeric(Tag_a). numeric(Tag_b). equals(Tag_a, Tag_b). equals(Tag_e, Tag_a).}.
Then types equations are solved (e.g., by simply running them as a Prolog program), and, if successful, all the tags (which are variables in this program) are now bound to concrete types, and if not, they're left as type parameters.
In a next step, our previous AST is rewritten again, this time replacing tags with all the inferred type information.
The whole process is a sequence of pure rewrites, no need to replace anything in your AST destructively. A typical compilation pipeline may take a couple of dozens of rewrites, some of them changing the AST datatype.
There are several options to model this. You may use the same kind of nullable data fields as in your imperative case:
data Exp = Var Name (Maybe Type) | ...
parse :: String -> Maybe Exp -- types are Nothings here
typeCheck :: Exp -> Maybe Exp -- turns Nothings into Justs
or even, using a more precise type
data Exp ty = Var Name ty | ...
parse :: String -> Maybe (Exp ())
typeCheck :: Exp () -> Maybe (Exp Type)
I cant speak for how it is supposed to be done, but I did do this in F# for a C# compiler here
The approach was basically - build an AST from the source, leaving things like type information unconstrained - So AST.fs basically is the AST which strings for the type names, function names, etc.
As the AST starts to be compiled to (in this case) .NET IL, we end up with more type information (we create the types in the source - lets call these type-stubs). This then gives us the information needed to created method-stubs (the code may have signatures that include type-stubs as well as built in types). From here we now have enough type information to resolve any of the type names, or method signatures in the code.
I store that in the file TypedAST.fs. I do this in a single pass, however the approach may be naive.
Now we have a fully typed AST you could then do things like compile it, fully analyze it, or whatever you like with it.
So in answer to the question "Does that mean I need to define two types per AST node, one for the syntax phase, and one for the semantic phase?", I cant say definitively that this is the case, but it is certainly what I did, and it appears to be what MS have done with Roslyn (although they have essentially decorated the original tree with type info IIRC)
"Are there purely functional programming tricks that help the compiler writer with this problem?"
Given the ASTs are essentially mirrored in my case, it would be possible to make it generic and transform the tree, but the code may end up (more) horrendous.
i.e.
type 'type AST;
| MethodInvoke of 'type * Name * 'type list
| ....
Like in the case when dealing with relational databases, in functional programming it is often a good idea not to put everything in a single data structure.
In particular, there may not be a data structure that is "the AST".
Most probably, there will be data structures that represent parsed expressions. One possible way to deal with type information is to assign a unique identifier (like an integer) to each node of the tree already during parsing and have some suitable data structure (like a hash map) that associates those node-ids with types. The job of the type inference pass, then, would be just to create this map.

Int extension not applied to raw negative values

My extensions to the Int type do not work for raw, negative values. I can work around it but the failure seems to be a type inference problem. Why is this not working as expected?
I first encountered this within the application development environment but I have recreated a simple form of it here on the Playground. I am using the latest version of Xcode; Version 6.2 (6C107a).
That's because - is interpreted as the minus operator applied to the integer 2, and not as part of the -2 numeric literal.
To prove that, just try this:
-(1.foo())
which generates the same error
Could not find member 'foo'
The message is probably misleading, because the error is about trying to apply the minus operator to the return value of the foo method.
I don't know if that is an intentional behavior or not, but it's how it works :)
This is likely a compiler bug (report on radar if you like). Use brackets:
println((-2).foo())

Scala: when to use explicit type annotations

I've been reading a lot of other people's Scala code recently, and one of the things that I have difficultly with (coming from Java) is a lack of explicit type annotations.
It's certainly convenient when writing code to be able to leave out type annotations -- however when reading code I often find that explicit type annotations help me to understand at a glance what code is doing more easily.
The Scala style guide (http://docs.scala-lang.org/style/types.html) doesn't seem to provide any definitive guidance on this, stating:
Use type inference where possible, but put clarity first, and favour explicitness in public APIs.
To my mind, this is a bit contradictory. While it's clearly obvious what type this variable is:
val tokens = new HashMap[String, Int]
It's not so obvious what type this one is:
val tokens = readTokens()
So, if I was putting clarity first I would probably annotate all variables where the type is not already declared on the same line.
Do any Scala practitioners have guidance on this? Am I crazy to be considering adding type annotations to my local variables? I'm particularly interested in hearing from folks who spend a lot of time reading scala code (for example, in code reviews), as well as writing it.
It's not so obvious what type this one is:
val tokens = readTokens()
Good names are important: the name is plural, ergo it returns some collection of some kind. The most general collection types in Scala are Traversable and Iterator, and they mostly share a common interface, so it's not really important which one of the two it is. The name also talks about "reading tokens", ergo it obviously should return Tokens in some fashion. And last but not least, the method call has parentheses, which according to the style guide means it has side-effects, so I wouldn't count on being able to traverse the collection more than once.
Ergo, the return type is something like
Traversable[Token]
or
Iterator[Token]
and which of the two it is doesn't really matter because their client interfaces are mostly identical.
Note also that the latter constraint (only traversing the collection once) isn't even captured in the type, even if you were providing an explicit type, you would still have to look at the name and the style!

JPA 2 criteria provide runtime type for gt operator

I am building a highly generic query mechanism on top of the JPA Criteria. I get as input an XML describing the query, something like this:
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<Criteria xmlns='criteria' maxResults='2'>
<Expression>
<CompareRestriction propertyType='Date' operator='GREATER_THAN_OR_EQUALS' propertyName='deliveryDate'>2010-07-02</CompareRestriction>
<CompareRestriction propertyType='Float' operator='GREATER_THAN_OR_EQUALS' propertyName='weight'>10f</CompareRestriction>
<Restriction operator='NOT_NULL' propertyName='maxDiameter'/>
<LogicalExpression operator='OR'>
<LeftHandSideCompare propertyType="Integer" operator="EQUALS" propertyName="weight">31</LeftHandSideCompare>
<RightHandSide operator='NOT_NULL' propertyName='lastChangedDate'/>
</LogicalExpression>
<LogicalExpression operator='OR'>
<LeftHandSideCompare propertyType="Integer" operator="EQUALS" propertyName="weight">31</LeftHandSideCompare>
<RightHandSide operator='NOT_NULL' propertyName='lastChangedDate'/>
</LogicalExpression>
</Expression>
<Order propertyName='deliveryDate' type='DESC'/>
</Criteria>
and I parse this thing and build the corresponding criteria. Currently I am facing a problem with it comes to comparison operators (<,>,<=,=>) as I deal with different numerical types: I have fields with Float, Integer or Long value. So when I am mapping I do something like this:
switch (leftHandSideCompareRestriction.getOperator().value()) {
...
case "LESS_THAN" : innerPredicates.add(criteriaBuilder.gt(rootQuery.<Number>get(propName), NumberUtils.createNumber((value))));
case "LESS_THAN_OR_EQUALS" : innerPredicates.add(criteriaBuilder.gt(rootQuery.<Number>get(propName), NumberUtils.createNumber(value)));
...
}
The NumberUtils is the apache commons NumberUtils utility class
that returns a numerical type based on the input provided (Float, Integer, Long or Double). Now I need a mechanism to provide the type also for the
rootQuery<T>.get(propName)
at runtime, otherwise the JPA is complaining that I provided a Float instead of a Integer or the other way around. I tried several things and now I kind of ran out of ideas. I would highly appreciate and thoughts, ideas, suggestions about how to accomplish this in a robust fashion.
I seems that initially I missed something - I am not sure how. There some kind of issue in a different part. So, doing the query like this: it will work for sure. I tried with the following types:
Integer
Float
And it worked as expected for the following operations: >,<, <=, =>. In conclusion using Number and the NumberUtils fixes this issue in quite an elegant manner, as the appropriate type is created by the NumberUtils and JPA takes the top of the hierarchy for Number.

How do I read this OCaml type signature?

I'm currently experimenting with using OCaml and GTK together (using the lablgtk bindings). However, the documentation isn't the best, and while I can work out how to use most of the features, I'm stuck with changing notebook pages (switching to a different tab).
I have found the function that I need to use, but I don't know how to use it. The documentation seems to suggest that it is in a sub-module of GtkPackProps.Notebook, but I don't know how to call this.
Also, this function has a type signature different to any I have seen before.
val switch_page : ([> `notebook ], Gpointer.boxed option -> int -> unit) GtkSignal.t
I think it returns a GtkSignal.t, but I have no idea how to pass the first parameter to the function (the whole part in brackets).
Has anyone got some sample code showing how to change the notebook page, or can perhaps give me some tips on how to do this?
What you have found is not a function but the signal. The functional type you see in its type is the type of the callback that will be called when the page switch happen, but won't cause it.
by the way the type of switch_page is read as: a signal (GtkSignal.t) raised by notebook [> `notebook ], whose callbacks have type Gpointer.boxed option -> int -> unit
Generally speaking, with lablgtk, you'd better stay away of the Gtk* low level modules, and use tge G[A-Z] higher level module. Those module API look like the C Gtk one, and I always use the main Gtk doc to help myself.
In your case you want to use the GPack.notebook object and its goto_page method.
You've found a polymorphic variant; they're described in the manual in Section 4.2, and the typing rules always break my head. I believe what the signature says is that the function switch_page expects as argument a GtkSignal.t, which is an abstraction parameterized by two types:
The first type parameter,
[> `notebook]
includes as values any polymorphic variant including notebook (that's what the greater-than means).
The second type parameter is an ordinary function.
If I'm reading the documentation for GtkSignal.t correctly, it's not a function at all; it's a record with three fields:
name is a string.
classe is a polymorphic variant which could be ``notebook` or something else.
marshaller is a marshaller for the function type Gpointer.boxed option -> int -> unit.
I hope this helps. If you have more trouble, section 4.2 of the manual, on polymorphic variants, might sort you out.