Is it possible to express existence in a grammar? - scala

I currently have the following grammar below:
COMPONENT = HEADER BODY
BODY = ELEMENT+
ELEMENT = EXPRESSION | DECLARATION | DESCRIPTION | NAME
I would like to assert that the body must have one of each of the ELEMENT in any order. Currently I am checking this after parsing, I am however curious if it's possible to express this in the grammar using Parser Combinators particularly guards
I tried doing further research regarding this, but nothing seems to come up.

Yes, it is possible. You simply write down all the valid permutations. This gets out of hand, fast.
It isn't worth the trouble to express in the grammar, because in essence you are trying to establish a constraint across the syntax, and grammars are really best at expressing context-free constraints.
Better to parse with grammar rules simply allowing all of the possibles clauses as options, and build your semantic pass (you will have one anyway) to check the additional constraints.

When writing a grammar, you should move as many constraints as possible to the semantic pass instead of the syntactic pass. This greatly improves the ability of the parser to recover from errors during parsing, and allows you to fully control the manner and wording used when reporting the error to users. For the case of permutations, the increased flexibility in the parser will likely reduce the size of the parsing tables as well.

Related

Is there a way to parse SMT-LIB2 strings through the CVC4 C++ API?

I have a program that can dynamically generate expressions in SMT-LIB format and I am trying to connect these expressions to CVC4 to test satisfiability and get the models. I am wondering if there is a convenient way to parse these strings through the CVC4 C++ API or if it would be best to just store the generated SMT-LIB code in a file and redirect input to the cvc4 executable.
A cursory look at their API doesn't reveal anything obvious, so I don't think they support this mode of operation. In general, loading such statements "on the fly" is tricky, since an expression by itself doesn't make much sense: You'd have to be in a context that has all the relevant sorts defined, along with all the definitions that your expressions rely on, including the selection of the proper logic. That is, for instance, why the corresponding function in z3 has extra arguments: https://z3prover.github.io/api/html/classz3_1_1context.html#af2b9bef14b4f338c7bdd79a1bb155a0f
Having said that, your best bet might be to ask directly at https://github.com/CVC4/CVC4/issues to see if they've something similar.

How could I encode "implies" logic in LogicBlox?

I would like to encode "implies" logic in LogicBlox.
I have a predicate:
Number(n),hasNumberName(n:i)->int(i).
isTrue[n] = i -> Number(n), boolean(i).
And I add some data in that predicate:
+Number(1).
Now, I want to create number 2 and number 3, and the truth value for these two number following this logic rule:
If isTrue[1] is true, then isTrue[2] is true or isTrue[3] is true. (isTrue[1] implies (isTrue[2] or isTrue[3]))
So I create a predicate:
implies[n1,n2,n3] = e -> Number(n1), Number(n2), Number(n3),boolean(e).
Then I try to create a rule like that:
isTrue[n2] = true;isTrue[n3] = true <- isTrue[n1] = true,implies[n1,n2,n3] = true.
But LogicBlox reports:"error: disjunction is not supported in the head of a rule "
So how can I encoding this implies logic in LogicBlox?
From your question it looks like you're asking this question with a Prolog background. If so, then it might be helpful to read a Datalog introduction, for example "What you always wanted to know about Datalog (and never dared to ask)".
The logic you want to express is on purpose not allowed in Datalog, because it requires a solving or search strategy. As opposed to Prolog, Datalog is on purpose restricted in the computational complexity of the programs you can express. As a result of these restrictions it meets important requirement for use in a database management system, most importantly supporting very large data sets. The computational complexity restrictions will be more clear after reading a good introduction to Datalog.
People have studied extensions of Datalog to allow more programs to be expressed (without going to full Prolog, which would result in a more procedural semantics). This particular example is called "Disjunctive Datalog". The hits on Google look good for this if you want to read more. LogicBlox does (at least currently) not implement Disjunctive Datalog because our primary objective is to be a scalable database management system.
LogicBlox does support using a solver for specific programs. A typical example is the knapsack problem. If your problem is expressible as an optimization problem (it almost certainly is, but the formulation usually requires some creativity for things that are not conventional optimization problems), then you could use this feature. The solver functionality is not very well documented in publicly available material yet. Please reach out to us directly if you would like to give this a try.
I assume you are trying to enforce a constraint that 1 -> 2 or 3 ? If so, trying to derive a value using <- is not going to work: if neither 2 nor 3 is present, which one(s) are you telling the system create? Instead, just write the constraint using -> syntax. Constraints are implications, after all (the right arrow syntax is no accident!), and that puts the disjunction on the right hand side where the language allows it. Then, if you ever try to create 1 and neither 2 nor 3 exists, the system will report a constraint failure because the implication was not found to hold.
Also, you don't usually need boolean-valued functions in logic languages; isTrue(x) can just be the set of x which you consider to be "true" (and any not present are "false").

Why does Scala separate symbols and letters in identifiers, and can I change this?

I'm thinking of using Scala for an internal DSL (ie, the DSL is really Scala) for musical composition. I'd want to use identifiers with characters like the sharp sign, ♯ (U+266F). Looks like this is not possible:
val c♯0 = 13
does not parse. The closest I can get is:
val c0_♯ = 13
I don't like it. What's the rationale for forcing identifiers to be named like this?
Scala's grammar tries to cater to a lot of conflicting requirements. In order to make optional any white-space between "operator-like" method names (such as +, / or ::) and their arguments on either side, it's necessary to have a way of signifying that a given name contains a mixture of punctuation and alphanumeric characters.
Keep in mind that internal DSLs are not new languages! They're at best illusions of being new languages. They are still Scala.
If you want complete freedom to control the language you provide to your users, you can write a combinator parser or use an external parser generator.
For anyone involved in language processing in Scala, I highly recommend you check out Kiama before rolling your own narrowly focused, single-purpose language processing code. Kiama is a gem!

POST/GET bindings in Racket

Is there a built-in way to get at POST/GET parameters in Racket? extract-binding and friends do what I want, but there's a dire note attached about potential security risks related to file uploads which concludes
Therefore, we recommend against their
use, but they are provided for
compatibility with old code.
The best I can figure is (and forgive me in advance)
(bytes->string/utf-8 (binding:form-value (bindings-assq (string->bytes/utf-8 "[field_name_here]") (request-bindings/raw req))))
but that seems unnecessarily complicated (and it seems like it would suffer from some of the same bugs documented in the Bindings section).
Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request? Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list? Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?
extract-binding is bad because it is case-insensitive, is very messy for inputs that return multiple times, doesn't have a way of dealing with file uploads, and automatically assumes everything is UTF-8, which isn't necessarily true. If you can accept those problems, feel free to use it.
The snippet you wrote works when the data is UTF-8 and when there is only one field return. You can define it is a function and avoid writing it many times.
In general, I recommend using formlets to deal with forms and their values.
Now your questions...
"Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request?"
What you have is the standard thing, although you wrongly assume that there is only one value. When there are multiple, you'll want to filter the bindings on the field name. Similarly, you don't need to turn the value into a string, you can leave it as bytes just fine.
"Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list?"
That's what request-bindings/raw does. It is a list of binding? objects. It doesn't make sense to turn it into a hash due to multiple value returns.
"Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?"
The Web server hides the difference between POSTs and GETs from you. You can inspect uri and raw post data to recover them, but you'd have to parse them yourself. I don't recommend it.
Jay

Writing programs in dynamic languages that go beyond what the specification allows

With the growth of dynamically typed languages, as they give us more flexibility, there is the very likely probability that people will write programs that go beyond what the specification allows.
My thinking was influenced by this question, when I read the answer by bobince:
A question about JavaScript's slice and splice methods
The basic thought is that splice, in Javascript, is specified to be used in only certain situations, but, it can be used in others, and there is nothing that the language can do to stop it, as the language is designed to be extremely flexible.
Unless someone reads through the specification, and decides to adhere to it, I am fairly certain that there are many such violations occuring.
Is this a problem, or a natural extension of writing such flexible languages? Or should we expect tools like JSLint to help be the specification police?
I liked one answer in this question, that the implementation of python is the specification. I am curious if that is actually closer to the truth for these types of languages, that basically, if the language allows you to do something then it is in the specification.
Is there a Python language specification?
UPDATE:
After reading a couple of comments, I thought I would check the splice method in the spec and this is what I found, at the bottom of pg 104, http://www.mozilla.org/js/language/E262-3.pdf, so it appears that I can use splice on the array of children without violating the spec. I just don't want people to get bogged down in my example, but hopefully to consider the question.
The splice function is intentionally generic; it does not require that its this value be an Array object.
Therefore it can be transferred to other kinds of objects for use as a method. Whether the splice function
can be applied successfully to a host object is implementation-dependent.
UPDATE 2:
I am not interested in this being about javascript, but language flexibility and specs. For example, I expect that the Java spec specifies you can't put code into an interface, but using AspectJ I do that frequently. This is probably a violation, but the writers didn't predict AOP and the tool was flexible enough to be bent for this use, just as the JVM is also flexible enough for Scala and Clojure.
Whether a language is statically or dynamically typed is really a tiny part of the issue here: a statically typed one may make it marginally easier for code to enforce its specs, but marginally is the key word here. Only "design by contract" -- a language letting you explicitly state preconditions, postconditions and invariants, and enforcing them -- can help ward you against users of your libraries empirically discovering what exactly the library will let them get away with, and taking advantage of those discoveries to go beyond your design intentions (possibly constraining your future freedom in changing the design or its implementation). And "design by contract" is not supported in mainstream languages -- Eiffel is the closest to that, and few would call it "mainstream" nowadays -- presumably because its costs (mostly, inevitably, at runtime) don't appear to be justified by its advantages. "Argument x must be a prime number", "method A must have been previously called before method B can be called", "method C cannot be called any more once method D has been called", and so on -- the typical kinds of constraints you'd like to state (and have enforced implicitly, without having to spend substantial programming time and energy checking for them yourself) just don't lend themselves well to be framed in the context of what little a statically typed language's compiler can enforce.
I think that this sort of flexibility is an advantage as long as your methods are designed around well defined interfaces rather than some artificial external "type" metadata. Most of the array functions only expect an object with a length property. The fact that they can all be applied generically to lots of different kinds of objects is a boon for code reuse.
The goal of any high level language design should be to reduce the amount of code that needs to be written in order to get stuff done- without harming readability too much. The more code that has to be written, the more bugs get introduced. Restrictive type systems can be, (if not well designed), a pervasive lie at worst, a premature optimisation at best. I don't think overly restrictive type systems aid in writing correct programs. The reason being that the type is merely an assertion, not necessarily based on evidence.
By contrast, the array methods examine their input values to determine whether they have what they need to perform their function. This is duck typing, and I believe that this is more scientific and "correct", and it results in more reusable code, which is what you want. You don't want a method rejecting your inputs because they don't have their papers in order. That's communism.
I do not think your question really has much to do with dynamic vs. static typing. Really, I can see two cases: on one hand, there are things like Duff's device that martin clayton mentioned; that usage is extremely surprising the first time you see it, but it is explicitly allowed by the semantics of the language. If there is a standard, that kind of idiom may appear in later editions of the standard as a specific example. There is nothing wrong with these; in fact, they can (unless overused) be a great productivity boost.
The other case is that of programming to the implementation. Such a case would be an actual abuse, coming from either ignorance of a standard, or lack of a standard, or having a single implementation, or multiple implementations that have varying semantics. The problem is that code written in this way is at best non-portable between implementations and at worst limits the future development of the language, for fear that adding an optimization or feature would break a major application.
It seems to me that the original question is a bit of a non-sequitor. If the specification explicitly allows a particular behavior (as MUST, MAY, SHALL or SHOULD) then anything compiler/interpreter that allows/implements the behavior is, by definition, compliant with the language. This would seem to be the situation proposed by the OP in the comments section - the JavaScript specification supposedly* says that the function in question MAY be used in different situations, and thus it is explicitly allowed.
If, on the other hand, a compiler/interpreter implements or allows behavior that is expressly forbidden by a specification, then the compiler/interpreter is, by definition, operating outside the specification.
There is yet a third scenario, and an associated, well defined, term for those situations where the specification does not define a behavior: undefined. If the specification does not actually specify a behavior given a particular situation, then the behavior is undefined, and may be handled either intentionally or unintentionally by the compiler/interpreter. It is then the responsibility of the developer to realize that the behavior is not part of the specification, and, should s/he choose to leverage the behavior, the developer's application is thereby dependent upon the particular implementation. The interpreter/compiler providing that implementation is under no obligation to maintain the officially undefined behavior beyond backwards compatibility and whatever commitments the producer may make. Furthermore, a later iteration of the language specification may define the previously undefined behavior, making the compiler/interpreter either (a) non-compliant with the new iteration, or (b) come out with a new patch/version to become compliant, thereby breaking older versions.
* "supposedly" because I have not seen the spec, myself. I go by the statements made, above.