Introducing termination as a precondition in Stainless - scala

I'm trying to proof that the evaluation of the untyped lambda calculus according to the following function:
def eval(t: Term): Option[Term] = t match {
case App(t1, t2) => eval(t1) match {
case Some(Abs(x, body)) => eval(t2) match {
case Some(v2) => eval(subst(x, v2, body))
case None() => None[Term]()
}
case _ => None[Term]() // stuck
}
case _ => Some(t) // Abs or Var, already a value
}
returns either None or a value. However, I was pointed out that this function might not terminate. My question is how can introduce as a precondition in Leon/Stainless that a function must terminate?

I'm unaware of a way to introduce a precondition that specifically says "this function terminates (at the arguments given)". You should try to figure out a more high-level predicate which is equivalent to that. In your case, that's not going to work out, because you can't give a computable predicate that determines whether a term in untyped lambda calculus has a normal form.
Not all is lost though: The usual approach here is to introduce an additional "fuel" argument of type BigInt. It represents the maximal number of reduction steps to be performed. In each step, you decrement the fuel by one. If the fuel is zero, you abort the recursion and return None. This will trivially make your function terminate.
However, you always need to supply a "big enough" fuel. Usually the fuel will be a parameter and the lemma has a precondition that eval(fuel, t) = Some(u).

Related

Scala match case with multiple branch with if

I have a match case with if and the expression is always the same.
I put some pseudo code:
value match {
case A => same expression
case B(_) if condition1 => same expression
case _ if condition2 => same expression
...
case _ => different expression //similar to an else
}
The match contains both case object (case A) matching and case class(case B(_))
Is it the best practice?
Try to explain this code in words. "This function returns one of two values. The first is returned if the input is A. Or if the input is of type B and a condition holds. Oh, or if a different condition holds. Otherwise, it's the other value". That sounds incredibly complex to me.
I have to recommend breaking this down at least a bit. At minimum, you've got two target expressions, and which one is chosen depends on some predicate of value. That sounds like a Boolean to me. Assuming value is of some trait type Foo (which A.type and B extend), you could write
sealed trait Foo {
def isFrobnicated: Boolean = this match {
case A => true
case B(_) if condition1 => true
case _ => condition2
}
}
...
if (value.isFrobnicated) {
same expression
} else {
different expression
}
Now the cognitive load is split into two different, smaller chunks of code to digest, and presumably isFrobnicated will be given a self-documenting name and a chunk of comments explaining why this distinction is important. Anyone reading the bottom snippet can simply understand "Oh, there's two options, based on the frobnication status", and if they want more details, there's some lovely prose they can go read in the isFrobnicated docs. And all of the complexity of "A or B if this or anything if that" is thrown into its own function, separate from everything else.
If A and B don't have a common supertype that you control, then you can always write a standalone function, an implicit class, or (if you're in Scala 3) a proper extension method. Take your pick.
Depending on your actual use case, there may be more that can be done, but this should be a start.

Why doesn't Scala optimize calls to the same Extractor?

Take the following example, why is the extractor called multiple times as opposed to temporarily storing the results of the first call and matching against that. Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
object Name {
val NameReg = """^(\w+)\s(?:(\w+)\s)?(\w+)$""".r
def unapply(fullName: String): Option[(String, String, String)] = {
val NameReg(fname, mname, lname) = fullName
Some((fname, if (mname == null) "" else mname, lname))
}
}
"John Smith Doe" match {
case Name("Jane", _, _) => println("I know you, Jane.")
case Name(f, "", _) => println(s"Hi ${f}")
case Name(f, m, _) => println(s"Howdy, ${f} ${m}.")
case _ => println("Don't know you")
}
Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
Unfortunately, assuming isn't good enough for a (static) compiler. In order for memoizing to be a legal optimization, the compiler has to prove that the expression being memoized is pure and referentially transparent. However, in the general case, this is equivalent to solving the Halting Problem.
It would certainly be possible to write an optimization pass which tries to prove purity of certain expressions and memoizes them iff and only iff it succeeds, but that may be more trouble than it's worth. Such proofs get very hard very quickly, so they are only likely to succeed for very trivial expressions, which execute very quickly anyway.
What is a pattern match? The spec says it matches the "shape" of the value and binds vars to its "components."
In the realm of mutation, you have questions like, if I match on case class C(var v: V), does a case C(x) capture the mutable field? Well, the answer was no:
https://issues.scala-lang.org/browse/SI-5158
The spec says (sorry) that order of evaluation may be changed, so it recommends against side-effects:
In the interest of efficiency the evaluation of a pattern matching
expression may try patterns in some other order than textual sequence.
This might affect evaluation through side effects in guards.
That's in relation to guard expressions, presumably because extractors were added after case classes.
There's no special promise to evaluate extractors exactly once. (Explicitly in the spec, that is.)
The semantics are only that "patterns are tried in sequence".
Also, your example for regexes can be simplified, since a regex will not be re-evaluated when unapplied to its own Match. See the example in the doc. That is,
Name.NameReg findFirstMatchIn s map {
case Name("Jane",_,_) =>
}
where
object Name { def unapply(m: Regex.Matcher) = ... }

PartialFunction That Isn't Partial

Is there a reason to use a PartialFunction on a function that's not partial?
scala> val foo: PartialFunction[Int, Int] = {
| case x => x * 2
| }
foo: PartialFunction[Int,Int] = <function1>
foo is defined as a PartialFunction, but of course the case x will catch all input.
Is this simply bad code as the PartialFunction type indicates to the programmer that the function is undefined for certain inputs?
There is no advantage in using a PartialFunction instead of a Function, but if you have to pass a PartialFunction, then you have to pass a PartialFunction.
Note that, because of the inheritance between these two, overloading a method to accept both results in something difficult to use, as the type inference won't work.
The thing is, there are many examples of times when what you need to define on a trait/object/function definition is a PartialFunction but in reality the real implementation may not be one. Case in point, take a look at def collect[B](f: PartialFunction[A,B]):
val myList = thatList collect {
case Right(value) => value
case Left(other) => other.toInt
}
It's clearly not a "real" partial as it is defined for all input. That said, if I wanted to, I could just have the Right match.
However, if I were to have written collect as a full on plain function, then I'd miss out on the desired behavior (that is to be both a filter and a map rolled into one base on when a function is defined.) That's nice behavior and allows for a lot of flexibility when writing my own code.
So I guess the better question is, will you ever want behavior to reflect that a function might not be defined everywhere? If the answer is no, then don't do it.
PartialFunction literals allow pattern matching directly on arguments (e.g. { case (a, b) => ... } instead of _ match { case (a, b) => ... }), which makes code more readable (see #wheaties' answer for another example).
EDIT: apparently this is wrong, see Daniel C. Sobral's comment on his answer. Not deleting, so that the comments still make sense.

How can I omit this Nil Case

I am mucking around with scala implementing some common algorithms. While attempting to recreate a bubble sort I ran into this issue
Here is an implementation of an the inner loop that bubbles the value to the top:
def pass(xs:List[Int]):List[Int] = xs match {
case Nil => Nil
case x::Nil => x::Nil
case l::r::xs if(l>r) => r::pass(l::xs)
case l::r::xs => l::pass(r::xs)
}
My issue is with case Nil => Nil. I understand that I need this is because I could apply Nil to this function. Is there a way to ensure that Nil can't be provided as an argument in a manner that would satisfy the compiler so I can eliminate this case?
List has two subtypes, Nil and ::, so :: represents a list that has at least one element.
def pass(xs: ::[Int]):List[Int] = xs match {
case x::Nil => x::Nil
case l::r::xs if(l>r) => r::pass(new ::(l,xs))
case l::r::xs => l::pass(new ::(r, xs))
}
In that case you can simply play with the case clauses order:
def pass(xs:List[Int]):List[Int] = xs match {
case l::r::xs if(l>r) => r::pass(l::xs)
case l::r::xs => l::pass(r::xs)
case xs => xs
}
The first two clauses will only match lists with one or more elements. The last clause will match elsewhere.
This would roughly correspond to a refinement of the original type, where you would write a type whose members were a subset of the initial type. You would then show that, for every input x to your function, that x was non Nil. As this requires a good amount of proof (you can implement this in Coq with dependent types using a subset type), the better thing to do in this situation might be to introduce a new type, which was a list having no Nil constructor, only a constructor for cons and a single element.
EDIT: As Scala allows you to use subtyping over the List type to enforce this, you can prove it in the type system decidably using that subtype. This is still a proof, in the sense that any type checking corresponds to a proof that the program does indeed inhabit some type, it's just something the compiler can prove completely.

costly computation occuring in both isDefined and Apply of a PartialFunction

It is quite possible that to know whether a function is defined at some point, a significant part of computing its value has to be done. In a PartialFunction, when implementing isDefined and apply, both methods will have to do that. What to do is this common job is costly?
There is the possibility of caching its result, hoping that apply will be called after isDefined. Definitely ugly.
I often wish that PartialFunction[A,B] would be Function[A, Option[B]], which is clearly isomorphic. Or maybe, there could be another method in PartialFunction, say applyOption(a: A): Option[B]. With some mixins, implementors would have a choice of implementing either isDefined and apply or applyOption. Or all of them to be on the safe side, performance wise. Clients which test isDefined just before calling apply would be encouraged to use applyOption instead.
However, this is not so. Some major methods in the library, among them collect in collections require a PartialFunction. Is there a clean (or not so clean) way to avoid paying for computations repeated between isDefined and apply?
Also, is the applyOption(a: A): Option[B] method reasonable? Does it sound feasible to add it in a future version? Would it be worth it?
Why is caching such a problem? In most cases, you have a local computation, so as long as you write a wrapper for the caching, you needn't worry about it. I have the following code in my utility library:
class DroppedFunction[-A,+B](f: A => Option[B]) extends PartialFunction[A,B] {
private[this] var tested = false
private[this] var arg: A = _
private[this] var ans: Option[B] = None
private[this] def cache(a: A) {
if (!tested || a != arg) {
tested = true
arg = a
ans = f(a)
}
}
def isDefinedAt(a: A) = {
cache(a)
ans.isDefined
}
def apply(a: A) = {
cache(a)
ans.get
}
}
class DroppableFunction[A,B](f: A => Option[B]) {
def drop = new DroppedFunction(f)
}
implicit def function_is_droppable[A,B](f: A => Option[B]) = new DroppableFunction(f)
and then if I have an expensive computation, I write a function method A => Option[B] and do something like (f _).drop to use it in collect or whatnot. (If you wanted to do it inline, you could create a method that takes A=>Option[B] and returns a partial function.)
(The opposite transformation--from PartialFunction to A => Option[B]--is called lifting, hence the "drop"; "unlift" is, I think, a more widely used term for the opposite operation.)
Have a look at this thread, Rethinking PartialFunction. You're not the only one wondering about this.
This is an interesting question, and I'll give my 2 cents.
First of resist the urge for premature optimization. Make sure the partial function is the problem. I was amazed at how fast they are on some cases.
Now assuming there is a problem, where would it come from?
Could be a large number of case clauses
Complex pattern matching
Some complex computation on the if causes
One option I'd try to find ways to fail fast. Break the pattern matching into layer, then chain partial functions. This way you can fail the match early. Also extract repeated sub matching. For example:
Lets assume OddEvenList is an extractor that break a list into a odd list and an even list:
var pf1: PartialFuntion[List[Int],R] = {
case OddEvenList(1::ors, 2::ers) =>
case OddEvenList(3::ors, 4::ors) =>
}
Break to two part, one that matches the split then one that tries to match re result (to avoid repeated computation. However this may require some re-engineering
var pf2: PartialFunction[(List[Int],List[Int],R) = {
case (1 :: ors, 2 :: ers) => R1
case (3 :: ors, 4 :: ors) => R2
}
var pf1: PartialFuntion[List[Int],R] = {
case OddEvenList(ors, ers) if(pf2.isDefinedAt(ors,ers) => pf2(ors,ers)
}
I have used this when progressively reading XML files that hard a rather inconstant format.
Another option is to compose partial functions using andThen. Although a quick test here seamed to indicate that only the first was is actually tests.
There is absolutely nothing wrong with caching mechanism inside the partial function, if:
the function returns always the same input, when passed the same argument
it has no side effects
it is completely hidden from the rest of the world
Such cached function is not distiguishable from a plain old pure partial function...