Cannot prove in Dafny that variance(x)>=0 - covariance

I would like to prove in Dafny that variance(x) (i.e., covariance(x,x)) is >=0. Note that we define covariance(x,y) as: 1/N*Summation{i=1 to N}{(x_i-x_mean)*(y_i-y_mean)}, where N is the number of elements of x and y.
Let us define covariance in Dafny as in Proving a covariance inequality in Dafny, use contradiction?:
//calculates the sum of all elements of a sequence
function method sum_seq(s:seq<real>): (res:real)
ensures (forall i :: 0 <= i < |s| ==> s[i] >= 0.0) ==> res>=0.0
decreases s;
{
if s == [] then 0.0 else s[0] + sum_seq(s[1..])
}
//calculates the mean of a sequence
function method mean_fun(s:seq<real>): (res:real)
requires |s| >= 1;
decreases |s|;
ensures (forall i :: 0 <= i < |s| ==> s[i] >= 0.0) ==> res>=0.0
{
sum_seq(s) / (|s| as real)
}
//from a sequence x, it constructs a=[x[0]-x_mean, x[1]-x_mean...]
function construct_list (x:seq<real>, m:real) : (a:seq<real>)
requires |x| >= 1
ensures |x|==|a|
{
if |x| == 1 then [x[0]-m]
else [x[0]-m] + construct_list(x[1..],m)
}
//it performs the Summation of the covariance
//note that a=[x[0]-x_mean, x[1]-x_mean...] and b=[y[0]-y_mean, y[1]-y_mean...]
function {:fuel 2} product(a: seq<real>, b: seq<real>) : real
requires |a| == |b|
{
if |a| == 0 then 0.0
else a[0] * b[0] + product(a[1..], b[1..])
}
//covariance is the Summation divided by the number of elements
function cov(x: seq<real>, y: seq<real>) : (res:real)
requires |x| == |y|
requires |x| >= 1
ensures cov(x,y) == product(construct_list(x, mean_fun(x)),construct_list(y, mean_fun(y))) / (|x| as real)
//i.e., ensures cov(x,y) == product(a,b) / (|x| as real)
{
var x_mean := mean_fun(x);
var y_mean := mean_fun(y);
var a := construct_list(x, x_mean);
var b := construct_list(y, y_mean);
product(a,b) / (|a| as real)
}
Thus, to prove cov(x,y)>=0, I directly try the following:
//variance (i.e., Cov(x,x)) is always positive
lemma covarianceItself_positive(a:seq<real>)
requires |a| >= 1
requires forall i :: 0 <= i < |a| ==> a[i] >= 0.0
ensures cov(a,a) >= 0.0
{}
Which does not verify on its own. So I verify its base case and then realize that is suffices to prove that product(a,a)>=0.0, which is stated in Lemma productItself_positive(a):
lemma covarianceItself_positive(a:seq<real>)
requires |a| >= 1
requires forall i :: 0 <= i < |a| ==> a[i] >= 0.0
ensures cov(a,a) >= 0.0
{
if (|a|==1){
assert cov(a,a) >= 0.0;
}
else{
productItself_positive(a);
}
}
Where productItself_positive is the following:
lemma productItself_positive(a:seq<real>)
requires |a| >= 1
requires forall i :: 0 <= i < |a| ==> a[i] >= 0.0
ensures product(construct_list(a, mean_fun(a)),construct_list(a, mean_fun(a))) >= 0.0
{}
I am trying to make a proof for this Lemma, so I started a calculation. My problems are that (1) Dafny reports that a calculation step might not hold; and (2) Dafny is easily getting stuck while verifying, so I do not know whether I am following the a right idea. My advances so far are as follows:
lemma productItself_positive(a:seq<real>)
requires |a| >= 1
requires forall i :: 0 <= i < |a| ==> a[i] >= 0.0
ensures product(construct_list(a, mean_fun(a)),construct_list(a, mean_fun(a))) >= 0.0
{
if (|a|==1){
assert product(construct_list(a, mean_fun(a)),construct_list(a, mean_fun(a))) >= 0.0;
}
else {
calc >= {
product(construct_list(a, mean_fun(a)),construct_list(a, mean_fun(a)));
{
assert forall x:real :: forall y:real :: (x>=0.0 && y>=0.0) ==> (x*y>=0.0);
assert forall x:real :: forall y:real :: (x>=0.0 && y>=0.0) ==> (x+y>=0.0);
assert construct_list(a, mean_fun(a))[0] * construct_list(a, mean_fun(a))[0] >= 0.0;
productItself_positive(a[1..]);
//assert product(a[1..],a[1..]) >= 0.0; //Loops forever
}
//construct_list(a, mean_fun(a))[0] * construct_list(a, mean_fun(a))[0] + product(a[1..],a[1..]); //This should hold, but says that previous calculation does not hold
//{assume construct_list(a, mean_fun(a))[0] * construct_list(a, mean_fun(a))[0] + product(a[1..],a[1..]) >= 0.0;}
//construct_list(a, mean_fun(a))[0] * construct_list(a, mean_fun(a))[0] + product(a[1..],a[1..]) >= 0.0;
//Knowing that construct_list(a, mean_fun(a))[0] * construct_list(a, mean_fun(a))[0] >= 0.0 and that product(a[1..],a[1..]), their sum should be >=0.0
0.0;
}
}
}
Any help?

There is straight forward way to do this. First add postcondition in product that it is positive if both array are same, dafny able to verify it without any help
function {:fuel 2} product(a: seq<real>, b: seq<real>) : real
requires |a| == |b|
ensures a == b ==> product(a, b) >= 0.0
{
if |a| == 0 then 0.0
else a[0] * b[0] + product(a[1..], b[1..])
}
Then use the fact that covariance is product of two equal array
lemma self_cov_is_positive(a: seq<real>)
requires |a| >= 1
ensures cov(a, a) >= 0.0
{
var a_list := construct_list(a, mean_fun(a));
assert cov(a, a) == product(a_list, a_list) / (|a| as real);
}

Related

How to simplify and extend verified algorithm?

I implemented and verified a quicksort function based on the implementation here on page 22.
It verifies (hurray!) but I'm not too happy with the proof, quicksortSorted, that the quicksort function is sorted for a couple reasons that lead to the following questions.
For whatever reason it seems like the proof will not verify because of the two recursive calls needed by the lemma. I got around this by setting the quicksort function to opaque and using another helping lemma quicksortDef when I needed to assert something about its definition. I feel like making the function opaque shouldn't be required. Is there another way to call the recursive subcases that doesn't explode? (better inductive setup?)
Following from above I saw the {:fuel} attribute exists, can it be used to improve this situation?
Although I was able to verify the lemma that quicksort ensures the result is sorted. It would be nice if that fact was part of the ensure conditions of the quicksort function, but my attempts to add it there always seemed to trigger endless recursion. Is there a better way to define quicksort or to prove it which allows this?
When I tried to set filter and quicksort to be function methods I got an error about filters parameter being a ghost variable. Why is a generic predicate function parameter a ghost variable?
Finally the quicksortSorted lemma feels a bit over-complicated, any suggestion on how to simplify it?
Whole implementation on github:
function {:opaque} quicksort(xs: seq<int>): seq<int>
// ensures sortedRec(quicksort(xs))
ensures multiset(xs) == multiset(quicksort(xs))
ensures xs == [] ==> quicksort(xs) == []
ensures xs == [] ==> quicksort(xs) == []
decreases multiset(xs)
{
if xs == [] then [] else
assert xs == [xs[0]] + xs[1..];
filterPreservesMultiset(xs);
// var ln := y => y < xs[0];
// var gn := y => y >= xs[0];
var ln := lessThanFirst(xs);
var gn := greaterOrEqualFirst(xs);
filterMultiSetSlice(xs, xs[1..], ln);
filterMultiSetSlice(xs, xs[1..], gn);
quicksort(filter(xs[1..], ln)) + [xs[0]] + quicksort(filter(xs[1..], gn))
}
lemma quicksortDef(xs: seq<int>)
requires |xs| > 0
ensures quicksort(xs) == quicksort(filter(xs[1..], lessThanFirst(xs))) + [xs[0]] + quicksort(filter(xs[1..], greaterOrEqualFirst(xs)))
{
reveal quicksort();
}
lemma quicksortSorted(xs: seq<int>)
ensures sortedRec(quicksort(xs))
decreases multiset(xs)
{
if xs == [] {
assert quicksort(xs) == [];
assert sortedRec(quicksort(xs));
}else{
assert xs == [xs[0]] + xs[1..];
var ln := lessThanFirst(xs);
var gn := greaterOrEqualFirst(xs);
if xs[1..] == [] {
assert xs == [xs[0]];
assert sortedRec([xs[0]]);
}else{
// filterPreservesMultiset(xs[1..]);
filterMultiSetSlice(xs, xs[1..], ln);
filterMultiSetSlice(xs, xs[1..], gn);
var lessThan := filter(xs[1..], ln);
var greaterThan := filter(xs[1..], gn);
var sortedLt := quicksort(lessThan);
var sortedGt := quicksort(greaterThan);
// assert multiset(lessThan) == multiset(sortedLt);
assert multiset(greaterThan) == multiset(sortedGt);
assert forall y :: y in multiset(sortedGt) ==> y in multiset(greaterThan) && y in sortedGt && y in greaterThan;
assert listPartition(lessThan, [xs[0]]);
quicksortPreservesRelations(lessThan, [xs[0]]);
assert listPartition(sortedLt, [xs[0]]);
assert forall x :: x in sortedLt ==> forall y :: y in [xs[0]] ==> x < y ==> x < xs[0];
forall x | x in sortedLt + [xs[0]]
ensures forall y :: y in sortedGt ==> x <= y
{
forall y | y in sortedGt
ensures x <= y
{
assert y in greaterThan;
assert y >= xs[0];
if x in sortedLt {
assert forall z :: z in [xs[0]] ==> x < z ==> x < xs[0];
}else if x in [xs[0]] {
assert x == xs[0];
assert y >= xs[0];
}
}
}
quicksortSorted(lessThan);
quicksortSorted(greaterThan);
assert listPartition(sortedLt + [xs[0]], sortedGt);
// assert sortedRec(sortedLt);
// assert sortedRec(sortedGt);
sortedConcat(sortedLt, [xs[0]]);
assert sortedRec(sortedLt+[xs[0]]);
sortedConcat(sortedLt + [xs[0]], sortedGt);
assert sortedRec((sortedLt + [xs[0]]) + sortedGt);
quicksortDef(xs);
assert quicksort(xs) == sortedLt + [xs[0]] + sortedGt;
assert sortedRec(quicksort(xs));
}
}
}
predicate listPartition(xs: seq<int>, ys: seq<int>)
{
forall x :: x in xs ==> forall y :: y in ys ==> x <= y
}
predicate sortedRec(list: seq<int>) {
if list == [] then true else (forall y :: y in list[1..] ==> list[0] <= y) && sortedRec(list[1..])
}
lemma sortedConcat(xs: seq<int>, ys: seq<int>)
requires sortedRec(xs)
requires sortedRec(ys)
requires listPartition(xs,ys)
ensures sortedRec(xs + ys)
{
if xs == [] || ys == [] {
if xs == [] {
assert xs + ys == ys;
assert sortedRec(xs + ys);
} else if ys == [] {
assert xs + ys == xs;
assert sortedRec(xs+ys);
}
}else{
assert sortedRec([xs[0]]);
assert sortedRec([ys[0]]);
var sum := xs + ys;
assert xs == [xs[0]] + xs[1..];
assert ys == [ys[0]] + ys[1..];
assert xs[0] in xs;
assert forall y :: y in ys ==> xs[0] <= y;
assert forall xz :: xz in xs[1..] ==> xz in xs && forall y :: y in ys ==> xz <= y;
sortedConcat(xs[1..], ys);
assert xs+ys == [xs[0]] + (xs[1..]+ys);
assert sortedRec(xs + ys );
}
}

How to write a specification of a method that char array convert to an integer in dafny?

method atoi(a:array<char>) returns(r:int)
requires a.Length>0
requires forall k :: 0<= k <a.Length ==> (a[k] as int) - ('0' as int) <= 9
ensures ??
{
var j:int := 0;
while j < a.Length
invariant ??
{
r := r*10 + (a[j] as int) - ('0' as int);
j := j + 1;
}
}
How to write "ensures" for the atoi method and "invariant" for the while loops in dafny?
I express the idea "each bit of the return value corresponds to each bit of the character array" as following:
// Ten to the NTH power
// e.g.: ten_pos_pow(2) == 10*10 == 100
function ten_pos_pow(p:int):int
requires p>=0
ensures ten_pos_pow(p) >= 1
{
if p==0 then 1 else
10*ten_pos_pow(p-1)
}
// Count from right to left, the ith digit of integer v (i starts from zero)
// e.g.: num_in_int(123,0) == 3 num_in_int(123,1) == 2 num_in_int(123,2) == 1
function num_in_int(v:int,i:int) : int
requires i>=0
{
(v % ten_pos_pow(i+1))/ten_pos_pow(i)
}
method atoi(a:array<char>) returns(r:int)
requires a.Length>0
requires forall k :: 0<= k <a.Length ==> (a[k] as int) - ('0' as int) <= 9
ensures forall k :: 0<= k < a.Length ==> ((a[k] as int) - ('0' as int)) == num_in_int(r,a.Length-k-1)
{
var i:int := 0;
r := 0;
while i < a.Length
invariant 0<= i <= a.Length
invariant forall k :: 0<= k < i ==> ((a[k] as int) - ('0' as int)) == num_in_int(r,i-k-1) // loop invariant violation
{
r := r*10 + (a[i] as int) - ('0' as int);
i := i + 1;
}
}
But the loops invariant violation. How to write a correct and provable specification?

Trigger Dafny with multisets

This lemma verifies, but it raises the warning Not triggers found:
lemma multisetPreservesGreater (a:seq<int>, b:seq<int>, c:int, f:int, x:int)
requires |a|==|b| && 0 <= c <= f + 1 <= |b|
requires (forall j :: c <= j <= f ==> a[j] >= x)
requires multiset(a[c..f+1]) == multiset(b[c..f+1])
ensures (forall j :: c <= j <= f ==> b[j] >= x)
{
assert (forall j :: j in multiset(a[c..f+1]) ==> j in multiset(b[c..f+1]));
}
I do not know how to instantiate this trigger (cannot instantiate it as a function, or can I?). Any help?
Edit: Maybe I can instantiate a method f such that takes an array and inserts it in a multiset, and therefore I can trigger f(a), but that does not mention i. I will try.
Here's one way to transform the program so that there are no trigger warnings.
function SeqRangeToMultiSet(a: seq<int>, c: int, f: int): multiset<int>
requires 0 <= c <= f + 1 <= |a|
{
multiset(a[c..f+1])
}
lemma multisetPreservesGreater (a:seq<int>, b:seq<int>, c:int, f:int, x:int)
requires |a|==|b| && 0 <= c <= f + 1 <= |b|
requires (forall j :: c <= j <= f ==> a[j] >= x)
requires multiset(a[c..f+1]) == multiset(b[c..f+1])
ensures (forall j :: c <= j <= f ==> b[j] >= x)
{
assert forall j :: j in SeqRangeToMultiSet(a, c, f) ==> j in SeqRangeToMultiSet(b, c, f);
forall j | c <= j <= f
ensures b[j] >= x
{
assert b[j] in SeqRangeToMultiSet(b, c, f);
}
}
The point is that we introduce the function SeqRangeToMultiSet to stand for a subexpression that is not a valid trigger (because it contains arithmetic). Then SeqRangeToMultiSet itself can be the trigger.
The downside of this approach is that it decreases automation. You can see that we had to add a forall statement to prove the postcondition. The reason is that we need to mention the trigger, which does not appear in the post condition.

Dafny fails to prove max element in integer array

I'm trying to prove a simple program in Dafny that finds the maximum element of an integer array. Dafny succeeds in a few seconds proving the program below. When I remove the comments from the last two ensures specifications, Dafny fires error messages saying that
a postcondition might not hold on this return path
This is probably caused by the fact that index is guaranteed to be <= a.Length. However, max_index < a.Length is correct, and I'm having a hard time proving it. I tried writing a nested invariant in the if statement, but Dafny rejected that syntax. Any possible solution?
Here is my code:
method FindMax(a: array<int>) returns (max: int, max_index : int)
requires a.Length > 0
ensures forall k :: 0 <= k < a.Length ==> a[k] <= max
ensures 0 <= max_index
// ensures max_index < a.Length
// ensures a[max_index] == max
{
max := 0;
var index := 0;
max_index := 0;
while index < a.Length
invariant 0 <= index <= a.Length
invariant forall k :: 0 <= k < index ==> a[k] <= max
{
if (max < a[index])
// invariant 0 <= index < a.Length
{
max := a[index];
max_index := index;
}
index := index + 1;
}
}
It turns out my loop invariants needed more careful planning. Here is the correct version:
method FindMax(a: array<int>) returns (max: int, max_index : int)
requires a.Length > 0
ensures forall k :: 0 <= k < a.Length ==> a[k] <= max
ensures 0 <= max_index
ensures max_index < a.Length
ensures a[max_index] == max
{
var index := 0;
max_index := 0;
max := a[max_index];
while index < a.Length
invariant max_index < a.Length
invariant 0 <= index <= a.Length
invariant forall k :: 0 <= k < index ==> a[k] <= max
invariant a[max_index] == max
{
if (max < a[index])
{
max := a[index];
max_index := index;
}
index := index + 1;
}
}
And it takes Dafny a little over than 10 seconds to prove.

fast power function in scala

I tried to write a function for fast power in scala, but I keep getting java.lang.StackOverflowError. I think it has something to do with two slashes that use in the third line when I recursively called this function for n/2.
Can someone explain why is this happening
def fast_power(x:Double, n:Int):Double = {
if(n % 2 == 0 && n > 1)
fast_power(x, n/2) * fast_power(x, n /2)
else if(n % 2 == 1 && n > 1)
x * fast_power(x, n - 1)
else if(n == 0) 1
else 1 / fast_power(x, n)
}
Your code doesn't terminate, because there was no case for n = 1.
Moreover, your fast_power has linear runtime.
If you write it down like this instead:
def fast_power(x:Double, n:Int):Double = {
if(n < 0) {
1 / fast_power(x, -n)
} else if (n == 0) {
1.0
} else if (n == 1) {
x
} else if (n % 2 == 0) {
val s = fast_power(x, n / 2)
s * s
} else {
val s = fast_power(x, n / 2)
x * s * s
}
}
then it is immediately obvious that the runtime is logarithmic, because
n is at least halved in every recursive invocation.
I don't have any strong opinions on if-vs-match, so I just sorted all the cases in ascending order.
Prefer the match construct instead of multiple if/else blocks. This will help you isolate the problem you have (wrong recursive call), and write more understandable recursive functions. Always put the termination conditions first.
def fastPower(x:Double, m:Int):Double = m match {
case 0 => 1
case 1 => x
case n if n%2 == 0 => fastPower(x, n/2) * fastPower(x, n/2)
case n => x * fastPower(x, n - 1)
}