scala filter of 2 sets - scala

def intersect(s: Set, t: Set): Set = (x => s(x) && t(x))
def filter(s: Set, p: Int => Boolean): Set = intersect(s, p)
I have this code.
I don't understand how does "filter" function be a valid one.
First of all, it uses intersect, but second parameter p is a method, not a Set as intersect function prototype requres.
Secondly, how does intersect(s, p) work as filter?
thanks

These are two different ways of looking at the same thing. As I mentioned in my previous answer, representing sets as their indicator functions makes lots of things more convenient, and one of those things is filtering.
Usually when we have a collection of some type A we can filter it using a predicate function A => Boolean that tells us whether or not we want to keep each element. In this case, the type of the predicate function is the same type we're using to represent the collection, and filtering is the same thing as taking the intersection of two sets.
To address your second question: intersect needs to return a function that will return true if the item is in both set s and set t and false otherwise. We can check that using the apply method on each (or in this case its syntactic sugar). The implementation is then a simple function literal (x => s(x) && t(x)) that takes an argument x and returns true if and only if x is in both sets.

Related

How to model if-expressions with actor systems?

I'm trying to emulate a simple functional language using an actor based execution model where issues arose modelling if-expression.
Actor systems nowadays are used basically for speeding up all kind of stuff by avoiding OS locks and stalled threads or to make microservices less painful, but initially it was supposed to be an alternative model of computation in general [1][2], a contemporary take on it may be propagation networks. So this should be capable to cover any programming language construct and certainly an if, right?
While I'm aware that this is occasionally met with irritation, I saw one timid attempt to move towards recursive algorithms represented using akka actors (I've refurbished it and added further examples including the one given below). That attempt halted at function calls, but why not go further and also model operators and if conditions with actors? In fact the smalltalk language applies this model and yields a precursor of the actor concept, as has been pointed out in the accepted answers below.
Surprisingly recursive function calls aren't much of an issue, but if1 is, due to it's potentially stateful nature.
Given the clause C: if a then x else y, here's the problem:
My initial idea was, that C is an actor behaving like a function with 3 parameters (a,x,y) that returns either x or y depending on a. Being maximally parallel [2] a,x and y would be evaluated simultaneously and passed as messages to C. Now, this isn't really good, if C is the exit condition of a recursive function f, sending one branch of f in an infinite recursion. Also if x or y have side effects one can't just evaluate both of them. Let's take this recursive sum (which is not the usual factorial, stupid as such and could be made tail recursive, but that's not the point)
f(n) = if n <= 0
0
else
n + f(n -1)
Note, that I'd like to create an if-expression resembling the one of Scala, see the (spec, p. 88), or Haskell, for that matter, rather than an if-statement that relies on side-effects.
f(0) would cause 3 concurrent evaluations
n <= 0 (ok)
0 (ok)
n + f(n -1) (bad, introducing the weird behavior that the call to f(n) actually will return (yielding 0) but the evaluation of its branches continues infinitely)
I can see these options from here:
The whole computation becomes stateful and the evaluation of either x
or y only happens after a has been calculated (mandatory if x or y have side effects).
Some guarding mechanism gets introduced that renders x or y not
applicable for arguments outside a certain range upon the call of f. They might evaluate to some "not applicable" marker instead of a value, which will not be used in C anyway, since it comes from a branch which isn't relevant.
I'm not sure at this point, If I didn't miss out on the question at a fundamental level and there are obvious other approaches that I just don't see. Input appreciated :)
Btw. see this for an exhaustive list of conditional branching in different languages, without giving their semantics, and, not exhaustive, the wiki page on conditionals, with semantics, and this for a discussion how the question at hand is an issue till down to the level of hardware.
1 I'm aware that an if could be seen as a special case of pattern matching, but then the question is, how to model the different cases of a match expression using actors. But maybe that wasn't even intended in the first place, matching is just something that every actor can do without referring to other specialized "match-actors". On the other hand it has been stated that "everything is an actor", rather confusing[2]. Btw. does anybody have a clear notion what the [#message whatever] notation is meant to be in that paper? # is irritatingly undefined. Maybe smalltak gives a hint, there it indicates a symbol.
There is a little bit of a misconception in your question. In functional languages, if is not necessarily a function of three parameters. Rather, it is sometimes two functions of two parameters.
In particular, that is how the Church Encoding of Booleans works in λ-calculus: there are two functions, let's call them True and False. Both functions have two parameters. True simply returns the first argument, False simply returns the second argument.
First, let's define two functions called true and false. We could define them any way we want, they are completely arbitrary, but we will define them in a very special way which has some advantages as we will see later (I will use ECMAScript as a somewhat reasonable approximation of λ-calculus that is probably readable by a bigger portion of visitors to this site than λ-calculus itself):
const tru = (thn, _ ) => thn,
fls = (_ , els) => els;
tru is a function with two parameters which simply ignores its second argument and returns the first. fls is also a function with two parameters which simply ignores its first argument and returns the second.
Why did we encode tru and fls this way? Well, this way, the two functions not only represent the two concepts of true and false, no, at the same time, they also represent the concept of "choice", in other words, they are also an if/then/else expression! We evaluate the if condition and pass it the then block and the else block as arguments. If the condition evaluates to tru, it will return the then block, if it evaluates to fls, it will return the else block. Here's an example:
tru(23, 42);
// => 23
This returns 23, and this:
fls(23, 42);
// => 42
returns 42, just as you would expect.
There is a wrinkle, however:
tru(console.log("then branch"), console.log("else branch"));
// then branch
// else branch
This prints both then branch and else branch! Why?
Well, it returns the return value of the first argument, but it evaluates both arguments, since ECMAScript is strict and always evaluates all arguments to a function before calling the function. IOW: it evaluates the first argument which is console.log("then branch"), which simply returns undefined and has the side-effect of printing then branch to the console, and it evaluates the second argument, which also returns undefined and prints to the console as a side-effect. Then, it returns the first undefined.
In λ-calculus, where this encoding was invented, that's not a problem: λ-calculus is pure, which means it doesn't have any side-effects; therefore you would never notice that the second argument also gets evaluated. Plus, λ-calculus is lazy (or at least, it is often evaluated under normal order), meaning, it doesn't actually evaluate arguments which aren't needed. So, IOW: in λ-calculus the second argument would never be evaluated, and if it were, we wouldn't notice.
ECMAScript, however, is strict, i.e. it always evaluates all arguments. Well, actually, not always: the if/then/else, for example, only evaluates the then branch if the condition is true and only evaluates the else branch if the condition is false. And we want to replicate this behavior with our iff. Thankfully, even though ECMAScript isn't lazy, it has a way to delay the evaluation of a piece of code, the same way almost every other language does: wrap it in a function, and if you never call that function, the code will never get executed.
So, we wrap both blocks in a function, and at the end call the function that is returned:
tru(() => console.log("then branch"), () => console.log("else branch"))();
// then branch
prints then branch and
fls(() => console.log("then branch"), () => console.log("else branch"))();
// else branch
prints else branch.
We could implement the traditional if/then/else this way:
const iff = (cnd, thn, els) => cnd(thn, els);
iff(tru, 23, 42);
// => 23
iff(fls, 23, 42);
// => 42
Again, we need some extra function wrapping when calling the iff function and the extra function call parentheses in the definition of iff, for the same reason as above:
const iff = (cnd, thn, els) => cnd(thn, els)();
iff(tru, () => console.log("then branch"), () => console.log("else branch"));
// then branch
iff(fls, () => console.log("then branch"), () => console.log("else branch"));
// else branch
Now that we have those two definitions, we can implement or. First, we look at the truth table for or: if the first operand is truthy, then the result of the expression is the same as the first operand. Otherwise, the result of the expression is the result of the second operand. In short: if the first operand is true, we return the first operand, otherwise we return the second operand:
const orr = (a, b) => iff(a, () => a, () => b);
Let's check out that it works:
orr(tru,tru);
// => tru(thn, _) {}
orr(tru,fls);
// => tru(thn, _) {}
orr(fls,tru);
// => tru(thn, _) {}
orr(fls,fls);
// => fls(_, els) {}
Great! However, that definition looks a little ugly. Remember, tru and fls already act like a conditional all by themselves, so really there is no need for iff, and thus all of that function wrapping at all:
const orr = (a, b) => a(a, b);
There you have it: or (plus other boolean operators) defined with nothing but function definitions and function calls in just a handful of lines:
const tru = (thn, _ ) => thn,
fls = (_ , els) => els,
orr = (a , b ) => a(a, b),
nnd = (a , b ) => a(b, a),
ntt = a => a(fls, tru),
xor = (a , b ) => a(ntt(b), b),
iff = (cnd, thn, els) => cnd(thn, els)();
Unfortunately, this implementation is rather useless: there are no functions or operators in ECMAScript which return tru or fls, they all return true or false, so we can't use them with our functions. But there's still a lot we can do. For example, this is an implementation of a singly-linked list:
const cons = (hd, tl) => which => which(hd, tl),
car = l => l(tru),
cdr = l => l(fls);
You may have noticed something peculiar: tru and fls play a double role, they act both as the data values true and false, but at the same time, they also act as a conditional expression. They are data and behavior, bundled up into one … uhm … "thing" … or (dare I say) object! Does this idea of identifying data and behavior remind us of anything?
Indeed, tru and fls are objects. And, if you have ever used Smalltalk, Self, Newspeak or other pure object-oriented languages, you will have noticed that they implement booleans in exactly the same way: two objects true and false which have method named if that takes two blocks (functions, lambdas, whatever) as arguments and evaluates one of them.
Here's an example of what it might look like in Scala:
sealed abstract trait Buul {
def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): T
def &&&(other: ⇒ Buul): Buul
def |||(other: ⇒ Buul): Buul
def ntt: Buul
}
case object Tru extends Buul {
override def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): U = thn
override def &&&(other: ⇒ Buul) = other
override def |||(other: ⇒ Buul): this.type = this
override def ntt = Fls
}
case object Fls extends Buul {
override def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): V = els
override def &&&(other: ⇒ Buul): this.type = this
override def |||(other: ⇒ Buul) = other
override def ntt = Tru
}
object BuulExtension {
import scala.language.implicitConversions
implicit def boolean2Buul(b: ⇒ Boolean) = if (b) Tru else Fls
}
import BuulExtension._
(2 < 3) { println("2 is less than 3") } { println("2 is greater than 3") }
// 2 is less than 3
Given the very close relationship between OO and actors (they are pretty much the same thing, actually), which is not historically surprising (Alan Kay based Smalltalk on Carl Hewitt's PLANNER; Carl Hewitt based Actors on Alan Kay's Smalltalk), I wouldn't be surprised if this turned out to be a step in the right direction to solve your problem.
Q : didn't ( I ) miss out on the question at a fundamental level?
Yes, you happened to have missed a cardinal point: even the functional-languages, that may otherwise enjoy the forms of AND- and/or OR-based fine-grain sorts of parallelism, do not grow as wild as not to respect the strictly [SERIAL] nature of the if ( expression1 ) expression2 [ else expression3 ]
You have spend many efforts on argumentation about recursion-case(s) whereas the principal property was left out of your view. State-fullness is the Mother Nature of the computing ( these toys are nothing but finite state automata, no matter how large the state-space might be, it is and always will principally remain finite ).
Even the cited Scala p.88 confirms this: "The conditional expression is evaluated by evaluating first e1. If this evaluates to true, the result of evaluating e2 is returned, otherwise the result of evaluating e3 is returned." - which is a pure-[SERIAL] process-recipe ( one step after another ).
One may remember, that even the evaluation of expression1 may have (and does have ) state-change effects ( not only "side-effects" ), but indeed state-change effects ( PRNG-steps into a new state whenever a random-number was asked to get generated and many similar situations )
Thus the if e1 then e2 else e3 has to obey a pure-[SERIAL] implementation, no matter what benefits may be brought into action from fine-grain {AND|OR}-based-parallelism ( may see many working examples thereof in languages that can use 'em right since late 70-ies early 80-ies )

How does this Scala function work?

def flatMap[A,B](f: Rand[A])(g: A => Rand[B]): Rand[B] =rng => {
val (a, r1) = f(rng)
g(a)(r1)
}
I am confused by g(a)(r1) because g is supposed to take only one argument, so why r1?
I don't exactly know the Rand[A] structure, so this is partially guessing. What you are returning in this function is a Rand[B] and when you implement the function you start of with defining an argument rng of an anonymous function. This tells me that Rand is probably some kind of function itself.
In the second line (val (a, r1) = f(rng)) you apply the value rng to the instance f: Rand[A]. In the resulting tuple, a has type A.
Note with this that f(rng) is equal to explicitly calling apply: f.apply(rng).
You can use the value a to get a Rand[B] by applying this value to the function g. So, val rb: Rand[B] = g(a) (or g.apply(a)).
Now, you don't want to return an instance of Rand[B] in this anonymous function! Instead you need to apply the previous result r1 to this instance rb. So, you get rb(r1) or rb.apply(r1). Substituting rb with g(a) or g.apply(a) gives you g(a)(r1) or g.apply(a).apply(r1).
To summarize, as you said, g is supposed to only take one argument, which results in an instance of Rand[B], but that is not the expected return type here. You need to apply the result of the previous computation to this new computation to get the expected result.

Partial Functions in Scala

I just wanted to clarify something about partially defined functions in Scala. I looked at the docs and it said the type of a partial function is PartialFunction[A,B], and I can define a partial function such as
val f: PartialFunction[Any, Int] = {...}
I was wondering, for the types A and B, is A a parameter, and B a return type? If I have multiple accepted types, do I use orElse to chain partial functions together?
In the set theoretic view of a function, if a function can map every value in the domain to a value in the range, we say that this function is a total function. There can be situations where a function cannot map some element(s) in the domain to the range; such functions are called partial functions.
Taking the example from the Scala docs for partial functions:
val isEven: PartialFunction[Int, String] = {
case x if x % 2 == 0 => x+" is even"
}
Here a partial function is defined since it is defined to only map an even integer to a string. So the input to the partial function is an integer and the output is a string.
val isOdd: PartialFunction[Int, String] = {
case x if x % 2 == 1 => x+" is odd"
}
isOdd is another partial function similarly defined as isEven but for odd numbers. Again, the input to the partial function is an integer and the output is a string.
If you have a list of numbers such as:
List(1,2,3,4,5)
and apply the isEven partial function on this list you will get as output
List(2 is even, 4 is even)
Notice that not all the elements in the original list have been mapped by the partial function. However, there may be situations where you want to apply another function in those cases where a partial function cannot map an element from the domain to the range. In this case we use orElse:
val numbers = sample map (isEven orElse isOdd)
And now you will get as output:
List(1 is odd, 2 is even, 3 is odd, 4 is even, 5 is odd)
If you are looking to set up a partial function that, in effect, takes multiple parameters, define the partial function over a tuple of the parameters you'll be feeding into it, eg:
val multiArgPartial: PartialFunction[(String, Long, Foo), Int] = {
case ("OK", _, Foo("bar", _)) => 0 // Use underscore to accept any value for a given parameter
}
and, of course, make sure you pass arguments to it as tuples.
In addition to other answers, if by "multiple accepted types" you mean that you want the same function accept e.g. String, Int and Boolean (and no other types), this is called "union types" and isn't supported in Scala currently (but is planned for the future, based on Dotty). The alternatives are:
Use the least common supertype (Any for the above case). This is what orElse chains will do.
Use a type like Either[String, Either[Int, Boolean]]. This is fine if you have two types, but becomes ugly quickly.
Encode union types as negation of intersection types.

How to check in Scala if a specific function returns empty?

I have this function:
type CustomSet = Int => Boolean
If I want to make the intersection I do something like:
def intersection(s: CustomSet, t: CustomSet): CustomSet = {
(x: Int) => contains(s, x) && contains(t, x)
}
Now, I don't see any way to check if intersection of two sets is empty...
I tried a lot of ways:
- if (intersection(s, t) == CustomSet())
- if (intersection(s, t) == None)
etc but it's not working...
Can you please tell me where I am wrong in this checking?
Just putting all the comments together:
The result of intersection is just a function. You can compare two functions for referential equality, i.e. if they are one and the same function.
There is no way to test if two functions return the same result (and have the same side effects) for all possible parameters (and system states), so all you can do is define a range of input parameters you care about and compare the results of two functions with all the results for all the interesting input parameters.
So in your case you could do something like
(-1000 to 1000).forall(!intersection(s,t)(_))
which would test if all the numbers from -1000 to 1000 are Not in the intersection of s and t

Unsure how this union function on Sets works

I'm trying to understand this def method :
def union(a: Set, b: Set): Set = i => a(i) || b(i)
Which is referred to at question : Scala set function
This is my understanding :
The method takes two parameters of type Set - a & b
A Set is returned which the union of the two sets a & b.
Here is where I am particularly confused : Set = i => a(i) || b(i)
The returned Set itself contains the 'or' of Set a & b . Is the Set 'i' being populated by an implicit for loop ?
Since 'i' is a Set why is it possible to or a 'set of sets', is this something like whats being generated in the background :
a(i) || b(i)
becomes
SetA(Set) || SetB(Set)
Maybe what's confusing you is the syntax. We can rewrite this as:
type Set = (Int => Boolean)
def union(a: Set, b: Set): Set = {
(i: Int) => a(i) || b(i)
}
So this might be easier to sort out. We are defining a method union that takes to Sets and returns a new Set. In our implementation, Set is just another name for a function from Int to Boolean (ie, a function telling us if the argument is "in the set").
The body of the the union method creates an anonymous function from Int to Boolean (which is a Set as we have defined it). This anonymous function accepts a parameter i, an Int, and returns true if, and only if, i is in set a (a(i)) OR i is in set b (b(i)).
If you look carefully, that question defines a type Set = Int => Boolean. So we're not talking about scala.collection.Set here; we're talking Int => Booleans.
To write a function literal, you use the => keyword, e.g.
x => someOp(x)
You don't need to annotate the type if it's already known. So if we know that the r.h.s. is Int => Boolean, we know that x is type Int.
No the set is not populated by a for loop.
The return type of union(a: Set, b: Set): Set is a function. The code of the declaration a(i) || b(i) is not executed when you call union; it will only be executed when you call the result of union.
And i is not a set it is an integer. It is the single argument of the function returned by union.
What happens here is that by using the set and union function you construct a binary tree of functions by combining them with the logical-or-operator (||). The set function lets you build leafs and the union lets you combine them into bigger function trees.
Example:
def set_one = set(1)
def set_two = set(2)
def set_three = set(2)
def set_one_or_two = union(set_one, set_two)
def set_one_two_three = union(set_three, set_one_or_two)
The set_one_two_three will be a function tree which contains two nodes: the left is a function checking if the passed parameter is equal to 3; the right is a node that contains two functions itself, checking if the parameter is equal to 1 and 2 respectively.