Any concise way to define a constant function (mapping) in scala - scala

I am a newbie in Scala. I have the following code to define a constant function that returns true for 1,2,3 and false for the other Integers.( actually the function defines a set {1,2,3} of integers):
val a= Node1 _
def Node1(x:Int):Boolean={
if (x==1 || x ==2 || x==3){true}
else{false}
}
Is there any way to define this function more concisely?

val a: Int => Boolean = Set(1, 2, 3).contains(_)
which is effectively same as
val a: Int => Boolean = Set(1, 2, 3)
This is because Set's apply method is same as its contains method.

On lookup performance, although as aforementioned Set.apply conveys the desired semantics, note that Set is a trait that may implement for instance ListSet, HashSet or TreeSet, each designed for specific uses.
For the aim here of looking up for an element, consider TreeSet (see Collections Performance Characteristics for details) outperforms other Set implementations. This proves relevant specially for large collections.

Related

Can I mutate a variable in place in a purely functional way?

I know I can use state passing and state monads for purely functional mutation, but afaik that's not in-place and I want the performance benefits of doing it in-place.
An example would be great, e.g. adding 1 to a number, preferably in Idris but Scala will also be good
p.s. is there a tag for mutation? can't see one
No, this is not possible in Scala.
It is however possible to achieve the performance benefits of in-place mutation in a purely functional language. For instance, let's take a function that updates an array in a purely functional way:
def update(arr: Array[Int], idx: Int, value: Int): Array[Int] =
arr.take(idx) ++ Array(value) ++ arr.drop(idx + 1)
We need to copy the array here in order to maintain purity. The reason is that if we mutated it in place, we'd be able to observe that after calling the function:
def update(arr: Array[Int], idx: Int, value: Int): Array[Int] = {
arr(idx) = value
arr
}
The following code will work fine with the first implementation but break with the second:
val arr = Array(1, 2, 3)
assert(arr(1) == 2)
val arr2 = update(arr, 1, 42)
assert(arr2(1) == 42) // so far, so good…
assert(arr(1) == 2) // oh noes!
The solution in a purely functional language is to simply forbid the last assert. If you can't observe the fact that the original array was mutated, then there's nothing wrong with updating the array in place! The means to achieve this is called linear types. Linear values are values that you can use exactly once. Once you've passed a linear value to a function, the compiler will not allow you to use it again, which fixes the problem.
There are two languages I know of that have this feature: ATS and Haskell. If you want more details, I'd recommend this talk by Simon Peyton-Jones where he explains the implementation in Haskell:
https://youtu.be/t0mhvd3-60Y
Support for linear types has since been merged into GHC: https://www.tweag.io/blog/2020-06-19-linear-types-merged/

What's the difference between a set and a mapping to boolean?

In Scala, I sometimes use a Map[A, Boolean], sometimes a Set[A]. There's really not much difference between these two concepts, and an implementation might use the same data structure to implement them. So why bother to have Sets? As I said, this question occurred to me in connection with Scala, but it would arise in any programming language whose library implements a Set abstraction.
The are some specific convenient methods defined on Set (intersect, diff and more). Not a big deal, but often useful.
My first thoughts are two:
efficiency: if you only want to signal presence, why bothering with a flag that can either be true or false?
meaning: a set is about the existence of something, a map is about a correlation between a key and value (generally speaking); these two ideas are quite different and should be used accordingly to simplify reading and understanding the code
Furthermore, the semantics of application change:
val map: Map[String, Bool] = Map("hello" -> true, "world" -> false)
val set: Set[String] = Set("hello")
map("hello") // true
map("world") // false
map("none!") // EXCEPTION
set("hello") // true
set("world") // false
set("none!") // false
Without having to actually store an extra pair to indicate absence (not to mention the boolean that actually indicates such absence).
Sets are very good to indicate the presence of something, which makes them very good for filtering:
val map = (0 until 10).map(_.toString).zipWithIndex.toMap
val set = (3 to 5).map(_.toString).toSet
map.filterKeys(set) // keeps pairs 3rd through 5th
Maps, in terms of processing collections, are good to indicate relationships, which makes them very good for collecting:
set.collect(map) // more or less equivalent as above, but only values are returned
You can read more about using collections as functions to process other collections here.
There are several reasons:
1) It is easier to think/work with a data structure that only has single elements as opposed to mapping to dummy true,
For example, it is easier to convert a list to Set, then to Map:
scala> val myList = List(1,2,3,2,1)
myList: List[Int] = List(1, 2, 3, 2, 1)
scala> myList.toSet
res9: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> myList.map(x => (x, true)).toMap
res1: scala.collection.immutable.Map[Int,Boolean] = Map(1 -> true, 2 -> true, 3 -> true)
2) As Kombajn zbożowy pointed out, Sets have additional helper methods, union, intersect, diff, subsetOf.
3) Sense Set does not have mapping to dummy variable the size of a set in memory is smaller, this is more noticeable for small sized keys.
Having said the above, not all languages have Set data structure, Go for example does not.

Partial Functions in Scala

I just wanted to clarify something about partially defined functions in Scala. I looked at the docs and it said the type of a partial function is PartialFunction[A,B], and I can define a partial function such as
val f: PartialFunction[Any, Int] = {...}
I was wondering, for the types A and B, is A a parameter, and B a return type? If I have multiple accepted types, do I use orElse to chain partial functions together?
In the set theoretic view of a function, if a function can map every value in the domain to a value in the range, we say that this function is a total function. There can be situations where a function cannot map some element(s) in the domain to the range; such functions are called partial functions.
Taking the example from the Scala docs for partial functions:
val isEven: PartialFunction[Int, String] = {
case x if x % 2 == 0 => x+" is even"
}
Here a partial function is defined since it is defined to only map an even integer to a string. So the input to the partial function is an integer and the output is a string.
val isOdd: PartialFunction[Int, String] = {
case x if x % 2 == 1 => x+" is odd"
}
isOdd is another partial function similarly defined as isEven but for odd numbers. Again, the input to the partial function is an integer and the output is a string.
If you have a list of numbers such as:
List(1,2,3,4,5)
and apply the isEven partial function on this list you will get as output
List(2 is even, 4 is even)
Notice that not all the elements in the original list have been mapped by the partial function. However, there may be situations where you want to apply another function in those cases where a partial function cannot map an element from the domain to the range. In this case we use orElse:
val numbers = sample map (isEven orElse isOdd)
And now you will get as output:
List(1 is odd, 2 is even, 3 is odd, 4 is even, 5 is odd)
If you are looking to set up a partial function that, in effect, takes multiple parameters, define the partial function over a tuple of the parameters you'll be feeding into it, eg:
val multiArgPartial: PartialFunction[(String, Long, Foo), Int] = {
case ("OK", _, Foo("bar", _)) => 0 // Use underscore to accept any value for a given parameter
}
and, of course, make sure you pass arguments to it as tuples.
In addition to other answers, if by "multiple accepted types" you mean that you want the same function accept e.g. String, Int and Boolean (and no other types), this is called "union types" and isn't supported in Scala currently (but is planned for the future, based on Dotty). The alternatives are:
Use the least common supertype (Any for the above case). This is what orElse chains will do.
Use a type like Either[String, Either[Int, Boolean]]. This is fine if you have two types, but becomes ugly quickly.
Encode union types as negation of intersection types.

The easiest way to write {1, 2, 4, 8, 16 } in Scala

I was advertising Scala to a friend (who uses Java most of the time) and he asked me a challenge: what's the way to write an array {1, 2, 4, 8, 16} in Scala.
I don't know functional programming that well, but I really like Scala. However, this is a iterative array formed by (n*(n-1)), but how to keep track of the previous step? Is there a way to do it easily in Scala or do I have to write more than one line of code to achieve this?
Array.iterate(1, 5)(2 * _)
or
Array.iterate(1, 5)(n => 2 * n)
Elaborating on this as asked for in comment. Don't know what you want me to elaborate on, hope you will find what you need.
This is the function iterate(start,len)(f) on object Array (scaladoc). That would be a static in java.
The point is to fill an array of len elements, from first value start and always computing the next element by passing the previous one to function f.
A basic implementation would be
import scala.reflect.ClassTag
def iterate[A: ClassTag](start: A, len: Int)(f: A => A): Array[A] = {
val result = new Array[A](len)
if (len > 0) {
var current = start
result(0) = current
for (i <- 1 until len) {
current = f(current)
result(i) = current
}
}
result
}
(the actual implementation, not much different can be found here. It is a little different mostly because the same code is used for different data structures, e.g List.iterate)
Beside that, the implementation is very straightforward . The syntax may need some explanations :
def iterate[A](...) : Array[A] makes it a generic methods, usable for any type A. That would be public <A> A[] iterate(...) in java.
ClassTag is just a technicality, in scala as in java, you normally cannot create an array of a generic type (java new E[]), and the : ClassTag asks the compiler to add some magic which is very similar to adding at method declaration, and passing at call site, a class<A> clazz parameter in java, which can then be used to create the array by reflection. If you do e.g List.iterate rather than Array.iterate, it is not needed.
Maybe more surprising, the two parameters lists, one with start and len, and then in a separate parentheses, the one with f. Scala allows a method to have severals parameters lists. Here the reason is the peculiar way scala does type inference : Looking at the first parameter list, it will determine what is A, based on the type of start. Only afterwards, it will look at the second list, and then it knows what type A is. Otherwise, it would need to be told, so if there had been only one parameter list, def iterate[A: ClassTag](start: A, len: Int, f: A => A),
then the call should be either
Array.iterate(1, 5, n : Int => 2 * n)
Array.iterate[Int](1, 5, n => 2 * n)
Array.iterate(1, 5, 2 * (_: int))
Array.iterate[Int](1, 5, 2 * _)
making Int explicit one way or another. So it is common in scala to put function arguments in a separate argument list. The type might be much longer to write than just 'Int'.
A => A is just syntactic sugar for type Function1[A,A]. Obviously a functional language has functions as (first class) values, and a typed functional language has types for functions.
In the call, iterate(1, 5)(n => 2 * n), n => 2 * n is the value of the function. A more complete declaration would be {n: Int => 2 * n}, but one may dispense with Int for the reason stated above. Scala syntax is rather flexible, one may also dispense with either the parentheses or the brackets. So it could be iterate(1, 5){n => 2 * n}. The curlies allow a full block with several instruction, not needed here.
As for immutability, Array is basically mutable, there is no way to put a value in an array except to change the array at some point. My implementation (and the one in the library) also use a mutable var (current) and a side-effecting for, which is not strictly necessary, a (tail-)recursive implementation would be only a little longer to write, and just as efficient. But a mutable local does not hurt much, and we are already dealing with a mutable array anyway.
always more than one way to do it in Scala:
scala> (0 until 5).map(1<<_).toArray
res48: Array[Int] = Array(1, 2, 4, 8, 16)
or
scala> (for (i <- 0 to 4) yield 1<<i).toArray
res49: Array[Int] = Array(1, 2, 4, 8, 16)
or even
scala> List.fill(4)(1).scanLeft(1)(2*_+0*_).toArray
res61: Array[Int] = Array(1, 2, 4, 8, 16)
The other answers are fine if you happen to know in advance how many entries will be in the resulting list. But if you want to take all of the entries up to some limit, you should create an Iterator, use takeWhile to get the prefix you want, and create an array from that, like so:
scala> Iterator.iterate(1)(2*_).takeWhile(_<=16).toArray
res21: Array[Int] = Array(1, 2, 4, 8, 16)
It all boils down to whether what you really want is more correctly stated as
the first 5 powers of 2 starting at 1, or
the powers of 2 from 1 to 16
For non-trivial functions you almost always want to specify the end condition and let the program figure out how many entries there are. Of course your example was simple, and in fact the real easiest way to create that simple array is just to write it out literally:
scala> Array(1,2,4,8,16)
res22: Array[Int] = Array(1, 2, 4, 8, 16)
But presumably you were asking for a general technique you could use for arbitrarily complex problems. For that, Iterator and takeWhile are generally the tools you need.
You don't have to keep track of the previous step. Also, each element is not formed by n * (n - 1). You probably meant f(n) = f(n - 1) * 2.
Anyway, to answer your question, here's how you do it:
(0 until 5).map(math.pow(2, _).toInt).toArray

Should Scala's map() behave differently when mapping to the same type?

In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map().
We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder to recreate the resulting collection, and those who go though an implicit CanBuildFrom to obtain the builder.
The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter, partition, drop, take, span, etc. These transformations are free to call newBuilder and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int] can always return a List[Int]; filtering a BitSet (or the RNA example structure described in this article on the architecture of the collections framework) can always return another BitSet (or RNA). Let's call them the filtering transformations.
The second category of transformations need CanBuildFroms to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet cannot contain Strings; an RNA contains only Bases. Examples of such transformations are map, flatMap, collect, scanLeft, ++, etc. Let's call them the mapping transformations.
Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.
scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet
scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)
scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)
scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again
scala> set.filter(_ % 2 == 0)
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine
scala> treeset.map(_ + 1)
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine
scala> set.map(_ + 1)
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!
Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll) method, may or may not be able to recreate the same collection type.
Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet when mapping to Ints, create a new TreeSet with the same ordering, etc.
Then we would avoid cases where
for (i <- set) {
val x = i + 1
println(x)
}
does not print the incremented elements of the TreeSet in the same order as
for (i <- set; x = i + 1)
println(x)
So:
Do you think this would be a good idea to change the behavior of the mapping transformations as described?
What are the inevitable caveats I have grossly overlooked?
How could it be implemented?
I was thinking about something like an implicit sameTypeEvidence: A =:= B parameter, maybe with a default value of null (or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null), which could be used at runtime to give more information to the CanBuildFrom, which in turn could be used to determine the type of builder to return.
I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet. Because the following does work as intended:
val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List
val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector
So the reason is that TreeSet is missing a specialised companion object/builder:
seq1.companion.newBuilder[Int] // ListBuffer
seq2.companion.newBuilder[Int] // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)
So my guess is, if you take proper provision for such a companion for your RNA class, you may find that both map and filter work as you wish...?