Why the variation in operators? - scala

Long time lurker, first time poster.
In Scala, I'm looking for advantages as to why it was preferred to vary operators depending on type. For example, why was this:
Vector(1, 2, 3) :+ 4
determined to be an advantage over:
Vector(1, 2, 3) + 4
Or:
4 +: Vector(1,2,3)
over:
Vector(4) + Vector(1,2,3)
Or:
Vector(1,2,3) ++ Vector(4,5,6)
over:
Vector(1,2,3) + Vector(4,5,6)
So, here we have :+, +:, and ++ when + alone could have sufficed. I'm new at Scala, and I'll succumb. But, this seems unnecessary and obfuscated for a language that tries to be clean with its syntax.
I've done quite a few google and stack overflow searches and have only found questions about specific operators, and operator overloading in general. But, no background on why it was necessary to split +, for example, into multiple variations.
FWIW, I could overload the operators using implicit classes, such as below, but I imagine that would only cause confusion (and tisk tisks) from experienced Scala programmers using/reading my code.
object AddVectorDemo {
implicit class AddVector(vector : Vector[Any]) {
def +(that : Vector[Any]) = vector ++ that
def +(that : Any) = vector :+ that
}
def main(args : Array[String]) : Unit = {
val u = Vector(1,2,3)
val v = Vector(4,5,6)
println(u + v)
println(u + v + 7)
}
}
Outputs:
Vector(1, 2, 3, 4, 5, 6)
Vector(1, 2, 3, 4, 5, 6, 7)

The answer requires a surprisingly long detour through variance. I'll try to make it as short as possible.
First, note that you can add anything to an existing Vector:
scala> Vector(1)
res0: scala.collection.immutable.Vector[Int] = Vector(1)
scala> res0 :+ "fish"
res1: scala.collection.immutable.Vector[Any] = Vector(1, fish)
Why can you do this? Well, if B extends A and we want to be able to use Vector[B] where Vector[A] is called for, we need to allow Vector[B] to add the same sorts of things that Vector[A] can add. But everything extends Any, so we need to allow addition of anything that Vector[Any] can add, which is everything.
Making Vector and most other non-Set collections covariant is a design decision, but it's what most people expect.
Now, let's try adding a vector to a vector.
scala> res0 :+ Vector("fish")
res2: scala.collection.immutable.Vector[Any] = Vector(1, Vector(fish))
scala> res0 ++ Vector("fish")
res3: scala.collection.immutable.Vector[Any] = Vector(1, fish)
If we only had one operation, +, we wouldn't be able to specify which one of these things we meant. And we really might mean to do either. They're both perfectly sensible things to try. We could try to guess based on types, but in practice it's better to just ask the programmer to explicitly say what they mean. And since there are two different things to mean, there need to be two ways to ask.
Does this come up in practice? With collections of collections, yes, all the time. For example, using your +:
scala> Vector(Vector(1), Vector(2))
res4: Vector[Vector[Int]] = Vector(Vector(1), Vector(2))
scala> res4 + Vector(3)
res5: Vector[Any] = Vector(Vector(1), Vector(2), 3)
That's probably not what I wanted.

It's a fair question, and I think it has a lot to do with legacy code and Java compatibility. Scala copied Java's + for String concatenation, which has complicated things.
This + allows us to do:
(new Object) + "foobar" //"java.lang.Object#5bb90b89foobar"
So what should we expect if we had + for List and we did List(1) + "foobar"? One might expect List(1, "foobar") (of type List[Any]), just like we get if we use :+, but the Java-inspired String-concatenation overload would complicate this, since the compiler would fail to resolve the overload.
Odersky even once commented:
One should never have a + method on collections that are covariant in their element type. Sets and maps are non-variant, that's why they can have a + method. It's all rather delicate and messy. We'd be better off if we did not try to duplicate Java's + for String concatenation. But when Scala got designed the idea was to keep essentially all of Java's expression syntax, including String +. And it's too late to change that now.
There is some discussion (although in a different context) on the answers to this similar question.

Related

Lists in Scala - plus colon vs double colon (+: vs ::)

I am little bit confused about +: and :: operators that are available.
It looks like both of them gives the same results.
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)
scala> 0 +: res0
res1: List[Int] = List(0, 1, 2, 3)
scala> 0 :: res0
res2: List[Int] = List(0, 1, 2, 3)
For my novice eye source code for both methods looks similar (plus-colon method has additional condition on generics with use of builder factories).
Which one of these methods should be used and when?
+: works with any kind of collection, while :: is specific implementation for List.
If you look at the source for +: closely, you will notice that it actually calls :: when the expected return type is List. That is because :: is implemented more efficiently for the List case: it simply connects the new head to the existing list and returns the result, which is a constant-time operation, as opposed to linear copying the entire collection in the generic case of +:.
+: on the other hand, takes CanBuildFrom, so you can do fancy (albeit, not looking as nicely in this case) things like:
val foo: Array[String] = List("foo").+:("bar")(breakOut)
(It's pretty useless in this particular case, as you could start with the needed type to begin with, but the idea is you can prepend and element to a collection, and change its type in one "go", avoiding an additional copy).

Functional Programming In Scala - reversing lists exercise

While working trough the exercises and concepts of the book: Functional Programming in Scala by Paul Chiusano and RĂșnar Bjarnason, I stumbled upon the exercise of writing my own function to reverse a list.
The "suggestion" the authors gave to motivate readers to learn more, was to see if we can write such a function using a fold.
My "non-fold" version is the following:
def myrev(arr:List[Int]):List[Int] = if (arr.length > 0) { myrev(arr.tail) :+ arr.head } else arr
However, can someone give me some pointers on how to at least start the logic to reverse a list via a fold?
I know that:
List(1,2,3,4).foldLeft(0)({case(x,y)=>x})
gives me the initial "seed" value, which is 0 above, and that:
List(1,2,3,4).foldLeft(0)({case(x,y)=>y})
gives me 4 which is the last element of the list.
So I would need to supply as function to foldLeft a sort of "identity" function that can give me an element in a certain position and maybe use it to reverse the list somehow, but I feel at a loss.
I didn't fully understood some Haskell code I found online and I want to "struggle" and try to get there on my own only with some pointers, instead of simply blindly copy some solution, so any help would be greatly appreciated.
scala> List(1, 2, 3, 4).foldLeft(List.empty[Int])((result, currentElem) => currentElem :: result)
res2: List[Int] = List(4, 3, 2, 1)
An implementation of a function that takes a list and inverse it using foldLeft
def reverse (l: List[String]) =
l.foldLeft(List[String]())( (x: List[String], y: String) => y :: x )
Create a list and call the reverse function
val l = List("1", "2", "3")
reverse(l)
Result
res2: List(3, 2, 1): List[String]

Inconsistent behaviour for xs.sliding(n) if n is less than size?

According to scaladoc, sliding() returns...
"An iterator producing iterable collections of size size, except the last and the only element will be truncated if there are fewer elements than size."
For me, intuitivelly, sliding(n) would return a sliding window of n elements if available. With the current implementation, I need to perform an extra check to make sure I don't get a list of 1 or 2 elements.
scala> val xs = List(1, 2)
xs: List[Int] = List(1, 2)
scala> xs.sliding(3).toList
res2: List[List[Int]] = List(List(1, 2))
I expected here an empty list instead. Why is sliding() implemented this way instead?
It was a mistake, but wasn't fixed as of 2.9. Everyone occasionally makes design errors, and once one gets into the library it's a nontrivial task to remove it.
Workaround: add a filter.
xs.sliding(3).filter(_.size==3).toList
You can "work around" this by using the GroupedIterator#withPartial modifier.
scala> val xs = List(1, 2)
xs: List[Int] = List(1, 2)
scala> xs.iterator.sliding(3).withPartial(false).toList
res7: List[Seq[Int]] = List()
(I don't know why you need to say xs.iterator but xs.sliding(3).withPartial(false) does not work because you get an Iterator instead of a GroupedIterator.
EDIT:
Check Rex's answer (which is the correct one). I'm leaving this just because (as Rex said on the comments) it was the original (wrong) idea behind that design decision.
I don't know why you would expect an empty list there, returning the full list seems like the best result, consider this example:
def slidingWindowsThing(windows : List[List[Int]]) { // do your thing
For this methods you probably want all these calls to work:
slidingWindowsThing((1 to 10).sliding(3))
slidingWindowsThing((1 to 3).sliding(3))
slidingWindowsThing((1 to 1).sliding(3))
This is why the method defaults to a list of size list.length instead of Nil (empty list).

Should Scala's map() behave differently when mapping to the same type?

In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map().
We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder to recreate the resulting collection, and those who go though an implicit CanBuildFrom to obtain the builder.
The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter, partition, drop, take, span, etc. These transformations are free to call newBuilder and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int] can always return a List[Int]; filtering a BitSet (or the RNA example structure described in this article on the architecture of the collections framework) can always return another BitSet (or RNA). Let's call them the filtering transformations.
The second category of transformations need CanBuildFroms to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet cannot contain Strings; an RNA contains only Bases. Examples of such transformations are map, flatMap, collect, scanLeft, ++, etc. Let's call them the mapping transformations.
Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.
scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet
scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)
scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)
scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again
scala> set.filter(_ % 2 == 0)
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine
scala> treeset.map(_ + 1)
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine
scala> set.map(_ + 1)
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!
Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll) method, may or may not be able to recreate the same collection type.
Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet when mapping to Ints, create a new TreeSet with the same ordering, etc.
Then we would avoid cases where
for (i <- set) {
val x = i + 1
println(x)
}
does not print the incremented elements of the TreeSet in the same order as
for (i <- set; x = i + 1)
println(x)
So:
Do you think this would be a good idea to change the behavior of the mapping transformations as described?
What are the inevitable caveats I have grossly overlooked?
How could it be implemented?
I was thinking about something like an implicit sameTypeEvidence: A =:= B parameter, maybe with a default value of null (or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null), which could be used at runtime to give more information to the CanBuildFrom, which in turn could be used to determine the type of builder to return.
I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet. Because the following does work as intended:
val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List
val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector
So the reason is that TreeSet is missing a specialised companion object/builder:
seq1.companion.newBuilder[Int] // ListBuffer
seq2.companion.newBuilder[Int] // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)
So my guess is, if you take proper provision for such a companion for your RNA class, you may find that both map and filter work as you wish...?

How to do something like this in Scala?

Sorry for the lack of a descriptive title; I couldn't think of anything better. Edit it if you think of one.
Let's say I have two Lists of Objects, and they are always changing. They need to remain as separate lists, but many operations have to be done on both of them. This leads me to doing stuff like:
//assuming A and B are the lists
A.foo(params)
B.foo(params)
In other words, I'm doing the exact same operation to two different lists at many places in my code. I would like a way to reduce them down to one list without explicitly having to construct another list. I know that just combining lists A and b into a list C would solve all my problems, but then we'd just be back to the same operation if I needed to add a new object to the list (because I'd have to add it to C as well as its respective list).
It's in a tight loop and performance is very important. Is there any way to construct an iterator or something that would iterate A and then move on to B, all transparently? I know another solution would be to construct the combined list (C) every time I'd like to perform some kind of function on both of these lists, but that is a huge waste of time (computationally speaking).
Iterator is what you need here. Turning a List into an Iterator and concatenating 2 Iterators are both O(1) operations.
scala> val l1 = List(1, 2, 3)
l1: List[Int] = List(1, 2, 3)
scala> val l2 = List(4, 5, 6)
l2: List[Int] = List(4, 5, 6)
scala> (l1.iterator ++ l2.iterator) foreach (println(_)) // use List.elements for Scala 2.7.*
1
2
3
4
5
6
I'm not sure if I understand what's your meaning.
Anyway, this is my solution:
scala> var listA :List[Int] = Nil
listA: List[Int] = List()
scala> var listB :List[Int] = Nil
listB: List[Int] = List()
scala> def dealWith(op : List[Int] => Unit){ op(listA); op(listB) }
dealWith: ((List[Int]) => Unit)Unit
and then if you want perform a operator in both listA and listB,you can use like following:
scala> listA ::= 1
scala> listB ::= 0
scala> dealWith{ _ foreach println }
1
0