flatMap function in scala - scala

In the below two scenarios, I have flatMap function invoked on a list. In both the cases, the map portion of the flatMap function returns an array which has a iterator. In the first case, the code errors out where in the second case, it produces the expected result.
Scenario-1
val x = List("abc","cde")
x flatMap ( e => e.toArray)
<console>:13: error: polymorphic expression cannot be instantiated to expected type;
found : [B >: Char]Array[B]
required: scala.collection.GenTraversableOnce[?]
x flatMap ( e => e.toArray)
Scenario-2
val x = List("abc,def")
x flatMap ( e => e.split(",") )
res1: List[String] = List(abc, def) //Result
Can you please help why in the first case, it is not behaving as expected ?

I think the difference is that in scenario 1 you actually have a Array[B] where B is some yet-to-be-decided supertype of Char. Normally the compiler would look for an implicit conversion to GenTraversableOnce but because B is not yet known you run into a type inference issue / limitation.
You can help type inference by filling in B.
List("abc", "cde").flatMap(_.toArray[Char])
Or even better, you don't need flatMap in this case. Just call flatten.
List("abc", "cde").flatten

It is worth keeping in mind that Array is not a proper Scala collection so compiler has to work harder to first convert it to something that fits with the rest of Scala collections
implicitly[Array[Char] <:< GenTraversableOnce[Char]] // error (Scala 2.12)
implicitly[Array[Char] <:< IterableOnce[Char]] // error (Scala 2.13)
Hence because flatMap takes a function
String => GenTraversableOnce[Char]
but we are passing in
String => Array[Char]
the compiler first has to find an appropriate implicit conversion of Array[Char] to GenTraversableOnce[Char]. Namely this should be wrapCharArray
scala.collection.immutable.List.apply[String]("abc", "cde").flatMap[Char](((e: String) => scala.Predef.wrapCharArray(scala.Predef.augmentString(e).toArray[Char]((ClassTag.Char: scala.reflect.ClassTag[Char])))))
or shorter
List("abc", "cde").flatMap(e => wrapCharArray(augmentString(e).toArray))
however due to type inference reasons explained by Jasper-M, the compiler is unable to select wrapCharArray conversion. As Daniel puts it
Array tries to be a Java Array and a Scala Collection at the same
time. It mostly succeeds, but fail at both for some corner cases.
Perhaps this is one of those corner cases. For these reasons it is best to avoid Array unless performance or compatibility reasons dictate it. Nevertheless another approach that seems to work in Scala 2.13 is to use to(Collection) method
List("abc","cde").flatMap(_.to(Array))

scala> val x = List("abc","cde")
x: List[String] = List(abc, cde)
scala> x.flatMap[Char](_.toArray)
res0: List[Char] = List(a, b, c, c, d, e)

As the error mentions, your type is wrong.
In the first case, if you appy map not flatmap, you obtain List[Array[Char]]. If you apply flatten to that, you get a List[Char]
In the second case, if you appy map not flatmap, you obtain List[Array[String]]. If you apply flatten to that, you get a List[String]
I suppose you need to transform String to Char in your Array, in order to make it work.
I use Scala 2.13 and still have the same error.

Related

What does the ++: operator do to a list?

Alright, Scala has me feeling pretty dense. I'm finding the docs pretty impenetrable -- and worse, you can't Google the term "Scala ++:" because Google drops the operator terms!
I was reading some code and saw this line:
Seq(file) ++: children.flatMap(walkTree(_))
But couldn't figure it out. The docs for Seq show three things:
++
++:
++:
Where the latter two are over loaded to do.. something. The actual explanation in the doc says that they do the same thing as ++. Namely, add one list to another.
So, what exactly is the difference between the operators..?
++ and ++: return different results when the operands are different types of collection. ++ returns the same collection type as the left side, and ++: returns the same collection type as the right side:
scala> List(5) ++ Vector(5)
res2: List[Int] = List(5, 5)
scala> List(5) ++: Vector(5)
res3: scala.collection.immutable.Vector[Int] = Vector(5, 5)
There are two overloaded versions of ++: solely for implementation reasons. ++: needs to be able to take any TraversableOnce, but an overloaded version is provided for Traversable (a subtype of TraversableOnce) for efficiency.
Just to make sure:
A colon (:) in the end of a method name makes the call upside-down.
Let's make two methods and see what's gonna happen:
object Test {
def ~(i: Int) = null
def ~:(i: Int) = null //putting ":" in the tail!
this ~ 1 //compiled
1 ~: this //compiled
this.~(1) //compiled
this.~:(1) //compiled.. lol
this ~: 1 //error
1 ~ this //error
}
So, in seq1 ++: seq2, ++: is actually the seq2's method.
edited: As #okiharaherbst mentions, this is called as right associativity.
Scala function naming will look cryptic unless you learn a few simple rules and their precedence.
In this case, a colon means that the function has right associativity as opposed to the more usual left associativity that you see in imperative languages.
So ++: as in List(10) ++: Vector(10) is not an operator on the list but a function called on the vector even if it appears on its left hand-side, i.e., it is the same as Vector(10).++:(List(10)) and returns a vector.
++ as in List(10) ++ Vector(10) is now function called on the list (left associativity), i.e., it is the same as List(10).++(Vector(10)) and returns a list.
what exactly is the difference between the operators..?
The kind of list a Seq.++ operates on.
def ++[B](that: GenTraversableOnce[B]): Seq[B]
def ++:[B >: A, That](that: Traversable[B])(implicit bf: CanBuildFrom[Seq[A], B, That]): That
def ++:[B](that: TraversableOnce[B]): Seq[B]
As commented in "What is the basic collection type in Scala?"
Traversable extends TraversableOnce (which unites Iterator and Traversable), and
TraversableOnce extends GenTraversableOnce (which units the sequential collections and the parallel.)

Iteration over Scala tuples [duplicate]

This question already has answers here:
Iterate Over a tuple
(4 answers)
Closed 8 years ago.
In scala, we can get an iterator over a tuple as follows
val t = (1, 2)
val it = t.productIterator
and even
it.foreach( x => println(x.isInstanceOf[Int]) )
returns true, we cannot do simple operations on the iterator values without using asInstanceOf[Int], since
it.foreach( x => println(x+1) )
returns an error: type mismatch; found : Int(1) required: String
I understand the issue with Integer vs. Int, but still the validity of isInstanceOf[Int] is somewhat confusing.
What is the best way to do these operations over tuples? Notice that the tuple can have a mix of types like integers with doubles, so converting to a list might not always work.
A tuple does not have to be homogenous and the compiler didn't try to apply magic type unification across the elements1. Take (1, "hello") as an example of such a a heterogenous tuple (Tuple2[Int,String]).
This means that x is typed as Any (not Int!). Try it.foreach( (x: Int) => println(x) ), with the original tuple, to get a better error message indicating that the iterator is not unified over the types of the tuple elements (it is an Iterators[Any]). The error reported should be similar to:
error: type mismatch;
found : (Int) => Unit
required: (Any) => ?
(1, 2).productIterator.foreach( (x: Int) => println(x) )
In this particular case isInstanceOf[Int] can be used to refine the type - from the Any that the type-system gave us - because we know, from manual code inspection, that it will "be safe" with the given tuple.
Here is another look at the iterators/types involved:
(1, 2) // -> Tuple2[Int,Int]
.productIterator // -> Iterator[Any]
.map(_.asInstanceOf[Int]) // -> Iterator[Int]
.foreach(x => println(x+1))
While I would recommend treating tuples as finite sets of homogenous elements and not a sequence, the same rules can be used as when dealing with any Iterator[Any] such as using pattern matching (e.g. match) that discriminates by the actual object type. (In this case the code is using an implicit PartialFunction.)
(1, "hello").productIterator
.foreach {
case s: String => println("string: " + s)
case i: Int => println("int: " + i)
}
1 While it might be possible to make the compiler unify the types in this scenario, it sounds like a special case that requires extra work for minimal gain. Normally sequences like lists - not tuples - are used for homogenous elements and the compiler/type-system correctly gives us a good refinement for something like List(1,2) (which is typed as List[Int] as expected).
There is another type HList, that is like tuple and List all-in-one. See shapeless.
I think, you can get close to what you want:
import shapeless._
val t = 1 :: 2 :: HNil
val lst = t.toList
lst.foreach( x => println(x+1) )

type inference in fold left one-liner?

I was trying to reverse a List of Integers as follows:
List(1,2,3,4).foldLeft(List[Int]()){(a,b) => b::a}
My question is that is there a way to specify the seed to be some List[_] where the _ is the type automatically filled in by scala's type-inference mechanism, instead of having to specify the type as List[Int]?
Thanks
Update: After reading a bit more on Scala's type inference, I found a better answer to your question. This article which is about the limitations of the Scala type inference says:
Type information in Scala flows from function arguments to their results [...], from left to right across argument lists, and from first to last across statements. This is in contrast to a language with full type inference, where (roughly speaking) type information flows unrestricted in all directions.
So the problem is that Scala's type inference is rather limited. It first looks at the first argument list (the list in your case) and then at the second argument list (the function). But it does not go back.
This is why neither this
List(1,2,3,4).foldLeft(Nil){(a,b) => b::a}
nor this
List(1,2,3,4).foldLeft(List()){(a,b) => b::a}
will work. Why? First, the signature of foldLeft is defined as:
foldLeft[B](z: B)(f: (B, A) => B): B
So if you use Nil as the first argument z, the compiler will assign Nil.type to the type parameter B. And if you use List(), the compiler will use List[Nothing] for B.
Now, the type of the second argument f is fully defined. In your case, it's either
(Nil.type, Int) => Nil.type
or
(List[Nothing], Int) => List[Nothing]
And in both cases the lambda expression (a, b) => b :: a is not valid, since its return type is inferred to be List[Int].
Note that the bold part above says "argument lists" and not "arguments". The article later explains:
Type information does not flow from left to right within an argument list, only from left to right across argument lists.
So the situation is even worse if you have a method with a single argument list.
The only way I know how is
scala> def foldList[T](l: List[T]) = l.foldLeft(List[T]()){(a,b) => b::a}
foldList: [T](l: List[T])List[T]
scala> foldList(List(1,2,3,4))
res19: List[Int] = List(4, 3, 2, 1)
scala> foldList(List("a","b","c"))
res20: List[java.lang.String] = List(c, b, a)

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther

Why does Scala warn about type erasure in the first case but not the second?

I have two functions (not these have been edited since the original -- some of the answers below are responding to the original ones which returned a sequence of ()):
def foo1[A](ls: Iterable[A]) : Iterator[A] =
for (List(a, b) <- ls sliding 2) yield a
def foo2[A](ls: Iterable[A]) : Iterator[A] =
for (a::b::Nil <- ls sliding 2) yield a
which I naively thought were the same. But Scala gives this waning only for the first one:
warning: non variable type-argument A in type pattern List[A]
is unchecked since it is eliminated by erasure
I think I understand why it gives that error for the first one: Scala thinks that I'm trying to use the type as a condition on the pattern, ie a match against List[B](_, _) should fail if B does not inherit from A, except that this can't happen because the type is erased in both cases.
So two questions:
1) Why does the second one not give the same warning?
2) Is it possible to convince Scala that the type is actually known at compile time, and thus can't possibly fail to match?
edit: I think this answers my first question. But I'm still curious about the second one.
edit: agilesteel mentioned in a comment that
for (List(a, b) <- List(1,2,3,4) sliding 2) yield ()
produces no warning. How is that different from foo1 (shouldn't the [Int] parameter be erased just the same as the [A] parameter is)?
I'm not sure what is happening here, but the static type of Iterable[A].sliding is Iterator[Iterable[A]], not Iterator[List[A]] which would be the static type of List[A].sliding.
You can try receiving Seq instead of Iterable, and that work too. EDIT Contrary to what I previously claimed, both Iterable and Seq are co-variant, so I don't know what's different. END EDIT The definition of sliding is pretty weird too:
def sliding [B >: A] (size: Int): Iterator[Iterable[A]]
See how it requires a B, superclass of A, that never gets used? Contrast that with an Iterator.sliding, for which there's no problem:
def sliding [B >: A] (size: Int, step: Int = 1): GroupedIterator[B]
Anyway, on to the second case:
for (a::b::Nil <- ls sliding 2) yield a
Here you are decomposing the list twice, and for each decomposition the type of head is checked against A. Since the type of head is not erased, you don't have a problem. This is also mostly a guess.
Finally, if you turn ls into a List, you won't have a problem. Short of that, I don't think there's anything you can do. Otherwise, you can also write this:
def foo1[A](ls: Iterable[A]) : Iterator[A] =
for (Seq(a, b) <- ls.iterator sliding 2) yield a
1) The second one does not produce a warning probably because you are constructing the list (or the pattern) by prepending elements to the Nil object, which extends List parameterising it with Nothing. And since everything is Nothing, there is nothing to be worried about ;) But I'm not sure, really guessing here.
2) Why don't you just use:
def foo[A](ls: Iterable[A]) =
for (list <- ls sliding 2) yield ()