How to avoid calling toVector on each Scala for comprehension / yield? - scala

I have several methods that operate on Vector sequences and the following idiom is common when combining data from multiple vectors into a single one with the use of a for comprehension / yield:
(for (i <- 0 until y.length) yield y(i) + 0.5*dy1(i)) toVector
Notice the closing toVector and the enclosing parentheses around the for comprehension. I want to get rid of it because it's ugly, but removing it produces the following error:
type mismatch;
found : scala.collection.immutable.IndexedSeq[Double]
required: Vector[Double]
Is there a better way of achieving what I want that avoids explicitly calling toVector many times to essentially achieve a non-operation (converting and indexed sequence...to an indexed sequence)?

One way to avoid collection casting, e.g. toVector, is to invoke, if possible, only those methods that return the same collection type.
y.zipWithIndex.map{case (yv,idx) => yv + 0.5*dy1(idx)}

for yield on Range which you are using in your example yields a Vector[T] by default.
example,
scala> val squares= for (x <- Range(1, 3)) yield x * x
squares: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 4)
check the type,
scala> squares.isInstanceOf[Vector[Int]]
res14: Boolean = true
Note that Vector[T] also extends IndexedSeq[T].
#SerialVersionUID(-1334388273712300479L)
final class Vector[+A] private[immutable] (private[collection] val startIndex: Int, private[collection] val endIndex: Int, focus: Int)
extends AbstractSeq[A]
with IndexedSeq[A]
with GenericTraversableTemplate[A, Vector]
with IndexedSeqLike[A, Vector[A]]
with VectorPointer[A #uncheckedVariance]
with Serializable
with CustomParallelizable[A, ParVector[A]]
That's why above result is also an instance of IndexedSeq[T],
scala> squares.isInstanceOf[IndexedSeq[Int]]
res15: Boolean = true
You can define the type of your result as IndexedSeq[T] and still achieve what you want with Vector without explicitly calling .toVector
scala> val squares: IndexedSeq[Int] = for (x <- Range(1, 3)) yield x * x
squares: IndexedSeq[Int] = Vector(1, 4)
scala> squares == Vector(1, 4)
res16: Boolean = true
But for yield on Seq[T] gives List[T] by default.
scala> val squares = for (x <- Seq(1, 3)) yield x * x
squares: Seq[Int] = List(1, 9)
Only in that case if you want vector you must .toVector the result.
scala> squares.isInstanceOf[Vector[Int]]
res21: Boolean = false
scala> val squares = (for (x <- Seq(1, 3)) yield x * x).toVector
squares: Vector[Int] = Vector(1, 9)

Related

Scala: map(f) and map(_.f)

I thought in scala map(f) is the same as map(_.f) as map(x => x.f), but turns out it is not
scala> val a = List(1,2,3)
val a: List[Int] = List(1, 2, 3)
scala> a.map(toString)
val res7: List[Char] = List(l, i, n)
scala> a.map(_.toString)
val res8: List[String] = List(1, 2, 3)
What happenes when a.map(toString) is called? Where did the three charaacters l, i, and n come from?
map(f) is not the same as map(_.f()). It's the same as map(f(_)). That is, it's going to call f(x), not x.f(), for each x in the list.
So a.map(toString) should be an error because the normal toString method does not take any arguments. My guess is that in your REPL session you've defined your own toString method that takes an argument and that's the one that's being called.

How to select elements of collection based on another of different type?

I know I can do this:
scala> val a = List(1,2,3)
a: List[Int] = List(1, 2, 3)
scala> val b = List(2,4)
b: List[Int] = List(2, 4)
scala> a.filterNot(b.toSet)
res0: List[Int] = List(1, 3)
But I'd like to select elements of a collection based on their integer key, as in the following:
case class N (p: Int , q: Int)
val x = List(N(1,100), N(2,200), N(3,300))
val y = List(2,4)
val z = .... ?
Z // want Z to be ((N1,100), (N3,300)) after removing the items of type N with 'p'
// matching any item in list y.
I know one way to do it is is something like the following which makes the above broken code work:
val z = x.filterNot(e => y.contains(e.p))
but this seems very inefficient. Is there a better way?
Just do
val z = y.toSet
x.filterNot {z.contains(_.p)}
That's linear.
The problem with contains is that the search will be a linear search and you are looking at O(N^2) solution(which is still OK, if the dataset is not large)
Anyways, a simple solution can be to use Binary search to get O(NlnN) solution. You can easily convert val y to Array from list and then use java's binary search method.
scala> case class N(p: Int, q: Int)
defined class N
scala> val x = List(N(1, 100), N(2, 200), N(3, 300))
x: List[N] = List(N(1,100), N(2,200), N(3,300))
scala> val y = Array(2, 4) // Using Array directly.
y: Array[Int] = Array(2, 4)
scala> val z = x.filterNot(e => java.util.Arrays.binarySearch(y, e.p) >= 0)
z: List[N] = List(N(1,100), N(3,300))

Index with Many Indices

Is there a quick scala idiom to have retrieve multiple elements of a a traversable using indices.
I am looking for something like
val L=1 to 4 toList
L(List(1,2)) //doesn't work
I have been using map so far, but wondering if there was a more "scala" way
List(1,2) map {L(_)}
Thanks in advance
Since a List is a Function you can write just
List(1,2) map L
Although, if you're going to be looking things up by index, you should probably use an IndexedSeq like Vector instead of a List.
You could add an implicit class that adds the functionality:
implicit class RichIndexedSeq[T](seq: IndexedSeq[T]) {
def apply(i0: Int, i1: Int, is: Int*): Seq[T] = (i0+:i1+:is) map seq
}
You can then use the sequence's apply method with one index or multiple indices:
scala> val data = Vector(1,2,3,4,5)
data: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)
scala> data(0)
res0: Int = 1
scala> data(0,2,4)
res1: Seq[Int] = ArrayBuffer(1, 3, 5)
You can do it with a for comprehension but it's no clearer than the code you have using map.
scala> val indices = List(1,2)
indices: List[Int] = List(1, 2)
scala> for (index <- indices) yield L(index)
res0: List[Int] = List(2, 3)
I think the most readable would be to implement your own function takeIndices(indices: List[Int]) that takes a list of indices and returns the values of a given List at those indices. e.g.
L.takeIndices(List(1,2))
List[Int] = List(2,3)

Why does this complex for comprehension fail? [duplicate]

Why does this construction cause a Type Mismatch error in Scala?
for (first <- Some(1); second <- List(1,2,3)) yield (first,second)
<console>:6: error: type mismatch;
found : List[(Int, Int)]
required: Option[?]
for (first <- Some(1); second <- List(1,2,3)) yield (first,second)
If I switch the Some with the List it compiles fine:
for (first <- List(1,2,3); second <- Some(1)) yield (first,second)
res41: List[(Int, Int)] = List((1,1), (2,1), (3,1))
This also works fine:
for (first <- Some(1); second <- Some(2)) yield (first,second)
For comprehensions are converted into calls to the map or flatMap method. For example this one:
for(x <- List(1) ; y <- List(1,2,3)) yield (x,y)
becomes that:
List(1).flatMap(x => List(1,2,3).map(y => (x,y)))
Therefore, the first loop value (in this case, List(1)) will receive the flatMap method call. Since flatMap on a List returns another List, the result of the for comprehension will of course be a List. (This was new to me: For comprehensions don't always result in streams, not even necessarily in Seqs.)
Now, take a look at how flatMap is declared in Option:
def flatMap [B] (f: (A) ⇒ Option[B]) : Option[B]
Keep this in mind. Let's see how the erroneous for comprehension (the one with Some(1)) gets converted to a sequence of map calls:
Some(1).flatMap(x => List(1,2,3).map(y => (x, y)))
Now, it's easy to see that the parameter of the flatMap call is something that returns a List, but not an Option, as required.
In order to fix the thing, you can do the following:
for(x <- Some(1).toSeq ; y <- List(1,2,3)) yield (x, y)
That compiles just fine. It is worth noting that Option is not a subtype of Seq, as is often assumed.
An easy tip to remember, for comprehensions will try to return the type of the collection of the first generator, Option[Int] in this case. So, if you start with Some(1) you should expect a result of Option[T].
If you want a result of List type, you should start with a List generator.
Why have this restriction and not assume you'll always want some sort of sequence? You can have a situation where it makes sense to return Option. Maybe you have an Option[Int] that you want to combine with something to get a Option[List[Int]], say with the following function: (i:Int) => if (i > 0) List.range(0, i) else None; you could then write this and get None when things don't "make sense":
val f = (i:Int) => if (i > 0) Some(List.range(0, i)) else None
for (i <- Some(5); j <- f(i)) yield j
// returns: Option[List[Int]] = Some(List(0, 1, 2, 3, 4))
for (i <- None; j <- f(i)) yield j
// returns: Option[List[Int]] = None
for (i <- Some(-3); j <- f(i)) yield j
// returns: Option[List[Int]] = None
How for comprehensions are expanded in the general case are in fact a fairly general mechanism to combine an object of type M[T] with a function (T) => M[U] to get an object of type M[U]. In your example, M can be Option or List. In general it has to be the same type M. So you can't combine Option with List. For examples of other things that can be M, look at subclasses of this trait.
Why did combining List[T] with (T) => Option[T] work though when you started with the List? In this case the library use a more general type where it makes sense. So you can combine List with Traversable and there is an implicit conversion from Option to Traversable.
The bottom line is this: think about what type you want the expression to return and start with that type as the first generator. Wrap it in that type if necessary.
It probably has something to do with Option not being an Iterable. The implicit Option.option2Iterable will handle the case where compiler is expecting second to be an Iterable. I expect that the compiler magic is different depending on the type of the loop variable.
I always found this helpful:
scala> val foo: Option[Seq[Int]] = Some(Seq(1, 2, 3, 4, 5))
foo: Option[Seq[Int]] = Some(List(1, 2, 3, 4, 5))
scala> foo.flatten
<console>:13: error: Cannot prove that Seq[Int] <:< Option[B].
foo.flatten
^
scala> val bar: Seq[Seq[Int]] = Seq(Seq(1, 2, 3, 4, 5))
bar: Seq[Seq[Int]] = List(List(1, 2, 3, 4, 5))
scala> bar.flatten
res1: Seq[Int] = List(1, 2, 3, 4, 5)
scala> foo.toSeq.flatten
res2: Seq[Int] = List(1, 2, 3, 4, 5)
Since Scala 2.13 Option was made IterableOnce
sealed abstract class Option[+A] extends IterableOnce[A] with Product with Serializable
so the following for comprehension works without the use of option2Iterable implicit conversion
scala> for {
| a <- List(1)
| b <- Some(41)
| } yield (a + b)
val res35: List[Int] = List(42)
scala> List(1).flatMap
final override def flatMap[B](f: Int => scala.collection.IterableOnce[B]): List[B]
where we see List#flatMap takes a function to IterableOnce. Desugaring above for comprehension we get something like
List(1).flatMap(a => Some(41).map(b => a + b))
which show the absence of the implicit conversion.
However in Scala 2.12 and before Option was not a traversable/iterable entity
sealed abstract class Option[+A] extends Product with Serializable
so the above for comprehension would desugar to something like
List(1).flatMap(a => option2Iterable(Some(41)).map(b => a + b))(List.canBuildFrom[Int])
where we see the implicit conversion.
The reason it does not work the other way around where for comprehension begins with Option and then we try to chain a List
scala> for {
| a <- Option(1)
| b <- List(41)
| } yield (a + b)
b <- List(41)
^
On line 3: error: type mismatch;
found : List[Int]
required: Option[?]
scala> Option(1).flatMap
final def flatMap[B](f: Int => Option[B]): Option[B]
is because Option#flatMap takes a function to Option and converting a List to Option probably does not make sense because we would lose elements for Lists with more than one element.
As szeiger explains
I think the recent Option changes actually make the for comprehensions
use case easier to understand because you do not need an implicit
conversion anymore. Option can be used on the RHS of a flatMap of any
collection type because it is IterableOnce (but not the opposite
because the RHS of Option#flatMap requires Option).

Composing Option with List in for-comprehension gives type mismatch depending on order

Why does this construction cause a Type Mismatch error in Scala?
for (first <- Some(1); second <- List(1,2,3)) yield (first,second)
<console>:6: error: type mismatch;
found : List[(Int, Int)]
required: Option[?]
for (first <- Some(1); second <- List(1,2,3)) yield (first,second)
If I switch the Some with the List it compiles fine:
for (first <- List(1,2,3); second <- Some(1)) yield (first,second)
res41: List[(Int, Int)] = List((1,1), (2,1), (3,1))
This also works fine:
for (first <- Some(1); second <- Some(2)) yield (first,second)
For comprehensions are converted into calls to the map or flatMap method. For example this one:
for(x <- List(1) ; y <- List(1,2,3)) yield (x,y)
becomes that:
List(1).flatMap(x => List(1,2,3).map(y => (x,y)))
Therefore, the first loop value (in this case, List(1)) will receive the flatMap method call. Since flatMap on a List returns another List, the result of the for comprehension will of course be a List. (This was new to me: For comprehensions don't always result in streams, not even necessarily in Seqs.)
Now, take a look at how flatMap is declared in Option:
def flatMap [B] (f: (A) ⇒ Option[B]) : Option[B]
Keep this in mind. Let's see how the erroneous for comprehension (the one with Some(1)) gets converted to a sequence of map calls:
Some(1).flatMap(x => List(1,2,3).map(y => (x, y)))
Now, it's easy to see that the parameter of the flatMap call is something that returns a List, but not an Option, as required.
In order to fix the thing, you can do the following:
for(x <- Some(1).toSeq ; y <- List(1,2,3)) yield (x, y)
That compiles just fine. It is worth noting that Option is not a subtype of Seq, as is often assumed.
An easy tip to remember, for comprehensions will try to return the type of the collection of the first generator, Option[Int] in this case. So, if you start with Some(1) you should expect a result of Option[T].
If you want a result of List type, you should start with a List generator.
Why have this restriction and not assume you'll always want some sort of sequence? You can have a situation where it makes sense to return Option. Maybe you have an Option[Int] that you want to combine with something to get a Option[List[Int]], say with the following function: (i:Int) => if (i > 0) List.range(0, i) else None; you could then write this and get None when things don't "make sense":
val f = (i:Int) => if (i > 0) Some(List.range(0, i)) else None
for (i <- Some(5); j <- f(i)) yield j
// returns: Option[List[Int]] = Some(List(0, 1, 2, 3, 4))
for (i <- None; j <- f(i)) yield j
// returns: Option[List[Int]] = None
for (i <- Some(-3); j <- f(i)) yield j
// returns: Option[List[Int]] = None
How for comprehensions are expanded in the general case are in fact a fairly general mechanism to combine an object of type M[T] with a function (T) => M[U] to get an object of type M[U]. In your example, M can be Option or List. In general it has to be the same type M. So you can't combine Option with List. For examples of other things that can be M, look at subclasses of this trait.
Why did combining List[T] with (T) => Option[T] work though when you started with the List? In this case the library use a more general type where it makes sense. So you can combine List with Traversable and there is an implicit conversion from Option to Traversable.
The bottom line is this: think about what type you want the expression to return and start with that type as the first generator. Wrap it in that type if necessary.
It probably has something to do with Option not being an Iterable. The implicit Option.option2Iterable will handle the case where compiler is expecting second to be an Iterable. I expect that the compiler magic is different depending on the type of the loop variable.
I always found this helpful:
scala> val foo: Option[Seq[Int]] = Some(Seq(1, 2, 3, 4, 5))
foo: Option[Seq[Int]] = Some(List(1, 2, 3, 4, 5))
scala> foo.flatten
<console>:13: error: Cannot prove that Seq[Int] <:< Option[B].
foo.flatten
^
scala> val bar: Seq[Seq[Int]] = Seq(Seq(1, 2, 3, 4, 5))
bar: Seq[Seq[Int]] = List(List(1, 2, 3, 4, 5))
scala> bar.flatten
res1: Seq[Int] = List(1, 2, 3, 4, 5)
scala> foo.toSeq.flatten
res2: Seq[Int] = List(1, 2, 3, 4, 5)
Since Scala 2.13 Option was made IterableOnce
sealed abstract class Option[+A] extends IterableOnce[A] with Product with Serializable
so the following for comprehension works without the use of option2Iterable implicit conversion
scala> for {
| a <- List(1)
| b <- Some(41)
| } yield (a + b)
val res35: List[Int] = List(42)
scala> List(1).flatMap
final override def flatMap[B](f: Int => scala.collection.IterableOnce[B]): List[B]
where we see List#flatMap takes a function to IterableOnce. Desugaring above for comprehension we get something like
List(1).flatMap(a => Some(41).map(b => a + b))
which show the absence of the implicit conversion.
However in Scala 2.12 and before Option was not a traversable/iterable entity
sealed abstract class Option[+A] extends Product with Serializable
so the above for comprehension would desugar to something like
List(1).flatMap(a => option2Iterable(Some(41)).map(b => a + b))(List.canBuildFrom[Int])
where we see the implicit conversion.
The reason it does not work the other way around where for comprehension begins with Option and then we try to chain a List
scala> for {
| a <- Option(1)
| b <- List(41)
| } yield (a + b)
b <- List(41)
^
On line 3: error: type mismatch;
found : List[Int]
required: Option[?]
scala> Option(1).flatMap
final def flatMap[B](f: Int => Option[B]): Option[B]
is because Option#flatMap takes a function to Option and converting a List to Option probably does not make sense because we would lose elements for Lists with more than one element.
As szeiger explains
I think the recent Option changes actually make the for comprehensions
use case easier to understand because you do not need an implicit
conversion anymore. Option can be used on the RHS of a flatMap of any
collection type because it is IterableOnce (but not the opposite
because the RHS of Option#flatMap requires Option).