What is the actual return type of the filter method on a range? How do I make it a Vector? - scala

I'm completely new to Scala so bear with me please! I am going through some Scala exercises and one of them is to create a list of odd numbers from 1 to 20. This is pretty straight forward but I'm a bit confused about the return type of the filter method on a range.
I have the following block:
val lst2 = (1 to 20).filter(_ % 2 != 0)
println(lst2)
The output of this is:
Vector(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)
However when I explicitly set the type of lst2 to Vector[Int] like the following:
val lst2: Vector[Int] = (1 to 20).filter(_ % 2 != 0)
println(lst2)
I get the this:
16: error: type mismatch;
found : scala.collection.immutable.IndexedSeq[Int]
required: Vector[Int]
val lst2: Vector[Int] = (1 to 20).filter(_ % 2 != 0)
^
one error found
So what's gonig on here? Is the println method just not giving me the correct type? How do I get the filter method to return a Vector?

The only guarantee that is provided by filter of Range is that it returns a collection.immutable.IndexedSeq[A], so this compiles:
val lst2: collection.immutable.IndexedSeq[Int] = (1 to 20).filter(_ % 2 == 1)
At runtime, lst2 just happens to be a Vector[Int], but this is not guaranteed by the interface, so the authors of the filter method effectively reserve the right to change the concrete implementation to another IndexedSeq whenever they like. The type Vector is an implementation detail you shouldn't rely on.
The reason why it's printed as Vector(...) is that it depends on the implementation of toString of a concrete instance that is present at runtime, not on the statically know type (dynamic dispatch).
If you really want something with type Vector, just append .toVector:
val lst2: Vector[Int] = (0 to 20).filter(_ % 2 == 1).toVector

Related

Scala: Detect and extract something more specific from a collection of `Any` values

Scala: Detect and extract something more specific from a collection of Any values.
(Motivation: The Saddle library -- the only Scala library I have found that provides a Frame type, which is critical for data science -- leads me to this puzzle. See last section for details.)
The problem
Imagine a collection c of type C[Any]. Suppose that some of the elements of c are of a type T which is strictly more specific than Any.
I would like a way to find all the elements of type T, and to then create an object d of type C[T], rather than C[Any].
Some code demonstrating the problem
scala> val c = List(0,1,"2","3")
<console>:11: warning: a type was inferred to be `Any`;
this may indicate a programming error.
val c = List(0,1,"2","3")
^
c: List[Any] = List(0, 1, 2, 3)
scala> :t c(0)
Any // I wish this were more specific
// Scala can convert some elements to Int.
scala> val c0 = c(0) . asInstanceOf[Int]
c0: Int = 0
// But how would I detect which?
scala> val d = c.slice(0,2)
d: List[Any] = List(0, 1) // I wish this were List[Int]
Motivation: Why the Saddle library leads me to this problem
Saddle lets you manipulate "Frames" (tables). Frames can have columns of various types. Some systems (e.g. Pandas) assign a separate type to each column. Every Frame in Saddle, however, has exactly three type parameters: The type of row labels, the type of column labels, and the type of cells.
Real world data is typically a mix of strings and numbers. The way such tables are represented in Saddle is as a Frame with a cell type of Any. I'd like to downcast (upcast? polymorphism is hard) a column to something more specific than a Series of Any values. I'd also like to be able to test a column, to be sure that the cast is appropriate.
I posted an issue on Saddle's Github site about the puzzle.
You could do something like this
scala> val c = List(0,1,"2","3")
c: List[Any] = List(0, 1, 2, 3)
scala> c.collect { case x: Int => x; case s: String => s.toInt }
res0: List[Int] = List(0, 1, 2, 3)
If you just want the Int types you can simply drop the second case.

Scala: How to find the minimum of more than 2 elements?

Since the Math.min() function only allows for the use of 2 elements, I was wondering if there is maybe another function which can calculate the minimum of more than 2 elements.
Thanks in advance!
If you have multiple elements you can just chain calls to the min method:
Math.min(x, Math.min(y, z))
Since scala adds the a min method to numbers via implicits you could write the following which looks much fancier:
x min y min z
If you have a list of values and want to find their minimum:
val someNumbers: List[Int] = ???
val minimum = someNumbers.min
Note that this throws an exception if the list is empty. From scala 2.13.x onwards, there will be a minOption method to handle such cases gracefully. For older versions you could use the reduceOption method as workaround:
someNumbers.reduceOption(_ min _)
someNumbers.reduceOption(Math.min)
Add all numbers in the collection like the list and find a minimum of it.
scala> val list = List(2,3,7,1,9,4,5)
list: List[Int] = List(2, 3, 7, 1, 9, 4, 5)
scala> list.min
res0: Int = 1

Scala: How to multiply a List of Lists by value

Now studying Scala and working with list of lists. Want to multiply array by an element(for example, 1).
However I get the following error:
identifier expected but integer constant found
Current code:
def multiply[A](listOfLists:List[List[A]]):List[List[A]] =
if (listOfLists == Nil) Nil
else -1 * listOfLists.head :: multiply(listOfLists.tail)
val tt = multiply[List[3,4,5,6];List[4,5,6,7,8]]
print(tt);;
There are a few issues with your code:
In general, you can't perform arithmetic on unconstrained generic types; somehow, you have to communicate any supported arithmetic operations.
Multiplying by 1 will typically have no effect anyway.
As already pointed out, you don't declare List instances using square brackets (they're used for declaring generic type arguments).
The arguments you're passing to multiply are two separate lists (using an invalid semicolon separator instead of a comma), not a list of lists.
In the if clause the return value is Nil, which matches the stated return type of List[List[A]]. However the else clause is trying to perform a calculation which is multiplying List instances (not the contents of the lists) by an Int. Even if this made sense, the resulting type is clearly not a List[List[A]]. (This also makes it difficult for me to understand exactly what it is you're trying to accomplish.)
Here's a version of your code that corrects the above, assuming that you're trying to multiply each member of the inner lists by a particular factor:
// Multiply every element in a list of lists by the specified factor, returning the
// resulting list of lists.
//
// Should work for any primitive numeric type (Int, Double, etc.). For custom value types,
// you will need to declare an `implicit val` of type Numeric[YourCustomType] with an
// appropriate implementation of the `Numeric[T]` trait. If in scope, the appropriate
// num value will be identified by the compiler and passed to the function automatically.
def multiply[A](ll: List[List[A]], factor: A)(implicit num: Numeric[A]): List[List[A]] = {
// Numeric[T] trait defines a times method that we use to perform the multiplication.
ll.map(_.map(num.times(_, factor)))
}
// Sample use: Multiply every value in the list by 5.
val tt = multiply(List(List(3, 4, 5, 6), List(4, 5, 6, 7, 8)), 5)
println(tt)
This should result in the following output:
List(List(15, 20, 25, 30), List(20, 25, 30, 35, 40))
However, it might be that you're just trying to multiply together all of the values in the lists. This is actually a little more straightforward (note the different return type):
def multiply[A](ll: List[List[A]])(implicit num: Numeric[A]): A = ll.flatten.product
// Sample use: Multiply all values in all lists together.
val tt = multiply(List(List(3, 4, 5, 6), List(4, 5, 6, 7, 8)))
println(tt)
This should result in the following output:
2419200
I'd recommend you read a good book on Scala. There's a lot of pretty sophisticated stuff going on in these examples, and it would take too long to explain it all here. A good start would be Programming in Scala, Third Edition by Odersky, Spoon & Venners. That will cover List[A] operations such as map, flatten and product as well as implicit function arguments and implicit val declarations.
To make numeric operations available to type A, you can use context bound to associate A with scala.math.Numeric which provides methods such as times and fromInt to carry out the necessary multiplication in this use case:
def multiply[A: Numeric](listOfLists: List[List[A]]): List[List[A]] = {
val num = implicitly[Numeric[A]]
import num._
if (listOfLists == Nil) Nil else
listOfLists.head.map(times(_, fromInt(-1))) :: multiply(listOfLists.tail)
}
multiply( List(List(3, 4, 5, 6), List(4, 5, 6, 7, 8)) )
// res1: List[List[Int]] = List(List(-3, -4, -5, -6), List(-4, -5, -6, -7, -8))
multiply( List(List(3.0, 4.0), List(5.0, 6.0, 7.0)) )
// res2: List[List[Double]] = List(List(-3.0, -4.0), List(-5.0, -6.0, -7.0))
For more details about context bound, here's a relevant SO link.

Constructing BitSets in Scala from a predicate?

Suppose I want to construct a BitSet containing all integers from 0 until n satisfying some predicate f: Int => Boolean.
I could write something like
BitSet((0 until n):_*).filter(f)
which of course works. But it feels rather inefficient! I'm planning on doing this inside a pretty tight loop, and would like suggestions for more efficient ways.
This is the best I could come up with at the moment
BitSet((0 until n).view.filter(f):_*)
The view part makes the filter method lazy. This makes sure that when the BitSet is created from the given sequence, it will filter on the fly. Your original suggestion creates a new BitSet after the first one is created.
If performance is truly your major concern, the best option is probably to use a mutable.BitSet and a while loop, and then call toImmutable on the result.
val bitSet = {
val tmp = new scala.collection.mutable.BitSet(n)
var i = 0;
while (i < n) {
if (f(i)) {
tmp += i
}
i = i + 1
}
tmp.toImmutable
}
I think the most efficient "functional" way is to use foldLeft:
(1 to 5).foldLeft(BitSet())((s,i) => if (f(i)) s + i else s)
It doesn't create an intermediate collection but construct the collection from scratch while filtering.
The first thing I thought is to use breakOut, but it doesn't work for filter:
scala> val set: BitSet = (0 until 10).filter(f)(collection.breakOut)
<console>:11: error: polymorphic expression cannot be instantiated to expected type;
found : [From, T, To]scala.collection.generic.CanBuildFrom[From,T,To]
required: Int
val set: BitSet = (0 until 10).filter(f)(collection.breakOut)
^
scala> val set: BitSet = (0 until 10).map(_+1)(collection.breakOut)
set: scala.collection.immutable.BitSet = BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
breakOut doesn't create an intermediate collection too, but because filter doesn't have a second parameter list it can't work.

Should Scala's map() behave differently when mapping to the same type?

In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map().
We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder to recreate the resulting collection, and those who go though an implicit CanBuildFrom to obtain the builder.
The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter, partition, drop, take, span, etc. These transformations are free to call newBuilder and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int] can always return a List[Int]; filtering a BitSet (or the RNA example structure described in this article on the architecture of the collections framework) can always return another BitSet (or RNA). Let's call them the filtering transformations.
The second category of transformations need CanBuildFroms to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet cannot contain Strings; an RNA contains only Bases. Examples of such transformations are map, flatMap, collect, scanLeft, ++, etc. Let's call them the mapping transformations.
Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.
scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet
scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)
scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)
scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again
scala> set.filter(_ % 2 == 0)
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine
scala> treeset.map(_ + 1)
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine
scala> set.map(_ + 1)
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!
Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll) method, may or may not be able to recreate the same collection type.
Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet when mapping to Ints, create a new TreeSet with the same ordering, etc.
Then we would avoid cases where
for (i <- set) {
val x = i + 1
println(x)
}
does not print the incremented elements of the TreeSet in the same order as
for (i <- set; x = i + 1)
println(x)
So:
Do you think this would be a good idea to change the behavior of the mapping transformations as described?
What are the inevitable caveats I have grossly overlooked?
How could it be implemented?
I was thinking about something like an implicit sameTypeEvidence: A =:= B parameter, maybe with a default value of null (or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null), which could be used at runtime to give more information to the CanBuildFrom, which in turn could be used to determine the type of builder to return.
I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet. Because the following does work as intended:
val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List
val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector
So the reason is that TreeSet is missing a specialised companion object/builder:
seq1.companion.newBuilder[Int] // ListBuffer
seq2.companion.newBuilder[Int] // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)
So my guess is, if you take proper provision for such a companion for your RNA class, you may find that both map and filter work as you wish...?