Cost of using repeated parameters - scala

I consider refactoring few method signatures that currently take parameter of type List or Set of concrete classes --List[Foo]-- to use repeated parameters instead: Foo*.
Update: Following reasoning is flawed, move along...
This would allow me to use the same method name and overload it based on the parameter type. This was not possible using List or Set, because List[Foo] and List[Bar] have same type after erasure: List[Object].
In my case the refactored methods work fine with scala.Seq[Foo] that results from the repeated parameter. I would have to change all the invocations and add a sequence argument type annotation to all collection parameters: baz.doStuffWith(foos:_*).
Given that switching from collection parameter to repeated parameter is semantically equivalent, does this change have some performance impact that I should be aware of?
Is the answer same for scala 2.7._ and 2.8?

When Scala is calling a Scala varargs method, the method will receive an object that extends Seq. When the call is made with : _*, the object will be passed as is*, without copying. Here are examples of this:
scala> object T {
| class X(val self: List[Int]) extends SeqProxy[Int] {
| private val serial = X.newSerial
| override def toString = serial.toString+":"+super.toString
| }
| object X {
| def apply(l: List[Int]) = new X(l)
| private var serial = 0
| def newSerial = {
| serial += 1
| serial
| }
| }
| }
defined module T
scala> new T.X(List(1,2,3))
res0: T.X = 1:List(1, 2, 3)
scala> new T.X(List(1,2,3))
res1: T.X = 2:List(1, 2, 3)
scala> def f(xs: Int*) = xs.toString
f: (Int*)String
scala> f(res0: _*)
res3: String = 1:List(1, 2, 3)
scala> f(res1: _*)
res4: String = 2:List(1, 2, 3)
scala> def f(xs: Int*): Seq[Int] = xs
f: (Int*)Seq[Int]
scala> def f(xs: Int*) = xs match {
| case ys: List[_] => println("List")
| case _ => println("Something else")
| }
f: (Int*)Unit
scala> f(List(1,2,3): _*)
List
scala> f(res0: _*)
Something else
scala> import scala.collection.mutable.ArrayBuffer
import scala.collection.mutable.ArrayBuffer
scala> def f(xs: Int*) = xs match {
| case ys: List[_] => println("List")
| case zs: ArrayBuffer[_] => zs.asInstanceOf[ArrayBuffer[Int]] += 4; println("Array Buffer")
| case _ => println("Something else")
| }
f: (Int*)Unit
scala> val ab = new ArrayBuffer[Int]()
ab: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer()
scala> ab + 1
res11: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1)
scala> ab + 2
res12: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1, 2)
scala> ab + 3
res13: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1, 2, 3)
scala> f(ab: _*)
Array Buffer
scala> ab
res15: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 4)
Note
An Array is passed as a WrappedArray. There's no copying of elements involved, however, and changes to the WrappedArray will be reflected in the Array.

Your reason for replacing List[T] with T* is flawed: Scala will not allow overloading like
class Foo
{
def t1(x : Int*) = println("Ints")
def t1(x : Strings*) = println("Strings")
}
This will result in the same compiler error as using List[Int]/List[String] here.
Although a bit clumsy you could use
class Foo
{
def t1(x0 : Int,x : Int*) = println("Ints")
def t1(x0 : String,x : Strings*) = println("Strings")
}
but that requires special treatment of the first parameter versus the rest.
Gr. Silvio

In the simplest terms, all the arguments that correspond to a repeated formal parameters, regardless of their origin, must be copied to a sequential collection of some sort for presentation to the method. The details of exactly what kind of sequence is used vary with Scala version and possibly with the source of the arguments. But regardless of those details, it is an O(n) operation, though the cost per item is pretty low. There will be at least one and sometimes more instance allocations for the sequence itself.
Randall Schulz

Related

Why does scala allow concating Strings with Option[Strings](or any other type)?

I mistakenly concatted a string with an Option[String] while coding in scala.
I expected as a strongly typed language, scala would not allow me to do such operation.
This is what I tried.
This works
scala> val a:String = "aaa"
val a: String = aaa
scala> val b:Option[String] = Some("bbbb")
val b: Option[String] = Some(bbbb)
scala> a + b
val res0: String = aaaSome(bbbb)
scala> val c:Option[String] = None
val c: Option[String] = None
scala> val d = a + c
val d: String = aaaNone
scala> val e = 1
val e: Int = 1
scala> a + e
val res2: String = aaa1
while this does not work
scala> val f:Option[String] = Some("ffff")
val f: Option[String] = Some(ffff)
scala> val g:Option[String] = None
val g: Option[String] = None
scala> f + g
^
error: type mismatch;
found : Option[String]
required: String
Why does scala allow such behavior? Dynamically typed languages like python will stop me from adding strings to int types, None types or any type other than strings. Curious if this design is intentional? If so why?
Scala contains an implicit class any2stringadd in it's Predef package. This class is responsible for these concatenation operations.
implicit final class any2stringadd[A](private val self: A) extends AnyVal {
def +(other: String): String = String.valueOf(self) + other
}
What it means is that, by default, scope contains a method + which can concatenate value of any type A with string by converting value of this type to string via String.valueOf(...).
I can't speak of design choices, I agree that this might be an unexpected behavior. The same applies to Scala's native == method. For example, this code compiles just ok: Some("a") == "b". This can lead to nasty bugs in filtering and other methods.
If you want to eliminate this behavior I suggest you take a look at https://typelevel.org/cats/ library, which introduces different typeclasses that can solve this problem.
For example, for string concatenation you can use Semigroup typeclass (which has tons of other useful use-cases as well):
import cats.Semigroup
Semigroup[String].combine("a", "b") // works, returns "ab"
Semigroup[String].combine("a", Some("b")) // won't work, compilation error
This looks tedious, but there is a syntactic sugar:
import cats.implicits._
"a" |+| "b" // works, returns "ab"
"a" |+| Some("b") // won't work, compilation error
// |+| here is the same as Semigroup[String].combine
The same thing applies to == method. Instead you can use Eq typeclass:
import cats.implicits._
"a" == Some("b") // works, no error, but could be unexpected
"a" === Some("b") // compilation error (Cats Eq)
"a" === "b" // works, as expected
"a" =!= "b" // same as != but type safe

Scala Either : simplest way to get a property that exists on right and left

I have a template using a valueObject that might be one of two flavours depending on where it is used in our app. So I am importing it as an Either:
valueObject: Either[ ObjectA, ObjectB ]
Both objects have an identically named property on them so I would like to retrieve it just by calling
valueObject.propertyA
Which doesn't work.
What is the most concise/ best way of doing this?
Assuming the two objects have the same type (or a supertype / trait) that defines that property - you can use merge which returns left if it exists and right otherwise, with the lowest common type of both:
scala> class MyClass {
| def propertyA = 1
| }
defined class MyClass
scala> val e1: Either[MyClass, MyClass] = Left(new MyClass)
e1: Either[MyClass,MyClass] = Left(MyClass#1e51abf)
scala> val e2: Either[MyClass, MyClass] = Right(new MyClass)
e2: Either[MyClass,MyClass] = Right(MyClass#b4c6d0)
scala> e1.merge.propertyA
res0: Int = 1
scala> e2.merge.propertyA
res1: Int = 1
Using fold
Assuming the two objects do not share a common supertype that holds the property/method, then you have to resort to fold:
scala> case class A(a: Int)
defined class A
scala> case class B(a: Int)
defined class B
scala> def foldAB(eab: Either[A,B]): Int = eab.fold(_.a,_.a)
foldAB: (eab: Either[A,B])Int
scala> foldAB(Left(A(1)))
res1: Int = 1
scala> foldAB(Right(B(1)))
res2: Int = 1
Pattern matching
Another possibility is to use pattern matching:
scala> def matchAB(eab: Either[A,B]): Int = eab match { case Left(A(i)) => i; case Right(B(i)) => i}
matchAB: (eab: Either[A,B])Int
scala> matchAB(Left(A(1)))
res3: Int = 1
scala> matchAB(Right(B(1)))
res4: Int = 1

How to flatten Tree in Scala?

It is pretty easy to write flatten(lol: List[List[T]]): List[T] which transforms a list of lists to a new list. Other "flat" collections (e.g. Set) seem to provide flatten too.
Now I wonder if I can define a flatten for Tree[T](defined as a T and list of Tree[T]s).
This is not perfect, just serves an example. All you need to do is to traverse a tree in depth-first or breadth-first manner and collect results. Pretty much the same as flatten for lists.
1) Define a tree structure (I know, I know it's not the best way to do it :)):
scala> case class Node[T](value: T, left: Option[Node[T]] = None,
| right: Option[Node[T]] = None)
defined class Node
2) Create a little tree:
scala> val tree = Node(13,
| Some(Node(8,
| Some(Node(1)), Some(Node(11)))),
| Some(Node(17,
| Some(Node(15)), Some(Node(25))))
| )
tree: Node[Int] = Node(13,Some(Node(8,Some(Node(1,None,None)),Some(Node(11,None,None)))),Some(Node(17,Some(Node(15,None,None)),Some(Node(25,None,None)))))
3) Implement a function that can traverse a tree:
scala> def observe[T](node: Node[T], f: Node[T] => Unit): Unit = {
| f(node)
| node.left foreach { observe(_, f) }
| node.right foreach { observe(_, f) }
| }
observe: [T](node: Node[T], f: Node[T] => Unit)Unit
4) Use it to define a function that prints all values:
scala> def printall = observe(tree, (n: Node[_]) => println(n.value))
printall: Unit
5) Finally, define that flatten function:
scala> def flatten[T](node: Node[T]): List[T] = {
| def flatten[T](node: Option[Node[T]]): List[T] =
| node match {
| case Some(n) =>
| n.value :: flatten(n.left) ::: flatten(n.right)
| case None => Nil
| }
|
| flatten(Some(node))
| }
flatten: [T](node: Node[T])List[T]
6) Let's test. First print all elems:
scala> printall
13
8
1
11
17
15
25
7) Run flatten:
scala> flatten(tree)
res1: List[Int] = List(13, 8, 1, 11, 17, 15, 25)
It's a sort of general purpose tree algorithm like tree traversal. I made it return Ts instead of Nodes, change it as you like.
I'm not sure how you want to define that flatten exactly, but you can look at the Scalaz Tree implementation:
https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/Tree.scala
If you want flatten to return you list of all Tree nodes, then Scalaz already provides you what you want:
def flatten: Stream[A]
Result type is Stream instead of List, but this is not a problem I guess.
If you want something more sophisticated, then you can probably implement it using existing flatMap:
def flatMap[B](f: A => Tree[B]): Tree[B]
Let's say you have Tree of type Tree[Tree[A]] and want to flatten it to Tree[A]:
def flatten1: Tree[A] = flatMap(identity)
This will work also for other, more weird scenarios. For example, you can have Tree[List[A]], and want to flatten everything inside that Lists without affecting Tree structure itself:
def flatten2[B]: Tree[List[B]] = flatMap(l => leaf(l.flatten))
Looks like it works as expected:
scala> node(List(List(1)), Stream(node(List(List(2)), Stream(leaf(List(List(3, 4), List(5))))), leaf(List(List(4)))))
res20: scalaz.Tree[List[List[Int]]] = <tree>
scala> res20.flatMap(l => leaf(l.flatten)).drawTree
res23: String =
"List(1)
|
+- List(2)
| |
| `- List(3, 4, 5)
|
`- List(4)
"
It could be worth noting that scalaz Tree is also a Monad. If you will look at the scalaz/tests/src/test/scala/scalaz/TreeTest.scala you will see which laws are fulfilled for Tree:
checkAll("Tree", equal.laws[Tree[Int]])
checkAll("Tree", traverse1.laws[Tree])
checkAll("Tree", applicative.laws[Tree])
checkAll("Tree", comonad.laws[Tree])
I don't know why monad is not here, but if you will add checkAll("Tree", monad.laws[Tree]) and run tests again, they will pass.
If I'm understanding the question correctly, you want to define Tree like so:
case class Tree[T]( value:T, kids:List[Tree[T]] )
First, I wouldn't want to use ::: in the solution because of the performance implications. Second, I'd want to do something much more general -- define a fold operator for the type, which can be used for all sorts of things -- and then simply use a fold to define flatten:
case class Tree[T]( value:T, kids:List[Tree[T]] ) {
def /:[A]( init:A )( f: (A,T) => A ):A =
( f(init,value) /: kids )( (soFar,kid) => ( soFar /: kid )(f) )
def flatten =
( List.empty[T] /: this )( (soFar,value) => value::soFar ).reverse
}
Here's a test:
scala> val t = Tree( 1, List( Tree( 2, List( Tree(3,Nil), Tree(4,Nil) ) ), Tree(5,Nil), Tree( 6, List( Tree(7,Nil) ) ) ) )
t: Tree[Int] = Tree(1,List(Tree(2,List(Tree(3,List()), Tree(4,List()))), Tree(5,List()), Tree(6,List(Tree(7,List())))))
scala> t.flatten
res15: List[Int] = List(1, 2, 3, 4, 5, 6, 7)

Scala: Filtering based on type

I'm learning Scala as it fits my needs well but I am finding it hard to structure code elegantly. I'm in a situation where I have a List x and want to create two Lists: one containing all the elements of SomeClass and one containing all the elements that aren't of SomeClass.
val a = x collect {case y:SomeClass => y}
val b = x filterNot {_.isInstanceOf[SomeClass]}
Right now my code looks like that. However, it's not very efficient as it iterates x twice and the code somehow seems a bit hackish. Is there a better (more elegant) way of doing things?
It can be assumed that SomeClass has no subclasses.
EDITED
While using plain partition is possible, it loses the type information retained by collect in the question.
One could define a variant of the partition method that accepts a function returning a value of one of two types using Either:
import collection.mutable.ListBuffer
def partition[X,A,B](xs: List[X])(f: X=>Either[A,B]): (List[A],List[B]) = {
val as = new ListBuffer[A]
val bs = new ListBuffer[B]
for (x <- xs) {
f(x) match {
case Left(a) => as += a
case Right(b) => bs += b
}
}
(as.toList, bs.toList)
}
Then the types are retained:
scala> partition(List(1,"two", 3)) {
case i: Int => Left(i)
case x => Right(x)
}
res5: (List[Int], List[Any]) = (List(1, 3),List(two))
Of course the solution could be improved using builders and all the improved collection stuff :) .
For completeness my old answer using plain partition:
val (a,b) = x partition { _.isInstanceOf[SomeClass] }
For example:
scala> val x = List(1,2, "three")
x: List[Any] = List(1, 2, three)
scala> val (a,b) = x partition { _.isInstanceOf[Int] }
a: List[Any] = List(1, 2)
b: List[Any] = List(three)
Just wanted to expand on mkneissl's answer with a "more generic" version that should work on many different collections in the library:
scala> import collection._
import collection._
scala> import generic.CanBuildFrom
import generic.CanBuildFrom
scala> def partition[X,A,B,CC[X] <: Traversable[X], To, To2](xs : CC[X])(f : X => Either[A,B])(
| implicit cbf1 : CanBuildFrom[CC[X],A,To], cbf2 : CanBuildFrom[CC[X],B,To2]) : (To, To2) = {
| val left = cbf1()
| val right = cbf2()
| xs.foreach(f(_).fold(left +=, right +=))
| (left.result(), right.result())
| }
partition: [X,A,B,CC[X] <: Traversable[X],To,To2](xs: CC[X])(f: (X) => Either[A,B])(implicit cbf1: scala.collection.generic.CanBuildFrom[CC[X],A,To],implicit cbf2: scala.collection.generic.CanBuildFrom[CC[X],B,To2])(To, To2)
scala> partition(List(1,"two", 3)) {
| case i: Int => Left(i)
| case x => Right(x)
| }
res5: (List[Int], List[Any]) = (List(1, 3),List(two))
scala> partition(Vector(1,"two", 3)) {
| case i: Int => Left(i)
| case x => Right(x)
| }
res6: (scala.collection.immutable.Vector[Int], scala.collection.immutable.Vector[Any]) = (Vector(1, 3),Vector(two))
Just one note: The partition method is similar, but we need to capture a few types:
X -> The original type for items in the collection.
A -> The type of items in the left partition
B -> The type of items in the right partition
CC -> The "specific" type of the collection (Vector, List, Seq etc.) This must be higher-kinded. We could probably work around some type-inference issues (see Adrian's response here: http://suereth.blogspot.com/2010/06/preserving-types-and-differing-subclass.html ), but I was feeling lazy ;)
To -> The complete type of collection on the left hand side
To2 -> The complete type of the collection on the right hand side
Finally, the funny "CanBuildFrom" implicit paramters are what allow us to construct specific types, like List or Vector, generically. They are built into to all the core library collections.
Ironically, the entire reason for the CanBuildFrom magic is to handle BitSets correctly. Because I require CC to be higher kinded, we get this fun error message when using partition:
scala> partition(BitSet(1,2, 3)) {
| case i if i % 2 == 0 => Left(i)
| case i if i % 2 == 1 => Right("ODD")
| }
<console>:11: error: type mismatch;
found : scala.collection.BitSet
required: ?CC[ ?X ]
Note that implicit conversions are not applicable because they are ambiguous:
both method any2ArrowAssoc in object Predef of type [A](x: A)ArrowAssoc[A]
and method any2Ensuring in object Predef of type [A](x: A)Ensuring[A]
are possible conversion functions from scala.collection.BitSet to ?CC[ ?X ]
partition(BitSet(1,2, 3)) {
I'm leaving this open for someone to fix if needed! I'll see if I can give you a solution that works with BitSet after some more play.
Use list.partition:
scala> val l = List(1, 2, 3)
l: List[Int] = List(1, 2, 3)
scala> val (even, odd) = l partition { _ % 2 == 0 }
even: List[Int] = List(2)
odd: List[Int] = List(1, 3)
EDIT
For partitioning by type, use this method:
def partitionByType[X, A <: X](list: List[X], typ: Class[A]):
Pair[List[A], List[X]] = {
val as = new ListBuffer[A]
val notAs = new ListBuffer[X]
list foreach {x =>
if (typ.isAssignableFrom(x.asInstanceOf[AnyRef].getClass)) {
as += typ cast x
} else {
notAs += x
}
}
(as.toList, notAs.toList)
}
Usage:
scala> val (a, b) = partitionByType(List(1, 2, "three"), classOf[java.lang.Integer])
a: List[java.lang.Integer] = List(1, 2)
b: List[Any] = List(three)
If the list only contains subclasses of AnyRef, becaus of the method getClass. You can do this:
scala> case class Person(name: String)
defined class Person
scala> case class Pet(name: String)
defined class Pet
scala> val l: List[AnyRef] = List(Person("Walt"), Pet("Donald"), Person("Disney"), Pet("Mickey"))
l: List[AnyRef] = List(Person(Walt), Pet(Donald), Person(Disney), Pet(Mickey))
scala> val groupedByClass = l.groupBy(e => e.getClass)
groupedByClass: scala.collection.immutable.Map[java.lang.Class[_],List[AnyRef]] = Map((class Person,List(Person(Walt), Person(Disney))), (class Pet,List(Pet(Donald), Pet(Mickey))))
scala> groupedByClass(classOf[Pet])(0).asInstanceOf[Pet]
res19: Pet = Pet(Donald)
Starting in Scala 2.13, most collections are now provided with a partitionMap method which partitions elements based on a function returning either Right or Left.
That allows us to pattern match a given type (here Person) that we transform as a Right in order to place it in the right List of the resulting partition tuple. And other types can be transformed as Lefts to be partitioned in the left part:
// case class Person(name: String)
// case class Pet(name: String)
val (pets, persons) =
List(Person("Walt"), Pet("Donald"), Person("Disney")).partitionMap {
case person: Person => Right(person)
case pet: Pet => Left(pet)
}
// persons: List[Person] = List(Person(Walt), Person(Disney))
// pets: List[Pet] = List(Pet(Donald))

Using lazy evaluation functions in varargs

What is wrong is the following method?
def someMethod(funcs: => Option[String]*) = {
...
}
That actually "works" under 2.7.7 if you add parens:
scala> def someMethod(funcs: => (Option[String]*)) = funcs
someMethod: (=> Option[String]*)Option[String]*
except it doesn't actually work at runtime:
scala> someMethod(Some("Fish"),None)
scala.MatchError: Some(Fish)
at scala.runtime.ScalaRunTime$.boxArray(ScalaRunTime.scala:136)
at .someMethod(<console>:4)
at .<init>(<console>:6)
at .<clinit>(<console>) ...
In 2.8 it refuses to let you specify X* as the output of any function or by-name parameter, even though you can specify it as an input (this is r21230, post-Beta 1):
scala> var f: (Option[Int]*) => Int = _
f: (Option[Int]*) => Int = null
scala> var f: (Option[Int]*) => (Option[Int]*) = _
<console>:1: error: no * parameter type allowed here
var f: (Option[Int]*) => (Option[Int]*) = _
But if you try to convert from a method, it works:
scala> def m(oi: Option[Int]*) = oi
m: (oi: Option[Int]*)Option[Int]*
scala> var f = (m _)
f: (Option[Int]*) => Option[Int]* = <function1>
scala> f(Some(1),None)
res0: Option[Int]* = WrappedArray(Some(1), None)
So it's not entirely consistent.
In any case, you can possibly achieve what you want by passing in an Array and then sending that array to something that takes repeated arguments:
scala> def aMethod(os: Option[String]*) { os.foreach(println) }
aMethod: (os: Option[String]*)Unit
scala> def someMethod(funcs: => Array[Option[String]]) { aMethod(funcs:_*) }
someMethod: (funcs: => Array[Option[String]])Unit
scala> someMethod(Array(Some("Hello"),Some("there"),None))
Some(Hello)
Some(there)
None
If you really want to (easily) pass a bunch of lazily evaluated arguments, then you need a little bit of infrastructure that as far as I know doesn't nicely exist in the library (this is code for 2.8; view it as inspiration for a similar strategy in 2.7):
class Lazy[+T](t: () => T, lt: Lazy[T]) {
val params: List[() => T] = (if (lt eq null) Nil else t :: lt.params)
def ~[S >: T](s: => S) = new Lazy[S](s _,this)
}
object Lz extends Lazy[Nothing](null,null) {
implicit def lazy2params[T : Manifest](lz: Lazy[T]) = lz.params.reverse.toArray
}
Now you can easily create a bunch of parameters that are lazily evaluated:
scala> import Lz._ // To get implicit def
import Lz._
scala> def lazyAdder(ff: Array[()=>Int]) = {
| println("I'm adding now!");
| (0 /: ff){(n,f) => n+f()}
| }
lazyAdder: (ff: Array[() => Int])Int
scala> def yelp = { println("You evaluated me!"); 5 }
yelp: Int
scala> val a = 3
a: Int = 3
scala> var b = 7
b: Int = 7
scala> lazyAdder( Lz ~ yelp ~ (a+b) )
I'm adding now!
You evaluated me!
res0: Int = 15
scala> val plist = Lz ~ yelp ~ (a+b)
plist: Lazy[Int] = Lazy#1ee1775
scala> b = 1
b: Int = 1
scala> lazyAdder(plist)
I'm adding now!
You evaluated me!
res1: Int = 9
Evidently repeated arguments are not available for by-name parameters.