When should .empty be used versus the singleton empty instance? - scala

Working with collections in Scala, it's common to need to use the empty instance of the collection for a base case. Because the default empty instances extend the collection type class with a type parameter of Nothing, it sometimes defeats type inference to use them directly. For example:
scala> List(1, 2, 3).foldLeft(Nil)((x, y) => x :+ y.toString())
<console>:8: error: type mismatch;
found : List[String]
required: scala.collection.immutable.Nil.type
List(1, 2, 3).foldLeft(Nil)((x, y) => x :+ y.toString())
^
fails, but the following two corrections succeed:
scala> List(1, 2, 3).foldLeft(Nil: List[String])((x, y) => x :+ y.toString())
res9: List[String] = List(1, 2, 3)
scala> List(1, 2, 3).foldLeft(List.empty[String])((x, y) => x :+ y.toString())
res10: List[String] = List(1, 2, 3)
Another place I've run into a similar dilemma is in defining default parameters. These are the only examples I could think of off the top of my head, but I know I've seen others. Is one method of providing the correct type hinting preferable to the other in general? Are there places where each would be advantageous?

I tend to use Nil (or None) in combination with telling the type parameterized method the type (like Kigyo) for the specific use case given. Though I think using explicit type annotation is equally OK for the use case given. But I think there are use cases where you want to stick to using .empty, for example, if you try to call a method on Nil: List[String] you first have to wrap it in braces, so that's 2 extra characters!!.
Now the other argument for using .empty is consistency with the entire collections hierarchy. For example you can't do Nil: Set[String] and you can't do Nil: Option[String] but you can do Set.empty[String] and Option.empty[String]. So unless your really sure your code will never be refactored into some other collection then you should use .empty as it will require less faff to refactor. Furthermore it's just generally nice to be consistent right? :)
To be fair I often use Nil and None as I'm often quite sure I'd never want to use a Set or something else, in fact I'd say it's better to use Nil when your really sure your only going to deal with lists because it tells the reader of the code "I'm really really dealing with lists here".
Finally, you can do some cool stuff with .empty and duck typing check this simple example:
def printEmptyThing[K[_], T <: {def empty[A] : K[A]}](c: T): Unit =
println("thing = " + c.empty[String])
printEmptyThing[List, List.type](List)
printEmptyThing[Option, Option.type](Option)
printEmptyThing[Set, Set.type](Set)
will print:
> thing = List()
> thing = None
> thing = Set()

Related

Scala: isInstanceOf followed by asInstanceOf

In my team, I often see teammates writing
list.filter(_.isInstanceOf[T]).map(_.asInstanceOf[T])
but this seems a bit redundant to me.
If we know that everything in the filtered list is an instance of T then why should we have to explicitly cast it as such?
I know of one alternative, which is to use match.
eg:
list.match {
case thing: T => Some(thing)
case _ => None
}
but this has the drawback that we must then explicitly state the generic case.
So, given all the above, I have 2 questions:
1) Is there another (better?) way to do the same thing?
2) If not, which of the two options above should be preferred?
You can use collect:
list collect {
case el: T => el
}
Real types just work (barring type erasure, of course):
scala> List(10, "foo", true) collect { case el: Int => el }
res5: List[Int] = List(10)
But, as #YuvalItzchakov has mentioned, if you want to match for an abstract type T, you must have an implicit ClassTag[T] in scope.
So a function implementing this may look as follows:
import scala.reflect.ClassTag
def filter[T: ClassTag](list: List[Any]): List[T] = list collect {
case el: T => el
}
And using it:
scala> filter[Int](List(1, "foo", true))
res6: List[Int] = List(1)
scala> filter[String](List(1, "foo", true))
res7: List[String] = List(foo)
collect takes a PartialFunction, so you shouldn't provide the generic case.
But if needed, you can convert a function A => Option[B] to a PartialFunction[A, B] with Function.unlift. Here is an example of that, also using shapeless.Typeable to work around type erasure:
import shapeless.Typeable
import shapeless.syntax.typeable._
def filter[T: Typeable](list: List[Any]): List[T] =
list collect Function.unlift(_.cast[T])
Using:
scala> filter[Option[Int]](List(Some(10), Some("foo"), true))
res9: List[Option[Int]] = List(Some(10))
but this seems a bit redundant to me.
Perhaps programmers in your team are trying to shield that piece of code from someone mistakenly inserting a type other then T, assuming this is some sort of collection with type Any. Otherwise, the first mistake you make, you'll blow up at run-time, which is never fun.
I know of one alternative, which is to use match.
Your sample code won't work because of type erasure. If you want to match on underlying types, you need to use ClassTag and TypeTag respectively for each case, and use =:= for type equality and <:< for subtyping relationships.
Is there another (better?) way to do the same thing?
Yes, work with the type system, not against it. Use typed collections when you can. You haven't elaborated on why you need to use run-time checks and casts on types, so I'm assuming there is a reasonable explanation to that.
If not, which of the two options above should be preferred?
That's a matter of taste, but using pattern matching on types can be more error-prone since one has to be aware of the fact that types are erased at run-time, and create a bit more boilerplate code for you to maintain.

Scala flatten list of String and List[String]

Need some help with Scala flatten.
I have a list of String and List[String].
Example: List("I", "can't", List("do", "this"))
Expecting result: List("I", "can't", "do", "this")
I've done a lot of experiments, and most compact solution is:
val flattenList = list.flatten {
case list: List[Any] => list
case x => List(x)
}
But it seems very tricky and hard to understand. Any suggestions for more naive code?
Thanks.
What's "tricky and hard to understand" is your mixing elements of different type in the same list. That's the root cause of your problem. Once you have that, there is no way around having to scan the list, and inspect the type of each element to correct it, and your solution to that is as good as any (certainly, better, than the one, suggested in the other answer :)).
I would really rethink the code path that leads to having a heterogeneous list like this in the first place though if I were you. This is not really a good approach, because you subvert the type safety this way, and end up with a List[AnyRef], that can contain ... well, anything.
I don't think you can avoid having to deal with 2 cases: single element vs list. One way or another you would have to tell your program what to do. Here is a more general implementation that deals with a list of any depth:
def flattenList(xs: List[Any]): List[Any] =
xs match {
case Nil => Nil
case (ys:List[_]) :: t => flattenList(ys) ::: flattenList(t)
case h :: t => h :: flattenList(t)
}
Example:
scala> flattenList(List("I", "can't", List("do", "this")))
res1: List[Any] = List(I, can't, do, this)
scala> flattenList(List("I", "can't", List("do", List("this", "and", "this"))))
res2: List[Any] = List(I, can't, do, this, and, this)
This does not look very type safe though. Try to use a Tree or something else.

Enforcing non-emptyness of scala varargs at compile time

I have a function that expects a variable number of parameters of the same type, which sounds like the textbook use case for varargs:
def myFunc[A](as: A*) = ???
The problem I have is that myFunc cannot accept empty parameter lists. There's a trivial way of enforcing that at runtime:
def myFunc[A](as: A*) = {
require(as.nonEmpty)
???
}
The problem with that is that it happens at runtime, as opposed to compile time. I would like the compiler to reject myFunc().
One possible solution would be:
def myFunc[A](head: A, tail: A*) = ???
And this works when myFunc is called with inline arguments, but I'd like users of my library to be able to pass in a List[A], which this syntax makes very awkward.
I could try to have both:
def myFunc[A](head: A, tail: A*) = myFunc(head +: tail)
def myFunc[A](as: A*) = ???
But we're right back where we started: there's now a way of calling myFunc with an empty parameter list.
I'm aware of scalaz's NonEmptyList, but in as much as possible, I'd like to stay with stlib types.
Is there a way to achieve what I have in mind with just the standard library, or do I need to accept some runtime error handling for something that really feels like the compiler should be able to deal with?
What about something like this?
scala> :paste
// Entering paste mode (ctrl-D to finish)
def myFunc()(implicit ev: Nothing) = ???
def myFunc[A](as: A*) = println(as)
// Exiting paste mode, now interpreting.
myFunc: ()(implicit ev: Nothing)Nothing <and> [A](as: A*)Unit
myFunc: ()(implicit ev: Nothing)Nothing <and> [A](as: A*)Unit
scala> myFunc(3)
WrappedArray(3)
scala> myFunc(List(3): _*)
List(3)
scala> myFunc()
<console>:13: error: could not find implicit value for parameter ev: Nothing
myFunc()
^
scala>
Replacing Nothing with a class that has an appropriate implicitNotFound annotation should allow for a sensible error message.
Let's start out with what I think is your base requirement: the ability to define myFunc in some way such that the following occurs at the Scala console when a user provides literals. Then maybe if we can achieve that, we can try to go for varargs.
myFunc(List(1)) // no problem
myFunc(List[Int]()) // compile error!
Moreover, we don't want to have to force users either to split a list into a head and tail or have them convert to a ::.
Well when we're given literals, since we have access to the syntax used to construct the value, we can use macros to verify that a list is non-empty. Moreover, there's already a library that'll do it for us, namely refined!
scala> refineMV[NonEmpty]("Hello")
res2: String Refined NonEmpty = Hello
scala> refineMV[NonEmpty]("")
<console>:39: error: Predicate isEmpty() did not fail.
refineMV[NonEmpty]("")
^
Unfortunately this is still problematic in your case, because you'll need to put refineMV into the body of your function at which point the literal syntactically disappears and macro magic fails.
Okay what about the general case that doesn't rely on syntax?
// Can we do this?
val xs = getListOfIntsFromStdin() // Pretend this function exists
myFunc(xs) // compile error if xs is empty
Well now we're up against a wall; there's no way a compile time error can happen here since the code has already been compiled and yet clearly xs could be empty. We'll have to deal with this case at runtime, either in a type-safe manner with Option and the like or with something like runtime exceptions. But maybe we can do a little better than just throw our hands up in the air. There's two possible paths of improvement.
Somehow provide implicit evidence that xs is nonempty. If the compiler can find that evidence, then great! If not, it's on the user to provide it somehow at runtime.
Track the provenance of xs through your program and statically prove that it must be non-empty. If this cannot be proved, either error out at compile time or somehow force the user to handle the empty case.
Once again, unfortunately this is problematic.
I strongly suspect this is not possible (but this is still only a suspicion and I would be happy to be proved wrong). The reason is that ultimately implicit resolution is type-directed which means that Scala gets the ability to do type-level computation on types, but Scala has no mechanism that I know of to do type-level computation on values (i.e. dependent typing). We require the latter here because List(1, 2, 3) and List[Int]() are indistinguishable at the type level.
Now you're in SMT solver land, which does have some efforts in other languages (hello Liquid Haskell!). Sadly I don't know of any such efforts in Scala (and I imagine it would be a harder task to do in Scala).
The bottom line is that when it comes to error checking there is no free lunch. A compiler can't magically make error handling go away (although it can tell you when you don't strictly need it), the best it can do is yell at you when you forget to handle certain classes of errors, which is itself very valuable. To underscore the no free lunch point, let's return to a language that does have dependent types (Idris) and see how it handles non-empty values of List and the prototypical function that breaks on empty lists, List.head.
First we get a compile error on empty lists
Idris> List.head []
(input):1:11:When checking argument ok to function Prelude.List.head:
Can't find a value of type
NonEmpty []
Good, what about non-empty lists, even if they're obfuscated by a couple of leaps?
Idris> :let x = 5
-- Below is equivalent to
-- val y = identity(Some(x).getOrElse(3))
Idris> :let y = maybe 3 id (Just x)
-- Idris makes a distinction between Natural numbers and Integers
-- Disregarding the Integer to Nat conversion, this is
-- val z = Stream.continually(2).take(y)
Idris> :let z = Stream.take (fromIntegerNat y) (Stream.repeat 2)
Idris> List.head z
2 : Integer
It somehow works! What if we really don't let the Idris compiler know anything about the number we pass along and instead get one at runtime from the user? We blow up with a truly gargantuan error message that starts with When checking argument ok to function Prelude.List.head: Can't find a value of type NonEmpty...
import Data.String
generateN1s : Nat -> List Int
generateN1s x = Stream.take x (Stream.repeat 1)
parseOr0 : String -> Nat
parseOr0 str = case parseInteger str of
Nothing => 0
Just x => fromIntegerNat x
z : IO Int
z = do
x <- getLine
let someNum = parseOr0 x
let firstElem = List.head $ generateN1s someNum -- Compile error here
pure firstElem
Hmmm... well what's the type signature of List.head?
Idris> :t List.head
-- {auto ...} is roughly the same as Scala's implicit
head : (l : List a) -> {auto ok : NonEmpty l} -> a
Ah so we just need to provide a NonEmpty.
data NonEmpty : (xs : List a) -> Type where
IsNonEmpty : NonEmpty (x :: xs)
Oh a ::. And we're back at square one.
Use scala.collection.immutable.::
:: is the cons of the list
defined in std lib
::[A](head: A, tail: List[A])
use :: to define myFunc
def myFunc[A](list: ::[A]): Int = 1
def myFunc[A](head: A, tail: A*): Int = myFunc(::(head, tail.toList))
Scala REPL
scala> def myFunc[A](list: ::[A]): Int = 1
myFunc: [A](list: scala.collection.immutable.::[A])Int
scala> def myFunc[A](head: A, tail: A*): Int = myFunc(::(head, tail.toList))
myFunc: [A](head: A, tail: A*)Int

When should I use Option.empty[A] and when should I use None in Scala?

In Scala, when I want to set something to None, I have a couple of choices: using None or Option.empty[A].
Should I just pick one and use it consistently, or are there times when I should be using one over the other?
Example:
scala> def f(str: Option[String]) = str
f: (str: Option[String])Option[String]
scala> f(None)
res0: Option[String] = None
scala> f(Option.empty)
res1: Option[String] = None
I would stick to None whenever possible, which is almost always. It is shorter and widely used. Option.empty allows you to specify the type of underlying value, so use it when you need to help type inference. If the type is already known for the compiler None would work as expected, however while defining new variable
var a = None
would cause infering a as None.type which is unlikely what you wanted.
You can then use one of the couple ways to help infer what you need
# var a = Option.empty[String]
a: Option[String] = None
# var a: Option[String] = None
a: Option[String] = None
# var a = None: Option[String] // this one is rather uncommon
a: Option[String] = None
Another place when compiler would need help:
List(1, 2, 3).foldLeft(Option.empty[String])((a, e) => a.map(s => s + e.toString))
(Code makes no sense but just as an example) If you were to omit the type, or replace it with None the type of accumulator would be infered to Option[Nothing] and None.type respectively.
And for me personally this is the place I would go with Option.empty, for other cases I stick with None whenever possible.
Short answer use None if talking about a value for example when passing parameter to any function, use Option.empty[T] when defining something.
var something = Option.empty[String] means something is None for now but can become Some("hede") in the future. On the other hand var something = None means nothing. you can't reassign it with Some("hede") compiler will be angry:
found : Some[String]
required: None.type
So, this means None and Option.empty[T] are not alternatives. You can pass None to any Option[T] but you can't pass Some[T] to None.type
Given that Option[A].empty simply returns None:
/** An Option factory which returns `None` in a manner consistent with
* the collections hierarchy.
*/
def empty[A] : Option[A] = None
I'd say:
As you said, be consistent throughout the codebase. Making it consistent would mean that programmers entrying your codebase have one less thing to worry about. "Should I use None or Option.empty? Well, I see #cdmckay is using X throughout the call base, I'll use that as well"
Readability - think what conveys the point you want the most. If you were to read a particular method, would it make more sense to you if it returned an empty Option (let's disregard for a moment the fact that the underlying implementation is simply returning None) or an explicit None? IMO, I think of None as a non-existent value, as the documentation specifies:
/** This case object represents non-existent values.
*
* #author Martin Odersky
* #version 1.0, 16/07/2003
*/
Following are worksheet exports using Scala and Scalaz .
def f(str: Option[String]) = str //> f: (str: Option[String])Option[String]
f(None) //> res1: Option[String] = None
var x:Option[String]=None //> x : Option[String] = None
x=Some("test")
x //> res2: Option[String] = Some(test)
x=None
x
Now using Scalaz ,
def fz(str: Option[String]) = str //> fz: (str: Option[String])Option[String]
fz(none) //> res4: Option[String] = None
var xz:Option[String]=none //> xz : Option[String] = None
xz=some("test")
xz //> res5: Option[String] = Some(test)
xz=none
xz
Note that all the statements evaluate in the same way irrespective of you use None or Option.Empty. How ?
As you can see it is important to let Scala know of your intentions via the return type in the var x:Option[String]=None statement. This allows a later assignment of a Some. However a simple var x=None will fail in later lines because this will make the variable x resolve to None.type and not Option[T].
I would think that one should follow the convention. For assignments i would go for the var x:Option[String]=None option. Also whenever using None it is good to use a return type (in this case Option[String]) so that the assignment does not resolve to None.type.
Only in cases where i have no way to provide a type and i need some assignment done will i go for Option.empty
As everyone else pointed out, it's more a matter of personal taste, in which most of the people prefer None, or, in some cases, you explicitly need to put the type because the compiler can't infer.
This question can be extrapolated to other Scala classes, such as Sequences, Map, Set, List and so on. In all of them you have several ways to define empty state. Using sequence:
Seq()
Seq.empty
Seq.empty[Type]
From the 3, I prefer the second, because:
The first (Seq()) is error prone. It looks like if someone wanted to create a sequence and forgot to add the elements
The second (Seq.empty) is explicit about the desire of having an empty sequence
While the third (Seq.empty[Type]) is as explicit as the second, it is more verbose, so I don't use typically

Best practice: "If not immutable create copy"-pattern

I have function which gets a Seq[_] as an argument and returns an immutable class instance with this Seq as a val member. If the Seq is mutable I obviously want to create a defensive copy to guarantee that my return class instance cannot be modified.
What are the best practice for this pattern? First I was surprised that it is not possible to overload the function
def fnc(arg: immutable.Seq[_]) = ...
def fnc(arg: mutable.Seq[_]) = ...
I could also pattern-match:
def fnc(arg: Seq[_]) = arg match {
case s: immutable.Seq[_] => { println("immutable"); s}
case s: mutable.Seq[_] => {println("mutable"); List()++s }
case _: ?
}
But I am not sure about the _ case. Is it guaranteed that arg is immutable.Seq or mutable.Seq? I also don't know if List()++s is the correct way to convert it. I saw many posts on SO, but most of them where for 2.8 or earlier.
Are the Scala-Collections "intelligent" enough that I can just always (without pattern matching) write List()++s and I get the same instance if immutable and a deep copy if mutable?
What is the recommend way to do this?
You will need to pattern match if you want to support both,. The code for Seq() ++ does not guarantee (as part of its API) that it won't copy the rest if it's immutable:
scala> val v = Vector(1,2,3)
v: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> Seq() ++ v
res1: Seq[Int] = List(1, 2, 3)
It may pattern-match itself for some special cases, but you know the cases you want. So:
def fnc[A](arg: Seq[A]): Seq[A] = arg match {
case s: collection.immutable.Seq[_] => arg
case _ => Seq[A]() ++ arg
}
You needn't worry about the _; this just says you don't care exactly what the type argument is (not that you could check anyway), and if you write it this way, you don't: pass through if immutable, otherwise copy.
What are the best practice for this pattern?
If you want to guarantee immutability, the best practice is to make a defensive copy, or require immutable.Seq.
But I am not sure about the _ case. Is it guaranteed that arg is immutable.Seq or mutable.Seq?
Not necessarily, but I believe every standard library collection that inherits from collection.Seq also inherits from one of those two. A custom collection, however, could theoretically inherit from just collection.Seq. See Rex's answer for an improvement on your pattern-matching solution.
Are the Scala-Collections "intelligent" enough that I can just always (without pattern matching) write List()++s and I get the same instance if immutable and a deep copy if mutable?
It appears they are in certain cases but not others, for example:
val immutableSeq = Seq[Int](0, 1, 2)
println((Seq() ++ immutableSeq) eq immutableSeq) // prints true
val mutableSeq = mutable.Seq[Int](0, 1, 2)
println((Seq() ++ mutableSeq) eq mutableSeq) // prints false
Where eq is reference equality. Note that the above also works with List() ++ s, however as Rex pointed out, it does not work for all collections, like Vector.
You certainly can overload in that way! E.g., this compiles fine:
object MIO
{
import collection.mutable
def f1[A](s: Seq[A]) = 23
def f1[A](s: mutable.Seq[A]) = 42
def f2(s: Seq[_]) = 19
def f2(s: mutable.Seq[_]) = 37
}
In the REPL:
Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_37).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import rrs.scribble.MIO._; import collection.mutable.Buffer
import rrs.scribble.MIO._
import collection.mutable.Buffer
scala> f1(List(1, 2, 3))
res0: Int = 23
scala> f1(Buffer(1, 2, 3))
res1: Int = 42
scala> f2(List(1, 2, 3))
res2: Int = 19
scala> f2(Buffer(1, 2, 3))
res3: Int = 37