Why Some(null) isn't considered None? - scala

I am curious:
scala> Some(null) == None
res10: Boolean = false
Why isn't Some(null) transformed to None?

You should use Option(null) to reach the desired effect and return None.
Some(null) just creates a new Option with a defined value (hence Some) which is actually null, and there are few valid reasons to ever create one like this in real code.

Unfortunately, null is a valid value for any AnyRef type -- a consequence of Scala's interoperability with Java. So a method that takes an object of type A and, internally, store it inside an Option, might well need to store a null inside that option.
For example, let's say you have a method that takes the head of a list, checks if that head correspond to a key in a store, and then return true if it is. One might implement it like this:
def isFirstAcceptable(list: List[String], keys: Set[String]): Boolean =
list.headOption map keys getOrElse false
So, here's the thing... if the that inside list and keys come from some Java API, they both may well contain null! If Some(null) wasn't possible, then isFirstAcceptable(List[String](null), Set[String](null)) would return false instead of true.

I think the others in the thread do a good job explaining why Some(null) "should" exist, but if you happen to be getting Some(null) somewhere and want a quick way to turn it into None, I've done this before:
scala> val x: Option[String] = Some(null)
x: Option[String] = Some(null)
scala> x.flatMap(Option(_))
res8: Option[String] = None
And when the starting Option is a legit non-null value things work as you probably want:
scala> val y: Option[String] = Some("asdf")
y: Option[String] = Some(asdf)
scala> y.flatMap(Option(_))
res9: Option[String] = Some(asdf)

Much of Scala's WTFs can be attributed to its need for compatibility with Java. null is often used in Java as a value, indicating, perhaps the absence of a value. For example hashMap.get(key) will return null if the key is not matched.
With this in mind, consider the following possible values from wrapping a null returning method in an Option:
if (b) Some(hashMap.get(key)) else None
// becomes -->
None // the method was not invoked;
Some(value) // the method was invoked and a value returned; or
Some(null) // the method was invoked and null was returned.
Some(null) seems sufficiently distinct from None in this case to warrant allowing it in the language.
Of course if this is not desirable in your case then simply use:
if (b) Option(hashMap.get(key)) else None
// becomes -->
None // the method was not invoked or the mapped value was null; or
Some(value) // the method was invoked and a value returned

As a simple thought experiment, consider two lists of Strings, one of length 5 and one of length 20.
Because we're running on the JVM, it's possible to insert null as a valid element into one of these lists - so put that in the long list as element #10
What, then, should the difference be in the values returned from the two following expressions?
EDIT: Exchanged get for lift, I was thinking of maps...
shortList.lift(10) //this element doesn't exist
longList.lift(10) //this element exists, and contains null

Because Option is considered to be a Functor and being a Functor means:
Has unit function (apply or just Option("blah") in Scala)
Has map function which transforms value from T=>B but not a context
Obeys 2 Functor laws - identity law and associative law
In this topic the main part is #2 - Option(1).map(t=>null) can not transform context. Some should remain. Otherwise it brakes associative law!
Just consider the following laws example:
def identity[T](v: T) = v
def f1(v: String) = v.toUpperCase
def f2(v: String) = v + v
def fNull(v: String): String = null
val opt = Option("hello")
//identity law
opt.map(identity) == opt //Some(hello) == Some(hello)
//associative law
opt.map(f1 _ andThen f2) == opt.map(f1).map(f2) //Some(HELLOHELLO) == Some(HELLOHELLO)
opt.map(fNull _ andThen f2) == opt.map(fNull).map(f2) //Some(nullnull) == Some(nullnull)
But what if Option("hello").map(t=>null) produced None? Associative law would be broken:
opt.map(fNull _ andThen f2) == opt.map(fNull).map(f2) //Some(nullnull) != None
That is my thought, might be wrong

Related

scala mutable.Map put/get handles null's in an unexpected way?

I know one is not supposed to use nulls in scala but sometimes when interoperating with Java it happens. The way a scala mutable map handles this seems off though:
scala> import scala.collection.mutable
import scala.collection.mutable
scala> val m: mutable.Map[String, String] = mutable.Map.empty
m: scala.collection.mutable.Map[String,String] = Map()
scala> m.put("Bogus", null)
res0: Option[String] = None
scala> m.get("Bogus")
res1: Option[String] = Some(null)
scala> m.getOrElse("Bogus", "default")
res2: String = null
I would have expected m.get to return None in this case. Almost seems like a bug, like somewhere in the code there was a Some(v) instead of Option(v)
Is there discussion w/r/t to changing this behavior?
I would have expected m.get to return None in this case.
Why? None would mean the key "Bogus" is not in the map, but you just put it in (with value null).
Java's Map API has problems distinguishing "the value for this key is null" from "this key is not in the map", but Scala's doesn't.
Null is subtype of String:
scala> implicitly[Null <:< String]
res3: Null <:< String = <function1>
Therefore null is a valid String value:
scala> val s: String = null
s: String = null
If you want to store a null as a String in a map, it's your good right.
Compared to Java's Map#get (let's call it javaLikeGet), the Scala's get behaves roughly as follows:
def get(k: K) = if (containsKey(k)) {
Some(this.javaLikeGet(k))
} else {
None
}
and not like what you have assumed:
def get(k: K) = Option(this.javaLikeGet(k))
The latter version (presumably what you thought) would get a null for an existing key, pass it to Option(...), and return None. But the former version (which imitates how the real implementation works) would notice that the key exists, and wrap the null returned by javaLikeGet into a Some.
The simple and consistent rule is:
If the key k exists, then get(k) returns Some[V], otherwise it returns None.
This is much less surprising than the strange behavior of Java's get that returns null in two entirely different situations.
This is the Billion-Dollar Mistake, but Scala is not the language that is likely to fix it, because it has to interop with Java. My guess is that there is and will be no discussion about changing this behavior, at least not until something fundamentally changes in the entire programming landscape.

Why does Some(x).map(_ => null) not evaluate to None?

I have recently faced a confusing issue in Scala. I expect the following code to result in None, but it results in Some(null):
Option("a").map(_ => null)
What is the reasoning behind this? Why does it not result in None?
Note: This question is not a duplicate of Why Some(null) isn't considered None?, as that questions asks for explicitly using Some(null). My question is about using Option.map.
Every time we add an exception to a rule, we deprive ourselves of a tool for reasoning about code.
Mapping over a Some always evaluates to a Some. That's a simple and useful law. If we were to make the change you propose, we would no longer have that law. For example, here's a thing we can say with certainty. For all f, x, and y:
Some(x).map(f).map(_ => y) == Some(y)
If we were to make the change you propose, that statement would no longer be true; specifically, it would not hold for cases where f(x) == null.
Moreover, Option is a functor. Functor is a useful generalization of things that have map functions, and it has laws that correspond well to intuition about how mapping should work. If we were to make the change you propose, Option would no longer be a functor.
null is an aberration in Scala that exists solely for interoperability with Java libraries. It is not a good reason to discard Option's validity as functor.
Here is the code for Option map method:
/** Returns a $some containing the result of applying $f to this $option's
* value if this $option is nonempty.
* Otherwise return $none.
*
* #note This is similar to `flatMap` except here,
* $f does not need to wrap its result in an $option.
*
* #param f the function to apply
* #see flatMap
* #see foreach
*/
#inline final def map[B](f: A => B): Option[B] =
if (isEmpty) None else Some(f(this.get))
So, as you can see, if the option is not empty, it will map to Some with the value returned by the function. And here is the code for Some class:
/** Class `Some[A]` represents existing values of type
* `A`.
*
* #author Martin Odersky
* #version 1.0, 16/07/2003
*/
#SerialVersionUID(1234815782226070388L) // value computed by serialver for 2.11.2, annotation added in 2.11.4
final case class Some[+A](x: A) extends Option[A] {
def isEmpty = false
def get = x
}
So, as you can see, Some(null) will actually create a Some object containing null. What you probably want to do is use Option.apply which does returns a None if the value is null. Here is the code for Option.apply method:
/** An Option factory which creates Some(x) if the argument is not null,
* and None if it is null.
*
* #param x the value
* #return Some(value) if value != null, None if value == null
*/
def apply[A](x: A): Option[A] = if (x == null) None else Some(x)
So, you need to write your code like this:
Option("a").flatMap(s => Option.apply(null))
Of course, this code makes no sense, but I will consider that you are just doing some kind of experiment.
Option is kind of replacement for null, but in general you see null in scala when you are talking to some java code, it is not like Option is supposed to handle nulls whenever possible, it is not designed to be used with nulls but instead of them. There is however conveniece method Option.apply that is similar to java's Optional.ofNullable that would handle the null case, and that's mostly all about nulls and Options in scala. In all other cases it works on Some and None not making any difference if null is inside or not.
If you have some nasty method returning null that comes from java and you want to use it directly, use following approach:
def nastyMethod(s: String): String = null
Some("a").flatMap(s => Option(nastyMethod(s)))
// or
Some("a").map(nastyMethod).flatMap(Option(_))
Both output Option[String] = None
So, nastyMethod can return a String or null conceptually is an Option, so wrap its result in an Option and use it as an Option. Don't expect null magic will happen whenever you need it.
To understand what's going on, we can use the functional substitution principle to explore the given expression step by step:
Option("a").map(s => null) // through Option.apply
Some("a").map(s => null) // let's name the anonymous function as: f(x) = null
Some("a").map(x => f(x)) // following Option[A].map(f:A=>B) => Option[B]
Some(f("a")) // apply f(x)
Some(null)
The confusion expressed in the question comes from the assumption that the map would apply to the argument of the Option before the Option.apply is evaluated: Let's see how that couldn't possibly work:
Option("a").map(x=> f(x)) // !!! can't evaluate map before Option.apply. This is the key to understand !
Option(f(a)) // !!! we can't get here
Option(null) // !!! we can't get here
None // !!! we can't get here
Why would it be None, the signature of map is a function from a value A to B to yield an Option[B]. No where in that signature does it indicate that B may be null by saying B is an Option[B]. flatMap however does indicate that the values returned is also optional. It's signature is Option[A] => (A => Option[B]) => Option[B].

When should I use Option.empty[A] and when should I use None in Scala?

In Scala, when I want to set something to None, I have a couple of choices: using None or Option.empty[A].
Should I just pick one and use it consistently, or are there times when I should be using one over the other?
Example:
scala> def f(str: Option[String]) = str
f: (str: Option[String])Option[String]
scala> f(None)
res0: Option[String] = None
scala> f(Option.empty)
res1: Option[String] = None
I would stick to None whenever possible, which is almost always. It is shorter and widely used. Option.empty allows you to specify the type of underlying value, so use it when you need to help type inference. If the type is already known for the compiler None would work as expected, however while defining new variable
var a = None
would cause infering a as None.type which is unlikely what you wanted.
You can then use one of the couple ways to help infer what you need
# var a = Option.empty[String]
a: Option[String] = None
# var a: Option[String] = None
a: Option[String] = None
# var a = None: Option[String] // this one is rather uncommon
a: Option[String] = None
Another place when compiler would need help:
List(1, 2, 3).foldLeft(Option.empty[String])((a, e) => a.map(s => s + e.toString))
(Code makes no sense but just as an example) If you were to omit the type, or replace it with None the type of accumulator would be infered to Option[Nothing] and None.type respectively.
And for me personally this is the place I would go with Option.empty, for other cases I stick with None whenever possible.
Short answer use None if talking about a value for example when passing parameter to any function, use Option.empty[T] when defining something.
var something = Option.empty[String] means something is None for now but can become Some("hede") in the future. On the other hand var something = None means nothing. you can't reassign it with Some("hede") compiler will be angry:
found : Some[String]
required: None.type
So, this means None and Option.empty[T] are not alternatives. You can pass None to any Option[T] but you can't pass Some[T] to None.type
Given that Option[A].empty simply returns None:
/** An Option factory which returns `None` in a manner consistent with
* the collections hierarchy.
*/
def empty[A] : Option[A] = None
I'd say:
As you said, be consistent throughout the codebase. Making it consistent would mean that programmers entrying your codebase have one less thing to worry about. "Should I use None or Option.empty? Well, I see #cdmckay is using X throughout the call base, I'll use that as well"
Readability - think what conveys the point you want the most. If you were to read a particular method, would it make more sense to you if it returned an empty Option (let's disregard for a moment the fact that the underlying implementation is simply returning None) or an explicit None? IMO, I think of None as a non-existent value, as the documentation specifies:
/** This case object represents non-existent values.
*
* #author Martin Odersky
* #version 1.0, 16/07/2003
*/
Following are worksheet exports using Scala and Scalaz .
def f(str: Option[String]) = str //> f: (str: Option[String])Option[String]
f(None) //> res1: Option[String] = None
var x:Option[String]=None //> x : Option[String] = None
x=Some("test")
x //> res2: Option[String] = Some(test)
x=None
x
Now using Scalaz ,
def fz(str: Option[String]) = str //> fz: (str: Option[String])Option[String]
fz(none) //> res4: Option[String] = None
var xz:Option[String]=none //> xz : Option[String] = None
xz=some("test")
xz //> res5: Option[String] = Some(test)
xz=none
xz
Note that all the statements evaluate in the same way irrespective of you use None or Option.Empty. How ?
As you can see it is important to let Scala know of your intentions via the return type in the var x:Option[String]=None statement. This allows a later assignment of a Some. However a simple var x=None will fail in later lines because this will make the variable x resolve to None.type and not Option[T].
I would think that one should follow the convention. For assignments i would go for the var x:Option[String]=None option. Also whenever using None it is good to use a return type (in this case Option[String]) so that the assignment does not resolve to None.type.
Only in cases where i have no way to provide a type and i need some assignment done will i go for Option.empty
As everyone else pointed out, it's more a matter of personal taste, in which most of the people prefer None, or, in some cases, you explicitly need to put the type because the compiler can't infer.
This question can be extrapolated to other Scala classes, such as Sequences, Map, Set, List and so on. In all of them you have several ways to define empty state. Using sequence:
Seq()
Seq.empty
Seq.empty[Type]
From the 3, I prefer the second, because:
The first (Seq()) is error prone. It looks like if someone wanted to create a sequence and forgot to add the elements
The second (Seq.empty) is explicit about the desire of having an empty sequence
While the third (Seq.empty[Type]) is as explicit as the second, it is more verbose, so I don't use typically

Scala Syntactic Sugar for converting to `Option`

When working in Scala, I often want to parse a field of type [A] and convert it to a Option[A], with a single case (for example, "NA" or "") being converted to None, and the other cases being wrapped in some.
Right now, I'm using the following matching syntax.
match {
case "" => None
case s: String => Some(s)
}
// converts an empty String to None, and otherwise wraps it in a Some.
Is there any more concise / idiomatic way to write this?
There are a more concise ways. One of:
Option(x).filter(_ != "")
Option(x).filterNot(_ == "")
will do the trick, though it's a bit less efficient since it creates an Option and then may throw it away.
If you do this a lot, you probably want to create an extension method (or just a method, if you don't mind having the method name first):
implicit class ToOptionWithDefault[A](private val underlying: A) extends AnyVal {
def optNot(not: A) = if (underlying == not) None else Some(underlying)
}
Now you can
scala> 47.toString optNot ""
res1: Option[String] = Some(47)
(And, of course, you can always create a method whose body is your match solution, or an equivalent one with if, so you can reuse it for that particular case.)
I'd probably use filterNot here:
scala> Option("hey").filterNot(_ == "NA")
res0: Option[String] = Some(hey)
scala> Option("NA").filterNot(_ == "NA")
res1: Option[String] = None
It requires you to think of Option as a collection with one or zero elements, but if you get into that habit it's reasonably clear.
A simple and intuitive approach includes this expression,
if (s.isEmpty) None else Some(s)
This assumes s labels the value to be otherwise matched (thanks to #RexKerr for the note).

Ending a for-comprehension loop when a check on one of the items returns false

I am a bit new to Scala, so apologies if this is something a bit trivial.
I have a list of items which I want to iterate through. I to execute a check on each of the items and if just one of them fails I want the whole function to return false. So you can see this as an AND condition. I want it to be evaluated lazily, i.e. the moment I encounter the first false return false.
I am used to the for - yield syntax which filters items generated through some generator (list of items, sequence etc.). In my case however I just want to break out and return false without executing the rest of the loop. In normal Java one would just do a return false; within the loop.
In an inefficient way (i.e. not stopping when I encounter the first false item), I could do it:
(for {
item <- items
if !satisfiesCondition(item)
} yield item).isEmpty
Which is essentially saying that if no items make it through the filter all of them satisfy the condition. But this seems a bit convoluted and inefficient (consider you have 1 million items and the first one already did not satisfy the condition).
What is the best and most elegant way to do this in Scala?
Stopping early at the first false for a condition is done using forall in Scala. (A related question)
Your solution rewritten:
items.forall(satisfiesCondition)
To demonstrate short-circuiting:
List(1,2,3,4,5,6).forall { x => println(x); x < 3 }
1
2
3
res1: Boolean = false
The opposite of forall is exists which stops as soon as a condition is met:
List(1,2,3,4,5,6).exists{ x => println(x); x > 3 }
1
2
3
4
res2: Boolean = true
Scala's for comprehensions are not general iterations. That means they cannot produce every possible result that one can produce out of an iteration, as, for example, the very thing you want to do.
There are three things that a Scala for comprehension can do, when you are returning a value (that is, using yield). In the most basic case, it can do this:
Given an object of type M[A], and a function A => B (that is, which returns an object of type B when given an object of type A), return an object of type M[B];
For example, given a sequence of characters, Seq[Char], get UTF-16 integer for that character:
val codes = for (char <- "A String") yield char.toInt
The expression char.toInt converts a Char into an Int, so the String -- which is implicitly converted into a Seq[Char] in Scala --, becomes a Seq[Int] (actually, an IndexedSeq[Int], through some Scala collection magic).
The second thing it can do is this:
Given objects of type M[A], M[B], M[C], etc, and a function of A, B, C, etc into D, return an object of type M[D];
You can think of this as a generalization of the previous transformation, though not everything that could support the previous transformation can necessarily support this transformation. For example, we could produce coordinates for all coordinates of a battleship game like this:
val coords = for {
column <- 'A' to 'L'
row <- 1 to 10
} yield s"$column$row"
In this case, we have objects of the types Seq[Char] and Seq[Int], and a function (Char, Int) => String, so we get back a Seq[String].
The third, and final, thing a for comprehension can do is this:
Given an object of type M[A], such that the type M[T] has a zero value for any type T, a function A => B, and a condition A => Boolean, return either the zero or an object of type M[B], depending on the condition;
This one is harder to understand, though it may look simple at first. Let's look at something that looks simple first, say, finding all vowels in a sequence of characters:
def vowels(s: String) = for {
letter <- s
if Set('a', 'e', 'i', 'o', 'u') contains letter.toLower
} yield letter.toLower
val aStringVowels = vowels("A String")
It looks simple: we have a condition, we have a function Char => Char, and we get a result, and there doesn't seem to be any need for a "zero" of any kind. In this case, the zero would be the empty sequence, but it hardly seems worth mentioning it.
To explain it better, I'll switch from Seq to Option. An Option[A] has two sub-types: Some[A] and None. The zero, evidently, is the None. It is used when you need to represent the possible absence of a value, or the value itself.
Now, let's say we have a web server where users who are logged in and are administrators get extra javascript on their web pages for administration tasks (like wordpress does). First, we need to get the user, if there's a user logged in, let's say this is done by this method:
def getUser(req: HttpRequest): Option[User]
If the user is not logged in, we get None, otherwise we get Some(user), where user is the data structure with information about the user that made the request. We can then model that operation like this:
def adminJs(req; HttpRequest): Option[String] = for {
user <- getUser(req)
if user.isAdmin
} yield adminScriptForUser(user)
Here it is easier to see the point of the zero. When the condition is false, adminScriptForUser(user) cannot be executed, so the for comprehension needs something to return instead, and that something is the "zero": None.
In technical terms, Scala's for comprehensions provides syntactic sugars for operations on monads, with an extra operation for monads with zero (see list comprehensions in the same article).
What you actually want to accomplish is called a catamorphism, usually represented as a fold method, which can be thought of as a function of M[A] => B. You can write it with fold, foldLeft or foldRight in a sequence, but none of them would actually short-circuit the iteration.
Short-circuiting arises naturally out of non-strict evaluation, which is the default in Haskell, in which most of these papers are written. Scala, as most other languages, is by default strict.
There are three solutions to your problem:
Use the special methods forall or exists, which target your precise use case, though they don't solve the generic problem;
Use a non-strict collection; there's Scala's Stream, but it has problems that prevents its effective use. The Scalaz library can help you there;
Use an early return, which is how Scala library solves this problem in the general case (in specific cases, it uses better optimizations).
As an example of the third option, you could write this:
def hasEven(xs: List[Int]): Boolean = {
for (x <- xs) if (x % 2 == 0) return true
false
}
Note as well that this is called a "for loop", not a "for comprehension", because it doesn't return a value (well, it returns Unit), since it doesn't have the yield keyword.
You can read more about real generic iteration in the article The Essence of The Iterator Pattern, which is a Scala experiment with the concepts described in the paper by the same name.
forall is definitely the best choice for the specific scenario but for illustration here's good old recursion:
#tailrec def hasEven(xs: List[Int]): Boolean = xs match {
case head :: tail if head % 2 == 0 => true
case Nil => false
case _ => hasEven(xs.tail)
}
I tend to use recursion a lot for loops w/short circuit use cases that don't involve collections.
UPDATE:
DO NOT USE THE CODE IN MY ANSWER BELOW!
Shortly after I posted the answer below (after misinterpreting the original poster's question), I have discovered a way superior generic answer (to the listing of requirements below) here: https://stackoverflow.com/a/60177908/501113
It appears you have several requirements:
Iterate through a (possibly large) list of items doing some (possibly expensive) work
The work done to an item could return an error
At the first item that returns an error, short circuit the iteration, throw away the work already done, and return the item's error
A for comprehension isn't designed for this (as is detailed in the other answers).
And I was unable to find another Scala collections pre-built iterator that provided the requirements above.
While the code below is based on a contrived example (transforming a String of digits into a BigInt), it is the general pattern I prefer to use; i.e. process a collection and transform it into something else.
def getDigits(shouldOnlyBeDigits: String): Either[IllegalArgumentException, BigInt] = {
#scala.annotation.tailrec
def recursive(
charactersRemaining: String = shouldOnlyBeDigits
, accumulator: List[Int] = Nil
): Either[IllegalArgumentException, List[Int]] =
if (charactersRemaining.isEmpty)
Right(accumulator) //All work completed without error
else {
val item = charactersRemaining.head
val isSuccess =
item.isDigit //Work the item
if (isSuccess)
//This item's work completed without error, so keep iterating
recursive(charactersRemaining.tail, (item - 48) :: accumulator)
else {
//This item hit an error, so short circuit
Left(new IllegalArgumentException(s"item [$item] is not a digit"))
}
}
recursive().map(digits => BigInt(digits.reverse.mkString))
}
When it is called as getDigits("1234") in a REPL (or Scala Worksheet), it returns:
val res0: Either[IllegalArgumentException,BigInt] = Right(1234)
And when called as getDigits("12A34") in a REPL (or Scala Worksheet), it returns:
val res1: Either[IllegalArgumentException,BigInt] = Left(java.lang.IllegalArgumentException: item [A] is not digit)
You can play with this in Scastie here:
https://scastie.scala-lang.org/7ddVynRITIOqUflQybfXUA