Contains of Option[String] in Scala not working as expected? - scala

I just discovered something weird. This statement:
Some("test this").contains("test")
Evaluates to false. While this evaluates to true:
Some("test this").contains("test this")
How does this make sense? I thought the Option would run the contains on the wrapped object if possible.
EDIT:
I'm also thinking about this from a code readability perspective. Imagine you are seeing this code:
person.name.contains("Roger")
Must name be equal to Roger? Or can it contain Roger? The behavior depends if it's a String or Option[String].

There's a principle in typed functional programming called "parametric reasoning". Broadly stated, the principle is that it's desirable to be able to have intuitions about what a function does just from looking at its type signature.
If we "devirtualize" (effectively turning it into a static method... this is actually a fairly common optimization step in object-oriented runtimes) Option's contains method has the signature:
def contains[A, A1 >: A](opt: Option[A], elem: A1): Boolean
That is, it takes an Option[A] and an A1 (where A1 is a supertype of A, if it's not an A) and returns a Boolean. Implicitly in Scala's typesystem, of course, we know that A and A1 are both subtypes of Any.
Without knowing anything more about what the types A and A1 are (A might be String and A1 might be AnyRef, or A and A1 might both be Int: whatever our intuition, it has to apply as much in either situation), what could we possibly do? We're basically limited to combinations of operations involving an Option[Any] and an Any which eventually get us to a Boolean (and, ideally, won't throw an exception).
For instance, opt.nonEmpty && opt.get == elem works: we can always call nonEmpty on an Option[Any] and then compare the contents using equality. We could also do something like opt.isEmpty || (opt.get.## % 43) == (elem.## % 57), but knowing that the contents of the Option and some other object have equal remainders in two different bases doesn't strike one as useful.
Note that in your specific case, because there's no contains method on an Any. What should the behavior be if we have an Option[Int]?
It might actually be useful, since we do have the ability to convert arbitrary objects into Strings via the toString method (thank you Java!), to implement a containsSubstring method on Option[A]:
def containsSubstring(substring: String): Boolean =
nonEmpty && get.toString.contains(substring)
You could implement an enrichment class along these lines:
object Enrichments {
implicit class OptionOps[A](opt: Option[A]) extends AnyVal {
def containsSubstring(substring: String): Boolean =
opt.nonEmpty && opt.get.toString.contains(substring)
}
}
then you only need:
import Enrichments.OptionOps
Some("test this").containsSubstring("test") // evaluates true
case class Person(name: Option[String], age: Int)
// Option(p).containsSubstring("Roger") would also work, assuming Person doesn't override toString...
def isRoger(p: Person): Boolean = p.name.containsSubstring("Roger")

I recommend to check the docs and eventually the code of the API that you are you using. The docs detail what Option's contains does and how it works:
/** Tests whether the option contains a given value as an element.
*
* This is equivalent to:
*
* option match {
* case Some(x) => x == elem
* case None => false
* }
*
* // Returns true because Some instance contains string "something" which equals "something".
* Some("something") contains "something"
*
* // Returns false because "something" != "anything".
* Some("something") contains "anything"
*
* // Returns false when method called on None.
* None contains "anything"
*
* #param elem the element to test.
* #return `true` if the option has an element that is equal (as
* determined by `==`) to `elem`, `false` otherwise.
*/
final def contains[A1 >: A](elem: A1): Boolean =
!isEmpty && this.get == elem

Related

Scala - Collection comparison - Why is Set(1) == ListSet(1)?

Why is the output of this comparison outputs true?
import scala.collection.immutable.ListSet
Set(1) == ListSet(1) // Expect false
//Output
res0: Boolean = true
And in a more general sense, how the comparison is actually done?
Since the inheritance chain Set <: GenSet <: GenSetLike is a bit lengthy, it might be not immediately obvious where to look for the code of equals, so I thought maybe I quote it here:
GenSetLike.scala:
/** Compares this set with another object for equality.
*
* '''Note:''' This operation contains an unchecked cast: if `that`
* is a set, it will assume with an unchecked cast
* that it has the same element type as this set.
* Any subsequent ClassCastException is treated as a `false` result.
* #param that the other object
* #return `true` if `that` is a set which contains the same elements
* as this set.
*/
override def equals(that: Any): Boolean = that match {
case that: GenSet[_] =>
(this eq that) ||
(that canEqual this) &&
(this.size == that.size) &&
(try this subsetOf that.asInstanceOf[GenSet[A]]
catch { case ex: ClassCastException => false })
case _ =>
false
}
Essentially, it checks whether the other object is also a GenSet, and if yes, it attempts to perform some fail-fast checks (like comparing size and invoking canEqual), and if the sizes are equal, it checks whether this set is a subset of another set, presumably by checking each element.
So, the exact class used to represent the set at runtime is irrelevant, what matters is that the compared object is also a GenSet and has the same elements.
From Scala collections equality:
The collection libraries have a uniform approach to equality and hashing. The idea is, first, to divide collections into sets, maps, and sequences.
...
On the other hand, within the same category, collections are equal if and only if they have the same elements
In your case, both collections are considered sets and they contain the same elements, hence, they're equal.
Scala 2.12.8 documentation:
This class implements immutable sets using a list-based data
structure.
So ListSet is a set too but with concrete (list-based) implementation.

Understanding Sets and Sequences using String checking as an example

I have a string which I would like to cross check if it is purely made of letters and space.
val str = "my long string to test"
val purealpha = " abcdefghijklmnopqrstuvwxyz".toSet
if (str.forall(purestring(_))) println("PURE") else "NOTPURE"
The above CONCISE code does the job. However, if I run it this way:
val str = "my long string to test"
val purealpha = " abcdefghijklmnopqrstuvwxyz" // not converted toSet
str.forall(purealpha(_)) // CONCISE code
I get an error (found: Char ... required: Boolean) and it can only work using the contains method this way:
str.forall(purealpha.contains(_))
My question is how can I use the CONCISE form without converting the string to a Set. Any suggestions on having my own String class with the right combination of methods to enable the nice code; or maybe some pure function(s) working on strings.
It's just a fun exercise I'm doing, so I can understand the intricate details of various methods on collections (including apply method) and how to write nice concise code and classes.
A slightly different approach is to use a regex pattern.
val str = "my long string to test"
val purealpha = "[ a-z]+"
str matches purealpha // res0: Boolean = true
If we look at the source code we can see that both these implementations are doing different things, although giving the same result.
When you are converting it to a Set and using the forAll, you are ultimately calling the apply method for the set. Here is how the apply is called explicitly in your code, also using named parameters in the anonymous functions:
if (str.forall(s => purestring.apply(s))) println("PURE") else "NOTPURE" // first example
str.forall(s => purealpha.apply(s)) // second example
Anyway, let's take a look at the source code for apply for Set (gotten from GenSetLike.scala):
/** Tests if some element is contained in this set.
*
* This method is equivalent to `contains`. It allows sets to be interpreted as predicates.
* #param elem the element to test for membership.
* #return `true` if `elem` is contained in this set, `false` otherwise.
*/
def apply(elem: A): Boolean = this contains elem
When you leave the String literal, you have to specifically call the .contains (this is the source code for that gotten from SeqLike.scala):
/** Tests whether this $coll contains a given value as an element.
* $mayNotTerminateInf
*
* #param elem the element to test.
* #return `true` if this $coll has an element that is equal (as
* determined by `==`) to `elem`, `false` otherwise.
*/
def contains[A1 >: A](elem: A1): Boolean = exists (_ == elem)
As you can imagine, doing an apply for the String literal will not give the same result as doing an apply for a Set.
A suggestion on having more conciseness is to omit the (_) entirely in the second example (compiler type inference will pick that up):
val str = "my long string to test"
val purealpha = " abcdefghijklmnopqrstuvwxyz" // not converted toSet
str.forall(purealpha.contains)

Why does Some(x).map(_ => null) not evaluate to None?

I have recently faced a confusing issue in Scala. I expect the following code to result in None, but it results in Some(null):
Option("a").map(_ => null)
What is the reasoning behind this? Why does it not result in None?
Note: This question is not a duplicate of Why Some(null) isn't considered None?, as that questions asks for explicitly using Some(null). My question is about using Option.map.
Every time we add an exception to a rule, we deprive ourselves of a tool for reasoning about code.
Mapping over a Some always evaluates to a Some. That's a simple and useful law. If we were to make the change you propose, we would no longer have that law. For example, here's a thing we can say with certainty. For all f, x, and y:
Some(x).map(f).map(_ => y) == Some(y)
If we were to make the change you propose, that statement would no longer be true; specifically, it would not hold for cases where f(x) == null.
Moreover, Option is a functor. Functor is a useful generalization of things that have map functions, and it has laws that correspond well to intuition about how mapping should work. If we were to make the change you propose, Option would no longer be a functor.
null is an aberration in Scala that exists solely for interoperability with Java libraries. It is not a good reason to discard Option's validity as functor.
Here is the code for Option map method:
/** Returns a $some containing the result of applying $f to this $option's
* value if this $option is nonempty.
* Otherwise return $none.
*
* #note This is similar to `flatMap` except here,
* $f does not need to wrap its result in an $option.
*
* #param f the function to apply
* #see flatMap
* #see foreach
*/
#inline final def map[B](f: A => B): Option[B] =
if (isEmpty) None else Some(f(this.get))
So, as you can see, if the option is not empty, it will map to Some with the value returned by the function. And here is the code for Some class:
/** Class `Some[A]` represents existing values of type
* `A`.
*
* #author Martin Odersky
* #version 1.0, 16/07/2003
*/
#SerialVersionUID(1234815782226070388L) // value computed by serialver for 2.11.2, annotation added in 2.11.4
final case class Some[+A](x: A) extends Option[A] {
def isEmpty = false
def get = x
}
So, as you can see, Some(null) will actually create a Some object containing null. What you probably want to do is use Option.apply which does returns a None if the value is null. Here is the code for Option.apply method:
/** An Option factory which creates Some(x) if the argument is not null,
* and None if it is null.
*
* #param x the value
* #return Some(value) if value != null, None if value == null
*/
def apply[A](x: A): Option[A] = if (x == null) None else Some(x)
So, you need to write your code like this:
Option("a").flatMap(s => Option.apply(null))
Of course, this code makes no sense, but I will consider that you are just doing some kind of experiment.
Option is kind of replacement for null, but in general you see null in scala when you are talking to some java code, it is not like Option is supposed to handle nulls whenever possible, it is not designed to be used with nulls but instead of them. There is however conveniece method Option.apply that is similar to java's Optional.ofNullable that would handle the null case, and that's mostly all about nulls and Options in scala. In all other cases it works on Some and None not making any difference if null is inside or not.
If you have some nasty method returning null that comes from java and you want to use it directly, use following approach:
def nastyMethod(s: String): String = null
Some("a").flatMap(s => Option(nastyMethod(s)))
// or
Some("a").map(nastyMethod).flatMap(Option(_))
Both output Option[String] = None
So, nastyMethod can return a String or null conceptually is an Option, so wrap its result in an Option and use it as an Option. Don't expect null magic will happen whenever you need it.
To understand what's going on, we can use the functional substitution principle to explore the given expression step by step:
Option("a").map(s => null) // through Option.apply
Some("a").map(s => null) // let's name the anonymous function as: f(x) = null
Some("a").map(x => f(x)) // following Option[A].map(f:A=>B) => Option[B]
Some(f("a")) // apply f(x)
Some(null)
The confusion expressed in the question comes from the assumption that the map would apply to the argument of the Option before the Option.apply is evaluated: Let's see how that couldn't possibly work:
Option("a").map(x=> f(x)) // !!! can't evaluate map before Option.apply. This is the key to understand !
Option(f(a)) // !!! we can't get here
Option(null) // !!! we can't get here
None // !!! we can't get here
Why would it be None, the signature of map is a function from a value A to B to yield an Option[B]. No where in that signature does it indicate that B may be null by saying B is an Option[B]. flatMap however does indicate that the values returned is also optional. It's signature is Option[A] => (A => Option[B]) => Option[B].

Check 2 sets for inclusion in Scala

abstract class FinSet[T] protected () {
// given a set other, it returns true iff every element of this is an element of other
def <=(other:FinSet[T]): Boolean =
// ????
That is what I am given so far. I am somewhat confused on how to implement this method. Would I be calling the method like so:
Set(1,2,3).<=(Set(3,2,1)) which should return true
I was wondering if this would work, but it seems too simple:
def <=(other:FinSet[T]): Boolean = if (this == other) true else false
Just looking for some guidance. Thanks.
& - means intersection, if second set doesn't have elements from first set, the following code would return false.
(thisSet & thatSet) == thisSet
In details this code computes the intersection between this set and another set and checks if elements in this equal to result of the first expression.
see & or intersect(more verbose version) method in Scaladoc
You can also do something like this:
thisSet.forall(x => thatSet contains x)
or less verbose:
thisSet.forall(thatSet contains _)
or like this:
(thisSet ++ thatSet) == thatSet
or maybe like this:
(thatSet -- thisSet).size == (thatSet.size - thisSet.size)
Rephrasing the requirement: you want to check if, for all elements of this set, the other set contains the element.
This sounds like the combination of two more primitive functions that you will probably want anyway. So, if you haven't done so already, I would define the methods:
def forall(predicate: T => Boolean): Boolean // Checks that predicate holds for all elements
def contains(elem: T): Boolean // Check that elem is an element of the set
Then the method <= devolves to:
def <=(other: FinSet[T]): Boolean = forall(other.contains)

How can I define a custom equality operation that will be used by immutable Set comparison methods

I have an immutable Set of a class, Set[MyClass], and I want to use the Set methods intersect and diff, but I want them to test for equality using my custom equals method, rather than default object equality test
I have tried overriding the == operator, but it isn't being used.
Thanks in advance.
Edit:
The intersect method is a concrete value member of GenSetLike
spec: http://www.scala-lang.org/api/current/scala/collection/GenSetLike.html
src: https://lampsvn.epfl.ch/trac/scala/browser/scala/tags/R_2_9_1_final/src//library/scala/collection/GenSetLike.scala#L1
def intersect(that: GenSet[A]): Repr = this filter that
so the intersection is done using the filter method.
Yet another Edit:
filter is defined in TraversableLike
spec: http://www.scala-lang.org/api/current/scala/collection/TraversableLike.html
src: https://lampsvn.epfl.ch/trac/scala/browser/scala/tags/R_2_9_1_final/src//library/scala/collection/TraversableLike.scala#L1
def filter(p: A => Boolean): Repr = {
val b = newBuilder
for (x <- this)
if (p(x)) b += x
b.result
}
What's unclear to me is what it uses when invoked without a predicate, p. That's not an implicit parameter.
equals and hashCode are provided automatically in case class only if you do not define them.
case class MyClass(val name: String) {
override def equals(o: Any) = o match {
case that: MyClass => that.name.equalsIgnoreCase(this.name)
case _ => false
}
override def hashCode = name.toUpperCase.hashCode
}
Set(MyClass("xx"), MyClass("XY"), MyClass("xX"))
res1: scala.collection.immutable.Set[MyClass] = Set(MyClass(xx), MyClass(XY))
If what you want is reference equality, still write equals and hashCode, to prevent automatic generation, and call the version from AnyRef
override def equals(o: Any) = super.equals(o)
override def hashCode = super.hashCode
With that:
Set(MyClass("x"), MyClass("x"))
res2: scala.collection.immutable.Set[MyClass] = Set(MyClass(x), MyClass(x))
You cannot override the ==(o: Any) from AnyRef, which is sealed and always calls equals. If you tried defining a new (overloaded) ==(m: MyClass), it is not the one that Set calls, so it is useless here and quite dangerous in general.
As for the call to filter, the reason it works is that Set[A] is a Function[A, Boolean]. And yes, equals is used, you will see that function implementation (apply) is a synonymous for contains, and most implementations of Set use == in contains (SortedSet uses the Ordering instead). And == calls equals.
Note: the implementation of my first equals is quick and dirty and probably bad if MyClass is to be subclassed . If so, you should at the very least check type equality (this.getClass == that.getClass) or better define a canEqual method (you may read this blog by Daniel Sobral)
You'll need to override .hashCode as well. This is almost always the case when you override .equals, as .hashCode is often used as a cheaper pre-check for .equals; any two objects which are equal must have identical hash codes. I'm guessing you're using objects whose default hashCode does not respect this property with respect to your custom equality, and the Set implementation is making assumptions based on the hash codes (and so never even calling your equality operation).
See the Scala docs for Any.equals and Any.hashCode: http://www.scala-lang.org/api/rc/scala/Any.html
This answer shows a custom mutable Set with user-defined Equality. It could be made immutable by replacing the internal store with a Vector and returning a modified copy of itself upon each operation
"It is not possible to override == directly, as it is defined as a final method in class Any. That is, Scala treats == as if were defined as follows in class Any:
final def == (that: Any): Boolean =
if (null eq this) {null eq that} else {this equals that}
" from Programming In Scala, Second Edition