How to write a proper null-safe coalescing operator in scala? - scala

Having seen the answers coming out of questions like this one involving horror shows like trying to catch the NPE and dredge the mangled name out of the stack trace, I am asking this question so I can answer it.
Comments or further improvements welcome.

Like so:
case class ?:[T](x: T) {
def apply(): T = x
def apply[U >: Null](f: T => U): ?:[U] =
if (x == null) ?:[U](null)
else ?:[U](f(x))
}
And in action:
scala> val x = ?:("hel")(_ + "lo ")(_ * 2)(_ + "world")()
x: java.lang.String = hello hello world
scala> val x = ?:("hel")(_ + "lo ")(_ => (null: String))(_ + "world")()
x: java.lang.String = null

Added orElse
case class ?:[T](x: T) {
def apply(): T = x
def apply[U >: Null](f: T => U): ?:[U] =
if (x == null) ?:[U](null)
else ?:[U](f(x))
def orElse(y: T): T =
if (x == null) y
else x
}
scala> val x = ?:(obj)(_.subField)(_.subSubField).orElse("not found")
x: java.lang.String = not found
Or if you prefer named syntax as opposed to operator syntax
case class CoalesceNull[T](x: T) {
def apply(): T = x
def apply[U >: Null](f: T => U): CoalesceNull[U] =
if (x == null) CoalesceNull[U](null)
else CoalesceNull[U](f(x))
def orElse(y: T): T =
if (x == null) y
else x
}
scala> val x = CoalesceNull(obj)(_.subField)(_.subSubField).orElse("not found")
x: java.lang.String = not found
More examples
case class Obj[T](field: T)
test("last null") {
val obj: Obj[Obj[Obj[Obj[String]]]] = Obj(Obj(Obj(Obj(null))))
val res0 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field)()
res0 should === (null)
val res1 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field).orElse("not found")
res1 should === ("not found")
}
test("first null") {
val obj: Obj[Obj[Obj[Obj[String]]]] = null
val res0 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field)()
res0 should === (null)
val res1 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field).orElse("not found")
res1 should === ("not found")
}
test("middle null") {
val obj: Obj[Obj[Obj[Obj[String]]]] = Obj(Obj(null))
val res0 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field)()
res0 should === (null)
val res1 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field).orElse("not found")
res1 should === ("not found")
}
test("not null") {
val obj: Obj[Obj[Obj[Obj[String]]]] = Obj(Obj(Obj(Obj("something"))))
val res0 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field)()
res0 should === ("something")
val res1 = CoalesceNull(obj)(_.field)(_.field)(_.field)(_.field).orElse("not found")
res1 should === ("something")
}
}

Related

Case class hashCode and equals override optional attributes

How to override hashcode and equals method if it's all attributes are optional, I tried below, what's the better way to write hashCode and equals
case class A(id: Option[String], name: Option[String]){
override def hashCode(): Int = ???
override def canEqual(a: Any) = a.isInstanceOf[A]
override def equals(obj: Any): Boolean = obj match {
case obj: A => {
obj.canEqual(this) && this.id == obj.id && this.name == obj.name
}
case _ => false
}
}
You can do without the luxury equals if you're not boxing primitives or nulls.
That can save some null checks and instanceof/checkcast.
scala> case class C(id: Option[String], name: Option[String]) {
| override def equals(other: Any) = other match { case c: C => cmp(id, c.id) && cmp(name, c.name) case _ => false }
| private def cmp(x: Option[String], y: Option[String]): Boolean =
| if (x eq None) y eq None else !(y eq None) && x.get.equals(y.get)
| }
defined class C
scala> val x = C(Option("king"), Option("kong"))
x: C = C(Some(king),Some(kong))
scala> val y = C(Option("king"), Option("kong"))
y: C = C(Some(king),Some(kong))
scala> x == y
res0: Boolean = true
scala> val y = C(Option("king"), Option("king"))
y: C = C(Some(king),Some(king))
scala> x == y
res1: Boolean = false
But:
scala> val y = C(Option("king"), Some(null))
y: C = C(Some(king),Some(null))
scala> x == y
res2: Boolean = false
scala> y == x
java.lang.NullPointerException
at C.cmp(<console>:4)
at C.equals(<console>:2)
... 28 elided
scala> val y = C(Option("king"), null)
y: C = C(Some(king),null)
scala> x == y
java.lang.NullPointerException
at C.cmp(<console>:4)
at C.equals(<console>:2)
... 28 elided
There's not much to gain from avoiding hashCode of None, which is just "None".##.
Illustrating loss of some comparison:
scala> :pa
// Entering paste mode (ctrl-D to finish)
case class C(id: Option[String], name: Option[Any]) {
override def equals(other: Any) = other match { case c: C => cmp(id, c.id) && cmp(name, c.name) case _ => false }
private def cmp[A](x: Option[A], y: Option[A]) = if (x eq None) y eq None else !(y eq None) && x.get.equals(y.get)
}
// Exiting paste mode, now interpreting.
defined class C
scala> val x = C(Option("king"), Option('k'))
x: C = C(Some(king),Some(k))
scala> val y = C(Option("king"), Option('k'.toInt))
y: C = C(Some(king),Some(107))
scala> x == y
res19: Boolean = false
scala> case class K(id: Option[String], name: Option[Any])
defined class K
scala> K(Option("king"), Option('k')) == K(Option("king"), Option('k'.toInt))
res20: Boolean = true

Scala implicit ordering error when not specifying type parameters

def merge(bigrams1: Map[String, mutable.SortedMap[String, Int]],
bigrams2: Map[String, mutable.SortedMap[String, Int]]): Map[String, mutable.SortedMap[String, Int]] = {
bigrams2 ++ bigrams1
.map(entry1 => entry1._1 -> (entry1._2 ++ bigrams2.getOrElse(entry1._1, mutable.SortedMap())
.map(entry2 => entry2._1 -> (entry2._2 + entry1._2.getOrElse(entry2._1, 0)))))
}
At compile time, I get these errors:
Error:(64, 114) diverging implicit expansion for type scala.math.Ordering[T1]
starting with method Tuple9 in object Ordering
bigrams2 ++ bigrams1.map(entry1 => entry1._1 -> (entry1._2 ++ bigrams2.getOrElse(entry1._1, mutable.SortedMap()).map(entry2 => entry2._1 -> (entry2._2 + entry1._2.getOrElse(entry2._1, 0)))))
Error:(64, 114) not enough arguments for method apply: (implicit ord: scala.math.Ordering[A])scala.collection.mutable.SortedMap[A,B] in class SortedMapFactory.
Unspecified value parameter ord.
bigrams2 ++ bigrams1.map(entry1 => entry1._1 -> (entry1._2 ++ bigrams2.getOrElse(entry1._1, mutable.SortedMap()).map(entry2 => entry2._1 -> (entry2._2 + entry1._2.getOrElse(entry2._1, 0)))))
Specifying the types of the sorted map solves the problem:
def merge(bigrams1: Map[String, mutable.SortedMap[String, Int]],
bigrams2: Map[String, mutable.SortedMap[String, Int]]): Map[String, mutable.SortedMap[String, Int]] = {
bigrams2 ++ bigrams1
.map(entry1 => entry1._1 -> (entry1._2 ++ bigrams2.getOrElse(entry1._1, mutable.SortedMap[String, Int]())
.map(entry2 => entry2._1 -> (entry2._2 + entry1._2.getOrElse(entry2._1, 0)))))
}
Why do these type parameters need to be specified? Why can they not be inferred without problems with the implicit ordering?
Full code:
import java.io.File
import scala.annotation.tailrec
import scala.collection.mutable
import scala.io.Source
import scala.util.matching.Regex
case class Bigrams(bigrams: Map[String, mutable.SortedMap[String, Int]]) {
def mergeIn(bigramsIn: Map[String, mutable.SortedMap[String, Int]]): Bigrams = {
Bigrams(Bigrams.merge(bigrams, bigramsIn))
}
def extractStatistics(path: String): Bigrams = {
val entry: File = new File(path)
if (entry.exists && entry.isDirectory) {
val bigramsFromDir: Map[String, mutable.SortedMap[String, Int]] = entry
.listFiles
.filter(file => file.isFile && file.getName.endsWith(".sgm"))
.map(Bigrams.getBigramsFrom)
.foldLeft(Map[String, mutable.SortedMap[String, Int]]())(Bigrams.merge)
val bigramsFromSubDirs: Bigrams = entry
.listFiles
.filter(entry => entry.isDirectory)
.map(entry => extractStatistics(entry.getAbsolutePath))
.foldLeft(Bigrams())(Bigrams.merge)
bigramsFromSubDirs.mergeIn(bigramsFromDir)
} else if (entry.exists && entry.isFile) {
Bigrams(Bigrams.getBigramsFrom(entry))
} else
throw new RuntimeException("Incorrect path")
}
def getFreqs(word: String): Option[mutable.SortedMap[String, Int]] = {
bigrams.get(word)
}
}
object Bigrams {
def fromPath(path: String): Bigrams = {
new Bigrams(Map[String, mutable.SortedMap[String, Int]]()).extractStatistics(path)
}
def apply(): Bigrams = {
new Bigrams(Map())
}
val BODY: Regex = "(?s).*<BODY>(.*)</BODY>(?s).*".r
// Return a list with the markup for each article
#tailrec
def readArticles(remainingLines: List[String], acc: List[String]): List[String] = {
if (remainingLines.size == 1) acc
else {
val nextLine = remainingLines.head
if (nextLine.startsWith("<REUTERS ")) readArticles(remainingLines.tail, nextLine +: acc)
else readArticles(remainingLines.tail, (acc.head + "\n" + nextLine) +: acc.tail)
}
}
def merge(bigrams1: Map[String, mutable.SortedMap[String, Int]],
bigrams2: Map[String, mutable.SortedMap[String, Int]]): Map[String, mutable.SortedMap[String, Int]] = {
bigrams2 ++ bigrams1
.map(entry1 => entry1._1 -> (entry1._2 ++ bigrams2.getOrElse(entry1._1, mutable.SortedMap[String, Int]())
.map(entry2 => entry2._1 -> (entry2._2 + entry1._2.getOrElse(entry2._1, 0)))))
}
def merge(bigrams1: Bigrams, bigrams2: Bigrams): Bigrams = {
new Bigrams(merge(bigrams1.bigrams, bigrams2.bigrams))
}
def getBigramsFrom(path: File): Map[String, mutable.SortedMap[String, Int]] = {
val file = Source.fromFile(path)
val fileLines: List[String] = file.getLines().toList
val articles: List[String] = Bigrams.readArticles(fileLines.tail, List())
val bodies: List[String] = articles.map(extractBody).filter(body => !body.isEmpty)
val sentenceTokens: List[List[String]] = bodies.flatMap(getSentenceTokens)
sentenceTokens.foldLeft(Map[String, mutable.SortedMap[String, Int]]())((acc, tokens) => addBigramsFrom(tokens, acc))
}
def getBigrams(tokens: List[String]): List[(String, String)] = {
tokens.indices.
map(i => {
if (i < tokens.size - 1) (tokens(i), tokens(i + 1))
else null
})
.filter(_ != null).toList
}
// Return the body of the markup of one article
def extractBody(article: String): String = {
try {
val body: String = article match {
case Bigrams.BODY(bodyGroup) => bodyGroup
}
body
}
catch {
case _: MatchError => ""
}
}
def getSentenceTokens(text: String): List[List[String]] = {
val separatedBySpace: List[String] = text
.replace('\n', ' ')
.replaceAll(" +", " ") // regex
.split(" ")
.map(token => if (token.endsWith(",")) token.init.toString else token)
.toList
val splitAt: List[Int] = separatedBySpace.indices
.filter(i => i > 0 && separatedBySpace(i - 1).endsWith(".") || i == 0)
.toList
groupBySentenceTokens(separatedBySpace, splitAt, List()).map(sentenceTokens => sentenceTokens.init :+ sentenceTokens.last.substring(0, sentenceTokens.last.length - 1))
}
#tailrec
def groupBySentenceTokens(tokens: List[String], splitAt: List[Int], sentences: List[List[String]]): List[List[String]] = {
if (splitAt.size <= 1) {
if (splitAt.size == 1) {
sentences :+ tokens.slice(splitAt.head, tokens.size)
} else {
sentences
}
}
else groupBySentenceTokens(tokens, splitAt.tail, sentences :+ tokens.slice(splitAt.head, splitAt.tail.head))
}
def addBigramsFrom(tokens: List[String], bigrams: Map[String, mutable.SortedMap[String, Int]]): Map[String, mutable.SortedMap[String, Int]] = {
var newBigrams = bigrams
val bigramsFromTokens: List[(String, String)] = Bigrams.getBigrams(tokens)
bigramsFromTokens.foreach(bigram => { // TODO: This code uses side effects to get the job done. Try to remove them.
val currentFreqs: mutable.SortedMap[String, Int] = newBigrams.get(bigram._1)
.map((map: mutable.SortedMap[String, Int]) => map)
.getOrElse(mutable.SortedMap())
val incrementedWordFreq = currentFreqs.get(bigram._2)
.map(freq => freq + 1)
.getOrElse(1)
val newFreqs = currentFreqs + (bigram._2 -> incrementedWordFreq)
newBigrams = newBigrams - bigram._1 + (bigram._1 -> newFreqs)
})
newBigrams
}
}
The thing is that method Map#getOrElse in 2.13 (or MapLike#getOrElse in 2.12) has signature
def getOrElse[V1 >: V](key: K, default: => V1): V1
https://github.com/scala/scala/blob/2.13.x/src/library/scala/collection/Map.scala#L132-L135
https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/MapLike.scala#L129-L132
i.e. it expects not necessarily default of the same type as V in Map[K, +V] (or MapLike[K, +V, +This <: MapLike[K, V, This] with Map[K, V]]) but possibly of a supertype of V. In your case V is mutable.SortedMap[String, Int] and there are so many its supertypes (you can look at inheritance hierarchy yourself). mutable.SortedMap() can be of any type mutable.SortedMap[A, B] or their super types.
If you replace method getOrElse with having fixed V
implicit class MapOps[K, V](m: Map[K, V]) {
def getOrElse1(key: K, default: => V): V = m.get(key) match {
case Some(v) => v
case None => default
}
}
or in 2.12
implicit class MapOps[K, V, +This <: MapLike[K, V, This] with Map[K, V]](m: MapLike[K, V, This]) {
def getOrElse1(key: K, default: => V): V = m.get(key) match {
case Some(v) => v
case None => default
}
}
then in your code
... bigrams2.getOrElse1(entry1._1, mutable.SortedMap())
all types will be inferred and it will compile.
So sometimes types should be just specified explicitly when compiler asks.

Implement find and remove in Scala

How is it easier to implement function that find and immutable remove the first occurrence in Scala collection:
case class A(a: Int, b: Int)
val s = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7))
val (s1, r) = s.findAndRemove(_.a == 2)
Result: s1 = Seq(A(1,5), A(4,6), A(5,1), A(2,7)) , r = Some(A(2,3))
It finds the first element that match, and keeps order. It can be improved with List instead of Seq.
case class A(a: Int, b: Int)
val s = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7))
val (s1, r) = s.findAndRemove(_.a == 2)
println(s1)
println(r)
implicit class SeqOps[T](s:Seq[T]) {
def findAndRemove(f:T => Boolean):(Seq[T], Option[T]) = {
s.foldLeft((Seq.empty[T], Option.empty[T])) {
case ((l, None), elem) => if(f(elem)) (l, Option(elem)) else (l :+ elem, None)
case ((l, x), elem) => (l :+ elem, x)
}
}
}
Yeah, a little late to the party, but I thought I'd throw this in.
Minimum invocations of the predicate.
Works with most popular collection types: Seq, List, Array, Vector. Even Set and Map (but for those the collection has no order to preserve and there's no telling which element the predicate will find first). Doesn't work for Iterator or String.
-
import scala.collection.generic.CanBuildFrom
import scala.language.higherKinds
implicit class CollectionOps[U, C[_]](xs :C[U]) {
def findAndRemove(p :U=>Boolean
)(implicit bf :CanBuildFrom[C[U], U, C[U]]
,ev :C[U] => collection.TraversableLike[U, C[U]]
) :(C[U], Option[U]) = {
val (before, after) = xs.span(!p(_))
before ++ after.drop(1) -> after.headOption
}
}
usage:
case class A(a: Int, b: Int)
val (as, a) = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7)).findAndRemove(_.a==2)
//as: Seq[A] = List(A(1,5), A(4,6), A(5,1), A(2,7))
//a: Option[A] = Some(A(2,3))
val (cs, c) = Array('g','t','e','y','b','e').findAndRemove(_<'f')
//cs: Array[Char] = Array(g, t, y, b, e)
//c: Option[Char] = Some(e)
val (ns, n) = Stream.from(9).findAndRemove(_ > 10)
//ns: Stream[Int] = Stream(9, ?)
//n: Option[Int] = Some(11)
ns.take(5).toList //List[Int] = List(9, 10, 12, 13, 14)
Try something like this
def findAndRemove(as: Seq[A])(fn: A => Boolean): (Seq[A], Option[A]) = {
val index = as.indexWhere(fn)
if(index == -1) as -> None
else as.patch(index, Nil, 1) -> as.lift(index)
}
val (s1, r) = findAndRemove(s)(_.a == 2)
My version:
def findAndRemove(s:Seq[A])(p:A => Boolean):(Seq[A], Option[A])={
val i = s.indexWhere(p)
if(i > 0){
val (l1, l2) = s.splitAt(i)
(l1++l2.tail, Some(l2.head))
}else{
(s, None)
}
}

scala polymorphic type for return value

How would I make this pattern work? func() fails to compile. I understand the problem with this setup, but what's a pattern that could accomplish basically this?
class A() {
val a: Int = 123
val b: String = "xxx"
}
def func[T](key: String, a: A): T = {
if (key == "a") a.a // would make T an Int
else if (key == "b") a.b // would make T a String
}
val a = new A()
func[Int]("a", a)
func[String]("b", a)
I'm not quite sure what you are going for, but a few possibilities.
class A() {
val a: Int = 123
val b: String = "xxx"
}
def func[T : Manifest](a: A) = implictly[Manifest[T]] match {
case implicitly[Manifest[Int]]) => a.a
case implicitly[Manifest[String]) => a.b
}
val a = new A()
func[Int](a)
func[String](a)
or
class A() {
val a: Int = 123
val b: String = "xxx"
}
val aKey = (_: A).a
val bKey = (_: A).b
def func[T](key: A => T, a: A) = key(a)
val a = new A()
func(aKey, a)
func(bKey, a)
or even with Shapeless,
import shapeless._
import syntax.singleton._
import record._
val a = ("a" ->> 123) :: ("b" ->> "xxx") :: HNil
a("a") // typed as an Int
b("b") // typed as a String
Maybe this is close to what you're after?
class A() {
val a: Int = 123
val b: String = "xxx"
}
def func(key: String, a: A): Either[Int,String] = {
if (key == "a") Left(a.a)
else Right(a.b)
}
val a = new A()
func("a", a) // Left(123)
func("b", a) // Right("xxx")

Scala: Are these two partial functions equivalent?

Are these two partial functions equivalent?
val f0: PartialFunction[Int, String] = {
case 10 => "ten"
case n: Int => s"$n"
}
val f1 = new PartialFunction[Int, String] {
override def isDefinedAt(x: Int): Boolean = true
override def apply(v: Int): String = if (v == 10) "ten" else s"$v"
}
UPD
val pf = new PartialFunction[Int, String] {
def isDefinedAt(x: Int) = x == 10
def apply(v: Int) = if (isDefinedAt(v)) "ten" else "undefined"
}
def fun(n: Int)(pf: PartialFunction[Int, String]) = pf.apply(n)
println(fun(100)(pf))
Is it truly PF now?
I think you need 2 partial (value) functions to use the PartialFunction the way it is designed to be: one for the value 10, and the other for the other Ints:
val f0:PartialFunction[Int, String] = { case 10 => "ten" }
val fDef:PartialFunction[Int, String] = { case n => s"$n" }
And how to apply them:
val t1 = (9 to 11) collect f0
t1 shouldBe(Array("ten"))
val t2 = (9 to 11) map (f0 orElse fDef)
t2 shouldBe(Array("9", "ten", "11"))