Functional patterns for better chaining of collect - scala

I often find myself needing to chain collects where I want to do multiple collects in a single traversal. I also would like to return a "remainder" for things that don't match any of the collects.
For example:
sealed trait Animal
case class Cat(name: String) extends Animal
case class Dog(name: String, age: Int) extends Animal
val animals: List[Animal] =
List(Cat("Bob"), Dog("Spot", 3), Cat("Sally"), Dog("Jim", 11))
// Normal way
val cats: List[Cat] = animals.collect { case c: Cat => c }
val dogAges: List[Int] = animals.collect { case Dog(_, age) => age }
val rem: List[Animal] = Nil // No easy way to create this without repeated code
This really isn't great, it requires multiple iterations and there is no reasonable way to calculate the remainder. I could write a very complicated fold to pull this off, but it would be really nasty.
Instead, I usually opt for mutation which is fairly similar to the logic you would have in a fold:
import scala.collection.mutable.ListBuffer
// Ugly, hide the mutation away
val (cats2, dogsAges2, rem2) = {
// Lose some benefits of type inference
val cs = ListBuffer[Cat]()
val da = ListBuffer[Int]()
val rem = ListBuffer[Animal]()
// Bad separation of concerns, I have to merge all of my functions
animals.foreach {
case c: Cat => cs += c
case Dog(_, age) => da += age
case other => rem += other
}
(cs.toList, da.toList, rem.toList)
}
I don't like this one bit, it has worse type inference and separation of concerns since I have to merge all of the various partial functions. It also requires lots of lines of code.
What I want, are some useful patterns, like a collect that returns the remainder (I grant that partitionMap new in 2.13 does this, but uglier). I also could use some form of pipe or map for operating on parts of tuples. Here are some made up utilities:
implicit class ListSyntax[A](xs: List[A]) {
import scala.collection.mutable.ListBuffer
// Collect and return remainder
// A specialized form of new 2.13 partitionMap
def collectR[B](pf: PartialFunction[A, B]): (List[B], List[A]) = {
val rem = new ListBuffer[A]()
val res = new ListBuffer[B]()
val f = pf.lift
for (elt <- xs) {
f(elt) match {
case Some(r) => res += r
case None => rem += elt
}
}
(res.toList, rem.toList)
}
}
implicit class Tuple2Syntax[A, B](x: Tuple2[A, B]){
def chainR[C](f: B => C): Tuple2[A, C] = x.copy(_2 = f(x._2))
}
Now, I can write this in a way that could be done in a single traversal (with a lazy datastructure) and yet follows functional, immutable practice:
// Relatively pretty, can imagine lazy forms using a single iteration
val (cats3, (dogAges3, rem3)) =
animals.collectR { case c: Cat => c }
.chainR(_.collectR { case Dog(_, age) => age })
My question is, are there patterns like this? It smells like the type of thing that would be in a library like Cats, FS2, or ZIO, but I am not sure what it might be called.
Scastie link of code examples: https://scastie.scala-lang.org/Egz78fnGR6KyqlUTNTv9DQ

I wanted to see just how "nasty" a fold() would be.
val (cats
,dogAges
,rem) = animals.foldRight((List.empty[Cat]
,List.empty[Int]
,List.empty[Animal])) {
case (c:Cat, (cs,ds,rs)) => (c::cs, ds, rs)
case (Dog(_,d),(cs,ds,rs)) => (cs, d::ds, rs)
case (r, (cs,ds,rs)) => (cs, ds, r::rs)
}
Eye of the beholder I suppose.

How about defining a couple utility classes to help you with this?
case class ListCollect[A](list: List[A]) {
def partialCollect[B](f: PartialFunction[A, B]): ChainCollect[List[B], A] = {
val (cs, rem) = list.partition(f.isDefinedAt)
new ChainCollect((cs.map(f), rem))
}
}
case class ChainCollect[A, B](tuple: (A, List[B])) {
def partialCollect[C](f: PartialFunction[B, C]): ChainCollect[(A, List[C]), B] = {
val (cs, rem) = tuple._2.partition(f.isDefinedAt)
ChainCollect(((tuple._1, cs.map(f)), rem))
}
}
ListCollect is just meant to start the chain, and ChainCollect takes the previous remainder (the second element of the tuple) and tries to apply a PartialFunction to it, creating a new ChainCollect object. I'm not particularly fond of the nested tuples this produces, but you may be able to make it look a bit better if you use Shapeless's HLists.
val ((cats, dogs), rem) = ListCollect(animals)
.partialCollect { case c: Cat => c }
.partialCollect { case Dog(_, age) => age }
.tuple
Scastie
Dotty's *: type makes this a bit easier:
opaque type ChainResult[Prev <: Tuple, Rem] = (Prev, List[Rem])
extension [P <: Tuple, R, N](chainRes: ChainResult[P, R]) {
def partialCollect(f: PartialFunction[R, N]): ChainResult[List[N] *: P, R] = {
val (cs, rem) = chainRes._2.partition(f.isDefinedAt)
(cs.map(f) *: chainRes._1, rem)
}
}
This does end up in the output being reversed, but it doesn't have that ugly nesting from my previous approach:
val ((owls, dogs, cats), rem) = (EmptyTuple, animals)
.partialCollect { case c: Cat => c }
.partialCollect { case Dog(_, age) => age }
.partialCollect { case Owl(wisdom) => wisdom }
/* more animals */
case class Owl(wisdom: Double) extends Animal
case class Fly(isAnimal: Boolean) extends Animal
val animals: List[Animal] =
List(Cat("Bob"), Dog("Spot", 3), Cat("Sally"), Dog("Jim", 11), Owl(200), Fly(false))
Scastie
And if you still don't like that, you can always define a few more helper methods to reverse the tuple, add the extension on a List without requiring an EmptyTuple to begin with, etc.
//Add this to the ChainResult extension
def end: Reverse[List[R] *: P] = {
def revHelp[A <: Tuple, R <: Tuple](acc: A, rest: R): RevHelp[A, R] =
rest match {
case EmptyTuple => acc.asInstanceOf[RevHelp[A, R]]
case h *: t => revHelp(h *: acc, t).asInstanceOf[RevHelp[A, R]]
}
revHelp(EmptyTuple, chainRes._2 *: chainRes._1)
}
//Helpful types for safety
type Reverse[T <: Tuple] = RevHelp[EmptyTuple, T]
type RevHelp[A <: Tuple, R <: Tuple] <: Tuple = R match {
case EmptyTuple => A
case h *: t => RevHelp[h *: A, t]
}
And now you can do this:
val (cats, dogs, owls, rem) = (EmptyTuple, animals)
.partialCollect { case c: Cat => c }
.partialCollect { case Dog(_, age) => age }
.partialCollect { case Owl(wisdom) => wisdom }
.end
Scastie

Since you mentioned cats, I would also add solution using foldMap:
sealed trait Animal
case class Cat(name: String) extends Animal
case class Dog(name: String) extends Animal
case class Snake(name: String) extends Animal
val animals: List[Animal] = List(Cat("Bob"), Dog("Spot"), Cat("Sally"), Dog("Jim"), Snake("Billy"))
val map = animals.foldMap{ //Map(other -> List(Snake(Billy)), cats -> List(Cat(Bob), Cat(Sally)), dogs -> List(Dog(Spot), Dog(Jim)))
case d: Dog => Map("dogs" -> List(d))
case c: Cat => Map("cats" -> List(c))
case o => Map("other" -> List(o))
}
val tuples = animals.foldMap{ //(List(Dog(Spot), Dog(Jim)),List(Cat(Bob), Cat(Sally)),List(Snake(Billy)))
case d: Dog => (List(d), Nil, Nil)
case c: Cat => (Nil, List(c), Nil)
case o => (Nil, Nil, List(o))
}
Arguably it's more succinct than fold version, but it has to combine partial results using monoids, so it won't be as performant.

This code is dividing a list into three sets, so the natural way to do this is to use partition twice:
val (cats, notCat) = animals.partitionMap{
case c: Cat => Left(c)
case x => Right(x)
}
val (dogAges, rem) = notCat.partitionMap {
case Dog(_, age) => Left(age)
case x => Right(x)
}
A helper method can simplify this
def partitionCollect[T, U](list: List[T])(pf: PartialFunction[T, U]): (List[U], List[T]) =
list.partitionMap {
case t if pf.isDefinedAt(t) => Left(pf(t))
case x => Right(x)
}
val (cats, notCat) = partitionCollect(animals) { case c: Cat => c }
val (dogAges, rem) = partitionCollect(notCat) { case Dog(_, age) => age }
This is clearly extensible to more categories, with the slight irritation of having to invent temporary variable names (which could be overcome by explicit n-way partition methods)

Related

Merging elements of a list of case classes

I have the following case class:
case class GHUser(login:String, contributions:Option[Int])
And a list of such elements:
val list = List(
List(GHUser("a", Some(10)), GHUser("b", Some(10))), List(GHUser("b", Some(300)))
).flatten
And now I would like to merge all elements such that all contributions are added together for the same user. At first I thought I could apply a Monoid to my case class, like this:
trait Semigroup[A] {
def combine(x: A, y: A): A
}
trait Monoid[A] extends Semigroup[A] {
def empty: A
}
case class GHUser(login: String, contributions: Option[Int])
object Main extends App {
val ghMonoid: Monoid[GHUser] = new Monoid[GHUser] {
def empty: GHUser = GHUser("", None)
def combine(x: GHUser, y: GHUser): GHUser = {
x match {
case GHUser(_, None) => GHUser(y.login, y.contributions)
case GHUser(_, Some(xv)) =>
y match {
case GHUser(_, None) => GHUser(x.login, x.contributions)
case GHUser(_, Some(yv)) => GHUser(x.login, Some(xv + yv))
}
}
}
}
val list = List(
List(GHUser("a", Some(10)), GHUser("b", Some(10))), List(GHUser("b", Some(300)))
).flatten
val b = list.groupBy(_.login)
val c = b.mapValues(_.foldLeft(ghMonoid.empty)(ghMonoid.combine))
println(c.valuesIterator mkString("\n"))
// GHUser(a,Some(10))
// GHUser(b,Some(310))
}
An it works, but I feel like I am not following Monoid Laws, as it is required that all user have the same login (For that reason I did the groupBy call.
Is there a cleaner solution?
Update
Rereading my question, it seems like I do not want a Monoid but a Semigroup, am I right?
groupMapReduce() (Scala 2.13) handles most of what you need.
list.groupMapReduce(_.login)(_.contributions){case (a,b) => a.fold(b)(n => Some(n+b.getOrElse(0)))}
.map(GHUser.tupled)
//res0 = List(GHUser(a,Some(10)), GHUser(b,Some(310)))
The Reduce part is a bit convoluted but it gets the job done.
Here is a simple solution:
list.groupBy(_.login).map{
case (k, v) =>
GHUser(k, Some(v.flatMap(_.contributions).sum))
}
This will give Some(0) for users with no contributions. If you want None in this case it looks more ugly:
list.groupBy(_.login).map{
case (k, v) =>
val c = v.flatMap(_.contributions)
GHUser(k, c.headOption.map(_ => c.sum))
}

Creating a union on the left side of an Either type

Is there a way to bind a value of Either[L1, R1] to a function of type R1 => Either[L2, R2] and get a value of Either[L1 | L2, R2] so individual functions can declare and potentially return their errors and consumers of a monadic pipeline of these functions can cleanly handle all possible errors in an exhaustive, type-safe way?
Edit
Here's an example...
sealed trait IncrementError
case object MaximumValueReached extends IncrementError
def increment(n: Int): Either[IncrementError, Int] = n match {
case Integer.MAX_VALUE => Left(MaximumValueReached)
case n => Right(n + 1)
}
sealed trait DecrementError
case object MinimumValueReached extends DecrementError
def decrement(n: Int): Either[DecrementError, Int] = n match {
case Integer.MIN_VALUE => Left(MinimumValueReached)
case n => Right(n - 1)
}
for {
n <- increment(0).right
n <- decrement(n).right
} yield n // scala.util.Either[Object, Int] = Right(0)
With that return type I'm not able to do exhaustive error handling. I'm curious if there exists a way to do this using standard Scala Either or if there exists something in a library like scalaz which supports this behavior. I'd like to be able to handle errors like this...
val n = for {
n <- increment(0).right
n <- decrement(n).right
} yield n // scala.util.Either[IncrementError | DecrementError, Int]
match n {
case Left(MaximumValueReached) => println("Maximum value reached!")
case Left(MinimumValueReached) => println("Minimum value reached!")
case Right(_) => println("Success!")
}
I would do following:
/**
* Created by alex on 10/3/16.
*/
object Temp{
sealed trait IncrementError
case object MaximumValueReached extends IncrementError
sealed trait DecrementError
case object MinimumValueReached extends DecrementError
type MyResult = Either[Either[IncrementError, DecrementError], Int]
def increment(n: Int): MyResult = n match {
case Integer.MAX_VALUE => Left(Left(MaximumValueReached))
case n => Right(n + 1)
}
def decrement(n: Int): MyResult = n match {
case Integer.MIN_VALUE => Left(Right(MinimumValueReached))
case n => Right(n - 1)
}
def main(args:Array[String]) = {
val result = for {
k <- increment(0).right
n <- decrement(k).right
} yield n
result match {
case Left(Left(MaximumValueReached)) => println("Maximum value reached!")
case Left(Right(MinimumValueReached)) => println("Minimum value reached!")
case Right(_) => println("Success!")
}
}
}
Honestly I don't like it but it works for the case if you don't want to have IncrementError and DecrementError to inherit from some ancestor trait for some reasons.

scalaz, Disjunction.sequence returning a list of lefts

In scalaz 7.2.6, I want to implement sequence on Disjunction, such that if there is one or more lefts, it returns a list of those, instead of taking only the first one (as in Disjunction.sequenceU):
import scalaz._, Scalaz._
List(1.right, 2.right, 3.right).sequence
res1: \/-(List(1, 2, 3))
List(1.right, "error2".left, "error3".left).sequence
res2: -\/(List(error2, error3))
I've implemented it as follows and it works, but it looks ugly. Is there a getRight method (such as in scala Either class, Right[String, Int](3).right.get)? And how to improve this code?
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def getLeft(v: \/[L, R]):L = v match { case -\/(x) => x }
def getRight(v: \/[L, R]):R = v match { case \/-(x) => x }
def sequence: \/[List[L], List[R]] =
if (l.forall(_.isRight)) {
l.map(e => getRight(e)).right
} else {
l.filter(_.isLeft).map(e => getLeft(e)).left
}
}
Playing around I've implemented a recursive function for that, but the best option would be to use separate:
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def sequence: \/[List[L], List[R]] = {
def seqLoop(left: List[L], right: List[R], list: List[\/[L, R]]): \/[List[L], List[R]] =
list match {
case (h :: t) =>
h match {
case -\/(e) => seqLoop(left :+ e, right, t)
case \/-(s) => seqLoop(left, right :+ s, t)
}
case Nil =>
if(left.isEmpty) \/-(right)
else -\/(left)
}
seqLoop(List(), List(), l)
}
def sequenceSeparate: \/[List[L], List[R]] = {
val (left, right) = l.separate[\/[L, R], L, R]
if(left.isEmpty) \/-(right)
else -\/(left)
}
}
The first one just collects results and at the end decide what to do with those, the second its basically the same with the exception that the recursive function is much simpler, I didn't think about performance here, I've used :+, if you care use prepend or some other collection.
You may also want to take a look at Validation and ValidationNEL which unlike Disjunction accumulate failures.

Scala recursive macro?

I was wondering whether Scala supports recursive macro expansion e.g. I am trying to write a lens library with a lensing macro that does this:
case class C(d: Int)
case class B(c: C)
case class A(b: B)
val a = A(B(C(10))
val aa = lens(a)(_.b.c.d)(_ + 12)
assert(aa.b.c.d == 22)
Given lens(a)(_.b.c.d)(f), I want to transforms it to a.copy(b = lens(a.b)(_.c.d)(f))
EDIT:
I made some decent progress here
However, I cannot figure out a generic way to create an accessor out of List[TermName] e.g. for the above example, given that I have List(TermName('b'), TermName('c'), TermName('d'))), I want to generate an anonymous function _.b.c.d i.e. (x: A) => x.b.c.d. How do I do that?
Basically, how can I write these lines in a generic fashion?
Actually I managed to make it work: https://github.com/pathikrit/sauron/blob/master/src/main/scala/com/github/pathikrit/sauron/package.scala
Here is the complete source:
package com.github.pathikrit
import scala.reflect.macros.blackbox
package object sauron {
def lens[A, B](obj: A)(path: A => B)(modifier: B => B): A = macro lensImpl[A, B]
def lensImpl[A, B](c: blackbox.Context)(obj: c.Expr[A])(path: c.Expr[A => B])(modifier: c.Expr[B => B]): c.Tree = {
import c.universe._
def split(accessor: c.Tree): List[c.TermName] = accessor match { // (_.p.q.r) -> List(p, q, r)
case q"$pq.$r" => split(pq) :+ r
case _: Ident => Nil
case _ => c.abort(c.enclosingPosition, s"Unsupported path element: $accessor")
}
def join(pathTerms: List[TermName]): c.Tree = (q"(x => x)" /: pathTerms) { // List(p, q, r) -> (_.p.q.r)
case (q"($arg) => $pq", r) => q"($arg) => $pq.$r"
}
path.tree match {
case q"($_) => $accessor" => split(accessor) match {
case p :: ps => q"$obj.copy($p = lens($obj.$p)(${join(ps)})($modifier))" // lens(a)(_.b.c)(f) = a.copy(b = lens(a.b)(_.c)(f))
case Nil => q"$modifier($obj)" // lens(x)(_)(f) = f(x)
}
case _ => c.abort(c.enclosingPosition, s"Path must have shape: _.a.b.c.(...), got: ${path.tree}")
}
}
}
And, yes, Scala does apply the same macro recursively.

Folding on case classes

I have a situation where I have a couple of case classes where all of their variables are optional.
Let's say I have:
case class Size(width: Option[Int], height: Option[Int])
case class Foo(a: Option[String], b: Option[Boolean], c: Option[Char])
Given a collection of the same type of case class I would like to fold over them comparing the option values and keep the values which are defined. I.e. for Size:
values.foldLeft(x) { (a, b) =>
Size(a.width.orElse(b.width), a.height.orElse(b.height))
}
I would like to do this in a more general way for any of the case classes like the ones above. I'm thinking about doing something with unapply(_).get etc. Does anyone know a smart way to solve this?
Ok, consider this:
def foldCase[C,T1](unapply: C => Option[Option[T1]], apply: Option[T1] => C)
(coll: Seq[C]): C = {
coll.tail.foldLeft(coll.head) { case (current, next) =>
apply(unapply(current).get orElse unapply(next).get)
}
}
case class Person(name: Option[String])
foldCase(Person.unapply, Person.apply)(List(Person(None), Person(Some("Joe")), Person(Some("Mary"))))
One could overload foldCase to accept two, three, or more parameters, one version of f for each arity. It could then be used with any case class. Since there's the tuple-thing to worry about, below's one way to make it work with case classes or two parameters. Expanding it to more parameters is then trivial, though a bit tiresome.
def foldCase[C,T1,T2](unapply: C => Option[(Option[T1], Option[T2])], apply: (Option[T1], Option[T2]) => C)
(coll: Seq[C]): C = {
def thisOrElse(current: (Option[T1], Option[T2]), next: (Option[T1], Option[T2])) =
apply(current._1 orElse next._1, current._2 orElse next._2)
coll.tail.foldLeft(coll.head) { case (current, next) =>
thisOrElse(unapply(current).get, unapply(next).get)
}
}
val list = Person(None, None) :: Person(Some("Joe"), None) :: Person(None, Some(20)) :: Person(Some("Mary"), Some(25)) :: Nil
def foldPerson = foldCase(Person.unapply, Person.apply) _
foldPerson(list)
To use it overloaded, just put all definitions inside one object:
object Folder {
def foldCase[C,T1](unapply: C => Option[Option[T1]], apply: Option[T1] => C)
(coll: Seq[C]): C = {
coll.tail.foldLeft(coll.head) { case (current, next) =>
apply(unapply(current).get orElse unapply(next).get)
}
}
def foldCase[C,T1,T2](unapply: C => Option[(Option[T1], Option[T2])], apply: (Option[T1], Option[T2]) => C)
(coll: Seq[C]): C = {
def thisOrElse(current: (Option[T1], Option[T2]), next: (Option[T1], Option[T2])) =
apply(current._1 orElse next._1, current._2 orElse next._2)
coll.tail.foldLeft(coll.head) { case (current, next) =>
thisOrElse(unapply(current).get, unapply(next).get)
}
}
}
When you do this, however, you'll have to explicitly turn apply and unapply into functions:
case class Question(answer: Option[Boolean])
val list2 = List(Question(None), Question(Some(true)), Question(Some(false)))
Folder.foldCase(Question.unapply _, Question.apply _)(list2)
It might be possible to turn it into a structural type, so that you only need to pass the companion object, but I couldn't do it. On #scala, I was told the answer is a definitive no, at least to how I approached the problem.
[Code updated]
Here is an solution which requires only one abstract class per "arity":
abstract class Foldable2[A,B](val a:Option[A], val b:Option[B]) {
def orElse[F <: Foldable2[A,B]](that: F)(implicit ev: this.type <:< F) =
getClass.getConstructor(classOf[Option[A]], classOf[Option[B]]).newInstance(
this.a.orElse(that.a), this.b.orElse(that.b)
)
}
case class Size(w: Option[Int], h: Option[Int]) extends Foldable2(w, h)
println(Size(Some(1),None).orElse(Size(Some(2),Some(42))))
//--> Size(Some(1),Some(42))
Note that the implicit <:< argument will give a compile time error when other case classes with the same constructor arguments are passed to the method.
However, a "well formed" constructor is required, else the reflection code will blow up.
You can use productElement or productIterator (on scala.Product) to generically retrieve/iterate the elements of case classes (and tuples), but they're typed as Any, so there will be some pain.