Scalaz: request for use case for Cokleisli composition - scala

This question isn't meant as flame-bait! As it might be apparent, I've been looking at Scalaz recently. I'm trying to understand why I need some of the functionality that the library provides. Here's something:
import scalaz._
import Scalaz._
type NEL[A] = NonEmptyList[A]
val NEL = NonEmptyList
I put some println statements in my functions to see what was going on (aside: what would I have done if I was trying to avoid side effects like that?). My functions are:
val f: NEL[Int] => String = (l: NEL[Int]) => {println("f: " + l); l.toString |+| "X" }
val g: NEL[String] => BigInt = (l: NEL[String]) => {println("g: " + l); BigInt(l.map(_.length).sum) }
Then I combine them via a cokleisli and pass in a NEL[Int]
val k = cokleisli(f) =>= cokleisli(g)
println("RES: " + k( NEL(1, 2, 3) ))
What does this print?
f: NonEmptyList(1, 2, 3)
f: NonEmptyList(2, 3)
f: NonEmptyList(3)
g: NonEmptyList(NonEmptyList(1, 2, 3)X, NonEmptyList(2, 3)X, NonEmptyList(3)X)
RES: 57
The RES value is the character count of the (String) elements in the final NEL. Two things occur to me:
How could I have known that my NEL was going to be reduced in this manner from the method signatures involved? (I wasn't expecting the result at all)
What is the point of this? Can a reasonably simple and easy-to-follow use case be distilled for me?
This question is a thinly-veiled plea for some lovely person like retronym to explain how this powerful library actually works.

To understand the result, you need to understand the Comonad[NonEmptyList] instance. Comonad[W] essentially provides three functions (the actual interface in Scalaz is a little different, but this helps with explanation):
map: (A => B) => W[A] => W[B]
copure: W[A] => A
cojoin: W[A] => W[W[A]]
So, Comonad provides an interface for some container W that has a distinguished "head" element (copure) and a way of exposing the inner structure of the container so that we get one container per element (cojoin), each with a given element at the head.
The way that this is implemented for NonEmptyList is that copure returns the head of the list, and cojoin returns a list of lists, with this list at the head and all tails of this list at the tail.
Example (I'm shortening NonEmptyList to Nel):
Nel(1,2,3).copure = 1
Nel(1,2,3).cojoin = Nel(Nel(1,2,3),Nel(2,3),Nel(3))
The =>= function is coKleisli composition. How would you compose two functions f: W[A] => B and g: W[B] => C, knowing nothing about them other than that W is a Comonad? The input type of f and the output type of g aren't compatible. However, you can map(f) to get W[W[A]] => W[B] and then compose that with g. Now, given a W[A], you can cojoin it to get the W[W[A]] to feed into that function. So, the only reasonable composition is a function k that does the following:
k(x) = g(x.cojoin.map(f))
So for your nonempty list:
g(Nel(1,2,3).cojoin.map(f))
= g(Nel(Nel(1,2,3),Nel(2,3),Nel(3)).map(f))
= g(Nel("Nel(1,2,3)X","Nel(2,3)X","Nel(3)X"))
= BigInt(Nel("Nel(1,2,3)X","Nel(2,3)X","Nel(3)X").map(_.length).sum)
= BigInt(Nel(11,9,7).sum)
= 27

Cojoin is also defined for scalaz.Tree and scalaz.TreeLoc. This can be exploited to find a stream of all paths from the root of the tree to each leaf node.
def leafPaths[T](tree: Tree[T]): Stream[Stream[T]]
= tree.loc.cojoin.toTree.flatten.filter(_.isLeaf).map(_.path)
Using coKleisli arrow composition, we can do this, for example:
def leafDist[A] = (cokleisli(leafPaths[A]) &&& cokleisli(_.rootLabel))
=>= (_.map(s => (s._2, s._1.map(_.length).max)))
leafDist takes a Tree and returns a copy of it with each node annotated with its maximum distance from a leaf.

Related

Foldable "foldMap" that take a partial function: foldCollect?

Say, I have the following object:
case class MyFancyObject(a: String, b: Int, c : Vector[String])
And what I needed is to get a single Vector[String] containing all 'c's that match a given partial function.
E.g.:
val xs = Vector(
MyFancyObject("test1",1,Vector("test1-1","test1-2","test1-3")),
MyFancyObject("test2",2,Vector("test2-1","test2-2","test2-3")),
MyFancyObject("test3",3,Vector("test3-1","test3-2","test3-3")),
MyFancyObject("test4",4,Vector("test4-1","test4-2","test4-3"))
)
val partialFunction1 : PartialFunction[MyFancyObject,Vector[String]] = {
case MyFancyObject(_,b,c) if b > 2 => c
}
What I need to get is: Vector("test3-1","test3-2","test3-3","test4-1","test4-2","test4-3").
I solved this doing the following:
val res1 = xs.foldMap{
case MyFancyObject(_,b,c) if b > 2 => c
case _ => Vector.empty[String]
}
However, this made me curious. What I am doing here seemed to be a pretty common and natural thing: for each element of a foldable collection, try to apply a partial function and, should that fail, default to the Monoid's empty (Vector.empty in my case). I searched in the library and I did not find anything doing this already, so I ended up adding this extension method in my code:
implicit class FoldableExt[F[_], A](foldable : F[A]) {
def foldCollect[B](pF: PartialFunction[A, B])(implicit F : Foldable[F], B : Monoid[B]) : B = {
F.foldMap(foldable)(pF.applyOrElse(_, (_ : A) => B.empty))
}
}
My question here is:
Is there any reason why such a method would not be in available already? Is it not a generic and common enough scenario, or am I missing something?
I think that if you really need the partial function, you don't want it to leak outside, because it's not very nice to use. The best thing to do, if you want to reuse your partialFunction1, is to lift it to make it a total function that returns Option. Then you can provide your default case in the same place you use your partial function. Here's the approach:
val res2 = xs.foldMap(partialFunction1.lift).getOrElse(Vector.empty)
The foldMap(partialFunction1.lift) returns Some(Vector(test3-1, test3-2, test3-3, test4-1, test4-2, test4-3)). This is exactly what you have in res1, but wrapped in Option.

Variable length "select"s with quasiquotes

With Scala's quasiquotes you can build trees of selects easily, like so:
> tq"a.b.MyObj"
res: Select(Select(Ident(TermName("a")), TermName("b")), TermName("MyObj"))
My question is, how do I do this if the list of things to select from (a,b,...,etc) is variable length (and therefore in a variable that needs to be spliced in)?
I was hoping lifting would work (e.g. tq"""..${List("a","b","MyObj")}""" but it doesn't. Or maybe even this tq"""${List("a","b","MyObj").mkString(".")}""", but no luck.
Is there a way to support this with quasiquotes? Or do I just need to construct the tree of selects manually in this case?
I don't think there is a way to do this outright with quasiquotes. I'm definitely sure that anything like tq"""${List("a","b","MyObj").mkString(".")}""" will not work. My understanding of quasiquotes is that they are just sugar for extractors and apply.
However, building on that idea, we can define a custom extractor to do what you want. (By the way, I'm sure there is a nicer way to express this, but you get the idea...)
object SelectTermList {
def apply(arg0: String, args: List[String]): universe.Tree =
args.foldLeft(Ident(TermName(arg0)).asInstanceOf[universe.Tree])
((s,arg) => Select(s, TermName(arg)))
def unapply(t: universe.Tree): Option[(String,List[String])] = t match {
case Ident(TermName(arg0)) => Some((arg0, List()))
case Select(SelectTermList(arg0,args),TermName(arg)) =>
Some((arg0, args ++ List(arg)))
case _ => None
}
}
Then, you can use this to both construct and extract expressions of the form a.b.MyObj.
Extractor tests:
scala> val SelectTermList(obj0,selectors0) = q"a.b.c.d.e.f.g.h"
obj0: String = a
selectors0: List[String] = List(b, c, d, e, f, g, h)
scala> val q"someObject.method(${SelectTermList(obj1,selectors1)})" = q"someObject.method(a.b.MyObj)"
obj1: String = a
selectors1: List[String] = List(b, MyObj)
Corresponding apply tests:
scala> SelectTermList(obj0,selectors0)
res: universe.Tree = a.b.c.d.e.f.g.h
scala> q"someObject.method(${SelectTermList(obj1,selectors1)})"
res: universe.Tree = someObject.method(a.b.MyObj)
As you can see, there is no problem with nesting the extractors deep inside quasi quotes, both when constructing and extracting.

How to compose functions that return Validation?

This is a follow-up to my previous question
Suppose I have two validating functions that return either the input if it is valid or the error messages if it is not.
type Status[A] = ValidationNel[String, A]
val isPositive: Int => Status[Int] =
x => if (x > 0) x.success else s"$x not positive".failureNel
val isEven: Int => Status[Int] =
x => if (x % 2 == 0) x.success else s"$x not even".failureNel
Suppose also that I need to validate an instance of case class X:
case class X(x1: Int, // should be positive
x2: Int) // should be even
More specifically I need a function checkX: X => Status[X]. Moreover, I'd like to write checkX as a composition of isPositive and isEven.
val checkX: X => Status[X] =
({x => isPositive(x.x1)} |#| {x => isEven(x.x2)}) ((X.apply _).lift[Status])
Does it make sense ?
How would you write checkX as a composition of isPositive and isEven?
There are lots of ways to write this, but I like the following:
val checkX: X => Status[X] = x => isPositive(x.x1).tuple(isEven(x.x2)).as(x)
Or:
val checkX: X => Status[X] =
x => isPositive(x.x1) *> isEven(x.x2) *> x.point[Status]
The key point is that you want to run the two validations only for their "effects" and then return the original value in the new context. This is a perfectly legitimate applicative operation, as your own implementation shows. There are just some slightly nicer ways to write it.
It doesn't make sense, because the two functions can't really be composed. You'd (presumably) like to check whether x is positive and whether x is even - but in the case where it is both negative and odd, you'd like to receive both errors. But that can't ever happen as a composition of those two functions - once you apply either of the failing cases, you don't have an x any more to pass to the second function.
In my experience Validation is almost never the correct type to use, for this very reason.
If you want "fail-fast" behaviour where the result is either success or the first error, you should use \/ (i.e. type Status[A] = String \/ A in this case). If you want "accumulate all the error messages alongside the value" behaviour, you want Writer, i.e. type Status[A] = Writer[Vector[String], A]. Both of these types allow easy composition (because they have monad instances available) using e.g. Kleisli: Kleisli(isPositive) >==> isEven would work for either of these definitions of Status (but not for yours).

Scala, a cross between a foldLeft and a map supporting lazy evaluation

I have a collection which I want to map to a new collection, however each resulting value is dependent on the value before it in some way.I could solve this with a leftFold
val result:List[B] = (myList:List[A]).foldLeft(C -> List.empty[B]){
case ((c, list), a) =>
..some function returning something like..
C -> (B :: list)
}
The problem here is I need to iterate through the entire list to retrieve the resultant list. Say I wanted a function that maps TraversableOnce[A] to TraversableOnce[B] and only evaluate members as I call them?
It seems to me to be a fairly conventional problem so Im wondering if there is a common approach to this. What I currently have is:
implicit class TraversableOnceEx[T](val self : TraversableOnce[T]) extends AnyVal {
def foldyMappyFunction[A, U](a:A)(func:(A,T) => (A,U)):TraversableOnce[U] = {
var currentA = a
self.map { t =>
val result = func(currentA, t)
currentA = result._1
result._2
}
}
}
As far as functional purity goes, you couldn't run it in parallel, but otherwise it seems sound.
An example would be;
Return me each element and if it is the first time that element has appeared before.
val elements:TraversableOnce[E]
val result = elements.mappyFoldyFunction(Set.empty[E]) {
(s, e) => (s + e) -> (e -> s.contains(e))
}
result:TraversableOnce[(E,Boolean)]
You might be able to make use of the State Monad. Here is your example re-written using scalaz:
import scalaz._, Scalaz._
def foldyMappy(i: Int) = State[Set[Int], (Int, Boolean)](s => (s + i, (i, s contains(i))))
val r = List(1, 2, 3, 3, 6).traverseS(foldyMappy)(Set.empty[Int])._2
//List((1,false), (2,false), (3,false), (3,true), (6,false))
println(r)
It is look like you need SeqView. Use view or view(from: Int, until: Int) methods for create a non-strict view of list.
I really don't understand your example as your contains check will always result to false.
foldLeft is different. It will result in a single value by aggregating all elements of the list.
You clearly need map (List => List).
Anyway, answering your question about laziness:
you should use Stream instead of List. Stream doesn't evaluate the tail before actually calling it.
Stream API

Nearest keys in a SortedMap

Given a key k in a SortedMap, how can I efficiently find the largest key m that is less than or equal to k, and also the smallest key n that is greater than or equal to k. Thank you.
Looking at the source code for 2.9.0, the following code seems about to be the best you can do
def getLessOrEqual[A,B](sm: SortedMap[A,B], bound: A): B = {
val key = sm.to(x).lastKey
sm(key)
}
I don't know exactly how the splitting of the RedBlack tree works, but I guess it's something like a O(log n) traversal of the tree/construction of new elements and then a balancing, presumable also O(log n). Then you need to go down the new tree again to get the last key. Unfortunately you can't retrieve the value in the same go. So you have to go down again to fetch the value.
In addition the lastKey might throw an exception and there is no similar method that returns an Option.
I'm waiting for corrections.
Edit and personal comment
The SortedMap area of the std lib seems to be a bit neglected. I'm also missing a mutable SortedMap. And looking through the sources, I noticed that there are some important methods missing (like the one the OP asks for or the ones pointed out in my answer) and also some have bad implementation, like 'last' which is defined by TraversableLike and goes through the complete tree from first to last to obtain the last element.
Edit 2
Now the question is reformulated my answer is not valid anymore (well it wasn't before anyway). I think you have to do the thing I'm describing twice for lessOrEqual and greaterOrEqual. Well you can take a shortcut if you find the equal element.
Scala's SortedSet trait has no method that will give you the closest element to some other element.
It is presently implemented with TreeSet, which is based on RedBlack. The RedBlack tree is not visible through methods on TreeSet, but the protected method tree is protected. Unfortunately, it is basically useless. You'd have to override methods returning TreeSet to return your subclass, but most of them are based on newSet, which is private.
So, in the end, you'd have to duplicate most of TreeSet. On the other hand, it isn't all that much code.
Once you have access to RedBlack, you'd have to implement something similar to RedBlack.Tree's lookup, so you'd have O(logn) performance. That's actually the same complexity of range, though it would certainly do less work.
Alternatively, you'd make a zipper for the tree, so that you could actually navigate through the set in constant time. It would be a lot more work, of course.
Using Scala 2.11.7, the following will give what you want:
scala> val set = SortedSet('a', 'f', 'j', 'z')
set: scala.collection.SortedSet[Char] = TreeSet(a, f, j, z)
scala> val beforeH = set.to('h').last
beforeH: Char = f
scala> val afterH = set.from('h').head
afterH: Char = j
Generally you should use lastOption and headOption as the specified elements may not exist. If you are looking to squeeze a little more efficiency out, you can try replacing from(...).head with keysIteratorFrom(...).head
Sadly, the Scala library only allows to make this type of query efficiently:
and also the smallest key n that is greater than or equal to k.
val n = TreeMap(...).keysIteratorFrom(k).next
You can hack this by keeping two structures, one with normal keys, and one with negated keys. Then you can use the other structure to make the second type of query.
val n = - TreeMap(...).keysIteratorFrom(-k).next
Looks like I should file a ticket to add 'fromIterator' and 'toIterator' methods to 'Sorted' trait.
Well, one option is certainly using java.util.TreeMap.
It has lowerKey and higherKey methods, which do excatly what you want.
I had a similar problem: I wanted to find the closest element to a given key in a SortedMap. I remember the answer to this question being, "You have to hack TreeSet," so when I had to implement it for a project, I found a way to wrap TreeSet without getting into its internals.
I didn't see jazmit's answer, which more closely answers the original poster's question with minimum fuss (two method calls). However, those method calls do more work than needed for this application (multiple tree traversals), and my solution provides lots of hooks where other users can modify it to their own needs.
Here it is:
import scala.collection.immutable.TreeSet
import scala.collection.SortedMap
// generalize the idea of an Ordering to metric sets
trait MetricOrdering[T] extends Ordering[T] {
def distance(x: T, y: T): Double
def compare(x: T, y: T) = {
val d = distance(x, y)
if (d > 0.0) 1
else if (d < 0.0) -1
else 0
}
}
class MetricSortedMap[A, B]
(elems: (A, B)*)
(implicit val ordering: MetricOrdering[A])
extends SortedMap[A, B] {
// while TreeSet searches for an element, keep track of the best it finds
// with *thread-safe* mutable state, of course
private val best = new java.lang.ThreadLocal[(Double, A, B)]
best.set((-1.0, null.asInstanceOf[A], null.asInstanceOf[B]))
private val ord = new MetricOrdering[(A, B)] {
def distance(x: (A, B), y: (A, B)) = {
val diff = ordering.distance(x._1, y._1)
val absdiff = Math.abs(diff)
// the "to" position is a key-null pair; the object of interest
// is the other one
if (absdiff < best.get._1)
(x, y) match {
// in practice, TreeSet always picks this first case, but that's
// insider knowledge
case ((to, null), (pos, obj)) =>
best.set((absdiff, pos, obj))
case ((pos, obj), (to, null)) =>
best.set((absdiff, pos, obj))
case _ =>
}
diff
}
}
// use a TreeSet as a backing (not TreeMap because we need to get
// the whole pair back when we query it)
private val treeSet = TreeSet[(A, B)](elems: _*)(ord)
// find the closest key and return:
// (distance to key, the key, its associated value)
def closest(to: A): (Double, A, B) = {
treeSet.headOption match {
case Some((pos, obj)) =>
best.set((ordering.distance(to, pos), pos, obj))
case None =>
throw new java.util.NoSuchElementException(
"SortedMap has no elements, and hence no closest element")
}
treeSet((to, null.asInstanceOf[B])) // called for side effects
best.get
}
// satisfy the contract (or throw UnsupportedOperationException)
def +[B1 >: B](kv: (A, B1)): SortedMap[A, B1] =
new MetricSortedMap[A, B](
elems :+ (kv._1, kv._2.asInstanceOf[B]): _*)
def -(key: A): SortedMap[A, B] =
new MetricSortedMap[A, B](elems.filter(_._1 != key): _*)
def get(key: A): Option[B] = treeSet.find(_._1 == key).map(_._2)
def iterator: Iterator[(A, B)] = treeSet.iterator
def rangeImpl(from: Option[A], until: Option[A]): SortedMap[A, B] =
new MetricSortedMap[A, B](treeSet.rangeImpl(
from.map((_, null.asInstanceOf[B])),
until.map((_, null.asInstanceOf[B]))).toSeq: _*)
}
// test it with A = Double
implicit val doubleOrdering =
new MetricOrdering[Double] {
def distance(x: Double, y: Double) = x - y
}
// and B = String
val stuff = new MetricSortedMap[Double, String](
3.3 -> "three",
1.1 -> "one",
5.5 -> "five",
4.4 -> "four",
2.2 -> "two")
println(stuff.iterator.toList)
println(stuff.closest(1.5))
println(stuff.closest(1000))
println(stuff.closest(-1000))
println(stuff.closest(3.3))
println(stuff.closest(3.4))
println(stuff.closest(3.2))
I've been doing:
val m = SortedMap(myMap.toSeq:_*)
val offsetMap = (m.toSeq zip m.keys.toSeq.drop(1)).map {
case ( (k,v),newKey) => (newKey,v)
}.toMap
When I want the results of my map off-set by one key. I'm also looking for a better way, preferably without storing an extra map.