I have a function like this:
def foo(item: Item) : Option[Int] = Try{
// Some code that can blow up
}.toOption
I have a list of items and I want to map through them, and apply the above function. But if the function above blows up and returns a None then the result of the map should be an error:
items.map{
item => foo(item)
}
Is map not the right thing to do here? It doesn't seem like it
This is called traverse. If you can use cats, it is as simple as:
import cats.implicits._
val result = items.traverse(foo) // Option[List[Int]]
If not, you can implement it pretty easily:
def traverse[A, B](data: List[A])(f: A => Option[B]): Option[List[B]] = {
#annotation.tailrec
def loop(remaining: List[A], acc: List[B]): Option[List[B]] =
remaining match {
case a :: as => f(a) match {
case Some(b) => loop(remaining = as, b :: acc)
case None => None
}
case Nil => Some(acc.reverse)
}
loop(remaining = data, acc = List.empty)
}
Which you can use like:
val result = traverse(items)(foo) // Option[List[Int]]
(however, I would suggest you to use cats instead, since it is more general).
For out-of-the-box short-circuiting, consider wrapping the list-mapping with Try like so
def fooUnsafe(item: Item): Int = // might throw
Try(items.map(fooUnsafe))
If you wish to keep def foo(item: Item) : Option[Int] signature then the following will also short-circuit
Try(list.map(v => foo(v).get))
Related
USE CASE
I have a list of files that can might have a valid mime type or not.
In my code, I represent this using an Option.
I need to convert a Seq[Option[T]] to Option[Seq[T]] so that I do not process the list if some of the files are invalid.
ERROR
This is the error in the implementation below:
found : (Option[Seq[A]], Option[A]) => Option[Seq[A]]
[error] required: (Option[Any], Option[Any]) => Option[Any]
[error] s.fold(init)(liftOptionItem[A])
IMPLEMENTATION
def liftOptionItem[A](acc: Option[Seq[A]], itemOption: Option[A]): Option[Seq[A]] = {
{
acc match {
case None => None
case Some(items) =>
itemOption match {
case None => None
case Some(item) => Some(items ++ Seq(item))
}
}
}
}
def liftOption[A](s: Seq[Option[A]]): Option[Seq[A]] = {
s.fold(Some(Seq()))(liftOptionItem[A])
}
This implementation returns Option[Any] instead of Option[Seq[A] as the type of the liftOptionItem[A] does not fit in.
If you use TypeLevel Cats:
import cats.implicits._
List(Option(1), Option(2), Option(3)).traverse(identity)
Returns:
Option[List[Int]] = Some(List(1, 2, 3))
You have to use List so use a toList first:
Seq(Option(1), Option(2), Option(3)).toList.traverse(identity).map(_.toSeq)
using scalaz:
import scalaz._
import Sclaza._
val x:List[Option[Int]] = List(Option(1))
x.sequence[Option, Int] //returns Some(List(1))
val y:List[Option[Int]] = List(None, Option(1))
y.sequence[Option, Int] // returns None
If you dont want to use functional libraries like cats or Scalaz you could use a foldLeft
def seqToOpt[A](seq: Seq[Option[A]]): Option[Seq[A]] =
seq.foldLeft(Option(Seq.empty[A])){
(res, opt) =>
for {
seq <- res
v <- opt
} yield seq :+ v
}
Tail-recursive Solution: It returns None if any one of the seq element is None.
def seqToOption[T](s: Seq[Option[T]]): Option[Seq[T]] = {
#tailrec
def seqToOptionHelper(s: Seq[Option[T]], accum: Seq[T] = Seq[T]()): Option[Seq[T]] = {
s match {
case Some(head) :: Nil => Option(head +: accum)
case Some(head) :: tail => seqToOptionHelper(tail, head +: accum)
case _ => None
}
}
seqToOptionHelper(s)
}
Dealing with None in case statements is the reason for returning the Option[Seq[Any]] type in stead of Option[Seq[A]] type. We need to make the function
liftOptionItem[A] to return Option[Seq[Any]] type. And the compilation error can be fixed with the following changes in both the functions.(Because fold does not go in any particular order, there are constraints on the start value and thus return value , the foldLeft is used in stead of fold.)
def liftOptionItem[A](acc: Option[Seq[Any]], itemOption: Option[A]): Option[Seq[Any]] = {
{
acc match {
case None => Some(Nil)
case Some(items)=>
itemOption match {
case None => Some(items ++ Seq("None"))
case Some(item) => Some(items ++ Seq(item))
}
}
}
}
def liftOption[A](s: Seq[Option[A]]): Option[Seq[Any]] = {
s.foldLeft(Option(Seq[Any]()))(liftOptionItem[A])
}
Now, code compiles.
In Scala REPL:
scala> val list1 = Seq(None,Some(21),None,Some(0),Some(43),None)
list1: Seq[Option[Int]] = List(None, Some(21), None, Some(0), Some(43), None)
scala> liftOption(list1)
res2: Option[Seq[Any]] = Some(List(None, 21, None, 0, 43, None))
scala> val list2 = Seq(None,Some("String1"),None,Some("String2"),Some("String3"),None)
list2: Seq[Option[String]] = List(None, Some(String1), None, Some(String2), Some(String3), None)
scala> liftOption(list2)
res3: Option[Seq[Any]] = Some(List(None, String1, None, String2, String3, None))
There is no really "beautiful" way to make this with out scalaz or cats.
But you can try something like this.
def seqToOpt[A](seq: Seq[Option[A]]): Option[Seq[A]] = {
val flatten = seq.flatten
if (flatten.isEmpty) None
else Some(flatten)
}
I want to update a sequence in Scala, I have this code :
def update(userId: Long): Either[String, Int] = {
Logins.findByUserId(userId) map {
logins: Login => update(login.id,
Seq(NamedParameter("random_date", "prefix-" + logins.randomDate)))
} match {
case sequence : Seq(Nil, Int) => sequence.foldLeft(Right(_) + Right(_))
case _ => Left("error.logins.update")
}
}
Where findByUserId returns a Seq[Logins] and update returns Either[String, Int] where Int is the number of updated rows,
and String would be the description of the error.
What I want to achieve is to return an String if while updating the list an error happenes or an Int with the total number of updated rows.
The code is not working, I think I should do something different in the match, I don't know how I can check if every element in the Seq of Eithers is a Right value.
If you are open to using Scalaz or Cats you can use traverse. An example using Scalaz :
import scalaz.std.either._
import scalaz.std.list._
import scalaz.syntax.traverse._
val logins = Seq(1, 2, 3)
val updateRight: Int => Either[String, Int] = Right(_)
val updateLeft: Int => Either[String, Int] = _ => Left("kaboom")
logins.toList.traverseU(updateLeft).map(_.sum) // Left(kaboom)
logins.toList.traverseU(updateRight).map(_.sum) // Right(6)
Traversing over the logins gives us a Either[String, List[Int]], if we get the sum of the List we get the wanted Either[String, Int].
We use toList because there is no Traverse instance for Seq.
traverse is a combination of map and sequence.
We use traverseU instead of traverse because it infers some of the types for us (otherwise we should have introduced a type alias or a type lambda).
Because we imported scalaz.std.either._ we can use map directly without using a right projection (.right.map).
You shouldn't really use a fold if you want to exit early. A better solution would be to recursively iterate over the list, updating and counting successes, then return the error when you encounter one.
Here's a little example function that shows the technique. You would probably want to modify this to do the update on each login instead of just counting.
val noErrors = List[Either[String,Int]](Right(10), Right(12))
val hasError = List[Either[String,Int]](Right(10), Left("oops"), Right(12))
def checkList(l: List[Either[String,Int]], goodCount: Int): Either[String, Int] = {
l match {
case Left(err) :: xs =>
Left(err)
case Right(_) :: xs =>
checkList(xs, (goodCount + 1))
case Nil =>
Right(goodCount)
}
}
val r1 = checkList(noErrors, 0)
val r2 = checkList(hasError, 0)
// r1: Either[String,Int] = Right(2)
// r2: Either[String,Int] = Left(oops)
You want to stop as soon as an update fails, don't you?
That means that you want to be doing your matching inside the map, not outside. Try is actually a more suitable construct for this purpose, than Either. Something like this, perhaps:
def update(userId: Long): Either[String, Int] = Try {
Logins.findByUserId(userId) map { login =>
update(login.id, whatever) match {
case Right(x) => x
case Left(s) => throw new Exception(s)
}
}.sum
}
.map { n => Right(n) }
.recover { case ex => Left(ex.getMessage) }
BTW, a not-too-widely-known fact about scala is that putting a return statement inside a lambda, actually returns from the enclosing method. So, another, somewhat shorter way to write this would be like this:
def update(userId: Long): Either[String, Int] =
Logins.findByUserId(userId).foldLeft(Right(0)) { (sum,login) =>
update(login.id, whatever) match {
case Right(x) => Right(sum.right + x)
case error#Left(s) => return error
}
}
Also, why in the world does findUserById return a sequence???
I have an Iterator[Record] which is ordered on record.id this way:
record.id=1
record.id=1
...
record.id=1
record.id=2
record.id=2
..
record.id=2
Records of a specific ID could occur a large number of times, so I want to write a function that takes this iterator as input, and returns an Iterator[Iterator[Record]] output in a lazy manner.
I was able to come up with the following, but it fails on StackOverflowError after 500K records or so:
def groupByIter[T, B](iterO: Iterator[T])(func: T => B): Iterator[Iterator[T]] = new Iterator[Iterator[T]] {
var iter = iterO
def hasNext = iter.hasNext
def next() = {
val first = iter.next()
val firstValue = func(first)
val (i1, i2) = iter.span(el => func(el) == firstValue)
iter = i2
Iterator(first) ++ i1
}
}
What am I doing wrong?
Trouble here is that each Iterator.span call makes another stacked closure for trailing iterator, and without any trampolining it's very easy to overflow.
Actually I dont think there is an implementation, which is not memoizing elements of prefix iterator, since followed iterator could be accessed earlier than prefix is drain out.
Even in .span implementation there is a Queue to memoize elements in the Leading definition.
So easiest implementation that I could imagine is the following via Stream.
implicit class StreamChopOps[T](xs: Stream[T]) {
def chopBy[U](f: T => U): Stream[Stream[T]] = xs match {
case x #:: _ =>
def eq(e: T) = f(e) == f(x)
xs.takeWhile(eq) #:: xs.dropWhile(eq).chopBy(f)
case _ => Stream.empty
}
}
Although it could be not the most performant as it memoize a lot. But with proper iterating of that, GC should handle problem of excess intermediate streams.
You could use it as myIterator.toStream.chopBy(f)
Simple check validates that following code can run without SO
Iterator.fill(10000000)(Iterator(1,1,2)).flatten //1,1,2,1,1,2,...
.toStream.chopBy(identity) //(1,1),(2),(1,1),(2),...
.map(xs => xs.sum * xs.size).sum //60000000
Inspired by chopBy implemented by #Odomontois here is a chopBy I implemented for Iterator. Of course each bulk should fit allocated memory. It doesn't looks very elegant but it seems to work :)
implicit class IteratorChopOps[A](toChopIter: Iterator[A]) {
def chopBy[U](f: A => U) = new Iterator[Traversable[A]] {
var next_el: Option[A] = None
#tailrec
private def accum(acc: List[A]): List[A] = {
next_el = None
val new_acc = hasNext match {
case true =>
val next = toChopIter.next()
acc match {
case Nil =>
acc :+ next
case _ MatchTail t if (f(t) == f(next)) =>
acc :+ next
case _ =>
next_el = Some(next)
acc
}
case false =>
next_el = None
return acc
}
next_el match{
case Some(_) =>
new_acc
case None => accum(new_acc)
}
}
def hasNext = {
toChopIter.hasNext || next_el.isDefined
}
def next: Traversable[A] = accum(next_el.toList)
}
}
And here is an extractor for matching tail:
object MatchTail {
def unapply[A] (l: Traversable[A]) = Some( (l.init, l.last) )
}
I am trying to find a cleaner way to express code that looks similar to this:
def method1: Try[Option[String]] = ???
def method2: Try[Option[String]] = ???
def method3: Try[Option[String]] = ???
method1 match
{
case f: Failure[Option[String]] => f
case Success(None) =>
method2 match
{
case f:Failure[Option[String]] => f
case Success(None) =>
{
method3
}
case s: Success[Option[String]] => s
}
case s: Success[Option[String]] => s
}
As you can see, this tries each method in sequence and if one fails then execution stops and the base match resolves to that failure. If method1 or method2 succeeds but contains None then the next method in the sequence is tried. If execution gets to method3 its results are always returned regardless of Success or Failure. This works fine in code but I find it difficult to follow whats happening.
I would love to use a for comprehension
for
{
attempt1 <- method1
attempt2 <- method2
attempt3 <- method3
}
yield
{
List(attempt1, attempt2, attempt3).find(_.isDefined)
}
because its beautiful and what its doing is quite clear. However, if all methods succeed then they are all executed every time, regardless of whether an earlier method returns a usable answer. Unfortunately I can't have that.
Any suggestions would be appreciated.
scalaz can be of help here. You'll need scalaz-contrib which adds a monad instance for Try, then you can use OptionT which has nice combinators. Here is an example:
import scalaz.OptionT
import scalaz.contrib.std.utilTry._
import scala.util.Try
def method1: OptionT[Try, String] = OptionT(Try(Some("method1")))
def method2: OptionT[Try, String] = OptionT(Try(Some("method2")))
def method3: OptionT[Try, String] = { println("method 3 is never called") ; OptionT(Try(Some("method3"))) }
def method4: OptionT[Try, String] = OptionT(Try(None))
def method5: OptionT[Try, String] = OptionT(Try(throw new Exception("fail")))
println((method1 orElse method2 orElse method3).run) // Success(Some(method1))
println((method4 orElse method2 orElse method3).run) // Success(Some(method2))
println((method5 orElse method2 orElse method3).run) // Failure(java.lang.Exception: fail)
If you don't mind creating a function for each method, you can do the following:
(Try(None: Option[String]) /: Seq(method1 _, method2 _, method3 _)){ (l,r) =>
l match { case Success(None) => r(); case _ => l }
}
This is not at all idiomatic, but I would like to point out that there's a reasonably short imperative version also with a couple tiny methods:
def okay(tos: Try[Option[String]]) = tos.isFailure || tos.success.isDefined
val ans = {
var m = method1
if (okay(m)) m
else if ({m = method2; okay(m)}) m
method3
}
The foo method should do the same stuff as your code, I don't think it is possible to do it using the for comprehension
type tryOpt = Try[Option[String]]
def foo(m1: tryOpt, m2: tryOpt, m3: tryOpt) = m1 flatMap {
case x: Some[String] => Try(x)
case None => m2 flatMap {
case y: Some[String] => Try(y)
case None => m3
}
}
method1.flatMap(_.map(Success _).getOrElse(method2)).flatMap(_.map(Success _).getOrElse(method3))
How this works:
The first flatMap takes a Try[Option[String]], if it is a Failure it returns the Failure, if it is a Success it returns _.map(Success _).getOrElse(method2) on the option. If the option is Some then it returns the a Success of the Some, if it is None it returns the result of method2, which could be Success[None], Success[Some[String]] or Failure.
The second map works similarly with the result it gets, which could be from method1 or method2.
Since getOrElse takes a by-name paramater method2 and method3 are only called if they need to be.
You could also use fold instead of map and getOrElse, although in my opinion that is less clear.
From this blog:
def riskyCodeInvoked(input: String): Int = ???
def anotherRiskyMethod(firstOutput: Int): String = ???
def yetAnotherRiskyMethod(secondOutput: String): Try[String] = ???
val result: Try[String] = Try(riskyCodeInvoked("Exception Expected in certain cases"))
.map(anotherRiskyMethod(_))
.flatMap(yetAnotherRiskyMethod(_))
result match {
case Success(res) => info("Operation Was successful")
case Failure(ex: ArithmeticException) => error("ArithmeticException occurred", ex)
case Failure(ex) => error("Some Exception occurred", ex)
}
BTW, IMO, Option is no need here?
I see that Scala standard library misses the method to get ranges of objects in the collection, that satisfy the predicate:
def <???>(p: A => Boolean): List[List[A]] = {
val buf = collection.mutable.ListBuffer[List[A]]()
var elems = this.dropWhile(e => !p(e))
while (elems.nonEmpty) {
buf += elems.takeWhile(p)
elems = elems.dropWhile(e => !p(e))
}
buf.toList
}
What would be the good name for such method? And is my implementation good enough?
I'd go for chunkWith or chunkBy
As for your implementation, I think this cries out for recursion! See if you can fill out this
#tailrec def chunkBy[A](l: List[A], acc: List[List[A]] = Nil)(p: A => Boolean): List[List[A]] = l match {
case Nil => acc
case l =>
val next = l dropWhile !p
val (chunk, rest) = next span p
chunkBy(rest, chunk :: acc)(p)
}
Why recursion? It's much easier to understand the algorithm and more likely to be bug-free (given the absence of vars).
The syntax !p for the negation of a predicate is achieved via an implicit conversion
implicit def PredicateW[A](p: A => Boolean) = new {
def unary_! : A => Boolean = a => !p(a)
}
I generally keep this around as it's astoundingly useful
How about:
def chunkBy[K](f: A => K): Map[K, List[List[A]]] = ...
Similar to groupBy but keeps contiguous chunks as chunks.
Using this, you can do xs.chunkBy(p)(true) to get what you want.
You probably want to call it splitWith because split is the string operation that more-or-less does that, and it's similar to splitAt.
Incidentally, here's a very compact implementation (though it does a lot of unnecessary work, so it's not a good implementation for speed; yours is fine for that):
def splitWith[A](xs: List[A])(p: A => Boolean) = {
(xs zip xs.scanLeft(1){ (i,x) => if (p(x) == ((i&1)==1)) i+1 else i }.tail).
filter(_._2 % 2 == 0).groupBy(_._2).toList.sortBy(_._1).map(_._2.map(_._1))
}
Just a little refinement of oxbow's code, this way the signature is lighter
def chunkBy[A](xs: List[A])(p: A => Boolean): List[List[A]] = {
#tailrec
def recurse(todo: List[A], acc: List[List[A]]): List[List[A]] = todo match {
case Nil => acc
case _ =>
val next = todo dropWhile (!p(_))
val (chunk, rest) = next span p
recurse(rest, acc ::: List(chunk))
}
recurse(xs, Nil)
}