Combining Scala Futures and collections in for comprehensions - scala

I'm trying to use a for expression to iterate over a list, then do a transformation on each element using a utility that returns a Future. Long story short, it doesn't compile, and I'd like to understand why. I read this question, which is similar, and was a great help, but what I'm trying to do is even simpler, which is all the more confusing as to why it doesn't work. I'm trying to do something like:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
val numberList = List(1, 2, 3)
def squareInTheFuture(number: Int): Future[Int] = Future { number * number}
val allTheSquares = for {
number <- numberList
square <- squareInTheFuture(number)
} yield { square }
And what I get is:
error: type mismatch;
found : scala.concurrent.Future[Int]
required: scala.collection.GenTraversableOnce[?]
square <- squareInTheFuture(number)
^
Can someone help me understand why this doesn't work and what the best alternative is?

The Future companion object has a traverse method that does exactly what you want:
val allTheSquares: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)
This will asynchronously start all the computations and return a future that will be completed once all of those futures are completed.

flatMap requires that the type constructors of numberList and squareInTheFuture(number) are the same (modulo whatever implicit conversions the collection library does). That isn't the case here. Instead, this is a traversal:
val allSquaresInTheFuture: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)

#Lee is correct. As an addition, if you are trying to do parallel computation:
val numberList = List(1, 2, 3)
val allTheSquares = numberList.par.map(x => x * x)(breakOut)
If you really want Future:
val allTheSquares: Future[List[Int]] = Future.traverse(numberList)(squareInTheFuture)

Your for comprehension is the same as
val allTheSquares = numberList.flatMap(number => squareInTheFuture(number))
flatMap requires that it's argument function returns a GenTraversableOnce[Int], however yours returns a Future[Int], hence the mismatch.

Related

Scala: Combine Either per the whole List with Either per elements

I have a list of Either, which represents error:
type ErrorType = List[String]
type FailFast[A] = Either[ErrorType, A]
import cats.syntax.either._
val l = List(1.asRight[ErrorType], 5.asRight[ErrorType])
If all of them are right, I want to get a list of [A], in this case - List[Int]
If any Either is left, I want to combine all errors of all either and return it.
I've found a similar topic at [How to reduce a Seq[Either[A,B]] to a Either[A,Seq[B]]
But it was quite long ago. For instance, one of the answers offers to use partitionMap, which I cannot find at this moment. Probably there is a better, more elegant solution. Example with scala-cats would be great.
How I would like to use it:
for {
listWithEihers <- someFunction
//if this list contains one or more errors, return Left[List[String]]
//if everything is fine, convert it to:
correctItems <- //returns list of List[Int] as right
} yield correctItems
Return type of this for-comprehension must be:
Either[List[String], List[Int]]
As already mentioned in the comments, Either is good for fail-fast behavior. For accumulating multiple errors, you probably want something like Validated. Moreover:
List is traversable (has instance of Traverse)
Validated is applicative
Validated.fromEither maps Either[List[String], X] to Validated[List[String], X], that's exactly what you need as function in traverse.
Therefore, you might try:
l.traverse(Validated.fromEither) if you are OK with a Validated
l.traverse(Validated.fromEither).toEither if you really want an Either in the end.
Full example with all imports:
import cats.data.Validated
import cats.syntax.validated._
import cats.syntax.either._
import cats.syntax.traverse._
import cats.instances.list._
import cats.Traverse
import scala.util.Either
type ErrorType = List[String]
type FailFast[A] = Either[ErrorType, A]
val l: List[Either[ErrorType, Int]] = List(1.asRight[ErrorType], 5.asRight[ErrorType])
// solution if you want to keep a `Validated`
val validatedList: Validated[ErrorType, List[Int]] =
l.traverse(Validated.fromEither)
// solution if you want to transform it back to `Either`
val eitherList: Either[ErrorType, List[Int]] =
l.traverse(Validated.fromEither).toEither
As #Luis mention in the comments, ValidatedNel is what you are looking for:
import cats.data.{ Validated, ValidatedNel }
import cats.implicits._
type ErrorType = String
def combine(listWithEither: List[Either[ErrorType, Int]]):ValidatedNel[ErrorType, List[Int]] =
listWithEither.foldMap(e => Validated.fromEither(e).map(List(_)).toValidatedNel)
val l1 = List[Either[ErrorType, Int]](Right(1), Right(2), Right(3))
val l2 = List[Either[ErrorType, Int]](Left("Incorrect String"), Right(2), Left("Validation error"))
println(combine(l1))
// Displays Valid(List(1, 2, 3))
println(combine(l2))
// Displays Invalid(NonEmptyList(Incorrect String, Validation error))
You could transform the final rather back to an Either using .toEither, but ValidatedNel is a better structure to accumulate errors, while Either is more suited for fail fast erroring.

How to make it a monad?

I am trying to validate a list of strings sequentially and define the validation result type like that:
import cats._, cats.data._, cats.implicits._
case class ValidationError(msg: String)
type ValidationResult[A] = Either[NonEmptyList[ValidationError], A]
type ListValidationResult[A] = ValidationResult[List[A]] // not a monad :(
I would like to make ListValidationResult a monad. Should I implement flatMap and pure manually or there is an easier way ?
I suggest you to take a totally different approach leveraging cats Validated:
import cats.data.Validated.{ invalidNel, valid }
val stringList: List[String] = ???
def evaluateString(s: String): ValidatedNel[ValidationError, String] =
if (???) valid(s) else invalidNel(ValidationError(s"invalid $s"))
val validationResult: ListValidationResult[String] =
stringList.map(evaluateString).sequenceU.toEither
It can be adapted for a generic type T, as per your example.
Notes:
val stringList: List[String] = ??? is the list of strings you want to validate;
ValidatedNel[A,B] is just a type alias for Validated[NonEmptyList[A],B];
evaluateString should be your evaluation function, it is currently just an unimplemented stub if;
sequenceU you may want to read cats documentation about it: sequenceU;
toEither does exactly what you think it does, it converts a Validated[A,B] to an Either[A,B].
As #Michael pointed out, you could also use traverseU instead of map and sequenceU
val validationResult: ListValidationResult[String] =
stringList.traverseU(evaluateString).toEither

Cats Seq[Xor[A,B]] => Xor[A, Seq[B]]

I have a sequence of Errors or Views (Seq[Xor[Error,View]])
I want to map this to an Xor of the first error (if any) or a Sequence of Views
(Xor[Error, Seq[View]]) or possibly simply (Xor[Seq[Error],Seq[View])
How can I do this?
You can use sequenceU provided by the bitraverse syntax, similar to as you would do with scalaz. It doesn't seem like the proper type classes exist for Seq though, but you can use List.
import cats._, data._, implicits._, syntax.bitraverse._
case class Error(msg: String)
case class View(content: String)
val errors: List[Xor[Error, View]] = List(
Xor.Right(View("abc")), Xor.Left(Error("error!")),
Xor.Right(View("xyz"))
)
val successes: List[Xor[Error, View]] = List(
Xor.Right(View("abc")),
Xor.Right(View("xyz"))
)
scala> errors.sequenceU
res1: cats.data.Xor[Error,List[View]] = Left(Error(error!))
scala> successes.sequenceU
res2: cats.data.Xor[Error,List[View]] = Right(List(View(abc), View(xyz)))
In the most recent version of Cats Xor is removed and now the standard Scala Either data type is used.
Michael Zajac showed correctly that you can use sequence or sequenceU (which is actually defined on Traverse not Bitraverse) to get an Either[Error, List[View]].
import cats.implicits._
val xs: List[Either[Error, View]] = ???
val errorOrViews: Either[Error, List[View]] = xs.sequenceU
You might want to look at traverse (which is like a map and a sequence), which you can use most of the time instead of sequence.
If you want all failing errors, you cannot use Either, but you can use Validated (or ValidatedNel, which is just a type alias for Validated[NonEmptyList[A], B].
import cats.data.{NonEmptyList, ValidatedNel}
val errorsOrViews: ValidatedNel[Error, List[View]] = xs.traverseU(_.toValidatedNel)
val errorsOrViews2: Either[NonEmptyList[Error], List[View]] = errorsOrViews.toEither
You could also get the errors and the views by using MonadCombine.separate :
val errorsAndViews: (List[Error], List[View]) = xs.separate
You can find more examples and information on Either and Validated on the Cats website.

Best way to handle Error on basic Array

val myArray = Array("1", "2")
val error = myArray(5)//throws an ArrayOutOfBoundsException
myArray has no fixed size, which explains why a call like performed on the above second line might happen.
First, I never really understood the reasons to use error handling for expected errors. Am I wrong to consider this practice as bad, resulting from poor coding skills or an inclination towards laziness?
What would be the best way to handle the above case?
What I am leaning towards: basic implementation (condition) to prevent accessing the data like depicted;
use Option;
use Try or Either;
use a try-catch block.
1 Avoid addressing elements through index
Scala offers a rich set of collection operations that are applied to Arrays through ArrayOps implicit conversions. This lets us use combinators like map, flatMap, take, drop, .... on arrays instead of addressing elements by index.
2 Prevent access out of range
An example I've seen often when parsing CSV-like data (in Spark):
case class Record(id:String, name: String, address:String)
val RecordSize = 3
val csvData = // some comma separated data
val records = csvData.map(line => line.split(","))
.collect{case arr if (arr.size == RecordSize) =>
Record(arr(0), arr(1), arr(2))}
3 Use checks that fit in the current context
If we are using monadic constructs to compose access to some resource, use a fitting way of lift errors to the application flow:
e.g. Imagine we are retrieving user preferences from some repository and we want the first one:
Option
def getUserById(id:ID):Option[User]
def getPreferences(user:User) : Option[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.lift(0)
} yield topPreference
(or even better, applying advice #1):
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.headOption
} yield topPreference
Try
def getUserById(id:ID): Try[User]
def getPreferences(user:User) : Try[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- Try(preferences(0))
} yield topPreference
As general guidance: Use the principle of least power.
If possible, use error-free combinators: = array.drop(4).take(1)
If all that matters is having an element or not, use Option
If we need to preserve the reason why we could not find an element, use Try.
Let the types and context of the program guide you.
If indexing myArray can be expected to error on occasion, then it sounds like Option would be the way to go.
myArray.lift(1) // Option[String] = Some(2)
myArray.lift(5) // Option[String] = None
You could use Try() but why bother if you already know what the error is and you're not interested in catching or reporting it?
Use arr.lift (available in standard library) which returns Option instead of throwing exception.
if not use safely
Try to access the element safely to avoid accidentally throwing exceptions in middle of the code.
implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
Usage:
arr.safely(4)
REPL
scala> val arr = Array(1, 2, 3)
arr: Array[Int] = Array(1, 2, 3)
scala> implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
defined class ArrUtils
scala> arr.safely(4)
res5: Option[Int] = None
scala> arr.safely(1)
res6: Option[Int] = Some(2)

Type Mismatch on Scala For Comprehension: scala.concurrent.Future

I'm pretty new to Scala, please bear with me. I have a bunch of futures wrapped in a large array. The futures have done their hard work looking through a few TBs of data, and at the end of my app I want to wrap up all the results of said futures so I can present them nicely.
The collection of futures I have is of the following type:
Array[Future[List(String, String, String)]]
Everything I've read so far about the for-comprehension show that
val test: Seq[Seq[List[String]]] = Seq(Seq(List("Hello", "World"), List("What's", "Up")))
val results = for {
test1 <- test
test2 <- test1
test3 <- test2
} yield test3
Results in
results: Seq[String] = List(Hello, World, What's, Up)
By that same logic, my intention was to do it like so, since I recently discovered that Option, Try, Failure and Success can be treated as collections:
val futures = { ... } // Logic that collects my Futures
// futures is now Array[Future[List(String, String, String)]]
val results = for {
// futureSeq as Seq[List(String, String, String]
futureSeq <- Future.sequence(futures.toSeq)
// resultSet as List(String, String, String)
resultSet <- futureSeq
} yield resultset
But this doesn't to work. I seem to be receiving the following compilation errors:
Error:(78, 15) type mismatch;
found : Seq[List(String, String, String)]
required: scala.concurrent.Future[?]
resultSet <- futureSeq
^
The part with the required: scala.concurrent.Future[?] is throwing me off completely. I do not understand at all why a Future would be required there.
I have checked the types of all my objects through the REPL, by debugging, and by using IntelliJ's type inspection. They seem to confirm that I'm not just confused about my types.
And before anyone mentions, yes, I am aware that the for-comprehension is syntactic sugar for a bunch of maps, flatMaps and withFilters.
The details of how the for-comprehension desugars to calls to flatMap and map are important here. This code:
for {
futureSeq <- Future.sequence(futures.toSeq)
resultSet <- futureSeq
} yield resultset
Becomes something more or less like this:
Future.sequence(futures.toSeq).flatMap(futureSeq => futureSeq)
The flatMap on Future expects a function that returns a Future, but you've given it one that returns an Seq[List[(String, String, String)]].
In general you can't mix types in for-comprehensions (Option in a sequence comprehension is a kind of exception that's supported by an implicit conversion). If you have a <- arrow coming out of a future, all of the rest of your <- arrows need to come out of futures.
You probably want something like this:
val results: Future[Seq[(String, String, String)]] =
Future.sequence(futures.toSeq).map(_.flatten)
You could then use something like this:
import scala.concurrent.Await
import scala.concurrent.duration._
Await.result(results.map(_.map(doSomethingWithResult)), 2.seconds)
To synchronously present the results (blocking until its done).