val myArray = Array("1", "2")
val error = myArray(5)//throws an ArrayOutOfBoundsException
myArray has no fixed size, which explains why a call like performed on the above second line might happen.
First, I never really understood the reasons to use error handling for expected errors. Am I wrong to consider this practice as bad, resulting from poor coding skills or an inclination towards laziness?
What would be the best way to handle the above case?
What I am leaning towards: basic implementation (condition) to prevent accessing the data like depicted;
use Option;
use Try or Either;
use a try-catch block.
1 Avoid addressing elements through index
Scala offers a rich set of collection operations that are applied to Arrays through ArrayOps implicit conversions. This lets us use combinators like map, flatMap, take, drop, .... on arrays instead of addressing elements by index.
2 Prevent access out of range
An example I've seen often when parsing CSV-like data (in Spark):
case class Record(id:String, name: String, address:String)
val RecordSize = 3
val csvData = // some comma separated data
val records = csvData.map(line => line.split(","))
.collect{case arr if (arr.size == RecordSize) =>
Record(arr(0), arr(1), arr(2))}
3 Use checks that fit in the current context
If we are using monadic constructs to compose access to some resource, use a fitting way of lift errors to the application flow:
e.g. Imagine we are retrieving user preferences from some repository and we want the first one:
Option
def getUserById(id:ID):Option[User]
def getPreferences(user:User) : Option[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.lift(0)
} yield topPreference
(or even better, applying advice #1):
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.headOption
} yield topPreference
Try
def getUserById(id:ID): Try[User]
def getPreferences(user:User) : Try[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- Try(preferences(0))
} yield topPreference
As general guidance: Use the principle of least power.
If possible, use error-free combinators: = array.drop(4).take(1)
If all that matters is having an element or not, use Option
If we need to preserve the reason why we could not find an element, use Try.
Let the types and context of the program guide you.
If indexing myArray can be expected to error on occasion, then it sounds like Option would be the way to go.
myArray.lift(1) // Option[String] = Some(2)
myArray.lift(5) // Option[String] = None
You could use Try() but why bother if you already know what the error is and you're not interested in catching or reporting it?
Use arr.lift (available in standard library) which returns Option instead of throwing exception.
if not use safely
Try to access the element safely to avoid accidentally throwing exceptions in middle of the code.
implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
Usage:
arr.safely(4)
REPL
scala> val arr = Array(1, 2, 3)
arr: Array[Int] = Array(1, 2, 3)
scala> implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
defined class ArrUtils
scala> arr.safely(4)
res5: Option[Int] = None
scala> arr.safely(1)
res6: Option[Int] = Some(2)
Related
I have the following scenario:
case class MyString(str: String)
val val1: ValidatedNel[String, MyString] = MyString("valid1").validNel
val val2: ValidatedNel[String, MyString] = MyString("valid2").validNel
val val3: ValidatedNel[String, MyString] = "invalid".invalidNel
val vals = Seq(val1, val2, val3)
//vals: Seq[cats.data.Validated[cats.data.NonEmptyList[String],MyString]] = List(Valid(MyString(valid)), Invalid(NonEmptyList(invalid)))
At the end I'd like to be able to do a match on the result and get any and all errors or all the valid values as a sequence.
My question is: How to convert Seq[Validated[NonEmptyList[String],MyString]] into Validated[NonEmptyList[String],Seq[MyString]]]
So, my first pass was to implement Semigroup for Seq[MyString]:
implicit val myStringsAdditionSemigroup: Semigroup[Seq[MyString]] = new Semigroup[Seq[MyString]] {
def combine(x: Seq[MyString], y: Seq[MyString]): Seq[MyString] = x ++ y
}
... which works:
Seq(val1, val2).map(_.map(Seq(_))).reduce(_ |+| _)
//res0: cats.data.Validated[cats.data.NonEmptyList[String],Seq[MyString]] = Valid(List(MyString(valid1), MyString(valid2)))
but I need to prepare my data by wrapping all valid values in Seq... which feels strange. So, maybe there's a better way of doing that?
If you use anything other than Seq, like Vector or List, you can sequence it.
sequence basically turns a type constructor inside out. Meaning turning a F[G[A]] into an G[F[A]]. For that to work, the F needs to be a Traverse and the G needs to be an Applicative. Luckily, Validated is an Applicative and List or Vector are instances of Traverse.
So in the end your code should look something like this:
import cats.implicits._
val validatedList: Validated[NonEmptyList[String],List[MyString]]] =
vals.sequence
Note: if this doesn't compile for you, you might need to enable partial-unification.
The easiest way to enable partial-unification, is to add the sbt-partial-unification plugin.
If you're on Scala 2.11.9 or newer, you can also simply add the compiler flag:
scalacOptions += "-Ypartial-unification"
We from the cats team strongly encourage you to have this flag on at all times when using cats, as it makes everything just a lot easier.
I have a variable underlying of type Option[mutable.Traversable[Field]]
All I wanted todo in my class was provide a method to return this as Sequence in the following way:
def toSeq: scala.collection.mutable.Seq[Field] = {
for {
f <- underlying.get
} yield f
}
This fails as it complains that mutable.traversable does not conform to mutable.seq. All it is doing is yielding something of type Field - in my mind this should work?
A possible solution to this is:
def toSeq: Seq[Field] = {
underlying match {
case Some(x) => x.toSeq
case None =>
}
}
Although I have no idea what is actually happening when x.toSeq is called and I imagine there is more memory being used here that actually required to accomplish this.
An explanation or suggestion would be much appreciated.
I am confused why you say that "I imagine there is more memory being used here than actually required to accomplish". Scala will not copy your Field values when doing x.toSeq, it is simply going to create an new Seq which will have pointers to the same Field values that underlying is pointing to. Since this new structure is exactly what you want there is no avoiding the additional memory associated with the extra pointers (but the amount of additional memory should be small). For a more in-depth discussion see the wiki on persistent data structures.
Regarding your possible solution, it could be slightly modified to get the result you're expecting:
def toSeq : Seq[Field] =
underlying
.map(_.toSeq)
.getOrElse(Seq.empty[Field])
This solution will return an empty Seq if underlying is a None which is safer than your original attempt which uses get. I say it's "safer" because get throws a NoSuchElementException if the Option is a None whereas my toSeq can never fail to return a valid value.
Functional Approach
As a side note: when I first started programming in scala I would write many functions of the form:
def formatSeq(seq : Seq[String]) : Seq[String] =
seq map (_.toUpperCase)
This is less functional because you are expecting a particular collection type, e.g. formatSeq won't work on a Future.
I have found that a better approach is to write:
def formatStr(str : String) = str.toUpperCase
Or my preferred coding style:
val formatStr = (_ : String).toUpperCase
Then the user of your function can apply formatStr in any fashion they want and you don't have to worry about all of the collection casting:
val fut : Future[String] = ???
val formatFut = fut map formatStr
val opt : Option[String] = ???
val formatOpt = opt map formatStr
I have a sequence of Errors or Views (Seq[Xor[Error,View]])
I want to map this to an Xor of the first error (if any) or a Sequence of Views
(Xor[Error, Seq[View]]) or possibly simply (Xor[Seq[Error],Seq[View])
How can I do this?
You can use sequenceU provided by the bitraverse syntax, similar to as you would do with scalaz. It doesn't seem like the proper type classes exist for Seq though, but you can use List.
import cats._, data._, implicits._, syntax.bitraverse._
case class Error(msg: String)
case class View(content: String)
val errors: List[Xor[Error, View]] = List(
Xor.Right(View("abc")), Xor.Left(Error("error!")),
Xor.Right(View("xyz"))
)
val successes: List[Xor[Error, View]] = List(
Xor.Right(View("abc")),
Xor.Right(View("xyz"))
)
scala> errors.sequenceU
res1: cats.data.Xor[Error,List[View]] = Left(Error(error!))
scala> successes.sequenceU
res2: cats.data.Xor[Error,List[View]] = Right(List(View(abc), View(xyz)))
In the most recent version of Cats Xor is removed and now the standard Scala Either data type is used.
Michael Zajac showed correctly that you can use sequence or sequenceU (which is actually defined on Traverse not Bitraverse) to get an Either[Error, List[View]].
import cats.implicits._
val xs: List[Either[Error, View]] = ???
val errorOrViews: Either[Error, List[View]] = xs.sequenceU
You might want to look at traverse (which is like a map and a sequence), which you can use most of the time instead of sequence.
If you want all failing errors, you cannot use Either, but you can use Validated (or ValidatedNel, which is just a type alias for Validated[NonEmptyList[A], B].
import cats.data.{NonEmptyList, ValidatedNel}
val errorsOrViews: ValidatedNel[Error, List[View]] = xs.traverseU(_.toValidatedNel)
val errorsOrViews2: Either[NonEmptyList[Error], List[View]] = errorsOrViews.toEither
You could also get the errors and the views by using MonadCombine.separate :
val errorsAndViews: (List[Error], List[View]) = xs.separate
You can find more examples and information on Either and Validated on the Cats website.
I'm pretty new to Scala, please bear with me. I have a bunch of futures wrapped in a large array. The futures have done their hard work looking through a few TBs of data, and at the end of my app I want to wrap up all the results of said futures so I can present them nicely.
The collection of futures I have is of the following type:
Array[Future[List(String, String, String)]]
Everything I've read so far about the for-comprehension show that
val test: Seq[Seq[List[String]]] = Seq(Seq(List("Hello", "World"), List("What's", "Up")))
val results = for {
test1 <- test
test2 <- test1
test3 <- test2
} yield test3
Results in
results: Seq[String] = List(Hello, World, What's, Up)
By that same logic, my intention was to do it like so, since I recently discovered that Option, Try, Failure and Success can be treated as collections:
val futures = { ... } // Logic that collects my Futures
// futures is now Array[Future[List(String, String, String)]]
val results = for {
// futureSeq as Seq[List(String, String, String]
futureSeq <- Future.sequence(futures.toSeq)
// resultSet as List(String, String, String)
resultSet <- futureSeq
} yield resultset
But this doesn't to work. I seem to be receiving the following compilation errors:
Error:(78, 15) type mismatch;
found : Seq[List(String, String, String)]
required: scala.concurrent.Future[?]
resultSet <- futureSeq
^
The part with the required: scala.concurrent.Future[?] is throwing me off completely. I do not understand at all why a Future would be required there.
I have checked the types of all my objects through the REPL, by debugging, and by using IntelliJ's type inspection. They seem to confirm that I'm not just confused about my types.
And before anyone mentions, yes, I am aware that the for-comprehension is syntactic sugar for a bunch of maps, flatMaps and withFilters.
The details of how the for-comprehension desugars to calls to flatMap and map are important here. This code:
for {
futureSeq <- Future.sequence(futures.toSeq)
resultSet <- futureSeq
} yield resultset
Becomes something more or less like this:
Future.sequence(futures.toSeq).flatMap(futureSeq => futureSeq)
The flatMap on Future expects a function that returns a Future, but you've given it one that returns an Seq[List[(String, String, String)]].
In general you can't mix types in for-comprehensions (Option in a sequence comprehension is a kind of exception that's supported by an implicit conversion). If you have a <- arrow coming out of a future, all of the rest of your <- arrows need to come out of futures.
You probably want something like this:
val results: Future[Seq[(String, String, String)]] =
Future.sequence(futures.toSeq).map(_.flatten)
You could then use something like this:
import scala.concurrent.Await
import scala.concurrent.duration._
Await.result(results.map(_.map(doSomethingWithResult)), 2.seconds)
To synchronously present the results (blocking until its done).
I'm trying to use a for expression to iterate over a list, then do a transformation on each element using a utility that returns a Future. Long story short, it doesn't compile, and I'd like to understand why. I read this question, which is similar, and was a great help, but what I'm trying to do is even simpler, which is all the more confusing as to why it doesn't work. I'm trying to do something like:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
val numberList = List(1, 2, 3)
def squareInTheFuture(number: Int): Future[Int] = Future { number * number}
val allTheSquares = for {
number <- numberList
square <- squareInTheFuture(number)
} yield { square }
And what I get is:
error: type mismatch;
found : scala.concurrent.Future[Int]
required: scala.collection.GenTraversableOnce[?]
square <- squareInTheFuture(number)
^
Can someone help me understand why this doesn't work and what the best alternative is?
The Future companion object has a traverse method that does exactly what you want:
val allTheSquares: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)
This will asynchronously start all the computations and return a future that will be completed once all of those futures are completed.
flatMap requires that the type constructors of numberList and squareInTheFuture(number) are the same (modulo whatever implicit conversions the collection library does). That isn't the case here. Instead, this is a traversal:
val allSquaresInTheFuture: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)
#Lee is correct. As an addition, if you are trying to do parallel computation:
val numberList = List(1, 2, 3)
val allTheSquares = numberList.par.map(x => x * x)(breakOut)
If you really want Future:
val allTheSquares: Future[List[Int]] = Future.traverse(numberList)(squareInTheFuture)
Your for comprehension is the same as
val allTheSquares = numberList.flatMap(number => squareInTheFuture(number))
flatMap requires that it's argument function returns a GenTraversableOnce[Int], however yours returns a Future[Int], hence the mismatch.