How can I define a type to present a stream which can provide integers, but also may fail? - scala

Suppose I want to define a "NumberLoader", which will load integers from another place by demand, so I can give it a type:
type NumberLoader = Stream[Integer]
But it may throw errors when loading (say, it loads integers via network from another computer), so there should be some NetworkError in the type.
Finally, I defined it as:
type NumberLoader = Stream[Either[NetworkError, Integer]]
If seems to work, but I feel it a little strange. Is it a good one?

You want to represent
a Stream of something which can be either a NetworkError or an Integer
Stream[ Either[ NetworkError, Integer]]
so the type looks appropriate and well-suited.
You can alternatively use Future or Try in place of Either, but you would lose the flexibility of specifying the kind of exception you expect, as both Future and Try failures hold a generic Throwable.

Related

Losing path dependent type when extracting value from Try in scala

I'm working with scalax to generate a graph of my Spark operationS. So, I have a custom library that generates my graph. So, let me show a sample:
val DAGWithoutGet = createGraphFromOps(ops)
val DAGWithGet = createGraphFromOps(ops).get
The return type of DAGWithoutGet is
scala.util.Try[scalax.collection.Graph[typeA, scalax.collection.GraphEdge.DiEdge]],
and, for DAGWithGet is
scalax.collection.Graph[typeA, scalax.collection.GraphEdge.DiEdge].
Here, typeA is a project related class representing a single Spark operation, not relevant for the context of this question. (for context only: What my custom library does is, essentially, generate a map of dependencies between those operations, creating a big Map object, and calling Graph(myBigMap: _*) to generate the graph).
As far as I know, calling the .get command on this point of my code or later should not make any difference, but that is not what I'm seeing.
Calling DAGWithoutGet.get.nodes has a return type of scalax.collection.Graph[typeA,DiEdge]#NodeSetT,
while calling DAGWithGet.nodes returns DAGWithGet.NodeSetT.
When I extract one of those nodes (using the .find method), I receive scalax.collection.Graph[typeA,DiEdge]#NodeT and DAGWithGet.NodeT types, respectively. Much to my dismay, even the methods available in each case are different - I cannot use pathTo (which happens to be what I want) or withSubgraph on the former, only on the latter.
My doubt is, then, after this relatively complex example: what is going on here? Why extracting the value from the Try construct on different moments leads to different types, one path dependent, and the other not - or, if that isn't correct, what may I be missing here?

Is returning Either/Option/Try/Or considered a viable / idiomatic approach when function has preconditions for arguments?

First of all, I'm very new to Scala and don't have any experience writing production code with it, so I lack understanding of what is considered a good/best practice among community. I stumbled upon these resources:
https://github.com/alexandru/scala-best-practices
https://nrinaudo.github.io/scala-best-practices/
It is mentioned there that throwing exceptions is not very good practice, which made me think what would be a good way to define preconditions for function then, because
A function that throws is a bit of a lie: its type implies it’s total function when it’s not.
After a bit of research, it seems that using Option/Either/Try/Or(scalactic) is a better approach, since you can use something like T Or IllegalArgumentException as return type to clearly indicate that function is actually partial, using exception as a way to store message that can be wrapped in other exceptions.
However lacking Scala experience I don't quite understand if this is actually viable approach for a real project or using Predef.require is a way to go. I would appreciate if someone explained how things are usually done in Scala community and why.
I've also seen Functional assertion in Scala, but while the idea itself looks interesting, I think PartialFunction is not very suitable for the purpose as it is, because often more than one argument is passed and tuples look like a hack in this case.
Option or Either is definitely the way to go for functional programming.
With Option it is important to document why None might be returned.
With Either, the left side is the unsuccessful value (the "error"), while the right side is the successful value. The left side does not necessarily have to be an Exception (or a subtype of it), it can be a simple error message String (type aliases are your friend here) or a custom data type that is suitable for you application.
As an example, I usually use the following pattern when error handling with Either:
// Somewhere in a package.scala
type Error = String // Or choose something more advanced
type EitherE[T] = Either[Error, T]
// Somewhere in the program
def fooMaybe(...): EitherE[Foo] = ...
Try should only be used for wrapping unsafe (most of the time, plain Java) code, giving you the ability to pattern-match on the result:
Try(fooDangerous()) match {
case Success(value) => ...
case Failure(value) => ...
}
But I would suggest only using Try locally and then go with the above mentioned data types from there.
Some advanced datatypes like cats.effect.IO or monix.reactive.Observable contain error handling natively.
I would also suggest looking into cats.data.EitherT for typeclass-based error handling. Read the documentation, it's definitely worth it.
As a sidenote, for everyone coming from Java, Scala treats all Exceptions as Java treats RuntimeExceptions. That means, even when an unsafe piece of code from one of your dependencies throws a (checked) IOException, Scala will never require you to catch or otherwise handle the exception. So as a rule of thumb, when using Java - dependencies, almost always wrap them in a Try (or an IO if they execute side effects or block the thread).
I think your reasoning is correct. If you have a simple total (opposite of partial) function with arguments that can have invalid types then the most common and simple solution is to return some optional result like Option, etc.
It's usually not advisable to throw exceptions as they break FP laws. You can use any library that can return a more advanced type than Option like Scalaz Validation if you need to compose results in ways that are awkward with Option.
Another two alternatives I could offer is to use:
Type constrained arguments that enforce preconditions. Example: val i: Int Refined Positive = 5 based on https://github.com/fthomas/refined. You can also write your own types which wrap primitive types and assert some properties. The problem here is if you have arguments that have multiple interdependent valid values which are mutually exclusive per argument. For instance x > 1 and y < 1 or x < 1 and y > 1. In such case you can return an optional value instead of using this approach.
Partial functions, which in the essence resemble optional return types: case i: Int if i > 0 => .... Docs: https://www.scala-lang.org/api/2.12.1/scala/PartialFunction.html.
For example:
PF's def lift: (A) ⇒ Option[B] converts PF to your regular function.
Turns this partial function into a plain function returning an Option
result.
Which is similar to returning an option. The problem with partial functions that they are a bit awkward to use and not fully FP friendly.
I think Predef.require belongs to very rare cases where you don't want to allow any invalid data to be constructed and is more of a stop-everything-if-this-happens kind of measure. Example would be that you get arguments you never supposed to get.
You use the return type of the function to indicate the type of the result.
If you want to describe a function that can fail for whatever reason, of the types you mentioned you would probably return Try or Either: I am going to "try" to give your a result, or I am going to return "either" a success or an failure.
Now you can specify a custom exception
case class ConditionException(message: String) extends RuntimeException(message)
that you would return if your condition is not satisfied, e.g
import scala.util._
def myfunction(a: String, minLength: Int): Try[String] = {
if(a.size < minLength) {
Failure(ConditionException(s"string $a is too short")
} else {
Success(a)
}
}
and with Either you would get
import scala.util._
def myfunction(a: String, minLength: Int): Either[ConditionException,String] = {
if(a.size < minLength) {
Left(ConditionException(s"string $a is too short")
} else {
Right(a)
}
}
Not that the Either solution clearly indicates the error your function might return

How to use Scala Cats Validated the correct way?

Following is my use case
I am using Cats for validation of my config. My config file is in json.
I deserialize my config file to my case class Config using lift-json and then validate it using Cats. I am using this as a guide.
My motive for using Cats is to collect all errors iff present at time of validation.
My problem is the examples given in the guide, are of the type
case class Person(name: String, age: Int)
def validatePerson(name: String, age: Int): ValidationResult[Person] = {
(validateName(name),validate(age)).mapN(Person)
}
But in my case I already deserialized my config into my case class ( below is a sample ) and then I am passing it for validation
case class Config(source: List[String], dest: List[String], extra: List[String])
def vaildateConfig(config: Config): ValidationResult[Config] = {
(validateSource(config.source), validateDestination(config.dest))
.mapN { case _ => config }
}
The difference here is mapN { case _ => config }. As I already have a config if everything is valid I dont want to create the config anew from its members. This arises as I am passing config to validate function not it's members.
A person at my workplace told me this is not the correct way, as Cats Validated provides a way to construct an object if its members are valid. The object should not exist or should not be constructible if its members are invalid. Which makes complete sense to me.
So should I make any changes ? Is the above I'm doing acceptable ?
PS : The above Config is just an example, my real config can have other case classes as its members which themselves can depend on other case classes.
One of the central goals of the kind of programming promoted by libraries like Cats is to make invalid states unrepresentable. In a perfect world, according to this philosophy, it would be impossible to create an instance of Config with invalid member data (through the use of a library like Refined, where complex constraints can be expressed in and tracked by the type system, or simply by hiding unsafe constructors). In a slightly less perfect world, it might still be possible to construct invalid instances of Config, but discouraged, e.g. through the use of safe constructors (like your validatePerson method for Person).
It sounds like you're in an even less perfect world where you have instances of Config that may or may not contain invalid data, and you want to validate them to get "new" instances of Config that you know are valid. This is totally possible, and in some cases reasonable, and your validateConfig method is a perfectly legitimate way to solve this problem, if you're stuck in that imperfect world.
The downside, though, is that the compiler can't track the difference between the already-validated Config instances and the not-yet-validated ones. You'll have Config instances floating around in your program, and if you want to know whether they've already been validated or not, you'll have to trace through all the places they could have come from. In some contexts this might be just fine, but for large or complex programs it's not ideal.
To sum up: ideally you'd validate Config instances whenever they are created (possibly even making it impossible to create invalid ones), so that you don't have to remember whether any given Config is good or not—the type system can remember for you. If that's not possible, because of e.g. APIs or definitions you don't control, or if it just seems too burdensome for a simple use case, what you're doing with validateConfig is totally reasonable.
As a footnote, since you say above that you're interested in looking in more detail at Refined, what it provides for you in a situation like this is a way to avoid even more functions of the shape A => ValidationResult[A]. Right now your validateName method, for example, probably takes a String and returns a ValidationResult[String]. You can make exactly the same argument against this signature as I have against Config => ValidationResult[Config] above—once you're working with the result (by mapping a function over the Validated or whatever), you just have a string, and the type doesn't tell you that it's already been validated.
What Refined allows you to do is write a method like this:
def validateName(in: String): ValidationResult[Refined[String, SomeProperty]] = ...
…where SomeProperty might specify a minimum length, or the fact that the string matches a particular regular expression, etc. The important point is that you're not validating a String and returning a String that only you know something about—you're validating a String and returning a String that the compiler knows something about (via the Refined[A, Prop] wrapper).
Again, this may be (okay, probably is) overkill for your use case—you just might find it nice to know that you can push this principle (tracking validation in types) even further down through your program.

What is the difference between Try and Either?

According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.

Under what conditions is inferring Nothing desirable?

In my own code, and on numerous mailing list postings, I've noticed confusion due to Nothing being inferred as the least upper bound of two other types.
The answer may be obvious to you*, but I'm lazy, so I'm asking you*:
Under what conditions is inferring Nothing in this way the most desirable outcome?
Would it make sense to have the compiler throw an error in these cases, or a warning unless overridden by some kind of annotation?
* Plural
Nothing is the subtype of everything, so it is in a certain sense the counter part of Any, which is the super-type of everything. Nothing can't be instantiated, you'll never hold a Nothing object. There are two situations (I'm aware of) where Nothing is actually useful:
A function that never returns (in contrast to a function that returns no useful value, which would use Unit instead), which happens for infinite loops, infinite blocking, throwing always an exception or exiting the application
As a way to specify the type of empty Containers, e.g. Nil or None. In Java, you can't have a single Nil object for an generic immutable lists without casting or other tricks: If you want to create a List of Dates, even the empty element needs to have the right type, which must be a subtype of Date. As Date and e.g. Integer don't share a common subtype in Java, you can't create such a Nil instance without tricks, despite the fact that your Nil doesn't even hold any value. Now Scala has this common subtype for all objects, so you can define Nil as object Nil extends List[Nothing], and you can use it to start any List you like.
To your second question: Yes, that would be useful. I'd guess there is already a compiler switch for turning on these warnings, but I'm not sure.
It's impossible to infer Nothing as the least upper bound of two types unless those two types are also both Nothing. When you infer the least upper bound of two types, and those two types have nothing in common, you'll get Any (In most such cases, you'll get AnyRef though, because you'll only get Any when a value type like Int or Long is involved.)