so Lets say I have some class:
case class Product(id: Int, name: String)
If I'm interfacing with some Java API or something and the possibility of either of those values being null is there then could I just write:
case class Product(id: Option[Int], name: Option[String])
If I wasn't interfacing with anything that has nulls as a concept then it would be fine to just stick with the first implementation right?
It seems a little annoying because I would have to write unit tests to test these things as well...
Thoughts would be much appreciated.
The question when to use Option should be more related to the fact that a value could be missing as opposed to a value might be mistakenly set to null.
To clarify, if you're worried that somehow the user of your library could set the id or name field to null, you should check for it in both version. Indeed Option itself doesn't give you any guarantee.
Furthermore using an Option where on a value that is not really optional is really misleading.
I usually use Option only to explicitly declare that a value might not be present and that is a valid state.
When it comes to interacting with Java I adopt the following 2 strategies depending if I'm providing the API or consuming it:
If I provide an API that is consumed by Java code I don't use Option, but instead use a little constructor wrapper which check that all parameters are not null and throw and IllegalArgumentException otherwise.
If I consume an API written in Java I do use Option wrapping all returned value so that my Scala code can handle these cases in a cleaner way.
Also you can use require to throw IllegalArgumentException, for example:
case class Product(id: Int, name: String){
require(name != null, "the name cannot be null")
}
Related
First of all, I'm very new to Scala and don't have any experience writing production code with it, so I lack understanding of what is considered a good/best practice among community. I stumbled upon these resources:
https://github.com/alexandru/scala-best-practices
https://nrinaudo.github.io/scala-best-practices/
It is mentioned there that throwing exceptions is not very good practice, which made me think what would be a good way to define preconditions for function then, because
A function that throws is a bit of a lie: its type implies it’s total function when it’s not.
After a bit of research, it seems that using Option/Either/Try/Or(scalactic) is a better approach, since you can use something like T Or IllegalArgumentException as return type to clearly indicate that function is actually partial, using exception as a way to store message that can be wrapped in other exceptions.
However lacking Scala experience I don't quite understand if this is actually viable approach for a real project or using Predef.require is a way to go. I would appreciate if someone explained how things are usually done in Scala community and why.
I've also seen Functional assertion in Scala, but while the idea itself looks interesting, I think PartialFunction is not very suitable for the purpose as it is, because often more than one argument is passed and tuples look like a hack in this case.
Option or Either is definitely the way to go for functional programming.
With Option it is important to document why None might be returned.
With Either, the left side is the unsuccessful value (the "error"), while the right side is the successful value. The left side does not necessarily have to be an Exception (or a subtype of it), it can be a simple error message String (type aliases are your friend here) or a custom data type that is suitable for you application.
As an example, I usually use the following pattern when error handling with Either:
// Somewhere in a package.scala
type Error = String // Or choose something more advanced
type EitherE[T] = Either[Error, T]
// Somewhere in the program
def fooMaybe(...): EitherE[Foo] = ...
Try should only be used for wrapping unsafe (most of the time, plain Java) code, giving you the ability to pattern-match on the result:
Try(fooDangerous()) match {
case Success(value) => ...
case Failure(value) => ...
}
But I would suggest only using Try locally and then go with the above mentioned data types from there.
Some advanced datatypes like cats.effect.IO or monix.reactive.Observable contain error handling natively.
I would also suggest looking into cats.data.EitherT for typeclass-based error handling. Read the documentation, it's definitely worth it.
As a sidenote, for everyone coming from Java, Scala treats all Exceptions as Java treats RuntimeExceptions. That means, even when an unsafe piece of code from one of your dependencies throws a (checked) IOException, Scala will never require you to catch or otherwise handle the exception. So as a rule of thumb, when using Java - dependencies, almost always wrap them in a Try (or an IO if they execute side effects or block the thread).
I think your reasoning is correct. If you have a simple total (opposite of partial) function with arguments that can have invalid types then the most common and simple solution is to return some optional result like Option, etc.
It's usually not advisable to throw exceptions as they break FP laws. You can use any library that can return a more advanced type than Option like Scalaz Validation if you need to compose results in ways that are awkward with Option.
Another two alternatives I could offer is to use:
Type constrained arguments that enforce preconditions. Example: val i: Int Refined Positive = 5 based on https://github.com/fthomas/refined. You can also write your own types which wrap primitive types and assert some properties. The problem here is if you have arguments that have multiple interdependent valid values which are mutually exclusive per argument. For instance x > 1 and y < 1 or x < 1 and y > 1. In such case you can return an optional value instead of using this approach.
Partial functions, which in the essence resemble optional return types: case i: Int if i > 0 => .... Docs: https://www.scala-lang.org/api/2.12.1/scala/PartialFunction.html.
For example:
PF's def lift: (A) ⇒ Option[B] converts PF to your regular function.
Turns this partial function into a plain function returning an Option
result.
Which is similar to returning an option. The problem with partial functions that they are a bit awkward to use and not fully FP friendly.
I think Predef.require belongs to very rare cases where you don't want to allow any invalid data to be constructed and is more of a stop-everything-if-this-happens kind of measure. Example would be that you get arguments you never supposed to get.
You use the return type of the function to indicate the type of the result.
If you want to describe a function that can fail for whatever reason, of the types you mentioned you would probably return Try or Either: I am going to "try" to give your a result, or I am going to return "either" a success or an failure.
Now you can specify a custom exception
case class ConditionException(message: String) extends RuntimeException(message)
that you would return if your condition is not satisfied, e.g
import scala.util._
def myfunction(a: String, minLength: Int): Try[String] = {
if(a.size < minLength) {
Failure(ConditionException(s"string $a is too short")
} else {
Success(a)
}
}
and with Either you would get
import scala.util._
def myfunction(a: String, minLength: Int): Either[ConditionException,String] = {
if(a.size < minLength) {
Left(ConditionException(s"string $a is too short")
} else {
Right(a)
}
}
Not that the Either solution clearly indicates the error your function might return
Following is my use case
I am using Cats for validation of my config. My config file is in json.
I deserialize my config file to my case class Config using lift-json and then validate it using Cats. I am using this as a guide.
My motive for using Cats is to collect all errors iff present at time of validation.
My problem is the examples given in the guide, are of the type
case class Person(name: String, age: Int)
def validatePerson(name: String, age: Int): ValidationResult[Person] = {
(validateName(name),validate(age)).mapN(Person)
}
But in my case I already deserialized my config into my case class ( below is a sample ) and then I am passing it for validation
case class Config(source: List[String], dest: List[String], extra: List[String])
def vaildateConfig(config: Config): ValidationResult[Config] = {
(validateSource(config.source), validateDestination(config.dest))
.mapN { case _ => config }
}
The difference here is mapN { case _ => config }. As I already have a config if everything is valid I dont want to create the config anew from its members. This arises as I am passing config to validate function not it's members.
A person at my workplace told me this is not the correct way, as Cats Validated provides a way to construct an object if its members are valid. The object should not exist or should not be constructible if its members are invalid. Which makes complete sense to me.
So should I make any changes ? Is the above I'm doing acceptable ?
PS : The above Config is just an example, my real config can have other case classes as its members which themselves can depend on other case classes.
One of the central goals of the kind of programming promoted by libraries like Cats is to make invalid states unrepresentable. In a perfect world, according to this philosophy, it would be impossible to create an instance of Config with invalid member data (through the use of a library like Refined, where complex constraints can be expressed in and tracked by the type system, or simply by hiding unsafe constructors). In a slightly less perfect world, it might still be possible to construct invalid instances of Config, but discouraged, e.g. through the use of safe constructors (like your validatePerson method for Person).
It sounds like you're in an even less perfect world where you have instances of Config that may or may not contain invalid data, and you want to validate them to get "new" instances of Config that you know are valid. This is totally possible, and in some cases reasonable, and your validateConfig method is a perfectly legitimate way to solve this problem, if you're stuck in that imperfect world.
The downside, though, is that the compiler can't track the difference between the already-validated Config instances and the not-yet-validated ones. You'll have Config instances floating around in your program, and if you want to know whether they've already been validated or not, you'll have to trace through all the places they could have come from. In some contexts this might be just fine, but for large or complex programs it's not ideal.
To sum up: ideally you'd validate Config instances whenever they are created (possibly even making it impossible to create invalid ones), so that you don't have to remember whether any given Config is good or not—the type system can remember for you. If that's not possible, because of e.g. APIs or definitions you don't control, or if it just seems too burdensome for a simple use case, what you're doing with validateConfig is totally reasonable.
As a footnote, since you say above that you're interested in looking in more detail at Refined, what it provides for you in a situation like this is a way to avoid even more functions of the shape A => ValidationResult[A]. Right now your validateName method, for example, probably takes a String and returns a ValidationResult[String]. You can make exactly the same argument against this signature as I have against Config => ValidationResult[Config] above—once you're working with the result (by mapping a function over the Validated or whatever), you just have a string, and the type doesn't tell you that it's already been validated.
What Refined allows you to do is write a method like this:
def validateName(in: String): ValidationResult[Refined[String, SomeProperty]] = ...
…where SomeProperty might specify a minimum length, or the fact that the string matches a particular regular expression, etc. The important point is that you're not validating a String and returning a String that only you know something about—you're validating a String and returning a String that the compiler knows something about (via the Refined[A, Prop] wrapper).
Again, this may be (okay, probably is) overkill for your use case—you just might find it nice to know that you can push this principle (tracking validation in types) even further down through your program.
I have a controller defined like this:
def registerCompany = Action.async(BodyParsers.parse.json) { request =>
request.body.validate[Company].fold(
errors => Future {
BadRequest(errors.mkString)
},
company => Future {
registrationService.registerCompany
Ok("saved")
}
)
}
Company is a simple case class
case class Company(name: String, address: Address, adminUser: Option[User] = None,
venues: Option[Set[Venue]] = None, _id: Option[Long]) {
}
so that I can take advantage of
implicit val companyFormatter = Json.format[Company]
So far so good, but now I want to have validation in the Company class. I've been googling a bit and the best I found was this:
http://koff.io/posts/292173-validation-in-scala/
So many solutions, yet, I'm not happy with any of them. Most of these solutions have known limitations or are a bit messy. I'd like to have declarative validation (annotation based), as that means I write less code and it looks cleaner.
I could mix java with scala and use JSR-303, but it doesn't work for case classes and I don't want to implement Reads and Writes for simple objects.
This is the closest I could find to what I want, but it doesn't support NotNull: https://github.com/bean-validation-scala/bean-validation-scala
Seems a bit like a luxury problem, with so many different solutions, but the truth is that in Java I can get the best of both worlds.
Is there anything else that I could use? Or any work around to the possibilities I'm listing here that could allow me to use both annotation based validation and case classes?
It all depends on what you mean by validation.
You gave an example of a validate Json in request body and I think you would like validate Json request (with Company constraints) but not case class Company (although the topic is called "Scala case class validation"). So you need use Validation with Reads
For example you may use:
( __ \ "name").read[String] for Required constraint.
( __ \ "_id").read[Long](min(0) keepAnd max(150)) for 0 < x < 150 constraint.
You may implement own Reads[Company] and Writes[Company] but not macros Json.format[Company]
Update for comment "The difference is that you rarely need to write a custom deserializer, as jackson knows how to handle any object"
If you don't want implement own deserialization and want use format macros, you may implement Company validation, but still using Reads:
val f(company: Company): Boolean = {... company constraints to boolean ...}
request.body.validate[Company](
companyFormatter.filter(ValidationError("Company validation error"))(f)
)
but this constraints will be applied after full Company deserialization.
Don't even know which is better: universal deserialization and "post" constraints or own deserialization and "pre" constraints. But both are working.
You could also roll your own, something like this:
https://gist.github.com/eirirlar/b40bd07a71044d3776bc069f210798c6
It will validate both case classes and incomplete case classes (hlists of options of types tagged with keys, where types and keys match the case class).
It won't give you annotations, but you're free to declare different validators for different contexts.
def myMethod(dog: Dog) = {
require (dog != null) // is it possible to already constraint it in the `Dog` type?
}
Is there a way to construct Dog such that it would be an ADT which would never be able to accept a null thus eliminate any null check? (I don't want an Option here, otherwise all my code would turn to have Option based, I want to already constraint the Dog class such that null is never possible, this is why type system is for to allow me to specify constraints in my program).
There was an attempt to provide such functionality (example I'm running in 2.10.4):
class A extends NotNull
defined class A
val x: A = null
// <console>:8: error: type mismatch;
// found : Null(null)
// required: A
// val x: A = null
^
Though it was never complete and eventually got deprecated. As for the time of writing, I don't think it's possible to construct ones hierarchy in a way that prevent you from nulls, without additional nullity checking analysis.
Check out comments in relevant ticket for an insight
I don't think this is possible, generally, because Java ruins everything. If you have a Java method that returns Dog, that could give you a null no matter what language/type features you add to Scala. That null could then be passed around, even in Scala code, and end up being passed to myMethod.
So you can't have non-null types in Scala without losing the interoperability property that Scala objects are Java objects (at least for the type in question).
Unfortunately inheritance makes it very difficult for the computer to know in the general case whether a method could be passed an object that originated from Java - unless everything is final/sealed, you can always subclass a class that handled the object at some point, and override the Dog-returning method. So it requires hairy full-program analysis to figure out the concrete types of everything (and remember, which concrete types are used can depend on runtime input!) just to establish that a given case could not involve Java code.
Case classes in Scala are standard classes enhanced with pattern-matching, equals, ... (or am I wrong?). Moreover they require no "new" keyword for their instanciation. It seems to me that they are simpler to define than regular classes (or am I again wrong?).
There are lots of web pages telling where they should be used (mostly about pattern matchin). But where should they be avoided ? Why don't we use them everywhere ?
There are many places where case classes are not adequate:
When one wishes to hide the data structure.
As part of a type hierarchy of more than two or three levels.
When the constructor requires special considerations.
When the extractor requires special considerations.
When equality and hash code requires special considerations.
Sometimes these requirements show up late in the design, and requires one to convert a case class into a normal class. Since the benefits of a case class really aren't all that great -- aside from the few special cases they were specially made for -- my own recommendation is not to make anything a case class unless there's a clear use for it.
Or, in other words, do not overdesign.
Inheriting from case classes is problematic. Suppose you have code like so:
case class Person(name: String) { }
case class Homeowner(address: String,override val name: String)
extends Person(name) { }
scala> Person("John") == Homeowner("1 Main St","John")
res0: Boolean = true
scala> Homeowner("1 Main St","John") == Person("John")
res1: Boolean = false
Perhaps this is what you want, but usually you want a==b if and only if b==a. Unfortunately, the compiler can't sensibly fix this for you automatically.
This gets even worse because the hashCode of Person("John") is not the same as the hashCode of Homeowner("1 Main St","John"), so now equals acts weird and hashCode acts weird.
As long as you know what to expect, inheriting from case classes can give comprehensible results, but it has come to be viewed as bad form (and thus has been deprecated in 2.8).
One downside that is mentioned in Programming in Scala is that due to the things automatically generated for case classes the objects get larger than for normal classes, so if memory efficiency is important, you might want to use regular classes.
It can be tempting to use case classes because you want free toString/equals/hashCode. This can cause problems, so avoid doing that.
I do wish there were an annotation that let you get those handy things without making a case class, but maybe that's harder than it sounds.