Using Scala, Play Framework, Slick 3, Specs2.
I have a repository layer and a service layer. The repositories are quite dumb, and I use specs2 to make sure the service layer does its job.
My repositories used to return futures, like this:
def findById(id: Long): Future[Option[Foo]] =
db.run(fooQuery(id))
Then it would be used in the service:
def foonicate(id: Long): Future[Foo] = {
fooRepository.findById(id).flatMap { optFoo =>
val foo: Foo = optFoo match {
case Some(foo) => [business logic returning Foo]
case None => [business logic returning Foo]
}
fooRepository.save(foo)
}
}
Services were easy to spec. In the service spec, the FooRepository would be mocked like this:
fooRepository.findById(3).returns(Future(Foo(3)))
I recently found the need for database transactions. Several queries should be combined into a single transaction. The prevailing opinion seems to be that it's perfectly ok to handle transaction logic in the service layer.
With that in mind, I've changed my repositories to return slick.dbio.DBIO and added a helper method to run the query transactionally:
def findById(id: Long): DBIO[Option[Foo]] =
fooQuery(id)
def run[T](query: DBIO[T]): Future[T] =
db.run(query.transactionally)
The services compose the DBIOs and finally call the repository to run the query:
def foonicate(id: Long): Future[Foo] = {
val query = fooRepository.findById(id).flatMap { optFoo =>
val foo: Foo = optFoo match {
case Some(foo) => [business logic finally returning Foo]
case None => [business logic finally returning Foo]
}
fooRepository.save(foo)
}
fooRepository.run(query)
}
That seems to work, but now I can only spec it like this:
val findFooDbio = DBIO.successful(None))
val saveFooDbio = DBIO.successful(Foo(3))
fooRepository.findById(3) returns findFooDbio
fooRepository.save(Foo(3)) returns saveFooDbio
fooRepository.run(any[DBIO[Foo]]) returns Future(Foo(3))
That any in the run mock sucks! Now I'm not testing the actual logic but instead accept any DBIO[Foo]! I've tried to use the following:
fooRepository.run(findFooDbio.flatMap(_ => saveFooDbio)) returns Future(Foo(3))
But it breaks with java.lang.NullPointerException: null, which is specs2's way of saying "sorry mate, the method with this parameter wasn't found". I've tried various variations, but none of them worked.
I suspect that might be because functions can't be compared:
scala> val a: Int => String = x => "hi"
a: Int => String = <function1>
scala> val b: Int => String = x => "hi"
b: Int => String = <function1>
scala> a == b
res1: Boolean = false
Any ideas how to spec DBIO composition without cheating?
I had a similar idea and also investigated it to see if I could:
create the same DBIO composition
use it with matcher in mocking
However I found out that it is actually infeasible in practice:
as you noticed, you cannot compare functions
additionally, when you investigate internals of DBIO, it is basically Free-monad-like structure - it has implementation for plain values and directly generated queries (then, if you extract the statements you could compare some part of the query), but there are also mappings which stores functions
even if you somehow managed to reuse functions, so that they had reference equality, implementations of DBIO do not care about overriding equals, so they would be different beast anyway
So knowing that, I gave up the initial idea. What I can recommend instead:
mock on any input of database.run - it is more error prone, as it won't notify you if test expectations start differing from returned results, but it's better than nothing
replace DBIO by some intermediate structure, that you know you can compare safely. E.g. Cats' Free monad implementation uses case classes so as long as you manage to ensure that functions are somehow comparable (e.g. by not creating them ad hoc, but instead using vals and objects), you could compare on intermediate representation, and mock whole interpret -> run process
replace unit tests with mocked database with integration tests with an actual database
try out Typed Tagless Final Interpreter pattern for handling databases - and basically inject in tests different monad than in production (e.g. prod -> service returning DBIO, production -> service returning Futures you want)
Actually, you could try many other things with Free, TTFI and swapping implementations. The bottom line is - you cannot reliably compare on DBIO, so design your code in a way, that you could test without doing so. It's not a pleasant answer, especially if you just wanted to put together test, and move on, but AFAIK there is no other way.
Related
For the sake of simplicity suppose we have a method called listTail which is defined in the following way:
private def listTail(ls: List[Int]): List[Int] = {
ls.tail
}
Also we have a method which handles the exception when the list is empty.
private def handleEmptyList(ls: List[Int]): List[Int] = {
if(ls.isEmpty) List.empty[Int]
}
Now I want to create a safe version of the listTail method, which uses both methods:
import scala.util.{Try, Success, Failure}
def safeListTail(ls: List[Int]): List[Int] = {
val tryTail: Try[List[Int]] = Try(listTail(ls))
tryTail match {
case Success(list) => list
case Failure(_) => handleEmptyList(ls)
}
}
My question is, if the two private methods are already tested, then should I test the safe method as well? And if yes, how?
I was thinking just to check if the pattern matching cases are executed depending on the input. That is, when we hit the Failure case then the handleEmptyList method is executed. But I am now aware of how to check this.
Or do I need to refactor my code, and put everything in a single method? Even though maybe my private methods are much more complex than this in the example.
My test are written using ScalaTest.
Allowing your methods to throw intentionally is a bad idea and definitely isn't in the spirit of FP. It's probably better to capture failure in the type signature of methods which have the ability to fail.
private def listTail(ls: List[Int]): Try[List[Int]] = Try {
ls.tail
}
Now your users know that this will return either an Success or a Failure and there's no magic stack unrolling. This already makes it easier to test that method.
You can also get rid of the pattern matching with a simple def safeTailList(ls: List[Int]) = listTail(l).getOrElse(Nil) with this formulation -- pretty nice!
If you want to test this, you can make it package private and test it accordingly.
The better idea would be to reconsider your algorithm. There's machinery that makes getting the safe tail built-in:
def safeTailList(ls: List[Int]) = ls.drop(1)
It is actually the other way around: normally, you don't want to test private methods, only the public ones, because they are the ones that define your interactions with the outside world, as long as they work as promised, who cares what your private methods do, that's just implementation detail.
So, the bottom line is - just test your safeListTail, and that's it, no need to test the inner implementation separately.
BTW, you don't need the match there: Try(listTail(ls)).getOrElse(handleEmptyList(ls)) is equivalent to what you have there ... which is actually not a very good idea, because it swallows other exceptions, not just the one that is thrown when the list is empty, a better approach would be actually to reinstate match but get rid of Try:
ls match {
case Nil => handleEmptyList(ls)
case _ => listTail(ls)
}
What would it be the best approach to solve this problem in the most functional (algebraic) way by using Scala and Cats (or maybe another library focused on Category Theory and/or functional programming)?
Resources
Provided we have the following methods which perform REST API calls to retrieve single pieces of information?
type FutureApiCallResult[A] = Future[Either[String, Option[A]]]
def getNameApiCall(id: Int): FutureApiCallResult[String]
def getAgeApiCall(id: Int): FutureApiCallResult[Int]
def getEmailApiCall(id: Int): FutureApiCallResult[String]
As you can see they produce asynchronous results. The Either monad is used to return possible errors during API calls and Option is used to return None whenever the resource is not found by the API (this case is not an error but a possible and desired result type).
Method to implement in a functional way
case class Person(name: String, age: Int, email: String)
def getPerson(id: Int): Future[Option[Person]] = ???
This method should used the three API calls methods defined above to asynchronously compose and return a Person or None if either any of the API calls failed or any of the API calls return None (the whole Person entity cannot be composed)
Requirements
For performance reasons all the API calls must be done in a parallel way
My guess
I think the best option would be to use the Cats Semigroupal Validated but I get lost when trying to deal with Future and so many nested Monads :S
Can anyone tell me how would you implement this (even if changing method signature or main concept) or point me to the right resources? Im quite new to Cats and Algebra in coding but I would like to learn how to handle this kind of situations so that I can use it at work.
The key requirement here is that it has to be done in parallel. It means that the obvious solution using a monad is out, because monadic bind is blocking (it needs the result in case it has to branch on it). So the best option is to use applicative.
I'm not a Scala programmer, so I can't show you the code, but the idea is that an applicative functor can lift functions of multiple arguments (a regular functor lifts functions of single argument using map). Here, you would use something like map3 to lift the three-argument constructor of Person to work on three FutureResults. A search for "applicative future in Scala" returns a few hits. There are also applicative instances for Either and Option and, unlike monads, applicatives can be composed together easily. Hope this helps.
You can make use of the cats.Parallel type class. This enables some really neat combinators with EitherT which when run in parallel will accumulate errors. So the easiest and most concise solution would be this:
type FutureResult[A] = EitherT[Future, NonEmptyList[String], Option[A]]
def getPerson(id: Int): FutureResult[Person] =
(getNameApiCall(id), getAgeApiCall(id), getEmailApiCall(id))
.parMapN((name, age, email) => (name, age, email).mapN(Person))
For more information on Parallel visit the cats documentation.
Edit: Here's another way without the inner Option:
type FutureResult[A] = EitherT[Future, NonEmptyList[String], A]
def getPerson(id: Int): FutureResult[Person] =
(getNameApiCall(id), getAgeApiCall(id), getEmailApiCall(id))
.parMapN(Person)
this is the only solution i came across with but still not satisfied because i have the feeling it could be done in a cleaner way
import cats.data.NonEmptyList
import cats.implicits._
import scala.concurrent.Future
case class Person(name: String, age: Int, email: String)
type FutureResult[A] = Future[Either[NonEmptyList[String], Option[A]]]
def getNameApiCall(id: Int): FutureResult[String] = ???
def getAgeApiCall(id: Int): FutureResult[Int] = ???
def getEmailApiCall(id: Int): FutureResult[String] = ???
def getPerson(id: Int): FutureResult[Person] =
(
getNameApiCall(id).map(_.toValidated),
getAgeApiCall(id).map(_.toValidated),
getEmailApiCall(id).map(_.toValidated)
).tupled // combine three futures
.map {
case (nameV, ageV, emailV) =>
(nameV, ageV, emailV).tupled // combine three Validated
.map(_.tupled) // combine three Options
.map(_.map { case (name, age, email) => Person(name, age, email) }) // wrap final result
}.map(_.toEither)
Personally I prefer to collapse all non-success conditions into the Future's failure. That really simplifies the error handling, like:
val futurePerson = for {
name <- getNameApiCall(id)
age <- getAgeApiCall(id)
email <- getEmailApiCall(id)
} yield Person(name, age, email)
futurePerson.recover {
case e: SomeKindOfError => ???
case e: AnotherKindOfError => ???
}
Note that this won't run the requests in parallel, to do so you'd need to move the future's creation outside of the for comprehension, like:
val futureName = getNameApiCall(id)
val futureAge = getAgeApiCall(id)
val futureEmail = getEmailApiCall(id)
val futurePerson = for {
name <- futureName
age <- futureAge
email <- futureEmail
} yield Person(name, age, email)
In https://gist.github.com/satyagraha/897e427bfb5ed203e9d3054ac6705704 I have posted a Scala Cats validation scenario which seems reasonable, but I haven't found a very neat solution.
Essentially, there is a two-stage validation, where individual fields are validated, then a class constructor is called which may throw due to internal checks (in general this may not be under my control to change, hence the exception handling code). We wish to not to call the constructor if any field validation fails, but also combine any constructor failure into the final result. "Fail-fast" is definitely right here for the two-phase check.
This is a kind of flatMap problem, which the cats.data.Validated framework appears to handle via the cats.data.Validated#andThenoperation. However I couldn't find a particularly neat solution to the problem as you can see in the code. There are quite a limited number of operations available on a cats.syntax.CartesianBuilder and is wasn't clear to me how to link it with the andThen operation.
Any ideas welcome! Note there is a Cats issue https://github.com/typelevel/cats/issues/1343 which possibly is related, not sure.
For fail fast, chained validation it is easier to use Either than Validated. You can easily switch from Either to Validated or vice versa depending if you want error accumulation.
A possible solution to your problem would be to create a smart constructor for User which returns an Either[Message, User] and use this with Validated[Message, (Name, Date)].
import cats.implicits._
import cats.data.Validated
def user(name: Name, date: Date): Either[Message, User] =
Either.catchNonFatal(User(name, date)).leftMap(Message.toMessage)
// error accumulation -> Validated
val valids: Validated[Message, (Name, Date)] =
(validateName(nameRepr) |#| validateDate(dateDepr)).tupled
// error short circuiting -> either
val userOrMessage: Either[Message, User] =
valids.toEither.flatMap((user _).tupled)
// Either[Message,User] = Right(User(Name(joe),Date(now)))
I would make a helper second-order function to wrap the exception-throwing ones:
def attempt[A, B](f: A => B): A => Validated[Message, B] = a => tryNonFatal(f(a))
Also, default companions of case classes extend the FunctionN trait, so there's no need to do (User.apply _).tupled, it can be shortened to User.tupled (on custom companions, you need to write extends ((...) => ...)) but apply override will be autogenerated)
So we end up with that using andThen:
val valids = validateName(nameRepr) |#| validateDate(dateDepr)
val res: Validated[Message, User] = valids.tupled andThen attempt(User.tupled)
I'm attempting to write a method which accepts multiple generic types and takes as an argument a unit of work to execute.
The idea is that the unit of work is a common function that itself is generic. For the sake of example, let's say it's something like the following:
def loadModelRdd[T: TypeTag](sc: SparkContext): RDD[T] = {
...
}
loadModelRdd() will construct an RDD of the given type after some internal processing like loading the Model information, etc.
A prototype method I've been hacking on looks something like the following (non-working):
def forkAll[A : Manifest, B : Manifest](work: => RDD[_]): (RDD[A], RDD[B]) = {
def aFuture = Future { work } // How can I notify that this work call returns type A?
def bFuture = Future { work } // How can I notify that this work call returns type B?
val res = for {
a <- aFuture
b <- bFuture
} yield (a.asInstanceOf[A], b.asInstanceOf[B])
Await.result(res, 10.seconds)
}
This is a shortened version of the code I'm working on as I'm actually looking at accepting as many as 10 different types.
As you can see, the overall goal of the forkAll method is to wrap the unit of work in a Future, fork-join the execution of the unit of work for each type, then return the results as a Tuple'd result. An example consumer statement would be:
val (a, b) = forkAll[ClassA, ClassB](loadModelRdd)
i.e I want to fork-join at this point and wait for the results, but I want the executions to be executed in parallel and then collected back to the Driver (Spark Driver to be specific).
The problem is I'm not sure how to coerce the type returned by the unit of work within forkAll when constructing the Future {} blocks. Without the forkAll, the implementation looked like the following:
val resA = loadModelRdd[ClassA](sc)
val resB = loadModelRdd[ClassB](sc)
...
I am looking at doing this for two reasons:
To abstract the details of fork-join for any unit of work which matches this model.
A version of this code, which explicitly states what the unit of work is, is working in Production and was responsible for cutting execution of a long-running block by close to half. I have a couple of execution steps where this pattern could be applied
Is this something that is possible in Scala's type system? Or should I look at this problem from a different perspective? I've tried a couple of implementations (including one described here) but I haven't quite found one that fits my current view of the problem
Please let me know if there is any additional information needed.
Thanks!
Short answer: Scala does not allow functions with type parameters, so what you want is not exactly possible.
You are attempting to pass a method with a type parameter. Although methods are allowed to have type parameters, functions are not. When you try to pass a method, it acts like an anonymous function, so you must specify a type.
However, since methods do allow type parameters, you can take advantage of this by creating an abstract class that will do your fork/join
abstract class ForkJoin {
protected def work[T]: RDD[T]
def apply[A, B]: (RDD[A], RDD[B]) = {
// Write implementation of fork/join here
(work[A], work[B])
}
}
then overriding the type generic work method so that it does what you want, such as calling some other pre-defined method.
val forkJoin = new ForkJoin {
override protected def work[T]: RDD[T] =
loadModelRdd[T](sc)
}
val (intRdd, stringRdd) = forkJoin[Int, String]
Check out this for a prototype implementation that compiles and runs without issues.
I've been reading about the OO 'fluent interface' approach in Java, JavaScript and Scala and I like the look of it, but have been struggling to see how to reconcile it with a more type-based/functional approach in Scala.
To give a very specific example of what I mean: I've written an API client which can be invoked like this:
val response = MyTargetApi.get("orders", 24)
The return value from get() is a Tuple3 type called RestfulResponse, as defined in my package object:
// 1. Return code
// 2. Response headers
// 2. Response body (Option)
type RestfulResponse = (Int, List[String], Option[String])
This works fine - and I don't really want to sacrifice the functional simplicity of a tuple return value - but I would like to extend the library with various 'fluent' method calls, perhaps something like this:
val response = MyTargetApi.get("customers", 55).throwIfError()
// Or perhaps:
MyTargetApi.get("orders", 24).debugPrint(verbose=true)
How can I combine the functional simplicity of get() returning a typed tuple (or similar) with the ability to add more 'fluent' capabilities to my API?
It seems you are dealing with a client side API of a rest style communication. Your get method seems to be what triggers the actual request/response cycle. It looks like you'd have to deal with this:
properties of the transport (like credentials, debug level, error handling)
providing data for the input (your id and type of record (order or customer)
doing something with the results
I think for the properties of the transport, you can put some of it into the constructor of the MyTargetApi object, but you can also create a query object that will store those for a single query and can be set in a fluent way using a query() method:
MyTargetApi.query().debugPrint(verbose=true).throwIfError()
This would return some stateful Query object that stores the value for log level, error handling. For providing the data for the input, you can also use the query object to set those values but instead of returning your response return a QueryResult:
class Query {
def debugPrint(verbose: Boolean): this.type = { _verbose = verbose; this }
def throwIfError(): this.type = { ... }
def get(tpe: String, id: Int): QueryResult[RestfulResponse] =
new QueryResult[RestfulResponse] {
def run(): RestfulResponse = // code to make rest call goes here
}
}
trait QueryResult[A] { self =>
def map[B](f: (A) => B): QueryResult[B] = new QueryResult[B] {
def run(): B = f(self.run())
}
def flatMap[B](f: (A) => QueryResult[B]) = new QueryResult[B] {
def run(): B = f(self.run()).run()
}
def run(): A
}
Then to eventually get the results you call run. So at the end of the day you can call it like this:
MyTargetApi.query()
.debugPrint(verbose=true)
.throwIfError()
.get("customers", 22)
.map(resp => resp._3.map(_.length)) // body
.run()
Which should be a verbose request that will error out on issue, retrieve the customers with id 22, keep the body and get its length as an Option[Int].
The idea is that you can use map to define computations on a result you do not yet have. If we add flatMap to it, then you could also combine two computations from two different queries.
To be honest, I think it sounds like you need to feel your way around a little more because the example is not obviously functional, nor particularly fluent. It seems you might be mixing up fluency with not-idempotent in the sense that your debugPrint method is presumably performing I/O and the throwIfError is throwing exceptions. Is that what you mean?
If you are referring to whether a stateful builder is functional, the answer is "not in the purest sense". However, note that a builder does not have to be stateful.
case class Person(name: String, age: Int)
Firstly; this can be created using named parameters:
Person(name="Oxbow", age=36)
Or, a stateless builder:
object Person {
def withName(name: String)
= new { def andAge(age: Int) = new Person(name, age) }
}
Hey presto:
scala> Person withName "Oxbow" andAge 36
As to your use of untyped strings to define the query you are making; this is poor form in a statically-typed language. What is more, there is no need:
sealed trait Query
case object orders extends Query
def get(query: Query): Result
Hey presto:
api get orders
Although, I think this is a bad idea - you shouldn't have a single method which can give you back notionally completely different types of results
To conclude: I personally think there is no reason whatsoever that fluency and functional cannot mix, since functional just indicates the lack of mutable state and the strong preference for idempotent functions to perform your logic in.
Here's one for you:
args.map(_.toInt)
args map toInt
I would argue that the second is more fluent. It's possible if you define:
val toInt = (_ : String).toInt
That is; if you define a function. I find functions and fluency mix very well in Scala.
You could try having get() return a wrapper object that might look something like this
type RestfulResponse = (Int, List[String], Option[String])
class ResponseWrapper(private rr: RestfulResponse /* and maybe some flags as additional arguments, or something? */) {
def get : RestfulResponse = rr
def throwIfError : RestfulResponse = {
// Throw your exception if you detect an error
rr // And return the response if you didn't detect an error
}
def debugPrint(verbose: Boolean, /* whatever other parameters you had in mind */) {
// All of your debugging printing logic
}
// Any and all other methods that you want this API response to be able to execute
}
Basically, this allows you to put your response into a contain that has all of these nice methods that you want, and, if you simply want to get the wrapped response, you can just call the wrapper's get() method.
Of course, the downside of this is that you will need to change your API a bit, if that's worrisome to you at all. Well... you could probably avoid needing to change your API, actually, if you, instead, created an implicit conversion from RestfulResponse to ResponseWrapper and vice versa. That's something worth considering.