Reduce testing overhead when DAO contains action - scala

For accessing objects, a Slick DAO which contains functions returning actions and objects of a stored type were created. Example:
def findByKeysAction(a: String, b: String, c: String = {
Users.filter(x => x.a === a && x.b === b && x.c === c).result
}
def findByKeys(a: String, b: String, c: String): Future[Option[foo]] = {
db.run(findByKeysAction(consumerId, contextId, userId)).map(_.headOption)
}
Notice how the non-action-based function wraps the other in db.run().
What is a solid approach to test both functions and minimizing redundancy of code?
I naive method could of course be to test them both with their individual test setups (above is a simple example; there could be a lot of test setup needed to satisfy DB restrictions).

Notice how the non-action-based function wraps the other in db.run().
Not really. Your findByKeys method does not call findByUserIdAction, so I'm adjusting for that minor detail in this answer.
def findByUserIdAction(userId: String) = {
Users.filter(_.userId === userId).result
}
The above code returns a DBIOAction. As the documentation states:
Just like a query, an I/O action is only a description of an operation. Creating or composing actions does not execute anything on a database.
As far as a user of Slick is concerned, there is no meaningful test for a DBIOAction, because by itself it does nothing; it's only a recipe for what one wants to do. To execute the above DBIOAction, you have to materialize it, which is what the following does:
def findByUserId(userId: String): Future[Option[User]] = {
db.run(findByUserIdAction(userId)).map(_.headOption)
}
The materialized result is what you want to test. One way to do so is to use ScalaTest's ScalaFutures trait. For example, in a spec that mixes in that trait, you could have something like:
"Users" should "return a single user by id" in {
findByUserId("id3").futureValue shouldBe Option(User("id3", ...))
}
Take a look at this Slick 3.2.0 test project for more examples: specifically, TestSpec and QueryCoffeesTest.
In summary, don't bother trying to test a DBIOAction in isolation; just test its materialized result.

Related

dependency injection vs partial application in Scala

Is there any benefit to using partially applied functions vs injecting dependencies into a class? Both approaches as I understand them shown here:
class DB(conn: String) {
def get(sql: String): List[Any] = _
}
object DB {
def get(conn: String) (sql: String): List[Any] = _
}
object MyApp {
val conn = "jdbc:..."
val sql = "select * from employees"
val db1 = new DB(conn)
db1.get(sql)
val db2 = DB.get(conn) _
db2(sql)
}
Using partially-applied functions is somewhat simpler, but the conn is passed to the function each time, and could have a different conn each time it is called. The advantage of using a class is that it can perform one-off operations when it is created, such as validation or caching, and retain the results in the class for re-use.
For example the conn in this code is a String but this is presumably used to connect to a database of some sort. With the partially-applied function it must make this connection each time. With a class the connection can be made when the class is created and just re-used for each query. The class version can also prevent the class being created unless the conn is valid.
The class is usually used when the dependency is longer-lived or used by multiple functions. Partial application is more common when the dependency is shorter-lived, like during a single loop or callback. For example:
list.map(f(context))
def f(context: Context)(element: Int): Result = ???
It wouldn't really make sense to create a class just to hold f. On the other hand, if you have 5 functions that all take context, you should probably just put those into a class. In your example, get is unlikely to be the only thing that requires the conn, so a class makes more sense.

Mocking slick.dbio.DBIO composition in specs2

Using Scala, Play Framework, Slick 3, Specs2.
I have a repository layer and a service layer. The repositories are quite dumb, and I use specs2 to make sure the service layer does its job.
My repositories used to return futures, like this:
def findById(id: Long): Future[Option[Foo]] =
db.run(fooQuery(id))
Then it would be used in the service:
def foonicate(id: Long): Future[Foo] = {
fooRepository.findById(id).flatMap { optFoo =>
val foo: Foo = optFoo match {
case Some(foo) => [business logic returning Foo]
case None => [business logic returning Foo]
}
fooRepository.save(foo)
}
}
Services were easy to spec. In the service spec, the FooRepository would be mocked like this:
fooRepository.findById(3).returns(Future(Foo(3)))
I recently found the need for database transactions. Several queries should be combined into a single transaction. The prevailing opinion seems to be that it's perfectly ok to handle transaction logic in the service layer.
With that in mind, I've changed my repositories to return slick.dbio.DBIO and added a helper method to run the query transactionally:
def findById(id: Long): DBIO[Option[Foo]] =
fooQuery(id)
def run[T](query: DBIO[T]): Future[T] =
db.run(query.transactionally)
The services compose the DBIOs and finally call the repository to run the query:
def foonicate(id: Long): Future[Foo] = {
val query = fooRepository.findById(id).flatMap { optFoo =>
val foo: Foo = optFoo match {
case Some(foo) => [business logic finally returning Foo]
case None => [business logic finally returning Foo]
}
fooRepository.save(foo)
}
fooRepository.run(query)
}
That seems to work, but now I can only spec it like this:
val findFooDbio = DBIO.successful(None))
val saveFooDbio = DBIO.successful(Foo(3))
fooRepository.findById(3) returns findFooDbio
fooRepository.save(Foo(3)) returns saveFooDbio
fooRepository.run(any[DBIO[Foo]]) returns Future(Foo(3))
That any in the run mock sucks! Now I'm not testing the actual logic but instead accept any DBIO[Foo]! I've tried to use the following:
fooRepository.run(findFooDbio.flatMap(_ => saveFooDbio)) returns Future(Foo(3))
But it breaks with java.lang.NullPointerException: null, which is specs2's way of saying "sorry mate, the method with this parameter wasn't found". I've tried various variations, but none of them worked.
I suspect that might be because functions can't be compared:
scala> val a: Int => String = x => "hi"
a: Int => String = <function1>
scala> val b: Int => String = x => "hi"
b: Int => String = <function1>
scala> a == b
res1: Boolean = false
Any ideas how to spec DBIO composition without cheating?
I had a similar idea and also investigated it to see if I could:
create the same DBIO composition
use it with matcher in mocking
However I found out that it is actually infeasible in practice:
as you noticed, you cannot compare functions
additionally, when you investigate internals of DBIO, it is basically Free-monad-like structure - it has implementation for plain values and directly generated queries (then, if you extract the statements you could compare some part of the query), but there are also mappings which stores functions
even if you somehow managed to reuse functions, so that they had reference equality, implementations of DBIO do not care about overriding equals, so they would be different beast anyway
So knowing that, I gave up the initial idea. What I can recommend instead:
mock on any input of database.run - it is more error prone, as it won't notify you if test expectations start differing from returned results, but it's better than nothing
replace DBIO by some intermediate structure, that you know you can compare safely. E.g. Cats' Free monad implementation uses case classes so as long as you manage to ensure that functions are somehow comparable (e.g. by not creating them ad hoc, but instead using vals and objects), you could compare on intermediate representation, and mock whole interpret -> run process
replace unit tests with mocked database with integration tests with an actual database
try out Typed Tagless Final Interpreter pattern for handling databases - and basically inject in tests different monad than in production (e.g. prod -> service returning DBIO, production -> service returning Futures you want)
Actually, you could try many other things with Free, TTFI and swapping implementations. The bottom line is - you cannot reliably compare on DBIO, so design your code in a way, that you could test without doing so. It's not a pleasant answer, especially if you just wanted to put together test, and move on, but AFAIK there is no other way.

Pattern for generating negative Scalacheck scenarios: Using property based testing to test validation logic in Scala

We are looking for a viable design pattern for building Scalacheck Gen (generators) that can produce both positive and negative test scenarios. This will allow us to run forAll tests to validate functionality (positive cases), and also verify that our case class validation works correctly by failing on all invalid combinations of data.
Making a simple, parameterized Gen that does this on a one-off basis is pretty easy. For example:
def idGen(valid: Boolean = true): Gen[String] = Gen.oneOf(ID.values.toList).map(s => if (valid) s else Gen.oneOf(simpleRandomCode(4), "").sample.get)
With the above, I can get a valid or invalid ID for testing purposes. The valid one, I use to make sure business logic succeeds. The invalid one, I use to make sure our validation logic rejects the case class.
Ok, so -- problem is, on a large scale, this becomes very unwieldly. Let's say I have a data container with, oh, 100 different elements. Generating a "good" one is easy. But now, I want to generate a "bad" one, and furthermore:
I want to generate a bad one for each data element, where a single data element is bad (so at minimum, at least 100 bad instances, testing that each invalid parameter is caught by validation logic).
I want to be able to override specific elements, for instance feeding in a bad ID or a bad "foobar." Whatever that is.
One pattern we can look to for inspiration is apply and copy, which allows us to easily compose new objects while specifying overridden values. For example:
val f = Foo("a", "b") // f: Foo = Foo(a,b)
val t = Foo.unapply(f) // t: Option[(String, String)] = Some((a,b))
Foo(t.get._1, "c") // res0: Foo = Foo(a,c)
Above we see the basic idea of creating a mutating object from the template of another object. This is more easily expressed in Scala as:
val f = someFoo copy(b = "c")
Using this as inspiration we can think about our objectives. A few things to think about:
First, we could define a map or a container of key/values for the data element and generated value. This could be used in place of a tuple to support named value mutation.
Given a container of key/value pairs, we could easily select one (or more) pairs at random and change a value. This supports the objective of generating a data set where one value is altered to create failure.
Given such a container, we can easily create a new object from the invalid collection of values (using either apply() or some other technique).
Alternatively, perhaps we can develop a pattern that uses a tuple and then just apply() it, kind of like the copy method, as long as we can still randomly alter one or more values.
We can probably explore developing a reusable pattern that does something like this:
def thingGen(invalidValueCount: Int): Gen[Thing] = ???
def someTest = forAll(thingGen) { v => invalidV = v.invalidate(1); validate(invalidV) must beFalse }
In the above code, we have a generator thingGen that returns (valid) Things. Then for all instances returned, we invoke a generic method invalidate(count: Int) which will randomly invalidate count values, returning an invalid object. We can then use that to ascertain whether our validation logic works correctly.
This would require defining an invalidate() function that, given a parameter (either by name, or by position) can then replace the identified parameter with a value that is known to be bad. This implies have an "anti-generator" for specific values, for instance, if an ID must be 3 characters, then it knows to create a string that is anything but 3 characters long.
Of course to invalidate a known, single parameter (to inject bad data into a test condition) we can simply use the copy method:
def thingGen(invalidValueCount: Int): Gen[Thing] = ???
def someTest = forAll(thingGen) { v => v2 = v copy(id = "xxx"); validate(v2) must beFalse }
That is the sum of my thinking to date. Am I barking up the wrong tree? Are there good patterns out there that handle this kind of testing? Any commentary or suggestions on how best to approach this problem of testing our validation logic?
We can combine a valid instance and an set of invalid fields (so that every field, if copied, would cause validation failure) to get an invalid object using shapeless library.
Shapeless allows you to represent your class as a list of key-value pairs that are still strongly typed and support some high-level operations, and converting back from this representation to your original class.
In example below I'll be providing an invalid instance for each single field provided
import shapeless._, record._
import shapeless.labelled.FieldType
import shapeless.ops.record.Updater
A detailed intro
Let's pretend we have a data class, and a valid instance of it (we only need one, so it can be hardcoded)
case class User(id: String, name: String, about: String, age: Int) {
def isValid = id.length == 3 && name.nonEmpty && age >= 0
}
val someValidUser = User("oo7", "Frank", "A good guy", 42)
assert(someValidUser.isValid)
We can then define a class to be used for invalid values:
case class BogusUserFields(name: String, id: String, age: Int)
val bogusData = BogusUserFields("", "1234", -5)
Instances of such classes can be provided using ScalaCheck. It's much easier to write a generator where all fields would cause failure. Order of fields doesn't matter, but their names and types do. Here we excluded about from User set of fields so we can do what you asked for (feeding only a subset of fields you want to test)
We then use LabelledGeneric[T] to convert User and BogusUserFields to their corresponding record value (and later we will convert User back)
val userLG = LabelledGeneric[User]
val bogusLG = LabelledGeneric[BogusUserFields]
val validUserRecord = userLG.to(someValidUser)
val bogusRecord = bogusLG.to(bogusData)
Records are lists of key-value pairs, so we can use head to get a single mapping, and the + operator supports adding / replacing field to another record. Let's pick every invalid field into our user one at a time. Also, here's the conversion back in action:
val invalidUser1 = userLG.from(validUserRecord + bogusRecord.head)// invalid name
val invalidUser2 = userLG.from(validUserRecord + bogusRecord.tail.head)// invalid ID
val invalidUser3 = userLG.from(validUserRecord + bogusRecord.tail.tail.head) // invalid age
assert(List(invalidUser1, invalidUser2, invalidUser3).forall(!_.isValid))
Since we basically are applying the same function (validUserRecord + _) to every key-value pair in our bogusRecord, we can also use map operator, except we use it with an unusual - polymorphic - function. We can also easily convert it to List, because every element will be of a same type now.
object polymerge extends Poly1 {
implicit def caseField[K, V](implicit upd: Updater[userLG.Repr, FieldType[K, V]]) =
at[FieldType[K, V]](upd(validUserRecord, _))
}
val allInvalidUsers = bogusRecord.map(polymerge).toList.map(userLG.from)
assert(allInvalidUsers == List(invalidUser1, invalidUser2, invalidUser3))
Generalizing and removing all the boilerplate
Now the whole point of this was that we can generalize it to work for any two arbitrary classes. The encoding of all relationships and operations is a bit cumbersome and it took me a while to get it right with all the implicit not found errors, so I'll skip the details.
class Picks[A, AR <: HList](defaults: A)(implicit lgA: LabelledGeneric.Aux[A, AR]) {
private val defaultsRec = lgA.to(defaults)
object mergeIntoTemplate extends Poly1 {
implicit def caseField[K, V](implicit upd: Updater[AR, FieldType[K, V]]) =
at[FieldType[K, V]](upd(defaultsRec, _))
}
def from[B, BR <: HList, MR <: HList, F <: Poly](options: B)
(implicit
optionsLG: LabelledGeneric.Aux[B, BR],
mapper: ops.hlist.Mapper.Aux[mergeIntoTemplate.type, BR, MR],
toList: ops.hlist.ToTraversable.Aux[MR, List, AR]
) = {
optionsLG.to(options).map(mergeIntoTemplate).toList.map(lgA.from)
}
}
So, here it is in action:
val cp = new Picks(someValidUser)
assert(cp.from(bogusData) == allInvalidUsers)
Unfortunately, you cannot write new Picks(someValidUser).from(bogusData) because implicit for mapper requires a stable identifier. On the other hand, cp instance can be reused with other types:
case class BogusName(name: String)
assert(cp.from(BogusName("")).head == someValidUser.copy(name = ""))
And now it works for all types! And bogus data is required to be any subset of class fields, so it will work even for class itself
case class Address(country: String, city: String, line_1: String, line_2: String) {
def isValid = Seq(country, city, line_1, line_2).forall(_.nonEmpty)
}
val acp = new Picks(Address("Test country", "Test city", "Test line 1", "Test line 2"))
val invalidAddresses = acp.from(Address("", "", "", ""))
assert(invalidAddresses.forall(!_.isValid))
You can see the code running at ScalaFiddle

Scala Fork-Join-All With Multiple Generic Types and 1 Generic Unit of Work

I'm attempting to write a method which accepts multiple generic types and takes as an argument a unit of work to execute.
The idea is that the unit of work is a common function that itself is generic. For the sake of example, let's say it's something like the following:
def loadModelRdd[T: TypeTag](sc: SparkContext): RDD[T] = {
...
}
loadModelRdd() will construct an RDD of the given type after some internal processing like loading the Model information, etc.
A prototype method I've been hacking on looks something like the following (non-working):
def forkAll[A : Manifest, B : Manifest](work: => RDD[_]): (RDD[A], RDD[B]) = {
def aFuture = Future { work } // How can I notify that this work call returns type A?
def bFuture = Future { work } // How can I notify that this work call returns type B?
val res = for {
a <- aFuture
b <- bFuture
} yield (a.asInstanceOf[A], b.asInstanceOf[B])
Await.result(res, 10.seconds)
}
This is a shortened version of the code I'm working on as I'm actually looking at accepting as many as 10 different types.
As you can see, the overall goal of the forkAll method is to wrap the unit of work in a Future, fork-join the execution of the unit of work for each type, then return the results as a Tuple'd result. An example consumer statement would be:
val (a, b) = forkAll[ClassA, ClassB](loadModelRdd)
i.e I want to fork-join at this point and wait for the results, but I want the executions to be executed in parallel and then collected back to the Driver (Spark Driver to be specific).
The problem is I'm not sure how to coerce the type returned by the unit of work within forkAll when constructing the Future {} blocks. Without the forkAll, the implementation looked like the following:
val resA = loadModelRdd[ClassA](sc)
val resB = loadModelRdd[ClassB](sc)
...
I am looking at doing this for two reasons:
To abstract the details of fork-join for any unit of work which matches this model.
A version of this code, which explicitly states what the unit of work is, is working in Production and was responsible for cutting execution of a long-running block by close to half. I have a couple of execution steps where this pattern could be applied
Is this something that is possible in Scala's type system? Or should I look at this problem from a different perspective? I've tried a couple of implementations (including one described here) but I haven't quite found one that fits my current view of the problem
Please let me know if there is any additional information needed.
Thanks!
Short answer: Scala does not allow functions with type parameters, so what you want is not exactly possible.
You are attempting to pass a method with a type parameter. Although methods are allowed to have type parameters, functions are not. When you try to pass a method, it acts like an anonymous function, so you must specify a type.
However, since methods do allow type parameters, you can take advantage of this by creating an abstract class that will do your fork/join
abstract class ForkJoin {
protected def work[T]: RDD[T]
def apply[A, B]: (RDD[A], RDD[B]) = {
// Write implementation of fork/join here
(work[A], work[B])
}
}
then overriding the type generic work method so that it does what you want, such as calling some other pre-defined method.
val forkJoin = new ForkJoin {
override protected def work[T]: RDD[T] =
loadModelRdd[T](sc)
}
val (intRdd, stringRdd) = forkJoin[Int, String]
Check out this for a prototype implementation that compiles and runs without issues.

How can I combine fluent interfaces with a functional style in Scala?

I've been reading about the OO 'fluent interface' approach in Java, JavaScript and Scala and I like the look of it, but have been struggling to see how to reconcile it with a more type-based/functional approach in Scala.
To give a very specific example of what I mean: I've written an API client which can be invoked like this:
val response = MyTargetApi.get("orders", 24)
The return value from get() is a Tuple3 type called RestfulResponse, as defined in my package object:
// 1. Return code
// 2. Response headers
// 2. Response body (Option)
type RestfulResponse = (Int, List[String], Option[String])
This works fine - and I don't really want to sacrifice the functional simplicity of a tuple return value - but I would like to extend the library with various 'fluent' method calls, perhaps something like this:
val response = MyTargetApi.get("customers", 55).throwIfError()
// Or perhaps:
MyTargetApi.get("orders", 24).debugPrint(verbose=true)
How can I combine the functional simplicity of get() returning a typed tuple (or similar) with the ability to add more 'fluent' capabilities to my API?
It seems you are dealing with a client side API of a rest style communication. Your get method seems to be what triggers the actual request/response cycle. It looks like you'd have to deal with this:
properties of the transport (like credentials, debug level, error handling)
providing data for the input (your id and type of record (order or customer)
doing something with the results
I think for the properties of the transport, you can put some of it into the constructor of the MyTargetApi object, but you can also create a query object that will store those for a single query and can be set in a fluent way using a query() method:
MyTargetApi.query().debugPrint(verbose=true).throwIfError()
This would return some stateful Query object that stores the value for log level, error handling. For providing the data for the input, you can also use the query object to set those values but instead of returning your response return a QueryResult:
class Query {
def debugPrint(verbose: Boolean): this.type = { _verbose = verbose; this }
def throwIfError(): this.type = { ... }
def get(tpe: String, id: Int): QueryResult[RestfulResponse] =
new QueryResult[RestfulResponse] {
def run(): RestfulResponse = // code to make rest call goes here
}
}
trait QueryResult[A] { self =>
def map[B](f: (A) => B): QueryResult[B] = new QueryResult[B] {
def run(): B = f(self.run())
}
def flatMap[B](f: (A) => QueryResult[B]) = new QueryResult[B] {
def run(): B = f(self.run()).run()
}
def run(): A
}
Then to eventually get the results you call run. So at the end of the day you can call it like this:
MyTargetApi.query()
.debugPrint(verbose=true)
.throwIfError()
.get("customers", 22)
.map(resp => resp._3.map(_.length)) // body
.run()
Which should be a verbose request that will error out on issue, retrieve the customers with id 22, keep the body and get its length as an Option[Int].
The idea is that you can use map to define computations on a result you do not yet have. If we add flatMap to it, then you could also combine two computations from two different queries.
To be honest, I think it sounds like you need to feel your way around a little more because the example is not obviously functional, nor particularly fluent. It seems you might be mixing up fluency with not-idempotent in the sense that your debugPrint method is presumably performing I/O and the throwIfError is throwing exceptions. Is that what you mean?
If you are referring to whether a stateful builder is functional, the answer is "not in the purest sense". However, note that a builder does not have to be stateful.
case class Person(name: String, age: Int)
Firstly; this can be created using named parameters:
Person(name="Oxbow", age=36)
Or, a stateless builder:
object Person {
def withName(name: String)
= new { def andAge(age: Int) = new Person(name, age) }
}
Hey presto:
scala> Person withName "Oxbow" andAge 36
As to your use of untyped strings to define the query you are making; this is poor form in a statically-typed language. What is more, there is no need:
sealed trait Query
case object orders extends Query
def get(query: Query): Result
Hey presto:
api get orders
Although, I think this is a bad idea - you shouldn't have a single method which can give you back notionally completely different types of results
To conclude: I personally think there is no reason whatsoever that fluency and functional cannot mix, since functional just indicates the lack of mutable state and the strong preference for idempotent functions to perform your logic in.
Here's one for you:
args.map(_.toInt)
args map toInt
I would argue that the second is more fluent. It's possible if you define:
val toInt = (_ : String).toInt
That is; if you define a function. I find functions and fluency mix very well in Scala.
You could try having get() return a wrapper object that might look something like this
type RestfulResponse = (Int, List[String], Option[String])
class ResponseWrapper(private rr: RestfulResponse /* and maybe some flags as additional arguments, or something? */) {
def get : RestfulResponse = rr
def throwIfError : RestfulResponse = {
// Throw your exception if you detect an error
rr // And return the response if you didn't detect an error
}
def debugPrint(verbose: Boolean, /* whatever other parameters you had in mind */) {
// All of your debugging printing logic
}
// Any and all other methods that you want this API response to be able to execute
}
Basically, this allows you to put your response into a contain that has all of these nice methods that you want, and, if you simply want to get the wrapped response, you can just call the wrapper's get() method.
Of course, the downside of this is that you will need to change your API a bit, if that's worrisome to you at all. Well... you could probably avoid needing to change your API, actually, if you, instead, created an implicit conversion from RestfulResponse to ResponseWrapper and vice versa. That's something worth considering.