Looking for some guidance on how to code a writer for a given "AST" (DynamoDB) - scala

As a personal project, I am writing yet another Scala library for DynamoDb. It contains many interesting aspect such as reading and writing from an AST (just as Json), handling HTTP request, streaming data…
In order to be able able to communicate with DynamoDb, one needs to be able to read from / to the DynamoDb format (the “AST”). I extracted this reading / writing from / to the AST in a minimalist library: dynamo-ast. It contains two main type classes: DynamoReads[_] and DynamoWrites[_] (deeply inspired from Play Json).
I successfully coded the reading part of the library ending with a very simple code such as :
trait DynamoRead[A] { self =>
def read(dynamoType: DynamoType): DynamoReadResult[A]
}
case class TinyImage(url: String, alt: String)
val dynamoReads: DynamoReads[TinyImage] = {
for {
url <- read[String].at(“url”)
alt <- read[String].at(“alt”)
} yield (url, alt) map (TinyImage.apply _).tupled
}
dynamoReads.reads(dynamoAst) //yield DynamoReadResult[TinyImage]
At that point, I thought I wrote the most complicated part of the library and the DynamoWrite[_] part would be a piece of cake. I am however stuck on writing the DynamoWrite part. I was a fool.
My goal is to provide a very similar “user experience” with the DynamoWrite[_] and keep it as simple as possible such as :
val dynamoWrites: DynamoWrites[TinyImage] = {
for {
url <- write[String].at(“url”)
alt <- write[String].at(“alt”)
} yield (url, alt) map (TinyImage.unapply _) //I am not sure what to yield here nor how to code it
}
dynamoWrites.write(TinyImage(“http://fake.url”, “The alt desc”)) //yield DynamoWriteResult[DynamoType]
Since this library is deeply inspired from Play Json library (because I like its simplicity) I had a look at the sources several times. I kind of dislike the way the writer part is coded because to me, it adds a lot of overhead (basically each time a field a written, a new JsObject is created with one field and the resulting JsObject for a complete class is the merge of all the JsObjects containing one field).
From my understanding, the DynamoReads part can be written with only one trait (DynamoRead[_]). The DynamoWrites part however requires at least two such as :
trait DynamoWrites[A] {
def write(a: A): DynamoWriteResult[DynamoType]
}
trait DynamoWritesPath[A] {
def write(path:String, a: A): DynamoWriteResult[(String, DynamoType)]
}
The DynamoWrites[_] is to write plain String, Int… and the DynamoWritesPath[_] is to write a tuple of (String, WhateverTypeHere) (to simulate a “field”).
So writing write[String].at(“url”) would yield a DynamoWritesPath[String]. Now I have several issues :
I have no clue how to write flatMap for my DynamoWritesPath[_]
what should yield a for comprehension to be able to obtain a DynamoWrite[TinyImage]
What I wrote so far (totally fuzzy and not compiling at all, looking for some help on this). Not committed at the moment (gist): https://gist.github.com/louis-forite/cad97cc0a47847b2e4177192d9dbc3ae
To sum up, I am looking for some guidance on how to write the DynamoWrites[_] part. My goal is to provide for the client the most straight forward way to code a DynamoWrites[_] for a given type. My non goal is to write the perfect library and keep it a zero dependency library.
Link to the library: https://github.com/louis-forite/dynamo-ast

A Reads is a covariant functor. That means it has map. It can also be seen as a Monad which means it has flatMap (although a monad is overkill unless you need the previous field in order to know how to process the next):
trait Reads[A] {
def map [B] (f: A => B): Reads[B]
def flatMap [B](f: A => Reads[B]): Reads[B] // not necessary, but available
}
The reason for this, is that to transform a Reads[Int] to a Reads[String], you need to first read the Int, then apply the Int => String function.
But a Writes is a contravariant functor. It has contramap where the direction of the types is reversed:
trait Writes[A] {
def contramap [B](f: B => A): Reads[B]
}
The type on the function is reversed because to transform a Writes[Int] to a Writes[String] you must receive the String from the caller, apply the transformation String => Int and then write the Int.
I don't think it makes sense to provide for-comprehension syntax (flatMap) for the Writes API.
// here it is clear that you're extracting a string value
url <- read[String].at(“url”)
// but what does this mean for the write method?
url <- write[String].at("url")
// what is `url`?
That's probably why play doesn't provide one either, and why they focus on their combinator syntax (using the and function, their version of applicative functor builder?).
For reference: http://blog.tmorris.net/posts/functors-and-things-using-scala/index.html
You can achieve a more consistent API by using something like the and method in play json:
(write[String]("url") and write[String]("alt"))(unlift(TinyImage.unapply))
(read[String]("url") and read[String]("alt"))(TinyImage.apply)
// unfortunately, the type ascription is necessary in this case
(write[String]("url") and write[String]("alt")) {(x: TinyImage) =>
(x.url, x.alt)
}
// transforming
val instantDynamoType: DynamoFormat[Instant] =
format[String].xmap(Instant.parse _)((_: Instant).toString)
You can still use for-comprehension for the reads, although it's a bit over-powered (sort of implies that fields must be processed in-sequence, while that's not technically necessary).

Related

Using scala-cats IO type to encapsulate a mutable Java library

I understand that generally speaking there is a lot to say about deciding what one wants to model as effect This discussion is introduce in Functional programming in Scala on the chapter on IO.
Nonethless, I have not finished the chapter, i was just browsing it end to end before takling it together with Cats IO.
In the mean time, I have a bit of a situation for some code I need to deliver soon at work.
It relies on a Java Library that is just all about mutation. That library was started a long time ago and for legacy reason i don't see them changing.
Anyway, long story short. Is actually modeling any mutating function as IO a viable way to encapsulate a mutating java library ?
Edit1 (at request I add a snippet)
Readying into a model, mutate the model rather than creating a new one. I would contrast jena to gremlin for instance, a functional library over graph data.
def loadModel(paths: String*): Model =
paths.foldLeft(ModelFactory.createOntologyModel(new OntModelSpec(OntModelSpec.OWL_MEM)).asInstanceOf[Model]) {
case (model, path) ⇒
val input = getClass.getClassLoader.getResourceAsStream(path)
val lang = RDFLanguages.filenameToLang(path).getName
model.read(input, "", lang)
}
That was my scala code, but the java api as documented in the website look like this.
// create the resource
Resource r = model.createResource();
// add the property
r.addProperty(RDFS.label, model.createLiteral("chat", "en"))
.addProperty(RDFS.label, model.createLiteral("chat", "fr"))
.addProperty(RDFS.label, model.createLiteral("<em>chat</em>", true));
// write out the Model
model.write(system.out);
// create a bag
Bag smiths = model.createBag();
// select all the resources with a VCARD.FN property
// whose value ends with "Smith"
StmtIterator iter = model.listStatements(
new SimpleSelector(null, VCARD.FN, (RDFNode) null) {
public boolean selects(Statement s) {
return s.getString().endsWith("Smith");
}
});
// add the Smith's to the bag
while (iter.hasNext()) {
smiths.add(iter.nextStatement().getSubject());
}
So, there are three solutions to this problem.
1. Simple and dirty
If all the usage of the impure API is contained in single / small part of the code base, you may just "cheat" and do something like:
def useBadJavaAPI(args): IO[Foo] = IO {
// Everything inside this block can be imperative and mutable.
}
I said "cheat" because the idea of IO is composition, and a big IO chunk is not really composition. But, sometimes you only want to encapsulate that legacy part and do not care about it.
2. Towards composition.
Basically, the same as above but dropping some flatMaps in the middle:
// Instead of:
def useBadJavaAPI(args): IO[Foo] = IO {
val a = createMutableThing()
mutableThing.add(args)
val b = a.bar()
b.computeFoo()
}
// You do something like this:
def useBadJavaAPI(args): IO[Foo] =
for {
a <- IO(createMutableThing())
_ <- IO(mutableThing.add(args))
b <- IO(a.bar())
result <- IO(b.computeFoo())
} yield result
There are a couple of reasons for doing this:
Because the imperative / mutable API is not contained in a single method / class but in a couple of them. And the encapsulation of small steps in IO is helping you to reason about it.
Because you want to slowly migrate the code to something better.
Because you want to feel better with yourself :p
3. Wrap it in a pure interface
This is basically the same that many third party libraries (e.g. Doobie, fs2-blobstore, neotypes) do. Wrapping a Java library on a pure interface.
Note that as such, the amount of work that has to be done is way more than the previous two solutions. As such, this is worth it if the mutable API is "infecting" many places of your codebase, or worse in multiple projects; if so then it makes sense to do this and publish is as an independent module.
(it may also be worth to publish that module as an open-source library, you may end up helping other people and receive help from other people as well)
Since this is a bigger task is not easy to just provide a complete answer of all you would have to do, it may help to see how those libraries are implemented and ask more questions either here or in the gitter channels.
But, I can give you a quick snippet of how it would look like:
// First define a pure interface of the operations you want to provide
trait PureModel[F[_]] { // You may forget about the abstract F and just use IO instead.
def op1: F[Int]
def op2(data: List[String]): F[Unit]
}
// Then in the companion object you define factories.
object PureModel {
// If the underlying java object has a close or release action,
// use a Resource[F, PureModel[F]] instead.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] = ???
}
Now, how to create the implementation is the tricky part.
Maybe you can use something like Sync to initialize the mutable state.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] =
F.delay(createMutableState()).map { mutableThing =>
new PureModel[F] {
override def op1: F[Int] = F.delay(mutableThing.foo())
override def op2(data: List[String]): F[Unit] = F.delay(mutableThing.bar(data))
}
}

Compose optional queries for for-comprehension in doobie?

I would like to run several queries in one transaction using a for-comprehension in doobie. Something like:
def addImage(path:String) : ConnectionIO[Image] = {
sql"INSERT INTO images(path) VALUES($path)".update.withUniqueGeneratedKeys('id', 'path')
}
def addUser(username: String, imageId: Optional[Int]) : ConnectionIO[User] = {
sql"INSERT INTO users(username, image_id) VALUES($username, $imageId)".update.withUniqueGeneratedKeys('id', 'username', 'image_id')
}
def createUser(username: String, imagePath: Optional[String]) : Future[User] = {
val composedIO : ConnectionIO[User] = for {
optImage <- imagePath.map { p => addImage(p) }
user <- addUser(username, optImage.map(_.id))
} yield user
composedIO.transact(xa).unsafeToFuture
}
I just started with doobie (and cats) so I'm not that familiar with FreeMonads. I've been trying different solutions but for the for-comprehension to work it looks like both blocks needs to return a cats.free.Free[doobie.free.connection.ConnectionOp,?].
If this is true, is there a way to transform my ConnectionIO[Image] (from the addImage call) into a cats.free.Free[doobie.free.connection.ConnectionOp,Option[Image]] ?
For your direct question, ConnectionIO is defined as type ConnectionIO[A] = Free[ConnectionOp, A], i.e. the two types are equivalent (no transformation required).
Your issue is different, and can be easily seen if we step through the code step by step. For simplicity, I will use Option where you used Optional.
imagePath.map { p => addImage(p) }:
imagePath is an Option, and map uses an A => B to convert Option[A] to Option[B].
Since addImage returns a ConnectionIO[Image], we now have an Option[ConnectionIO[Image]], i.e. this is an Option program, not a ConnectionIO program.
We can instead return a ConnectionIO[Option[Image]] by replacing map with traverse, which uses the Traverse typeclass, see https://typelevel.org/cats/typeclasses/traverse.html for some details on how this works. But a basic intuition is that where map would have given you an F[G[B]], traverse instead gives you a G[F[B]]. In a sense, it works similarly to Future.traverse from the standard library, but in a more general way.
addUser(username, optImage.map(_.id))
The issue here is that given optImage which is an Option[Image], and its id field, which is an Option[Int], the result of optImage.map(_.id) is an Option[Option[Int]], not the Option[Int] which your method expects.
One way of solving this (if it matches your requirements), is to change this part of code to
addUser(username, optImage.flatMap(_.id))
flatMap can "join" an Option with another created by its value (if it exists).
(note: you need to add import cats.implicits._ to get the syntax for traverse).
In general, some of the ideas here about Traverse, flatMap, etc., are useful to study, and two books for doing so are "Scala With Cats" (https://underscore.io/books/scala-with-cats/) and "Functional Programming with Scala" (https://www.manning.com/books/functional-programming-in-scala)
The author of doobie also recently gave a talk about "effects", which may be of use in improving your intuition about types like Option, IO, etc.: https://www.youtube.com/watch?v=po3wmq4S15A
If I got your intention right, you should use traverse instead of map:
val composedIO : ConnectionIO[User] = for {
optImage <- imagePath.traverse { p => addImage(p) }
user <- addUser(username, optImage.map(_.id))
} yield user
You might need to import cats.instances.option._ and/or cats.syntax.traverse._

Combining Future with Kleisli and Either on Finch endpoint

I’m hacking a bit with Finch and Cats. I ended up with an issue where my Service returns a Reader of Repository and Either as Reader[Repository, Either[List[String], Entity]].
The problem is: I need to transform the Either’s Right value to a Finch’s Output in a FP way. So, using for-expr won’t work because it will evaluates to a new Reader monad.
I saw a few implementations using fold as a solution like either.fold[Output[Entity]](NotFound)(Ok) - but I am not sure if its a valid path for me with this Reader between my Either and fold.
Finch’s Endpoint is a Future, so I wonder if I encapsulate my Reader monad within a Future, I could transform the possible and eventual evaluation of Either’s Right to a Finch’s Output.
Here's what I got now:
object ItemAction {
def routes: Endpoint[String] = post("todo" :: "items" :: jsonBody[Item]) { create _ }
def create(i: Item): Output[Item] = ???
}
object ItemService {
def create(item: Item): Reader[ItemRepository, Either[String, Item]] = Reader { (repository: ItemRepository) =>
repository.create(item)
}
}
So, my idea is to transform ItemService#create output into Output[Item] on ItemAction#create. Output[Item] is a Future, so a signature like Future[Reader[?]] could fits into ItemAction but not sure if its possible and recommended.
Any ideas on this matter?

Is there a universal method to create a tail recursive function in Scala?

While checking Intel's BigDL repo, I stumbled upon this method:
private def recursiveListFiles(f: java.io.File, r: Regex): Array[File] = {
val these = f.listFiles()
val good = these.filter(f => r.findFirstIn(f.getName).isDefined)
good ++ these.filter(_.isDirectory).flatMap(recursiveListFiles(_, r))
}
I noticed that it was not tail recursive and decided to write a tail recursive version:
private def recursiveListFiles(f: File, r: Regex): Array[File] = {
#scala.annotation.tailrec def recursiveListFiles0(f: Array[File], r: Regex, a: Array[File]): Array[File] = {
f match {
case Array() => a
case htail => {
val these = htail.head.listFiles()
val good = these.filter(f => r.findFirstIn(f.getName).isDefined)
recursiveListFiles0(these.filter(_.isDirectory)++htail.tail, r, a ++ good)
}
}
}
recursiveListFiles0(Array[File](f), r, Array.empty[File])
}
What made this difficult compared to what I am used to is the concept that a File can be transformed into an Array[File] which adds another level of depth.
What is the theory behind recursion on datatypes that have the following member?
def listTs[T]: T => Traversable[T]
Short answer
If you generalize the idea and think of it as a monad (polymorphic thing working for arbitrary type params) then you won't be able to implement a tail recursive implementation.
Trampolines try to solve this very problem by providing a way to evaluate a recursive computation without overflowing the stack. The general idea is to create a stream of pairs of (result, computation). So at each step you'll have to return the computed result up to that point and a function to create the next result (aka thunk).
From Rich Dougherty’s blog:
A trampoline is a loop that repeatedly runs functions. Each function,
called a thunk, returns the next function for the loop to run. The
trampoline never runs more than one thunk at a time, so if you break
up your program into small enough thunks and bounce each one off the
trampoline, then you can be sure the stack won't grow too big.
More + References
In the categorical sense, the theory behind such data types is closely related to Cofree Monads and fold and unfold functions, and in general to Fixed point types.
See this fantastic talk: Fun and Games with Fix Cofree and Doobie by Rob Norris which discusses a use case very similar to your question.
This article about Free monads and Trampolines is also related to your first question: Stackless Scala With Free Monads.
See also this part of the Matryoshka docs. Matryoshka is a Scala library implementing monads around the concept of FixedPoint types.

Are there any types with side-effecting methods that return the original type?

Often I find myself wanting to chain a side-effecting function to the end of another method call in a more functional-looking way, but I don't want to transform the original type to Unit. Suppose I have a read method that searches a database for a record, returning Option[Record].
def read(id: Long): Option[Record] = ...
If read returns Some(record), then I might want to cache that value and move on. I could do something like this:
read(id).map { record =>
// Cache the record
record
}
But, I would like to avoid the above code and end up with something more like this to make it more clear as to what's happening:
read(id).withSideEffect { record =>
// Cache the record
}
Where withSideEffect returns the same value as read(id). After searching high and low, I can't find any method on any type that does something like this. The closest solution I can come up with is using implicit magic:
implicit class ExtendedOption[A](underlying: Option[A]) {
def withSideEffect(op: A => Unit): Option[A] = {
underlying.foreach(op)
underlying
}
}
Are there any Scala types I may have overlooked with methods like this one? And are there are any potential design flaws from using such a method?
Future.andThen (scaladoc) takes a side-effect and returns a future of the current value to facilitate fluent chaining.
The return type is not this.type.
See also duplicate questions about tap.
You can use scalaz for "explicit annotation" of side-effectful functions. In scalaz 7.0.6 it's IO monad: http://eed3si9n.com/learning-scalaz/IO+Monad.html
It's deprecated in scalaz 7.1. I would do something like that with Task
val readAndCache = Task.delay(read(id)).map(record => cacheRecord(record); record)
readAndCache.run // Run task for it's side effects