Using scala-cats IO type to encapsulate a mutable Java library - scala

I understand that generally speaking there is a lot to say about deciding what one wants to model as effect This discussion is introduce in Functional programming in Scala on the chapter on IO.
Nonethless, I have not finished the chapter, i was just browsing it end to end before takling it together with Cats IO.
In the mean time, I have a bit of a situation for some code I need to deliver soon at work.
It relies on a Java Library that is just all about mutation. That library was started a long time ago and for legacy reason i don't see them changing.
Anyway, long story short. Is actually modeling any mutating function as IO a viable way to encapsulate a mutating java library ?
Edit1 (at request I add a snippet)
Readying into a model, mutate the model rather than creating a new one. I would contrast jena to gremlin for instance, a functional library over graph data.
def loadModel(paths: String*): Model =
paths.foldLeft(ModelFactory.createOntologyModel(new OntModelSpec(OntModelSpec.OWL_MEM)).asInstanceOf[Model]) {
case (model, path) ⇒
val input = getClass.getClassLoader.getResourceAsStream(path)
val lang = RDFLanguages.filenameToLang(path).getName
model.read(input, "", lang)
}
That was my scala code, but the java api as documented in the website look like this.
// create the resource
Resource r = model.createResource();
// add the property
r.addProperty(RDFS.label, model.createLiteral("chat", "en"))
.addProperty(RDFS.label, model.createLiteral("chat", "fr"))
.addProperty(RDFS.label, model.createLiteral("<em>chat</em>", true));
// write out the Model
model.write(system.out);
// create a bag
Bag smiths = model.createBag();
// select all the resources with a VCARD.FN property
// whose value ends with "Smith"
StmtIterator iter = model.listStatements(
new SimpleSelector(null, VCARD.FN, (RDFNode) null) {
public boolean selects(Statement s) {
return s.getString().endsWith("Smith");
}
});
// add the Smith's to the bag
while (iter.hasNext()) {
smiths.add(iter.nextStatement().getSubject());
}

So, there are three solutions to this problem.
1. Simple and dirty
If all the usage of the impure API is contained in single / small part of the code base, you may just "cheat" and do something like:
def useBadJavaAPI(args): IO[Foo] = IO {
// Everything inside this block can be imperative and mutable.
}
I said "cheat" because the idea of IO is composition, and a big IO chunk is not really composition. But, sometimes you only want to encapsulate that legacy part and do not care about it.
2. Towards composition.
Basically, the same as above but dropping some flatMaps in the middle:
// Instead of:
def useBadJavaAPI(args): IO[Foo] = IO {
val a = createMutableThing()
mutableThing.add(args)
val b = a.bar()
b.computeFoo()
}
// You do something like this:
def useBadJavaAPI(args): IO[Foo] =
for {
a <- IO(createMutableThing())
_ <- IO(mutableThing.add(args))
b <- IO(a.bar())
result <- IO(b.computeFoo())
} yield result
There are a couple of reasons for doing this:
Because the imperative / mutable API is not contained in a single method / class but in a couple of them. And the encapsulation of small steps in IO is helping you to reason about it.
Because you want to slowly migrate the code to something better.
Because you want to feel better with yourself :p
3. Wrap it in a pure interface
This is basically the same that many third party libraries (e.g. Doobie, fs2-blobstore, neotypes) do. Wrapping a Java library on a pure interface.
Note that as such, the amount of work that has to be done is way more than the previous two solutions. As such, this is worth it if the mutable API is "infecting" many places of your codebase, or worse in multiple projects; if so then it makes sense to do this and publish is as an independent module.
(it may also be worth to publish that module as an open-source library, you may end up helping other people and receive help from other people as well)
Since this is a bigger task is not easy to just provide a complete answer of all you would have to do, it may help to see how those libraries are implemented and ask more questions either here or in the gitter channels.
But, I can give you a quick snippet of how it would look like:
// First define a pure interface of the operations you want to provide
trait PureModel[F[_]] { // You may forget about the abstract F and just use IO instead.
def op1: F[Int]
def op2(data: List[String]): F[Unit]
}
// Then in the companion object you define factories.
object PureModel {
// If the underlying java object has a close or release action,
// use a Resource[F, PureModel[F]] instead.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] = ???
}
Now, how to create the implementation is the tricky part.
Maybe you can use something like Sync to initialize the mutable state.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] =
F.delay(createMutableState()).map { mutableThing =>
new PureModel[F] {
override def op1: F[Int] = F.delay(mutableThing.foo())
override def op2(data: List[String]): F[Unit] = F.delay(mutableThing.bar(data))
}
}

Related

How to design abstract classes if methods don't have the exact same signature?

This is a "real life" OO design question. I am working with Scala, and interested in specific Scala solutions, but I'm definitely open to hear generic thoughts.
I am implementing a branch-and-bound combinatorial optimization program. The algorithm itself is pretty easy to implement. For each different problem we just need to implement a class that contains information about what are the allowed neighbor states for the search, how to calculate the cost, and then potentially what is the lower bound, etc...
I also want to be able to experiment with different data structures. For instance, one way to store a logic formula is using a simple list of lists of integers. This represents a set of clauses, each integer a literal. We can have a much better performance though if we do something like a "two-literal watch list", and store some extra information about the formula in general.
That all would mean something like this
object BnBSolver[S<:BnBState]{
def solve(states: Seq[S], best_state:Option[S]): Option[S] = if (states.isEmpty) best_state else
val next_state = states.head
/* compare to best state, etc... */
val new_states = new_branches ++ states.tail
solve(new_states, new_best_state)
}
class BnBState[F<:Formula](clauses:F, assigned_variables) {
def cost: Int
def branches: Seq[BnBState] = {
val ll = clauses.pick_variable
List(
BnBState(clauses.assign(ll), ll :: assigned_variables),
BnBState(clauses.assign(-ll), -ll :: assigned_variables)
)
}
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Int) :F =
Formula(clauses.filterNot(_ contains ll)
.map(_.filterNot(_==-ll))))
}
Hopefully this is not too crazy, wrong or confusing. The whole issue here is that this assign method from a formula would usually take just the current literal that is going to be assigned. In the case of two-literal watch lists, though, you are doing some lazy thing that requires you to know later what literals have been previously assigned.
One way to fix this is you just keep this list of previously assigned literals in the data structure, maybe as a private thing. Make it a self-standing lazy data structure. But this list of the previous assignments is actually something that may be naturally available by whoever is using the Formula class. So it makes sense to allow whoever is using it to just provide the list every time you assign, if necessary.
The problem here is that we cannot now have an abstract Formula class that just declares a assign(ll:Int):Formula. In the normal case this is OK, but if this is a two-literal watch list Formula, it is actually an assign(literal: Int, previous_assignments: Seq[Int]).
From the point of view of the classes using it, it is kind of OK. But then how do we write generic code that can take all these different versions of Formula? Because of the drastic signature change, it cannot simply be an abstract method. We could maybe force the user to always provide the full assigned variables, but then this is a kind of a lie too. What to do?
The idea is the watch list class just becomes a kind of regular assign(Int) class if I write down some kind of adapter method that knows where to take the previous assignments from... I am thinking maybe with implicit we can cook something up.
I'll try to make my answer a bit general, since I'm not convinced I'm completely following what you are trying to do. Anyway...
Generally, the first thought should be to accept a common super-class as a parameter. Obviously that won't work with Int and Seq[Int].
You could just have two methods; have one call the other. For instance just wrap an Int into a Seq[Int] with one element and pass that to the other method.
You can also wrap the parameter in some custom class, e.g.
class Assignment {
...
}
def int2Assignment(n: Int): Assignment = ...
def seq2Assignment(s: Seq[Int]): Assignment = ...
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Assignment) :F = ...
}
And of course you would have the option to make those conversion methods implicit so that callers just have to import them, not call them explicitly.
Lastly, you could do this with a typeclass:
trait Assigner[A] {
...
}
implicit val intAssigner = new Assigner[Int] {
...
}
implicit val seqAssigner = new Assigner[Seq[Int]] {
...
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign[A : Assigner](ll: A) :F = ...
}
You could also make that type parameter at the class level:
case class Formula[A:Assigner,F<:Formula[A,F]](clauses:List[List[Int]]) {
def assign(ll: A) :F = ...
}
Which one of these paths is best is up to preference and how it might fit in with the rest of the code.

Wrapping a class with side-effects

After reading chapter six of Functional Programming in Scala and trying to understand the State monad, I have a question regarding wrapping an side-effecting class.
Say I have a class that modifies itself in someway.
class SideEffect(x:Int) {
var value = x
def modifyValue(newValue:Int):Unit = { value = newValue }
}
My understanding is that if we wrapped this in a State monad as below, it would still modify the original making the point of wrapping it moot.
case class State[S,+A](run: S => (A, S)) { // See footnote
// map, flatmap, unit, utility functions
}
val sideEffect = new SideEffect(20)
println(sideEffect.value) // Prints "20"
val stateMonad = State[SideEffect,Int](state => {
state.modifyValue(10)
(state.value,state)
})
stateMonad.run(sideEffect) // run the modification
println(sideEffect.value) // Prints "10" i.e. we have modified the original state
The only solution to this that I can see is to make a copy of the class and modify that, but that seems computationally expensive as SideEffect grows. Plus if we wanted to wrap something like a Java class that doesn't implement Cloneable, we'd be out of luck.
val stateMonad = State[SideEffect,Int](state => {
val newState = SideEffect(state.value) // Easier if it was a case class but hypothetically if one was, say, working with a Java library, one would not have this luxury
newState.modifyValue(10)
(newState.value,newState)
})
stateMonad.run(sideEffect) // run the modification
println(sideEffect.value) // Prints "20", original state not modified
Am I using the State monad wrong? How would one go about wrapping a side-effecting class without having to copy it or is this the only way?
The implementation for the State monad I'm using here is from the book, and might differ from Scalaz implementation
You can't do anything with mutable object except hiding mutation in some wrapper. So the scope of program that requires more attention in testing will be much more smaller. Your first sample is good enough. One moment only. It would be better to hide outer reference at all. Instead stateMonad.run(sideEffect) use something like stateMonad.run(new SideEffect(20)) or
def initState: SideEffect = new SideEffect(20)
val (state, value) = stateMonad.run(initState)

Looking for some guidance on how to code a writer for a given "AST" (DynamoDB)

As a personal project, I am writing yet another Scala library for DynamoDb. It contains many interesting aspect such as reading and writing from an AST (just as Json), handling HTTP request, streaming data…
In order to be able able to communicate with DynamoDb, one needs to be able to read from / to the DynamoDb format (the “AST”). I extracted this reading / writing from / to the AST in a minimalist library: dynamo-ast. It contains two main type classes: DynamoReads[_] and DynamoWrites[_] (deeply inspired from Play Json).
I successfully coded the reading part of the library ending with a very simple code such as :
trait DynamoRead[A] { self =>
def read(dynamoType: DynamoType): DynamoReadResult[A]
}
case class TinyImage(url: String, alt: String)
val dynamoReads: DynamoReads[TinyImage] = {
for {
url <- read[String].at(“url”)
alt <- read[String].at(“alt”)
} yield (url, alt) map (TinyImage.apply _).tupled
}
dynamoReads.reads(dynamoAst) //yield DynamoReadResult[TinyImage]
At that point, I thought I wrote the most complicated part of the library and the DynamoWrite[_] part would be a piece of cake. I am however stuck on writing the DynamoWrite part. I was a fool.
My goal is to provide a very similar “user experience” with the DynamoWrite[_] and keep it as simple as possible such as :
val dynamoWrites: DynamoWrites[TinyImage] = {
for {
url <- write[String].at(“url”)
alt <- write[String].at(“alt”)
} yield (url, alt) map (TinyImage.unapply _) //I am not sure what to yield here nor how to code it
}
dynamoWrites.write(TinyImage(“http://fake.url”, “The alt desc”)) //yield DynamoWriteResult[DynamoType]
Since this library is deeply inspired from Play Json library (because I like its simplicity) I had a look at the sources several times. I kind of dislike the way the writer part is coded because to me, it adds a lot of overhead (basically each time a field a written, a new JsObject is created with one field and the resulting JsObject for a complete class is the merge of all the JsObjects containing one field).
From my understanding, the DynamoReads part can be written with only one trait (DynamoRead[_]). The DynamoWrites part however requires at least two such as :
trait DynamoWrites[A] {
def write(a: A): DynamoWriteResult[DynamoType]
}
trait DynamoWritesPath[A] {
def write(path:String, a: A): DynamoWriteResult[(String, DynamoType)]
}
The DynamoWrites[_] is to write plain String, Int… and the DynamoWritesPath[_] is to write a tuple of (String, WhateverTypeHere) (to simulate a “field”).
So writing write[String].at(“url”) would yield a DynamoWritesPath[String]. Now I have several issues :
I have no clue how to write flatMap for my DynamoWritesPath[_]
what should yield a for comprehension to be able to obtain a DynamoWrite[TinyImage]
What I wrote so far (totally fuzzy and not compiling at all, looking for some help on this). Not committed at the moment (gist): https://gist.github.com/louis-forite/cad97cc0a47847b2e4177192d9dbc3ae
To sum up, I am looking for some guidance on how to write the DynamoWrites[_] part. My goal is to provide for the client the most straight forward way to code a DynamoWrites[_] for a given type. My non goal is to write the perfect library and keep it a zero dependency library.
Link to the library: https://github.com/louis-forite/dynamo-ast
A Reads is a covariant functor. That means it has map. It can also be seen as a Monad which means it has flatMap (although a monad is overkill unless you need the previous field in order to know how to process the next):
trait Reads[A] {
def map [B] (f: A => B): Reads[B]
def flatMap [B](f: A => Reads[B]): Reads[B] // not necessary, but available
}
The reason for this, is that to transform a Reads[Int] to a Reads[String], you need to first read the Int, then apply the Int => String function.
But a Writes is a contravariant functor. It has contramap where the direction of the types is reversed:
trait Writes[A] {
def contramap [B](f: B => A): Reads[B]
}
The type on the function is reversed because to transform a Writes[Int] to a Writes[String] you must receive the String from the caller, apply the transformation String => Int and then write the Int.
I don't think it makes sense to provide for-comprehension syntax (flatMap) for the Writes API.
// here it is clear that you're extracting a string value
url <- read[String].at(“url”)
// but what does this mean for the write method?
url <- write[String].at("url")
// what is `url`?
That's probably why play doesn't provide one either, and why they focus on their combinator syntax (using the and function, their version of applicative functor builder?).
For reference: http://blog.tmorris.net/posts/functors-and-things-using-scala/index.html
You can achieve a more consistent API by using something like the and method in play json:
(write[String]("url") and write[String]("alt"))(unlift(TinyImage.unapply))
(read[String]("url") and read[String]("alt"))(TinyImage.apply)
// unfortunately, the type ascription is necessary in this case
(write[String]("url") and write[String]("alt")) {(x: TinyImage) =>
(x.url, x.alt)
}
// transforming
val instantDynamoType: DynamoFormat[Instant] =
format[String].xmap(Instant.parse _)((_: Instant).toString)
You can still use for-comprehension for the reads, although it's a bit over-powered (sort of implies that fields must be processed in-sequence, while that's not technically necessary).

Loaner Pattern in Scala

Scala in Depth demonstrates the Loaner Pattern:
def readFile[T](f: File)(handler: FileInputStream => T): T = {
val resource = new java.io.FileInputStream(f)
try {
handler(resource)
} finally {
resource.close()
}
}
Example usage:
readFile(new java.io.File("test.txt")) { input =>
println(input.readByte)
}
This code appears simple and clear. What is an "anti-pattern" of the Loaner pattern in Scala so that I know how to avoid it?
Make sure that whatever you compute is evaluated eagerly and no longer depends on the resource. Scala makes lazy computation fairly easy. For instance, if you wrap scala.io.Source.fromFile in this way, you might try
readFile("test.txt")(_.getLines)
Unfortunately, this doesn't work because getLines is lazy (returns an iterator). And Scala doesn't have any great way to indicate which methods are lazy and which are not. So you just have to know (docs will tend to tell you), and you have to actually do the work before returning:
readFile("test.txt")(_.getLines.toVector)
Overall, it's a very useful pattern. Just make sure that all accesses to the resource are completed before exiting the block (so no uncompleted futures, no lazy vals that depend on the resource, no iterators, no returning the resource itself, no streams that haven't been fully read, etc.; of course any of these things are okay if they do not depend on the open resource but only on some fully-computed quantity based upon the resource).
With the Loan pattern it is important to know when the "bit" of code that is going to actually call your loaned resource is going to use it.
If you want to return a future from a loan pattern I advise to not create it inside the function that is passed to the loan pattern function.
Don't write
readFile("text.file")(future { doSomething })
but do:
future { readFile("text.file")( doSomething ) }
what I usually do is that I define two types of loan pattern functions: Synchronous and Async
So in your case I would have:
def asyncReadFile[T](f: File)(handler: FileInputStream => T): Future[T] = {
future{
readFile(f)(handler)
}
}
This way you avoid calling closed resources. And you reuse your already tested and hopefully correct code of the Synchronous function.

Can I transform this asynchronous java network API into a monadic representation (or something else idiomatic)?

I've been given a java api for connecting to and communicating over a proprietary bus using a callback based style. I'm currently implementing a proof-of-concept application in scala, and I'm trying to work out how I might produce a slightly more idiomatic scala interface.
A typical (simplified) application might look something like this in Java:
DataType type = new DataType();
BusConnector con = new BusConnector();
con.waitForData(type.getClass()).addListener(new IListener<DataType>() {
public void onEvent(DataType t) {
//some stuff happens in here, and then we need some more data
con.waitForData(anotherType.getClass()).addListener(new IListener<anotherType>() {
public void onEvent(anotherType t) {
//we do more stuff in here, and so on
}
});
}
});
//now we've got the behaviours set up we call
con.start();
In scala I can obviously define an implicit conversion from (T => Unit) into an IListener, which certainly makes things a bit simpler to read:
implicit def func2Ilistener[T](f: (T => Unit)) : IListener[T] = new IListener[T]{
def onEvent(t:T) = f
}
val con = new BusConnector
con.waitForData(DataType.getClass).addListener( (d:DataType) => {
//some stuff, then another wait for stuff
con.waitForData(OtherType.getClass).addListener( (o:OtherType) => {
//etc
})
})
Looking at this reminded me of both scalaz promises and f# async workflows.
My question is this:
Can I convert this into either a for comprehension or something similarly idiomatic (I feel like this should map to actors reasonably well too)
Ideally I'd like to see something like:
for(
d <- con.waitForData(DataType.getClass);
val _ = doSomethingWith(d);
o <- con.waitForData(OtherType.getClass)
//etc
)
If you want to use a for comprehension for this, I'd recommend looking at the Scala Language Specification for how for comprehensions are expanded to map, flatMap, etc. This will give you some clues about how this structure relates to what you've already got (with nested calls to addListener). You can then add an implicit conversion from the return type of the waitForData call to a new type with the appropriate map, flatMap, etc methods that delegate to addListener.
Update
I think you can use scala.Responder[T] from the standard library:
Assuming the class with the addListener is called Dispatcher[T]:
trait Dispatcher[T] {
def addListener(listener: IListener[T]): Unit
}
trait IListener[T] {
def onEvent(t: T): Unit
}
implicit def dispatcher2Responder[T](d: Dispatcher[T]):Responder[T] = new Responder[T} {
def respond(k: T => Unit) = d.addListener(new IListener[T] {
def onEvent(t:T) = k
})
}
You can then use this as requested
for(
d <- con.waitForData(DataType.getClass);
val _ = doSomethingWith(d);
o <- con.waitForData(OtherType.getClass)
//etc
) ()
See the Scala wiki and this presentation on using Responder[T] for a Comet chat application.
I have very little Scala experience, but if I were implementing something like this I'd look to leverage the actor mechanism rather than using callback listener classes. Actors were made for asynchronous communication, they nicely separate those different parts of your app for you. You can also have them send messages to multiple listeners.
We'll have to wait for a "real" Scala programmer to flesh this idea out, though. ;)