I would like to use State monad to implement caching for data provided from third party API. Let's imagine method getThirdPartyData(key: String) which firstly checks cache then if it's not present there should make request to API. First and most naive implementation which came to my mind was enclosing State type within Future -
Future[State[Cache, ThirdPartyData]]
But it's not correct because when request fails you will lose your cache (getThirdPartyData will return Failure).
Second option which came to my mind was to extend, or rather redefine State monad - s => Future[(s,a)], instead of s => (s,a), but I thought that it's quite popular problem so scalaz probably has some already defined way to solve this issue.
Any help greatly appreciated!
Is this what you are looking for StateT[Future, Cache, ThirdPartyData]?
implicit val m: Monoid[ThirdPartyData] = ...
val startState: Cache = ...
val l: List[StateT[Future, Cache, ThirdPartyData]] = ...
val result = l.sequenceU
.map { _.foldMap (identity)) }
.eval (startState)
Related
I understand that generally speaking there is a lot to say about deciding what one wants to model as effect This discussion is introduce in Functional programming in Scala on the chapter on IO.
Nonethless, I have not finished the chapter, i was just browsing it end to end before takling it together with Cats IO.
In the mean time, I have a bit of a situation for some code I need to deliver soon at work.
It relies on a Java Library that is just all about mutation. That library was started a long time ago and for legacy reason i don't see them changing.
Anyway, long story short. Is actually modeling any mutating function as IO a viable way to encapsulate a mutating java library ?
Edit1 (at request I add a snippet)
Readying into a model, mutate the model rather than creating a new one. I would contrast jena to gremlin for instance, a functional library over graph data.
def loadModel(paths: String*): Model =
paths.foldLeft(ModelFactory.createOntologyModel(new OntModelSpec(OntModelSpec.OWL_MEM)).asInstanceOf[Model]) {
case (model, path) ⇒
val input = getClass.getClassLoader.getResourceAsStream(path)
val lang = RDFLanguages.filenameToLang(path).getName
model.read(input, "", lang)
}
That was my scala code, but the java api as documented in the website look like this.
// create the resource
Resource r = model.createResource();
// add the property
r.addProperty(RDFS.label, model.createLiteral("chat", "en"))
.addProperty(RDFS.label, model.createLiteral("chat", "fr"))
.addProperty(RDFS.label, model.createLiteral("<em>chat</em>", true));
// write out the Model
model.write(system.out);
// create a bag
Bag smiths = model.createBag();
// select all the resources with a VCARD.FN property
// whose value ends with "Smith"
StmtIterator iter = model.listStatements(
new SimpleSelector(null, VCARD.FN, (RDFNode) null) {
public boolean selects(Statement s) {
return s.getString().endsWith("Smith");
}
});
// add the Smith's to the bag
while (iter.hasNext()) {
smiths.add(iter.nextStatement().getSubject());
}
So, there are three solutions to this problem.
1. Simple and dirty
If all the usage of the impure API is contained in single / small part of the code base, you may just "cheat" and do something like:
def useBadJavaAPI(args): IO[Foo] = IO {
// Everything inside this block can be imperative and mutable.
}
I said "cheat" because the idea of IO is composition, and a big IO chunk is not really composition. But, sometimes you only want to encapsulate that legacy part and do not care about it.
2. Towards composition.
Basically, the same as above but dropping some flatMaps in the middle:
// Instead of:
def useBadJavaAPI(args): IO[Foo] = IO {
val a = createMutableThing()
mutableThing.add(args)
val b = a.bar()
b.computeFoo()
}
// You do something like this:
def useBadJavaAPI(args): IO[Foo] =
for {
a <- IO(createMutableThing())
_ <- IO(mutableThing.add(args))
b <- IO(a.bar())
result <- IO(b.computeFoo())
} yield result
There are a couple of reasons for doing this:
Because the imperative / mutable API is not contained in a single method / class but in a couple of them. And the encapsulation of small steps in IO is helping you to reason about it.
Because you want to slowly migrate the code to something better.
Because you want to feel better with yourself :p
3. Wrap it in a pure interface
This is basically the same that many third party libraries (e.g. Doobie, fs2-blobstore, neotypes) do. Wrapping a Java library on a pure interface.
Note that as such, the amount of work that has to be done is way more than the previous two solutions. As such, this is worth it if the mutable API is "infecting" many places of your codebase, or worse in multiple projects; if so then it makes sense to do this and publish is as an independent module.
(it may also be worth to publish that module as an open-source library, you may end up helping other people and receive help from other people as well)
Since this is a bigger task is not easy to just provide a complete answer of all you would have to do, it may help to see how those libraries are implemented and ask more questions either here or in the gitter channels.
But, I can give you a quick snippet of how it would look like:
// First define a pure interface of the operations you want to provide
trait PureModel[F[_]] { // You may forget about the abstract F and just use IO instead.
def op1: F[Int]
def op2(data: List[String]): F[Unit]
}
// Then in the companion object you define factories.
object PureModel {
// If the underlying java object has a close or release action,
// use a Resource[F, PureModel[F]] instead.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] = ???
}
Now, how to create the implementation is the tricky part.
Maybe you can use something like Sync to initialize the mutable state.
def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] =
F.delay(createMutableState()).map { mutableThing =>
new PureModel[F] {
override def op1: F[Int] = F.delay(mutableThing.foo())
override def op2(data: List[String]): F[Unit] = F.delay(mutableThing.bar(data))
}
}
In https://gist.github.com/satyagraha/897e427bfb5ed203e9d3054ac6705704 I have posted a Scala Cats validation scenario which seems reasonable, but I haven't found a very neat solution.
Essentially, there is a two-stage validation, where individual fields are validated, then a class constructor is called which may throw due to internal checks (in general this may not be under my control to change, hence the exception handling code). We wish to not to call the constructor if any field validation fails, but also combine any constructor failure into the final result. "Fail-fast" is definitely right here for the two-phase check.
This is a kind of flatMap problem, which the cats.data.Validated framework appears to handle via the cats.data.Validated#andThenoperation. However I couldn't find a particularly neat solution to the problem as you can see in the code. There are quite a limited number of operations available on a cats.syntax.CartesianBuilder and is wasn't clear to me how to link it with the andThen operation.
Any ideas welcome! Note there is a Cats issue https://github.com/typelevel/cats/issues/1343 which possibly is related, not sure.
For fail fast, chained validation it is easier to use Either than Validated. You can easily switch from Either to Validated or vice versa depending if you want error accumulation.
A possible solution to your problem would be to create a smart constructor for User which returns an Either[Message, User] and use this with Validated[Message, (Name, Date)].
import cats.implicits._
import cats.data.Validated
def user(name: Name, date: Date): Either[Message, User] =
Either.catchNonFatal(User(name, date)).leftMap(Message.toMessage)
// error accumulation -> Validated
val valids: Validated[Message, (Name, Date)] =
(validateName(nameRepr) |#| validateDate(dateDepr)).tupled
// error short circuiting -> either
val userOrMessage: Either[Message, User] =
valids.toEither.flatMap((user _).tupled)
// Either[Message,User] = Right(User(Name(joe),Date(now)))
I would make a helper second-order function to wrap the exception-throwing ones:
def attempt[A, B](f: A => B): A => Validated[Message, B] = a => tryNonFatal(f(a))
Also, default companions of case classes extend the FunctionN trait, so there's no need to do (User.apply _).tupled, it can be shortened to User.tupled (on custom companions, you need to write extends ((...) => ...)) but apply override will be autogenerated)
So we end up with that using andThen:
val valids = validateName(nameRepr) |#| validateDate(dateDepr)
val res: Validated[Message, User] = valids.tupled andThen attempt(User.tupled)
I am integrating an application using ReactiveMongo with a legacy application.
As, I must maintain legacy application interfaces at some point I must block and/or transform my code into the specified interface types. I have that code distilled down to the example below.
Is there a better way than, getChunks, to consume all of an Enumerator with the output type being a List? What is the standard practice?
implicit def legacyAdapter[TInput,TResult]
(block: Future[Enumerator[TInput]])
(implicit translator : (TInput => TResult),
executionContext:ExecutionContext,
timeOut : Duration): List[TResult] = {
val iter = Iteratee.getChunks[TResult]
val exhaustFuture = block.flatMap{
enumy => { enumy.map(i => translator(i) ).run(iter) }
}
val r = Await.result(exhaustFuture , timeOut)
r
}
Iteratee.getChunks is the only utility offered by playframework that build a list by consuming all chuncks of an enumerator, you can of course do the same thing using Iteratee.fold but you will reinvent the wheel as Iteratee.getChunks uses Iteratee.fold.
We found that exposing the Collect method of http://reactivemongo.org/releases/0.10/api/index.html#reactivemongo.api.Cursor to be more performant than collecting the chunks of the Enumerator. So in a sense we solved our problem by changing the problem.
That did mean an API change for our data layer but the performance improvements allowed for it.
Scala in Depth demonstrates the Loaner Pattern:
def readFile[T](f: File)(handler: FileInputStream => T): T = {
val resource = new java.io.FileInputStream(f)
try {
handler(resource)
} finally {
resource.close()
}
}
Example usage:
readFile(new java.io.File("test.txt")) { input =>
println(input.readByte)
}
This code appears simple and clear. What is an "anti-pattern" of the Loaner pattern in Scala so that I know how to avoid it?
Make sure that whatever you compute is evaluated eagerly and no longer depends on the resource. Scala makes lazy computation fairly easy. For instance, if you wrap scala.io.Source.fromFile in this way, you might try
readFile("test.txt")(_.getLines)
Unfortunately, this doesn't work because getLines is lazy (returns an iterator). And Scala doesn't have any great way to indicate which methods are lazy and which are not. So you just have to know (docs will tend to tell you), and you have to actually do the work before returning:
readFile("test.txt")(_.getLines.toVector)
Overall, it's a very useful pattern. Just make sure that all accesses to the resource are completed before exiting the block (so no uncompleted futures, no lazy vals that depend on the resource, no iterators, no returning the resource itself, no streams that haven't been fully read, etc.; of course any of these things are okay if they do not depend on the open resource but only on some fully-computed quantity based upon the resource).
With the Loan pattern it is important to know when the "bit" of code that is going to actually call your loaned resource is going to use it.
If you want to return a future from a loan pattern I advise to not create it inside the function that is passed to the loan pattern function.
Don't write
readFile("text.file")(future { doSomething })
but do:
future { readFile("text.file")( doSomething ) }
what I usually do is that I define two types of loan pattern functions: Synchronous and Async
So in your case I would have:
def asyncReadFile[T](f: File)(handler: FileInputStream => T): Future[T] = {
future{
readFile(f)(handler)
}
}
This way you avoid calling closed resources. And you reuse your already tested and hopefully correct code of the Synchronous function.
How would one transform Future[Option[X]] into Option[Future[X]]?
val futOpt:Future[Option[Int]] = future(Some(1))
val optFut:Option[Future[Int]] = ?
Update:
This is a follow up to this question. I suppose I'm trying to get a grasp on elegantly transforming nested futures. I'm trying to achieve with Options what can be done with Sequences, where you turn a Future[Seq[Future[Seq[X]]]] into Future[Future[Seq[Seq[x]]]] and then flatMap the double layers. As Ionut has clarified, I have phrased the question in flipped order, it was supposed to be Option[Future[X]] -> Future[Option[X]].
Unfortunately, this isn't possible without losing the non-blocking properties of the computation. It's pretty simple if you think about it. You don't know whether the result of the computation is None or Some until that Future has completed, so you have to Await on it. At that point, it makes no sense to have a Future anymore. You can simply return the Option[X], as the Future has completed already.
Take a look here. It always returns Future.successful, which does no computation, just wraps o in a Future for no good reason.
def transform[A](f: Future[Option[A]]): Option[Future[A]] =
Await.result(f, 2.seconds).map(o => Future.successful(o))
So, if in your context makes sense to block, you're better off using this:
def transform[A](f: Future[Option[A]]): Option[A] =
Await.result(f, 2.seconds)
Response for comments:
def transform[A](o: Option[Future[A]]): Future[Option[A]] =
o.map(f => f.map(Option(_))).getOrElse(Future.successful(None))
you could technically you could do this - in snippet below I use A context bound to a monoid so my types hold.
def fut2Opt[A: Monoid](futOpt: Future[Option[A]])(implicit ec: ExecutionContext): Option[Future[A]] =
Try(futOpt.flatMap {
case Some(a) => Future.successful(a)
case None => Future.fromTry(Try(Monoid[A].empty))
}).toOption