Are these two functions different in any way?
case class DFStorage(private var cache: Map[String, DataFrame] = Map()) {
def tryLoad(job: Job): Kleisli[IO, MakeContext, \/[List[String], Unit]] = {
if(!cache.contains(job.id)) {
job.tryLoad.map(_.map(df => add(job, df)))
} else {
IO(().right[List[String]]).liftKleisli
}
}
def tryLoad(job: Job): Kleisli[IO, MakeContext, \/[List[String], Unit]] = {
Kleisli({makeContext: MakeContext =>
if(!cache.contains(job.id)) {
IO {
job.tryLoad.run(makeContext).unsafePerformIO().map(df => add(job, df))
}
} else {
IO(().right[List[String]])
}
})
}
}
I think you are supposed to call unsafePerformIO as the final side effect in your program (e.g. in your main function), but in this case it may result in the same thing given that you are "unwrapping" the value in the IO monad and then wrapping a transformation of that value. As you can see this unwrapping and transformation of the value may be the same process that IO.map does.
unsafePerformIO is used to force the execution of the side effects represented by the IO value. In Haskell your program must be something of type IO and Haskell runs it (main must have type IO). But in Scala the main function can do anything, so you must execute the IO at that point.
Related
I have a function(myFunc) in scala that gives Future[Either[Throwable,T ]] . Now I need to unwrap and get Either[Throwable,T ] out of it and pass to as an input parameter to another function (anotherFunc).
def myFunc(input: String): Future[Either[Throwable, HttpResponse]] = {
....
}
def anotherFunc(response: Either[Throwable, T]) # signature
anotherFunc(myFunc("some string"))
Normally we use map to transform a Future but thats not helping me here
myFunc("some string").map { _ =>
anotherFunc(_)
}
This causes problem with the return of the block from where I am calling .
You can't unwrap the value of a Future because a Future represents the result of an asynchronous computation that may or may not be available yet. By default, futures and non-blocking, encouraging the use of callbacks instead of typical blocking operations.
What you can do is either:
use combinators such as map, flatMap, filter to compose futures in a non-blocking way.
register a callback using the onComplete method, or foreach if you want to call a callback only when the Future completes successfully.
block the main thread using Await.result, if this is absolutely necessary, although is discouraged. If you want to transform the Future result or combine it with others, you should opt for the 2 previos non-blocking ways mentioned.
That being said. These are the preferred approaches:
def anotherFunc[T](response: Future[Either[Throwable, T]]) = {
response.onComplete {
case Failure(exception) => // process exception
case Success(value) => // process value
}
}
def anotherFunc2[T](response: Future[Either[Throwable, T]]) = {
response.map {
case Left(exception) => // process exception
case Right(value) => // process value
}
}
Then you can do:
anotherFunc(myFunc("some string"))
anotherFunc2(myFunc("some string"))
EDIT:
If you can't change the signature of anotherFunc[T](response: Either[Throwable, T]) then just do:
myFunc("some string").map(anotherFunc)
Here's the code from FPIS
object test2 {
//a naive IO monad
sealed trait IO[A] { self =>
def run: A
def map[B](f: A => B): IO[B] = new IO[B] { def run = f(self.run) }
def flatMap[B](f: A => IO[B]): IO[B] = {
println("calling IO.flatMap")
new IO[B] {
def run = {
println("calling run from flatMap result")
f(self.run).run
}
}
}
}
object IO {
def unit[A](a: => A): IO[A] = new IO[A] { def run = a }
def apply[A](a: => A): IO[A] = unit(a) // syntax for IO { .. }
}
//composer in question
def forever[A,B](a: IO[A]): IO[B] = {
lazy val t: IO[B] = a flatMap (_ => t)
t
}
def PrintLine(msg: String) = IO { println(msg) }
def say = forever(PrintLine("Still Going..")).run
}
test2.say will print thousands of "Still Going" before stack overflows. But I don't know exactly how that happens.
The output looks like this:
scala> test2.say
calling IO.flatMap //only once
calling run from flatMap result
Still Going..
calling run from flatMap result
Still Going..
... //repeating until stack overflows
When function forever returns, is the lazy val t fully computed (cached)?
And, the flatMap method seems to be called only once (I add print statements) which counters the recursive definition of forever. Why?
===========
Another thing I find interesting is that the B type in forever[A, B] could be anything. Scala actually can run with it being opaque.
I manually tried forever[Unit, Double], forever[Unit, String] etc and it all worked. This feels smart.
What forever method does is, as the name suggests, makes the monadic instance a run forever. To be more precise, it gives us an infinite chain of monadic operations.
Its value t is defined recursively as:
t = a flatMap (_ => t)
which expands to
t = a flatMap (_ => a flatMap (_ => t))
which expands to
t = a flatMap (_ => a flatMap (_ => a flatMap (_ => t)))
and so on.
Lazy gives us the ability to define something like this. If we removed the lazy part we would either get a "forward reference" error (in case the recursive value is contained within some method) or it would simply be initialized with a default value and not used recursively (if contained within a class, which makes it a class field with a behind-the-scenes getter and setter).
Demo:
val rec: Int = 1 + rec
println(rec) // prints 1, "rec" in the body is initialized to default value 0
def foo() = {
val rec: Int = 1 + rec // ERROR: forward reference extends over definition of value rec
println(rec)
}
However, this alone is not the reason why the whole stack overflow thing happens. There is another recursive part, and this one is actually responsible for the stack overflow. It is hidden here:
def run = {
println("calling run from flatMap result")
f(self.run).run
}
Method run calls itself (see that self.run). When we define it like this, we don't evaluate self.run on the spot because f hasn't been invoked yet; we are just stating that it will be invoked once run() is invoked.
But when we create the value t in forever, we are creating an IO monad that flatMaps into itself (the function it provides to flatMap is "evaluate into yourself"). This will trigger the run and therefore the evaluation and invocation of f. We never really leave the flatMap context (hence only one printed statement for the flatMap part) because as soon as we try to flatMap, run starts evaluating the function f which returns the IO on which we call run which invokes the function f which returns the IO on which we call run which invokes the function f which returns the IO on which we call run...
I'd like to know when function forever returns, is the lazy val t fully computed (cached)?
Yes
If so then why need the lazy keyword?
It's no use in your case. It can be useful in situation like:
def repeat(n: Int): Seq[Int] {
lazy val expensive = "some expensive computation"
Seq.fill(n)(expensive)
// when n == 0, the 'expensive' computation will be skipped
// when n > 1, the 'expensive' computation will only be computed once
}
The other thing I don't understand is that the flatMap method seems to
be called only once (I add print statements) which counters the
recursive definition of forever. Why?
Not possible to comment until you can provide a Minimal, Complete, and Verifiable example, like #Yuval Itzchakov said
Updated 19/04/2017
Alright, I need to correct myself :-) In your case the lazy val is required due to the recursive reference back to itself.
To explain your observation, let's try to expand the forever(a).run call:
forever(a) expands to
{ lazy val t = a flatMap(_ => t) } expands to
{ lazy val t = new IO[B] { def run() = { ... t.run } }
Because t is lazy, flatMap and new IO[B] in 2 and 3 are invoked only once and then 'cached' for reuse.
On invoking run() on 3, you start a recursion on t.run and thus the result you observed.
Not exactly sure about your requirement, but a non-stack-blowing version of forever can be implemented like:
def forever[A, B](a: IO[A]): IO[B] = {
new IO[B] {
#tailrec
override def run: B = {
a.run
run
}
}
}
new IO[B] {
def run = {
println("calling run from flatMap result")
f(self.run).run
}
}
I get it now why overflowing occurs at run method: the outer run invocation in def run actually points to def run itself.
The call stack looks like this:
f(self.run).run
|-----|--- println
|--- f(self.run).run
|-----|------println
|------f(self.run).run
|------ (repeating)
f(self.run) always points to the same evaluated/cached lazy val t object
because f: _ => t simply returns t that IS the UNIQUE newly created
IO[B] that hosts its run method which we are calling and will immediately recursively call again.
That's how we can see print statements before stack overflows.
However still not clear how lazy val in this case can cook it right.
I am trying to find the best way to handle jedis commands from scala. I am trying to implement a finally block, and prevent the java exceptions from bubbling up to my caller.
Does the following code make sense, and is it the best I can do performance wise, if I want to ensure that I handle exceptions when redis may be down temporarily? This trait would be extended by an object, and I'd call objectname.del(key). I feel like I'm combining too many concepts (Either, Option, Try, feels like there should be a cleaner way)
trait MyExperiment {
implicit class TryOps[T](val t: Try[T]) {
def eventually[Ignore](effect: => Ignore): Try[T] = {
val ignoring = (_: Any) => { effect; t }
t transform (ignoring, ignoring)
}
}
val jpool:JedisPool = initialize()
// init the pool at object creation
private def initialize(): JedisPool =
{
val poolConfig = new JedisPoolConfig()
poolConfig.setMaxIdle(10)
poolConfig.setMinIdle(2)
poolConfig.setTestWhileIdle(true)
poolConfig.setTestOnBorrow(true)
poolConfig.setTestOnReturn(true)
poolConfig.setNumTestsPerEvictionRun(10)
new JedisPool( poolConfig , "localhost" )
}
// get a resource from pool. This can throw an error if redis is
// down
def getFromPool: Either[Throwable,Jedis] =
Try(jpool.getResource) match {
case Failure(m) => Left(m)
case Success(m) => Right(m)
}
// return an object to pool
// i believe this may also throw an error if redis is down?
def returnToPool(cache:Jedis): Unit =
Try(jpool.returnResource(cache))
// execute a command -- "del" in this case, (wrapped by
// the two methods above)
def del(key: String) : Option[Long] = {
getFromPool match {
case Left(m) => None
case Right(m) => Try(m.del(key)) eventually returnToPool(m) match {
case Success(r) => Option(r)
case Failure(r) => None
}
}
}
}
Not an exact answer, but I moved on after doing some performance testing. Using the standard java-ish exception blocks ended up being much faster at high iterations (at 10,000 iterations, it was about 2.5x faster than the (bad) code above). That also cleaned up my code, although it's more verbose.
So the answer I arrived at is to use the Java-style exception blocks which provide for the finally construct. I believe it should be significantly faster, as long as exceptions are a very rare occurance.
According to the Scala Language Specification (ยง6.19), "An enumerator sequence always starts with a generator". Why?
I sometimes find this restriction to be a hindrance when using for-comprehensions with monads, because it means you can't do things like this:
def getFooValue(): Future[Int] = {
for {
manager = Manager.getManager() // could throw an exception
foo <- manager.makeFoo() // method call returns a Future
value = foo.getValue()
} yield value
}
Indeed, scalac rejects this with the error message '<-' expected but '=' found.
If this was valid syntax in Scala, one advantage would be that any exception thrown by Manager.getManager() would be caught by the Future monad used within the for-comprehension, and would cause it to yield a failed Future, which is what I want. The workaround of moving the call to Manager.getManager() outside the for-comprehension doesn't have this advantage:
def getFooValue(): Future[Int] = {
val manager = Manager.getManager()
for {
foo <- manager.makeFoo()
value = foo.getValue()
} yield value
}
In this case, an exception thrown by foo.getValue() will yield a failed Future (which is what I want), but an exception thrown by Manager.getManager() will be thrown back to the caller of getFooValue() (which is not what I want). Other possible ways of handling the exception are more verbose.
I find this restriction especially puzzling because in Haskell's otherwise similar do notation, there is no requirement that a do block should begin with a statement containing <-. Can anyone explain this difference between Scala and Haskell?
Here's a complete working example showing how exceptions are caught by the Future monad in for-comprehensions:
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.{Try, Success, Failure}
class Foo(val value: Int) {
def getValue(crash: Boolean): Int = {
if (crash) {
throw new Exception("failed to get value")
} else {
value
}
}
}
class Manager {
def makeFoo(crash: Boolean): Future[Foo] = {
if (crash) {
throw new Exception("failed to make Foo")
} else {
Future(new Foo(10))
}
}
}
object Manager {
def getManager(crash: Boolean): Manager = {
if (crash) {
throw new Exception("failed to get manager")
} else {
new Manager()
}
}
}
object Main extends App {
def getFooValue(crashGetManager: Boolean,
crashMakeFoo: Boolean,
crashGetValue: Boolean): Future[Int] = {
for {
manager <- Future(Manager.getManager(crashGetManager))
foo <- manager.makeFoo(crashMakeFoo)
value = foo.getValue(crashGetValue)
} yield value
}
def waitForValue(future: Future[Int]): Unit = {
val result = Try(Await.result(future, Duration("10 seconds")))
result match {
case Success(value) => println(s"Got value: $value")
case Failure(e) => println(s"Got error: $e")
}
}
val future1 = getFooValue(false, false, false)
waitForValue(future1)
val future2 = getFooValue(true, false, false)
waitForValue(future2)
val future3 = getFooValue(false, true, false)
waitForValue(future3)
val future4 = getFooValue(false, false, true)
waitForValue(future4)
}
Here's the output:
Got value: 10
Got error: java.lang.Exception: failed to get manager
Got error: java.lang.Exception: failed to make Foo
Got error: java.lang.Exception: failed to get value
This is a trivial example, but I'm working on a project in which we have a lot of non-trivial code that depends on this behaviour. As far as I understand, this is one of the main advantages of using Future (or Try) as a monad. What I find strange is that I have to write
manager <- Future(Manager.getManager(crashGetManager))
instead of
manager = Manager.getManager(crashGetManager)
(Edited to reflect #RexKerr's point that the monad is doing the work of catching the exceptions.)
for comprehensions do not catch exceptions. Try does, and it has the appropriate methods to participate in for-comprehensions, so you can
for {
manager <- Try { Manager.getManager() }
...
}
But then it's expecting Try all the way down unless you manually or implicitly have a way to switch container types (e.g. something that converts Try to a List).
So I'm not sure your premises are right. Any assignment you made in a for-comprehension can just be made early.
(Also, there is no point doing an assignment inside a for comprehension just to yield that exact value. Just do the computation in the yield block.)
(Also, just to illustrate that multiple types can play a role in for comprehensions so there's not a super-obvious correct answer for how to wrap an early assignment in terms of later types:
// List and Option, via implicit conversion
for {i <- List(1,2,3); j <- Option(i).filter(_ <2)} yield j
// Custom compatible types with map/flatMap
// Use :paste in the REPL to define A and B together
class A[X] { def flatMap[Y](f: X => B[Y]): A[Y] = new A[Y] }
class B[X](x: X) { def map[Y](f: X => Y): B[Y] = new B(f(x)) }
for{ i <- (new A[Int]); j <- (new B(i)) } yield j.toString
Even if you take the first type you still have the problem of whether there is a unique "bind" (way to wrap) and whether to doubly-wrap things that are already the correct type. There could be rules for all these things, but for-comprehensions are already hard enough to learn, no?)
Haskell translates the equivalent of for { manager = Manager.getManager(); ... } to the equivalent of lazy val manager = Manager.getManager(); for { ... }. This seems to work:
scala> lazy val x: Int = throw new Exception("")
x: Int = <lazy>
scala> for { y <- Future(x + 1) } yield y
res8: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#fedb05d
scala> Try(Await.result(res1, Duration("10 seconds")))
res9: scala.util.Try[Int] = Failure(java.lang.Exception: )
I think the reason this can't be done is because for-loops are syntactic sugar for flatMap and map methods (except if you are using a condition in the for-loop, in that case it's desugared with the method withFilter). When you are storing in a immutable variable, you can't use these methods. That's the reason you would be ok using Try as pointed out by Rex Kerr. In that case, you should be able to use map and flatMap methods.
Having a function that is used like this:
xpto.withClient {
client => client.abcd
}
I would like to wrap it in an object:
object X {
def foo[T](block: => T): T = {
xpto.withClient {
client => {
block
}
}
}
}
to make it possible to be used like this:
object Y {
def bar : Unit {
X.foo {
client.abcd
}
}
}
This doesn't seems to be making the client value available inside the block though. Is this possible? Making the client variable available inside the block definition? I've looked around with implicits in Scala but so far no good.
That won't work, because block is just something that produces a value of T. It doesn't have the same scope. Supposing that client has type Client, then block should be a function Client => T. foo would then pass the client to block.
def foo[T](block: Client => T): T = {
xpto.withClient { client =>
block(client)
}
}
Or more concisely:
def foo[T](block: Client => T): T = xpto.withClient(block(_))
However, that will change your usage to this:
object Y {
def bar : Unit {
X.foo { client =>
client.abcd
}
}
}
Of course, this does nothing but thinly wrap xpto.withClient. The thing is, you need to have a way to pass client down the chain. Doing this implicitly won't really help either, because you still need a client identifier within that anonymous block of code.