I've been using Slick for quite a while and now I'm migrating from Slick 2.1 to 3.0. Unfortunatelly I got stuck with ordinary stuff like counting lines. My code worked perfectly in Slick 2.1 when I used to do this:
connection.withSession {
implicit session => coffees.length.run
}
On the code above I would get my result as an Int, but I can't get it to work now after I moved to Slick 3.0.2 though the documentation tells me that the code should be the same.
I tried the following (I already removed the withSession deprecated call):
connection.createSession.withTransaction {
coffees.length
}
But this code will return a slick.lifted.Rep[Int] which does not have any method to get the integer value. Am I missing some implicit import?
As you have probably realised, the result of the run call is to produce a Future, which will resolve at some later point.
While this means that eventually somewhere in the code the future will need to be waited on in a manner like you show in your answer, this can, and should, be pushed back as late as possible. If you are working with, for example, the Play framework, use async Actions and let Play handle it for you.
In the meantime work with the Future as you would any other monadic construct (like Option) - calling map, flatMap, onSuccess and so on to chain your computations inside the propagated Future context.
Please, someone tell me there is a better way to answer my question. I got it to work doing this, but this looks terrible:
import scala.concurrent.duration._
import scala.concurrent.Await
val timeout = Duration(10, SECONDS)
val count = Await.result(connection.run(coffees.length.result), timeout)
Related
I have some cats IO operations, and Future among then. Simplified:
IO(getValue())
.flatMap(v => IO.fromFuture(IO(blockingProcessValue(v)))(myBlockingPoolContextShift))
.map(moreProcessing)
So I have some value in IO, then I need to do some blocking operation using a library that returns Future, and then I need to do some more processing on the value returned from Future
Future runs on a dedicated thread pool - so far so good. The problem is after Future is completed. moreProcessing runs on the same thread the Future was running on.
Is there a way to get back to the thread getValue() was running on?
After the discussion in the chat, the conclusion is that the only thing OP needs is to create a ContextShift in the application entry point using the appropriate (compute) EC and then pass it down to the class containing this method.
// Entry point
val computeEC = ???
val cs = IO.contextShift(computeEC)
val myClass = new MyClass(cs, ...)
// Inside the method on MyClass
IO(getValue())
.flatMap(v => IO.fromFuture(IO(blockingProcessValue(v)))(myBlockingPoolContextShift))
.flatTap(_ => cs.shift)
.map(moreProcessing)
This Scastie showed an approach using Blocker and other techniques common in the Typelevel ecosystem but were not really suitable for OP's use case; anyways I find it useful for future readers who may have a similar problem.
Futures are executed in my code and are not being detected.
def f(): Future[String] = {
functionReturningFuture() // How to detect this?
Future("")
}
Ideally, a static analysis tool would help detect this.
The closer you can get is NonUnitStatements wart from WartRemover, but it cannot error only Future statements and skip all the others.
The fact that you have such issue could be used as an argument against using Future and replacing it with some IO: Cats' IO, Monix's Task or Scalaz ZIO. When it comes to them, you build your pipeline first, and the you run it. If you omitted IO value in return and you didn't compose it into the result in any other way (flatMap, map2, for comprehension etc) it would not get executed - it would still be there but it would cause no harm.
If you wanted to have greater control and error only on Future, you would probably have to write your own WartRemover's wart or ScalaFix rule.
Let's say I have a method that permits to update some date in DB:
def updateLastConsultationDate(userId: String): Unit = ???
How can I throttle/debounce that method easily so that it won't be run more than once an hour per user.
I'd like the simplest possible solution, not based on any event-bus, actor lib or persistence layer. I'd like an in-memory solution (and I am aware of the risks).
I've seen solutions for throttling in Scala, based on Akka Throttler, but this really looks to me overkill to start using actors just for throttling method calls. Isn't there a very simple way to do that?
Edit: as it seems not clear enough, here's a visual representation of what I want, implemented in JS. As you can see, throttling may not only be about filtering subsequent calls, but also postponing calls (also called trailing events in js/lodash/underscore). The solution I'm looking for can't be based on pure-synchronous code only.
This sounds like a great job for a ReactiveX-based solution. On Scala, Monix is my favorite one. Here's the Ammonite REPL session illustrating it:
import $ivy.`io.monix::monix:2.1.0` // I'm using Ammonite's magic imports, it's equivalent to adding "io.monix" %% "monix" % "2.1.0" into your libraryImports in SBT
import scala.concurrent.duration.DurationInt
import monix.reactive.subjects.ConcurrentSubject
import monix.reactive.Consumer
import monix.execution.Scheduler.Implicits.global
import monix.eval.Task
class DbUpdater {
val publish = ConcurrentSubject.publish[String]
val throttled = publish.throttleFirst(1 hour)
val cancelHandle = throttled.consumeWith(
Consumer.foreach(userId =>
println(s"update your database with $userId here")))
.runAsync
def updateLastConsultationDate(userId: String): Unit = {
publish.onNext(userId)
}
def stop(): Unit = cancelHandle.cancel()
}
Yes, and with Scala.js this code will work in the browser, too, if it's important for you.
Since you ask for the simplest possible solution, you can store a val lastUpdateByUser: Map[String, Long], which you would consult before allowing an update
if (lastUpdateByUser.getOrElse(userName, 0)+60*60*1000 < System.currentTimeMillis) updateLastConsultationDate(...)
and update when a user actually performs an update
lastUpdateByUser(userName) = System.currentTimeMillis
One way to throttle, would be to maintain a count in a redis instance. Doing so would ensure that the DB wouldn't be updated, no matter how many scala processes you were running, because the state is stored outside of the process.
I'm using Slick 3.0 and (of course) almost all the examples out there cover Slick 2.x. Things have changed and frankly seem more complicated, not less.
Here's an example: I want to get an object (a GPPerson) by id. This is what I have right now, and it seems very verbose... more so than Slick 2.x:
def get(id: GPID): Option[GPPerson] = Await.result(
db.run(
people.filter(_.id === id).result), Duration.Inf
).headOption
In Slick 2.x things were easier because of the implicits, among other things. But the above seems to be the most concise expression I've come up with.
It also doesn't really address exception handling, which I would need to add.
I started to use Slick 3.0 in a new project a few months ago and I had the same questions. This is what I understood:
Slick 3.0 was designed for non-blocking asynchronous (reactive) applications. Obviously it means Akka + Play / Spray nowadays. In this world you mostly interact with Futures, that's why Slick's db.run returns Future. There is no point in using Await.result - if you need blocking calls it's better to return to 2.x.
But if you use reactive stack you'll get benefits immediately. For example, Spray is completely non-blocking library that works with Futures nicely using onComplete directive. You can call a method that returns Future with a result from Slick in a Spray route and then use that result together with onComplete. In this case the whole response-reply pipeline is non-blocking.
You also mentioned exception handling, so this is exactly how you do it - using Futures.
So based on my experience I would write your method in a following way:
def get(id: GPID): Future[Option[GPPerson]] = db.run(
people.filter(_.id === id).result.map(_.headOption)
)
and then work with a Future.
you can do this.
def syncResult[R](action:slick.dbio.DBIOAction[R, slick.dbio.NoStream, scala.Nothing]):R = {
import scala.concurrent.duration.Duration
val db = Database.forConfig("db")
try {
Await.result(db.run(action), Duration.Inf)
} finally db.close
}
def get(id: GPID): Option[GPPerson] = syncResult { people.filter(_.id === id).result.headOption }
Passing messages around with actors is great. But I would like to have even easier code.
Examples (Pseudo-code)
val splicedList:List[List[Int]]=biglist.partition(100)
val sum:Int=ActorPool.numberOfActors(5).getAllResults(splicedList,foldLeft(_+_))
where spliceIntoParts turns one big list into 100 small lists
the numberofactors part, creates a pool which uses 5 actors and receives new jobs after a job is finished
and getallresults uses a method on a list. all this done with messages passing in the background. where maybe getFirstResult, calculates the first result, and stops all other threads (like cracking a password)
With Scala Parallel collections that will be included in 2.8.1 you will be able to do things like this:
val spliced = myList.par // obtain a parallel version of your collection (all operations are parallel)
spliced.map(process _) // maps each entry into a corresponding entry using `process`
spliced.find(check _) // searches the collection until it finds an element for which
// `check` returns true, at which point the search stops, and the element is returned
and the code will automatically be done in parallel. Other methods found in the regular collections library are being parallelized as well.
Currently, 2.8.RC2 is very close (this or next week), and 2.8 final will come in a few weeks after, I guess. You will be able to try parallel collections if you use 2.8.1 nightlies.
You can use Scalaz's concurrency features to achieve what you want.
import scalaz._
import Scalaz._
import concurrent.strategy.Executor
import java.util.concurrent.Executors
implicit val s = Executor.strategy[Unit](Executors.newFixedThreadPool(5))
val splicedList = biglist.grouped(100).toList
val sum = splicedList.parMap(_.sum).map(_.sum).get
It would be pretty easy to make this prettier (i.e. write a function mapReduce that does the splitting and folding all in one). Also, parMap over a List is unnecessarily strict. You will want to start folding before the whole list is ready. More like:
val splicedList = biglist.grouped(100).toList
val sum = splicedList.map(promise(_.sum)).toStream.traverse(_.sum).get
You can do this with less overhead than creating actors by using futures:
import scala.actors.Futures._
val nums = (1 to 1000).grouped(100).toList
val parts = nums.map(n => future { n.reduceLeft(_ + _) })
val whole = (0 /: parts)(_ + _())
You have to handle decomposing the problem and writing the "future" block and recomposing it in to a final answer, but it does make executing a bunch of small code blocks in parallel easy to do.
(Note that the _() in the fold left is the apply function of the future, which means, "Give me the answer you were computing in parallel!", and it blocks until the answer is available.)
A parallel collections library would automatically decompose the problem and recompose the answer for you (as with pmap in Clojure); that's not part of the main API yet.
I'm not waiting for Scala 2.8.1 or 2.9, it would rather be better to write my own library or use another, so I did more googling and found this: akka
http://doc.akkasource.org/actors
which has an object futures with methods
awaitAll(futures: List[Future]): Unit
awaitOne(futures: List[Future]): Future
but http://scalablesolutions.se/akka/api/akka-core-0.8.1/
has no documentation at all. That's bad.
But the good part is that akka's actors are leaner than scala's native ones
With all of these libraries (including scalaz) around, it would be really great if scala itself could eventually merge them officially
At Scala Days 2010, there was a very interesting talk by Aleksandar Prokopec (who is working on Scala at EPFL) about Parallel Collections. This will probably be in 2.8.1, but you may have to wait a little longer. I'll lsee if I can get the presentation itself. to link here.
The idea is to have a collections framework which parallelizes the processing of the collections by doing exactly as you suggest, but transparently to the user. All you theoretically have to do is change the import from scala.collections to scala.parallel.collections. You obviously still have to do the work to see if what you're doing can actually be parallelized.