Do own stuff in Slick transaction - scala

I'm using Slick 3.1.1 and i would like to implement my own stuff on Slick transaction.
def findSomeProducts = table.getSomeProducts() //db operation
def productTableUpdate = doSomeStuff1() //db operation
def priceTableUpdate = doSomeStuff2() //db operation
def updateElasticCache = updateIndexOfProduct() //this is not a database operation
I have these sample functions. Firstly, I'm getting some products from db and after i'm updating the tables. At the end I need to run updateElasticCache method. If updateElasticCache method fails, I would like to rollback whole db procceses.
I can't use
(for { ... } yield ()).transactionally this code because it's not applicable for my cases. This 'transactionally' is waiting a db action. But I want to add another functionality which is not a db-process.
Is it possible? How can I achieve it?

DBIO.from
Yes !!! its possible to add non db logic in between db logic in slick using DBIO actions composition and DBIO.from
Note that "your own stuff" should return a future, future can be converted to DBIO and can be composed along with usual db actions.
DBIO.from can help you with this. Here is how it works. DBIO.from takes a future and converts it into a DBIOAction. now you can compose these actions with usual db actions to do non-db operations along with db operations in a transaction.
def updateElasticCache: Future[Unit] = Future(doSomething())
now lets say we have some db actions
def createUser(user: User): DBIO[Int] = ???
I want createUser to rollback if updating cache fails. so i do the following
val action = createUser.flatMap { _ => DBIO.from(updateElasticCache()) }.transactionally
db.run(action)
now if updateElasticCache fails whole tx will fail and everything will rollback to normal state.
Example
You can use for comprehension to make it look good
def updateStats: DBIO[Int] = ???
val rollbackActions =
(for {
cStatus <- createUser()
uStatus <- updateStats()
result <- DBIO.from(updateElasticCache())
} yield result).transactionally
db.run(rollbackActions)
everything roll backs if updateElasticCache future fails

Related

Future returned by Slick run method completes successfully before rows have been inserted

I am working a Scala app using Slick and PostgreSQL for storage. I have a method foo which reads data from a CSV file and inserts about 10,000 rows into the database. After the Future of the row insertion action completes, another method bar is called which retrieves these rows from the database and performs some actions on them. This is where the problem lies: no rows are actually retrieved from the database, as no rows have been inserted by the time the Future completes.
From what I could gather while looking for an answer and at the official documentation, the Future is not supposed to complete before the insert statement is successfully executed. If I add the following code Thread.sleep(30000) to bar, allowing the insert statement to be executed first, the methods provide the expected result. Now, for obvious reasons I would rather not do that, so I am looking for alternatives.
The following diagram illustrates the program flow once initial method doStuff is called:
doStuff calls foo which loads the data and stores it in the database, before returning a Future. In doStuff this Future is then mapped and a call to bar is made. bar bar retrieves the rows from the database and processes them. However, since no rows have been inserted at the point where bar is called, no data is processed.
The doStuff method:
def doStuff(csvFile: File): Future[Unit] = {
fooService.foo(csvFile)
.map(_ => {
csvFile.delete()
barService.bar()
})
}
The foo method:
def foo(file: File) Future[Unit] = {
val reader = CSVReader.open(file)
fooStorage.truncateFooData().map(_ => {
val foos = for (line <- reader.iterator if
line.head != "bad1" &&
line.head !="bad2")
yield parseFooData(line)
fooStorage.saveFooDataBulk(foos.toSeq)
})
}
How I am using Slick to insert the rows:
override def saveFooDataBulk(fooSeq: Seq[Foo]): Future[Seq[Foo]] =
db.run(DBIO.seq(fooQuery ++= fooSeq)).map(_ => fooSeq)
I am expecting bar to be called once all the rows have been inserted into the database and not sooner, however, currently the Future from Slick completes too soon. In case it's relevant: the doStuff method is called when a request is sent to an Akka Http endpoint. The application and the database are running in two different docker containers. What am I doing wrong?
Also I can't get over how appropriate the username I picked a year and a half ago is now.
Replace map by flatMap in the def foo. Otherwise, it will start a Future[Seq[Foo]], then immediately return a (), which is then discarded by your doStuff. Something like this:
fooStorage
.truncateFooData()
.flatMap(_ => {
/* stuff... */
fooStorage.saveFooDataBulk(foo.toSeq)
})
.map(_ => ())
I didn't test it, but in any case, starting some Futures in the middle of another Future.map and then immediately returning an () doesn't feel quite right.

Doobie and DB access composition within 1 transaction

Doobie book says that it's a good practice to return ConnectionIO from your repository layer. It gives an ability to chain calls and perform them in one transaction.
Nice and clear.
Now let's imagine we are working on REST API service and our scenario is:
Find an object in database
Perform some async manipulation (using cats.effect.IO or monix.eval.Task) with this object.
Store the object in database.
And we want to perform all these steps inside 1 transaction. The problem is that without natural transformation which is given for us by transactor.trans() we are working inside 2 monads - Task and ConnectionIO. That's not possible.
The question is - how to mix doobie ConnectionIO with any effect monad in 1 composition such as we are working in 1 transaction and able to commit/rollback all DB mutations at the end of the world?
Thank you!
UPD:
small example
def getObject: ConnectionIO[Request] = ???
def saveObject(obj: Request): ConnectionIO[Request] = ???
def processObject(obj: Request): monix.eval.Task[Request] = ???
val transaction:??? = for {
obj <- getObject //ConnectionIO[Request]
processed <- processObject(obj) //monix.eval.Task[Request]
updated <- saveObject(processed) //ConnectionIO[Request]
} yield updated
UPD2: The correct answer provided by #oleg-pyzhcov is to lift your effect datatypes to ConnectionIO like this:
def getObject: ConnectionIO[Request] = ???
def saveObject(obj: Request): ConnectionIO[Request] = ???
def processObject(obj: Request): monix.eval.Task[Request] = ???
val transaction: ConnectionIO[Request] = for {
obj <- getObject //ConnectionIO[Request]
processed <- Async[ConnectionIO].liftIO(processObject(obj).toIO) //ConnectionIO[Request]
updated <- saveObject(processed) //ConnectionIO[Request]
} yield updated
val result: Task[Request] = transaction.transact(xa)
ConnectionIO in doobie has a cats.effect.Async instance, which, among other things, allows you do turn any cats.effect.IO into ConnectionIO by means of liftIO method:
import doobie.free.connection._
import cats.effect.{IO, Async}
val catsIO: IO[String] = ???
val cio: ConnectionIO[String] = Async[ConnectionIO].liftIO(catsIO)
For monix.eval.Task, your best bet is using Task#toIO and performing the same trick, but you'd need a monix Scheduler in scope.

Order of execution of Future - Making sequential inserts in a db non-blocking

A simple scenario here. I am using akka streams to read from kafka and write into an external source, in my case: cassandra.
Akka streams(reactive-kafka) library equips me with backpressure and other nifty things to make this possible.
kafka being a Source and Cassandra being a Sink, when I get bunch of events which are, for example be cassandra queries here through Kafka which are supposed to be executed sequentially (ex: it could be a INSERT, UPDATE and a DELETE and must be sequential).
I cannot use mayAsync and execute both the statement, Future is eager and there is a chance that DELETE or UPDATE might get executed first before INSERT.
I am forced to use Cassandra's execute as opposed to executeAsync which is non-blocking.
There is no way to make a complete async solution to this issue, but how ever is there a much elegant way to do this?
For ex: Make the Future lazy and sequential and offload it to a different execution context of sorts.
mapAsync gives a parallelism option as well.
Can Monix Task be of help here?
This a general design question and what are the approaches one can take.
UPDATE:
Flow[In].mapAsync(3)(input => {
input match {
case INSERT => //do insert - returns future
case UPDATE => //do update - returns future
case DELETE => //delete - returns future
}
The scenario is a little more complex. There could be thousands of insert, update and delete coming in order for specific key(s)(in kafka)
I would ideally want to execute the 3 futures of a single key in sequence. I believe Monix's Task can help?
If you process things with parallelism of 1, they will get executed in strict sequence, which will solve your problem.
But that's not interesting. If you want, you can run operations for different keys in parallel - if processing for different keys is independent, which, I assume from your description, is possible. To do this, you have to buffer the incoming values and then regroup it. Let's see some code:
import monix.reactive.Observable
import scala.concurrent.duration._
import monix.eval.Task
// Your domain logic - I'll use these stubs
trait Event
trait Acknowledgement // whatever your DB functions return, if you need it
def toKey(e: Event): String = ???
def processOne(event: Event): Task[Acknowledgement] = Task.deferFuture {
event match {
case _ => ??? // insert/update/delete
}
}
// Monix Task.traverse is strictly sequential, which is what you need
def processMany(evs: Seq[Event]): Task[Seq[Acknowledgement]] =
Task.traverse(evs)(processOne)
def processEventStreamInParallel(source: Observable[Event]): Observable[Acknowledgement] =
source
// Process a bunch of events, but don't wait too long for whole 100. Fine-tune for your data source
.bufferTimedAndCounted(2.seconds, 100)
.concatMap { batch =>
Observable
.fromIterable(batch.groupBy(toKey).values) // Standard collection methods FTW
.mapAsync(3)(processMany) // processing up to 3 different keys in parallel - tho 3 is not necessary, probably depends on your DB throughput
.flatMap(Observable.fromIterable) // flattening it back
}
The concatMap operator here will ensure that your chunks are processed sequentially as well. So even if one buffer has key1 -> insert, key1 -> update and the other has key1 -> delete, that causes no problems. In Monix, this is the same as flatMap, but in other Rx libraries flatMap might be an alias for mergeMap which has no ordering guarantee.
This can be done with Futures too, tho there's no standard "sequential traverse", so you have to roll your own, something like:
def processMany(evs: Seq[Event]): Future[Seq[Acknowledgement]] =
evs.foldLeft(Future.successful(Vector.empty[Acknowledgement])){ (acksF, ev) =>
for {
acks <- acksF
next <- processOne(ev)
} yield acks :+ next
}
You can use akka-streams subflows, to group by key, then merge substreams if you want to do something with what you get from your database operations:
def databaseOp(input: In): Future[Out] = input match {
case INSERT => ...
case UPDATE => ...
case DELETE => ...
}
val databaseFlow: Flow[In, Out, NotUsed] =
Flow[In].groupBy(Int.maxValues, _.key).mapAsync(1)(databaseOp).mergeSubstreams
Note that order from input source won't be kept in output as it is done in mapAsync, but all operations on the same key will still be in order.
You are looking for Future.flatMap:
def doSomething: Future[Unit]
def doSomethingElse: Future[Unit]
val result = doSomething.flatMap { _ => doSomethingElse }
This executes the first function, and then, when its Future is satisfied, starts the second one. The result is a new Future that completes when the result of the second execution is satisfied.
The result of the first future is passed into the function you give to .flatMap, so the second function can depend on the result of the first one. For example:
def getUserID: Future[Int]
def getUser(id: Int): Future[User]
val userName: Future[String] = getUserID.flatMap(getUser).map(_.name)
You can also write this as a for-comprehension:
for {
id <- getUserID
user <- getUser(id)
} yield user.name

Sharing database session between multiple methods in Slick 3

I have recently switched from Slick-2 to Slick-3. Everything is working very well with slick-3. However, I am having some issues when it comes to transaction.
I have seen different questions and sample code in which transactionally and withPinnedSession are used to handle the transaction. But my case is slightly different. Both transcationally and withPinnedSession can be applied on Query. But what I want to do is to pass the same session to another method which will do some operations and want to wrap multiple methods in same transaction.
I have the below slick-2 code, I am not sure how this can be implemented with Slick-3.
def insertWithTransaction(row: TTable#TableElementType)(implicit session: Session) = {
val entity = (query returning query.map(obj => obj) += row).asInstanceOf[TEntity]
// do some operations after insert
//eg: invoke another method for sending the notification
entity
}
override def insert(row: TTable#TableElementType) = {
db.withSession {
implicit session => {
insertWithTransaction(row)
}
}
}
Now, if someone is not interested in having transactions, they can just invoke the insert() method.
If we need to do some transaction, it can be done by using insertWithTransaction() in db.withTransaction block.
For eg :
db.withTransaction { implicit session =>
insertWithTransaction(row1)
insertWithTransaction(row2)
//check some condition, invoke session.rollback if something goes wrong
}
But with slick-3, the transactionally can be applied on query only.
That means, wherever we need to do some logic centrally after insertion, it is possible. Every developer needs to manually handle those scenarios explicitly, if they are using transactions. I believe this could potentially cause errors. I am trying to abstract the whole logic in insert operation so that the implementors need to worry only about the transaction success/failure
Is there any other way, in slick-3, in which I can pass the same session to multiple methods so that everything can be done in single db session.
You are missing something : .transactionally doesn't apply to a Query, but to a DBIOAction.
Then, a DBIOAction can be composed of multiple queries by using monadic composition.
Here is a exemple coming from the documentation :
val action = (for {
ns <- coffees.filter(_.name.startsWith("ESPRESSO")).map(_.name).result
_ <- DBIO.seq(ns.map(n => coffees.filter(_.name === n).delete): _*)
} yield ()).transactionally
action is composed of a select query and as many delete queries as rows returned by the first query. All that creates DBIOAction that be executed in a transaction.
Then, to run the action against the database, you have to call db.run, so, like this:
val f: Future[Unit] = db.run(action)
Now, to come back to your exemple, let's say you want to apply an update query after your insert, you can create an action this way
val action = (for {
entity <- (query returning query.map(obj => obj) += row)
_ <- query.map(_.foo).update(newFoo)
} yield entity).transactionally
Hope it helps.

Scala transactional block with for comprehension

Getting stuck with a DAO layer I've created; works fine for the single case, but when needing to persist several bean instances in a transactional block, I find that I have coded myself into a corner. Why? Check out the DAO create method below:
def create(e: Entity): Option[Int] =
db.handle withSession { implicit ss: Session=>
catching( mapper.insert(e) ) option match {
case Some(success) => Some(Query(sequenceID))
case None => None
}
}
Queries that occur within a session block are set to auto commit, so I can't wrap several persistence operations in a transactional block. For example, here's a simplified for comprehension that processes new member subscriptions
val result = for{
u <- user.dao.create(ubean)
m <- member.dao.create(mbean)
o <- order.dao.create(obean)
} yield (u,m,o)
result match {
case Some((a,b,c)) => // all good
case _ => // failed, need to rollback here
}
I could manually perform the queries, but that gets ugly fast
db.handle withSession { implicit ss: Session=>
ss.withTransaction {
val result = for{
u <- safe( UserMapper.insert(ubean) )
...
}
def safe(q: Query[_]) =
catching( q ) option match {
case Some(success) => Some(Query(sequenceID))
case None => None
}
}
}
because I then wind up duplicating error handling, have to supply database, session, etc. all over the application, instead of encapsulating in the DAO layer
Anyone have some sage advice here for how to workaround this problem? I really like the concision of the for comprehension, Scala rocks ;-), ideas appreciated!
OK, unless someone has a better idea, here's what I'm rolling with:
Since DAO to Entity is a 1-to-1 relationship, and ScalaQuery auto commits queries executed in a session block, performing multiple inserts on separate entities is not possible via my DAO implementation.
The workaround is to create a GenericDAO, one not tied to a particular entity, which provides transactional query functionality, with error handling extracted into parent trait
def createMember(...): Boolean = {
db.handle withSession { implicit ss: Session=>
ss.withTransaction {
val result = for{
u <- safeInsert( UserMapper.insert(ubean) )(ss)
...
}
...
At the controller layer the implementation becomes dao.createMember(...) which is quite nice, imo, transactional inserts using safe for comprehension, cool stuff.