Database Exception in Slick 3.0 while batch insert - postgresql

While inserting thousands of records per five seconds through batch insert in slick 3 I am getting
org.postgresql.util.PSQLException: FATAL: sorry, too many clients already
My data access layer looks like :
val db: CustomPostgresDriver.backend.DatabaseDef = Database.forURL(url, user=user, password=password, driver= jdbcDriver)
override def insertBatch(rowList: List[T#TableElementType]): Future[Long] = {
val res = db.run(insertBatchQuery(rowList)).map(_.head.toLong).recover{ case ex:Throwable=> RelationalRepositoryUtility.handleBatchOperationErrors(ex)}
//db.close()
res
}
override def insertBatchQuery(rowList: List[T#TableElementType]): FixedSqlAction[Option[Int], NoStream, Write] = {
query ++= (rowList)
}
closing the connection in insert batch has no effect...it still gives the same error.
I am calling insert batch from my code like this :
val temp1 = list1.flatMap { li =>
Future.sequence(li.map { trip =>
val data = for {
tripData <- TripDataRepository.insertQuery( trip.tripData)
subTripData <- SubTripDataRepository.insertBatchQuery(getUpdatedSubTripDataList(trip.subTripData, tripData.id))
} yield ((tripData, subTripData))
val res=db.run(data.transactionally)
res
//db.close()
})
}
if i close the connection after my work here as you can see in commented code i get error :
java.util.concurrent.RejectedExecutionException: Task slick.backend.DatabaseComponent$DatabaseDef$$anon$2#6c3ae2b6 rejected from java.util.concurrent.ThreadPoolExecutor#79d2d4eb[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
After calling the method without Future.sequence like this :
val temp1 =list.map { trip =>
val data = for {
tripData <- TripDataRepository.insertQuery( trip.tripData)
subTripData <- SubTripDataRepository.insertBatchQuery(getUpdatedSubTripDataList(trip.subTripData, tripData.id))
} yield ((tripData, subTripData))
val res=db.run(data.transactionally)
res
}
I still got too many clients error...

The root of this problem is that you are spinning up an unbounded list of Future simultaneously, each connecting to the database - one per entry in list.
This can be solved by running your inserts in serial, forcing each insert batch to depend on the previous:
// Empty Future for the results. Replace Unit with the correct type - whatever
// "res" is below.
val emptyFuture = Future.successful(Seq.empty[Unit])
// This will only insert one at a time. You could use list.sliding to batch the
// inserts if that was important.
val temp1 = list.foldLeft(emptyFuture) { (previousFuture, trip) =>
previousFuture flatMap { previous =>
// Inner code copied from your example.
val data = for {
tripData <- TripDataRepository.insertQuery(trip.tripData)
subTripData <- SubTripDataRepository.insertBatchQuery(getUpdatedSubTripDataList(trip.subTripData, tripData.id))
} yield ((tripData, subTripData))
val res = db.run(data.transactionally)
previous :+ res
}
}

Related

inserting a list of objects not working in slick

I am using slick3.1.1
I want to insert a list of objects in DB (postgres)
I have written following code which works
for(user<-x.userList)
{
val users = userClass(user.name,user.id)
val userAction = DBAccess.userTable.insertOrUpdate(users)
val f = DBAccess.db.run(DBIO.seq(userAction))
Await.result(f, Duration.Inf)
}
However I am running multiple DB queries. So I was looking some way to call only a single db.run.
So I wrote something like below
val userQuery = TableQuery[dbuserTable]
for(user<-x.userList)
{
val users = userClass(user.name,user.id)
userQuery += users
}
val f = DBAccess.db.run(DBIO.seq(userQuery.result))
Await.result(f, Duration.Inf)
However this second piece does not write to DB. Can someone point out where I am going wrong?
I know this old but since I just fell on it I'll give an updated answer for this.
You can use ++= to insert a list.
val usersTable = TableQuery[dbuserTable]
val listToInsert = x.userList.map(userClass(_.name, _.id))
val action = usersTable ++= listToInsert
DBAccess.db.run(action)
This will only do one request that insert everything at once.
+= doesn't mutate your userQuery. Every iteration of your for loop creates an insert action and then promptly discards it.
Try accumulating the insert actions instead of discarding them (NB use of yield):
val usersTable = TableQuery[dbuserTable]
val inserts = for (user <- x.userList) yield {
val userRow = userClass(user.name, user.id)
usersTable += userRow
}
DBAccess.db.run(DBIO.seq(inserts: _*))

Scala Tail Recursion java.lang.StackOverflowError

I am iteratively querying a mysql table called txqueue that is growing continuously.
Each successive query only considers rows that were inserted into the txqueue table after the query executed in the previous iteration.
To achieve this, each successive query selects rows from the table where the primary key (seqno field in my example below) exceeds the maximum seqno observed in the previous query.
Any newly inserted rows identified in this way are written into a csv file.
The intention is for this process to run indefinitely.
The tail recursive function below works OK, but after a while it runs into a java.lang.StackOverflowError. The results of each iterative query contains two to three rows and results are returned every second or so.
Any ideas on how to avoid the java.lang.StackOverflowError?
Is this actually something that can/should be achieved with streaming?
Many thanks for any suggestions.
Here's the code that works for a while:
object TXQImport {
val driver = "com.mysql.jdbc.Driver"
val url = "jdbc:mysql://mysqlserveraddress/mysqldb"
val username = "username"
val password = "password"
var connection:Connection = null
def txImportLoop(startID : BigDecimal) : Unit = {
try {
Class.forName(driver)
connection = DriverManager.getConnection(url, username, password)
val statement = connection.createStatement()
val newMaxID = statement.executeQuery("SELECT max(seqno) as maxid from txqueue")
val maxid = new Iterator[BigDecimal] {
def hasNext = newMaxID.next()
def next() = newMaxID.getBigDecimal(1)
}.toStream.max
val selectStatement = statement.executeQuery("SELECT seqno,someotherfield " +
" from txqueue where seqno >= " + startID + " and seqno < " + maxid)
if(startID != maxid) {
val ts = System.currentTimeMillis
val file = new java.io.File("F:\\txqueue " + ts + ".txt")
val bw = new BufferedWriter(new FileWriter(file))
// Iterate Over ResultSet
while (selectStatement.next()) {
bw.write(selectStatement.getString(1) + "," + selectStatement.getString(2))
bw.newLine()
}
bw.close()
}
connection.close()
txImportLoop(maxid)
}
catch {
case e => e.printStackTrace
}
}
def main(args: Array[String]) {
txImportLoop(0)
}
}
Your function is not tail-recursive (because of the catch in the end).
That's why you end up with stack overflow.
You should always annotate the functions you intend to be tail-recursive with #scala.annotation.tailrec - it will fail compilation in case tail recursion is impossible, so that you won't be surprised by it at run time.

Slick3 with SQLite - autocommit seems to not be working

I'm trying to write some basic queries with Slick for SQLite database
Here is my code:
class MigrationLog(name: String) {
val migrationEvents = TableQuery[MigrationEventTable]
lazy val db: Future[SQLiteDriver.backend.DatabaseDef] = {
val db = Database.forURL(s"jdbc:sqlite:$name.db", driver = "org.sqlite.JDBC")
val setup = DBIO.seq(migrationEvents.schema.create)
val createFuture = for {
tables <- db.run(MTable.getTables)
createResult <- if (tables.length == 0) db.run(setup) else Future.successful()
} yield createResult
createFuture.map(_ => db)
}
val addEvent: (String, String) => Future[String] = (aggregateId, eventType) => {
val id = java.util.UUID.randomUUID().toString
val command = DBIO.seq(migrationEvents += (id, aggregateId, None, eventType, "CREATED", System.currentTimeMillis, None))
db.flatMap(_.run(command).map(_ => id))
}
val eventSubmitted: (String, String) => Future[Unit] = (id, batchId) => {
val q = for { e <- migrationEvents if e.id === id } yield (e.batchId, e.status, e.updatedAt)
val updateAction = q.update(Some(batchId), "SUBMITTED", Some(System.currentTimeMillis))
db.map(_.run(updateAction))
}
val eventMigrationCompleted: (String, String, String) => Future[Unit] = (batchId, id, status) => {
val q = for { e <- migrationEvents if e.batchId === batchId && e.id === id} yield (e.status, e.updatedAt)
val updateAction = q.update(status, Some(System.currentTimeMillis))
db.map(_.run(updateAction))
}
val allEvents = () => {
db.flatMap(_.run(migrationEvents.result))
}
}
Here is how I'm using it:
val migrationLog = MigrationLog("test")
val events = for {
id <- migrationLog.addEvent("aggregateUserId", "userAccessControl")
_ <- migrationLog.eventSubmitted(id, "batchID_generated_from_idam")
_ <- migrationLog.eventMigrationCompleted("batchID_generated_from_idam", id, "Successful")
events <- migrationLog.allEvents()
} yield events
events.map(_.foreach(event => event match {
case (id, aggregateId, batchId, eventType, status, submitted, updatedAt) => println(s"$id $aggregateId $batchId $eventType $status $submitted $updatedAt")
}))
The idea is to add event first, then update it with batchId (which also updates status) and then update the status when the job is done. events should contain events with status Successful.
What happens is that after running this code it prints events with status SUBMITTED. If I wait a while and do the same allEvents query or just go and check the db from command line using sqlite3 then it's updated correctly.
I'm properly waiting for futures to be resolved before starting the next operation, auto-commit should be enabled by default.
Am I missing something?
Turns out the problem was with db.map(_.run(updateAction)) which returns Future[Future[Int]] which means that the command was not finished by the time I tried to run another query.
Replacing it with db.flatMap(_.run(updateAction)) solved the issue.

Run sequential process with scala future

I have two external processes to be run sequentially:
val antProc = Process(Seq(antBatch","everythingNoJunit"), new File(scriptDir))
val bossProc = Process(Seq(bossBatch,"-DcreateConnectionPools=true"))
val f: Future[Process] = Future {
println("Run ant...")
antProc.run
}
f onSuccess {
case proc => {
println("Run boss...")
bossProc.run
}
}
The result is:
Run ant...
Process finished with exit code 0
How do I run antProc until completion, then bossProc?
The following method seems to achieve the purpose. However, it's not a Future approach.
antProc.!<
bossProc.!<
You should be able to do something like this:
val antProc = Process(Seq(antBatch,"everythingNoJunit"), new File(scriptDir))
val bossProc = Process(Seq(bossBatch,"-DcreateConnectionPools=true"))
val antFut: Future[Process] = Future {
antProc.run
}
val bossFut: Future[Process] = Future {
bossProc.run
}
val aggFut = for{
antRes <- antFut
bossRes <- bossFut
} yield (antRes, bossRes)
aggFut onComplete{
case tr => println(tr)
}
The result of the aggFut will be a tuple consisting of the ant result and the boss result.
Also, be sure your vm that is running this is not exiting before the async callbacks can occur. If your execution context contains daemon threads then it might exit before completion.
Now if you want bossProc to run after antProc, the code would look like this
val antProc = Process(Seq(antBatch,"everythingNoJunit"), new File(scriptDir))
val bossProc = Process(Seq(bossBatch,"-DcreateConnectionPools=true"))
val antFut: Future[Process] = Future {
antProc.run
}
val aggFut = for{
antRes <- antFut
bossRes <- Future {bossProc.run}
} yield (antRes, bossRes)
aggFut onComplete{
case tr => println(tr)
}

neo4j embedded scala example inserting nodes

I've been able to run the neo4j scala example using the batch insert with no problems. However, when I try to create Nodes without the unsafe batch inserter, I get no errors but no inserts either.
Here's the sample code
private def insertNodes(label:String, data: Iterator[Map[String, String]]) = {
val dynLabel: Label = DynamicLabel.label(label)
val graphDb = new GraphDatabaseFactory().newEmbeddedDatabase(DB_PATH)
registerShutdownHook( graphDb )
val tx = graphDb.beginTx()
for (item <- data) {
val node: Node = graphDb.createNode(dynLabel)
node.setProperty("item_id", data("item_id"))
node.setProperty("title", data("title"))
}
tx.success
graphDb.shutdown()
}
You have to commit the transaction. I'm not sure the proper scala syntax, but you want something like:
val tx = graphDb.beginTx()
try {
for (item <- data) {
val node: Node = graphDb.createNode(dynLabel)
node.setProperty("item_id", data("item_id"))
node.setProperty("title", data("title"))
}
tx.success
} catch {
case e: Exception => tx.failure
} finally {
tx.close
}