Where am I iterating over my ResultSet one time too many? or why is it closed? - scala

I am using anorm to access data on my DB. The DB is written to using another service, which is made in Java, and persist using ebean.
I have the following scala object
import java.sql.Connection
import scala.concurrent.{ Future, blocking, future }
import scala.concurrent.ExecutionContext.Implicits.global
import anorm.{ SQL, SqlQuery, SqlRow, sqlToSimple, toParameterValue }
import play.api.Logger
import play.api.Play.current
import play.api.db.DB
object Queries {
private val readDataSource: String = play.Configuration.root().getString("data.provider.api.source", "default")
//better IO execution context
import play.api.libs.concurrent.Execution.Implicits.defaultContext
private val dataSetDescription: SqlQuery = SQL("SELECT DISTINCT platform, name FROM data_nugget")
private val identityCreationTime: SqlQuery = SQL("SELECT i.creation_time FROM identity i WHERE platform = {pfm} AND userid = {uid};")
private val identityData: SqlQuery = SQL("SELECT n.name, n.value FROM data_nugget n WHERE platform = {pfm} AND userid = {uid};")
private val playerData: SqlQuery = SQL("SELECT n.platform, n.name, n.value, r.userid, r.registration_time FROM data_nugget n JOIN registration r ON n.platform=r.platform AND n.userid=r.userid WHERE r.playerid = {pid} AND r.application = {app};")
private def withAsyncAnormConnection(function: Connection => Stream[SqlRow]): Future[List[SqlRow]] = {
future {
blocking {
DB.withConnection(readDataSource)(c => function(c)).toList
}
}
}
def fetchDistinctDataNames(): Future[List[SqlRow]] = {
withAsyncAnormConnection(implicit c => dataSetDescription())
}
def fetchIdentityCreationTime(platform: String, userid: String): Future[List[SqlRow]] = {
withAsyncAnormConnection(implicit c => identityCreationTime.on("pfm" -> platform, "uid" -> userid)())
}
def fetchIdentityData(platform: String, userid: String): Future[List[SqlRow]] = {
withAsyncAnormConnection(implicit c => identityData.on("pfm" -> platform, "uid" -> userid)())
}
def fetchRegistrationData(game: String, playerid: String): Future[List[SqlRow]] = {
withAsyncAnormConnection(implicit c => playerData.on("app" -> game, "pid" -> playerid)())
}
}
I use it to wrap my SQL queries executions within futures.
Everytime I run any of those queries I obtain an error with this following stack trace :
(Error,com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
com.mysql.jdbc.ResultSetImpl.checkClosed(ResultSetImpl.java:794)
com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:7139)
anorm.Sql$$anonfun$resultSetToStream$1.apply(Anorm.scala:527)
anorm.Sql$$anonfun$resultSetToStream$1.apply(Anorm.scala:527)
anorm.Useful$.unfold(Anorm.scala:315)
anorm.Useful$$anonfun$unfold$1.apply(Anorm.scala:317)
anorm.Useful$$anonfun$unfold$1.apply(Anorm.scala:317)
scala.collection.immutable.Stream$Cons.tail(Stream.scala:1078)
scala.collection.immutable.Stream$Cons.tail(Stream.scala:1070)
scala.collection.immutable.Stream.foreach(Stream.scala:548)
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:178)
scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
scala.collection.TraversableLike$class.to(TraversableLike.scala:629)
scala.collection.AbstractTraversable.to(Traversable.scala:105)
scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:243)
scala.collection.AbstractTraversable.toList(Traversable.scala:105)
controllers.dataprovider.data.Queries$$anonfun$withAsyncAnormConnection$1$$anonfun$apply$1.apply(Queries.scala:31)
controllers.dataprovider.data.Queries$$anonfun$withAsyncAnormConnection$1$$anonfun$apply$1.apply(Queries.scala:31)
scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$2$$anon$3.block(ExecutionContextImpl.scala:44)
scala.concurrent.forkjoin.ForkJoinPool.managedBlock(ForkJoinPool.java:2803)
scala.concurrent.impl.ExecutionContextImpl$DefaultThreadFactory$$anon$2.blockOn(ExecutionContextImpl.scala:41)
scala.concurrent.package$.blocking(package.scala:50)
controllers.dataprovider.data.Queries$$anonfun$withAsyncAnormConnection$1.apply(Queries.scala:30)
controllers.dataprovider.data.Queries$$anonfun$withAsyncAnormConnection$1.apply(Queries.scala:30)
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
scala.concurrent.forkjoin.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1417)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:262)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1478)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104))
I already encountered those on previous Java services using jdbc but here I am not touching the ResultSet, and I am even returning a list asap from the Stream of rows I receive from the connection.
What is happening? Where am I closing the ResultSet? What did I refactor wrong?
As a note, on the prototype of this service (when everything was in the controller) I used to have the SQL("...") directly in the code with something like that:
future {
blocking {
DB.withConnection(implicit c => {
SQL("SELECT DISTINCT platform, name FROM data_nugget")().map(row => (row[String]("platform"), row[String]("name"))).toArray
})
}
}
and it worked just fine.
PS : Sorry for the long copy/paste of stacktrace and the code ... trying to be detailed.

I solved it myself and it is a very fine line.
I changed this function
private def withAsyncAnormConnection(function: Connection => Stream[SqlRow]): Future[List[SqlRow]] = {
future {
blocking {
DB.withConnection(readDataSource)(c => function(c)).toList
}
}
}
to THIS:
private def withAsyncAnormConnection(function: Connection => Stream[SqlRow]): Future[List[SqlRow]] = {
future {
blocking {
DB.withConnection(readDataSource)(c => function(c).toList)
}
}
}
The trick is that I am using the "loan-pattern" of withConnection, so I need to iter through the Stream to get all the rows before I release the connection.
The connection is alive only within this round brackets (c => function(c).toList)

There's a difference between the code that is working for you and the code that is not working. In your working example, you are calling map on the lazy Stream of Row instances. In the non-working example, you are calling toList without using map. Maybe map is forcing the full processing of the underlying ResultSet within the withConnection block and toList is not, leaving it lazy until you get outside of the withConnection block after which the underlying ResultSet is closed. Maybe you can modify your new code to try and map the results (mapping the Row to itself, no actual mapping logic) and see if this fixes anything.

Related

How to implement a concurrent processing in akka?

I have a method in which there are multiple calls to db. As I have not implemented any concurrent processing, a 2nd db call has to wait until the 1st db call gets completed, 3rd has to wait until the 2nd gets completed and so on.
All db calls are independent of each other. I want to make this in such a way that all DB calls run concurrently.
I am new to Akka framework.
Can someone please help me with small sample or references would help. Application is developed in Scala Lang.
There are three primary ways that you could achieve concurrency for the given example needs.
Futures
For the particular use case that is asked about in the question I would recommend Futures before any akka construct.
Suppose we are given the database calls as functions:
type Data = ???
val dbcall1 : () => Data = ???
val dbcall2 : () => Data = ???
val dbcall3 : () => Data = ???
Concurrency can be easily applied, and then the results can be collected, using Futures:
val f1 = Future { dbcall1() }
val f2 = Future { dbcall2() }
val f3 = Future { dbcall3() }
for {
v1 <- f1
v2 <- f2
v3 <- f3
} {
println(s"All data collected: ${v1}, ${v2}, ${v3}")
}
Akka Streams
There is a similar stack answer which demonstrates how to use the akka-stream library to do concurrent db querying.
Akka Actors
It is also possible to write an Actor to do the querying:
object MakeQuery
class DBActor(dbCall : () => Data) extends Actor {
override def receive = {
case _ : MakeQuery => sender ! dbCall()
}
}
val dbcall1ActorRef = system.actorOf(Props(classOf[DBActor], dbcall1))
However, in this use case Actors are less helpful because you still need to collect all of the data together.
You can either use the same technique as the "Futures" section:
val f1 : Future[Data] = (dbcall1ActorRef ? MakeQuery).mapTo[Data]
for {
v1 <- f1
...
Or, you would have to wire the Actors together by hand through the constructor and handle all of the callback logic for waiting on the other Actor:
class WaitingDBActor(dbCall : () => Data, previousActor : ActorRef) {
override def receive = {
case _ : MakeQuery => previousActor forward MakeQuery
case previousData : Data => sender ! (dbCall(), previousData)
}
}
If you want to querying database, you should use something like slick which is a modern database query and access library for Scala.
quick example of slick:
case class User(id: Option[Int], first: String, last: String)
class Users(tag: Tag) extends Table[User](tag, "users") {
def id = column[Int]("id", O.PrimaryKey, O.AutoInc)
def first = column[String]("first")
def last = column[String]("last")
def * = (id.?, first, last) <> (User.tupled, User.unapply)
}
val users = TableQuery[Users]
then your need to create configuration for your db:
mydb = {
dataSourceClass = "org.postgresql.ds.PGSimpleDataSource"
properties = {
databaseName = "mydb"
user = "myuser"
password = "secret"
}
numThreads = 10
}
and in your code you load configuration:
val db = Database.forConfig("mydb")
then run your query with db.run method which gives you future as result, for example you can get all rows by calling method result
val allRows: Future[Seq[User]] = db.run(users.result)
this query run without blocking current thread.
If you have task which take long time to execute or calling to another service, you should use futures.
Example of that is simple HTTP call to external service. you can find example in here
If you have task which take long time to execute and for doing so, you have to keep mutable states, in this case the best option is using Akka Actors which encapsulate your state inside an actor which solve problem of concurrency and thread safety as simple as possible.Example of suck tasks are:
import akka.actor.Actor
import scala.concurrent.Future
case class RegisterEndpoint(endpoint: String)
case class NewUpdate(update: String)
class UpdateConsumer extends Actor {
val endpoints = scala.collection.mutable.Set.empty[String]
override def receive: Receive = {
case RegisterEndpoint(endpoint) =>
endpoints += endpoint
case NewUpdate(update) =>
endpoints.foreach { endpoint =>
deliverUpdate(endpoint, update)
}
}
def deliverUpdate(endpoint: String, update: String): Future[Unit] = {
Future.successful(Unit)
}
}
If you want to process huge amount of live data, or websocket connection, processing CSV file which is growing over time, ... or etc, the best option is Akka stream. For example reading data from kafka topic using Alpakka:Alpakka kafka connector

explicitly close db connection in slick

Here is code I am trying to optimise:
object UserRepo
{
val users = TableQuery[Users]
val dbName = "db"
lazy val queryAllUsers = for (user <- users) yield user
type UserRow = (Int, String, String, String)
def getAll() : Future[ Seq[UserRow] ] =
{
val db = Database.forConfig( dbName )
val f: Future[Seq[UserRow]] = db.run( queryAllUsers.result )
f.onComplete {
case Success(_) => { db.close() }
case Failure(_) => { db.close() }
}
f
}
}
I going to have number of query to the DB I am trying to get rid of string where I am creating DB connection. Is there any execution context I can use to close connection explicitly ?? so code will look more concise ?
Is there option to get used db connection within Future.onComplete scope??
Thanks
As for your comment ( explicitly close db connection in slick ) normally what you do is to create a connection on an application startup (or lazily on first use) and then closing it at the application end.
This obviously all depends what kind of application you are running:
if you are having DI container you would probably manage some of this in your DI mechanisms (like Modules in Guice)
if you are having web application, specifically e.g. Play - you would probably use play-slick that does this initialization / shutting down for you (kind of).
General way (no DI)
The easiest general way (assuming you are not using DI or play-slick) of doing this would be perhaps something like this:
object DbManager {
lazy val db = createDb
private def createDb = {
Database.forConfig("db")
}
def close {
db.close
}
}
Then your code would be:
object UserRepo
{
val users = TableQuery[Users]
lazy val queryAllUsers = for (user <- users) yield user
type UserRow = (Int, String, String, String)
def getAll() : Future[ Seq[UserRow] ] =
{
DbManager.db.run( queryAllUsers.result )
}
}
Above code doesn't do any cleaning up - this would need to be added to some kind of hook when application is closing (in case e.g. of web application) or you would need to manually call DbManager.close at some specified time (when you are closing the application).
Play slick
You would probably need to start from here: https://github.com/playframework/play-slick/tree/master/samples/basic (most basic sample showing play-slick configuration).
Updating your answer with this would be:
class UserRepo #Inject() (dbConfigProvider: DatabaseConfigProvider) extends HasDatabaseConfigProvider[JdbcProfile])
{
import driver.api._
val users = TableQuery[Users]
lazy val queryAllUsers = for (user <- users) yield user
type UserRow = (Int, String, String, String)
def getAll() : Future[ Seq[UserRow] ] =
{
db.run( queryAllUsers.result )
}
}
In this scenario you wouldn't call:
UserRepo.getAll
but you would rather need to inject it:
class MyClientCode #Inject() (userRepo: UserRepo) {
...
userRepo.getAll
...
}
You would need to obviously configure it in your configuration but this should be very straightforward to do with the sample provided above.
So in short your Play application will have database connection configuration and would do all initialization / cleaning up. Your external modules (like the one you described in your comment) would simply pull DatabaseConfigProvider as Guice managed dependency (as show above).

Database transactions in Play framework scala applications (anorm)

I am developing an application using Play framework and scala. I am using anorm for data-access layer. And I've got a problem I could not solve.
Brief: I want to be able to have methods in data-access objects (dao) to work inside transactions as well as being called alone.
Details:
I have data-access layer consist of class with methods that only executes particular SQL over database. Traditionally they looks like:
def list() = DB.withConnection { implicit cn =>
...
}
Now I want to have some methods to be executed in a transaction scope. Like traditional select-update service methods but still be able to run them alone. So, what I have in my mind is like this:
class Service {
def fooTransacted() = {
inTransaction {
val old = dao.select(id = 2)
val newObj = old.copy(isActive = true)
dao.update(newObj)
}
}
def fooSinle() = {
dao.select(id = 2)
}
}
I tried around several ways, but could not come up with any solution.
What about
class Dao {
def foo(id: Long)(implicit connection: Connection) = {
SQL("select * from foo where id={id}").on('id->id).as(...)
}
}
class Service{
def withConnection = {
DB.withConnection {implicit connection =>
Dao.foo(1)
Dao.foo(2)
}
}
def withTransaction = {
DB.withTransaction {implicit connection =>
Dao.foo(1)
Dao.foo(2)
}
}
The solution I've seen used elsewhere (principally in Squeryl), is roughly the following:
import java.sql.Connection
object Helper {
private val conn: ThreadLocal[Connection] = new ThreadLocal
def inTransaction[X](f: Connection => X) = {
conn.get() match {
case null =>
DB.withConnection { newConn =>
conn.set(newConn)
try f(newConn)
finally conn.set(null)
}
case c => f(c)
}
}
}
This way, the inTransaction method is re-entrant, so there's no harm in calling it redundantly inside dao.select.
If you prefer, you can expose conn via a public method, and change the signature of f to => X - you lose some compile-time safety, but the API is a little cleaner.
One pitfall with this approach is that connections are tied to threads, which may cause problems if you're using futures or actors, and a process can resume on a different thread (this is a tricky area anyway, but one you should be aware of).
You might want to look into Squeryl too - it may already do what you need.

How to query OrientDB asynchronously from Play controller?

I am writing a Play (2.2) controller in Scala, which should return the result of a query against OrientDB. Now, I have succeeded in writing a synchronous version of said controller, but I'd like to re-write it to work asynchronously.
My question is; given the below code (just put together for demonstration purposes), how do I re-write my controller to interact asynchronously with OrientDB (connecting and querying)?
import play.api.mvc.{Action, Controller}
import play.api.libs.json._
import com.orientechnologies.orient.`object`.db.OObjectDatabasePool
import java.util
import com.orientechnologies.orient.core.sql.query.OSQLSynchQuery
import scala.collection.JavaConverters._
object Packages extends Controller {
def packages() = Action { implicit request =>
val db = OObjectDatabasePool.global().acquire("http://localhost:2480", "reader", "reader")
try {
db.getEntityManager().registerEntityClass(classOf[models.Package])
val packages = db.query[util.List[models.Package]](new OSQLSynchQuery[models.Package]("select from Package")).asScala.toSeq
Ok(Json.obj(
"packages" -> Json.toJson(packages)
))
}
finally {
db.close()
}
}
}
EDIT:
Specifically, I wish to use OrientDB's asynchronous API. I know that asynchronous queries are supported by the API, though I'm not sure if you can connect asynchronously as well.
Attempted Solution
Based on Jean's answer, I've tried the following asynchronous implementation, but it fails due to a compilation error value execute is not a member of Nothing possible cause: maybe a semicolon is missing before 'value execute'?:
def getPackages(): Future[Seq[models.Package]] = {
val db = openDb
try {
val p = promise[Seq[models.Package]]
val f = p.future
db.command(
new OSQLAsynchQuery[ODocument]("select from Package",
new OCommandResultListener() {
var acc = List[ODocument]()
#Override
def result(iRecord: Any): Boolean = {
val doc = iRecord.asInstanceOf[ODocument]
acc = doc :: acc
true
}
#Override
def end() {
// This is just a dummy
p.success(Seq[models.Package]())
}
// Fails
})).execute()
f
}
finally {
db.close()
}
}
One way could be to start a promise, return the future representing the result of that promise, locally accumulate the results as they come and complete de promise ( thus resolving the future ) when orient db notifies you that the command has completed.
def executeAsync(osql: String, params: Map[String, String] = Map()): Future[List[ODocument]] = {
import scala.concurrent._
val p = promise[List[ODocument]]
val f =p.future
val req: OCommandRequest = database.command(
new OSQLAsynchQuery[ODocument]("select * from animal where name = 'Gipsy'",
new OCommandResultListener() {
var acc = List[ODocument]()
#Override
def result(iRecord:Any):Boolean= {
val doc = iRecord.asInstanceOf[ODocument]
acc=doc::acc
true
}
#Override
def end() {
p.success(acc)
}
}))
req.execute()
f
}
Be careful though, to enable graph navigation and field lazy loading, orientdb objects used to keep an internal reference to the database instance they were loaded from ( or to depend on a threadlocal database connected instance ) for lazily loading elements from the database. Manipulating these objects asynchronously may result in loading errors. I haven't checked changes from 1.6 but that seemed to be deeply embedded in the design.
It's as simple as wrapping the blocking call in a Future.
import play.api.libs.concurrent.Execution.Implicits.defaultContext
import scala.concurrent.Future
object Packages extends Controller {
def packages = Action.async { implicit request =>
val db = OObjectDatabasePool.global().acquire("http://localhost:2480", "reader", "reader")
db.getEntityManager().registerEntityClass(classOf[models.Package])
val futureResult: Future[Result] = Future(
db.query[util.List[models.Package]](new OSQLSynchQuery[models.Package]("select from Package")).asScala.toSeq
).map(
queryResult => Ok(Json.obj("packages" -> Json.toJson(packages)))
).recover {
// Handle each of the exception cases legitimately
case e: UnsupportedOperationException => UnsupportedMediaType(e.getMessage)
case e: MappingException => BadRequest(e.getMessage)
case e: MyServiceException => ServiceUnavailable(e.toString)
case e: Throwable => InternalServerError(e.toString + "\n" + e.getStackTraceString)
}
futureResult.onComplete { case _ =>
db.close()
}
futureResult
}
}
Note that I did not compile the code. There is a lot of room to improve the code.

How could I know if a database table is exists in ScalaQuery

I'm trying ScalaQuery, it is really amazing. I could defined the database table using Scala class, and query it easily.
But I would like to know, in the following code, how could I check if a table is exists, so I won't call 'Table.ddl.create' twice and get a exception when I run this program twice?
object Users extends Table[(Int, String, String)]("Users") {
def id = column[Int]("id")
def first = column[String]("first")
def last = column[String]("last")
def * = id ~ first ~ last
}
object Main
{
val database = Database.forURL("jdbc:sqlite:sample.db", driver = "org.sqlite.JDBC")
def main(args: Array[String]) {
database withSession {
// How could I know table Users is alrady in the DB?
if ( ??? ) {
Users.ddl.create
}
}
}
}
ScalaQuery version 0.9.4 includes a number of helpful SQL metadata wrapper classes in the org.scalaquery.meta package, such as MTable:
http://scalaquery.org/doc/api/scalaquery-0.9.4/#org.scalaquery.meta.MTable
In the test code for ScalaQuery, we can see examples of these classes being used. In particular, see org.scalaquery.test.MetaTest.
I wrote this little function to give me a map of all the known tables, keyed by table name.
import org.scalaquery.meta.{MTable}
def makeTableMap(dbsess: Session) : Map[String, MTable] = {
val tableList = MTable.getTables.list()(dbsess);
val tableMap = tableList.map{t => (t.name.name, t)}.toMap;
tableMap;
}
So now, before I create an SQL table, I can check "if (!tableMap.contains(tableName))".
This thread is a bit old, but maybe someone will find this useful. All my DAOs include this:
def create = db withSession {
if (!MTable.getTables.list.exists(_.name.name == MyTable.tableName))
MyTable.ddl.create
}
Here's a full solution that checks on application start using a PostGreSQL DB for PlayFramework
import globals.DBGlobal
import models.UsersTable
import org.scalaquery.meta.MTable
import org.scalaquery.session.Session
import play.api.GlobalSettings
import play.api.Application
object Global extends GlobalSettings {
override def onStart(app: Application) {
DBGlobal.db.withSession { session : Session =>
import org.scalaquery.session.Database.threadLocalSession
import org.scalaquery.ql.extended.PostgresDriver.Implicit._
if (!makeTableMap(session).contains("tableName")) {
UsersTable.ddl.create(session)
}
}
}
def makeTableMap(dbsess: Session): Map[String, MTable] = {
val tableList = MTable.getTables.list()(dbsess)
val tableMap = tableList.map {
t => (t.name.name, t)
}.toMap
tableMap
}
}
With java.sql.DatabaseMetaData (Interface). Depending on your Database, more or less functions might be implemented.
See also the related discussion here.I personally prefer hezamu's suggestion and extend it as follows to keep it DRY:
def createIfNotExists(tables: TableQuery[_ <: Table[_]]*)(implicit session: Session) {
tables foreach {table => if(MTable.getTables(table.baseTableRow.tableName).list.isEmpty) table.ddl.create}
}
Then you can just create your tables with the implicit session:
db withSession {
implicit session =>
createIfNotExists(table1, table2, ..., tablen)
}
You can define in your DAO impl the following method (taken from Slick MTable.getTables always fails with Unexpected exception[JdbcSQLException: Invalid value 7 for parameter columnIndex [90008-60]]) that gives you a true o false depending if there a defined table in your db:
def checkTable() : Boolean = {
val action = MTable.getTables
val future = db.run(action)
val retVal = future map {result =>
result map {x => x}
}
val x = Await.result(retVal, Duration.Inf)
if (x.length > 0) {
true
} else {
false
}
}
Or, you can check if some "GIVENTABLENAME" or something exists with println method:
def printTable() ={
val q = db.run(MTable.getTables)
println(Await.result(q, Duration.Inf).toList(0)) //prints first MTable element
println(Await.result(q, Duration.Inf).toList(1))//prints second MTable element
println(Await.result(q, Duration.Inf).toList.toString.contains("MTable(MQName(public.GIVENTABLENAME_pkey),INDEX,null,None,None,None)"))
}
Don't forget to add
import slick.jdbc.meta._
Then call the methods from anywhere with the usual #Inject(). Using
play 2.4 and play-slick 1.0.0.
Cheers,