Why the following code leaks a Slick database collection? - scala

I wrote the following wrapper around play's Action that will use a function that takes both a session and a request. Here is the first version:
def ActionWithSession[A](bp: BodyParser[A])(f: Session => Request[A] => Result): Action[A] =
Action(bp) {
db.withSession {
session: DbSession =>
request => f(session)(request)
}
}
This version works well (the correct Result is returned to the browser), however each call will leak a database connection. After several calls, I started getting the following exceptions:
java.sql.SQLException: Timed out waiting for a free available connection.
When I change it to the version below (by moving the request => right after the Action, the connection leakage goes away, and it works.
def ActionWithSession[A](bp: BodyParser[A])(f: Session => Request[A] => Result): Action[A] =
Action(bp) { request =>
db.withSession {
session: DbSession =>
f(session)(request)
}
}
Why is the first version causing a connection to leak, and how the second version fixes that?

The first version of the code is not supposed to work. You should not return anything that holds a reference to the Session object from a withSession scope. Here you return a closure which holds such a reference. When the closure is later called by Play, the withSession scope has already been closed and the Session object is invalid. Admittedly, leaking the Session object in a closure happens very easily (and will be caught by Slick in the future).
Here is why it seems to work at first, but leaks the Connection: Session objects acquire a connection lazily. withSession blocks return (or close) the connection at the end of the block if one has been acquired. When you leak an unused Session object from the block however and use it for the first time after the block ended, it still lazily opens the connection, but nothing automatically closes it. We recognized this as undesired behavior a while ago, but didn't fix it yet. The fix we have in mind is disallowing Session object's to acquire connections once their .close method has been called. In your case this would have lead to an exception instead of a leaking connection.
See https://github.com/slick/slick/pull/107
The correct code is indeed the second version you posted, where the returned closure's body contains the whole withSession block, not just its result.

db.withSession receives a function that gets a Session in its first argument and executes it with some session it provides to it. The return value of db.withSession is whatever that function returns.
In the first version, the expression that is passed to withSession evaluates to a function request => f(session)(request), so db.withSession ends up: instantiating a session, instantiates a function object that is bound to that session, close the session (before the function it instantiated is called!), and returns this bound function. Now, Action got exactly what it wanted - a function that takes a Request[A] and gives a Result. However, at the time Play is going to execute this Action, the session is going to be lazily opened, but there is nothing that returns it pack to the pool.
The second version does it right, inside db.withSession we are actually calling f, rather than returning a functin that calls f. This ensures that the call to f is nested inside db.withSession and occurs while the session is acquired.
Hope this helps someone!

Related

How can I perform session based logging in Play Framework

We are currently using the Play Framework and we are using the standard logging mechanism. We have implemented a implicit context to support passing username and session id to all service methods. We want to implement logging so that it is session based. This requires implementing our own logger. This works for our own logs but how do we do the same for basic exception handling and logs as a result. Maybe there is a better way to capture this then with implicits or how can we override the exception handling logging. Essentially, we want to get as many log messages to be associated to the session.
It depends if you are doing reactive style development or standard synchronous development:
If standard synchronous development (i.e. no futures, 1 thread per request) - then I'd recommend you just use MDC, which adds values onto Threadlocal for logging. You can then customise the output in logback / log4j. When you get the username / session (possibly in a Filter or in your controller), you can then set the values there and then and you do not need to pass them around with implicits.
If you are doing reactive development you have a couple options:
You can still use MDC, except you'd have to use a custom Execution Context that effectively copies the MDC values to the thread, since each request could in theory be handled by multiple threads. (as described here: http://code.hootsuite.com/logging-contextual-info-in-an-asynchronous-scala-application/)
The alternative is the solution which I tend to use (and close to what you have now): You could make a class which represents MyAppRequest. Set the username, session info, and anything else, on that. You can continue to pass it around as an implicit. However, instead of using Action.async, you make your own MyAction class which an be used like below
myAction.async { implicit myRequest => //some code }
Inside the myAction, you'd have to catch all Exceptions and deal with future failures, and do the error handling manually instead of relying on the ErrorHandler. I often inject myAction into my Controllers and put common filter functionality in it.
The down side of this is, it is just a manual method. Also I've made MyAppRequest hold a Map of loggable values which can be set anywhere, which means it had to be a mutable map. Also, sometimes you need to make more than one myAction.async. The pro is, it is quite explicit and in your control without too much ExecutionContext/ThreadLocal magic.
Here is some very rough sample code as a starter, for the manual solution:
def logErrorAndRethrow(myrequest:MyRequest, x:Throwable): Nothing = {
//log your error here in the format you like
throw x //you can do this or handle errors how you like
}
class MyRequest {
val attr : mutable.Map[String, String] = new mutable.HashMap[String, String]()
}
//make this a util to inject, or move it into a common parent controller
def myAsync(block: MyRequest => Future[Result] ): Action[AnyContent] = {
val myRequest = new MyRequest()
try {
Action.async(
block(myRequest).recover { case cause => logErrorAndRethrow(myRequest, cause) }
)
} catch {
case x:Throwable =>
logErrorAndRethrow(myRequest, x)
}
}
//the method your Route file refers to
def getStuff = myAsync { request:MyRequest =>
//execute your code here, passing around request as an implicit
Future.successful(Results.Ok)
}

Facing issues with sqlalchemy+postgresql session management

I am using sqlalchemy with PostgreSQL and the Pyramid web framework. Here is my models.py:
engine = create_engine(db_url, pool_recycle=3600,
isolation_level="READ UNCOMMITTED")
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
Base = declarative_base()
Base.metadata.bind = engine
class _BaseMixin(object):
def save(self):
session.add(self)
session.commit()
def delete(self):
session.delete(self)
session.commit()
I am inheriting both the Base and _BaseMixin for my models. For example:
class MyModel(Base, _BaseMixin):
__tablename__ = 'MY_MODELS'
id = Column(Integer, primary_key=True, autoincrement=True)
The reason is that I would like to do something like
m = MyModel()
m.save()
I am facing weird issues with session all the time. Sample error messages include
InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction.
InvalidRequestError: A transaction is already begun. Use subtransactions=True to allow subtransactions.
All I want to do is to commit what I have in the memory into the DB. But intermittently SQLAlchemy throws errors like described above and fails to commit.
Is there something wrong with my approach?
tl;dr The problems is that you're sharing one Session object between threads. It fails, because Session object is not thread-safe itself.
What happens?
You create a Session object, which is bound to current thread (let's call it Thread-1). Then you closure it inside your _BaseMixin. Incoming requests are handled in the different threads (let's call them Thread-2 and Thread-3). When request is handling, you call model.save(), which uses Session object created in Thread-1 from Thread-2 or Thread-3. Multiple requests can run concurrently, which together with thread-unsafe Session object gives you totally indeterministic behaviour.
How to handle?
When using scoped_session(), each time you create new object with Session() it will be bound to current thread. Furthermore if there is a session bound to current thread it will return you existing session instead of creating new one.
So you can move session = Session() from the module level to your save() and delete() methods. It will ensure, that you are always using session from current thread.
class _BaseMixin(object):
def save(self):
session = Session()
session.add(self)
session.commit()
def delete(self):
session = Session()
session.delete(self)
session.commit()
But it looks like duplication and also doesn't make sense to create Session object (it will always return the same object inside current thread). So SA provides you ability to implicitly access session for current thread. It produces clearer code.
class _BaseMixin(object):
def save(self):
Session.add(self)
Session.commit()
def delete(self):
Session.delete(self)
Session.commit()
Please, also note that for normal applications you never want to create Session objects explicitly. But want to use implicit access to thread-local session by using Session.some_method().
Further reading
Contextual/Thread-local Sessions.
When do I construct a Session, when do I commit it, and when do I close it?.

How to call a method in a catch clause on an object defined in a try clause?

I am creating a redis pubsub client in a try-catch block. In the try block, the client is initialised with a callback to forward messages to a client. If there's a problem sending the message to the client, an exception will be thrown, in which case I need to stop the redis client. Here's the code:
try {
val redisClient = RedisPubSub(
channels = Seq(currentUserId.toString),
patterns = Seq(),
onMessage = (pubSubMessage: PubSubMessage) => {
responseObserver.onValue(pubSubMessage.data)
}
)
}
catch {
case e: RuntimeException =>
// redisClient isn't defined here...
redisClient.unsubscribe(currentUserId.toString)
redisClient.stop()
messageStreamResult.complete(Try(true))
responseObserver.onCompleted()
}
The problem is that the redis client val isn't defined in the catch block because there may have been an exception creating it. I also can't move the try-catch block into the callback because there's no way (that I can find) of referring to the redisClient object from within the callback (this doesn't resolve).
To solve this I'm instantiating redisClient as a var outside the try-catch block. Then inside the try block I stop the client and assign a new redisPubSub (created as above) to the redisClient var. That's an ugly hack which is also error prone (e.g. if there genuinely is a problem creating the second client, the catch block will try to call methods on an erroneous object).
Is there a better way of writing this code so that I can correctly call stop() on the redisClient if an exception is raised when trying to send the message to the responseObserver?
Update
I've just solved this using promises. Is there a simpler way though?
That exception handler is not going to be invoked if there is a problem sending the message. It is for problems in setting up the client. This SO answer talks about handling errors when sending messages.
As for the callback referring to the client, I think you want to register the callback after creating the client rather than trying to pass the callback in when you create it. Here is some sample code from Debashish Ghosh that does this.
Presumably that callback is going to run in another thread, so if it uses redisClient you'll have to be careful about concurrency. Ideally the callback could get to the client object through some argument. If not, then perhaps using volatile would be the easiest way to deal with that, although I suspect you'd eventually get into trouble if multiple callbacks can fail at once. Perhaps use an actor to manage the client connection, as Debashish has done?

My http request becomes null inside an Akka future

My server application uses Scalatra, with json4s, and Akka.
Most of the requests it receives are POSTs, and they return immediately to the client with a fixed response. The actual responses are sent asynchronously to a server socket at the client. To do this, I need to getRemoteAddr from the http request. I am trying with the following code:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(request)
}
}
def withJsonFuture[A](closure: A => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val params:A = parse(request.body).extract[A]
future{
closure(params)
}
Ok("""{"result":"OK"}""")
}
}
The intention of the withJsonFuture function is to move some boilerplate out of my route processing.
This sometimes works (prints a non-null value for request) and sometimes request is null, which I find quite puzzling. I suspect that I must be "closing over" the request in my future. However, the error also happens with controlled test scenarios when there are no other requests going on. I would imagine request to be immutable (maybe I'm wrong?)
In an attempt to solve the issue, I have changed my code to the following:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ (addr, params) =>
println(addr)
}
}
def withJsonFuture[A](closure: (String, A) => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val addr = request.getRemoteAddr()
val params:A = parse(request.body).extract[A]
future{
closure(addr, params)
}
Ok("""{"result":"OK"}""")
}
}
This seems to work. However, I really don't know if it is still includes any bad concurrency-related programming practice that could cause an error in the future ("future" meant in its most common sense = what lies ahead :).
Scalatra is not so well suited for asynchronous code. I recently stumbled on the very same problem as you.
The problem is that scalatra tries to make the code as declarative as possible by exposing a dsl that removes as much fuss as possible, and in particular does not require you to explicitly pass data around.
I'll try to explain.
In your example, the code inside post("/test") is an anonymous function. Notice that it does not take any parameter, not even the current request object.
Instead, scalatra will store the current request object inside a thread local value just before it calls your own handler, and you can then get it back through ScalatraServlet.request.
This is the classical Dynamic Scope pattern. It has the advantage that you can write many utility methods that access the current request and call them from your handlers, without explicitly passing the request.
Now, the problem comes when you use asynchronous code, as you do.
In your case, the code inside withJsonFuture executes on another thread than the original thread that the handler was initially called (it will execute on a thread from the ExecutionContext's thread pool).
Thus when accessing the thread local, you are accessing a totally distinct instance of the thread local variable.
Simply put, the classical Dynamic Scope pattern is no fit in an asynchronous context.
The solution here is to capture the request at the very start of your handler, and then exclusively reference that:
post("/test") {
val currentRequest = request
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(currentRequest)
}
}
Quite frankly, this is too easy to get wrong IMHO, so I would personally avoid using Scalatra altogether if you are in an synchronous context.
I don't know Scalatra, but it's fishy that you are accessing a value called request that you do not define yourself. My guess is that it is coming as part of extending ScalatraServlet. If that's the case, then it's probably mutable state that it being set (by Scalatra) at the start of the request and then nullified at the end. If that's happening, then your workaround is okay as would be assigning request to another val like val myRequest = request before the future block and then accessing it as myRequest inside of the future and closure.
I do not know scalatra but at first glance, the withJsonFuture function returns an OK but also creates a thread via the future { closure(addr, params) } call.
If that latter thread is run after the OK is processed, the response has been sent and the request is closed/GCed.
Why create a Future to run you closure ?
if withJsonFuture needs to return a Future (again, sorry, I do not know scalatra), you should wrap the whole body of that function in a Future.
Try to put with FutureSupport on your class declaration like this
class MyServices extends ScalatraServlet with FutureSupport {}

Avoiding Squeryl transaction in Play! controllers

I am learning Play! and I have followed the To do List tutorial. Now, I would like to use Squeryl in place of Anorm, so I tried to translate the tutorial, and actually it works.
Still, there is one little thing that irks me. Here is the relevant part of my model
def all: Iterable[Task] = from(tasks) {s => select(s)}
and the corresponding action in the controller to list all tasks
def tasks = Action {
inTransaction {
Ok(views.html.index(Task.all, taskForm))
}
}
The view contains, for instance
<h1>#tasks.size task(s)</h1>
What I do not like is that, unlike in the methods to update or delete tasks, I had to manage the transaction inside the controller action.
If I move inTransaction to the all method, I get an exception,
[RuntimeException: No session is bound to current thread, a session must be created via Session.create and bound to the thread via 'work' or 'bindToCurrentThread' Usually this error occurs when a statement is executed outside of a transaction/inTrasaction block]
because the view tries to obtain the size of tasks, but the transaction is already closed at that point.
Is there a way to use Squeryl transaction only in the model and not expose these details up to the controller level?
Well. It's because of lazy evaluations on Iterable that require session bound (size() method).
This might work if you turn Iterable into List or Vector (IndexedSeq) I suppose.
from(tasks)(s => select(s)).toIndexedSeq //or .toList