Facing issues with sqlalchemy+postgresql session management - postgresql

I am using sqlalchemy with PostgreSQL and the Pyramid web framework. Here is my models.py:
engine = create_engine(db_url, pool_recycle=3600,
isolation_level="READ UNCOMMITTED")
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
Base = declarative_base()
Base.metadata.bind = engine
class _BaseMixin(object):
def save(self):
session.add(self)
session.commit()
def delete(self):
session.delete(self)
session.commit()
I am inheriting both the Base and _BaseMixin for my models. For example:
class MyModel(Base, _BaseMixin):
__tablename__ = 'MY_MODELS'
id = Column(Integer, primary_key=True, autoincrement=True)
The reason is that I would like to do something like
m = MyModel()
m.save()
I am facing weird issues with session all the time. Sample error messages include
InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction.
InvalidRequestError: A transaction is already begun. Use subtransactions=True to allow subtransactions.
All I want to do is to commit what I have in the memory into the DB. But intermittently SQLAlchemy throws errors like described above and fails to commit.
Is there something wrong with my approach?

tl;dr The problems is that you're sharing one Session object between threads. It fails, because Session object is not thread-safe itself.
What happens?
You create a Session object, which is bound to current thread (let's call it Thread-1). Then you closure it inside your _BaseMixin. Incoming requests are handled in the different threads (let's call them Thread-2 and Thread-3). When request is handling, you call model.save(), which uses Session object created in Thread-1 from Thread-2 or Thread-3. Multiple requests can run concurrently, which together with thread-unsafe Session object gives you totally indeterministic behaviour.
How to handle?
When using scoped_session(), each time you create new object with Session() it will be bound to current thread. Furthermore if there is a session bound to current thread it will return you existing session instead of creating new one.
So you can move session = Session() from the module level to your save() and delete() methods. It will ensure, that you are always using session from current thread.
class _BaseMixin(object):
def save(self):
session = Session()
session.add(self)
session.commit()
def delete(self):
session = Session()
session.delete(self)
session.commit()
But it looks like duplication and also doesn't make sense to create Session object (it will always return the same object inside current thread). So SA provides you ability to implicitly access session for current thread. It produces clearer code.
class _BaseMixin(object):
def save(self):
Session.add(self)
Session.commit()
def delete(self):
Session.delete(self)
Session.commit()
Please, also note that for normal applications you never want to create Session objects explicitly. But want to use implicit access to thread-local session by using Session.some_method().
Further reading
Contextual/Thread-local Sessions.
When do I construct a Session, when do I commit it, and when do I close it?.

Related

Actor accessing things out of scope

I'm using the Akka libraries.
What happens when multiple actors call a function on an object? Would this block other actors from accessing the object?
The reason I ask this is because I want to use JBCrypt with akka actors. And since we can encrypt multiple strings concurrently I have each actor calling JBcrypt.hash(...). Not sure how it works since I think, in scala, objects exist in one place, and I feel like multiple actors using the same object (library) might block the concurrency from actually happening.
Multiple actors calling a function in an object that calls a library will not block unless the library being called uses concurrency control mechanisms such as sychronized, ThreadLocal or an object lock.
For example, calling print on the below Printer object will block:
class BlockingPrinter(){
def print(s: String) = synchronized{s}
}
object Printer{
val printer = new BlockingPrinter()
def print(str: String) = printer.print(str)
}
But calling it on the below Printer object will not
class NonBlockingPrinter(){
def print(s: String) = s
}
object Printer{
val printer = new NonBlockingPrinter()
def print(str: String) = printer.print(str)
}
In summary, the library that you're calling is the one that decides how concurrency is handled. Not the fact that you're calling an object.
It depends on how the function is implemented. If the function accessed some internal mutable state and tries to synchronize in order to achieve thread safety then there is a problem. If it's a pure function and does not access any external state, then it is safe. If the function has the mutable state at least it must contain the mutable state to itself.

Is it wise to keep a MongoDB session open for the lifetime of an object?

I have a struct that abstracts MongoDB for specific objects in my system, and during the factory function for it I set a few variables like hostname and other connection information. At the moment in each of the UserService's methods* it creates a new session, does what it needs to, then closes the session when it's done.
The system itself is the backend of a web application, so the lifetime of each UserService struct is usually only a few seconds at most, but more than one operation might be performed for each request that comes in. So I'm wondering whether to make session a member of the struct, initialise it when I create my UserService, and then ensure the request handler function that creates it calls a close function once it's done all the operations it needs to. I haven't been able to find anything that encourages or discourages this behaviour so I'm not really sure if the mgo package I'm using is designed that way.
This is my proposed code:
type UserService struct {
callingUserId id.ID //The ID of the user calling this service.
host string //The host name.
session mgo.Session //The active session.
}
func NewUserService(userId id.ID) *UserService {
nus := new(UserService)
nus.callingUserId = userId
nus.host = "localhost" //TODO change this to read from file or buffer.
nus.session, sessionErr = mgo.Dial(nos.host)
if sessionErr != nil {
//TODO log error.
}
return nus
}
func (us UserService) GetById(usrId id.ID) *users.User {
//TODO get a user from the database with the matching ID using the existing session.
}
There will also be a function called UserService.CloseSession, which as I mentioned, will be called after all the required operations are done by the handler function that creates the service.
Does anyone know if this is a sensible idea or should I just call mgo.Dial in every function on UserService?
Opening a connection to MongoDB or any other DB is an expensive task that should be done once and kept operative as much as possible inside your service.
You have three options:
A local package level MongoDB instance.
Using Contexts and Composing Handlers.
A method receiver which holds the MongoDB Instance.
This is a very informative blog post for handling DBs connections.
Take a look to this example, especially for MongoDB

Slick and play controller testing

have a play controller method
def insertDepartment = Action(parse.json) { request =>
MyDataSourceProvider.db.withSession{ implicit session =>
val departmentRow = DepartmentRow(1, Option("Department1"))
departmentService.insert(departmentRow)
}
}
note MyDataSourceProvider.db is providing slick.driver.PostgresDriver.simple.Database and creating a withSession provides an implicit session to departmentService.insert
when I test departmentService session is provided by a text fixture as mentioned in this post. sessionWrapper is a simple function which creates a session, provides that session to a test block and rolls back data after test finishes.
sessionWrapper { implicit session =>
val departmentRow = DepartmentRow(1, Option("Department1"))
departmentService.insert(departmentRow)
}
This works nicely and as expected by not polluting database when service tests run. tests should not persist anything in the db but rollback after executing successfully.
now when testing play controller need a way to use sessionWrapper. to be able to roll back controller tests in a similar fashion to service tests.
note MyDataSourceProvider.db.withSession in controller insertDepartment.
wrapping controller test with sessionWrapper has no significance since controller def isn't accepting any implicit session but using one from MyDataSourceProvider.db.withSession
what's the best way to handle this? tried creating a trait controller, to be able to inject impl for a trait so mixin can be different for a test and real code but haven't found a way to "pass" session for test and not for production code. Any ideas?
Since Slick is blocking, you don't need Action.async. I wonder why that even compiles, because I don't see a future there, but I am not that familiar with Play.
There are several alternatives for what you can do:
My favorite: not use transaction rollback for testing, but use a test database, which you re-create for each test.
pull out
val departmentRow = DepartmentRow(1, Option("Department1"))
departmentService.insert(departmentRow)
into a method and only test that method, not the controller
use sessionWrapper in the controller and let it check for a configuration flag, that tells it if it is in test mode and should do a rollback, or if it is in production mode.

Logging user activity in another logical layer without global HttpContext

I'm looking for a good solution to log DB changes in a web application developed using Play/Scala/ReactiveMongo. I need to know who changed what.
I have a separate layer namely Services in which all data access and business logics happens. All saves/updates/removes are done by certain methods so I can log them safely in just 3 or 4 methods but I need user identity there.
I don't have access to current user in services! There is no global HttpContext or Request or something like that which let me get the user identity (I think this way of getting user identity was incorrect of course).
I have a solution:
Add an implicit parameter to all methods of services which have side effects (change DB) and pass user identity to them.
def save(model: A)(implicit userIdentity: Option[UserIndentity] = None) = { ... }
As I wrapped default request, it can extend UserIdentity trait so that the implicit request matches the implicit parameters.
class MyRequest[A](...) extends WrappedRequest[A](request) extends UserIdentity
Finally, actions can use services like this:
def index() = MyAction { implicit request =>
//...
defaultService.save(model)
//...
}
The bad thing is that I have to add implicit parameters to those service methods. Isn't there another solution to get current user without polluting method signatures.
What's the problem with simply adding UserIdentity as an argument to your functions? Knowing who the user is seems to be important for your business logic - after all, today you want to log who performed the operation, tomorrow you will want to make sure this particular user is allowed to do it.
And I would just use a real UserIdentity object, not some hack with WrappedRequest, your services don't need to mess with a WrappedRequest instance.

akka: sharing mutable state

I need to have one global variable (singleton) that will change very infrequently. Actually it only changes when the actor restarts, and reinitialize the variable. Since I cannot do this with singleton val in companion object, I have to declare it as a var (mutable).
object UserDatabase {
var dbConnection = "" // initializing db connection
}
Many guidelines that I read always go against sharing a mutable state. So I move the variable to class and use message passing to retrieve the variable.
class UserDatabase extends Actor{
val dbConnection = "" // initializing db connection locally
def receive = {case GetConnection => self.reply(dbConnection)}
}
Problem is, dbConnection is accessed very frequently by many .. many actors, and continuously sending message will reduce performance (since akka process mailbox one by one).
I don't see how I can do this without sacrificing performance. Any idea?
Perhaps use an Agent instead? http://akka.io/docs/akka/1.2-RC6/scala/agents.html
First of all, have you actually measure/notice performance reduction ? Since messaging is lightweight, perhaps it's fast enough for your application.
Then, a possible solution: If the "global" state is written rarely, but accessed very often, you can choose a push strategy. Every time it changes, the UserDatabase actor will send the updated value to interested actors. You can then use a publish/subscribe approach, rely on the actor register, use a pool of actors, etc.
class UserDatabase extends Actor{
var dbConnection = "" // initializing db connection locally
def receive = {
case SetConnection( newConnection ) if dbConnection != newConnection => {
dbConnection = newConnection
sendUpdatedConnection(); // sends the change to every relevant actor
}
}
}
If you don't need to use the variable very often in any case, it might be simpler and more efficient to make it a java.lang.concurrent.atomic.AtomicReference or wrap every access of it in a synchronized block (on the variable). Actors don't always make things easier and safer, just usually.
Create many actors as routees of a RoundRobinRouter.
Make each actor handle a connection and actually handling the DB logic.