How can I convert my mgo sessions to mongo-go-driver clients using connection pooling? - mongodb

Long, long ago, when we were using mgo.v2, we created some wrapper functions that copied the session, set the read pref and returned that for consumption by other libraries, e.g.
func NewMonotonicConnection() (conn *Connection, success bool) {
conn := &Connection{
session: baseSession.Copy(),
}
conn.session.SetMode(mongo.Monotonic, true)
return conn, true
}
We now just pass the default client (initialized using mongo.Connect and passed into a connection singleton) in an init function and then consumed like this:
func NewMonotonicConnection() (conn *Connection, success bool) {
conn = defaultConnection
return conn, true
}
My understanding is that to leverage connection pooling, you need to use the same client (which is contained in defaultConn), and session is now implicitly handled inside of the .All()/cursor teardown. Please correct me if I'm wrong here.
It would be nice if we could still set the readpref on these connections (e.g. set NearestMode on this connection before returning), but what's the community/standard way of doing that?
I know I could call mongo.Connect over and over again, but is that expensive?
I could create different clients - each client with a different readpref - but I was thinking that if a write occurred on that connection, it wouldn't ever go back to reading from a slave.
It looks like I *can create sessions explicitly, but I'm not certain I should or if there are any implications around managing those explicitly in the new driver.

There are a couple things I learned on this quest through the mongo-go-driver codebase that I thought I should share with the world before closing this question. If I'm wrong here - please correct me.
You should not call Connect() over and over if you want to leverage connection pooling. It looked like each time Connect() was called, a new socket was created. This means that there's a risk of socket exhaustion over time unless you are manually defer Close()-ing it each time.
In mongo-go-driver, sessions are automatically handled under the covers now when you make the call to execute the query (e.g. All()). You can* explicitly create and teardown a session, but you can't consume it using the singleton approach I proposed above without having to change all the caller functions.
This is because you can no longer call query operations on the session, you instead have to consume it using a WithSession function at the DB operation itself
I realized that writeconcern, readpref and readconcern can all be set at the:
client level (these are the defaults that everything will use if not overridden)
session level
database level
query level
So what I did is create Database options and overloaded *mongo.Database e.g.:
// Database is a meta-helper that allows us to wrap and overload
// the standard *mongo.Database type
type Database struct {
*mongo.Database
}
// NewEventualConnection returns a new instantiated Connection
// to the DB using the 'Nearest' read preference.
// Per https://github.com/go-mgo/mgo/blob/v2/session.go#L61
// Eventual is the same as Nearest, but may change servers between reads.
// Nearest: The driver reads from a member whose network latency falls within
// the acceptable latency window. Reads in the nearest mode do not consider
// whether a member is a primary or secondary when routing read operations;
// primaries and secondaries are treated equivalently.
func NewEventualConnection() (conn *Connection, success bool) {
conn = &Connection{
client: baseConnection.client,
dbOptions: options.Database().
SetReadConcern(readconcern.Local()).
SetReadPreference(readpref.Nearest()).
SetWriteConcern(writeconcern.New(
writeconcern.W(1))),
}
return conn, true
}
// GetDB returns an overloaded Database object
func (conn Connection) GetDB(dbname string) *Database {
dbByName := &Database{conn.client.Database(dbname, conn.dbOptions)}
}
This allows me to leverage connection pooling and maintain backwards compatibility with our codebase. Hopefully this helps someone else.

Related

Release a Connection borrowed from ConnectionPool

ScalikeJDBC's ConnectionPool docs page says:
Borrowing Connections
Simply just call #borrow method.
import scalikejdbc._
val conn: java.sql.Connection = ConnectionPool.borrow()
val conn: java.sql.Connection = ConnectionPool('named).borrow()
Be careful. The connection object should be released by yourself.
However there's no mention of how to do it.
I can always do Connection.close() but by 'releasing' Connection,
I understand that I'm supposed to return the Connection back to the ConnectionPool and not close it (otherwise the purpose of having a ConnectionPool would be defied).
My doubts are:
In general, what does 'releasing' a Connection (that has been borrowed from ConnectionPool) mean?
In ScalikeJDBC, how do I 'release' a Connection borrowed from ConnectionPool?
Calling close is fine. As per the Oracle docs: Closing a connection instance that was obtained from a pooled connection does not close the physical database connection.. The DBConnection in scalikejdbc just wraps the java.sql.Connection and delegates calls to close. The usual way of doing this with scalikejdbc is with the using function which is essentially an implementation of Java's try-with-resources.
See Closing JDBC Connections in Pool for a similar discussion on JDBC.
Upon a second look into the docs, ScalikeJdbc does provide a using method implementing the loan-pattern that automatically returns the connection to the ConnectionPool.
So you can borrow a connection, use it, and return it to the pool as follows:
import scalikejdbc.{ConnectionPool, using}
import java.sql.Connection
using(ConnectionPool.get("poolName").borrow()) { (connection: Connection) =>
// use connection (only once) here
}
// connection automatically returned to pool

Is it wise to keep a MongoDB session open for the lifetime of an object?

I have a struct that abstracts MongoDB for specific objects in my system, and during the factory function for it I set a few variables like hostname and other connection information. At the moment in each of the UserService's methods* it creates a new session, does what it needs to, then closes the session when it's done.
The system itself is the backend of a web application, so the lifetime of each UserService struct is usually only a few seconds at most, but more than one operation might be performed for each request that comes in. So I'm wondering whether to make session a member of the struct, initialise it when I create my UserService, and then ensure the request handler function that creates it calls a close function once it's done all the operations it needs to. I haven't been able to find anything that encourages or discourages this behaviour so I'm not really sure if the mgo package I'm using is designed that way.
This is my proposed code:
type UserService struct {
callingUserId id.ID //The ID of the user calling this service.
host string //The host name.
session mgo.Session //The active session.
}
func NewUserService(userId id.ID) *UserService {
nus := new(UserService)
nus.callingUserId = userId
nus.host = "localhost" //TODO change this to read from file or buffer.
nus.session, sessionErr = mgo.Dial(nos.host)
if sessionErr != nil {
//TODO log error.
}
return nus
}
func (us UserService) GetById(usrId id.ID) *users.User {
//TODO get a user from the database with the matching ID using the existing session.
}
There will also be a function called UserService.CloseSession, which as I mentioned, will be called after all the required operations are done by the handler function that creates the service.
Does anyone know if this is a sensible idea or should I just call mgo.Dial in every function on UserService?
Opening a connection to MongoDB or any other DB is an expensive task that should be done once and kept operative as much as possible inside your service.
You have three options:
A local package level MongoDB instance.
Using Contexts and Composing Handlers.
A method receiver which holds the MongoDB Instance.
This is a very informative blog post for handling DBs connections.
Take a look to this example, especially for MongoDB

How can I safely return an object to a pool after asynchronous usage?

Given I have a pool of objects, how can I safely return an an pooled object back to the pool after usage if the usage is asynchronous? (using Future and Promise in this case)
Here's an example:
pool
.take
.flatMap {
connection =>
connection
.sendQuery("SELECT 0")
.map {
query =>
pool.giveBack(connection)
query.rows.get(0, 0)
}
}
The problem here is that if the sendQuery call fails, the object will never be returned to the pool. Is there some kind of pipeline sequence for futures what would allow me to safely return this object to the pool even if the code itself fails to do so or should I just ignore this?
The pool implementation is this one and the pooled object is this one.
My main objective here is to make the pool usage be as little error prone as possible and as it stands currently it's clearly not that, since the programmer could forget to return the object and the pool would quickly exhaust itself.
You're looking for the "andThen" method on Future.

akka: sharing mutable state

I need to have one global variable (singleton) that will change very infrequently. Actually it only changes when the actor restarts, and reinitialize the variable. Since I cannot do this with singleton val in companion object, I have to declare it as a var (mutable).
object UserDatabase {
var dbConnection = "" // initializing db connection
}
Many guidelines that I read always go against sharing a mutable state. So I move the variable to class and use message passing to retrieve the variable.
class UserDatabase extends Actor{
val dbConnection = "" // initializing db connection locally
def receive = {case GetConnection => self.reply(dbConnection)}
}
Problem is, dbConnection is accessed very frequently by many .. many actors, and continuously sending message will reduce performance (since akka process mailbox one by one).
I don't see how I can do this without sacrificing performance. Any idea?
Perhaps use an Agent instead? http://akka.io/docs/akka/1.2-RC6/scala/agents.html
First of all, have you actually measure/notice performance reduction ? Since messaging is lightweight, perhaps it's fast enough for your application.
Then, a possible solution: If the "global" state is written rarely, but accessed very often, you can choose a push strategy. Every time it changes, the UserDatabase actor will send the updated value to interested actors. You can then use a publish/subscribe approach, rely on the actor register, use a pool of actors, etc.
class UserDatabase extends Actor{
var dbConnection = "" // initializing db connection locally
def receive = {
case SetConnection( newConnection ) if dbConnection != newConnection => {
dbConnection = newConnection
sendUpdatedConnection(); // sends the change to every relevant actor
}
}
}
If you don't need to use the variable very often in any case, it might be simpler and more efficient to make it a java.lang.concurrent.atomic.AtomicReference or wrap every access of it in a synchronized block (on the variable). Actors don't always make things easier and safer, just usually.
Create many actors as routees of a RoundRobinRouter.
Make each actor handle a connection and actually handling the DB logic.

Class Design: Demeter vs. Connection Lifetimes

Okay, so here's a problem I'm running into.
I have some classes in my application that have methods that require a database connection. I am torn between two different ways to design the classes, both of which are centered around dependency injection:
Provide a property for the connection that is set by the caller prior to method invocation. This has a few drawbacks.
Every method relying on the connection property has to validate that property to ensure that it isn't null, it's open and not involved in a transaction if that's going to muck up the operation.
If the connection property is unexpectedly closed, all the methods have to either (1.) throw an exception or (2.) coerce it open. Depending on the level of robustness you want, either case is appropriate. (Note that this is different from a connection that is passed to a method in that the reference to the connection exists for the lifetime of the object, not simply for the lifetime of the method invocation. Consequently, the volatility of the connection just seems higher to me.)
Providing a Connection property seems (to me, anyway) to scream out for a corresponding Transaction property. This creates additional overhead in the documentation, since you'd have to make it fairly obvious when the transaction was being used, and when it wasn't.
On the other hand, Microsoft seems to favor the whole set-and-invoke paradigm.
Require the connection to be passed as an argument to the method. This has a few advantages and disadvantages:
The parameter list is naturally larger. This is irksome to me, primarily at the point of call.
While a connection (and a transaction) must still be validated prior to use, the reference to it exists only for the duration of the method call.
The point of call is, however, quite clear. It's very obvious that you must provide the connection, and that the method won't be creating one behind your back automagically.
If a method doesn't require a transaction (say a method that only retrieves data from the database), no transaction is required. There's no lack of clarity due to the method signature.
If a method requires a transaction, it's very clear due to the method signature. Again, there's no lack of clarity.
Because the class does not expose a Connection or a Transaction property, there's no chance of callers trying to drill down through them to their properties and methods, thus enforcing the Law of Demeter.
I know, it's a lot. But on the one hand, there's the Microsoft Way: Provide properties, let the caller set the properties, and then invoke methods. That way, you don't have to create complex constructors or factory methods and the like. Also, avoid methods with lots of arguments.
Then, there's the simple fact that if I expose these two properties on my objects, they'll tend to encourage consumers to use them in nefarious ways. (Not that I'm responsible for that, but still.) But I just don't really want to write crappy code.
If you were in my shoes, what would you do?
Here is a third pattern to consider:
Create a class called ConnectionScope, which provides access to a connection
Any class at any time, can create a ConnectionScope
ConnectionScope has a property called Connection, which always returns a valid connection
Any (and every) ConnectionScope gives access to the same underlying connection object (within some scope, maybe within the same thread, or process)
You then are free to implement that Connection property however you want, and your classes don't have a property that needs to be set, nor is the connection a parameter, nor do they need to worry about opening or closing connections.
More details:
In C#, I'd recommend ConnectionScope implement IDisposable, that way your classes can write code like "using ( var scope = new ConnectionScope() )" and then ConnectionScope can free the connection (if appropriate) when it is destroyed
If you can limit yourself to one connection per thread (or process) then you can easily set the connection string in a [thread] static variable in ConnectionScope
You can then use reference counting to ensure that your single connection is re-used when its already open and connections are released when no one is using them
Updated: Here is some simplified sample code:
public class ConnectionScope : IDisposable
{
private static Connection m_Connection;
private static int m_ReferenceCount;
public Connection Connection
{
get
{
return m_Connection;
}
}
public ConnectionScope()
{
if ( m_Connection == null )
{
m_Connection = OpenConnection();
}
m_ReferenceCount++;
}
public void Dispose()
{
m_ReferenceCount--;
if ( m_ReferenceCount == 0 )
{
m_Connection.Dispose();
m_Connection = null;
}
}
}
Example code of how one (any) of your classes would use it:
using ( var scope = new ConnectionScope() )
{
scope.Connection.ExecuteCommand( ... )
}
I would prefer the latter method. It sounds like your classes use the database connection as a conduit to the persistence layer. Making the caller pass in the database connection makes it clear that this is the case. If the connection/transaction were represented as a property of the object, then things are not so clear and all of the ownership and lifetime issues come out. Better to avoid them from the start.