Is it wise to keep a MongoDB session open for the lifetime of an object? - mongodb

I have a struct that abstracts MongoDB for specific objects in my system, and during the factory function for it I set a few variables like hostname and other connection information. At the moment in each of the UserService's methods* it creates a new session, does what it needs to, then closes the session when it's done.
The system itself is the backend of a web application, so the lifetime of each UserService struct is usually only a few seconds at most, but more than one operation might be performed for each request that comes in. So I'm wondering whether to make session a member of the struct, initialise it when I create my UserService, and then ensure the request handler function that creates it calls a close function once it's done all the operations it needs to. I haven't been able to find anything that encourages or discourages this behaviour so I'm not really sure if the mgo package I'm using is designed that way.
This is my proposed code:
type UserService struct {
callingUserId id.ID //The ID of the user calling this service.
host string //The host name.
session mgo.Session //The active session.
}
func NewUserService(userId id.ID) *UserService {
nus := new(UserService)
nus.callingUserId = userId
nus.host = "localhost" //TODO change this to read from file or buffer.
nus.session, sessionErr = mgo.Dial(nos.host)
if sessionErr != nil {
//TODO log error.
}
return nus
}
func (us UserService) GetById(usrId id.ID) *users.User {
//TODO get a user from the database with the matching ID using the existing session.
}
There will also be a function called UserService.CloseSession, which as I mentioned, will be called after all the required operations are done by the handler function that creates the service.
Does anyone know if this is a sensible idea or should I just call mgo.Dial in every function on UserService?

Opening a connection to MongoDB or any other DB is an expensive task that should be done once and kept operative as much as possible inside your service.
You have three options:
A local package level MongoDB instance.
Using Contexts and Composing Handlers.
A method receiver which holds the MongoDB Instance.
This is a very informative blog post for handling DBs connections.
Take a look to this example, especially for MongoDB

Related

How can I convert my mgo sessions to mongo-go-driver clients using connection pooling?

Long, long ago, when we were using mgo.v2, we created some wrapper functions that copied the session, set the read pref and returned that for consumption by other libraries, e.g.
func NewMonotonicConnection() (conn *Connection, success bool) {
conn := &Connection{
session: baseSession.Copy(),
}
conn.session.SetMode(mongo.Monotonic, true)
return conn, true
}
We now just pass the default client (initialized using mongo.Connect and passed into a connection singleton) in an init function and then consumed like this:
func NewMonotonicConnection() (conn *Connection, success bool) {
conn = defaultConnection
return conn, true
}
My understanding is that to leverage connection pooling, you need to use the same client (which is contained in defaultConn), and session is now implicitly handled inside of the .All()/cursor teardown. Please correct me if I'm wrong here.
It would be nice if we could still set the readpref on these connections (e.g. set NearestMode on this connection before returning), but what's the community/standard way of doing that?
I know I could call mongo.Connect over and over again, but is that expensive?
I could create different clients - each client with a different readpref - but I was thinking that if a write occurred on that connection, it wouldn't ever go back to reading from a slave.
It looks like I *can create sessions explicitly, but I'm not certain I should or if there are any implications around managing those explicitly in the new driver.
There are a couple things I learned on this quest through the mongo-go-driver codebase that I thought I should share with the world before closing this question. If I'm wrong here - please correct me.
You should not call Connect() over and over if you want to leverage connection pooling. It looked like each time Connect() was called, a new socket was created. This means that there's a risk of socket exhaustion over time unless you are manually defer Close()-ing it each time.
In mongo-go-driver, sessions are automatically handled under the covers now when you make the call to execute the query (e.g. All()). You can* explicitly create and teardown a session, but you can't consume it using the singleton approach I proposed above without having to change all the caller functions.
This is because you can no longer call query operations on the session, you instead have to consume it using a WithSession function at the DB operation itself
I realized that writeconcern, readpref and readconcern can all be set at the:
client level (these are the defaults that everything will use if not overridden)
session level
database level
query level
So what I did is create Database options and overloaded *mongo.Database e.g.:
// Database is a meta-helper that allows us to wrap and overload
// the standard *mongo.Database type
type Database struct {
*mongo.Database
}
// NewEventualConnection returns a new instantiated Connection
// to the DB using the 'Nearest' read preference.
// Per https://github.com/go-mgo/mgo/blob/v2/session.go#L61
// Eventual is the same as Nearest, but may change servers between reads.
// Nearest: The driver reads from a member whose network latency falls within
// the acceptable latency window. Reads in the nearest mode do not consider
// whether a member is a primary or secondary when routing read operations;
// primaries and secondaries are treated equivalently.
func NewEventualConnection() (conn *Connection, success bool) {
conn = &Connection{
client: baseConnection.client,
dbOptions: options.Database().
SetReadConcern(readconcern.Local()).
SetReadPreference(readpref.Nearest()).
SetWriteConcern(writeconcern.New(
writeconcern.W(1))),
}
return conn, true
}
// GetDB returns an overloaded Database object
func (conn Connection) GetDB(dbname string) *Database {
dbByName := &Database{conn.client.Database(dbname, conn.dbOptions)}
}
This allows me to leverage connection pooling and maintain backwards compatibility with our codebase. Hopefully this helps someone else.

Service Fabric, determine if specific actor exists

We are using Azure Service Fabric and are using actors to model specific devices, using the id of the device as the ActorId. Service Fabric will instantiate a new actor instance when we request an actor for a given id if it is not already instantiated, but I cannot seem to find an api that allows me to query if a specific device id already has an instantiated actor.
I understand that there might be some distributed/timing issues in obtaining the point-in-time truth but for our specific purpose, we do not need a hard realtime answer to this but can settle for a best guess. We would just like to, in theory, contact the current primary for the specific partition resolved by the ActorId and get back whether or not the device has an instantiated actor.
Ideally it is a fast/performant call, essentially faster than e.g. instantiating the actor and calling a method to understand if it has been initialized correctly and is not just an "empty" actor.
You can use the ActorServiceProxy to iterate through the information for a specific partition but that does not seem to be a very performant way of obtaining the information.
Anyone with insights into this?
The only official way you can check if the actor has been activated in any Service Partition previously is using the ActorServiceProxy query, like described here:
IActorService actorServiceProxy = ActorServiceProxy.Create(
new Uri("fabric:/MyApp/MyService"), partitionKey);
ContinuationToken continuationToken = null;
do
{
PagedResult<ActorInformation> page = await actorServiceProxy.GetActorsAsync(continuationToken, cancellationToken);
var actor = page.Items.FirstOrDefault(x => x.ActorId == idToFind);
continuationToken = page.ContinuationToken;
}
while (continuationToken != null);
By the nature of SF Actors, they are virtual, that means they always exist, even though you didn't activated then previously, so it make a bit harder to do this check.
As you said, it is not performant to query all actors, so, the other workarounds you could try is:
Store the IDs in a Reliable Dictionary elsewhere, every time an Actor is activated you raise an event and insert the ActorIDs in the Dictionary if not there yet.
You can use the OnActivateAsync() actor event to notify it's creation, or
You can use the custom actor factory in the ActorService to register actor activation
You can store the dictionary in another actor, or another StatefulService
Create a property in the actor that is set by the actor itself when it is activated.
The OnActivateAsync() check if this property has been set before
If not set yet, you set a new value and store in a variable (a non persisted value) to say the actor is new
Whenever you interact with actor you set this to indicate it is not new anymore
The next activation, the property will be already set, and nothing should happen.
Create a custom IActorStateProvider to do the same as mentioned in the option 2, instead of handle it in the actor it will handle a level underneath it. Honestly I think it is a bit of work, would only be handy if you have to do the same for many actor types, the option 1 and 2 would be much easier.
Do as Peter Bons Suggested, store the ActorID outside the ActorService, like in a DB, I would only suggest this option if you have to check this from outside the cluster.
.
The following snipped can help you if you want to manage these events outside the actor.
private static void Main()
{
try
{
ActorRuntime.RegisterActorAsync<NetCoreActorService>(
(context, actorType) => new ActorService(context, actorType,
new Func<ActorService, ActorId, ActorBase>((actorService, actorId) =>
{
RegisterActor(actorId);//The custom method to register the actor if new
return (ActorBase)Activator.CreateInstance(actorType.ImplementationType, actorService, actorId);
})
)).GetAwaiter().GetResult();
Thread.Sleep(Timeout.Infinite);
}
catch (Exception e)
{
ActorEventSource.Current.ActorHostInitializationFailed(e.ToString());
throw;
}
}
private static void RegisterActor(ActorId actorId)
{
//Here you will put the logic to register elsewhere the actor creation
}
Alternatively, you could create a stateful DeviceActorStatusActor which would be notified (called) by DeviceActor as soon as it's created. (Share the ActorId for correlation.)
Depending on your needs you can also register multiple Actors with the same status-tracking actor.
You'll have great performance and near real-time information.

$_POST and static variables

I am trying to ensure that I only instantiate a class once. In my function I have a static variable. When $_POST receives a value, it calls the function with the value received. Why is this static variable overwritten each time? $aClass is always NULL. Is the $_POST call via HTML resetting everything?
function RunFunctions($inputStr)
{
static $aClass;
var_dump($aClass);
if (!isset($aClass))
{
$aClass = new CreateNewClass();
$aClass->SetNumber($inputStr);
}
}
You didn't give us full details on your PHP setup, but most PHP environments are start-from-scratch on each request. What this means is that the entire program is initialized from scratch including your static variables each time.
If you're not using an optimizer, the PHP code itself can be compiled from scratch for each request as well.
A lot of PHP requests are handled by separate system processes, meaning that even if data could be shared between requests, it would only work for requests handled by the same process.
If you want to share data between requests, consider storing it in a Cookie that the user's browser automatically provides on each request, or store it in a local database.

Using a static instance of DbProviderFactory in ADO.NET?

I am using the following code in my asp.net app. According to this code, for all users of the app, there will be only a single instance of DBProviderFactory. Will this create a problem in a multi-user environment? So all users would use the the same DbProviderFactory object to create connections. I am not sure if this will create some type of hidden problems in a multi-user environment.
The reason, why I am using a static instance for DbProviderFactory, is so that the GetFactory method is not called everytime a connection needs to be instantiated. This, I think, would make it quicker to get a connection object. Any flaw in my reasoning?
public class DatabaseAccess
{
private static readonly DbProviderFactory _dbProviderFactory =
DbProviderFactories.GetFactory(System.Configuration.ConfigurationManager.ConnectionStrings["DB"].ProviderName);
public static DbConnection GetDbConnection()
{
DbConnection con = _dbProviderFactory.CreateConnection();
con.ConnectionString = System.Web.Configuration.WebConfigurationManager.ConnectionStrings["DB"].ConnectionString;
return con;
}
}
It looks fine, but probably will not create interesting efficiencies.
Object creation in .NET is quick. So creating the factory doesn't take a lot of time. Acquiring the connection from a remote database does, but with connection pooling, this normally isn't an issue.
The factory probably doesn't appear to implement any state of it's own & looks like it's probably immutable. So access from different threads is probably okay.
Static objects aren't garbage collected. I doubt the factory will grow in size, so this shouldn't be a problem.
So you avoid a bunch of cheap object creation, a bunch of cheap background garbage collections, and have a minor risk of a derived class actually having state and not being thread safe depending on the exact implementation returned by GetFactory

Class Design: Demeter vs. Connection Lifetimes

Okay, so here's a problem I'm running into.
I have some classes in my application that have methods that require a database connection. I am torn between two different ways to design the classes, both of which are centered around dependency injection:
Provide a property for the connection that is set by the caller prior to method invocation. This has a few drawbacks.
Every method relying on the connection property has to validate that property to ensure that it isn't null, it's open and not involved in a transaction if that's going to muck up the operation.
If the connection property is unexpectedly closed, all the methods have to either (1.) throw an exception or (2.) coerce it open. Depending on the level of robustness you want, either case is appropriate. (Note that this is different from a connection that is passed to a method in that the reference to the connection exists for the lifetime of the object, not simply for the lifetime of the method invocation. Consequently, the volatility of the connection just seems higher to me.)
Providing a Connection property seems (to me, anyway) to scream out for a corresponding Transaction property. This creates additional overhead in the documentation, since you'd have to make it fairly obvious when the transaction was being used, and when it wasn't.
On the other hand, Microsoft seems to favor the whole set-and-invoke paradigm.
Require the connection to be passed as an argument to the method. This has a few advantages and disadvantages:
The parameter list is naturally larger. This is irksome to me, primarily at the point of call.
While a connection (and a transaction) must still be validated prior to use, the reference to it exists only for the duration of the method call.
The point of call is, however, quite clear. It's very obvious that you must provide the connection, and that the method won't be creating one behind your back automagically.
If a method doesn't require a transaction (say a method that only retrieves data from the database), no transaction is required. There's no lack of clarity due to the method signature.
If a method requires a transaction, it's very clear due to the method signature. Again, there's no lack of clarity.
Because the class does not expose a Connection or a Transaction property, there's no chance of callers trying to drill down through them to their properties and methods, thus enforcing the Law of Demeter.
I know, it's a lot. But on the one hand, there's the Microsoft Way: Provide properties, let the caller set the properties, and then invoke methods. That way, you don't have to create complex constructors or factory methods and the like. Also, avoid methods with lots of arguments.
Then, there's the simple fact that if I expose these two properties on my objects, they'll tend to encourage consumers to use them in nefarious ways. (Not that I'm responsible for that, but still.) But I just don't really want to write crappy code.
If you were in my shoes, what would you do?
Here is a third pattern to consider:
Create a class called ConnectionScope, which provides access to a connection
Any class at any time, can create a ConnectionScope
ConnectionScope has a property called Connection, which always returns a valid connection
Any (and every) ConnectionScope gives access to the same underlying connection object (within some scope, maybe within the same thread, or process)
You then are free to implement that Connection property however you want, and your classes don't have a property that needs to be set, nor is the connection a parameter, nor do they need to worry about opening or closing connections.
More details:
In C#, I'd recommend ConnectionScope implement IDisposable, that way your classes can write code like "using ( var scope = new ConnectionScope() )" and then ConnectionScope can free the connection (if appropriate) when it is destroyed
If you can limit yourself to one connection per thread (or process) then you can easily set the connection string in a [thread] static variable in ConnectionScope
You can then use reference counting to ensure that your single connection is re-used when its already open and connections are released when no one is using them
Updated: Here is some simplified sample code:
public class ConnectionScope : IDisposable
{
private static Connection m_Connection;
private static int m_ReferenceCount;
public Connection Connection
{
get
{
return m_Connection;
}
}
public ConnectionScope()
{
if ( m_Connection == null )
{
m_Connection = OpenConnection();
}
m_ReferenceCount++;
}
public void Dispose()
{
m_ReferenceCount--;
if ( m_ReferenceCount == 0 )
{
m_Connection.Dispose();
m_Connection = null;
}
}
}
Example code of how one (any) of your classes would use it:
using ( var scope = new ConnectionScope() )
{
scope.Connection.ExecuteCommand( ... )
}
I would prefer the latter method. It sounds like your classes use the database connection as a conduit to the persistence layer. Making the caller pass in the database connection makes it clear that this is the case. If the connection/transaction were represented as a property of the object, then things are not so clear and all of the ownership and lifetime issues come out. Better to avoid them from the start.