SQLConnection Pooling - Handling InvalidOperationExceptions - sqlconnection

I am designing a Highly Concurrent CCR Application in which it is imperative that I DO NOT Block or Send to sleep a Thread.
I am hitting SQLConnection Pool issues - Specifically getting InvalidOperationExceptions when trying to call SqlConnection.Open
I can potentially retry a hand full of times, but this isn't really solving the problem.
The ideal solution for me would be a method of periodically re-checking the connection for availablity that doesn't require a thread being tied up
Any ideas?
[Update]
Here is a related problem/solution posted at another forum
The solution requires a manually managed connection pool. I'd rather have a solution which is more dynamic i.e. kicks in when needed

Harry, I've run into this as well, also whilst using the CCR. My experience was that having completely decoupled my dispatcher threads from blocking on any I/O, I could consume and process work items much faster than the SqlConnection pool could cope with. Once the maximum-pool-limit was hit, I ran into the sort of errors you are seeing.
The simplest solution is to pre-allocate a number of non-pooled asynchronous SqlConnection objects and post them to some central Port<SqlConnection> object. Then whenever you need to execute a command, do so within an iterator with something like this:
public IEnumerator<ITask> Execute(SqlCommand someCmd)
{
// Assume that 'connPort' has been posted with some open
// connection objects.
try
{
// Wait for a connection to become available and assign
// it to the command.
yield return connPort.Receive(item => someCmd.Connection = item);
// Wait for the async command to complete.
var iarPort = new Port<IAsyncResult>();
var iar = someCmd.BeginExecuteNonQuery(iarPort.Post, null);
yield return iarPort.Receive();
// Process the response.
var rc = someCmd.EndExecuteNonQuery(iar);
// ...
}
finally
{
// Put the connection back in the 'connPort' pool
// when we're done.
if (someCmd.Connection != null)
connPort.Post(someCmd.Connection);
}
}
The nice thing about using the Ccr is that it is trivial to add the following the features to this basic piece of code.
Timeout - just make the initial receive (for an available connection), a 'Choice' with a timeout port.
Adjust the pool size dynamically. To increase the size of the pool, just post a new open SqlConnection to 'connPort'. To decrease the size of the pool, yield a receive on the connPort, and then close the received connection and throw it away.

Yes, connections are kept open and out of the connection pool. In the above example, the port is the pool.

Related

How to capture command line input from Vert.x

Env: Mac OS 12.1, JDK 17, Vert.x 4.2.4
Question: how to capture command line input from a verticle? Tried so far following in the public void start(Promise<Void> startPromise) throws Exception method:
getVertx().createSharedWorkerExecutor("sys-in").executeBlocking(promise -> {
try (final BufferedReader br = new BufferedReader(new InputStreamReader(System.in))) {
String line;
int count = 0;
do {
System.out.print("message to MC: ");
line = br.readLine();
count++;
//doSth(line); // e.g. send line over multicast
} while (count < 3);
} catch (Throwable t) {
// log.info("<start> ", t);
} finally {
// bye(); // send a final message and close vertx
promise.complete();
}
});
This will start, get 3 nulls from br, and exit. Also tried a separated ExecutorService, in vain. Couldn't find any help in Vert.x doc either. Any hints are appreciated:
aware of the warnings of Vert.x when doing blocking stuff
Vert.x might not meant to be used this way, but would be cool if it (reading from command line) can be done with the same toolkit
I understand what you are trying to accomplish, but the problem is that that goes against fundamentals of verticles concept. Waiting for user input is potentially infinitely blocking operation i.e. there is no guarantee user will ever input the values. In that case, you are left with the verticle that is hung forever, spending resources and stuck in one spot. Multiply this if you are using worker verticles and you might have serious problems with the app. This issue is also emphasized here: https://vertx.io/docs/vertx-core/java/#blocking_code (under Warning).
In the link provided you can also find a suggested solution with a separate thread solution. Non-vertx thread won't mind being blocked and when the user input is provided can inform the vertx part of the application via the event bus that the user input dependent code can now be executed.
This might not be the solution you had in mind since it's not pure vertx, but have in mind that vert.x is just another tool, and that tool is not a good fit for what you are trying to accomplish here. However, it can be paired well with plain Java and it won't mind.

MS-MPI MPI_Barrier: sometimes hangs indefinitely, sometimes doesn't

I'm using the MPI.NET library, and I've recently moved my application to a bigger cluster (more COMPUTE-NODES). I've started seeing various collective functions hang indefinitely, but only sometimes. About half the time a job will complete, the rest of the time it'll hang. I've seen it happen with Scatter, Broadcast, and Barrier.
I've put a MPI.Communicator.world.Barrier() call (MPI.NET) at the start of the application, and created trace logs (using the MPIEXEC.exe /trace switch).
C# code snippet:
static void Main(string[] args)
{
var hostName = System.Environment.MachineName;
Logger.Trace($"Program.Main entered on {hostName}");
string[] mpiArgs = null;
MPI.Environment myEnvironment = null;
try
{
Logger.Trace($"Trying to instantiated on MPI.Environment on {hostName}. Is currently initialized? {MPI.Environment.Initialized}");
myEnvironment = new MPI.Environment(ref mpiArgs);
Logger.Trace($"Is currently initialized?{MPI.Environment.Initialized}. {hostName} is waiting at Barrier... ");
Communicator.world.Barrier(); // CODE HANGS HERE!
Logger.Trace($"{hostName} is past Barrier");
}
catch (Exception envEx)
{
Logger.Error(envEx, "Could not instantiate MPI.Environment object");
}
// rest of implementation here...
}
I can see the msmpi.dll's MPI_Barrier function being called in the log, and I can see messages being sent and received thereafter for a passing and a failing example. For the passing example, messages are sent/received and then the MPI_Barrier function Leave is logged.
For the failing example it look like one (or more) of the send messages is lost - it is never received by the target. Am I correct in thinking that messages lost within the MPI_Barrier call will mean that the processes never synchronize, therefore all get stuck at the Communicator.world.Barrier() call?
What could be causing this to happen intermittently? Could poor network performance between the COMPUTE-NODES be a cause?
I'm running MS HPC Pack 2008 R2, so the version of MS-MPI is pretty old, v2.0.
EDIT - Additional information
If I keep a task running within the same node, then this issue does not happen. For example, if I run a task using 8 cores on one node then fine, but if i use 9 cores on two nodes I'll see this issue ~50% of the time.
Also, we have two clusters in use and this only happens on one of them. They are both virtualized environments, but appear to be set up identically.

General pattern for failing over from one database to another using Entity Framework?

We have an enterprise DB that is replicated through many sites throughout the world. We would like our app to attempt to connect to one of the local sites, and if that site is down we want it to fall back to the enterprise DB. We'd like this behavior on each of our DB operations.
We are using Entity Framework, C#, and SQL Server.
At first I hoped I could just specify a "Failover Partner" in the connection string, but that only works in a mirrored DB environment, which this is not. I also looked into writing a custom IDbExecutionStrategy. But these strategies only allow you to specify the pattern for retrying a failed DB operation. It does not allow you to change the operation in any way like directing it to a new connection.
So, do you know of any good pattern for dealing with this type of operation, other than duplicating retry logic around each of our many DB operations?
Update on 2014-05-14:
I'll elaborate in response to some of the suggestions already made.
I have many places where the code looks like this:
try
{
using(var db = new MyDBContext(ConnectionString))
{
// Database operations here.
// var myList = db.MyTable.Select(...), etc.
}
}
catch(Exception ex)
{
// Log exception here, perhaps rethrow.
}
It was suggested that I have a routine that first checks each of the connections strings and returns the first one that successfully connects. This is reasonable as far as it goes. But some of the errors I'm seeing are timeouts on the operations, where the connection works but the DB has issues that keep it from completing the operation.
What I'm looking for is a pattern I can use to encapsulate the unit of work and say, "Try this on the first database. If it fails for any reason, rollback and try it on the second DB. If that fails, try it on the third, etc. until the operation succeeds or you have no more DBs." I'm pretty sure I can roll my own (and I'll post the result if I do), but I was hoping there might be a known way to approach this.
How about using some Dependency Injection system like autofac and registering there a factory for new context objects - it will execute logic that will try to connect first to local and in case of failure it will connect to enterprise db. Then it will return ready DbContext object. This factory will be provided to all objects that require it with Dependency Injection system - they will use it to create contexts and dispose of them when they are not needed any more.
" We would like our app to attempt to connect to one of the local sites, and if that site is down we want it to fall back to the enterprise DB. We'd like this behavior on each of our DB operations."
If your app is strictly read-only on the DB and data consistency is not absolutely vital to your app/users, then it's just a matter of trying to CONNECT until an operational site has been found. As M.Ali suggested in his remark.
Otherwise, I suggest you stop thinking along these lines immediately because you're just running 90 mph down a dead end street. As Viktor Zychla suggested in his remark.
Here is what I ended up implementing, in broad brush-strokes:
Define delegates called UnitOfWorkMethod that will execute a single Unit of Work on the Database, in a single transaction. It takes a connection string and one also returns a value:
delegate T UnitOfWorkMethod<out T>(string connectionString);
delegate void UnitOfWorkMethod(string connectionString);
Define a method called ExecuteUOW, that will take a unit of work and method try to execute it using the preferred connection string. If it fails, it tries to execute it with the next connection string:
protected T ExecuteUOW<T>(UnitOfWorkMethod<T> method)
{
// GET THE LIST OF CONNECTION STRINGS
IEnumerable<string> connectionStringList = ConnectionStringProvider.GetConnectionStringList();
// WHILE THERE ARE STILL DATABASES TO TRY, AND WE HAVEN'T DEFINITIVELY SUCCEDED OR FAILED
var uowState = UOWStateEnum.InProcess;
IEnumerator<string> stringIterator = connectionStringList.GetEnumerator();
T returnVal = default(T);
Exception lastException = null;
string connectionString = null;
while ((uowState == UOWStateEnum.InProcess) && stringIterator.MoveNext())
{
try
{
// TRY TO EXECUTE THE UNIT OF WORK AGAINST THE DB.
connectionString = stringIterator.Current;
returnVal = method(connectionString);
uowState = UOWStateEnum.Success;
}
catch (Exception ex)
{
lastException = ex;
// IF IT FAILED BECAUSE OF A TRANSIENT EXCEPTION,
if (TransientChecker.IsTransient(ex))
{
// LOG THE EXCEPTION AND TRY AGAINST ANOTHER DB.
Log.TransientDBException(ex, connectionString);
}
// ELSE
else
{
// CONSIDER THE UOW FAILED.
uowState = UOWStateEnum.Failed;
}
}
}
// LOG THE FAILURE IF WE HAVE NOT SUCCEEDED.
if (uowState != UOWStateEnum.Success)
{
Log.ExceptionDuringDataAccess(lastException);
returnVal = default(T);
}
return returnVal;
}
Finally, for each operation we define our unit of work delegate method. Here an example
UnitOfWorkMethod uowMethod =
(providerConnectionString =>
{
using (var db = new MyContext(providerConnectionString ))
{
// Do my DB commands here. They will roll back if exception thrown.
}
});
ExecuteUOW(uowMethod);
When ExecuteUOW is called, it tries the delegate on each database until it either succeeds or fails on all of them.
I'm going to accept this answer since it fully addresses all of concerns raised in the original question. However, if anyone provides and answer that is more elegant, understandable, or corrects flaws in this one I'll happily accept it instead.
Thanks to all who have responded.

Can my nginx module make a connection in the master process?

I'm writing an nginx module that wants to subscribe to a zeromq pubsub socket and update an in-memory data-structure based on the messages it receives. To save bandwidth, it makes sense that only one process should make the subscription, and the data structure should be in shm so that all processes can make use of it. To me it seems natural that that one process should be the master (since if it was a worker, the code would have to somehow decide which worker).
But when I call ngx_get_connection from my init_master or init_module callbacks, it segfaults, apparently due to ngx_cycle not being initialized yet. Google searches on plugins doing work in the master process seem pretty pessimistic. Is there a better way to accomplish my goal of making a single outgoing connection to the pubsub socket per server, regardless of how many workers it has?
Here's a sample of code that works in a worker context but not from the master:
void *zmq_context = zmq_ctx_new();
void *control_socket = zmq_socket(zmq_context, ZMQ_SUB);
int control_fd;
size_t fdsize = sizeof(int);
ngx_connection_t *control_connection;
zmq_connect(control_socket, "tcp://somewhere:1234");
zmq_setsockopt(control_socket, ZMQ_SUBSCRIBE, "", 0);
zmq_getsockopt(control_socket, ZMQ_FD, &control_fd, &fdsize);
control_connection = ngx_get_connection(control_fd, cycle->log);
control_connection->read->handler = my_read_handler;
control_connection->read->log = cycle->log;
ngx_add_event(control_connection->read, NGX_READ_EVENT, 0);
and elsewhere
void my_read_handler (ngx_event_t *ev) {
int events;
size_t events_size = sizeof(events);
zmq_getsockopt(control_socket, ZMQ_EVENTS, &events, &events_size);
while (events & ZMQ_POLLIN) {
/* ...
read a message, do something with it
... */
events = 0;
zmq_getsockopt(control_socket, ZMQ_EVENTS, &events, &events_size);
}
}
To save bandwidth, it makes sense that only one process should make the subscription, and the data structure should be in shm so that all processes can make use of it. To me it seems natural that that one process should be the master (since if it was a worker, the code would have to somehow decide which worker).
As I already said, all you need is to decline your natural idea and just use one worker process for your purpose.
Which worker? Well, let it be the first one started.

How to specify ADO.NET connection timeouts of less than a second?

Connection time outs are specified in the connectionString in web.config file like this:
"Data Source=dbs;Initial Catalog=db;"+"Integrated Security=SSPI;Connection Timeout=30"
The time is in seconds. I want to specify a connection timeout in milliseconds, say 500ms. How can I do that?
Edit 1: I want to do this to create a ping method which just checks if the database is reachable or not.
Edit 2: I have been searching for some similar solutions and this answer mentioned specifying timeout in milliseconds. So I was intrigued and wanted to find out how it can be done.
Firstly, please make sure that you are using non-pooled connections to ensure that you are always getting a fresh connection, you can do this by adding Pooling=false to your connection string. For both of these solutions I would also recommend adding Connection Timeout=1 just to ensure that ADO.NET does not needlessly continue to open the connection after you application has given up.
For .Net 4.5 you can use the new OpenAsync method and a CancellationToken to achieve a short timeout (e.g. 500ms):
using (var tokenSource = new CancellationTokenSource())
using (var connection = new SqlConnection(connectionString))
{
tokenSource.CancelAfter(500);
await connection.OpenAsync(tokenSource.Token);
}
When this times out you should see the Task returned by OpenAsync go to the canceled state, which will result in a TaskCanceledException
For .Net 4.0 you can wrap the connection open in a Task and wait on that for the desired time:
var openTask = Task.Factory.StartNew(() =>
{
using (var connection = new SqlConnection(connectionString))
{
connection.Open();
}
});
openTask.ContinueWith(task =>
{
// Need to observe any exceptions here - perhaps you might log them?
var ignored = task.Exception;
}, TaskContinuationOptions.OnlyOnFaulted);
if (!openTask.Wait(500))
{
// Didn't complete
Console.WriteLine("Fail");
}
In this example, openTask.Wait() will return a bool that indicates if the Task completed or not. Please be aware that in .Net 4.0 you must observe all exceptions thrown inside tasks, otherwise they will cause your program to crash.
If you need examples for versions of .Net prior to 4.0 please let me know.