I have a windows service, running workflows. The workflows are XAMLs loaded from database (users can define their own workflows using a rehosted designer). It is configured with one instance of the SQLWorkflowInstanceStore, to persist workflows when becoming idle. (It's basically derived from the example code in \ControllingWorkflowApplications from Microsoft's WCF/WF samples).
But sometimes I get an error like below:
System.Runtime.DurableInstancing.InstanceOwnerException: The execution of an InstancePersistenceCommand was interrupted because the instance owner registration for owner ID 'a426269a-be53-44e1-8580-4d0c396842e8' has become invalid. This error indicates that the in-memory copy of all instances locked by this owner have become stale and should be discarded, along with the InstanceHandles. Typically, this error is best handled by restarting the host.
I've been trying to find the cause, but it is hard to reproduce in development, on production servers however, I get it once in a while. One hint I found : when I look at the LockOwnersTable, I find the LockOnwersTable lockexpiration is set to 01/01/2000 0:0:0 and it's not getting updated anymore, while under normal circumstances the should be updated every x seconds according to the Host Lock Renewal period...
So , why whould SQLWorkflowInstanceStore stop renewing this LockExpiration and how can I detect the cause of it?
This happens because there are procedures running in the background and trying to extend the lock of the instance store every 30 seconds, and it seems that once the connection fail connecting to the SQL service it will mark this instance store as invalid.
you can see the same behaviour if you delete the instance store record from [LockOwnersTable] table.
The proposed solution is when this exception fires, you need to free the old instance store and initialize a new one
public class WorkflowInstanceStore : IWorkflowInstanceStore, IDisposable
{
public WorkflowInstanceStore(string connectionString)
{
_instanceStore = new SqlWorkflowInstanceStore(connectionString);
InstanceHandle handle = _instanceStore.CreateInstanceHandle();
InstanceView view = _instanceStore.Execute(handle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(30));
handle.Free();
_instanceStore.DefaultInstanceOwner = view.InstanceOwner;
}
public InstanceStore Store
{
get { return _instanceStore; }
}
public void Dispose()
{
if (null != _instanceStore)
{
var deleteOwner = new DeleteWorkflowOwnerCommand();
InstanceHandle handle = _instanceStore.CreateInstanceHandle();
_instanceStore.Execute(handle, deleteOwner, TimeSpan.FromSeconds(10));
handle.Free();
}
}
private InstanceStore _instanceStore;
}
you can find the best practices to create instance store handle in this link
Workflow Instance Store Best practices
This is an old thread but I just stumbled on the same issue.
Damir's Corner suggests to check if the instance handle is still valid before calling the instance store. I hereby quote the whole post:
Certain aspects of Workflow Foundation are still poorly documented; the persistence framework being one of them. The following snippet is typically used for setting up the instance store:
var instanceStore = new SqlWorkflowInstanceStore(connectionString);
instanceStore.HostLockRenewalPeriod = TimeSpan.FromSeconds(30);
var instanceHandle = instanceStore.CreateInstanceHandle();
var view = instanceStore.Execute(instanceHandle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(10));
instanceStore.DefaultInstanceOwner = view.InstanceOwner;
It's difficult to find a detailed explanation of what all of this
does; and to be honest, usually it's not necessary. At least not,
until you start encountering problems, such as InstanceOwnerException:
The execution of an InstancePersistenceCommand was interrupted because
the instance owner registration for owner ID
'9938cd6d-a9cb-49ad-a492-7c087dcc93af' has become invalid. This error
indicates that the in-memory copy of all instances locked by this
owner have become stale and should be discarded, along with the
InstanceHandles. Typically, this error is best handled by restarting
the host.
The error is closely related to the HostLockRenewalPeriod property
which defines how long obtained instance handle is valid without being
renewed. If you try monitoring the database while an instance store
with a valid instance handle is instantiated, you will notice
[System.Activities.DurableInstancing].[ExtendLock] being called
periodically. This stored procedure is responsible for renewing the
handle. If for some reason it fails to be called within the specified
HostLockRenewalPeriod, the above mentioned exception will be thrown
when attempting to persist a workflow. A typical reason for this would
be temporarily inaccessible database due to maintenance or networking
problems. It's not something that happens often, but it's bound to
happen if you have a long living instance store, e.g. in a constantly
running workflow host, such as a Windows service.
Fortunately it's not all that difficult to fix the problem, once you
know the cause of it. Before using the instance store you should
always check, if the handle is still valid; and renew it, if it's not:
if (!instanceHandle.IsValid)
{
instanceHandle = instanceStore.CreateInstanceHandle();
var view = instanceStore.Execute(instanceHandle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(10));
instanceStore.DefaultInstanceOwner = view.InstanceOwner;
}
It's definitely less invasive than the restart of the host, suggested
by the error message.
you have to be sure about expiration of owner user
here how I am used to handle this issue
public SqlWorkflowInstanceStore SetupSqlpersistenceStore()
{
SqlWorkflowInstanceStore sqlWFInstanceStore = new SqlWorkflowInstanceStore(ConfigurationManager.ConnectionStrings["DB_WWFConnectionString"].ConnectionString);
sqlWFInstanceStore.InstanceCompletionAction = InstanceCompletionAction.DeleteAll;
InstanceHandle handle = sqlWFInstanceStore.CreateInstanceHandle();
InstanceView view = sqlWFInstanceStore.Execute(handle, new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(30));
handle.Free();
sqlWFInstanceStore.DefaultInstanceOwner = view.InstanceOwner;
return sqlWFInstanceStore;
}
and here how you can use this method
wfApp.InstanceStore = SetupSqlpersistenceStore();
wish this help
Related
We have an enterprise DB that is replicated through many sites throughout the world. We would like our app to attempt to connect to one of the local sites, and if that site is down we want it to fall back to the enterprise DB. We'd like this behavior on each of our DB operations.
We are using Entity Framework, C#, and SQL Server.
At first I hoped I could just specify a "Failover Partner" in the connection string, but that only works in a mirrored DB environment, which this is not. I also looked into writing a custom IDbExecutionStrategy. But these strategies only allow you to specify the pattern for retrying a failed DB operation. It does not allow you to change the operation in any way like directing it to a new connection.
So, do you know of any good pattern for dealing with this type of operation, other than duplicating retry logic around each of our many DB operations?
Update on 2014-05-14:
I'll elaborate in response to some of the suggestions already made.
I have many places where the code looks like this:
try
{
using(var db = new MyDBContext(ConnectionString))
{
// Database operations here.
// var myList = db.MyTable.Select(...), etc.
}
}
catch(Exception ex)
{
// Log exception here, perhaps rethrow.
}
It was suggested that I have a routine that first checks each of the connections strings and returns the first one that successfully connects. This is reasonable as far as it goes. But some of the errors I'm seeing are timeouts on the operations, where the connection works but the DB has issues that keep it from completing the operation.
What I'm looking for is a pattern I can use to encapsulate the unit of work and say, "Try this on the first database. If it fails for any reason, rollback and try it on the second DB. If that fails, try it on the third, etc. until the operation succeeds or you have no more DBs." I'm pretty sure I can roll my own (and I'll post the result if I do), but I was hoping there might be a known way to approach this.
How about using some Dependency Injection system like autofac and registering there a factory for new context objects - it will execute logic that will try to connect first to local and in case of failure it will connect to enterprise db. Then it will return ready DbContext object. This factory will be provided to all objects that require it with Dependency Injection system - they will use it to create contexts and dispose of them when they are not needed any more.
" We would like our app to attempt to connect to one of the local sites, and if that site is down we want it to fall back to the enterprise DB. We'd like this behavior on each of our DB operations."
If your app is strictly read-only on the DB and data consistency is not absolutely vital to your app/users, then it's just a matter of trying to CONNECT until an operational site has been found. As M.Ali suggested in his remark.
Otherwise, I suggest you stop thinking along these lines immediately because you're just running 90 mph down a dead end street. As Viktor Zychla suggested in his remark.
Here is what I ended up implementing, in broad brush-strokes:
Define delegates called UnitOfWorkMethod that will execute a single Unit of Work on the Database, in a single transaction. It takes a connection string and one also returns a value:
delegate T UnitOfWorkMethod<out T>(string connectionString);
delegate void UnitOfWorkMethod(string connectionString);
Define a method called ExecuteUOW, that will take a unit of work and method try to execute it using the preferred connection string. If it fails, it tries to execute it with the next connection string:
protected T ExecuteUOW<T>(UnitOfWorkMethod<T> method)
{
// GET THE LIST OF CONNECTION STRINGS
IEnumerable<string> connectionStringList = ConnectionStringProvider.GetConnectionStringList();
// WHILE THERE ARE STILL DATABASES TO TRY, AND WE HAVEN'T DEFINITIVELY SUCCEDED OR FAILED
var uowState = UOWStateEnum.InProcess;
IEnumerator<string> stringIterator = connectionStringList.GetEnumerator();
T returnVal = default(T);
Exception lastException = null;
string connectionString = null;
while ((uowState == UOWStateEnum.InProcess) && stringIterator.MoveNext())
{
try
{
// TRY TO EXECUTE THE UNIT OF WORK AGAINST THE DB.
connectionString = stringIterator.Current;
returnVal = method(connectionString);
uowState = UOWStateEnum.Success;
}
catch (Exception ex)
{
lastException = ex;
// IF IT FAILED BECAUSE OF A TRANSIENT EXCEPTION,
if (TransientChecker.IsTransient(ex))
{
// LOG THE EXCEPTION AND TRY AGAINST ANOTHER DB.
Log.TransientDBException(ex, connectionString);
}
// ELSE
else
{
// CONSIDER THE UOW FAILED.
uowState = UOWStateEnum.Failed;
}
}
}
// LOG THE FAILURE IF WE HAVE NOT SUCCEEDED.
if (uowState != UOWStateEnum.Success)
{
Log.ExceptionDuringDataAccess(lastException);
returnVal = default(T);
}
return returnVal;
}
Finally, for each operation we define our unit of work delegate method. Here an example
UnitOfWorkMethod uowMethod =
(providerConnectionString =>
{
using (var db = new MyContext(providerConnectionString ))
{
// Do my DB commands here. They will roll back if exception thrown.
}
});
ExecuteUOW(uowMethod);
When ExecuteUOW is called, it tries the delegate on each database until it either succeeds or fails on all of them.
I'm going to accept this answer since it fully addresses all of concerns raised in the original question. However, if anyone provides and answer that is more elegant, understandable, or corrects flaws in this one I'll happily accept it instead.
Thanks to all who have responded.
We have a HTTP end-point that takes a long time to run and can also be called concurrently by users. As part of this request, we update the model inside a synchronized block so that other (possibly concurrent) requests pick up that change.
E.g.
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
//Long running operation continues here. It can involve further changes to instance "m"
The reason for the synchronized block is to ensure that even concurrent requests get to pick up the latest status. However, the underlying JPA does not commit my changes (m.save()) until the request is complete. Since this is a long-running request, I do not want to wait until the request is complete and still want to ensure that other callers are notified of the change in status. I tried to call "m.em().flush(); JPA.em().getTransaction().commit();" after m.save(), but that makes the transaction unavailable for the subsequent action as part of the same request. Can I just given "JPA.em().getTransaction().begin();" and let Play handle the transaction from then on? If not, what is the best way to handle this use-case?
UPDATE:
Based on the response, I modified my code as follows:
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
new MyModelUpdateJob(m.id).now();
And in my job, I have the following line:
doJob() {
MyModel m = MyModel.findById(id);
print m.status; //This still prints the old status as-if m.save() had no effect...
}
What am I missing?
Put your update code in a job an call
new MyModelUpdateJob(id).now().get();
thus the update will be done in another transaction that is commited at the end of the job
ouch, as soon as you add more play servers, you will be in trouble. You may want to play with optimistic locking in your example or and I advise against it pessimistic locking....ick.
HOWEVER, looking at your code, maybe read the article Building on Quicksand. I am not sure you need a synchronized block in that case at all...try to go after being idempotent.
In your case if
1. user 1 and user 2 both call that method and it is pending, then it goes to active(Idempotent)
If user 1 or user 2 wins, well that would be like you had the synchronization block anyways.
I am sure however you have a more complex scenario not shown here, BUT READ that article Building on Quicksand as it really changes the traditional way of thinking and is how google and amazon and very large scale systems operate.
Another option for distributed transactions across play servers is zookeeper which the big large nosql guys use BUT only as a last resort ;) ;)
later,
Dean
ScopedDBConnection's constructor gets a connection from pool(if can't it will create a new one) and save it as a private member variable.Its get method returns a pointer of DBClientBase,I thinks client code don't need to delete this pointer because the done method will return it back to the pool. Here is my code,am I right.
ScopedDbConnection con(...);
DBClientBase* session = con.get();
//do something using session
...
//
con.done();// ignore session because done will return it back to connection pool
You can find a number of good ScopedDbConnection examples in the MongoDB github. Here's a file that shows some basic usage of that class:
https://github.com/mongodb/mongo/blob/master/src/mongo/client/model.cpp
Check out lines 24-46 (Model::load).
[I am new to ADO.NET and the Entity Framework, so forgive me if this questions seems odd.]
In my WPF application a user can switch between different databases at run time. When they do this I want to be able to do a quick check that the database is still available. What I have easily available is the ObjectContext. The test I am preforming is getting the count on the total records of a very small table and if it returns results then it passed, if I get an exception then it fails. I don't like this test, it seemed the easiest to do with the ObjectContext.
I have tried setting the connection timeout it in the connection string and on the ObjectConntext and either seem to change anything for the first scenario, while the second one is already fast so it isn't noticeable if it changes anything.
Scenario One
If the connect was down when before first access it takes about 30 seconds before it gives me the exception that the underlying provider failed.
Scenario Two
If the database was up when I started the application and I access it, and then the connect drops while using the test is quick and returns almost instantly.
I want the first scenario described to be as quick as the second one.
Please let me know how best to resolve this, and if there is a better way to test the connectivity to a DB quickly please advise.
There really is no easy or quick way to resolve this. The ConnectionTimeout value is getting ignored with the Entity Framework. The solution I used is creating a method that checks if a context is valid by passing in the location you which to validate and then it getting the count from a known very small table. If this throws an exception the context is not valid otherwise it is. Here is some sample code showing this.
public bool IsContextValid(SomeDbLocation location)
{
bool isValid = false;
try
{
context = GetContext(location);
context.SomeSmallTable.Count();
isValid = true;
}
catch
{
isValid = false;
}
return isValid;
}
You may need to use context.Database.Connection.Open()
Strange one. We have a multi-threaded app which pulls messages off a MSMQ Queue and then subsequently performs actions based on the messages. All of this is done using DTC.
Sometimes, for some reason I can't describe, we get message read errors when pulling Messages off the queue.
The code that is being used in the app:
Message[] allMessagesOnQueue = this.messageQueue.GetAllMessages();
foreach (Message currentMessage in allMessagesOnQueue)
{
if ((currentMessage.Body is IAMessageIDealWith))
{
// do something;
}
}
When the currentMessage.Body is accessed, at times it throws an exception:
System.InvalidOperationException: Property Body was not retrieved when receiving the message. Ensure that the PropertyFilter is set correctly.
Now - this only happens some of the time - and it appears as though the MessageReadPropertyFilter on the queue has the Body property set to false.
As to how it gets like this is a bit of a mystery. The Body property is one of the defaults and we absolutley never explicitly set it to false.
Has anyone else seen this kind of behaivour or has some idea why this value is getting set to be false?
As alluded to earlier, you could explicitly set the boolean values on the System.Messaging.MessagePropertyFilter object that is accessible on your messageQueue object via the MessageReadPropertyFilter property.
If you want all data to be extracted from a message when received or peaked, use:
this.messageQueue.MessageReadPropertyFilter.SetAll(); // add this line
Message[] allMessagesOnQueue = this.messageQueue.GetAllMessages();
// ...
That may hurt performance of reading many messages, so if you want just a few additional properties, create a new MessagePropertyFilter with custom flags:
// Specify to retrieve selected properties.
MessagePropertyFilter filter= new MessagePropertyFilter();
filter.ClearAll();
filter.Body = true;
filter.Priority = true;
this.messageQueue.MessageReadPropertyFilter = filter;
Message[] allMessagesOnQueue = this.messageQueue.GetAllMessages();
// ...
You can also set it back to default using:
this.messageQueue.MessageReadPropertyFilter.SetDefaults();
More info here: http://msdn.microsoft.com/en-us/library/system.messaging.messagequeue.messagereadpropertyfilter.aspx
I have seen it as well, and have tried initializing it with the properties I'm accessing explicitly set, and not setting them anywhere else. I periodically get the same error you are getting, my app is multi-threaded as well, what I ended up doing is trapping that error and reconnecting to MSMQ when I get it.
Sometimes, for some reason I can't describe, we get message read errors when pulling Messages off the queue.
Are you using the same MessageQueue instance from more than one thread, without locking? In that case, you will encounter spurious changes in MessageReadPropertyFilter - at least I did, when I tried.
Why? Because
Only the GetAllMessages method is thread safe.
What can you do? Either
wrap a lock (_messageQueue) around all access to your messageQueue OR
create multiple MessageQueue instances, one per thread