Play 1.2.3 framework - Right way to commit transaction - jpa

We have a HTTP end-point that takes a long time to run and can also be called concurrently by users. As part of this request, we update the model inside a synchronized block so that other (possibly concurrent) requests pick up that change.
E.g.
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
//Long running operation continues here. It can involve further changes to instance "m"
The reason for the synchronized block is to ensure that even concurrent requests get to pick up the latest status. However, the underlying JPA does not commit my changes (m.save()) until the request is complete. Since this is a long-running request, I do not want to wait until the request is complete and still want to ensure that other callers are notified of the change in status. I tried to call "m.em().flush(); JPA.em().getTransaction().commit();" after m.save(), but that makes the transaction unavailable for the subsequent action as part of the same request. Can I just given "JPA.em().getTransaction().begin();" and let Play handle the transaction from then on? If not, what is the best way to handle this use-case?
UPDATE:
Based on the response, I modified my code as follows:
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
new MyModelUpdateJob(m.id).now();
And in my job, I have the following line:
doJob() {
MyModel m = MyModel.findById(id);
print m.status; //This still prints the old status as-if m.save() had no effect...
}
What am I missing?

Put your update code in a job an call
new MyModelUpdateJob(id).now().get();
thus the update will be done in another transaction that is commited at the end of the job

ouch, as soon as you add more play servers, you will be in trouble. You may want to play with optimistic locking in your example or and I advise against it pessimistic locking....ick.
HOWEVER, looking at your code, maybe read the article Building on Quicksand. I am not sure you need a synchronized block in that case at all...try to go after being idempotent.
In your case if
1. user 1 and user 2 both call that method and it is pending, then it goes to active(Idempotent)
If user 1 or user 2 wins, well that would be like you had the synchronization block anyways.
I am sure however you have a more complex scenario not shown here, BUT READ that article Building on Quicksand as it really changes the traditional way of thinking and is how google and amazon and very large scale systems operate.
Another option for distributed transactions across play servers is zookeeper which the big large nosql guys use BUT only as a last resort ;) ;)
later,
Dean

Related

ServerValue.increment doesn't work properly when Internet goes down

The addition of ServerValue.increment() (Add increment() for atomic field value increments #2437) was a great news as it allows field values ​​to be increased atomically in Firebase RTDB.
I have an application that keeps inventories and this function has been key because it allows updating the inventory regardless of whether the user is offline at times. However, I started to notice that sometimes the function is executed twice, which completely misstates the inventory in the wrong way.
To isolate the problem I decided to do the following test, which shows that ServerValue.Increment() works wrong when the connection goes from Online to Offline:
Make a for loop function from 1 to 200:
for (var i = 1; i <= 200; i++) {
testBloc.incrementTest(i);
print('Pos: $i');
}
The function incrementTest(i) must increment two variables: position (count from 1 in 1 up to 200) and sum (add 1 + 2 + 3, ..., + 200 which should result in 20,100)
Future<bool> incrementTest(int value) async {
try {
db.child('test/position')
.set(ServerValue.increment(1));
db.child('test/sum')
.set(ServerValue.increment(value));
} catch (e) {
print(e);
}
return true;
}
Note that db refers to the Firebase instance (FirebaseDatabase.instance.reference())
With this, comes the tests:
Test 1: 100% Online. PASSED
The function works properly, reaching the two variables to the correct result (in the Firebase console):
position: 200
sum: 20100
Test 2: 100% Offline. PASSED
To do this I used a physical device in airplane mode, then I executed the for loop function, and when the function finished executing I deactivated airplane mode and checked the result in the firebase console, which was satisfactory:
position: 200
sum: 20100
Test 3: Start Online and then go to Offline. FAILED
It is a typical operating scenario when the Internet Connection goes down. Even worse when the connections are intermittent, you are traveling on a subway or you are in a low coverage site for which Offline Persistence is a desired feature. To simulate it, what I did was run the for loop function in online mode, and before it finished, I put the physical device in airplane mode. Later I went Online to finish the test and see the results on the Firebase console. The results obtained are incorrect in all cases. Here are some of the results:
As you can see, the Increment was erroneously repeated 10, 18 and 9 times more.
How can I avoid this behavior?
Is there any other way to increment atomically a number in Firebase that works properly online / Offline ?
firebaser here
That's an interesting edge-case in the increment behavior. Between the client and the server neither can be certain whether the increment was executed or not, so it ends up being retried from the client upon the reconnect. This problem can only occur with the increment operation as far as I can tell, as all the other write operations are idempotent except for transactions, but those don't work while offline.
It is possible to ensure each increment happens only once, but it'll take some work:
First, add a nonce to write operation that unique identifies this operation. You can use a push key for this, but any other UUID also works fine. Combine this with your original set() call into a single multi-path update call, writing the nonce to a top-level node with a server-side timestamp as its value.
Now in your security rules for the top-level location, only allow the write if there is no existing data. This ensures the secondary writes you're seeing get rejected, and since security rules are checked across multi-path updates as a whole, the faulty increment will get rejected too.
You'll probably want to periodically clean up the node with nonce keys, based on the timestamp value in there. It won't matter for performance (since you're never searching here outside of during the cleanup), but may help control the storage cost for the nonces.
I haven't used this approach for this specific use-case yet, but have done it for others. If you'd include a client-side retry, the above essentially builds your own multi-path transaction mechanism, which is what I needed it for in the past. But since you don't need that here, it's simpler without that.
Based on #puf answer, you can proceed as follows:
Future<bool> incrementTest(int value, int dateOfToday) async {
var id = db.push().key;
Map<String, dynamic> _updates = {
'test/position': ServerValue.increment(1),
'test/sum': ServerValue.increment(value),
'test/nonce/$id': dateOfToday,
};
db.child('previousPath').update(_updates)
.catchError((error) => print('Increment Duplication Rejected ${error.message}'));
return true;
}
Then, in Firebase Security Rules, you need to add a rule in test/nonce/id location. Something as follows:
{
"previousPath": {
"test": {
".read": "auth != null", //It depends on your root rules
".write": "auth != null", //It depends on your root rules
"nonce": {
"$nonce_id": {
".validate": "!data.exists()" //THE MAGIC IS HERE
}
}
}
}
}
In this way, when the device tries to write to the database again (wrongly), Firebase will reject it since it already had a write with that same ID before.
I hope it serves someone else!!!

(Laravel 5) Monitor and optionally cancel an ALREADY RUNNING job on queue

I need to achieve the ability to monitor and be able to cancel an ALREADY RUNNING job on queue.
There's a lot of answers about deleting QUEUED jobs, but not on an already running one.
This is the situation: I have a "job", which consists of HUNDREDS OF THOUSANDS rows on a database, that need to be queried ONE BY ONE against a web service.
Every row needs to be picked up, queried against a web service, stored the response and its status updated.
I had that already working as a Command (launching from / outputting to console), but now I need to implement queues in order to allow piling up more jobs from more users.
So far I've seen Horizon (which doesn't runs on Windows due to missing process control libs). However, in some demos seen around it lacks (I believe) a couple things I need:
Dynamically configurable timeout (the whole job may take more than 12 hours, depending on the number of rows to process on the selected job)
Ability to CANCEL an ALREADY RUNNING job.
I also considered the option to generate EACH REQUEST as a new job instead of seeing a "job" as the whole collection of rows (this would overcome the timeout thing), but that would give me a Horizon "pending jobs" list of hundreds of thousands of records per job, and that would kill the browser (I know Redis can handle this without itching at all). Further, I guess is not possible to cancel "all jobs belonging to X tag".
I've been thinking about hitting an API route, fire the job and decouple it from the app, but I'm seeing that this requires forking processes.
For the ability to cancel, I would implement a database with job_id, and when the user hits an API to cancel a job, I'd mark it as "halted". On every loop I would check its status and if it finds "halted" then kill itself.
If I've missed any aspect just holler and I'll add it or clarify about it.
So I'm asking for an advice here since I'm new to Laravel: how could I achieve this?
So I finally came up with this (a bit clunky) solution:
In Controller:
public function cancelJob()
{
$jobs = DB::table('jobs')->get();
# I could use a specific ID and user owner filter, etc.
foreach ($jobs as $job) {
DB::table('jobs')->delete($job->id);
}
# This is a file that... well, it's self explaining
touch(base_path(config('files.halt_process_signal')));
return "Job cancelled - It will stop soon";
}
In job class (inside model::chunk() function)
# CHECK FOR HALT SIGNAL AND [OPTIONALLY] STOP THE PROCESS
if ($this->service->shouldHaltProcess()) {
# build stats, do some cleanup, log, etc...
$this->halted = true;
$this->service->stopProcess();
# This FALSE is what it makes the chunk() method to stop looping
return false;
}
In service class:
/**
* Checks the existence of the 'Halt Process Signal' file
*
* #return bool
*/
public function shouldHaltProcess() :bool
{
return file_exists($this->config['files.halt_process_signal']);
}
/**
* Stop the batch process
*
* #return void
*/
public function stopProcess() :void
{
logger()->info("=== HALT PROCESS SIGNAL FOUND - STOPPING THE PROCESS ===");
$this->deleteHaltProcessSignalFile();
return ;
}
It doesn't looks quite elegant, but it works.
I've surfed the whole web and many goes for Horizon or other tools that doesn't fit my case.
If anyone has a better way to achieve this, it's welcome to share.
Laravel queue have 3 important config:
1. retry_after
2. timeout
3. tries
See more: https://laravel.com/docs/5.8/queues
Dynamically configurable timeout (the whole job may take more than 12
hours, depending on the number of rows to process on the selected job)
I think you can config timeout + retry_after about 24h.
Ability to CANCEL an ALREADY RUNNING job.
Delete job in jobs table
Delete process by process id in your server
Hope it help you :)

Moving from file-based tracing session to real time session

I need to log trace events during boot so I configure an AutoLogger with all the required providers. But when my service/process starts I want to switch to real-time mode so that the file doesn't explode.
I'm using TraceEvent and I can't figure out how to do this move correctly and atomically.
The first thing I tried:
const int timeToWait = 5000;
using (var tes = new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl") { StopOnDispose = false })
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
using (var tes = new TraceEventSession("TEMPSESSIONNAME", TraceEventSessionOptions.Attach))
{
Thread.Sleep(timeToWait);
tes.SetFileName(null);
Thread.Sleep(timeToWait);
Console.WriteLine("Done");
}
Here I wanted to make that I can transfer the session to real-time mode. But instead, the file I got contained events from a 15s period instead of just 10s.
The same happens if I use new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl", TraceEventSessionOptions.Create) instead.
It seems that the following will cause the file to stop being written to:
using (var tes = new TraceEventSession("TEMPSESSIONNAME"))
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
But here I must reenable all the providers and according to the documentation "if the session already existed it is closed and reopened (thus orphans are cleaned up on next use)". I don't understand the last part about orphans. Obviously some events might occur in the time between closing, opening and subscribing on the events. Does this mean I will lose these events or will I get the later?
I also found the following in the documentation of the library:
In real time mode, events are buffered and there is at least a second or so delay (typically 3 sec) between the firing of the event and the reception by the session (to allow events to be delivered in efficient clumps of many events)
Does this make the above code alright (well, unless the improbable happens and for some reason my thread is delayed for more than a second between creating the real-time session and starting processing the events)?
I could close the session and create a new different one but then I think I'd miss some events. Or I could open a new session and then close the file-based one but then I might get duplicate events.
I couldn't find online any examples of moving from a file-based trace to a real-time trace.
I managed to contact the author of TraceEvent and this is the answer I got:
Re the exception of the 'auto-closing and restarting' feature, it is really questions about the OS (TraceEvent simply calls the underlying OS API). Just FYI, the deal about orphans is that it is EASY for your process to exit but leave a session going. This MAY be what you want, but often it is not, and so to make the common case 'just work' if you do Create (which is the default), it will close a session if it already existed (since you asked for a new one).
Experimentation of course is the touchstone of 'truth' but I would frankly expecting unusual combinations to just work is generally NOT true.
My recommendation is to keep it simple. You need to open a new session and close the original one. Yes, you will end up with duplicates, but you CAN filter them out (after all they are IDENTICAL timestamps).
The other possibility is use SetFileName in its intended way (from one file to another). This certainly solves your problem of file size growth, and often is a good way to deal with other scenarios (after all you can start up you processing and start deleting files even as new files are being generated).

WF4 InstancePersistenceCommand interrupted

I have a windows service, running workflows. The workflows are XAMLs loaded from database (users can define their own workflows using a rehosted designer). It is configured with one instance of the SQLWorkflowInstanceStore, to persist workflows when becoming idle. (It's basically derived from the example code in \ControllingWorkflowApplications from Microsoft's WCF/WF samples).
But sometimes I get an error like below:
System.Runtime.DurableInstancing.InstanceOwnerException: The execution of an InstancePersistenceCommand was interrupted because the instance owner registration for owner ID 'a426269a-be53-44e1-8580-4d0c396842e8' has become invalid. This error indicates that the in-memory copy of all instances locked by this owner have become stale and should be discarded, along with the InstanceHandles. Typically, this error is best handled by restarting the host.
I've been trying to find the cause, but it is hard to reproduce in development, on production servers however, I get it once in a while. One hint I found : when I look at the LockOwnersTable, I find the LockOnwersTable lockexpiration is set to 01/01/2000 0:0:0 and it's not getting updated anymore, while under normal circumstances the should be updated every x seconds according to the Host Lock Renewal period...
So , why whould SQLWorkflowInstanceStore stop renewing this LockExpiration and how can I detect the cause of it?
This happens because there are procedures running in the background and trying to extend the lock of the instance store every 30 seconds, and it seems that once the connection fail connecting to the SQL service it will mark this instance store as invalid.
you can see the same behaviour if you delete the instance store record from [LockOwnersTable] table.
The proposed solution is when this exception fires, you need to free the old instance store and initialize a new one
public class WorkflowInstanceStore : IWorkflowInstanceStore, IDisposable
{
public WorkflowInstanceStore(string connectionString)
{
_instanceStore = new SqlWorkflowInstanceStore(connectionString);
InstanceHandle handle = _instanceStore.CreateInstanceHandle();
InstanceView view = _instanceStore.Execute(handle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(30));
handle.Free();
_instanceStore.DefaultInstanceOwner = view.InstanceOwner;
}
public InstanceStore Store
{
get { return _instanceStore; }
}
public void Dispose()
{
if (null != _instanceStore)
{
var deleteOwner = new DeleteWorkflowOwnerCommand();
InstanceHandle handle = _instanceStore.CreateInstanceHandle();
_instanceStore.Execute(handle, deleteOwner, TimeSpan.FromSeconds(10));
handle.Free();
}
}
private InstanceStore _instanceStore;
}
you can find the best practices to create instance store handle in this link
Workflow Instance Store Best practices
This is an old thread but I just stumbled on the same issue.
Damir's Corner suggests to check if the instance handle is still valid before calling the instance store. I hereby quote the whole post:
Certain aspects of Workflow Foundation are still poorly documented; the persistence framework being one of them. The following snippet is typically used for setting up the instance store:
var instanceStore = new SqlWorkflowInstanceStore(connectionString);
instanceStore.HostLockRenewalPeriod = TimeSpan.FromSeconds(30);
var instanceHandle = instanceStore.CreateInstanceHandle();
var view = instanceStore.Execute(instanceHandle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(10));
instanceStore.DefaultInstanceOwner = view.InstanceOwner;
It's difficult to find a detailed explanation of what all of this
does; and to be honest, usually it's not necessary. At least not,
until you start encountering problems, such as InstanceOwnerException:
The execution of an InstancePersistenceCommand was interrupted because
the instance owner registration for owner ID
'9938cd6d-a9cb-49ad-a492-7c087dcc93af' has become invalid. This error
indicates that the in-memory copy of all instances locked by this
owner have become stale and should be discarded, along with the
InstanceHandles. Typically, this error is best handled by restarting
the host.
The error is closely related to the HostLockRenewalPeriod property
which defines how long obtained instance handle is valid without being
renewed. If you try monitoring the database while an instance store
with a valid instance handle is instantiated, you will notice
[System.Activities.DurableInstancing].[ExtendLock] being called
periodically. This stored procedure is responsible for renewing the
handle. If for some reason it fails to be called within the specified
HostLockRenewalPeriod, the above mentioned exception will be thrown
when attempting to persist a workflow. A typical reason for this would
be temporarily inaccessible database due to maintenance or networking
problems. It's not something that happens often, but it's bound to
happen if you have a long living instance store, e.g. in a constantly
running workflow host, such as a Windows service.
Fortunately it's not all that difficult to fix the problem, once you
know the cause of it. Before using the instance store you should
always check, if the handle is still valid; and renew it, if it's not:
if (!instanceHandle.IsValid)
{
instanceHandle = instanceStore.CreateInstanceHandle();
var view = instanceStore.Execute(instanceHandle,
new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(10));
instanceStore.DefaultInstanceOwner = view.InstanceOwner;
}
It's definitely less invasive than the restart of the host, suggested
by the error message.
you have to be sure about expiration of owner user
here how I am used to handle this issue
public SqlWorkflowInstanceStore SetupSqlpersistenceStore()
{
SqlWorkflowInstanceStore sqlWFInstanceStore = new SqlWorkflowInstanceStore(ConfigurationManager.ConnectionStrings["DB_WWFConnectionString"].ConnectionString);
sqlWFInstanceStore.InstanceCompletionAction = InstanceCompletionAction.DeleteAll;
InstanceHandle handle = sqlWFInstanceStore.CreateInstanceHandle();
InstanceView view = sqlWFInstanceStore.Execute(handle, new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(30));
handle.Free();
sqlWFInstanceStore.DefaultInstanceOwner = view.InstanceOwner;
return sqlWFInstanceStore;
}
and here how you can use this method
wfApp.InstanceStore = SetupSqlpersistenceStore();
wish this help

Cancelling an Entity Framework Query

I'm in the process of writing a query manager for a WinForms application that, among other things, needs to be able to deliver real-time search results to the user as they're entering a query (think Google's live results, though obviously in a thick client environment rather than the web). Since the results need to start arriving as the user types, the search will get more and more specific, so I'd like to be able to cancel a query if it's still executing while the user has entered more specific information (since the results would simply be discarded, anyway).
If this were ordinary ADO.NET, I could obviously just use the DbCommand.Cancel function and be done with it, but we're using EF4 for our data access and there doesn't appear to be an obvious way to cancel a query. Additionally, opening System.Data.Entity in Reflector and looking at EntityCommand.Cancel shows a discouragingly empty method body, despite the docs claiming that calling this would pass it on to the provider command's corresponding Cancel function.
I have considered simply letting the existing query run and spinning up a new context to execute the new search (and just disposing of the existing query once it finishes), but I don't like the idea of a single client having a multitude of open database connections running parallel queries when I'm only interested in the results of the most recent one.
All of this is leading me to believe that there's simply no way to cancel an EF query once it's been dispatched to the database, but I'm hoping that someone here might be able to point out something I've overlooked.
TL/DR Version: Is it possible to cancel an EF4 query that's currently executing?
Looks like you have found some bug in EF but when you report it to MS it will be considered as bug in documentation. Anyway I don't like the idea of interacting directly with EntityCommand. Here is my example how to kill current query:
var thread = new Thread((param) =>
{
var currentString = param as string;
if (currentString == null)
{
// TODO OMG exception
throw new Exception();
}
AdventureWorks2008R2Entities entities = null;
try // Don't use using because it can cause race condition
{
entities = new AdventureWorks2008R2Entities();
ObjectQuery<Person> query = entities.People
.Include("Password")
.Include("PersonPhone")
.Include("EmailAddress")
.Include("BusinessEntity")
.Include("BusinessEntityContact");
// Improves performance of readonly query where
// objects do not have to be tracked by context
// Edit: But it doesn't work for this query because of includes
// query.MergeOption = MergeOption.NoTracking;
foreach (var record in query
.Where(p => p.LastName.StartsWith(currentString)))
{
// TODO fill some buffer and invoke UI update
}
}
finally
{
if (entities != null)
{
entities.Dispose();
}
}
});
thread.Start("P");
// Just for test
Thread.Sleep(500);
thread.Abort();
It is result of my playing with if after 30 minutes so it is probably not something which should be considered as final solution. I'm posting it to at least get some feedback with possible problems caused by this solution. Main points are:
Context is handled inside the thread
Result is not tracked by context
If you kill the thread query is terminated and context is disposed (connection released)
If you kill the thread before you start a new thread you should use still one connection.
I checked that query is started and terminated in SQL profiler.
Edit:
Btw. another approach to simply stop current query is inside enumeration:
public IEnumerable<T> ExecuteQuery<T>(IQueryable<T> query)
{
foreach (T record in query)
{
// Handle stop condition somehow
if (ShouldStop())
{
// Once you close enumerator, query is terminated
yield break;
}
yield return record;
}
}