How continuous Azure Web Jobs can be idempotent and send email? - email

After reading tons of information on the web about Azure WebJobs, documentation says a job should be idempotent, on the other hand, blogs say they use WebJobs to do actions such as "charging a customer", "sending an e-mail".
This documentation says that running a continuous WebJob on multiple instances with a queue could result in being called more than once. Do people really ignore the fact that they could charge their customer twice, or send an e-mail twice?
How can I make sure I can run a WebJob with a queue on a scaled web app and messages are processed only once?

I do this using a database, an update query with a row lock and TransactionScope object.
In your Order table, create a column to manage the state of the action you are taking in your WebJob. i.e. EmailSent.
In the QueueTrigger function begin a transaction, then execute an UPDATE query on the customer order with ROWLOCK set, that sets EmailSent = 1 with a WHERE EmailSent = 0. If the return value from SqlCommand = 0 then exit the function. Another WebJob has already sent the email. Otherwise, send the email and call Complete() on the TransactionScope object if sent successfully.
That should provide the idempotency you want.
Hope that helps.

Related

Batch status - 'validating' after failed creation

Hi i`m tryng to work with shipping batches, after i create batch like this:
{"default_carrier_account":"9348***********50","default_servicelevel_token":"usps_priority","metadata":"test","label_filetype":"PDF_4x6","batch_shipments":[{"carrier_account":"93********************","servicelevel_token":"usps_priority","shipment":"c8c411c2ad8b497eb583decf7c3c614d","metadata":1},{"carrier_account":"9348ce6eecf**********ab850","servicelevel_token":"usps_priority","shipment":"768ae43826b04040b32490a6f069fa4f","metadata":2}]}
i get notification like this:
batch 0f0b69ae42bc475ab3c1421edddeb4fc creation failed
and after this i try to make api request and get batch data(status, messages, etc..) i did post request to : http://api.goshippo.com/batches/0f0b69ae42bc475ab3c1421edddeb4fc?page=1
and get response:
{
"object_id":"0f0b69ae42bc475ab3c1421edddeb4fc",
"object_owner":"info#skumatrix.com",
"status":"VALIDATING",
"object_created":"2017-04-16T16:35:24.925Z",
"object_updated":"2017-04-16T16:35:27.143Z",
"metadata":"test",
"default_carrier_account":"9***************b850",
"default_servicelevel_token":"usps_priority",
"label_filetype":"PDF_4x6",
"batch_shipments":{
"count":0,
"next":null,
"previous":null,
"results":{
}
},
"object_results":{
"purchase_succeeded":0,
"purchase_failed":0,
"creation_failed":0,
"creation_succeeded":0
},
"label_url":{
}
}
what i don`t understand is - why status is still validating and why there is no error messages ?
So, for starters, the default status of a Batch object in Shippo is VALIDATING. So this is why it would persist to stay in that state, although it might be a little confusing when there is an unexpected failure (which is what appears to have happened here).
As mentioned in the comments there, this failure occurred due to trying to do a Batch purchase using a collection of Shipment object_id's. The Batch endpoint is actually supposed to allow you to create a collection of Shipment objects en masse, and then later you can Batch purchase the labels for your desired rates on those Shipment objects.
Rate retrieval is generally the more time consuming process, depending on how many connected shipping accounts you have. So Batch creation is intended to allow you to have Shippo retrieve rates for a lot of packages and simply check on them once they are done (or get notified of their completion via Shippo's webhooks).
So moving forward, make sure that you first try to create the Batch with a collection of Shipments (see here). Then you can proceed to create the labels for the shipment like so.

Updating entities in ndb while paging with cursors

To make things short, I have to make a script in Second Life communicating with an AppEngine app updating records in an ndb database. Records extracted from the database are sent as a batch (a page) to the LSL script, which updates customers, then asks the web app to mark these customers as updated in the database.
To create the batch I use a query on a (integer) property update_ver==0 and use fetch_page() to produce a cursor to the next batch. This cursor is also sent as urlsafe()-encoded parameter to the LSL script.
To mark the customer as updated, the update_ver is set to some other value like 2, and the entity is updated via put_async(). Then the LSL script fetches the next batch thanks to the cursor sent earlier.
My rather simple question is: in the web app, since the query property update_ver no longer satisfies the filter, is my cursor still valid ? Or do I have to use another strategy ?
Stripping out irrelevant parts (including authentication), my code currently looks like this (Customer is the entity in my database).
class GetCustomers(webapp2.RequestHandler): # handler that sends batches to the update script in SL
def get(self):
cursor=self.request.get("next",default_value=None)
query=Customer.query(Customer.update_ver==0,ancestor=customerset_key(),projection=[Customer.customer_name,Customer.customer_key]).order(Customer._key)
if cursor:
results,cursor,more=query.fetch_page(batchsize,start_cursor=ndb.Cursor(urlsafe=cursor))
else:
results,cursor,more=query.fetch_page(batchsize)
if more:
self.response.write("more=1\n")
self.response.write("next={}\n".format(cursor.urlsafe()))
else:
self.response.write("more=0\n")
self.response.write("n={}\n".format(len(results)))
for c in results:
self.response.write("c={},{},{}\n".format(c.customer_key,c.customer_name,c.key.urlsafe()))
self.response.set_status(200)
The handler that updates Customer entities in the database is the following. The c= parameters are urlsafe()-encoded entity keys of the records to update and the nv= parameter is the new version number for their update_ver property.
class UpdateCustomer(webapp2.RequestHandler):
#ndb.toplevel # don't exit until all async operations are finished
def post(self):
updatever=self.request.get("nv")
customers=self.request.get_all("c")
for ckey in customers:
cust=ndb.Key(urlsafe=ckey).get()
cust.update_ver=nv # filter in the query used to produce the cursor was using this property!
cust.update_date=datetime.datetime.utcnow()
cust.put_async()
else:
self.response.set_status(403)
Will this work as expected ? Thanks for any help !
Your strategy will work and that's the whole point for using these cursors, because they are efficient and you can get the next batch as it was intended regardless of what happened with the previous one.
On a side note you could also optimise your UpdateCustomer and instead of retrieving/saving one by one you can do things in batches using for example the ndb.put_multi_async.

How do you ensure consistent client reads in an eventual consistent system?

I'm digging into CQRS and I am looking for articles on how to solve client reads in an eventual consistent system. Consider for example a web shop where users can add items to their cart. How can you ensure that the client displays items in the cart if the actual processing of the command "AddItemToCart" is done async? I understand the principles of dispatching commands async and updating the read model async based on domain events, but I fail to see how this is handled from the clients perspective.
There are a few different ways of doing it;
Wait at user till consistent
Just poll the server until you get the read model updated. This is similar to what Ben showed.
Ensure consistency through 2PC
You have a queue that supports DTC; and your commands are put there first. They are then; executed, events sent, read model updated; all inside a single transaction. You have not actually gained anything with this method though, so don't do it this way.
Fool the client
Place the read models in local storage at the client and update them when the corresponding event is sent -- but you were expecting this event anyway, so you had already updated the javascript view of the shopping cart.
I'd recommend you have a look at the Microsoft Patterns & Practices team's guidance on CQRS. Although this is still work-in-progress they have given one solution to the issue you've raised.
Their approach for commands requiring feedback is to submit the command asynchronously, redirect to another controller action and then poll the read model for the expected change or a time-out occurs. This is using the Post-Redirect-Get pattern which works better with the browser's forward and back navigation buttons, and gives the infrastructure more time to process the command before the MVC controller starts polling.
Example code from the RegistrationController using ASP.NET MVC 4 asynchronous controllers.
[HttpGet]
[OutputCache(Duration = 0, NoStore = true)]
public Task<ActionResult> SpecifyRegistrantAndPaymentDetails(Guid orderId, int orderVersion)
{
return this.WaitUntilOrderIsPriced(orderId, orderVersion)
.ContinueWith<ActionResult>(
...
);
}
...
private Task<PricedOrder> WaitUntilOrderIsPriced(Guid orderId, int lastOrderVersion)
{
return
TimerTaskFactory.StartNew<PricedOrder>(
() => this.orderDao.FindPricedOrder(orderId),
order => order != null && order.OrderVersion > lastOrderVersion,
PricedOrderPollPeriodInMilliseconds,
DateTime.Now.AddSeconds(PricedOrderWaitTimeoutInSeconds));
}
I'd probably use AJAX polling instead of having a blocked web request at the server.
Post-Redirect-Get
You're hoping that the save command executes on time before Get is called. What if the command takes 10 seconds to complete in the back end but Get is called in 1 second?
Local Storage
With storing the result of the command on the client while the command goes off to execute, you're assuming that the command will go through without errors. What if the back-end runs into an error while processing the command? Then what you have locally isn't consistent.
Polling
Polling seems to be the option that is actually in line with eventual consistency; you're not faking or assuming. Your polling mechanism can be an asynchronous as a part of your page, e.g. shopping cart page component polls until it gets an update without refreshing the page.
Callbacks
You could introduce something like web hooks to make a call back to the client if the client is capable of receiving such. By providing a correlation Id once the command is accepted by the back-end, once the command has finished processing, the back-end can notify the front end of the command's status along with the correlation Id on whether the command went through successfully or not. There is no need for any kind of polling with this approach.

Working with Google Cloud Print

I have begun digging around in the Google Cloud Print project in hopes of creating a custom application for my network. I have a Windows Print Server running with a printer queue I wish to submit jobs to. Setup the Google Cloud Print with the Crome browser and I was able to submit and print jobs just fine. However, my end goal is a little more complicated.
I need to customize the access control from the server-side client endpoint once the job reaches my network. Meaning, once the job is submitted, I need to be able to check the username of the owner of the job and process accordingly. From the looks of it, the /fetch interface does not store the original owner of the job, just the end owner of the queue it ends up on. Meaning, User A has the Google Cloud Printer linked to their account and has shared it with User B. User B submits a job to the shared queue. When I run /fetch on the shared printerID the user is User A.
Has anyone else dabbled with this?
Thanks
Take a look at the ownerId of the job.
The /fetch call does in fact return a user field containing the owner of the printer (User A), but the specific job(s) returned contains an ownerId field with a value of the user that submitted the print job (User B).
Hope that helps.
Following is a partial response from a /fetch call, including ownerId of the job. Keep in mind that this could be one of many jobs that are returned in the jobs array:
...
updateTime: "1403628993840",
status: "QUEUED",
ownerId: "rpreeves#gmail.com",
rasterUrl: "https://www.google.com/cloudprint/download?id=5ca7b1e4-c533-c42b-7d2b-efb862c4215a&forcepwg=1",
ticketUrl: "https://www.google.com/cloudprint/ticket?format=ppd&output=json&jobid=5ca7b1e4-c633-c42b-782b-efb862c4215a",
printerid: "f33c6ff8-fc25-7075-249b-ab65c3e2354e",
...

Obtain ServiceDeploymentId in TrackingParticipant

In WF4, I've created a descendant of TrackingParticipant. In the Track method, record.InstanceId gives me the GUID of the workflow instance.
I'm using the SqlWorkflowInstanceStore for persistence. By default records are automatically deleted from the InstancesTable when the workflow completes. I want to keep it that way to keep the transaction database small.
This creates a problem for reporting, though. My TrackingParticipant will log the instance ID to a reporting table (along with other tracking information), but I'll want to join to the ServiceDeploymentsTable. If the workflow is complete, that GUID won't be in the InstancesTable, so I won't be able to look up the ServiceDeploymentId.
How can I obtain the ServiceDeploymentId in the TrackingParticipant? Alternately, how can I obtain it in the workflow to add it to a CustomTrackingRecord?
You can't get the ServiceDeploymentId in the TrackingParticipant. Basically the ServiceDeploymentId is an internal detail of the SqlWorkflowInstanceStore.
I would either set the SqlWorkflowInstanceStore to not delete the worklow instance upon completion and do so myself at some later point in time after saving the ServiceDeploymentId with the InstanceId.
An alternative is to use auto cleanup with the SqlWorkflowInstanceStore and retreive the ServiceDeploymentId when the first tracking record is generated. At that point the workflow is not complete so the original instance record is still there.