ZIO [Declarative] Transactional Management - scala

I spent a lot of time figuring out transactions on my otherwise well running ZIO+HTTP4S+Doobie-project.
How can I have proper [declarative] transaction management? Something like Spring #Transactional.
I read this very good post from-transactional-to-type-safe-reasonable-transactions, but ironically have exactly the alleged Spring challenge of getting confused how/where the transactions are precisely handled :)
Also tried tranzactio, even a entered a question under issues, where my PR (on my fork) seems to indicate individual transact calls are grouped in a single transaction?!
And here in bootzooka which is on the ZIO references list, I don't understand why the Resource is 'used' only once in main, and then transact is called around the logic. Well, I understand why it works, but I don't see any 'Strategy' or anything else letting me handle when an error in the first ZIO-effekt would let me control what happens to the second flatmapped effect...
Or how to assure there will be a single transaction happening for a HTTP4S-request.
Or an error handler, that got a #RequiresNew to actually store information about the error somewhere in the databse, even when all the previous logic is correctly rolled back.
I wouldn't mind helping others by writing up something about this, but I clearly have more to learn. About Doobie and ZLayers and so forth.
If there is something I am clearly missing here, please let me know :-)

Related

Lagom | Return Values from read side processor

We are using Lagom for developing our set of microservices. The trick here is that although we are using event sourcing and persisting events into cassandra but we have to store the data in one of the graph DB as well since it will be the one that will be serving most of the queries because of the use case.
As per the Lagom's documentation, all the insertion into Graph database(or any other database) has to be done in ReadSideProcecssor after the command handler persist the events into cassandra as followed by philosophy of CQRS.
Now here is the problem which we are facing. We believe that the ReadSideProcecssor is a listener which gets triggered after the events are generated and persisted. What we want is we could return the response back from the ReadSideProcecssor to the ServiceImpl. Example when a user is added to the system, the unique id generated by the graph has to be returned as one of the response headers. How that can be achieved in Lagom since the response is constructed from setCommandHandler and not the ReadSideProcessor.
Also, we need to make sure that if due to any error at graph side, the API should notify the client that the request has failed but again exceptions occuring in ReadSideProcessor are not propagated to either PersistentEntity or ServiceImpl class. How can that be achieved as well?
Any helps are much appreciated.
The read side processor is not a listener that is attached to the command - it is actually completely disconnected from the persistent entity, it may be running on a different node, at a different time, perhaps even years in the future if you add a new read side processor that first comes up to speed with all the old events in history. If the read side processor were connected synchronously to the command, then it would not be CQRS, there would not be segregation between the command and the query side.
Read side processors essentially poll the database for new events, processing them as they detect them. You can add a new read side processor at any time, and it will get all events from all of history, not just the new ones that are added, this is one of the great things about event sourcing, you don't need to anticipate all your query needs from the start, you can add them as the query need comes.
To further explain why you don't want a connection between the two - what happens if the event persist succeeds, but the update on the graph db fails? Perhaps the graph db is crashed. Does the command have to retry? Does the event have to be deleted? What happens if the node doing the update itself crashes before it has an opportunity to fix the problem? Now your read side is in an inconsistent state from your entities. Connecting them leads to inconsistency in many failure scenarios - for example, like when you update your address with a utility company, and but your bills still go to the old address, and you contact them, and they say "yes, your new address is updated in our system", but they still go to the old address - that's the sort of terrible user experience that you are signing your users up for if you try to connect your read side and write side together. Disconnecting allows Lagom to ensure consistency between the events you have emitted on the write side, and the consumption of them on the read side.
So to address your specific concerns: ID generation should be done on the write side, or, if a subsequent ID is generated on the read side, it should also provide a way of mapping the IDs on the write side to the read side ID. And as for handling errors on the read side - all validation should be done on the write side - the write side should ensure that it never emits an event that is invalid.
Now if the read side processor encounters something that is invalid, then it has two options. One option is it could fail. In many cases, this is a good option, since if something is invalid or inconsistent, then it's likely that either you have a bug or some form of corruption. What you don't want to do is continue processing as if everything is happy, since that might make the data corruption or inconsistency even worse. Instead the read side processor stops, your monitoring should then detect the error, and you can go in and work out either what the bug is or fix the corruption. Of course, there are downsides to doing this, your read side will start lagging behind the write side while it's unable to process new events. But that's also an advantage of CQRS - the write side is able to continue working, continue enforcing consistency, etc, the failure is just isolated to the read side, and only in updating the read side. Instead of your whole system going down and refusing to accept new requests due to this bug, it's isolated to just where the problem is.
The other option that the read side has is it can store the error somewhere - eg, store the event in a dead letter table, or raise some sort of trouble ticket, and then continue processing. This way, you can go and fix the event after the fact. This ensures greater availability, but does come at the risk that if that event that it failed to process was important to the processing of subsequent events, you've potentially just got yourself into a bigger mess.
Now this does introduce specific constraints on what you can and can't do, but I can't really anticipate those without specific knowledge of your use case to know how to address them. A common constraint is set validation - for example, how do you ensure that email addresses are unique to a single user in your system? Greg Young (the CQRS guy) wrote this blog post about those types of problems:
http://codebetter.com/gregyoung/2010/08/12/eventual-consistency-and-set-validation/

Sorting Gmail threads by date

I'm hoping someone can suggest a good technique for sorting Gmail threads by date without needing to get details on potentially thousands of threads.
Right now I use threads.list to get a list of threads, and I'm using whatever order they're returned in. That's mostly correct in that it returns threads in reverse chronological order. Except that chronology is apparently determined by the first message in a thread rather than the thread's most recent message.
That's fine for getting new messages at the top of the list but it's not good if someone has a new reply to a thread that started a few days ago. I'd like to put that at the top of the list, since it's a new message. But threads.list leaves it sorted based on the first message.
I thought the answer might be to sort threads based on historyId. That does sort by the most recent message. But it's also affected by any other thread change. If the user updates the labels on a message (by starring it, for example), historyId changes. Then the message sorts to the top even though it's not new.
I could use threads.get to get details of the thread, and do more intelligent sorting based on that. But users might have thousands of threads and I don't want to have to make this call for every one of them.
Does anyone have a better approach? Something I've missed?
I'm not a developer and never used the API before but I just readed the API documentation and it doesn't seem to have the functionality you want.
Anyway, this is what I understood in your question:
You want is organize threads by the latest message in each one.
I thought you could use a combination of users.messages and threads.list. In users.messages you'll have the ThreadID:
threadId string The ID of the thread the message belongs to.
The method would be using the date of user.messages to organize the latest messages from newer to old, then recursively obtain their original threads by threadId and print the threads with threads.list by their latest received message.
With this method you'll avoid recursion in each thread saving resources and time.
I don't know how new messages are affected by labels or starring, you'll have to find out that.
I apologize in advance if my answer isn't correct or missleading.

What coredata errors should I prepare for?

I have an app getting close to release date, but it occurred to me that wherever I have core data save and/or fetch requests I'm not really handling the errors other than to check if they exist and #throw them, which I'm sure will seem almost like nails on a chalkboard to more experienced programmers, and surely there's some kind of disaster waiting to happen.
So to be specific, what kinds of errors can I expect from A) Fetches, and B) Saves, and also C) in general terms, how should I deal with these?
You can see the Core Data Constants Reference to get an idea about what kind of errors you can expect to see in general.
For fetches, the most common issue is that the fetch returns an empty array. Make sure that your view controllers, datasources and delegates can handle an empty fetch. If you dynamically construct complex predicates, make sure catch exceptions from an invalid predicate.
Most save errors results from validation errors. You should have a error recovery for every validation you supply. One common and somewhat hidden validation error is not providing a required relationship.
One thing that trips people up with Objective-c is that errors and exceptions are slightly different critters than they are in other languages. In Objective-C an error is something that the programmer should anticipate and plan for in the normal operation of the application e.g. a missing file. By contrast an exception is something exceptional that the programmer wouldn't expect the app to have to routinely handle e.g. a corrupted file.
Therefore, in Core Data a validation failure would be an common expected and unexceptional error whereas as corrupted persistent store would be a rare, unexpected and highly exceptional exception.
See the Exceptions Programming Guide and the Error Handling Programming Guide for details.

BEST approach to share messages between data tier and application tier

I was looking for any suggestion or a good approach to handle messages between the data & application tier, what i mean with this is when using store procedures or even using direct SQL statements in the application, there should be a way the data tier notifies upper layers about statement/operation results in at least an organized way.
What i commonly use is two variables in every store procedure:
#code INT,
#message VARCHAR(1024)
I put the DML statements inside a TRY CATCH block and at the end of the try block, set both variables to certain status meaning everything went ok, as you might be thinking i do the opposite on the catch block, set both variables to some failure code if needed perform some error handling.
After the TRY CATCH block i return the result using a record:
SELECT #code AS code, #message AS message
These two variables are used in upper tiers for validation purposes as sending messages to the user, also for committing o rolling back transactions.
Perhaps i'm missing important features like RAISERROR or not cosidering better and more optimal and secure approaches.
I would appreciate your advices and good practices, not asking for cooks recipes there's no need, just the idea but if you decide to include examples they'd be more than welcome.
Thanks
The generally accepted approach for dealing with this is to have your sproc set a return values
and use raise error as has been previously suggested.
The key thing that I suspect you are missing is how to get this data back on the client side.
If you are in a .net environment, you can check for the return value as part of the ado.net code. The raise error messages can be captured by catching the SqlException object and iterating through the errors collection.
Alternatively, you could create and event handler and attach to the connection InfoMessage event. This would allow you to get the messages as they are generated from sql server instead of at the completion of the batch or stored procedure.
another option which I wouldn't recommend, would be to track all of the things you care about into a XML message and return that from the stored proc this gives you a little more structure then your free text field approach.
My experience has been to rely on the inherent workings of the procedure mechanisms. What I mean by this is that if a procedure is to fail, either the implicit raised error (if not using try/catch . . . simply CRUD stuff) or using the explicit RAISERROR. This means that if the code falls into the catch block and I want to notify the application of the error, I would re-raise the error with RAISERROR. In the past, I've have created a custom rethrow procedure so I can trap and log the error information. One could use the rethrow as an opportunity to throw a custom message/code that could be used to guide the application down a different decision path. If the calling application doesn't receive an error on execution, it's safe to assume that it succeeded - so explicit transaction committing can commence.
I've never attempted to write in my own communication mechanism for interaction between the app and the stored proc - the objects in the languages I interact with have events or mechanisms to detect the raised error from the procedure. It sounds to me like your approach doesn't necessarily require too much more code than standard error handling, but I do find it confusing as it seems to be a reinventing the wheel type of scenario.
I hope I've not misunderstood what you are doing, nor do I want to come off as critical; I do believe, however, you have this feedback mechanism available with the standard API and are probably making this too complicated. I would recommend reading up on RAISERROR and see if it's going to meet your needs (see the Using RAISERROR link for this at the bottom of the page as well).

How can I clear Class::DBI's internal cache?

I'm currently working on a large implementation of Class::DBI for an existing database structure, and am running into a problem with clearing the cache from Class::DBI. This is a mod_perl implementation, so an instance of a class can be quite old between times that it is accessed.
From the man pages I found two options:
Music::DBI->clear_object_index();
And:
Music::Artist->purge_object_index_every(2000);
Now, when I add clear_object_index() to the DESTROY method, it seems to run, but doesn't actually empty the cache. I am able to manually change the database, re-run the request, and it is still the old version.
purge_object_index_every says that it clears the index every n requests. Setting this to "1" or "0", seems to clear the index... sometimes. I'd expect one of those two to work, but for some reason it doesn't do it every time. More like 1 in 5 times.
Any suggestions for clearing this out?
The "common problems" page on the Class::DBI wiki has a section on this subject. The simplest solution is to disable the live object index entirely using:
$Class::DBI::Weaken_Is_Available = 0;
$obj->dbi_commit(); may be what you are looking for if you have uncompleted transactions. However, this is not very likely the case, as it tends to complete any lingering transactions automatically on destruction.
When you do this:
Music::Artist->purge_object_index_every(2000);
You are telling it to examine the object cache every 2000 object loads and remove any dead references to conserve memory use. I don't think that is what you want at all.
Furthermore,
Music::DBI->clear_object_index();
Removes all objects form the live object index. I don't know how this would help at all; it's not flushing them to disk, really.
It sounds like what you are trying to do should work just fine the way you have it, but there may be a problem with your SQL or elsewhere that is preventing the INSERT or UPDATE from working. Are you doing error checking for each database query as the perldoc suggests? Perhaps you can begin there or in your database error logs, watching the queries to see why they aren't being completed or if they ever arrive.
Hope this helps!
I've used remove_from_object_index successfully in the past, so that when a page is called that modifies the database, it always explicitly reset that object in the cache as part of the confirmation page.
I should note that Class::DBI is deprecated and you should port your code to DBIx::Class instead.