How is the best way to overcome bad / inconsistent data in an event-store?

How is the best way to overcome bad / inconsistent data in an event-store? - cqrs

Event stores are supposed to be Add only, you never delete or edit data.
In my case, we didn't disallow some mid-stream changes that were made by a user, and the data is "bad" / inconsistent... in that they changed a domain name for google docs that we were provisioning mid-stream...
I can reprovision from the event store, but that data is broken.
Should I create a mutator of some kind that as it pulls the data from the event store, fix it up?
I need some ideas here!

If I understand you correctly, the "bad" data were correct back then?
I suggest you fix this by not touching the events, but update the provisioning service to handle the broken domain names. That could be a component which knows how to fix a domain name or a url: You send a resource in and get a corrected version in return. Unknown resources could be returned as is.

Related

How to delete items from a subset with REST API

I'm wondering what are the best ways to delete items from a subset in a restful way. I got users and series, each user has his own list of series (watching, completed, etc). For example, if we want to get a list from a user we can do it with: GET /users/:id_user/series
If we want to delete a serie from the list of that user (but we don't want to delete the serie itself), how should it be?
I thought about the possibility of using DELETE /users/:id_user/series/:id_serie, but I'm not sure if it's the correct way for this case (maybe PATCH?).
I got another case, we got series and reviews. We can get the reviews like this: GET /series/:serie_id/reviews. In the other case we didn't want to delete the serie itself when deleting from a user list of series, but in this case we want to delete the review because its existence depends on the serie. So I guess in this case DELETE /series/:serie_id/reviews/:review_id is correct.
Is this difference important in order to choose the rest operation to delete the object/item from the subset?

How would you do it on the web?
You'd follow a link to a form, with input controls. You might have a something like a dropdown if you wanted to delete one series at a time, or lots of check boxes if you wanted to support a bulk delete. The user would provide input, hit the submit button, and the browser would create an application/x-www-form-urlencoded document and send it to the server.
What method would be used? Normally POST, because we are intending an edit to some resource on the server.
What resource would we be editing? Well, in trutch, it could be anything -- the server would include that information in the form metadata, so the client can just do what it is told.
So where should the server tell it to submit the form? Again, it could be anywhere... but a useful approach is to think about what resource in the client's cache is being updated. Because if we send the request to that resource, we get intelligent cache invalidation "for free".
So on the web, we would expect to see:
POST /users/:id_user/series
Does it have to be POST? On the HTML web, maybe it does, because the ubiquitous client of the web is a browser, not an editor.
It is okay to use POST.
But a perfectly valid alternative would be to edit the local copy of /users/:id_user/series, and then send back to the server a complete copy of the new version (PUT) or a patch-document describing the edits (PATCH). Notice that with both of these choices, the target uri is still /user/:id_user/series, so we still get the cache invalidation magic.
Creating a new resource in your model just to have something to DELETE is probably the wrong idea.
There are cases where an edit, or a delete, will necessarily impact more than one resource.
There are some specific circumstances when you can get the right magic cache invalidation with two resources (for instance, delete one document, and send back an updated copy of another).
But we don't, today, have a general purpose cache invalidation mechanism. (Closest thing I've been able to find is this, which seems to have stalled out in 2012.

want to send deltas to server using ra-data-json-server

we're using react-admin with a jsonServerRestClient from ra-data-json-server. We've encapsulated it to handle GET_MANY a little differently, so we already have a touchpoint there to control what goes to the server.
Going forward we would like to be able to send a delta of changes when an item is modified, as some items (in our case, groups) have 4k+ members in them.
While we could save the raw objects to local storage when they come into our REST client and use that info to create a delta upon save, the state and reducers should have that info already somewhere else, right?
So what files should I look at to see how to modify what gets sent to the REST client during modify events?

Figured it out, React-Admin actually is nice enough to give us the previous version of whatever data it wants to save. When you're writing a restClient you will be given type, resource and params. Inside params is both params.data and params.previousData. You can do your deltas there, comparing both.

Rally rest API transactions

Is there any way to achieve an atomic transaction using the Rally wsapi. I know a transaction implies state among the consecutive requests, but REST obviously is a stateless protocol. So that might be an issue.
need to be able to pull a portfolioitem/feature and then immediately write it back if I have the most recent version of it. I have a custom field on portfolioitem/feature that WILL be edited by multiple people simultaneously, and I need to make sure that each update happens in the correct order.
Since i don't have access to Rally's server stuff, i must do all this client side, and I can't figure out how to do this. I will be doing this will the Rally SDK also.

I don't think WS API supports atomic transactions. A scenario where updates occur as one atomic transaction so that, for example, if one of the updates fail they are all rolled back is not supported. In the example you mentioned each update will be a distinct transaction and in case of a mid-air collision when the same artifact is updated by different users, one of the users will receive a concurrency error.

I am in the same boat as the OP, the only difference being that hours may pass between the read and subsequent write. Interestingly, I only seem to get concurrency errors when I attempt to update a record while there's another transaction of mine in flight. I don't see any exception raised when I am updating a record using a stale version thereof, i.e. one that someone else has changed from under me.
I will be attempting to fix this soon as it's becoming an issue. The chosen approach is to forcibly chain a GET before every POST, and throw an exception if the VersionID of the record I GET doesn't match the one I have stored in-memory. In case of mismatch, it will refresh the local record (and thus, view) and prompt the user to resubmit their changes. Yes this will be inconvenient for a user but in my app most changes are a single click away so it's reasonable.
I too would like to know if there is a better approach to this problem. One would assume that with every record having a VersionID, it would be handled server-side, with proper support from WsapiProxy on the client end. Maybe I'm missing something obvious, like explicitly fetching VersionID?

Core Data Sync - Tracking Deleted Objects

I'm setting up a basic sync service for an iPad application I'm developing. The goal is to have data consistent throughout several instances of the iPad app, as well as having a read-only version of the data on the web, hence rolling a custom solution.
The current flow is this:
Each entity has a 'created', 'modified' and 'UUID' field which are automatically updated by Core Data
On sync, each entity with a created or modified date after the last sync date is serialised into JSON and sent to the server
The server persists any changes to a MySQL database using the client-generated UUIDs as PKs (if there's a conflict, it just uses the most recently modified entity as the 'true' version, nothing fancy there) and sends back any updated entities to the client
The client then merges these changes back into its Core Data DB
This all seems to be working fine. My problem is how to track deleted objects using this method? I'm guessing I can add a 'deleted' flag to each entity and set this whenever a client deletes something, I can then push that change to the server with the rest of the sync data. Once the sync is complete then the client can actually delete these entities. My questions are:
Can I override Core Data's delete methods to automatically set this flag?
Will this require keeping all deleted entities indefinitely on the server? We'll have no way of knowing when every client has synced and actually deleted each entity (I'm not currently tracking client instances)
Is there a better way of doing this?

How about you keep a delta history table with UUID and created/updated/deleted field, maybe with a revision number for each update? So you keep a small check list of changes since your last successful sync.
That way, if you delete an object you could add an entry in the delta history table with the deleted UUID and mark it deleted. Same with created and updated objects, you only need to check the delta table to see what items you the server needs to delete, update, create, etc. You could even store every revision on the server to support rolling back to a previous version in the future if you feel like it.
I think a revision number is better than relying on client's clock that could potentially be changed manually.
You could use NSManagedObjectContext's insertedObjects, updatedObjects, deletedObjects methods to create the delta objects before every save procedure :)
My 2 cents

Whether or not you have to keep deleted objects on the server or not totally depends on your needs. You will need a deleted flag locally to mark as deleted for the sync, maybe also on the server depending on your desire to roll back.
I have taken care of this problem a few ways before. Here is one possibility:
When a client deletes something, just mark it to be deleted locally and delete from the server during the sync (at which point you can purge from core data). When other clients request to access that data, send back an HTTP 404 because you dont have the object any more. At that point the client can delete the entity locally. Now if a client requests a list of things and this object has been deleted, it will just be missing from the list of things he gets back so you can detect that and delete it. I do that in a client by creating an array of object IDs when I get a response from the server and deleting any local objects that don't have those IDs.
We have a deleted field on the server, but just to have the ability to roll back in case something is deleted by accident.
Of course you could return deleted objects to the client so they know to delete but if you don't want to keep a copy on the server, you would have to make some assumption that the clients would all update within a time frame. Then you could garbage collect after that time frame has expired.
I don't really like that solution though. If your data is too heavy to ask for all the objects for a complete sync, you could use your current merge strategy for creating and updating, and then run a separate call to check for deleted items. That call could simply ask for all IDs that the client should have on the device. It could delete the ones that don't exist. OR it could send all IDs on the client and get back a list of IDs to delete.
I think you have to provide more details about the nature of the data if you want a more opinionated suggestion.

Regarding your second question: You can design this so that the server doesn't have to keep deleted records around, if you want to. Let each app know if a given piece of data (based on its UUID) is stored on the server (e.g. add an existsOnServer property or similar). This starts out false when a new item is created in the app, but is set to true once it has been synced to the server for the first time. That way, if the app tries to sync later, but the UUID is not found, you can differentiate the two cases: If existsOnServer is false, then then this item is newly created and should be synced to the server, but if it is true then it can be taken to mean that it was already on the sever before, but has now been deleted, so you can delete it in the app too.
I'd probably argue against this approach, since it seems more error prone to me (I imagine a database or connection error incorrectly being interpreted as a deletion) and keeping records around on your server would usually not be a big deal, but it is possible. The "delta-approach" suggested by dzeikei could be used at the same time, so an update to a record that does not exist on the server signifies that it was deleted, while an insert does not.

You may take a look at Cross-Platform Data Synchronization by Dan Grover if you haven't. It's a very well written paper regarding synchronization and iOS.
About your questions:
You can avoid deleting a file in Core Data and set a 'deleted flag': just update the file instead of deleting it. You could make your own 'deleting' method that actually would call and update the flag on the record.
Keep always a last_sync and a last_updated for each record on the server and on each client. This way you'll always know when someone did change something anywhere and if that change was synced or not against the 'truth database'.
Keeping track of deleted files is a hard thing to do, I guess the best way to do it is keeping track of the history of syncs for each table, but is a difficult task. The easiest way, using this 'truth-database' kind of configuration is to flag the files, so that way yes, you should keep the data on the server as well as on the client.

during synchronization of data between tow table some records or deleted when the table rows are same. and when the rows are different the correctly synchronized, i used this Code click here on image

CQRS: how do you retrieve information about an executed command?

In most command interfaces I've seen, there is typically an "Execute" method which takes takes a command input and either returns void or some generic structure indicating if the command executed successfully or not (we are using the latter). Now, I've never thought of this before, but we suddenly got the need to know some more details about the result of the command than what you can expose generically.
Consider the following example:
you have a team and you are creating a screen where you can add members to your team. The members of the team are shown in a grid below the "add new member"-stuff. Now, when you press "add new member" you want to run some jquery/roundohuse/whatever and add the new member to the list of team members. No problems so far, but: you also want to include some identification data in a hidden field for each member and this id-data comes from the server.
So the problem is: how can I get that id-data from the server? The "AddNewTeamMember" command which I am pushing through the "ExecuteCommand"-method does not give me anything useful back, and if I add a new query method to the service saying something like: "GetLastAddedTeamMember" then I might just get the last entry added by someone else (at least if this is data which is very aggressively added by different users). In some situations you have a natural unique identifier generated on the client side which we can use, but for team members we did not.

Given that you have no choice but to update an on-page widget when another command completes, I see two choices for you:
Shoot off the command, display something locally that indicates it is submitted, and then wait until you get a notification from the server that the team member list has changed. Update the widget to reflect that.
Add a correlation ID to your command when you submit it, and add the team member provisionally locally to the list. When you get a confirmation from the server that a team member update happened because of your correlation ID, update your local data.
I would suggest the first approach, where the "provisional indicator" could be throwing a marked version of the normal indication into place; then, when you finally get an update you should have the data you need.
Given you went with CQRS to solve this problem I assume you have frequent updates to the content of those widgets happening in the background already, so have presumably solved the "background update" problem.
If not, I suggest you either ditch CQRS as a bad - over-complicated - solution in your problem space, or solve the background update problem first.

If you want to add an existing team member, you should query the read side of your application for this data. If you need to add a new team member, you have to consider if it's necessary to show the user in the grid below at once. Can you wait until the team member is in place on the read side? You can also query a service on the server side to get an unique ID (it can return a GUID). Then you add the team member to the grid, and of course, send the command to the server. But, if it's possible, try to design the application in a way that you don't have to show the team member at once. It's also possible to give the user a message saying something like this: "Team member added, waiting for response from server.". Then use AJAX to query the read side for new team members. When it appears on the read side, show it in the grid. You might have to deal with team members added by other users, but does it matter? CQRS gives you a great way to collaborate with other users, so maybe you should take advantage of that. As I see it; CQRS forces you to think different, and that may not be a bad thing.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse