What should cas id for initial set call? - memcached

I am implementing CAS based memcached retrieval and have a possibly dumb question. If there is no entry for particular key, i.e. first time its stored, What should I set the cas id too?

Whenever you set a new key that doesn't exist in memcached then the cas value should be 0. If you set it to something other then 0 you will get a NOT_FOUND error. The reason for this error is that memcached will try to check the cas value you gave against the key you are trying to set in memcached. Since that key doesn't exist yet you get NOT_FOUND.

Related

Is there anyway to check duplicate the message control id (MSH:10) in MSH segment using Mirth connect?

Is there anyway to check duplicate the message control id (MSH:10) in MSH segment using Mirth connect?
MSH|^~&|sss|xxx|INSTANCE2|KKLIU 0063/2021|20190905162034||ADT^A28^ADT_A05|Zx20190905162034|P|2.4|||NE|NE|||||
whenever message enters it needs to be validated whether duplicate of control id Zx20190905162034 is already processed or not?
Mirth will not do this for you, but you can write your own JavaScript transformer to check a database or your own set of previously encountered control ids.
Your JavaScript can make use of any appropriate Java classes.
The database check (you can implement this using code template) is the easier way out. You might want to designate the column storing MSH:10 values as a primary key or define an index on it. Queries against unique entries would be faster. Other alternatives include periodically redeploying the Channel while reading all MSH:10 values already in the database and placing them in a global map variable or maintained in an API that you can make a GET request to when processing every message. Any of the options depends on the number of records we are speaking about.

Allow negative primary keys in SailsJS

Is there any way to allow primary keys to be negative integers in Sails?
I ran accross the following error when testing some older software;
{
"code":"E_INVALID_VALUES_TO_SET",
"details":"Could not use specified `org`. Expecting an id representing the associated record, or `null` to indicate there will be no associated record. But the specified value is not a valid `org`. Cannot use a negative number (-1) as a primary key value.",
"message":"The server could not fulfill this request (`PATCH /user/1402`) due to a problem with the parameters that were sent. See the `details` for more info. **The following additional tip will not be shown in production**: Tip: Check your client-side code to make sure that the request data it sends matches the expectations of the corresponding attribues in your model. Also check that your client-side code sends data for every required attribute."}
I've checked the Sails documentation and can't find any place which mentions that negative primary keys are not allowed.
I've also checked the schema definitions for both tables, and neither speciefies the relevant field as unsigned.
Is there any workaround other than changing the relevant row to some different id and updating every other row which references it?
Here is a workaround. Maybe you can change your primary key to a string.
https://sailsjs.com/documentation/concepts/models-and-orm/model-settings

Is it legitimate to insert UUIDs into Postgres that have been generated by a client application?

The normal MO for creating items in a database is to let the database control the generation of the primary key (id). That's usually true whether you're using auto-incremented integer ids or UUIDs.
I'm building a clientside app (Angular but the tech is irrelevant) that I want to be able to build offline behaviour into. In order to allow allow offline object creation (and association) I need the the client appplication to generate primary keys for new objects. This is both to allow for associations with other objects created offline and also to allow for indempotence (making sure I don't accidentally save the same object to the server twice due to a network issue).
The challenge though is what happens when that object gets sent to the server. Do you use a temporary clientside ID which you then replace with the ID that the server subsequently generates or you use some sort of ID translation layer between the client and the server - this is what Trello did when building their offline functionality.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Is this a legitimate approach? The risk seems to be 1. Collisions and 2. any form of security that I haven't anticipated. Collisons seem to be taken care of by the way that UUIDs are generated but I can't tell if there are risks in allowing a client to choose the ID of an inserted object.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Yes, this is fine. Postgres even has a UUID type.
Set the default ID to be a server-generated UUID if the client does not send one.
Collisions.
UUIDs are designed to not collide.
Any form of security that I haven't anticipated.
Avoid UUIDv1 because...
This involves the MAC address of the computer and a time stamp. Note that UUIDs of this kind reveal the identity of the computer that created the identifier and the time at which it did so, which might make it unsuitable for certain security-sensitive applications.
You can instead use uuid_generate_v1mc which obscures the MAC address.
Avoid UUIDv3 because it uses MD5. Use UUIDv5 instead.
UUIDv4 is simplest, it's a 122 bit random number, and built into Postgres (the others are in the commonly available uuid-osp extension). However, it depends on the strength of the random number generator of each client. But even a bad UUIDv4 generator is better than incrementing an integer.

Why memcached set operation is not idempotent?

On the page 2 of the Facebook's paper "Scaling Memcache at Facebook" they said "For write requests,the webserver issues SQL statements to the database and then sends a delete request to memcache that invalidates any stale data. We choose to delete cached data instead of updating it because deletes are idempotent."
Why update/set is not idempotent operation?
Paper can be found here: https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
If you call one delete after another two times second delete won't make any effect. Update/set here will act differently: while it won't change value associated with the key, it will update the last access time for the key changing the logic of when the key will be evicted. In this sense delete key operation is idempotent while update key's value is not.
E.g. in the paper they don't want to keep a key in the cache if no one ever tried to read it (even if there are lots of writes for that key in the database).

Should the natural or surrogate key be returned in an API?

First time I think about it...
Until now, I always used the natural key in my API. For example, a REST API allowing to deal with entities, the URL would be like /entities/{id} where id is a natural key known to the user (the ID is passed to the POST request that creates the entity). After the entity is created, the user can use multiple commands (GET, DELETE, PUT...) to manipulate the entity. The entity also has a surrogate key generated by the database.
Now, think about the following sequence:
A user creates entity with id 1. (POST /entities with body containing id 1)
Another user deletes the entity (DELETE /entities/1)
The same other user creates the entity again (POST /entities with body containing id 1)
The first user decides to modify the entity (PUT /entities/1 with body)
Before step 4 is executed, there is still an entity with id 1 in the database, but it is not the same entity created during step 1. The problem is that step 4 identifies the entity to modify based on the natural key which is the same for the deleted and new entity (while the surrogate key is different). Therefore, step 4 will succeed and the user will never know it is working on a new entity.
I generally also use optimistic locking in my applications, but I don't think it helps here. After step 1, the entity's version field is 0. After step 3, the new entity's version field is also 0. Therefore, the version check won't help. Is the right case to use timestamp field for optimistic locking?
Is the "good" solution to return surrogate key to the user? This way, the user always provides the surrogate key to the server which can use it to ensure it works on the same entity and not on a new one?
Which approach do you recommend?
It depends on how you want your users to user your api.
REST APIs should try to be discoverable. So if there is benefit in exposing natural keys in your API because it will allow users to modify the URI directly and get to a new state, then do it.
A good example is categories or tags. We could have these following URIs;
GET /some-resource?tag=1 // returns all resources tagged with 'blue'
GET /some-resource?tag=2 // returns all resources tagged with 'red'
or
GET /some-resource?tag=blue // returns all resources tagged with 'blue'
GET /some-resource?tag=red // returns all resources tagged with 'red'
There is clearly more value to a user in the second group, as they can see that the tag is a real word. This then allows them to type ANY word in there to see whats returned, whereas the first group does not allow this: it limits discoverability
A different example would be orders
GET /orders/1 // returns order 1
or
GET /orders/some-verbose-name-that-adds-no-meaning // returns order 1
In this case there is little value in adding some verbose name to the order to allow it to be discoverable. A user is more likely to want to view all orders first (or a subset) and filter by date or price etc, and then choose an order to view
GET /orders?orderBy={date}&order=asc
Additional
After our discussion over chat, your issue seems to be with versioning and how to manage resource locking.
If you allow resources to be modified by multiple users, you need to send a version number with every request and response. The version number is incremented when any changes are made. If a request sends an older version number when trying to modify a resource, throw an error.
In the case where you allow the same URIs to be reused, there is a potential for conflict as the version number always begins from 0. In this case, you will also need to send over a GUID (surrogate key) and a version number. Or don't use natural URIs (see original answer above to decided when to do this or not).
There is another option which is to disallow reuse of URIs. This really depends on the use case and your business requirements. It may be fine to reuse a URI as conceptually it means the same thing. Example would be if you had a folder on your computer. Deleting the folder and recreating it, is the same as emptying the folder. Conceptually the folder is the same 'thing' but with different properties.
User account is probably an area where reusing URIs is not a good idea. If you delete an account /accounts/u1, that URI should be marked as deleted, and no other user should be able to create an account with username u1. Conceptually, a new user using the same URI is not the same as when the previous user was using it.
Its interesting to see people trying to rediscover solutions to known problems. This issue is not specific to a REST API - it applies to any indexed storage. The only solution I have ever seen implemented is don't re-use surrogate keys.
If you are generating your surrogate key at the client, use UUIDs or split sequences, but for preference do it serverside.
Also, you should never use surrogate keys to de-reference data if a simple natural key exists in the data. Indeed, even if the natural key is a compound entity, you should consider very carefully whether to expose a surrogate key in the API.
You mentioned the possibility of using a timestamp as your optimistic locking.
Depending how strictly you're following a RESTful principle, the Entity returned by the POST will contain an "edit self" link; this is the URI to which a DELETE or UPDATE can be performed.
Taking your steps above as an example:
Step 1
User A does a POST of Entity 1. The returned Entity object will contain a "self" link indicating where updates should occur, like:
/entities/1/timestamp/312547124138
Step 2
User B gets the existing Entity 1, with the above "self" link, and performs a DELETE to that timestamp versioned URI.
Step 3
User B does a POST of a new Entity 1, which returns an object with a different "self" link, e.g.:
/entities/1/timestamp/312547999999
Step 4
User A, with the original Entity that they obtained in Step 1, tries doing a PUT to the "self" link on their object, which was:
/entities/1/timestamp/312547124138
...your service will recognise that although Entity 1 does exist; User A is trying a PUT against a version which has since become stale.
The service can then perform the appropriate action. Depending how sophisticated your algorithm is, you could either merge the changes or reject the PUT.
I can't remember the appropriate HTTP status code that you should return, following a PUT to a stale version... It's not something that I've implemented in the Rest framework that I work on, although I have planned to enable it in future. It might be that you return a 410 ("Gone").
Step 5
I know you don't have a step 5, but..! User A, upon finding their PUT has failed, might re-retrieve Entity 1. This could be a GET to their (stale) version, i.e. a GET to:
/entities/1/timestamp/312547124138
...and your service would return a redirect to GET from either a generic URI for that object, e.g.:
/entities/1
...or to the specific latest version, i.e.:
/entities/1/timestamp/312547999999
They can then make the changes intended in Step 4, subject to any application-level merge logic.
Hope that helps.
Your problem can be solved either using ETags for versioning (a record can only modified if the current ETag is supplied) or by soft deletes (so the deleted record still exists but with a trashed bool which is reset by a PUT).
Sounds like you might also benefit from a batch end point and using transactions.