Firebase REST API: Delete sometimes fails - matlab

I'm currently building a web frontend for a Matlab program. I'm using webread/webwrite to interface with the Firebase realtime database (Though I'll be shifting to urlread2 soon for compatibility reasons). The Matlab end has to delete nodes from the database on a regular basis. I do this by using webwrite to send a POST request and putting "X-HTTP-Method-Override: DELETE" in the header. This works, but after a few deletes it stops working until data is either added to or removed from the database. It seems completely random, my teammate and I have been trying to find a pattern for a few days and we've found nothing.
Here is the relevant Matlab code:
modurl = strcat(url, modkey, '.json');
modurlstr = char(modurl);
webop = weboptions('KeyName', 'X-HTTP-Method-Override', 'KeyValue','DELETE');
webwrite(modurlstr, webop);
Where url is our database url and modkey is the key of the node we're trying to delete. There's no authentication because the database is set to public (Security is not an issue for us).
The database is organized pretty simply. The root node just has a bunch of children. We only delete a whole child (i.e. we don't ever try to delete the individual components of a child).
Are we doing something wrong?
Thanks in advance!

We found out some of the keys had hyphens in them, which were getting translated to their ascii representation. The reason it seemed random was because the delete was only bugging out on the nodes which had a hyphen in their keys. When we switched them back everything worked fine.

Related

Best practice for RESTful API design updating 1/many-to-many relationship?

Suppose I have a recipe page where the recipe can have a number of ingredients associated with it. Users can edit the ingredients list and update/save the recipe. In the database there are these tables: recipes table, ingredients table, ingredients_recipes_table. Suppose a recipe has ingredients a, b, c, d but then the user changes it to a, d, e, f. With the request to the server, do I just send only the new ingredients list and have the back end determine what values need to be deleted/inserted into the database? Or do I explicitly state in the payload what values need to be deleted and what values need to be inserted? I'm guessing it's probably the former, but then is this handled before or during the db query? Do I read from the table first then write after calculating the differences? Or does the query just handle this?
I searched and I'm seeing solutions involving INSERT IGNORE... + DELETE ... NOT IN ... or using the MERGE statement. The project isn't using an ORM -- would I be right to assume that this could be done easily with an ORM?
Can you share what the user interface looks like? It would be pretty standard practice that you can either post a single new ingredient as an action or delete one as an action. You can simply have a button next to the ingredients to initiate a DELETE request, and have a form beneath for a POST.
Having the users input a list creates unnecessary complexity.
A common pattern to use would be to treat this like a remote authoring problem.
The basic idea of remote authoring is that we ask the server for its current representation of a resource. We then make local (to the client) edits to the representation, and then request that the server accept our representation as a replacement.
So we might GET a representation that includes a JSON Array of ingredients. In our local copy, we remove the ingredients we no longer want, add the new ones in. The we would PUT our local copy back to the server.
When the documents are very large, with changes that are easily described, we might instead of sending the entire document to the server instead send a PATCH request, with a "patch document" that describes the changes we have made locally.
When the server is just a document store, the implementation on the server is easy -- you can review the changes to decide if they are valid, compute the new representation (if necessary), and then save it into a file, or whatever.
When you are using a relational database? Then the server implementation needs to figure out how to update itself. An ORM library might save you a bunch of work, but there are no guarantees -- people tend to get tangled up in the "object" end of the "object relational mapper". You may need to fall back to hand rolling your own SQL.
An alternative to remote authoring is to treat the problem like a web site. In that case, you would get some representation of a form that allows the client to describe the change that should be made, and then submit the form, producing a POST request that describes the intended changes.
But you run into the same mapping problem on the server end -- how much work do you have to do to translate the POST request into the correct database transaction?
REST, alas, doesn't tell you anything about how to transform the representation provided in the request into your relational database. After all, that's part of the point -- REST is intended to allow you to replace the server with an alternative implementation without breaking existing clients, and vice versa.
That said, yes - your basic ideas are right; you might just replace the entire existing representation in your database, or you might instead optimize to only issue the necessary changes. An ORM may be able to effectively perform the transformations for you -- optimizations like lazy loading have been known to complicate things significantly.

breeze breeze.EntityQuery.from('Agencies'); returns the last entity returned for entire results

Let me start by saying i'm new to breeze so I apologize if this turns out to be a stupid error on my side.
I'm using angular specifically John Papa's hot towel base. I have create a repository that returns back a list of agencies from a WebApi2 service (backed by EF6). This code actually seems to work fine if i break point the return of the breeze httpget the records returned show all the correct entity data and count.
The problem is when they are returned to my angular callback from "... .execute().then" the number if items is correct but all the entities appear to be the last entity returned from the webapi.
I am using local storage again John Papa's zStorage. but even if i by pass the local call and force just a remote it's the same result.
So any ideas and what code snippets are needed to help resolve this?
Thanks for any help you can provide.
It turns out the issue was an error in camelCasing between the database, server and client. Once I got the "Id" fields all in sync things started working properly.

memcached sometimes holding corrupt data

I have been using Memcached (AWS Elasticache) for a while now.
Just today I ran into a situation that I hadn't experienced before. Regularly there is a call to the database for a list of countries and I store this in memcached. This time however the data wasn't stored correctly (I'm not sure why as it has worked fine for months) but after looking over the code & trying code based fixes (assuming something was wrong with the site code) a bounce of the cache fixed the issue. Note: I had bounced memcached the day before so maybe it didn't warm up correctly etc.
My Question is - currently I check to see if the memcached key exists and if it does I use the data. Only if the memcached key doesn't exist do I query the database and populate the key. Do I also need to validate the data somehow to so I can be sure its not corrupt or should this be seen as an infrequent issue (which it is) and left at that.
Also I believe the memcached key didn't have any data in it so maybe just checking if the key is empty is good enough...
Code below:
public $countryList = array();
// Countries, Country Code, Zip Enabled --- 'generic::countryList::'.$_SESSION['language']'
public function countryList() {
$elasticache = new elasticache();
if(!$this->countryList = $elasticache->memcached->get('generic::countryList::'.$_SESSION['language'])) {
--- this is where the database query code is
$elasticache->memcached->set('generic::countryList::'.$_SESSION['language'], $this->countryList, 2592000);
}
}
I guess confirming the data in the key is correct would required a database call and therefore would defeat the purpose of memcached....
thoughts & ideas?

Avoid duplicate POSTs with REST

I have been using POST in a REST API to create objects. Every once in a while, the server will create the object, but the client will be disconnected before it receives the 201 Created response. The client only sees a failed POST request, and tries again later, and the server happily creates a duplicate object...
Others must have had this problem, right? But I google around, and everyone just seems to ignore it.
I have 2 solutions:
A) Use PUT instead, and create the (GU)ID on the client.
B) Add a GUID to all objects created on the client, and have the server enforce their UNIQUE-ness.
A doesn't match existing frameworks very well, and B feels like a hack. How does other people solve this, in the real world?
Edit:
With Backbone.js, you can set a GUID as the id when you create an object on the client. When it is saved, Backbone will do a PUT request. Make your REST backend handle PUT to non-existing id's, and you're set.
Another solution that's been proposed for this is POST Once Exactly (POE), in which the server generates single-use POST URIs that, when used more than once, will cause the server to return a 405 response.
The downsides are that 1) the POE draft was allowed to expire without any further progress on standardization, and thus 2) implementing it requires changes to clients to make use of the new POE headers, and extra work by servers to implement the POE semantics.
By googling you can find a few APIs that are using it though.
Another idea I had for solving this problem is that of a conditional POST, which I described and asked for feedback on here.
There seems to be no consensus on the best way to prevent duplicate resource creation in cases where the unique URI generation is unable to be PUT on the client and hence POST is needed.
I always use B -- detection of dups due to whatever problem belongs on the server side.
Detection of duplicates is a kludge, and can get very complicated. Genuine distinct but similar requests can arrive at the same time, perhaps because a network connection is restored. And repeat requests can arrive hours or days apart if a network connection drops out.
All of the discussion of identifiers in the other anwsers is with the goal of giving an error in response to duplicate requests, but this will normally just incite a client to get or generate a new id and try again.
A simple and robust pattern to solve this problem is as follows: Server applications should store all responses to unsafe requests, then, if they see a duplicate request, they can repeat the previous response and do nothing else. Do this for all unsafe requests and you will solve a bunch of thorny problems. Repeat DELETE requests will get the original confirmation, not a 404 error. Repeat POSTS do not create duplicates. Repeated updates do not overwrite subsequent changes etc. etc.
"Duplicate" is determined by an application-level id (that serves just to identify the action, not the underlying resource). This can be either a client-generated GUID or a server-generated sequence number. In this second case, a request-response should be dedicated just to exchanging the id. I like this solution because the dedicated step makes clients think they're getting something precious that they need to look after. If they can generate their own identifiers, they're more likely to put this line inside the loop and every bloody request will have a new id.
Using this scheme, all POSTs are empty, and POST is used only for retrieving an action identifier. All PUTs and DELETEs are fully idempotent: successive requests get the same (stored and replayed) response and cause nothing further to happen. The nicest thing about this pattern is its Kung-Fu (Panda) quality. It takes a weakness: the propensity for clients to repeat a request any time they get an unexpected response, and turns it into a force :-)
I have a little google doc here if any-one cares.
You could try a two step approach. You request an object to be created, which returns a token. Then in a second request, ask for a status using the token. Until the status is requested using the token, you leave it in a "staged" state.
If the client disconnects after the first request, they won't have the token and the object stays "staged" indefinitely or until you remove it with another process.
If the first request succeeds, you have a valid token and you can grab the created object as many times as you want without it recreating anything.
There's no reason why the token can't be the ID of the object in the data store. You can create the object during the first request. The second request really just updates the "staged" field.
Server-issued Identifiers
If you are dealing with the case where it is the server that issues the identifiers, create the object in a temporary, staged state. (This is an inherently non-idempotent operation, so it should be done with POST.) The client then has to do a further operation on it to transfer it from the staged state into the active/preserved state (which might be a PUT of a property of the resource, or a suitable POST to the resource).
Each client ought to be able to GET a list of their resources in the staged state somehow (maybe mixed with other resources) and ought to be able to DELETE resources they've created if they're still just staged. You can also periodically delete staged resources that have been inactive for some time.
You do not need to reveal one client's staged resources to any other client; they need exist globally only after the confirmatory step.
Client-issued Identifiers
The alternative is for the client to issue the identifiers. This is mainly useful where you are modeling something like a filestore, as the names of files are typically significant to user code. In this case, you can use PUT to do the creation of the resource as you can do it all idempotently.
The down-side of this is that clients are able to create IDs, and so you have no control at all over what IDs they use.
There is another variation of this problem. Having a client generate a unique id indicates that we are asking a customer to solve this problem for us. Consider an environment where we have a publicly exposed APIs and have 100s of clients integrating with these APIs. Practically, we have no control over the client code and the correctness of his implementation of uniqueness. Hence, it would probably be better to have intelligence in understanding if a request is a duplicate. One simple approach here would be to calculate and store check-sum of every request based on attributes from a user input, define some time threshold (x mins) and compare every new request from the same client against the ones received in past x mins. If the checksum matches, it could be a duplicate request and add some challenge mechanism for a client to resolve this.
If a client is making two different requests with same parameters within x mins, it might be worth to ensure that this is intentional even if it's coming with a unique request id.
This approach may not be suitable for every use case, however, I think this will be useful for cases where the business impact of executing the second call is high and can potentially cost a customer. Consider a situation of payment processing engine where an intermediate layer ends up in retrying a failed requests OR a customer double clicked resulting in submitting two requests by client layer.
Design
Automatic (without the need to maintain a manual black list)
Memory optimized
Disk optimized
Algorithm [solution 1]
REST arrives with UUID
Web server checks if UUID is in Memory cache black list table (if yes, answer 409)
Server writes the request to DB (if was not filtered by ETS)
DB checks if the UUID is repeated before writing
If yes, answer 409 for the server, and blacklist to Memory Cache and Disk
If not repeated write to DB and answer 200
Algorithm [solution 2]
REST arrives with UUID
Save the UUID in the Memory Cache table (expire for 30 days)
Web server checks if UUID is in Memory Cache black list table [return HTTP 409]
Server writes the request to DB [return HTTP 200]
In solution 2, the threshold to create the Memory Cache blacklist is created ONLY in memory, so DB will never be checked for duplicates. The definition of 'duplication' is "any request that comes into a period of time". We also replicate the Memory Cache table on the disk, so we fill it before starting up the server.
In solution 1, there will be never a duplicate, because we always check in the disk ONLY once before writing, and if it's duplicated, the next roundtrips will be treated by the Memory Cache. This solution is better for Big Query, because requests there are not imdepotents, but it's also less optmized.
HTTP response code for POST when resource already exists

Can Primary-Keys be re-used once deleted?

0x80040237 Cannot insert duplicate key.
I'm trying to write an import routine for MSCRM4.0 through the CrmService.
This has been successful up until this point. Initially I was just letting CRM generate the primary keys of the records. But my client wanted the ability to set the key of a our custom entity to predefined values. Potentially this enables us to know what data was created by our installer, and what data was created post-install.
I tested to ensure that the Guids can be set when calling the CrmService.Update() method and the results indicated that records were created with our desired values. I ran my import and everything seemed successful. In modifying my validation code of the import files, I deleted the data (through the crm browser interface) and tried to re-import. Unfortunately now it throws and a duplicate key error.
Why is this error being thrown? Does the Crm interface delete the record, or does it still exist but hidden from user's eyes? Is there a way to ensure that a deleted record is permanently deleted and the Guid becomes free? In a live environment, these Guids would never have existed, but during my development I need these imports to be successful.
By the way, considering I'm having this issue, does this imply that statically setting Guids is not a recommended practice?
As far I can tell entities are soft-deleted so it would not be possible to reuse that Guid unless you (or the deletion service) deleted the entity out of the database.
For example in the LeadBase table you will find a field called DeletionStateCode, a value of 0 implies the record has not been deleted.
A value of 2 marks the record for deletion. There's a deletion service that runs every 2(?) hours to physically delete those records from the table.
I think Zahir is right, try running the deletion service and try again. There's some info here: http://blogs.msdn.com/crm/archive/2006/10/24/purging-old-instances-of-workflow-in-microsoft-crm.aspx
Zahir is correct.
After you import and delete the records, you can kick off the deletion service at a time you choose with this tool. That will make it easier to test imports and reimports.