NDB urlsafe keys and REST api requests - rest

I was wondering what others are doing to expose REST api endpoints with the datastore (using app engine standard). I want to use urlsafe keys but 1 - I'd rather not pass this data directly as it poses a security risk since app-engine to app-engine calls are exposed over a public ip, and 2 - the keys that are generated are very long and would not be great when multiple need to be passed as a query parameter to form a get request (and would probably exceed browser character limits).
I was thinking maybe using compression of some sort to compress the urlsafe keys which would solve both 1 and 2, but want to see if there is a better way to create REST endpoints. Or if some type of compression method is already baked into ndb?

Google uses HTTPS internally so I'm not sure you need to worry about it.
Also, you should probably design your app so that keys are not secret info and such that it is safe to expose them.
I use key IDs for my REST calls, which I believe are 12 digit numbers. That works as long as you know the entity type. If you need to specify the entity type, you could add another parameter to your API call.

Related

How to specify data security constraints in REST APIs?

I'm designing a REST API and I'm a big defender of keeping my URL simple, avoiding more than two nested resources.
However, I've been having second thoughts because of data security restrictions that apply to my APIs, that have been trying to force me to nest more resources. I'll try to provide examples to be more specific, as I don't know the correct naming for this situation.
Consider a simple example where I want to get a given contact restriction for a customer, like during what period my customer accepts to be bothered with a phone call:
So, I believe it's simpler to have this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /contacts/9999
- GET /contacts/9999/restrictions
- GET /restrictions/1
than this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /customers/12345/contacts/9999
- GET /customers/12345/contacts/9999/restrictions
- GET /customers/12345/contacts/9999/restrictions/1
Note: If there are more related resources, who knows where this will go...
The first case is my favourite because since all resources MUST have a unique identifier, as soon I have its unique identifier I should be able to get the resource instance directly: GET /restrictions/1
The data security restriction in place in my company states that not everyone can see every customers' info (eg: only some managers can access private equity customers). So, to guarantee that, the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security informatio, because that's what I believe to be, should be in a custom header.
So, I believe I should stick to GET /restriction/1 with a custom header "customerId" with the value 12345.
This custom header would only be needed for the apis that have this requirement.
Besides the simpler URL, another advantage of the header, is that if an API didn't start with that security requirement and suddenly needs to comply to it, we could simply require the header to be passed, instead of redefining paths.
I hope I made it clear for you and I'll be looking to learn more about great API design techniques.
Thank you all that reached the end of my post :)
TL;DR: you are fighting over URI design, and REST doesn't actually offer guidance there.
REST, and REST clients, don't distinguish between your "simpler" design and the nested version. A URI is just an opaque sequence of bytes with some little domain agnostic semantics.
/4290c3b2-134e-4647-867a-214d0c866f29
Is a perfectly "RESTFUL" URI. See Stefan Tilkov, REST: I don't Think it Means What You Think it Does.
Fundamentally, REST servers are document stores. You provide a key (the URI) and the server provides the document. Or you provide a key, and the server modifies the document.
How this is implemented is completely at the discretion of the server. It could be that /4290c3b2-134e-4647-867a-214d0c866f29 is used to look up the tuple (12345, 9999, 1), and then the server checks to see if the credentials described in the request header have permission to access that information, and if so the appropriate representation of the resource corresponding to that tuple is returned.
From the client's perspective, it's all the same thing: I provide an opaque identifier in a standard way, and credentials in a standard way, and I get access to the resource or I don't.
the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security information, because that's what I believe to be, should be in a custom header.
There's nothing in REST to back you up. In fact, the notion of introducing a custom header is something of a down check, because your customer header is not something that a generic component is going to know about.
When you need a new header, the "REST" way to go about it is to introduce a new standard. See RFC 5988 for an example.
Fielding, writing in 2008
Every protocol, every media type definition, every URI scheme, and every link relationship type constitutes prior knowledge that the client must know (or learn) in order to make use of that knowledge. REST doesn’t eliminate the need for a clue. What REST does is concentrate that need for prior knowledge into readily standardizable forms.
The architects have a good point - encoding into the uri the hints that make it easier/cheaper/more-reliable to use your data access validator is exactly the sort of thing that allowing the servers to control their own URI namespace is supposed to afford.
The reason that this works, in REST, is that clients don't depend on URI for semantics; instead, they rely on the definitions of the relations that are encoded into the links (or otherwise expressed by the definition of the media type itself).

What are the best practices to make a good REST API request cache key?

I am building a simple API service using Ruby on Rails. In production, I would like to integrate Redis/Memcached in order to cache some frequently-used endpoints with key-based caching. For example, I have a Car table with name and color fields.
My question is, what is the best way to define a cache key for a particular endpoint (eg. /cars) when the resource has variety of params that could come in different order? eg. /cars?name=honda&color=white, /cars?color=white&name=honda.
If I use request url as cache key I will have 2 different cache records but technically speaking, if both name and color have the same values, there should only be one cache record in Redis database.
arrange the parameters in alphabetical order and use that as the basis for a cache key.
/cars?name=honda&color=white
/cars?color=white&name=honda
in both cases the cache key would be based on the concatenated alphabetically listed parameters
colorname
So both the above reordered urls would result in the same cache key.

will the list ensure the order while the data is served from service to app?

Building an pagination API. Assume the request is coming for 0 to 5 resources.
The API returns the 5 resources in an order.
But there is no explicit sequence number or order in the responses.
When the data is transferred in a network, will the list ensures the order while the data is served from service to app?
My worries are Service will be in some technology stack - java/node-js etc., The client can be java based android app, js based react native app, another backend service in java/node-js.
Each one will have own serialization and deserialization libraries. Will it be consistent in order the elements.
I am thinking of adding explicit sequence number/index values, explicitly tell the clients to use sequence number/index values for any ordering/sorting based operations.
Is my understanding correct?
Does explicit sequence number needed or not?
Does the serialization and de-serialization library maintains the order?
to be sure to get the order you want, you could achieve it in one line if sort text object:
arrayResults.sort((a, b)=> b.titleObject.toUpperCase().localeCompare(a.titleObject.toUpperCase()))
sort number:
arrayResults.sort((a, b)=> b.numberObject - a.numberObject)
Then you could paginate peacefully

How to better specify kindo fo ID in RESTful service

I'm looking for an opinion about defining contract for standard GET/PUT/POST/DELETE methods.
We have resource, let's say Client, so route will be /clients
However, we have two types of id for the client. One is the ID generated by our system. On top of that we want optionally allow customers use external id, generated by customer themselves.
So, if customer never going to add clients to the system, don't really interested about integration, and need only use method GET to read customer, endpoint will be:
/clients/{id}
However, if they want full integration, with ability to add clients, and use some their id, we want give them ability to use their own id.
We considered four possible solutions:
1. /clients/external/{externaId}
2. /clients/ext-{externalId}
3. /clients/{externalId}?use-external-id=true
4. /clients/{externalId} with additional header -"use-external-id": true
We are leaning to options 3 and 4 (can be supported simultaneously) but concerns about "restfulness" of such approach. Any opinions on this? What would you choose and why?
REST says nothing about URLs.
How different are internal and external clients? If the only difference is the existence of an externalId property, just use the /clients endpoint and add the property to your client resource. Always assign and use the internal id property in your API, but allow queries to filter by the customer-provided external id also.
How about this:
/clients/client_id/1 - for automatically generated ids
/clients/external_id/d23sa - for filtering on the external_id field
This could be extended to generically filter on any field of a resource and is the approach my company used in developing SlashDB.

Multi-tenant Algolia index

I would like to offer full-text search to my users through their data - and make sure that they can only access the data they own. Are there any patterns allowing to do that on Algolia ? None of the solutions I've considered seem a good fit, so i was wondering if I had overlooked some other options.
We could host each user's data in a separate Algolia app, so that each API key would give access to only the relevant data, but that would quickly become unaffordable, as many would hit the 10000 records limit.
We could host each user's data in a separate index and use team index restrictions, but there does not seem to be an API to manage those, and that would anyway require an Algolia account for each customer, which seems like a misuse of the service (we could e.g. generate email addresses at our domain name).
Finally we could filter queries with some userId to retrieve only the relevant data, but that wouldn't be secure, as someone could use the apikey to query algolia without the filter.
We could proxy algolia calls to inject the filter and the api key - but the perf penalty would probably be high.
Any other suggestions ? Thanks!
I got a great answer from rayrutjes at Algolia, so I'm pasting it here in case :
The best approach for your use case is to use what we call generated API keys. Here is the documentation for the JavaScript client: https://www.algolia.com/doc/api-client/javascript/api-keys/#generate-key
The usage is fairly simple, you generate an API key on the fly based on your search API key + some additional query params.
The resulting API key can be used like a standard search API key, with the difference that it can be scoped on a given set of parameters.
Note that the generation of such a scoped API key does not require an actual call to the API.
Also be sure to generate those scoped API keys in the backend as in that case you don't want to expose the search API key you use for their generation.