REST-API design - allow custom IDs - rest

we are designing an API which can be used by marketplaces and onlineshops to create payments for their customers.
To reduce the work the marketplaces and shops have to do to implement our API, we want to give them the ability to use their own user- and contract-IDs rather than storing the IDs we create. It makes it easier for them as they dont have to change/extend their databases. Internally in our database we will still use our own technical IDs. So far we do not run any checks on the custom-IDs (i.e. uniqueness).
My question is, if it is a good idea in general to let the stores & marketplaces use their own IDs, or if it is bad practice. And if our approach makes sense, should we run checks on the IDs we receive by the stores & marketplaces (i.e. uniqueness of a user-ID related to the store)?
Example payload for creating a new user via POST /users/:
{
customUserId: "fancyshopuserid12345",
name: "John",
surName: "Doe"
}
Now the shop can run a GET-request /users/fancyshopuserid12345 to retrieve the new user via our API.
EDIT:
We go with both approaches now.
If he wants to use his own id he does it like in the example above, if he sets false as the value for customUserId we set our internal ID as value.

Personally i think that it's awesome feature!
And i don't see any problems here.
I also think that you don't have validate customers ids, just check that it don't have injection to your persistence layer and it'll be enough.
More over your don't violate any REST conventions - that's why i think it's nice idea...

Well, a cool (RESTful) approach would be to receive URIs instead of custom IDs. That would unfortunately mean that those partner systems would have to publish their own resources in order be able to link to them. This would also solve the unique-ness problem, since you would only have to check whether the URI exists.
If some shop systems are in fact build RESTfully, they may want to actually store a URI instead of id, to be able to navigate seamlessly through their own and your systems. They would only have to add your media-types to their clients, and that's it.
Other than that, sure you can store IDs of third-party systems. I know of a few trading systems that do exactly that, storing all sorts of third-party IDs, of backend systems, of transport layer ids, etc. It is at least not unheard of.

Related

Keycloak users - is a good idea to differentiate users by their country?

I'm designing a fairly complex backend and now I have a doubt. Is a good idea in Keycloak to differentiate users in different keycloak groups by their country when I create them during a sign-in for example?
I was thinking that it could be useful to better manage users in the future.
What do you think?
There is no direct solution for such question. It clearly depends on your application. If in the future your application will provide services based on the country of each user it might be good idea as your application might get this information about the user directly from Keycloak.
If you are planning to do some researches about your users it also might be good idea as some statistics might be country related or you would like to get country related outputs (to relocate your cloud instances near to majority of your users etc..)
There might be faster database lookups with such additional information but I don't know if Keycloak currently provides functionality for this. On the other hand, if I will sign up to your service while I am chilling on my holidays on the other side of the world from where I usually live your record will be useless. Therefore this action could bring more issues to implementation of your application while you might not need it at all.
If you have no plans for such functionalities there is simply no reason to do such thing. Present web services tend to store more data then they actually need to. For example in majority of recent database leaks you can see LAST geological coordination's point stored with each user. While these might be unnecessary for precise advertisements targeting and unnecessary users screening, there is really no reason to store last geological coordination of each user. Such information might change with each user login and should be determined in "runtime". If services do not benefit from such data users are under threat for no reason.
You should determine what is needed by your application and what is not. You should never store or expose any additional information's about your users regardless how well your application is secured.

REST: Get query only changeable objects

I'm having a bunch of apis which return several types of data.
All users can query all data by using a GET rest api.
A few users can also change data. What is a common approach when designing REST API, to query only the data that can be changed by the current user but still allow the api to return all data (for display mode).
To explain it further:
The software manages projects. All projects are accessible for all users (also anonymous) via an api (let's call it GET api/projects).
A user has the ability to see a list of all projects he is involved in and which he can edit.
The api should return exactly the same data but limited to the projects he is involed in.
Should I create an additonal parameter, or maybe pass an http header, or what else?
There's no one-size-fits-all answer to this, so I will give you one recommendation that works for some people.
I don't really like creating resources that have 'complex access control'. Instead, I would prefer to create distinct resources for different access levels.
If you want to return limit results for people that have partial access, it might be better to create new resources that reflect this.
I think this might also help a bit thinking about the abstract role of a person who is not allowed to do everything. The abstraction probably doesn't exist per-property, but it exists somewhere as a business rule.

What are some patters for designing REST API for user-based platform in AWS?

I am trying to shift towards serverless architecture when it comes to building REST API. I came from Ruby on Rails background.
I have successfully understood and adapted services such as Api Gateway, Cognito, RDS and Lambda functions, however I am struggling with putting it all together in optimal way.
My case is the following. I have a simple user based platform when there are multiple resources related to application members say blog application.
I have used Cognito for the sake of authentication and Aurora as the database service for keeping thing like articles and likes..
Since the database and Cognito user pool are decoupled, it is hard for me to do things like:
Fetching users that liked particular article
Fetching users comments
It seems problematic for me because I need to pass some unique Cognito user identifier (retrieved during authorization phase in API gateway) to lambda function which will then save the database record with an external reference to this user. On the other hand, If I were to fetch particular users, firstly I must fetch their identifiers from my relation database and then request users details from Cognito user pool..I lack some standard ways of accessing current user in my lambda functions as well as mechanisms for easily associating databse record with that user..
I have not found some convincing recommended patterns for designing such applications even though it seems like a very common problem and I am having hard time struggling if my approach is correct..
I would appreciate some comments on what are some patterns to consider when designing simple user based platform and what are the pitfalls of my solution. Any articles and examples will also be very helpfull.
Thanks in advance.
These sound like standard problems associated with distributed, indpependent, databases. You can no longer delegate all relationships to the database and get a result aggregating them in some way. You have to do the work yourself by calling one database, then the other.
For a case like this:
Fetching users that liked particular article
You would look up the "likes" database to determine user IDs of those who liked it, then look up the "users" database to determine user details such as name and avatar.
Most patterns follow standard database advice, e.g. in the above example, you could follow the performance-oriented pattern of de-normalising - store user data such as name and avatar against each "like", as long as you feel the extra storage and burden of keeping it consistent is justified by the reduction in queries (probably too many Likes to justify this).
Another important practice is using bulk queries to avoid N+1 queries. This is what Rails does with the includes syntax, but you may have to do it yourself here. In my example, it should only take two queries because the second query should get all required user data in one go, by querying for users matching the list of user IDs.
Finally, I'd suggest you try to abstract things. This kind of code gets messy fast, so be sure to build a well-encapsulated data layer that isolates application code from dealing with the mess of multiple databases.

Include / embed vs. link in RESTful APIs

So the general pattern for a RESTful API is to return a single object with embedded links you can use to retrieve related objects. But sometimes for convenience you want to pull back a whole chunk of the object graph at once.
For instance, let's say you have a store application with customers, orders, and returns. You want to display the personal information, all orders, and all returns, together, for customer ID 12345. (Presumably there's good reasons for not always returning orders and returns with customer personal information.)
The purely RESTful way to do this is something like:
GET /
returns a list of link templates, including one to query for customers
GET /customers/12345 (based on link template from /)
returns customer personal information
returns links to get this customer's orders and returns
GET /orders?customerId=12345 (from /customers/12345 response)
gets the orders for customer 12345
GET /returns?customerId=12345 (from /customers/12345 response)
gets the returns for customer 12345
But it'd be nice, once you have the customers URI, to be able to pull this all back in one query. Is there a best practice for this sort of convenience query, where you want to transclude some or all of the links instead of making multiple requests? I'm thinking something like:
GET /customers/12345?include=orders,returns
but if there's a way people are doing this out there I'd rather not just make something up.
(FWIW, I'm not building a store, so let's not quibble about whether these are the right objects for the model, or how you're going to drill down to the actual products, or whatever.)
Updated to add: It looks like in HAL speak these are called 'embedded resources', but in the examples shown, there doesn't seem to be any way to choose which resources to embed. I found one blog post suggesting something like what I described above, using embed as the query parameter:
GET /ticket/12?embed=customer.name,assigned_user
Is this a standard or semi-standard practice, or just something one blogger made up?
Being that the semantics of these types of parameters would have to be documented for each link relation that supported them and that this is more-or-less something you'd have to code to, I don't know that there's anything to gain by having a a standard way of expressing this. The URL structure is more likely to be driven by what's easiest or most prudent for the server to return rather than any particular standard or best practice.
That said, if you're looking for inspiration, you could check out what OData is doing with the $expand parameter and model your link relation from that. Keep in mind that you should still clearly define the contract of your relation, otherwise client programmers may see an OData-like convention and assume (wrongly) that your app is fully OData compliant and will behave like one.

Comparing data with RESTful API

For a website I am working on defining a RESTful API. I believe I got it (mostly) correct using proper resource URIs and proper use of GET/POST/UPDATE/DELETE.
However there is one point where I can't quite figure out what the proper way to do it "in" REST would be - comparison of lists.
Let's say I have a bookstore and a customer can have a wishlist. The wishlist consists of books (their full Book record, i.e. name, synopsis, etc) and a full copy of the list exists on the client. What would be a good way to design the RESTful API to allow a client to query the correctness of its local wishlist (i.e. get to know what books have been added/removed on the wishlist on the server side)?
One option would be to just download the full wishlist from the server and compare it locally. However this is quite a large amount of data (due to the embedded content) and this is a mobile client with a low-bandwidth connection, so this would cause a lot of problems.
Another option would be to download not the whole wishlist (i.e. not including book infos) but only a list of the books' identifiers. This would be not much data (compared to the previous option) and the client could compare the lists locally. However to get the full book record for newly added books, a REST call would have to be made for every single new book. Again, as this is a mobile client with bad network connectivity, this could be problematic.
A third option and my favorite, would be that the client sends its list of identifiers to the server and the server compares it to the wishlist and returns what books were removed and the data for books that were added. This would mean a single roundtrip and only the necessary amount of data. As the wishlist size is estimated to be less than 100 entries, sending just the IDs would be a minimal amount of data (~0.5kb). However I don't know what kind of call would be appropriate - it can't be GET as we are sending data (and putting it all in the URL does not feel right), it can't be POST/UPDATE as we do not change anything on the server. Obviously it's not DELETE either.
How would you implement this third option?
Side-question: how would you solve this problem (i.e. why is option 3 stupid or what better, simple solutions may there be)?
Thank you.
P.S.: A fourth option would be to implement a more sophisticated protocol where the server keeps track of changes to the list (additions/deletes) and the client can e.g. query for changes based on a version identifier or simply a timestamp. However I like the third option better as implementation-wise it is much more simpler and less error-prone on both client and server.
There is nothing in HTTP that says that POST must update the server. People seem to forget the following line in RFC2616 regarding one use of POST:
Providing a block of data, such as the result of submitting a
form, to a data-handling process;
There is nothing wrong with taking your client side wishlist and POSTing to a resource whose sole purpose is to return a set of differences.
POST /Bookstore/WishlistComparisonEngine
The whole concept behind REST is that you leverage the power of the underlying HTTP protocol.
In this case there are two HTTP headers that can help you find out if the list on your mobile device is stale. An added benefit is that the client on your mobile device probably supports these headers natively, which means you won't have to add any client side code to implement them!
If-Modified-Since: check to see if the server's copy has been updated since your client first retrieved it
Etag: check to see if a unique identifier for your client's local copy matches that which is on the server. An easy way to generate the unique string required for ETags on your server is to just hash the service's text output using MD5.
You might try reading Mark Nottingham's excellent HTTP caching tutorial for information on how these headers work.
If you are using Rails 2.2 or greater, there is built in support for these headers.
Django 1.1 supports conditional view processing.
And this MIX video shows how to implement with ASP.Net MVC.
I think the key problems here are the definitions of Book and Wishlist, and where the authoritative copies of Wishlists are kept.
I'd attack the problem this way. First you have Books, which are keyed by ISBN number and have all the metadata describing the book (title, authors, description, publication date, pages, etc.) Then you have Wishlists, which are merely lists of ISBN numbers. You'll also have Customer and other resources.
You could name Book resources something like:
/book/{isbn}
and Wishlist resources:
/customer/{customer}/wishlist
assuming you have one wishlist per customer.
The authoritative Wishlists are on the server, and the client has a local cached copy. Likewise the authoritative Books are on the server, and the client has cached copies.
The Book representation could be, say, an XML document with the metadata. The Wishlist representation would be a list of Book resource names (and perhaps snippets of metadata). The Atom and RSS formats seem good fits for Wishlist representations.
So your client-server synchronization would go like this:
GET /customer/{customer}/wishlist
for ( each Book resource name /book/{isbn} in the wishlist )
GET /book/{isbn}
This is fully RESTful, and lets the client later on do PUT (to update a Wishlist) and DELETE (to delete it).
This synchronization would be pretty efficient on a wired connection, but since you're on a mobile you need to be more careful. As #marshally points out, HTTP 1.1 has a lot of optimization features. Do read that HTTP caching tutorial, and be sure to have your web server properly set Expires headers, ETags, etc. Then make sure the client has an HTTP cache. If your app were browser-based, you could leverage the browser cache. If you're rolling your own app, and can't find a caching library to use, you can write a really basic HTTP 1.1 cache that stores the returned representations in a database or in the file system. The cache entries would be indexed by resource names, and hold the expiration dates, entity tag numbers, etc. This cache might take a couple days or a week or two to write, but it is a general solution to your synchronization problems.
You can also consider using GZIP compression on the responses, as this cuts down the sizes by maybe 60%. All major browsers and servers support it, and there are client libraries you can use if your programming language doesn't already (Java has GzipInputStream, for instance).
If I strip out the domain-specific details from your question, here's what I get:
In your RESTful client-server application, the client stores a local copy of a large resource. Periodically, the client needs to check with the server to determine whether its copy of the resource is up-to-date.
marshally's suggestion is to use HTTP caching, which IMO is a good approach provided it can be done within your app's constraints (e.g., authentication system).
The downside is that if the resource is stale in any way, you'll be downloading the entire list, which sounds like it's not feasible in your situation.
Instead, how about re-evaluating the need to keep a local copy of the Wishlist in the first place:
How is your client currently using the local Wishlist?
If you had to, how would you replace the local copy with data fetched from the server?
What have you done to minimize your client's data requirements when building its Wishlist view(s) and executing business logic?
Your third alternative sounds nice, but I agree that it doesn't feel to RESTfull ...
Here's another suggestion that may or may not work: If you keep a version history of of your list, you could ask for updates since a specific version. This feels more like something that can be a GET operation. The version identifiers could either be simple version numbers (like in e.g. svn), or if you want to support branching or other non-linear history they could be some kind of checksums (like in e.g. monotone).
Disclaimer: I'm not an expert on REST philosophy or implementation by any means.
Edit: Did you ad that PS after I loaded the question? Or did I simply not read your question all the way through before writing an answer? Sorry. I still think the versioning might be a good idea, though.