REST: Get query only changeable objects - rest

I'm having a bunch of apis which return several types of data.
All users can query all data by using a GET rest api.
A few users can also change data. What is a common approach when designing REST API, to query only the data that can be changed by the current user but still allow the api to return all data (for display mode).
To explain it further:
The software manages projects. All projects are accessible for all users (also anonymous) via an api (let's call it GET api/projects).
A user has the ability to see a list of all projects he is involved in and which he can edit.
The api should return exactly the same data but limited to the projects he is involed in.
Should I create an additonal parameter, or maybe pass an http header, or what else?

There's no one-size-fits-all answer to this, so I will give you one recommendation that works for some people.
I don't really like creating resources that have 'complex access control'. Instead, I would prefer to create distinct resources for different access levels.
If you want to return limit results for people that have partial access, it might be better to create new resources that reflect this.
I think this might also help a bit thinking about the abstract role of a person who is not allowed to do everything. The abstraction probably doesn't exist per-property, but it exists somewhere as a business rule.

Related

Rest API design: Managing access to sub-entities

Note: I realize that this is close to being off-topic for being opinion-based, but I am hoping that there is some accepted best practice to handle this that I just don't know about.
My problem is the following: I need to design a Rest API for a program where users can create their own projects, and each project contains files that can only be seen by users that have access. I am stuck with how to design the "List all files of a project" query.
Standard Rest API practice would suggest two endpoints, like:
`GET /projects` # List all projects
`POST /projects` # Create new project
`GET /projects/id` # Get specific project
etc.
and the same for the files of a project.
However, there should never be a reason to list all files - only the files of a single project. To make it more complicated, access management needs to be a thing, users should never see files that are in projects they don't have access to.
I can see multiple ways to handle that:
So the obvious way is to implement the GET function, optionally with a filter. However, this isn't optimal, since if the user doesn't set a filter, it would have to crawl through all projects, check for each project whether the user has access, and then list all files the user has access to:
GET /files?project=test1
I could also make the files command a subcommand of the projects command - e.g.
GET /projects/#id/files
However, I have the feeling this isn't too restful, since it doesn't expose entities directly?
Is there any concencus on which should usually be implemented? Is it okay to "force" users to set a parameter in the first one? Or is there a third alternative that solves what I am looking for? Happy about any literature recommendations on how to design this as well.
Standard Rest API practice would suggest two endpoints
No, it wouldn't. REST practice would suggest figuring out the resources in your resource model.
Think "documents": I should be able to retrieve (GET) a document that describes all of the files in the project. Great! This document should only be accessible when the request authorization matches some access control list. Also good.
Maybe there should also be a document for each user, so they can see a list of all of the projects they have access to, where that document includes links to the "all of the files in the project" documents. And of course that document should also be subject to access control.
Note that "documents" here might be text, or media files, or scripts, or CSS, or pretty much any kind of information that you can transmit over the network. We can gloss the details, because "uniform interface" means that we manage them all the same way.
In other words, we're just designing a "web site" filled with interlinked documents, with access control.
Each document is going to need a unique identifier. That identifier can be anything we want: /5393d5b0-0517-4c13-a821-c6578cb97668 is fine. Because it can be anything we want, we have extra degrees of freedom.
For example, we might design our identifiers such that the document whose identifiers begin with /users/12345 are only accessible by requests with authorization headers that match user 12345, and that all documents whose identifiers begin with /projects/12345 are only accessible by requests with authorization headers that match any of the users that have access to that specific project, and so on.
In other words, it is completely acceptable to choose resource identifier spellings that make your implementation easier.
(Note: in an ideal world, you would have "cool" identifiers that are implementation agnostic, so that they still work even if you change the underlying implementation details of your server.)
I have the feeling this isn't too restful, since it doesn't expose entities directly?
It's fine. Resource models and entity models are different things; we shouldn't expect them to always match one to one.
After looking further, I came across this document from Microsoft. Some quotes:
Also consider the relationships between different types of resources and how you might expose these associations. For example, the /customers/5/orders might represent all of the orders for customer 5. You could also go in the other direction, and represent the association from an order back to a customer with a URI such as /orders/99/customer. However, extending this model too far can become cumbersome to implement. A better solution is to provide navigable links to associated resources in the body of the HTTP response message. This mechanism is described in more detail in the section Use HATEOAS to enable navigation to related resources.
In more complex systems, it can be tempting to provide URIs that enable a client to navigate through several levels of relationships, such as /customers/1/orders/99/products. However, this level of complexity can be difficult to maintain and is inflexible if the relationships between resources change in the future. Instead, try to keep URIs relatively simple. Once an application has a reference to a resource, it should be possible to use this reference to find items related to that resource. The preceding query can be replaced with the URI /customers/1/orders to find all the orders for customer 1, and then /orders/99/products to find the products in this order.
This makes me think that using solution 2 is probably the best case for me, since each file will be associated with only a single project, and should be deleted when a project is deleted. Files cannot exist on their own, outside of projects.

What are some patters for designing REST API for user-based platform in AWS?

I am trying to shift towards serverless architecture when it comes to building REST API. I came from Ruby on Rails background.
I have successfully understood and adapted services such as Api Gateway, Cognito, RDS and Lambda functions, however I am struggling with putting it all together in optimal way.
My case is the following. I have a simple user based platform when there are multiple resources related to application members say blog application.
I have used Cognito for the sake of authentication and Aurora as the database service for keeping thing like articles and likes..
Since the database and Cognito user pool are decoupled, it is hard for me to do things like:
Fetching users that liked particular article
Fetching users comments
It seems problematic for me because I need to pass some unique Cognito user identifier (retrieved during authorization phase in API gateway) to lambda function which will then save the database record with an external reference to this user. On the other hand, If I were to fetch particular users, firstly I must fetch their identifiers from my relation database and then request users details from Cognito user pool..I lack some standard ways of accessing current user in my lambda functions as well as mechanisms for easily associating databse record with that user..
I have not found some convincing recommended patterns for designing such applications even though it seems like a very common problem and I am having hard time struggling if my approach is correct..
I would appreciate some comments on what are some patterns to consider when designing simple user based platform and what are the pitfalls of my solution. Any articles and examples will also be very helpfull.
Thanks in advance.
These sound like standard problems associated with distributed, indpependent, databases. You can no longer delegate all relationships to the database and get a result aggregating them in some way. You have to do the work yourself by calling one database, then the other.
For a case like this:
Fetching users that liked particular article
You would look up the "likes" database to determine user IDs of those who liked it, then look up the "users" database to determine user details such as name and avatar.
Most patterns follow standard database advice, e.g. in the above example, you could follow the performance-oriented pattern of de-normalising - store user data such as name and avatar against each "like", as long as you feel the extra storage and burden of keeping it consistent is justified by the reduction in queries (probably too many Likes to justify this).
Another important practice is using bulk queries to avoid N+1 queries. This is what Rails does with the includes syntax, but you may have to do it yourself here. In my example, it should only take two queries because the second query should get all required user data in one go, by querying for users matching the list of user IDs.
Finally, I'd suggest you try to abstract things. This kind of code gets messy fast, so be sure to build a well-encapsulated data layer that isolates application code from dealing with the mess of multiple databases.

REST versus more complicated data requests

REST APIs work great for get-one, get-a-list etc.
But our frontend has a dashboard, and one part of the dashboard is a more complicated. It requires a query that aggregates/joins several different resources.
Returning the data is not a problem. But what of the taxonomy of the endpoint that returns this data? Since the data is not a resource, what should the URL look like?
For REST principles it does not matter much if data returned 'aggregates/joins several different resources'. It is implementation detail of underlying data store. The dashboard should not care how exactly that store is implemented, if it uses joins, multiple queries.
Whatever is displayed on dashboard (single item or list of items) still may be treated as resource.
Example: Imagine use case when dashboard shows aggregated user profile from multiple portals (Facebook, Linkedin, etc). You may still have REST resource /user/id for that, even if obtaining that single resource would require many complex operations.

REST-API design - allow custom IDs

we are designing an API which can be used by marketplaces and onlineshops to create payments for their customers.
To reduce the work the marketplaces and shops have to do to implement our API, we want to give them the ability to use their own user- and contract-IDs rather than storing the IDs we create. It makes it easier for them as they dont have to change/extend their databases. Internally in our database we will still use our own technical IDs. So far we do not run any checks on the custom-IDs (i.e. uniqueness).
My question is, if it is a good idea in general to let the stores & marketplaces use their own IDs, or if it is bad practice. And if our approach makes sense, should we run checks on the IDs we receive by the stores & marketplaces (i.e. uniqueness of a user-ID related to the store)?
Example payload for creating a new user via POST /users/:
{
customUserId: "fancyshopuserid12345",
name: "John",
surName: "Doe"
}
Now the shop can run a GET-request /users/fancyshopuserid12345 to retrieve the new user via our API.
EDIT:
We go with both approaches now.
If he wants to use his own id he does it like in the example above, if he sets false as the value for customUserId we set our internal ID as value.
Personally i think that it's awesome feature!
And i don't see any problems here.
I also think that you don't have validate customers ids, just check that it don't have injection to your persistence layer and it'll be enough.
More over your don't violate any REST conventions - that's why i think it's nice idea...
Well, a cool (RESTful) approach would be to receive URIs instead of custom IDs. That would unfortunately mean that those partner systems would have to publish their own resources in order be able to link to them. This would also solve the unique-ness problem, since you would only have to check whether the URI exists.
If some shop systems are in fact build RESTfully, they may want to actually store a URI instead of id, to be able to navigate seamlessly through their own and your systems. They would only have to add your media-types to their clients, and that's it.
Other than that, sure you can store IDs of third-party systems. I know of a few trading systems that do exactly that, storing all sorts of third-party IDs, of backend systems, of transport layer ids, etc. It is at least not unheard of.

Include / embed vs. link in RESTful APIs

So the general pattern for a RESTful API is to return a single object with embedded links you can use to retrieve related objects. But sometimes for convenience you want to pull back a whole chunk of the object graph at once.
For instance, let's say you have a store application with customers, orders, and returns. You want to display the personal information, all orders, and all returns, together, for customer ID 12345. (Presumably there's good reasons for not always returning orders and returns with customer personal information.)
The purely RESTful way to do this is something like:
GET /
returns a list of link templates, including one to query for customers
GET /customers/12345 (based on link template from /)
returns customer personal information
returns links to get this customer's orders and returns
GET /orders?customerId=12345 (from /customers/12345 response)
gets the orders for customer 12345
GET /returns?customerId=12345 (from /customers/12345 response)
gets the returns for customer 12345
But it'd be nice, once you have the customers URI, to be able to pull this all back in one query. Is there a best practice for this sort of convenience query, where you want to transclude some or all of the links instead of making multiple requests? I'm thinking something like:
GET /customers/12345?include=orders,returns
but if there's a way people are doing this out there I'd rather not just make something up.
(FWIW, I'm not building a store, so let's not quibble about whether these are the right objects for the model, or how you're going to drill down to the actual products, or whatever.)
Updated to add: It looks like in HAL speak these are called 'embedded resources', but in the examples shown, there doesn't seem to be any way to choose which resources to embed. I found one blog post suggesting something like what I described above, using embed as the query parameter:
GET /ticket/12?embed=customer.name,assigned_user
Is this a standard or semi-standard practice, or just something one blogger made up?
Being that the semantics of these types of parameters would have to be documented for each link relation that supported them and that this is more-or-less something you'd have to code to, I don't know that there's anything to gain by having a a standard way of expressing this. The URL structure is more likely to be driven by what's easiest or most prudent for the server to return rather than any particular standard or best practice.
That said, if you're looking for inspiration, you could check out what OData is doing with the $expand parameter and model your link relation from that. Keep in mind that you should still clearly define the contract of your relation, otherwise client programmers may see an OData-like convention and assume (wrongly) that your app is fully OData compliant and will behave like one.