Does REST make sense for an event collecting API? - rest

I'm building an API for collecting events from an ecommerce site. The data flow, besides authentication and HTTP response codes, will be entirely from client to server.
Does it make sense to design such an API in a RESTful way?
For example, I'm not sure if I get much from URL endpoints representing resources. For instance, when a user on the ecommerce site goes to a new page, I'd like the API to be notified, and sent all the user data that has changed. RESTfully, I'd have to do it at an endpoint like /pageloads, /webpages or maybe /users since I'm interested not only in the page data but the user's too. Not quite RESTfully, the following seems more natural: /pageloadEvents along with other events represented like this. I'm saying "not quite RESTfully" because such endpoints don't seem to represent real resources, such as the web pages, users, etc, underneath.
Additionally, because of the communication going only from client to server, I won't have any need for the REST verb GET. In other words, the server is just a listener here. Perhaps this one sided communication is REST-wise OK, but looking around, I get a sense that that's a bit unusual.
Many thanks!

Related

REST API design for resource modification: catch all POST vs multiple endpoints

I'm trying to figure out best or common practices for API design.
My concern is basically this:
PUT /users/:id
In my view this endpoint could by used for a wide array of functions.
I would use it to change the user name or profile, but what about ex, resetting a password?
From a "model" point of view, that could be flag, a property of the user, so it would "work" to send a modification.
But I would expect more something like
POST /users/:id/reset_password
But that means that almost for each modification I could create a different endpoint according to the meaning of the modification, i.e
POST /users/:id/enable
POST /users/:id/birthday
...
or even
GET /user/:id/birthday
compared to simply
GET /users/:id
So basically I don't understand when to stop using a single POST/GET and creating instead different endpoints.
It looks to me as a simple matter of choice, I just want to know if there is some standard way of doing this or some guideline. After reading and looking at example I'm still not really sure.
Disclaimer: In a lot of cases, people ask about REST when what they really want is an HTTP compliant RPC design with pretty URLs. In what follows, I'm answering about REST.
In my view this endpoint could by used for a wide array of functions. I would use it to change the user name or profile, but what about ex, resetting a password?
Sure, why not?
I don't understand when to stop using a single POST/GET and creating instead different endpoints.
A really good starting point is Jim Webber's talk Domain Driven Design for RESTful systems.
First key idea - your resources are not your domain model entities. Your REST API is really a facade in front of your domain model, which supports the illusion that you are just a website.
So your resources are analogous to documents that represent information. The URI identifies the document.
Second key idea - that URI is used by clients to cache representations of the resource, so that we don't need to send requests back to the server all the time. Instead, we have built into HTTP a bunch of standard ways for communicating caching meta data from the server to the client.
Critical to that is the rule for cache invalidation: a successful unsafe request invalidates previously cached representations of the same resource (ie, the same URI).
So the general rule is, if the client is going to do something that will modify a resource they have already cached, then we want the modification request to go to that same URI.
Your REST API is a facade to make your domain model look like a web site. So if we think about how we might build a web site to do the same thing, it can give us insights to how we arrange our resources.
So to borrow your example, we might have a web page representation of the user. If we were going to allow the client to modify that page, then we might think through a bunch of use cases (enable, change birthday, change name, reset password). For each of these supported cases, we would have a link to a task-specific form. Each of those forms would have fields allowing the client to describe the change, and a url in the form action to decide where the form gets submitted.
Since what the client is trying to achieve is to modify the profile page itself, we would have each of those forms submit back to the profile page URI, so that the client would know to invalidate the previously cached representations if the request were successful.
So your resource identifiers might look like:
/users/:id
/users/:id/forms/enable
/users/:id/forms/changeName
/users/:id/forms/changeBirthday
/users/:id/forms/resetPassword
Where each of the forms submits its information to /users/:id.
That does mean, in your implementation, you are probably going to end up with a lot of different requests routed to the same handler, and so you may need to disambiguate them there.

How to protect an API endpoint for reporting client-side JS errors against spam (if even necessary)?

I am developing a web application with Spring Boot and a React.js SPA, but my question is not specific to those libraries/frameworks, as i assume reporting client-side JS errors to the server (for logging and analyzing) must be a common operation for many modern web applications.
So, suppose we have a JS client application that catches an error and a REST endpoint /errors that takes a JSON object holding the relevant information about what happened. The client app sends the data to the server, it gets stored in a database (or whatever) and everyone's happy, right?
Now I am not, really. Because now I have an open (as in allowing unauthenticated create/write operations) API endpoint everyone with just a little knowledge could easily spam.
I might validate the structure of JSON data the endpoint accepts, but that doesn't really solve the problem.
In questions like "Open REST API attached to a database- what stops a bad actor spamming my db?" or "Secure Rest-Service before user authentification", there are suggestions such as:
access quotas (but I don't want to save IPs or anything to identify clients)
Captchas (useless for error reporting, obviously)
e-mail verification (same, just imagine that)
So my questions are:
Is there an elegant, commonly used strategy to secure such an endpoint?
Would a lightweight solution like validating the structure of the data be enough in practice?
Is all this even necessary? After all I won't advertise my error handling API endpoint with a banner in the app...
I’ve seen it done three different ways…
Assuming you are using OAuth 2 to secure your API. Stand up two
error endpoints.
For a logged in user, if an errors occurs you would
hit the /error endpoint, and would authenticate using the existing
user auth token.
For a visitor, you can expose a /clientError (or
named in a way that makes sense to you) endpoint that takes the
client_credentials token for the client app.
Secure the /error endpoint using an api key that would be scope for
access to the error endpoint only.
This key would be specific to the
client and would be pass in the header.
Use a 3rd party tool such as Raygun.io, or any APM tool, such as New Relic.

Protecting REST API behind SPA against data thiefs

I am writing a REST Api gateway for an Angular SPA and I am confronted with the problem of securing the data exposed by the API for the SPA against "data thiefs". I am aware that I can't do much against HTML scraping, but at least I don't want to offer such data thiefs the user experience and full power of our JSON sent to the SPA.
The difference between most "tutorials" and threads about this topic is that I am exposing this data to a public website (which means no user authentication required) which offers valuable statistics about a video game.
My initial idea on how to protect the Rest API for SPA:
Using JWTs everywhere. When a visitor opens the website the very first time the SPA requests a JWT from my REST Api and saves it in the HTTPS cookies. For all requests the SPA has to use the JWT to get a response.
Problems with that approach
The data thief could simply request the oauth token from our endpoint as well. I have no chance to verify that the token has actually been requested from my SPA or from the data thief?
Even if I solved that the attacker could read the saved JWT from the HTTPS cookies and use it in his own application. Sure I could add time expiration for the JWT
My question:
I am under the impression that this is a common problem and therefore I am wondering if there are any good solutions to protect against others than the SPA having direct access to my REST Api responses?
From the API's point of view, your SPA is in no way different than any other client. You obviously can't include a secret in the SPA as it is sent to anybody and cannot be protected. Also the requests it makes to the API can be easily sniffed and copied by another client.
So in short, as diacussed many times here, you can't authenticate the client application. Anybody can create a different client if they want.
One thing you can actually do is checking the referer/origin of requests. If a client is running in a browser, thr requests it can make are somewhat limited, and one such limitation is the referer and origin headers, which are always controlled by the browser, and not javascript. So you can actually make sure that if (and only if!) the client is running in an unmodified browser, it is downloaded from your domain. This is the default in browsers btw, so if you are not sending CORS headers, you already did this (browsers do, actually). However, this does not keep an attacker from building and running a non-browser client and fake any referer or origin he likes, or just disregard the same origin policy.
Another thing you could do is changing the API regularly just enough to stop rogue clients from working (and changing your client at the same time ofc). Obviously this is not secure at all, but can be annoying enough for an attacker. If downloading all your data once is a concern, this again doesn't help at all.
Some real things you should consider though are:
Does anybody actually want to download your data? How much is it worth? Most of the times nobody wants to create a different client, and nobody is that much interested in the data.
If it is that interesting, you should implement user authentication at the very least, and cover the remaining risk either via points below and/or in your contracts legally.
You could implement throttling to not allow bulk downloading. For example if the typical user accesses 1 record every 5 seconds, and 10 altogether, you can build rules based on the client IP for example to reasonably limit user access. Note though that rate limiting must be based on a parameter the client can't modify arbitrarily, and without authentication, that's pretty much the client IP only, and you will face issues with users behind a NAT (ie. corporate networks for example).
Similarly, you can implement monitoring to discover if somebody is downloading more data than it would be normal or necessary. However, without user authentication, your only option will be to ban the client IP. So again it comes down to knowing who the user is, ie. authentication.

When is JSON-RPC over http with POST more suitable than RESTful API?

I'm currently developing a web application with a senior developer. We've agreed to use REST API for client-server communication and he sent me the parameters and the expected responses.
But the design does not seem to be RESTful. Rather it looks like JSON-RPC over http utilizing only the POST method.
For example, to register a user you send a POST request to the server the following parameters.
{
id: 1,
method: "RegisterUser",
params: {
firstName: "John",
lastName: 'Smith',
country: 'USA',
phone: "~",
email: "~",
password: "~"
}
}
And the expected response is
{
id: 1
result: "jwt-token",
error : null
}
Multiple requests are sent to the same URL and the server sends back the response based on the 'method' in the parameters. For example, to get a user info, you send a { method: "GetUserInfo", params: { id: ~ }} to the same URL. All responses have the status code 200, and the errors are handled by the error in the response body. So even if the status code is 200, if error is not null it means something is wrong.
The way I'm used to doing is sending a POST request to 'users/' with a request body when registering a new user, sending a GET request to 'users/1' to retrieve a user information, etc.
When I asked why he'd decided to do it this way, he said in his previous job, trying to add more and more APIs was a pain when following RESTful API design. Also, he said he didn't understand why RESTful API uses different HTTP verbs when all of them could be done with POST.
I tried to come up with the pros of REST API over JSON-RPC over http with POST.
GET requests are cached by the browser, but some browsers may not support POST request caching.
If we are going to open the API to outside developers, this might cause discomfort for them since this is not a typical REST API.
In what circumstance would the JSON-RPC over http style be better the REST RESTful APIs? Or does it just not matter and just a matter of preferance?
it looks like JSON-RPC over http utilizing only the POST method.
Yes, it does.
The way I'm used to doing is sending a POST request to 'users/' with a request body when registering a new user, sending a GET request to 'users/1' to retrieve a user information, etc.
That's not quite it either.
Riddle. How did you submit this question to stack overflow? Well, you probably followed a book mark you had saved, or followed a link from google. Maybe you submitted a search or two, eventually you clicked the "Ask Question", which took you to a form. After filling in the details of the form, you hit the submit button. That took you to a view of your question, that include (among other things) a link to edit the question. You weren't interested in that, so you were done -- except for refreshing the page from time to time hoping for an answer.
That's a REST api. You, the agent, follow links from one state to another, negotiating stack overflows "submit a question" protocol.
Among other things to notice: the browser didn't need to know in advance what URLs to send things to, or which http method to use, because the HTML had encoded those instructions into it. The browser just need to understand the HTML standard, so that it could understand how to find the links/forms within the representation.
Now, REST is just a set of architectural constraints, that boil down to "do it the way a web server does". You don't need to use HTML as your media type; you don't need to design for web browsers as your clients. But, to do REST, you do need hypermedia; and clients that understand that hypermedia type -- so it is going to be a lot easier for you to choose one of the standardized media types.
Are there more reasons why I should prefer RESTful API over JSON-RPC over http with POST? Or does it just not matter?
Roy Fielding, in 2008, offered this simple and correct observation
REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them.
For instance, the folks working on GraphQL decided that the properties that the REST constraints induce weren't valuable for their use case; not nearly as valuable as being able to delivery to the client a representation tuned to a clients specific needs.
Horses for courses.
Use RESTful APIs when you are performing standard create, read, update and delete actions on resources. The CRUD actions should behave the same way for each resource, unless you have some before and after hooks. Any new developer coming to the project will easily understand your API if it follows the standards.
Use JSON-RPC when you are performing actions that don't necessarily map cleanly to any CRUD. For instance, maybe you want to retrieve counts or summary data of a specific resource collection. You could do this with REST, but it might require you to think of it as some sort of "summary" resource that you read from. It's easier to do with JSON-RPC, since you can just implement a procedure that runs the appropriate query in your database and returns an appropriate result object.
Or what if you want to make an API call that lets a user delete or update all of instances of a resource(s) that meet some condition, without knowing ahead of time what those instances are?
You can also use JSON-RPC in cases where you need to have a lot of side effects for standard CRUD actions and it's inconvenient to make hooks that run before or after each action.
You don't have to go all in with one of the other, you can use both. Have standard RESTful endpoints where appropriate and another RPC endpoint for handling JSON-RPC calls.
Use REST when you write public web services. REST is standardized and predictable, it will help consumers to write client apps. Also, GET HTTP method is widely used to retrieve resources from public web services.
Use JSON RPC when you write back-end for an application (i.e. not public web services). JSON RPC style is more flexible and more suitable for register, login, and getProductsByFilters methods. There is no reason to use GET with JSON RPC, only POST should be used.

RESTful vs. SEO Urls

So I have a dilema here. Trying to build out a RESTful API. Which means I'd like to have resources defined and also require that consumers reference those resources by id.
For example, the typical {resourceName}/{id}?{querystring}
But the way of course a web UI works, for a ecommerce site is that naturally you have SEO-friendly urls.
So there is a mismatch in terms of application specific URLs where we have application context based urls, context here being an ecommerce site using SEO-friendly URLs. They wouldn't have stuff like /id in it 100% of the time. And actually our site doesn't have any ids whatsoever in the web UI's urls.
So when users go from page to page, we might have urls like www.ourdomain.com/cars/local/usa/ca/sunnyvale-cars. And sometimes we're talking about stripping the '-' or other characters out of the more seo friendly urls, after the UI devs send a request to lets say like above a state resource. I'd strip the city name out of 'sunnyvale-cars' if lets say I wanted info on that city so the UI team might send me '/cities?name=sunnyvale-cars' where our API would need to strip out the city name in order to do the query in the backend which to me just doesn't sound like something our API should be doing. It's also coupling our REST API to the web and web conventions...our REST API should be ignorant of any type of delivery mechanism so that our API can be reused for other apps or other APIs in our Business domain.
There might be times where the UI Engineering team, in fact a lot of times they'll only have URLs like this. There is no user action or lets say dropdown list where they can just grab an id. So they can't always request stuff for the next page by sending me a RESTful URI request every time as they may not have ids all the time. In other words, sometimes they'll have to instead request stuff like this: /states?name="ca" to get something related to that state just as a hypothetical example.
So how in the world do you even build a REST API if your UI team is telling you they won't have ids for everything?
From a REST purist's point of view no URI is inherently more RESTful than another as long as both use identify the resources. At least, I could not find anything about that in the original thesis.
However, if you find some structure fitting the purposes of your API better, you can create a second endpoint exclusively for the needs of your UI team. That endpoint would serve as a proxy and simply map the SEO-friendly structure to your API in possibly generic way.
Or, applications from the outside are just expected to do the rewriting or any app specific mapping-to-REST API "stuff" since it's really app specific in what that mapping looks like anyway, thus keeping the API ignorant of application specific details, domain logic, or even web conventions which the API should not care about or even know about.