REST Protocol for searching and filtering

REST Protocol for searching and filtering - rest

The standard REST verb for returning a value GET can take different parameters to select what to "get". Often there is one that takes an id to get a single value, and often some sort of search criteria to get a list.
Is there a standard way to specify the filtering and sorting of the data that is being searched for? For example, if I have an invoice record I'd like to write a GET query that says "give me all invoices for customer 123, with total > $345 and return in descending order of date".
If I were writing this myself I'd have something like:
GET http://example.com/mydata?query="customer=123&&total>345.00"&order="date"
(Note I didn't urlencode the url for clarity, though obviously that is required in practice, but I hope you get what I mean.)
I can certainly write something for this, but I am wondering if there is a standardized way to do this?

Is there a standard way to specify the filtering and sorting of the data that is being searched for?
Not that I'm aware of.
Note that HTTP doesn't really have queries (yet); HTTP has resource identifiers.
We've got a standard for resource identifiers (RFC 3986) and a standard for URI templates (RFC 6570) that describes how to produce a range of identifiers via variable expansion.
But as far as I can tell there is no published "standard" that automatically transforms a URI into a SQL query.
It's possible that one of the "convention over configuration" frameworks (ex: Rails) might have something useful here, but I haven't found it.

Related

Pass multiple filters with multiple values in a query string (REST)

I originally posted multiple filters containing multiple values in JSON as part of my GET request but I believe this is bad practise, so I changed it to a POST but I don't like it as getting results from a query has nothing to do with a POST so I guess I'll have to use a query string
Most filter examples I have found are either using one filter or one value, but I am looking as to whether or not there is a best practise to pass multiple filters with multiple values for filtering as a single parameter in the query string.
For example, this is a basic one which looks for all cars that are red
GET /cars?color=red
But what if I wanted to look for all cars that are
red, blue or green and
have 2 seats or less and
their brand name starts with b and
can be bought in the US, UK or Germany
Would the following be ok?
http://myserver/api/cars?color=red|blue|green¬seats<=2¬brand[startswith]b¬country=USA|UK|Germany
I'm suggesting the use of the:
| character as a separator between each values for a given filter
¬ character as a separator between each filters
[startsWith] to handle the search type, but could contain [=, <=, >=, <>, [contains],[endswith], etc...
This would then be parsed in the server end and the relevant filters would be build accordingly based on the provided values
Hope this make sense but I'm really interested as to whether or not there is a standard/best practise used for such scenarios with REST in mind?
Thanks.

As in most design questions, the key is having a consistent design for all your APIs. You can follow certain well-known guidelines/standards to make your API easily discoverable.
For example, take a look at OData. The "Queries" section on this page is relevant to your question. Here's an example:
https://services.odata.org/v4/TripPinServiceRW/People?$top=2 & $select=FirstName, LastName & $filter=Trips/any(d:d/Budget gt 3000)
Another option is the OpenSearch standard. The relevant section is here. Here's an example:
https://opensearch.php?recordSchema=html&query=dc.creator any Mill* Grad*
Another interesting option is GraphQL, which makes it easier to map query parameters to data fetch parameters. It uses a filter payload instead of query parameters. See the spec here: GraphQL Spec.

RESTful query API design

I want to ask what is the most RESTful way for queries, I have this existing API
/entities/users?skip=0&limit=100&queries={"$find":{"$minus":{"$find":{"username":"markzu"}}}}
Easily the first parts of the query, skip and limit are easily identifiable however I find the "queries" part quite confusing for others. What the query means is to
Find every User minus Find User entities with username 'markzu'
The reason it is defined this way is due to the internal database query behavior.
Meaning in the NoSQL database we use, the resource run two transactional queries, first is to find everything in the User table minus a find User with a username that was specified (similar to SQL) -- boolean operations. So in other words, the query means, "fetch every User except username 'markzu' "
What is the proper way to define this in RESTful way, based on standards?

What is the proper way to define this in RESTful way, based on standards?
REST doesn't care what spelling you use for resource identifiers, so long as your choice is consistent with the production rules defined in RFC 3986.
However, we do have a standard for URI Templates
A URI Template is a compact sequence of characters for describing a range of Uniform Resource Identifiers through variable expansion.
You are already aware of the most familiar form of URI template -- key-value pairs encoded in the query string.
?skip=0&limit=100&username=markzu
That's often a convenient choice, because HTML understands how to process forms into url encoded queries.
It doesn't look like you need any other parameters, you just need to be able this query from others. So a perfectly reasonable choice might be
/every-user-except?skip=0&limit=100&username=markzu
It may help to think "prepared statement", rather than "query".
The underlying details of the implementation really shouldn't enter into the calculation at all. Your REST API is a facade that makes your app look like an HTTP aware key value store.

REST Best practise for filtering and knowing the result is singular: List or single?

Variety of REST practises suggest (i.e. 1, 2, 3) to use plurals in your endpoints and the result is always a list of objects, unless it's filtered by a specific value, such as /users/123 Query parameters are used to filter the list, but still result in a list, nevertheless. I want to know if my case should 'abandon' those best practices.
Let's use cars for my example below.
I've got a database full of cars and each one has a BuildNumber ("Id"), but also a model and build year which combination is unique. If I then query for /cars/ and search for a specific model and year, for example /cars?model=golf&year=2018 I know, according to my previous sentence, my retrieve will always contain a single object, never multiple. My result, however, will still be a list, containing just one object, nevertheless.
In such case, what will be the best practise as the above would mean the object have to be extracted from the list, even though a single object could've been returned instead.
Stick to best practises and export a list
Make a second endpoind /car/ and use the query parameters ?model=golf&year=2018, which are primarily used for filtering in a list, and have the result be a single object, as the singular endpoint states
The reason that I'm asking this is simply for the cleanness of the action: I'm 100% sure my GET request will result in single object, but still have to perform actions to extract it from the list. These steps should've been unnecessary. Aside of that, In my case I don't know the unique identifier, so cars/123 for retrieving a specific car isn't an option. I know, however, filters that will result in one object and one specific object altogether. The additional steps simply feel redundant.
1: https://learn.microsoft.com/en-us/azure/architecture/best-practices/api-design
2: https://blog.mwaysolutions.com/2014/06/05/10-best-practices-for-better-restful-api/
3: https://medium.com/hashmapinc/rest-good-practices-for-api-design-881439796dc9

As you've specifically asked for best practices in regards to REST:
REST doesn't care how you specify your URIs or that semantically meaningful tokens are used inside the URI at all. Further, a client should never expect a certain URI to return a certain type but instead rely on content-type negotiation to tell the server all of the capabilities the client supports.
You should furthermore not think of REST in terms of object orientation but more in terms of affordance and statemachines where a client get served every information needed in order to make an educated decision on what to do next.
The best sample to give here is probably to take a close look at the Web and how it's done for HTML pages. How can you filter for a specific car and how it will be presented to you? The same concepts that are used in the Web also apply to REST as both use the same interaction model. In regards to your car sample, the API should initially return some control-structures that teach a client how a request needs to be formed and what options could be filtered for. In HTML this is done via forms. For non-HTML based REST APIs dedicated media-types should be defined that translate the same approach to non-HTML structures. On sending the request to the server, your client would include all of the supported media-types it supports in an Accept HTTP header, which informs the server about the capabilities of the client. Media-types are just human-readable specification on how to process payloads of such types. Such specifications may include hints on type information a link relation might return. In order to gain wide-usage of media-types they should be defined as generic as possible. Instead of defining a media-type specific for a car, which is possible, it probably would be more convenient to use an existing or define a new general data-container format (similar to HTML).
All of the steps mentioned here should help you to design and implement an API that is free to evolve without having to risk to break clients, that furthermore is also scalable and minimizes interoperability concerns.
Unfortunately your question targets something totally different IMO, something more related to RPC. You basically invoke a generic method via HTTP on an endpoint, similar like SOAP, RMI or CORBA work. Whether you respect the semantics of HTTP operations or not is only of sub-interest here. Even if you'd reached level 3 of the Richardson Maturity Model (RMM) it does not mean that you are compliant to REST. Your client might still break if the server changes anything within the response. The RMM further doesn't even consider media-types at all, hence I consider it as rather useless.
However, regardless if you use a (true) REST or RPC/CRUD client, if retrieving single items is your preference instead of feeding them into a collection you should consider to include the URI of the items of interest instead of its data directly into the collection, as Evert also has suggested. While most people seem to be concerned on server performance and round-trip-times, it actually is very elegant in terms of caching. Further certain link-relation names such as prefetch may inform the client that it may fetch the targets payload early as it is highly possible that it's content will be requested next. Through caching a request might not even have to be triggered or sent to the server for processing, which is probably the best performance gain you can achieve.

1) If you use query like cars/where... - use CARS
2) If you whant CAR - make method GetCarById

You might not get a perfect answer to this, because all are going to be a bit subjective and often in a different way.
My general thought about this is that every item in my system will have its own unique url, for example /cars/1234. That case is always singular.
But this specific item might appear as a member in collections and search results. When /cars/1234 apears in these, they will always appear as a list with 1 item (or 0 or more depending on the query).
I feel that this is ultimately the most predictable.
In my case though, if a car appears as a member of a search or colletion, it's 'true url' will still be displayed.

REST: Filter primary resource by properties on related resource

I'm looking for some guidance/advice/input on the concept of filtering resources when making a REST API call. Let's say I have Users and Posts, and a User creates a Post. If I want to get all Posts, I might have a route as follows:
GET /api/posts
Now if I wanted to get all posts that were created after a certain date, I might add a filter parameter like so
GET /api/posts?created_after=2017-09-01
However, let's say I want to get all posts by Users that were created after a certain date. Is this the right format?
GET /api/posts?user.created_after=2017-09-01
When it comes to filtering, grouping, etc, I'm having a hard time figuring out the right stuff to do for REST APIs, particularly when using a paginated API. If I do this client side (which was my initial thought) then you potentially end up with a variable number of resources per page, based on what meets your criteria. It seems complicated to add all of this logic as query parameters over the API, but I can't see any other way to do it. Is there a standard for this kind of thing?

There is no objective 'right' way. If using user.created_after logically makes sense in the context of your API, then there's nothing really wrong with it.

Personally, I would not use user.created_after.
I would rather prefer one of the following options:
Option I: /api/posts/users/{userid}?created_after=2017-09-01
Option II: /api/posts/?user={userid}&created_after=2017-09-01
The reason is simple: It looks wrong to me to create dynamic query parameters. Instead you can combine the query parameters (Option II) or even define a more specific resource (Option I).
Regarding pagination: the standard approach is something like this: In addition to filter parameters, you define the following parameters: page and pageSize. When constructing the request, client will specify something like page=2&pageSize=25&orderBy=creationDate.
It's important to note that server must always validate the parameters and can potentially ignore or override incorrect parameters (e.g. page doesn't exist, or pageSize is too big may not return an error, but instead returning reasonable output. This really depends on your business case)

Conflicting REST urls

So I'm building a REST api and need to make some urls. The problem is, I'm running into some conflicting paths. For example:
GET <type>/<id> gets the details of an object of a given type and id
GET <type>/summary gets the summary of objects of a given type
This simplified example shows a problem occurs when an object has id "summary". What is the best way to solve this? From a REST puritan perspective, what should be the solution?
Here's some of my ideas:
Put the <id> in query parameters. From what I understand this is against standards
Put a keyword at the start of the url. Also against standards?
Disallow certain id values. Not something I want to enforce for all my users and use cases and different entrances into my system

I may have an alternative to this. What if we have both book as wel as the plural books. Then you can have:
/book/{id}
and
/books/summary
or
/books/count

The URL structure is not quite right to begin with so it's difficult to solve it in a clean way.
For the sake of discussion, let's assume <type> is a books resource. So the first URL is fine - you get a book of the given ID:
GET /books/<id>
However this is not:
GET /books/summary
Because it's a bespoke URL, which I guess has a use in your application but is not restful. A GET call should return one or more resources. However a "summary" is not a resource, it's a property of a resource and that's why you end up in this situation of having IDs mixed up with book properties.
So your best option would be to change this URL to something like this:
GET /books?fields=summary
By default GET /books would return all the resources, while GET /books?fields=<list_of_fields> will return the books but with only the chosen properties.
That will be similar to your previous URL but without the ID/property conflict, and will also allow you later on to retrieve resources with specific fields (without having to create new custom URLs).
Edit:
Regarding the count of books, it's still useful to reason in terms of resources. /books gives you one or more books, but it should not be used for meta-information about the collection, such as count, but also things like "most read book", or "books that start with the letter 'A'", etc. as that will make the resource more and more complex and difficult to maintain.
Depending on what you want to achieve I think there'd be two solutions:
Create a new resource that manages the collection of books. For example:
GET /bookcase
And that will give you information about the collection, for example:
{
"count": 1234,
"most_read": "<isbn>",
// etc. - any information that might be needed about the book collection
}
Or a search engine. You create a resources such as:
GET /book_search_engine/?query=
which would return a search result such as:
{
"count": 123,
"books": [
// Books that match the query
]
}
then a query like this would give you just the count:
// Search all the books, but provide only the "count" field
GET /book_search/?query=*&fields=count
Obviously that's a more involved solution and maybe not necessary for a simple REST API, however it can be useful as it makes it easier to create queries specific to a client.

This simplified example shows a problem occurs when an object has id "summary". What is the best way to solve this? From a REST puritan perspective, what should be the solution?
As far as REST is concerned, the URI are opaque. Spelling is absolutely irrelevant. You could use URI like
/a575cc90-2878-41fe-9eec-f420a509e1f0
/f871fff6-4c4e-48f7-83a4-26858fdb3096
and as far as REST is concerned, that's spot on. See Stefan Tilkov's talk REST: I Don't Think It Means What You Think It Does.
What you are asking about is URI design, how to adapt conventions/best practices to your particular setting.
One thing that will help is to recognize is that summary is a resource, in the REST/HTTP sense -- it is a document that can be represented as a byte sequence. All you need to do is figure out where that resource belongs (according to your local spelling conventions).
Continuing to borrow the "books" example used by others
# Here's the familiar "URI that identifies a member of the books collection"
/books/<id>
# Here's the summary of the /books collection
/summaries/books
Put the in query parameters. From what I understand this is against standards
Not as much as you might think. REST doesn't care. The URI spec expresses some views about hierarchical vs non hierarchical data. HTTP supports the notion of a redirect, where one resource can reference another.
GET /books?id=12345
302 Found
Location: /books/12345
You also have options for skipping a round trip, by returning the representation you want immediately, taking advantage of Content-Location
GET /books?summary
200 OK
Content-Location: /summaries/books
...

I have the same issue. And all the solutions seem a little off b/c REST best practices seem to suggest none of them are ideal.
You could have just one off-limit id, like all.
GET <type>/<id>
GET <type>/all/summary
It might even be possible to use a single symbol instead, such as ~ or _.
GET <type>/<id>
GET <type>/~/summary
How satisfying this solution seems is of course very subjective.
The singular/plural approach seems more elegant to me but despite most REST best practice guides saying not to do this. Unfortunately some words don't have distinct singular and plural forms.

This isn't perfectly conventional for how some like to define their rest endpoints.
But I would would enforce a pattern where "id" cannot be any string. Instead I would use a uuid and define my routes as such.
GET /books/{id:uuid}
GET /books/{id:uuid}/summary
And if you really want a verb in the URL without an identifier it is still technically possible because we know the {id:uuid} in the path must conform to the uuid pattern.
With that GET /books/summary is still distinct from GET /books/{id:uuid}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse