How to combine pagination with getting next / previous element of a REST collection - rest

I am trying to design a REST API in which there is a paginated collection. In my current design I use a page based approach:
GET /entities?page=2&pageSize=25
Retrieving a single entity is trivial:
GET /entities/4
A user story related to this API requires that when viewing a single entity in a screen, two buttons "Previous" and "Next" enable switching to said entities. An example:
GET /entities?page=2&pageSize=25
returns:
[{id: 2}, {id: 4}, {id: 17}]
Now, when viewing entity with id 4, the named buttons would navigate to the entities with id 2 or id 17.
My assumption is, that a client (web frontend in my case) could be able to "remember" the pagination information and use that when fetching the previous or next entity. This could be applied to eventual filters which I might add to the endpoint.
A basic idea of how to implement this would be to get the current page and for edge cases the previous and the next page (required if currently viewing the first / last resource of the collection). But that seems like an inefficient solution.
So my question is: Is my chosen pagination method even compatible with what I try to archive? If it is, how would clients use the API to archive the next/previous feature?

In this case I ended up with a frontend based solution.
Given that my constraint was that I could not change the type of pagination used, this is a crude solution:
Frontend makes initial GET request on the paginated data. This request may not only contain pagination parameters (page + size) but also filters (color = red). This is the search context.
The user accesses one of the items and some other view is displayed. Important: The search context is kept in local storage or similar.
If the user wants to browse to the next / previous item, the search is executed again with the same search context.
In the search result, search for the ID of the currently displayed item
If the currently displayed item is found, show the next / previous item
If it is not found, do some fallback. (In my case I chose to simply display the initial search UI)
I am not happy with this implementation as it has some major drawbacks IMO:
Not efficient: For every browsing click, a search request is made. When not using an efficient search database like Elasticsearch this will probably cause a lot of stress on the db.
Not perfect: If the currently displayed item is not in the search result, there is no sensible way to get the next / previous item
If the currently displayed item is the first / last item in the search result and you want to browse back / forward, the search would have to be executed twice, once on the current page, once on the previous / next.
As stated, I think this is not a good solution, I am still looking for a clever, efficient way to do this.

Related

Issue in MongoDB document search

I am new to MongoDB. And I have the following issue on currently developing web application.
We have an application where we use mongoDB to store data.
And we have an API where we search for the document via text search.
As an example: if the user type “New York” then the request should send the all the available data in the collection to the keyword “New York". (Here we call the API for each letter typed.) We have nearly 200000 data in the DB. Once the user searches for a document then it returns nearly 4000 data for some keywords. We tried with limiting the data to 5 – so it returns the top 5 data, and not the other available data. And we tried without limiting data now it returns hundreds and thousands of data as I mentioned. And it causes the request to slow down.
At Frontend we Bind search results to a dropdown. (NextJs)
My question:
Is there an optimizing way to search a document?
Are there any suggestions of a suitable way that I can implement this requirement using mongoDB and net5.0?
Or any other Implementation methods regarding this requirement?
Following code segment shows the query to retrieve the data to the incomming keyword.
var hotels = await _hotelsCollection
.Find(Builders<HotelDocument>.Filter.Text(keyword))
.Project<HotelDocument>(hotelFields)
.ToListAsync();
var terminals = await _terminalsCollection
.Find(Builders<TerminalDocument>.Filter.Text(keyword))
.Project<TerminalDocument>(terminalFeilds)
.ToListAsync();
var destinations = await _destinationsCollection
.Find(Builders<DestinationDocument>.Filter.Text(keyword))
.Project<DestinationDocument>(destinationFields)
.ToListAsync();
So this is a classic "autocomplete" feature, there are some known best practices you should follow:
On the client side you should use a debounce feature, this is a most. there is no reason to execute a request for each letter. This is most critical for an autocomplete feature.
On the backend things can get a bit more complicated, naturally you want to be using a db that is suited for this task, specifically MongoDB have a service called Atlas search that is a lucene based text search engine.
This will get you autocomplete support out of the box, however if you don't want to make big changes to your infra here are some suggestions:
Make sure the field your searching on is indexed.
I see your executing 3 separate requests, consider using something like Task.WhenAll to execute all of them at once instead of 1 by 1, I am not sure how the client side is built but if all 3 entities are shown in the same list then ideally you merge the labels into 1 collection so you could paginate the search properly.
As mentioned in #2 you must add server side pagination, no search engine can exist without one. I can't give specifics on how you should implement it as you have 3 separate entities and this could potentially make pagination implementation harder, i'd consider wether or not you need all 3 of these in the same API route.

Best practice for filtering data mongodb

I am looking into best -practices for returning search results. I have a search page that subscribes to a publication that returns a find based on the searched regex query in multiple fields. This gets put into the minimongo collection, on the client.
At this time, the way it is being handled is that facets are being set up from the subscription. My question is if the filtering for the pre-loaded results from the backend should be done client side, or if the query should be sent back.
Example :
Given a collection of fruits, i want to find all that have the color red. The server returns this, but I have facets based on the fruits. So, i have a checkbox for strawberries, apples, cherries, etc. If I click on the checkbox for cherries, should I just be filtering the current minimongo collection, or should I re-query?
Logically, I already have all the needed items in my collection that I could be filtering on, so I am not sure why I would need to hit the back-end. The only time I should hit the backend is if in the search, I type in a new query (such as blue), and the facets get re-done appropriately
If your original search is returning all matching documents then adding criteria on the client can just be done in your minimongo query if the fields on which the additional criteria were returned with the original search.
OTOH if the original search is returning a paginated list or just the top N results or if the required keys weren't included then you want to continue the search on the server.
In a traditional request-response system, you might also want to query the server each time if the underlying data was rapidly changing (ex: a reservations system). With Meteor the reactive nature of pub-sub means that the data on the client is being constantly refreshed with adds/changes/deletions via DDP over WebSocket.

Elastic Search completion suggester configuration

I'm trying to set autocomplete/suggestions on my site's search form, using Elastic Search's completion suggester feature.
I have a list of products, which are grouped by categories (on multiple levels). The search feature should be able to suggest category names, which are of more interest to users than direct products.
Several of these categories have the same name but a different parent (e.g. 'milk' under parent category 'dairy products' and 'milk' under category 'baby'). When the user selects a category suggestion, she's redirected to another page, with more specific results than mere search method.
Additional metadata (url to redirect to, parent category id/name) are added in the payload field.
I use the output field to return normalized suggestions for different inputs. As stated in the documentation:
"The result is de-duplicated if several documents have the same output,
i.e. only one is returned as part of the suggest result."
But as explained, my outputs may have the same value, while being different results, as they will link to different pages. It is also possible in the future that different category levels will yield different actions.
I am reluctant to differentiate things by adding the full string (i.e. "milk in dairy products") as the output, because:
1. The parent category is conceptually not the output itself but a related metadata.
2. I intend to have some formatting in the results, this forces me to parse the output string to add HTML tags in it.
So, is it possible to deactivate the de-duplication?
One workaround I'm thinking of if it's not possible is to store a stringified json object in the output, with all the data 'll need, both the one displayed in the search form and the metadata currently in the payload. But Id' rather look into existing configuration before resorting to that.

CakePHP how to write a search form to display results

I am writing a search form in CakePHP 2.0, current I have set it up running with the index action and view (it also posts to the index action) with validation against the model so that if anything incorrect is entered into a search field (fields include date, price) there is a nice validation error message next to the element. Basically it is a bit like a scaffolded add form.
If validation is successful I need to actually run a query and return some data. I don't want to display this data in the index view - should I:
Run the query then render a different view (which means the URL doesn't change - not sure I want that).
Store the search parameters in a session, redirect off to another action then retrieve the search details.
Is there any other way?
Both options are ok. You must decide what you like more, to not change the url or to change it?
you may also use the named parameters to pass the info so a user can bookmark their request, though it would need to do the validations in the same page as where it shows results. I usually do this with the cakedc search plugin.
Returning to your two options, if you mean which is better in performance i would choose number one, since the second one needs to load a new model/controller etc

How to implement robust pagination with a RESTful API when the resultset can change?

I'm implementing a RESTful API which exposes Orders as a resource and supports pagination through the resultset:
GET /orders?start=1&end=30
where the orders to paginate are sorted by ordered_at timestamp, descending. This is basically approach #1 from the SO question Pagination in a REST web application.
If the user requests the second page of orders (GET /orders?start=31&end=60), the server simply re-queries the orders table, sorts by ordered_at DESC again and returns the records in positions 31 to 60.
The problem I have is: what happens if the resultset changes (e.g. a new order is added) while the user is viewing the records? In the case of a new order being added, the user would see the old order #30 in first position on the second page of results (because the same order is now #31). Worse, in the case of a deletion, the user sees the old order #32 in first position on the second page (#31) and wouldn't see the old order #31 (now #30) at all.
I can't see a solution to this without somehow making the RESTful server stateful (urg) or building some pagination intelligence into each client... What are some established techniques for dealing with this?
For completeness: my back-end is implemented in Scala/Spray/Squeryl/Postgres; I'm building two front-end clients, one in backbone.js and the other in Python Django.
The way I'd do it, is to make the indices from old to new. So they never change. And then when querying without any start parameter, return the newest page. Also the response should contain an index indicating what elements are contained, so you can calculate the indices you need to request for the next older page. While this is not exactly what you want, it seems like the easiest and cleanest solution to me.
Initial request: GET /orders?count=30 returns:
{
"start"=1039;
"count"=30;
...//data
}
From this the consumer calculates that he wants to request:
Next requests: GET /orders?start=1009&count=30 which then returns:
{
"start"=1009;
"count"=30;
...//data
}
Instead of raw indices you could also return a link to the next page:
{
"next"="/orders?start=1009&count=30";
}
This approach breaks if items get inserted or deleted in the middle. In that case you should use some auto incrementing persistent value instead of an index.
The sad truth is that all the sites I see have pagination "broken" in that sense, so there must not be an easy way to achieve that.
A quick workaround could be reversing the ordering, so the position of the items is absolute and unchanging with new additions. From your front page you can give the latest indices to ensure consistent navigation from up there.
Pros: same url gives the same results
Cons: there's no evident way to get the latest elements... Maybe you could use negative indices and redirect the result page to the absolute indices.
With a RESTFUL API, Application state should be in the client. Here the application state should some sort of time stamp or version number telling when you started looking at the data. On the server side, you will need some form of audit trail, which is properly server data, as it does not depend on whether there have been clients and what they have done. At the very least, it should know when the data last changed. No contradiction with REST here.
You could add a version parameter to your get. When the client first requires a page, it normally does not send a version. The server replies contains one. For instance, if there are links in the reply to next/other pages, those links contains &version=... The client should send the version when requiring another page.
When the server recieves some request with a version, it should at least know whether the data have changed since the client started looking and, dependending of what sort of audit trail you have, how they have changed. If they have not, it answer normally, transmitting the same version number. If they have, it may at least tell the client. And depending how much it knows on how the data have changed, it may taylor the reply accordingly.
Just as an example, suppose you get a request with start, end, version, and that you know that since version was up to date, 3 rows coming before start have been deleted. You might send a redirect with start-3, end-3, new version.
WebSockets can do this. You can use something like pusher.com to catch realtime changes to your database and pass the changes to the client. You can then bind different pusher events to work with models and collections.
Just Going to throw it out there. Please feel free to tell me if it's completely wrong and why so.
This approach is trying to use a left_off variable to sort through without using offsets.
Consider you need to make your result Ordered by timestamp order_at DESC.
So when I ask for first result set
it's
SELECT * FROM Orders ORDER BY order_at DESC LIMIT 25;
right?
This is the case when you ask for the first page (in terms of URL probably the request that doesn't have any
yoursomething.com/orders?limit=25&left_off=$timestamp
Then When receiving your data set. just grab the timestamp of last viewed item. 2015-12-21 13:00:49
Now to Request next 25 items go to: yoursomething.com/orders?limit=25&left_off=2015-12-21 13:00:49 (to lastly viewed timestamp)
In Sql you would just make the same query and say where timestamp is equal or less than $left_off
SELECT * FROM (SELECT * FROM Orders ORDER BY order_at DESC) as a
WHERE a.order_at < '2015-12-21 13:00:49' LIMIT 25;
You should get a next 25 items from the last seen item.
For those who sees this answer. Please comment if this approach is relevant or even possible in the first place. Thank you.