Does SharePoint Search API throws any threshold limit in any case? - rest

I have gone through following question
SharePoint REST API call with more than 500 rows
We want to use Search API with REST API or JSOM. We have large lists in SharePoint. The data can be 100000 or more items in a single list. There are multiple large lists. In some scenarios search API which will search across all lists and in some scenarios it will also search in a single lists.
So like REST API does it throw threshold error for more than 5000 matched items or does SharePoint Search REST API throws any threshold limit in any case?

As Mike pointed out, I'm not aware of any limitations in terms of API's for search in SharePoint. You will have to make a lot of sub requests to retrieve all the items if you need to do so (but generally speaking, it'd probably be better to have a more complex query, retrieving less items and giving more work to the search engine). More details on how you can paginate (REST) here.
Keep in mind that your search service application needs to be properly designed to accept all the items otherwise it might stop indexing new items. (if you're on prem. or doing hybrid)

Related

Issue in MongoDB document search

I am new to MongoDB. And I have the following issue on currently developing web application.
We have an application where we use mongoDB to store data.
And we have an API where we search for the document via text search.
As an example: if the user type “New York” then the request should send the all the available data in the collection to the keyword “New York". (Here we call the API for each letter typed.) We have nearly 200000 data in the DB. Once the user searches for a document then it returns nearly 4000 data for some keywords. We tried with limiting the data to 5 – so it returns the top 5 data, and not the other available data. And we tried without limiting data now it returns hundreds and thousands of data as I mentioned. And it causes the request to slow down.
At Frontend we Bind search results to a dropdown. (NextJs)
My question:
Is there an optimizing way to search a document?
Are there any suggestions of a suitable way that I can implement this requirement using mongoDB and net5.0?
Or any other Implementation methods regarding this requirement?
Following code segment shows the query to retrieve the data to the incomming keyword.
var hotels = await _hotelsCollection
.Find(Builders<HotelDocument>.Filter.Text(keyword))
.Project<HotelDocument>(hotelFields)
.ToListAsync();
var terminals = await _terminalsCollection
.Find(Builders<TerminalDocument>.Filter.Text(keyword))
.Project<TerminalDocument>(terminalFeilds)
.ToListAsync();
var destinations = await _destinationsCollection
.Find(Builders<DestinationDocument>.Filter.Text(keyword))
.Project<DestinationDocument>(destinationFields)
.ToListAsync();
So this is a classic "autocomplete" feature, there are some known best practices you should follow:
On the client side you should use a debounce feature, this is a most. there is no reason to execute a request for each letter. This is most critical for an autocomplete feature.
On the backend things can get a bit more complicated, naturally you want to be using a db that is suited for this task, specifically MongoDB have a service called Atlas search that is a lucene based text search engine.
This will get you autocomplete support out of the box, however if you don't want to make big changes to your infra here are some suggestions:
Make sure the field your searching on is indexed.
I see your executing 3 separate requests, consider using something like Task.WhenAll to execute all of them at once instead of 1 by 1, I am not sure how the client side is built but if all 3 entities are shown in the same list then ideally you merge the labels into 1 collection so you could paginate the search properly.
As mentioned in #2 you must add server side pagination, no search engine can exist without one. I can't give specifics on how you should implement it as you have 3 separate entities and this could potentially make pagination implementation harder, i'd consider wether or not you need all 3 of these in the same API route.

Complex requests with REST API

I am wondering if it is possible to adhere to REST principles when creating what will essentially amount to a BI tool.
In my scenario I have high data volume with 100,000's of IDs (frankly more than this but for the sake of this example let's go with that.). These are presented in a traditional table that allows for necessary features when accessing large data sets such as pagination. The user also has the ability to filter by one, or many of these ID's to drill down the data set as they see fit.
It is theoretically possible that the user would want to filter on 100 of the ID's, thus rendering a GET URI incredibly long. Which as I understand it would kind of break the resource identification principle of a REST API. Not to mention could potentially bump into the character limit in a GET request for certain browsers since the ID's may be quite long. Normally I would just use a POST since I can add all of the applied filters in the body and generate a where clause on the server.
Since a POST in a REST API is supposed to
Create a new entry in the collection.
By definition it would appear, at least to me that generating a complex query for something like this would mean that a REST API is not possible. Or does this perhaps mean that I am approaching the solution wrong (totally plausible).
It would seem that in my scenario using a GET simply isn't possible due to the potential length of the parameters. Thus I am forced to use a POST. Though using a POST as I am seems to violate REST style, which isn't the end of the world. I mostly just wanted to double check that I am not missing something and there is not a solution using a GET.
To follow the resources principle, make a search like resource. POST your ids in a body wto this endpoint and it will prepare a list of results for you and redirect you to searchresults/{id}.
See for example: https://gooroo.io/GoorooTHINK/Article/16583/HTTP-Patterns---Bouncer/25829#.W3aBsugzaUk

Designing REST api for returning count of multiple objects

I have a question on how should I design my rest api(s) given that I need to return count of different objects in my application. There are multiple approaches that could be thought off
Defining a single api that generally accepts the object identifier in the request body (json) and returns back the count for each of the object identifiers in the response. The drawback is the api is too generic and possibly not restful since there is no resource. The url would look like GET /counts
Define individual api's for each of the resources for which count is needed. Assuming I need to return counts for the fields defined in the software, tables, processes, tasks, jobs etc I would define individual api's for each of these resources. It would look like GET /field/count or GET /table/count. A side effect of this would be there would be many web api's for each of the resources we need the count for. Is that bad?
Kindly share your thoughts on the above approaches and any new ways in which I could design such an API that adheres to the REST standards.
Thanks
It totally depends on the client that is consuming the APIs.
Case 1. If its a WebApp which needs count of many tables on a single page then both will lead to bad design where you will have to make hundreds of calls just for counts data. You can club counts in a single API and send that as a response.
Case 2. If the client are individually using the count then i would recommend the 2nd approach where the resource is clearly defined. For the 2nd approach you are making the client intelligent which is not recommended.
As mentioned in the comments REST is a totally subjective topic so there can be multiple view points to every design.

REST: How best to handle large lists

It seems to me that REST has clean and clear semantics for basic CRUD and for listing resources, but I've seen no discussion of how to handle large lists of resources. It doesn't scale to dump an entire database table over the network in a resource-oriented architecture (imagine a customer table with a million customers!), especially if you only need a few items. So it seems that some semantics should exist to filter, map and reduce a list of resources on the server-side.
So, do you know any tried and true ways to do the following kinds of requests in REST:
1) Retrieve just the count of the resources?
I could imagine doing something like GET /api/customer?result=count
Is that how it's usually done?
I could also imagine modifying the URL (/api/count/customer or /api/customer/count, for example), but that seems to either break the continuity of the resource paths or inflict an ugly hack on the expected ID field.
2) Filter the results on the server-side?
I could imagine using query parameters for this, in a context-specific way (such as GET /api/customer?country=US&state=TX).
It seems tricky to do it in a flexible way, especially if you need to join other tables (for example, get customers who purchased in the last 6 months).
I could imagine using the HTTP OPTIONS method to return a JSON string specifying possible filters and their possible values.
Has anyone tried this sort of thing?
If so, how complex did you get (for example, retrieving the items purchased year-to-date by female customers between 18 and 45 years old in Massachussetts, etc.)?
3) Mapping to just get a limited set of fields or to add fields from joined tables?
4) More complicated reductions than count (such as average, sum, etc.)?
EDIT: To clarify, I'm interested in how the request is formulated rather than how to implement it on the server-side.
I think the answer to your question is OData! OData is a generic protocol for querying and interacting with information. OData is based on REST but extends the semantics to include programatic elemements similar to SQL.
OData is not always URL-based only as it use JSON payloads for some scenarios. But it is a standard (OASIS) so it well structured and supported by many APIs.
A few general links:
https://en.wikipedia.org/wiki/Open_Data_Protocol
http://www.odata.org/
The most common ways of handling large data sets in GET requests are (afaict) :
1) Pagination. The request would be something like GET /api/customer?country=US&state=TX&firstResult=0&maxResults=50. This way the client has the freedom to choose the size of the data chunk he needs (this is often useful for UI-based clients).
2) Exposing a size service, so that the client gets to know how large the data set is before actually requesting it. The service would be something like
GET /api/customer/size?country=US&state=TX
Obviously the two can (and imho should) be used together, so that when/if a client (be it mobile or web or whatever) waints to fill its UI with content, he can choose what's the best data chunk size based also on the size of whole data set (e.g. to avoid creating 100 pages for the user to navigate).

RESTful API subresource pattern

In designing a RESTful API, the following call gives us basic information on user 123 (first name, last name, etc):
/api/users/123
We have a lot of information on users so we make additional calls to get subresources on a user like their cart:
/api/users/123/cart
For an admin page we would like to see all the cart information for all the users. A big table listing each user and some details about their cart. Obviously we don't want to make a separate API call for each user (tons of requests). How would this be done using RESTful API patterns?
/api/carts/users came to mind but then you'd in theory have 2 ways to get a specific user's cart by going /api/carts/users/123.
This is generally solved by adding a deref capability to your REST server. Assuming the response from your user looks like:
{
...
cartId: "12345",
...
}
you could add a simple dereference by passing in the query string "&deref=cart" (or however you setup your syntax.)
This still leaves the problem of making a request per user. I'd posit there are two ways to generally do this. The first would be with a multiget type of resource (see [1] for an example). The problem with this approach is you must know all of the IDs and handle paging yourself. The second (which I believe is better) is to implement an index endpoint to your user resource. Indexing allows you to query a resource (generally via a query string such as firstName=X or whatever else you want to sort on.) Then you should implement basic paging so you're not passing around massive amounts of data. There are tons of examples of paging, but the simplest would be to specify a number (count=20) a start token (since=X) and a sort order (sort=-createdAt).
These implementations allow you to then ask for all users and their carts by iterating on the index endpoint. You might find this helpful as a starting point for paging [2].
[1] - How to construct a REST API that takes an array of id's for the resources
[2] - Pagination in a REST web application
For some reason I was under the assumption that having 2 URIs to the same resource was a bad thing. In my situation /api/users/123/cart and /api/carts/users/123 would return the same data. Through more research I've learned from countless sources that it's acceptable to have multiple URIs to the same resource if it makes sense to the end user.
In my case I probably wont expose /api/carts/users/123, but I'm planning on using /api/carts/users with some query parameters to return a subset of carts in the system. Similarly, I'm going to have /api/carts/orgs to search org carts.
A really good site I found with examples and clear explanations was the REST API Tutorial. I hope this helps others with planning their API URIs.