How to show more than 1000 records in Algolia instantsearch.js - algolia

I'm using Algolia along with instantsearch.js in a project to make searches and show categories and contents inside them (category page and search pages are powered by Algolia). We are using instantsearch.js v1 is from cdn.
Our main issue is that search doesn't provide more than 1000 records which we need.
As far as I understand correctly, browse() method provides more results, but it's not usable in instantsearch.js.
Also, after reading docs, I found out that there's a new option called paginationLimitedTo, which allows displaying more than 1000 records:
https://www.algolia.com/doc/rest-api/search/#paginationlimitedto
So, setting this would allow displaying more than 1000 records.
Can you help me here, how should I achieve getting more than 1000 records, or if it'd achieve our goal, how do we set this paginationLimitedTo attribute in instantsearch.js ? I'm okay if I need to build or edit instantsearch.js for the time being.
Thanks in advance,

In order to change the value for paginationLimitedTo, you will need to create a custom client object, then get your index by doing calling client.initIndex(indexName), and then change the setting by calling
index.setSettings({
paginationLimitedTo: 1000
});
You can check the guide for that in the docs here.
Also, please remember the following:
We recommend keeping the default value to guarantee excellent performance. Increasing the pagination limit will have a direct impact on the performance of search queries. A too high value will also make it very easy for anyone to retrieve (“scrape”) your entire dataset.

Related

Do I have to loop through each 'page' of orders to get all orders in one WooComerce REST Api query?

I've built a KNIME workflow that helps me analyse (sales) data from numerous channels. In the past I used to export all orders manually and use an XSLX or CSV reader but I want to do it via WooCommerce's REST API to reduce manual labor.
I would like to be able to receive all orders up until now from a single query. So far, I only get as many as the # I fill in for &per_page=X. But if I fill in like 1000, it gives an error. This + my common sense give me the feeling I'm thinking the wrong way!
If it is not possible, is looping through all pages the second best thing?
I've managed to connect to the api via basic auth. The following query returns orders, but only 10:
I've tried increasing the number per_page but I do not think this is the right way to get all orders in one table.
https://XXXX.nl/wp-json/wc/v3/orders?consumer_key=XXXX&consumer_secret=XXXX
My current mindset would like to be able to receive all orders up until now from a single query. But it personally also feels like that this is not the common way to do it. Is looping through all pages the second best thing?
Thanks in advance for your responses. I am more of a data analist than a data engineer or scientist and I hope your answers will help me towards my goal of being more of a scientist :)
It's possible by passing params "per_page" with the request
per_page integer Maximum number of items to be returned in result set. Default is 10.
Try -1 as the value
https://woocommerce.github.io/woocommerce-rest-api-docs/?php#list-all-orders

free-jqGrid External Filtering Used With Grid's beforeRequest() or onPaging() Event

Using jqGrid free (version 4.15.6) to show very basic information about invoices (ie: date created, date due, client, total, status). The invoices grid only has a few pertinent columns that are displayed because it is just not needed to show more than that. In reality there are a lot of other invoice-related fields that are not shown. I would like to offer end-users the ability to filter the grid based on a lot of these other parameters that are simply not part of the grid contents.
I know jqGrid offers built-in searching, and you can easily just add hidden columns with all the data, but I feel this is not good for us--invoices contain a lot of data--data that is not necessarily present in just the invoices database table. We want the grid to provide many other filtering options outside of the base invoice data but we do NOT want to use the built-in filter options. Instead, I would like to use a separate HTML table with a bunch of search fields that our server-side code would know how to pull back). When one decides to invoke the external filter, we want the grid to load all invoices matching that combined filter. And if one chooses to navigate using the grid's paging buttons, we want the grid to continue using the original external filtering parameters.
Hope this makes sense. Maybe I am just overthinking this but I am fairly certain the grid is designed to use it's built in filtering/searching tools/dialog and I have not found anyway to override this behavior. Actually I have using an older jqGrid but that involved using jQuery to completely REPLACE the default pager with custom HTML and event handling. I never could figure this out with older jqGrid so I chose to write it myself. But that code is less than optimum and even I know it is subject to much criticism. Having upgraded to 4.15.6, I want to do this the best way and I want to keep it logical and practical.
I have tried using beforeRequest() and onPaging() events to change the 'url' parameter, thinking that if I modified the url, I could change the GET to include all of our custom filtering fields. It seems that does not work as the url NEVER changes from the originally defined value. Console logging does show the events firing but no change to url. On top of that, the grid ALWAYS passes its own page field, _search field, etc. to the server so the server NEVER sees the filter request.
How does one define their own custom filtering coupled with paging loader and still take advantage of the built-in paging events? What am I missing?
**** DELETED CODE THAT WAS ADDED TO QUESTION THAT DID NOT PERTAIN TO ORIGINAL QUESTION ISSUE *********
It's difficult to answer on your question because you didn't posted code fragments, which shows how you use jqGrid and because the total number of data, which could be needed to display in all pages isn't known.
In general there are two main alternatives implementing of custom filtering:
server side filtering
client side filtering
One can additionally use a mix from both filtering. For example, one can load from the server all invoices based on some fixed filters (all invoices of specific user or all invoices of one organization, all invoices of the last month) and then use loadonce: true, forceClientSorting: true options to sort and to filter the returned data on the client side. The user could additionally to filter the subset of data locally using filter toolbar of searching dialog.
The performance of client side is essentially improved last years and loading relatively large JSON data from the server could be done very quickly. Because of that Client-Side-Filtering is strictly recommended. For better understanding the performance of local sorting, filtering and paging I'd recommend you to try the functionality on the demo. You will see that the timing of local filtering of the grid with 5000 rows and 13 columns is better as you can expect mostly from the round trip to the server and processing of server side filtering on some very good organized database. It's the reason why I recommend to consider to use client side sorting (or loadonce: true, forceClientSorting: true options) as far it's possible.
If you need to filter data on the server then you need just send additional parameters to the server on every request. One can do that by including additional parameters in postData. See the old answer for additional details. Alternatively one can use serializeGridData to extend/modify the data, which will be set to the server.
After the data are loaded from the server, it could be sorted and filtered locally before the first page of data will be displayed in the grid. To force local filtering one need just add forceClientSorting: true additionally to well known loadonce: true parameter. It force applying local logic on the data returned from the server. Thus one can use postData.filters, search: true to force additional local filtering and sortname and sortorder parameter to force local sorting.
One more important remark about using hidden columns. Every hidden column will force creating DOM elements, which represent unneeded <td> elements. The more DOM elements you place on the page the more slow will be the page. If local data will be used (or if loadonce: true be used) then jqGrid hold data associated with every row twice: once as JavaScript object and once as cells in the grid (<td> elements). Free jqGrid allows to use "additional properties" instead of hidden columns. In the case no data will be placed in DOM of the grid, but the data will be hold in JavaScript objects and one able to sort or filter by additional properties in the same way like with other columns. In the simplest way one can remove all hidden columns and to add additionalProperties parameter, which should be array of strings with the name of additional properties. Instead of strings elements of additionalProperties could be objects of the same structures like colModel. For example, additionalProperties: [{ name: "taskId", sorttype: "integer"}, "isFinal"]. See the demo as an example. The input data of the grid can be seen here. Another demo shows that searching dialog contains additional properties additionally to jqGrid column. The commented part columns of searching shows more advanced way to specify the list and the order of columns and additional properties displayed in searching dialog.
Forgive my answering like this but this question started out on one subject related to filtering and paging but with using an external filtering source. Oleg actually has several demos over many threads that I was able to use to accomplish the custom filtering and maintain default built-in paging. So his answer will be the accepted answer for the original question topic.
But in the solution of original, I encountered another issue with loading the grid initially. I wanted to have the grid load with default filtering values should no other filter already be in place. That really should have been a different question because it really did not affect the first.
I found yet another Oleg reply on a completely different question:
jqGrid - how to set grid to NOT load any data initially?.
Oleg answered that question and that answer solved our second need to load one way, then allow another way.
So, on initial load, we look for the filter params server-side. None given? We pull records using default filtering. Params present? We use initial provided params. The difference with initial loading we do not AJAX exit. We instead json_encode the data and place it in the grid definition as follows:
$('#grd_invoices').jqGrid(
...
url: '{$modulelink}&sm=130',
data: {$json_encoded_griddata},
datatype: 'local',
...
});
Since the datatype is set to 'local', the grid does NOT go to server initially, so the data parameter is used by the grid. Once we are ready to filter, we use Oleg's solution from yet another answer on yet another question to dynamically apply the filter as follows:
var myfilter = { groupOp: 'AND', rules: []};
myfilter.rules.push({field:'fuserid',op:'eq',data:$('#fuserid').val()});
myfilter.rules.push({field:'finvoicenum',op:'eq',data:$('#finvoicenum').val()});
myfilter.rules.push({field:'fdatefield',op:'eq',data:$('#fdatefield').val()});
myfilter.rules.push({field:'fsdate',op:'eq',data:$('#fsdate').val()});
myfilter.rules.push({field:'fedate',op:'eq',data:$('#fedate').val()});
myfilter.rules.push({field:'fwithin',op:'eq',data:$('#fwithin').val()});
myfilter.rules.push({field:'fnotes',op:'eq',data:$('#fnotes').val()});
myfilter.rules.push({field:'fdescription',op:'eq',data:$('#fdescription').val()});
myfilter.rules.push({field:'fpaymentmethod',op:'eq',data:$('#fpaymentmethod').val()});
myfilter.rules.push({field:'fstatus',op:'eq',data:$('#fstatus').val()});
myfilter.rules.push({field:'ftotalfrom',op:'eq',data:$('#ftotalfrom').val()});
myfilter.rules.push({field:'ftotal',op:'eq',data:$('#ftotal').val()});
myfilter.rules.push({field:'fmake',op:'eq',data:$('#fmake').val()});
myfilter.rules.push({field:'fmodel',op:'eq',data:$('#fmodel').val()});
myfilter.rules.push({field:'fserial',op:'eq',data:$('#fserial').val()});
myfilter.rules.push({field:'fitemid',op:'eq',data:$('#fitemid').val()});
myfilter.rules.push({field:'ftaxid',op:'eq',data:$('#ftaxid').val()});
myfilter.rules.push({field:'fsalesrepid',op:'eq',data:$('#fsalesrepid').val()});
var grid = $('#grd_invoices');
grid[0].p.search = myfilter.rules.length>0;
$.extend(grid[0].p.postData,{filters:JSON.stringify(myfilter)});
$('#grd_invoices').jqGrid('setGridParam',{datatype:'json'}).trigger('reloadGrid',[{page:1}]);
This allows us to have the grid show initial data loaded locally, and then subsequent filtering changes the grid datatype to 'json', which forces the grid to go to server with new filter params where it loads the more specific filtering.
Credit goes to Oleg because I used many of his posts from many questions to reach the end result. Thank you #Oleg!

How to inject (dynamic?) Parameters in Tableau CustomSQL

I currently try to solve the following issue in Tableau:
In the end, I would like to have a Tableau dashboard where the user can select a Customer, and then can see the Customer's KPIs. Nothing spectacular so far.
To obtain a Customer's KPIs, there is a CustomSQL query with a parameter "CustomerName" (that returns the KPIs for that Customer).
Now the thing:
I don't want to have a hardcoded list of CustomerNames, as it would be possible with Tableau Parameters. Instead, the CustomerNames should be fetched from another datasource. I did not find a way to "link" a Parameter to a DataSource, and/or inject something other than static Parameters into CustomSQL.
My Question: Is there really no solution for this, or am I just doing something wrong (I hope so).
I found this workaround here https://www.interworks.com/de/blog/daustin/2015/12/17/dynamic-parameters-tableau that seems to work, but that looks like... a workaround.
Few background info:
I have to stick to using a CustomSQL because
It is not viable for me to calculate all KPIs for all CustomerNames
and then filter by Tableau, since the data amount is too big.
It is not viable to replace the CustomSQL with Tableau Calculations
and Filters (already tried that, ended up in having Tableau pulling
too much data instead of pushing the work to the database).
I cannot believe that Tableau does not offer a solution here, since the use case is pretty common I believe.
Do you have some input for me?
Thank you for your help in advance!
Kind Regards
have you tried using rawsql() functions together with stored functions on the database side? I found it pretty useful when needed to load single value from the dataset completely not related to currently used datasource.
For example, running foo stored function which accepts 2 dates and calculated sum of something, Syntax should be something like:
rawsql_int(your_db_schema.foo(%1,%2),[startDateFieldTableau],[endDateFieldTableau])
but you can access it directly:
rawsql_int("select sum(bar) from sales")
but this is bit risky.
Drawbacks:
it relies on the current connection (you create a calculated field (duh!)
it will not work with extract (but you are using custom sql anyways so I believe you are more into live connection

Mongo pagination

I have a use case where I need to get list of Objects from mongo based off a query. But, to improve performance I am adding Pagination.
So, for first call I get list of say 10 Objects, in next I need 10 more. But I cannot use offset and pageSize directly because the first 10 objects displayed on the page may have been modified [ deleted ].
Solution is to find Object Id of last object passed and retrieve next 10 objects after that ObjectId.
Please help how to efficiently do it using Morphia mongo.
Using morphia you can do this by the following command.
datastore.find(YourClass.class).field(id).smallerThan(lastId).limit(10).order("-ts");
Since you are querying for retrieving the items after the last retrieved id, you won't be bothered to deal with deleted items.
One thing I have thought up of is that you will have the same problem as with using skip() here unless you intend to change how your interface works.
Using ranged queries like this demands that you use a different kind of interface since it is must harder to detect now exactly what page you are on and how many pages exist in the future, especially if you are doing this to avoid problems with conventional paging.
The default type of interface to arise from this type of paging is merely a infinitely scrolling page, think of YouTube video comments or Facebook wall feed or even Google+. There is no physical pagination or "pages", instead you have a get more button.
This is the type of interface you will need to use to get ranged paging working better.
As for the query #cubbuk gives a good example:
datastore.find(YourClass.class).field(id).smallerThan(lastId).limit(10).order("-ts");
Except it should be greaterThan(lastId) since you want to find everything above that last _id. I would also sort by _id unless you make your OjbectIds sometime before you insert a record, if this is the case then you can use a specific timestamp set on insert instead.

Lucene.NET faceted search

I found a great tutorial on performing a faceted search.
http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-lucenenet/
This article does not explain how to retrieve the narrowed available attributes to filter from (for further drill down).
Lets say I am looking for planners that are red. When I perform the faceted search, I want to return all available attributes to filter from that are red. Then when I add a "weekly format" filter, I want the attribute list to get even smaller, containing only filters available for the segmented group.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
I am fairly new to lucene.net, so examples are much appreciated.
IIUC, you get a BitArray containing the list of the filtered results. In the tutorial's example, you will have combinedResults as this list. If you want to further narrow this down, you need to reiterate the process: run another searchQuery and intersect the results with the BitArray you have for combinedResults.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
You can always use an off-site, hosted Solr solution. See this question for more information.