Algolia Faceted Search to Show all Facet Options - algolia

I'm using facets with Algolia using the js client. As an example, if I have facets for colors and sizes, and I pick Red and Large, I can get the results with something like:
index.search("shirts", {
"facets": "*",
"hitsPerPage": 10,
"facetFilters": [
"color:Red",
"size:Large",
],
"maxValuesPerFacet": 100
});
That works fine, showing just the Red and Large facets in the result:
Color:
Red
Size:
Large
But I would like to be able to show all of the possible options for each facet and just highlight the selected one. Something like:
Color:
Red *selected*
Blue
Green
Size:
Large *selected*
Small
Medium
Is there a way to do this in Algolia with one search query and using regular facets (not disjunctive)?

One way to handle facets easily is to use the Algolia JS Helper, additionally to the JS API Client
That one offers an elegant way to manage regular facets, disjunctive facets, but also hierarchical facets.

You can only achieve such a behavior with disjunctive faceting as you need to have several queries:
first query without the filter to compute the counts on each facets
second query with the filter applied to have the count after application of filters
The disjunctive faceting is generating those queries for you so you won't have to deal with them.

Related

MongoDB Querying Large Datasets

Lets say I have simple document structure like:
{
"item": {
"name": "Skittles",
"category": "Candies & Snacks"
}
}
On my search page, whenever user searches for product name, I want to have a filter options by category.
Since categories can be many (like 50 types) I cannot display all of the checkboxes on the sidebar beside the search results. I want to only show those which have products associated with it in the results. So if none of the products in search result have a category, then do not show that category option.
Now, the item search by name itself is paginated. I only show 30 items in a page. And we have tens of thousands of items in our database.
I can search and retrieve all items from all pages, then parse the categories. But if i retrieve tens of thousands of items in 1 page, it would be really slow.
Is there a way to optimize this query?
You can use different approaches based on your workflow and see what works the best in your situation. Some good candidate for the solution are
Use distinct prior to running the query on large dataset
Use Aggregation Pipeline as #Lucia suggested
[{$group: { _id: "$item.category" }}]
Use another datastore(either redis or mongo itselff) to store intelligence on categories
Finally based on the approach you choose and the inflow of requests for filters, you may want to consider indexing some fields
P.S. You're right about how aggregation works, unless you have a match filter as first stage, it will fetch all the documents and then applies the next stage.

How to sort data using MongoDB Compass

I'm currently trying to use MongoDB Compass to query my collection. However, I seem to be only able to filter the data.
Is there any way for me to sort the data as well? I would like to sort my data in ascending order using one of my data fields.
If MongoDB Compass isn't the best way to order a collection, what other GUI could I use?
Using MongoDB Compass 1.7 or newer, you can sort (and project, skip, or limit) results by choosing the Documents tab and expanding the Options.
To sort in ascending order by a field myField, use { myField:1 }. Any of the usual cursor sort() options can be provided, including ordering results by multiple fields.
Note: options like sort and skip are not available in the default Schema tab because this view uses sampling to find a random set of documents, as opposed to the Documents view which displays a specific query result.

Can I OR geo search with numeric filters?

I am using insideBoundingBox, and I would like to add a numeric filter.
Something like
color_id=12 OR insideBoundingBox='...'
Is this possible?
Thanks!
Unfortunately, that's not supported by the Algolia API. The work-around would be to perform 2 queries and merge the results sets.
Maybe you could first display the results matching "in the area" followed by the ones that are matching the color_id?

How to display score of Hibernate Search query results

Hibernate Search allows to sort search results on relevance. Is it possible to obtain and display (e.g. in a jsp view) this information using Lucene query?
A Query in Hibernate Search can return Projections rather than the simple list of matching entities.
A projection result essentially means each result is an array containing the sequence of projections you asked for. Normally this is used to extract text from a specific field, so to not need loading the data from the database, but there are Projection constants to return also the Score value or the Explanation of the scoring.
query.setProjection( ProjectionConstants.SCORE, ProjectionConstants.EXPLANATION, ProjectionConstants.THIS );
See also the Reference documentation on projections explaining this and more.

MongoDB fulltext search + workaround for partial word match

Since it is not possible to find "blueberry" by the word "blue" by using a mongodb full text search, I want to help my users to complete the word "blue" to "blueberry". To do so, is it possible to query all the words in a mongodb full text index -> that I can use the words as suggestions i.e. for typeahead.js?
Language stemming in text search uses an algorithm to try to relate words derived from a common base (eg. "running" should match "run"). This is different from the prefix match (eg. "blue" matching "blueberry") that you want to implement for an autocomplete feature.
To most effectively use typeahead.js with MongoDB text search I would suggest focusing on the prefetch support in typeahead:
Create a keywords collection which has the common words (perhaps with usage frequency count) used in your collection. You could create this collection by running a Map/Reduce across the collection you have the text search index on, and keep the word list up to date using a periodic Incremental Map/Reduce as new documents are added.
Have your application generate a JSON document from the keywords collection with the unique keywords (perhaps limited to "popular" keywords based on word frequency to keep the list manageable/relevant).
You can then use the generated keywords JSON for client-side autocomplete with typeahead's prefetch feature:
$('.mysearch .typeahead').typeahead({
name: 'mysearch',
prefetch: '/data/keywords.json'
});
typeahead.js will cache the prefetch JSON data in localStorage for client-side searches. When the search form is submitted, your application can use the server-side MongoDB text search to return the full results in relevance order.
A simple workaround I am doing right now is to break the text into individual chars stored as a text indexed array.
Then when you do the $search query you simply break up the query into chars again.
Please note that this only works for short strings say length smaller than 32 otherwise the indexing building process will take really long thus performance will be down significantly when inserting new records.
You can not query for all the words in the index, but you can of course query the original document's fields. The words in the search index are also not always the full words, but are stemmed anyway. So you probably wouldn't find "blueberry" in the index, but just "blueberri".
Don't know if this might be useful to some new people facing this problem.
Depending on the size of your collection and how much RAM you have available, you can make a search by $regex, by creating the proper index. E.g:
db.collection.find( {query : {$regex: /querywords/}}).sort({'criteria': -1}).limit(limit)
You would need an index as follows:
db.collection.ensureIndex( { "query": 1, "criteria" : -1 } )
This could be really fast if you have enough memory.
Hope this helps.
For those who have not yet started implementing any database architecture and are here for a solution, go for Elasticsearch. Its a json document driven database similar to mongodb structurally. It has "edge-ngram" analyzer which is really really efficient and quick in giving you did you mean for mis-spelled searches. You can also search partially.