Algolia: Filter Index by string array attribute with a string array of possible values - algolia

I have an Algolia Index that contains objects like this:
id: 2,
name: test,
important: ["lorem", "ipsum", "dolor", "sit", "amet"]
I want to retrieve all entries that e.g. contain either "dolor" or "sit".
How would I go about this?
Note: This is just an example, the importantarray of each entry would normally contain around 1 to 4 values (in total around 1.000 possible values). The array to filter it by / to search for could have anywhere between 1 to 400 values.
What AFAIK doesn't work:
searching in Facet Values by using a facetQuery: facetQuery does not allow for boolean operators. Therefore I can only search for only one of "dolor" or "sit" at once, see docs.
The filters docs however says
Non-numeric attributes (e.g. strings) need to be set up as categories, which we call facets.
So I am wondering if this is possible at all...? Or maybe I am approaching this issue the wrong way?

You are looking at the right place and need to combine attributesForFaceting and filters:
set the important attribute as an attributesForFaceting either via API or the Dashboard
then use the filters to filter on your desired values
Your filter will look like this: { "filters": "important:dolor OR important:sit" }

Related

Algolia optionalFilters acts like filters on virtual index

I have my index and virtual index, on my index query like:
{
"facetFilters": [["objectID:12345", "tag:Luxury","tag:Makeup"]], // 12345 or luxury or makeup
"optionalFilters": "objectID:12345" // put it as first
}
will return all documents that have given object id or tag luxury or tag makeup and puts object with id "12345" as first. It behaves like expected.
But when I run the same query on my virtual index it only returns document with given id "12456". So it behave like filter where in docs it says:
https://www.algolia.com/doc/guides/managing-results/rules/merchandising-and-promoting/in-depth/optional-filters/
Unlike filters, optional filters don’t remove records from your search results when your query doesn’t match them. Instead, they divide your records into two sets: the results that match the optional filter, and the ones that don’t.
Weird. I just set this up and am seeing the same results. I don't see anything in the docs that would explain why the behavior would be different, so I'm reaching out to some engineering colleagues to see what's going on.
UPDATE:
Algolia virtual replicas and optionalFilters both do out-of-band sorting of results at query time. It looks like those two features are causing strangeness when they both try to do their sort. I've cut a ticket on this, but for now to get the results you'll want to use a standard replica with optionalFilters -- the standard replica will do index-time sorting and then the optionalFilters can layer their query time filtering on top of it.

Extensive filtering

Example:
{
shortName: "KITT",
longName: "Knight Industries Two Thousand",
fromZeroToSixty: 2,
year: 1982,
manufacturer: "Pontiac",
/* 25 more fields */
}
Ability to query by at least 20 fields which means that only 10 fields are left unindexed
There's 3 fields (all number) that could be used for sorting (both ways)
This leaves me wondering that how does sites with lots of searchable fields do it: e.g real estate or car sale sites where you can filter by every small detail and can choose between several sort options.
How could I pull this off with MongoDB? How should I index that kind of collection?
Im aware that there are dbs specifically made for searching but there must be general rules of thumb to do this (even if less performant) in every db. Im sure not everybody uses Elasticsearch or similar.
---
Optional reading:
My reasoning is that index could be huge but the index order matters. You'll always make sure that fields that return the least results are first and most generic fields are last in index. However, what if user chooses only generic fields? Should I include non-generic fields to query anyway? How to solve ordering in both ways? Or index intersection saves the day and I should just add 20 different indexes?
text index is your friend.
Read up on it here: https://docs.mongodb.com/v3.2/core/index-text/
In short, it's a way to tell mongodb that you want full text search over a specific field, multiple fields, or all fields (yay!)
To allow text indexing of all fields, use the special symbol $**, and define it of type 'text':
db.collection.createIndex( { "$**": "text" } )
you can also configure it with Case Insensitivity or Diacritic Insensitivity, and more.
To perform text searches using the index, use the $text query helper, see: https://docs.mongodb.com/v3.2/reference/operator/query/text/#op._S_text
Update:
In order to allow user to select specific fields to search on, it's possible to use weights when creating the text-index: https://docs.mongodb.com/v3.2/core/index-text/#specify-weights
If you carefully select your fields' weights, for example using different prime numbers only, and then add the $meta text score to your results you may be able to figure out from the "textScore" which field was matched on this query, and so filter out the results that didn't get a hit from a selected search field.
Read more here: https://docs.mongodb.com/v3.2/tutorial/control-results-of-text-search/

Match a specific keyword to a specific result

We have a need to make a specific search keyword offer a specific result. Is there a way to do this from within the Algolia console?
There isn't such feature out-of-the box but what you can reproduce such behavior using 2 different solutions:
Quick work-around: adding the search keywords in your records
You could add a new keyword attribute to your objects and list there all the keywords you want to use for every single object.
with attributesToIndex:
Putting that keyword attribute on top of your attributesToIndex will make it match before the other attributes.
or with attributesForFaceting:
Putting that keyword attribute in the attributesForFaceting will let you filter on it. For every single search, you could do an extra query putting the query string as the filter: index.search('' /* empty query string: match all */, { facetFilters: "keyword:THE_USER_QUERY_STRING" }) and check use the results if there is a match, otherwise use the regular search query.
Better option: using an additional index
Using an additional index, you would push 1 record per search keyword you want to handle.
A record of such index would look like this:
{
"keyword": "mykeyword",
"object": {
// the object you want to retrieve
}
}
Configure your attributesToIndex with keyword only. You may also want to configure the queryType of that extra index to prefixNone so the native prefix search doesn't trigger. (instead, you could also use the facetFilters approach here).
For every single search, you would then query 2 indices: your original index and this extra index. In case the extra one has a match, you can inject the object in your search results.

MongoDB - forcing stored value to uppercase and searching

in SQL world I could do something to the effect of:
SELECT name FROM table WHERE UPPER(name) = UPPER('Smith');
and this would match a search for "Smith", "SMITH", "SmiTH", etc... because it forces the query and the value to be the same case.
However, MongoDB doesn't seem to have this capability without using a RegEx, which won't use indexes and would be slow for a large amount of data.
Is there a way to convert a stored value to a particular case before doing a search against it in MongoDB?
I've come across the $toUpper aggregate, but I can't figure out how that would be used in this particular case.
If there's not way to convert stored values before searching, is it possible to have MongoDB convert a value when it's created in Mongo? So when I add a document to the collection it would force the "name" attribute to a particular case? Something like a callback in the Rails world.
It looks like there's the ability to create stored JS for MongoDB as well, similar to a Stored Procedure. Would that be a feasible solution as well?
Mostly looking for a push in the right direction; I can figure out the particular code once I know what I'm looking for, but so far I'm not even sure if my desired functionality is doable.
You have to normalize your data before storing them. There is no support for performing normalization as part of a query at runtime.
The simplest thing to do is probably to save both a case-normalized (i.e. all-uppercase) and display version of the field you want to search by. Suppose you are storing users and want to do a case-insensitive search on last name. You might store:
{
_id: ObjectId(...),
first_name: "Dan",
last_name: "Crosta",
last_name_upper: "CROSTA"
}
You can then create an index on last_name_upper, and query like:
> db.users.find({last_name_upper: "CROSTA"})

Is there a way to fetch max and min values in Sphinx?

Im using sphinx for document search. Each document has list of integer parameters, like "length", "publication date (unix)", popularity, ... .
The search process itself works fine. But is there a way to get a maximum and minimum fields values for a specified search query?
The main purpose is to generate a search form which will contain filter fields so user can select document`s length.
Or maybe there is another way to solve this problem?
It is possible if length, date etc are defined as attributes.
http://www.sphinxsearch.com/docs/current.html#attributes
Attributes are additional values
associated with each document that can
be used to perform additional
filtering and sorting during search.
Try GroupBy function by 'length' and select mix(length), max(lenght).
In SphinxQl it is like:
select mix(length), max(lenght) from index_123 group by length
The same for other attributes.