Here's my issue:
I have 2 indexes:
A - product titles only
B - product titles and product descriptions
By default I search index A to categorize products (e.g. most bikes have "bike" in title).
Sometimes there instances where to determine category (which might be a sub-category of something) we need to look at description, mostly to exclude irrelevant results. In order for pagination on search result page to work, I need to get this clean result as one array after running RunQueries().
But it does not work. It basically adds results of both queries, and looks like there's no way to subtract results. Anyone has any ideas?
Tell me if I'm completely missing something but it sounds to me like your trying to include results with product titles that match a certain query and exclude results with a description that matches another query?
If this is the case it seems to me that having 2 indexes is useless, and you can have one index with both product titles and descriptions and then run a full text search query as such:
#title queryA #description -queryB
You can use the same query to search for matches that have a title of queryA AND a description of queryB by simply removing the - symbol.
If this is off base the only other way I could think of doing it is using SphinxQL (I'm not well versed in any of the libraries since support for all the libraries which don't use SphinxQL is being phased out in the future as far as I've read)
Using SphinxQL you could run 2 queries, one which is like
SELECT id FROM indexB WHERE MATCH('#description queryB')
And then run a second query using a the list of ids you got from the first query as such
SELECT id FROM indexA WHERE id NOT IN(id1,id2,id3,...)
Related
I am working on a database of Polish verbs and I'd like to find out how to display my results such that each verb conjugation appears in the following order: 1ps (1st person singular), 2ps, 3ps, 1ppl (1st person plural, etc.), 2ppl, 3ppl. It displays fine when I insert documents:
verb "żyć/przeżyć" conjugation as array and nested document
But when I go to perform queries it jumbles all the array elements up, in the first case (I want to see them in order of array indices), and sorts the nested document elements into alphabetical order (whereas I want to see them in the order in which they were inserted).
verb "żyć/przeżyć" conjugation array/document query
This should be an easy one to solve, I hope this comes across as a reasonable beginner's question. I have searched for answers but couldn't find much info on this topic. Any and all help is greatly appreciated!
Cheers,
LC.
Your screenshots highlight two different views in MongoDB Compass.
The Schema view is based on a sampling of multiple documents and the order of the fields displayed cannot be specified. The schema analysis (as at Compass 1.7) lists fields in case-insensitive alphabetical order with the _id field at the top. Since this is an aggregate schema view based on multiple documents, the ordering of fields is not expected to reflect individual document order.
If you want to work with individual documents and field ordering you need to use the Documents view, as per your second screenshot. In addition to displaying the actual documents, this view allows you to include sort and skip options for queries:
Example:
{
shortName: "KITT",
longName: "Knight Industries Two Thousand",
fromZeroToSixty: 2,
year: 1982,
manufacturer: "Pontiac",
/* 25 more fields */
}
Ability to query by at least 20 fields which means that only 10 fields are left unindexed
There's 3 fields (all number) that could be used for sorting (both ways)
This leaves me wondering that how does sites with lots of searchable fields do it: e.g real estate or car sale sites where you can filter by every small detail and can choose between several sort options.
How could I pull this off with MongoDB? How should I index that kind of collection?
Im aware that there are dbs specifically made for searching but there must be general rules of thumb to do this (even if less performant) in every db. Im sure not everybody uses Elasticsearch or similar.
---
Optional reading:
My reasoning is that index could be huge but the index order matters. You'll always make sure that fields that return the least results are first and most generic fields are last in index. However, what if user chooses only generic fields? Should I include non-generic fields to query anyway? How to solve ordering in both ways? Or index intersection saves the day and I should just add 20 different indexes?
text index is your friend.
Read up on it here: https://docs.mongodb.com/v3.2/core/index-text/
In short, it's a way to tell mongodb that you want full text search over a specific field, multiple fields, or all fields (yay!)
To allow text indexing of all fields, use the special symbol $**, and define it of type 'text':
db.collection.createIndex( { "$**": "text" } )
you can also configure it with Case Insensitivity or Diacritic Insensitivity, and more.
To perform text searches using the index, use the $text query helper, see: https://docs.mongodb.com/v3.2/reference/operator/query/text/#op._S_text
Update:
In order to allow user to select specific fields to search on, it's possible to use weights when creating the text-index: https://docs.mongodb.com/v3.2/core/index-text/#specify-weights
If you carefully select your fields' weights, for example using different prime numbers only, and then add the $meta text score to your results you may be able to figure out from the "textScore" which field was matched on this query, and so filter out the results that didn't get a hit from a selected search field.
Read more here: https://docs.mongodb.com/v3.2/tutorial/control-results-of-text-search/
I have an index which has several different attributes.
MySQL [(none)]> select * FROM products_index WHERE MATCH('red shoes');
This returns a bunch of results. Magic. Love Sphinx.
Now, is it possible to see which attribute Sphinx matched on for each of these results?
For example, I have a "colour" field which the "red" would be matching on (potentially), but it could also match on the product name attribute.
I think PACKEDFACTORS() is the only way to do this
http://sphinxsearch.com/docs/current.html#expr-func-packedfactors
It a little cumbersome to use, and adds a bit of overhead to the query, but should work.
(other than post matching, eg using Snippets)
Let's say I have a table structure like this:
name | groupId | locationId
And I use SphinxQL's realtime indexing features to index the records in the table, with name being full_text and groupId and locationId being rt_attr_uint.
I understand I can use SphinxQL to do a full-text search and use the attributes to filter results. So, for example, I could search for users that have the letters Bo in their name (e.g. Bob) and then filter results to only include records with groupId = 1 and locationId = 67.
However, my question is whether using SphinxQL is still appropriate if I only want to use the attributes to filter records and am not doing full-text searching, or if I should revert to just doing a MySQL query for that? For example, if I wanted to retrieve all records with groupId = 1 and locationId = 67, regardless of name.
I am asking instead of running performance tests myself because I've never used realtime indexing with Sphinx and am still trying to figure it out (actually, I'm still undecided on whether to use it..), so I really can't do tests myself at this time.
Or, going a completely different directly, should I just make groupId and locationId fields instead of attributes so that they are indexed?
We are trying to make a site-wide search using sphinx. This means, that our search must look at all the main indexes and fields in them and return them by relevance.
This is the query:
SELECT *
FROM articles, users, genres
WHERE match ('#(articles.title, genres.title, articles.description, users.nickname) test_sting')
But this does not seem to work. Is there any way to search across multiple indexes and specify the fields that we want to search?
SELECT * FROM articles, users, genres WHERE match ('test_sting')
should just match all fields in all indexes, no need to specify the specific fields.
Otherwise you can use the barely documented ##relaxed operator....
SELECT * FROM articles, users, genres
WHERE match ('##relaxed #(title, description, nickname) test_sting')
which should work. It will only search those named fields, but the ##relaxed, means it doesnt matter if a particular field doesnt exist in a particular index.