Let's say I have a table structure like this:
name | groupId | locationId
And I use SphinxQL's realtime indexing features to index the records in the table, with name being full_text and groupId and locationId being rt_attr_uint.
I understand I can use SphinxQL to do a full-text search and use the attributes to filter results. So, for example, I could search for users that have the letters Bo in their name (e.g. Bob) and then filter results to only include records with groupId = 1 and locationId = 67.
However, my question is whether using SphinxQL is still appropriate if I only want to use the attributes to filter records and am not doing full-text searching, or if I should revert to just doing a MySQL query for that? For example, if I wanted to retrieve all records with groupId = 1 and locationId = 67, regardless of name.
I am asking instead of running performance tests myself because I've never used realtime indexing with Sphinx and am still trying to figure it out (actually, I'm still undecided on whether to use it..), so I really can't do tests myself at this time.
Or, going a completely different directly, should I just make groupId and locationId fields instead of attributes so that they are indexed?
Related
I have a document structure which looks something like this:
{
...
"groupedFieldKey": "groupedFieldVal",
"otherFieldKey": "otherFieldVal",
"filterFieldKey": "filterFieldVal"
...
}
I am trying to fetch all documents which are unique with respect to groupedFieldKey. I also want to fetch otherField from ANY of these documents. This otherFieldKey has minor changes from one document to another, but I am comfortable with getting ANY of these values.
SELECT DISTINCT groupedFieldKey, otherField
FROM bucket
WHERE filterFieldKey = "filterFieldVal";
This query fetches all the documents because of the minor variations.
SELECT groupedFieldKey, maxOtherFieldKey
FROM bucket
WHERE filterFieldKey = "filterFieldVal"
GROUP BY groupFieldKey
LETTING maxOtherFieldKey= MAX(otherFieldKey);
This query works as expected, but is taking a long time due to the GROUP BY step. As this query is used to show products in UI, this is not a desired behaviour. I have tried applying indexes, but it has not given fast results.
Actual details of the records:
Number of records = 100,000
Size per record = Approx 10 KB
Time taken to load the first 10 records: 3s
Is there a better way to do this? A way of getting DISTINCT only on particular fields will be good.
EDIT 1:
You can follow this discussion thread in Couchbase forum: https://forums.couchbase.com/t/getting-distinct-on-the-basis-of-a-field-with-other-fields/26458
GROUP must materialize all the documents. You can try covering index
CREATE INDEX ix1 ON bucket(filterFieldKey, groupFieldKey, otherFieldKey);
I'm using this php package to make queries - https://github.com/jenssegers/laravel-mongodb
The situation is, there are two fields, user_id and post_status among others. I want to retrieve all the documents in that collection, but when post_status field value is draft, that should be retrieved only when user_id is a given string. The idea is, only logged in user finds their drafted posts among other posts.
I'm having hard time finding any solution for this problem. The app is still not in production. If I should store data is some different manner, that is an option as well.
Find documents with a certain field value only when another field value is a given string
The question your are framing is simply convert into a and query, how let's see it
when another field value is a given string
This means that you have some result sets and you need to filter out when user_id match with some string. i.e some result sets and user_id = <id>
Now consider the first part of the sentence Find documents with a certain field value
This means you are filtering the records with some values i.e "status" = "draft" and whatever result will come and want again to filter on the basis of user_id = <id>
So finally you will end-up with below query:
db.collectionName.find({"status":"draft", "user_id": "5c618615903aaa496d129d90"})
Hope this explanation will help you out or you can rephrase your question I will try to modify by ans.
QUERYING MONGODB: RETREIVE SHOPS BY NAME AND BY LOCATION WITH ONE SINGLE QUERY
Hi folks!
I'm building a "search shops" application using MEAN Stack.
I store shops documents in MongoDB "location" collection like this:
{
_id: .....
name: ...//shop name
location : //...GEOJson
}
UI provides to the users one single input for shops searching. Basically, I would perform one single query to retrieve in the same results array:
All shops near the user (eventually limit to x)
All shops named "like" the input value
On logical side, I think this is a "$or like" query
Based on this answer
Using full text search with geospatial index on Mongodb
probably assign two special indexes (2dsphere and full text) to the collection is not the right manner to achieve this, anyway I think this is a different case just because I really don't want to apply sequential filter to results, "simply" want to retreive data with 2 distinct criteria.
If I should set indexes on my collection, of course the approach is to perform two distinct queries with two distinct mehtods ($near for locations and $text for name), and then merge the results with some server side logic to remove duplicate documents and sort them in some useful way for user experience, but I'm still wondering if exists a method to achieve this result with one single query.
So, the question is: is it possible or this kind of approach is out of MongoDB purpose?
Hope this is clear and hope that someone can teach something today!
Thanks
We are trying to make a site-wide search using sphinx. This means, that our search must look at all the main indexes and fields in them and return them by relevance.
This is the query:
SELECT *
FROM articles, users, genres
WHERE match ('#(articles.title, genres.title, articles.description, users.nickname) test_sting')
But this does not seem to work. Is there any way to search across multiple indexes and specify the fields that we want to search?
SELECT * FROM articles, users, genres WHERE match ('test_sting')
should just match all fields in all indexes, no need to specify the specific fields.
Otherwise you can use the barely documented ##relaxed operator....
SELECT * FROM articles, users, genres
WHERE match ('##relaxed #(title, description, nickname) test_sting')
which should work. It will only search those named fields, but the ##relaxed, means it doesnt matter if a particular field doesnt exist in a particular index.
Here's my issue:
I have 2 indexes:
A - product titles only
B - product titles and product descriptions
By default I search index A to categorize products (e.g. most bikes have "bike" in title).
Sometimes there instances where to determine category (which might be a sub-category of something) we need to look at description, mostly to exclude irrelevant results. In order for pagination on search result page to work, I need to get this clean result as one array after running RunQueries().
But it does not work. It basically adds results of both queries, and looks like there's no way to subtract results. Anyone has any ideas?
Tell me if I'm completely missing something but it sounds to me like your trying to include results with product titles that match a certain query and exclude results with a description that matches another query?
If this is the case it seems to me that having 2 indexes is useless, and you can have one index with both product titles and descriptions and then run a full text search query as such:
#title queryA #description -queryB
You can use the same query to search for matches that have a title of queryA AND a description of queryB by simply removing the - symbol.
If this is off base the only other way I could think of doing it is using SphinxQL (I'm not well versed in any of the libraries since support for all the libraries which don't use SphinxQL is being phased out in the future as far as I've read)
Using SphinxQL you could run 2 queries, one which is like
SELECT id FROM indexB WHERE MATCH('#description queryB')
And then run a second query using a the list of ids you got from the first query as such
SELECT id FROM indexA WHERE id NOT IN(id1,id2,id3,...)