Im using sphinx for document search. Each document has list of integer parameters, like "length", "publication date (unix)", popularity, ... .
The search process itself works fine. But is there a way to get a maximum and minimum fields values for a specified search query?
The main purpose is to generate a search form which will contain filter fields so user can select document`s length.
Or maybe there is another way to solve this problem?
It is possible if length, date etc are defined as attributes.
http://www.sphinxsearch.com/docs/current.html#attributes
Attributes are additional values
associated with each document that can
be used to perform additional
filtering and sorting during search.
Try GroupBy function by 'length' and select mix(length), max(lenght).
In SphinxQl it is like:
select mix(length), max(lenght) from index_123 group by length
The same for other attributes.
Related
I'm using this php package to make queries - https://github.com/jenssegers/laravel-mongodb
The situation is, there are two fields, user_id and post_status among others. I want to retrieve all the documents in that collection, but when post_status field value is draft, that should be retrieved only when user_id is a given string. The idea is, only logged in user finds their drafted posts among other posts.
I'm having hard time finding any solution for this problem. The app is still not in production. If I should store data is some different manner, that is an option as well.
Find documents with a certain field value only when another field value is a given string
The question your are framing is simply convert into a and query, how let's see it
when another field value is a given string
This means that you have some result sets and you need to filter out when user_id match with some string. i.e some result sets and user_id = <id>
Now consider the first part of the sentence Find documents with a certain field value
This means you are filtering the records with some values i.e "status" = "draft" and whatever result will come and want again to filter on the basis of user_id = <id>
So finally you will end-up with below query:
db.collectionName.find({"status":"draft", "user_id": "5c618615903aaa496d129d90"})
Hope this explanation will help you out or you can rephrase your question I will try to modify by ans.
I know that the ATTR function is used for aggregation, but can someone explain it in simple terms?
In the most simplest of terms, ATTR returns a value if it is unique, otherwise it returns "*". I think you'll find this link helpful with examples.
https://www.interworks.com/blog/tcostello/2014/05/15/attr-tableaus-attribute-function-explained
You cannot mix aggregate and non aggregate comparisons in tableau, you have to use ATTR, for e.g. if ATTR(segment) ='Corporate' then sum(sales)
ATTR is like using an already aggregated field for comparison with another aggregated field. Measures are taken as aggregated fields while dimension's aren't. If you have created a field which is already aggregated and still you want to use this field as measure it will be shown as ATTR as it cannot be further aggregated but is behaving like it has.
I'm trying to do create a query to exclude all documents which have an empty/null value in one specific field.
What is the query syntax or the programmatic way to do this?
You can use a required range query, which is open at both sides, like:
+field:[* TO *]
That is probably adequate, assuming that the documents to exclude have no value in the index.
If some form of default value appears, you would have to exclude that value as well, like:
+field:[* TO *] -field:NULL
I am using sphinx 2.0.
I want to achieve following results :
user will input tags with other search terms, documents associated with user input tags should come on top, sorted by distance.
After that documents does not contain those tags sorted by distance.
What i am doing:
I am searching on different parameters at the same time using like #name , #tag, #streetname etc.so i am using below
$cl->SetMatchMode(SPH_MATCH_EXTENDED);
and sorting the result by distance using $cl->SetSortMode(SPH_SORT_EXTENDED, '#geodist asc');
tag filed can contain multiple values i am using OR operator to get the desired results.
If i search for only #tags then i am able to achieve the requirement i have mentioned. but if user input is #tag food|dinner #city london #name taxi
then result with name: London Taxi, street: London comes on top or some other position breaking the sorting order by lat-long. because London is there in two parameters.i just want to sort by tag, do not want to include the weight of other search terms in sorting order.
Ranking mode is : $cl->setRankingMode(SPH_RANK_PROXIMITY_BM25);
any suggestion to overcome this issue ? or any other way to implement it.
Many Thanks.
I think the way to solve this would be to arrange for matches on the tag field to rank way way higher. Would have to test it but something like this...
$cl->setFieldWeights(array('tags' => 100000));
$cl->setSelect("*,IF(#weight>100000,1,0) AS matchtags");
$cl->SetSortMode(SPH_SORT_EXTENDED, 'matchtags DESC, #geodist ASC');
According to the documentation, Zend Lucene is supposed to sort lexicographically. I am finding this is not the case. If I have a query 'avg:[050 TO 300]', yes it will return all values in that range, but it will sort them according to the document id, not the value.
I have found that the find() function can accept additional parameters, allowing me to sort by a specific column (eg $hits = $index->find($query, 'avg', SORT_NUMERIC, SORT_ASC);). However, I am creating $query dynamically and do not want to sort every search by 'avg'.
How do I force Lucene to sort the results automatically, lexicographically, when I do a range search? And if that's not possible, how do I dynamically add a sort field to the find function?
Why don't you sort $hits by yourself after getting the result from $index->find(...)? Ok this looks like a workaround and will be time-consuming for very large resultsets, but I guess that this is the easiest way in most cases.