Advanced elasticsearch query - mongodb

I am using laravel 4.2, mongodb and elasticsearch. Below is a working code, I am trying to convert this advanced where queries to elasticsearch queries:
$products = Product::where(function ($query) {
$query->where (function($subquery1){
$subquery1->where('status', '=', 'discontinued')->where('inventory', '>', 0);
});
$query->orWhere (function($subquery2){
$subquery2->where('status', '<>', 'discontinued');
});
})->get();
All I can get so far is just returning discontinued products, the code below works but it is not what I need:
$must = [
['bool' =>
['should' =>
['term' =>
['status' => 'discontinued']
]
]
]
];
Can you show me how can I achieve the same query I first described above but in elasticsearch? I want to return discontinued products with inventory, then also return products that are not equal to discontinued.

The WHERE query you've described can be expressed in SQL like this
... WHERE (status = discontinued AND inventory > 0)
OR status <> discontinued
In Elasticsearch Query DSL, this can be expressed like this:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"status": "discontinued"
}
},
{
"range": {
"inventory": {
"gt": 0
}
}
}
]
}
},
{
"bool": {
"must_not": [
{
"term": {
"status": "discontinued"
}
}
]
}
}
]
}
}
}
}
}
Translating this query into PHP should now be straightforward. Give it a try.

Related

Atlas Search works too slow when using facet

I have a big collection (over 22M records, approx. 25GB) on an M10 cluster with MongoDB version 4.4.10. I set up an Atlas search index on one field (address) and it works pretty fast when I request through the search tester. However, when I try to paginate it by specifying a facet, it gets extremely slow in comparison with the query without the facet. Is there a way to optimize the facet or somehow replace the facet with one that works faster ? Below are the plain query and another one with the facet:
db.getCollection("users").aggregate([{
$search: {
index: 'address',
text: {
query: '7148 BIG WOODS DR',
path: {
'wildcard': '*'
}
}
}
}]);
db.getCollection("users").aggregate([{
$search: {
index: 'address',
text: {
query: '7148 BIG WOODS DR',
path: {
'wildcard': '*'
}
}
}
}, {
$facet: {
paginatedResult: [
{
$limit: 50
},
{
$skip: 0
}
],
totalCount: [
{
$count: 'total'
}
]
}
}]);
The fast and recommend way is using facet with the $searchMeta stage to retrieve metadata results only for the query
"$searchMeta": {
"index":"search_index_with_facet_fields",
"facet":{
"operator":{
"compound":{
"must":[
{
"text":{
"query":"red shirt",
"path":{
"wildcard":"*"
}
}
},
{
"compound":{
"filter":[
{
"text":{
"query":["clothes"],
"path":"category"
}
},
{
"text":{
"query":[
"maroon",
"blackandred",
"blackred",
"crimson",
"burgandy",
"burgundy"
],
"path":"color"
}
}
]
}
}
]
}
},
"facets":{
"brand":{
"type":"string",
"path":"brand"
},
"size":{
"type":"string",
"path":"size"
},
"color":{
"type":"string",
"path":"color"
}
}
}
}
}
Here we are fetching 3 facets brand, size, and color, which we need to be defined in your search_index as Facet fields such as
{
"mappings": {
"dynamic": false,
"fields": {
"category": [
{
"type": "string"
}
],
"brand": [
{
"type": "string"
},
{
"type": "stringFacet"
}
],
"size": [
{
"type": "string"
},
{
"type": "stringFacet"
}
],
"color": [
{
"type": "string"
},
{
"type": "stringFacet"
}
]
}
}
}
category is defined only as string since we are not using it in facets but only as a filter field.
We can also replace filter op with must or should based on our requirement.
Finally, we will get as our result.
*p.s. I am also new to mongo and got to this solution after searching a lot, so please upvote if you find it useful, also let me know if there is any error/improvement you notice. Thanks *

How to create query in Elastic4s

I'm implement query in Elastic4s library. But I don't know how to implement a following Json query for Elasticsearch.
{
"bool": {
"must": [
{
"match_all": {}
},
{
"keywordQuery": "hogehoge"
}
]
}
}
I don't know how to implement this part of Json query.
{
"keywordQuery": "hogehoge"
}
This is a code I implemented halfway.
boolQuery().must(Seq(matchAllQuery(), query("{keywordQuery: hogehoge}")))
and this is an output of an above code.
{
"bool": {
"must": [
{
"match_all": {}
},
{ "queryString": {
"query": "{keywordQuery": "hogehoge}"
}
}
]
}
}
I expect
{
"keywordQuery": "hogehoge"
}
but actually
{ "queryString": {
"query": "{keywordQuery": "hogehoge}"
}
}
Would you help me please?
I can't find a reference to keywordQuery in the ElasticSearch DSL documentation at https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl.html or https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html - maybe you need a Term query?
(on e.g. Logstash indices 'text' fields have a non-analysed subfield called '.keyword' so if I do a "keyword query" I normally do termQuery("field.keyword","value))
I don't think you need to include matchAllQuery() as it's kinda implied that you start off with the full set of results, so you could drop the bool and simplify the query to:
{
"query": {
"term": {
"field.keyword": "value"
}
}
}
In Elastic4s this would be:
client.execute {
termQuery("field.keyword", "value")
}

ElasticSearch - Get different types from different indices

I have two indices: A and B.
A has the following types: car, motorbike and van.
B has the following types: bus, car and pickup.
I want to be able to have a single query which gets motorbike and van from A and car and pickup from B.
I want to use a filter to do this and currently, I have:
.filter(
not(
should(
termsQuery("key", Seq("car", "bus"))
)
)
)
But obviously, this will filter car for both indices. I know I can do two separate queries for each index and filter different types for each but I want to avoid this if possible.
Is it possible to do what I am trying to do in a single query?
You can search on index and type by using the special fields _index and _type so once you know that, it's just a matter of putting together a boolean query.
search("_all").query(
boolQuery().should(
boolQuery().must(
termQuery("_index", "A"),
termsQuery("_type", "motorbike", "van")
),
boolQuery().must(
termQuery("_index", "B"),
termsQuery("_type", "car", "pickup")
)
)
)
You can do something like this.
GET _search
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "A"
}
}
},
{
"terms": {
"_type": ["motorbike","van"]
}
}
]
}
},
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "B"
}
}
},
{
"terms": {
"_type": ["car","pickup"]
}
}
]
}
}
]
}
}
}

elastic4s: score stays at 1 with rawQuery

We're using elastic4s for ElasticSearch 2.2.0. A number of queries is stored as JSON on disk and used as rawQuery via the elastic4s driver. The score in the result differs between the query being submitted via command line or the elastic4s driver. The elastic4s driver always returns score of 1 for all results, while the command line execution yields two different scores (for different data types).
The code for elastic4s:
val searchResult = client.execute {
search in indexName types(product, company, orga, "User", "Workplace") rawQuery preparedQuery sourceInclude(preparedSourceField:_*) sort {sortDefintions:_*} start start limit limit
}.await
Note that I removed anything but rawQuery preparedQuery and it didn't change the score 1. The full query via the command line is quite long:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "${search}",
"fields": [
"name",
"abbreviation",
"articleNumberManufacturer",
"productLine",
"productTitle^10",
"productSubtitle",
"productDescription",
"manufacturerRef.name",
"props"
]
}
}
],
"filter": [
{
"or": [
{
"bool": {
"must": [
{
"type": {
"value": "Product"
}
},
{
"term": {
"publishState": "published"
}
}
],
"must_not": [
{
"term": {
"productType": "MASTER"
}
},
{
"term": {
"deleted": true
}
}
]
}
}
]
}
]
}
}
}
Note that this is almost preparedQuery but for the replacement of $search with the search query. The elastic search REST client returns a score of 3.075806 for the matches.
elastic4s rawQuery will wrap your rawQuery-JSON in another query object.
it's like you would query for
{ "query": { "query": {
"bool": {
"must": [
{
"multi_match": {
"query": "${search}",
...
just remove your wrapping "query" from you JSON and the response will show varying scores.
Alternatively you can try to use extraSource instead of rawQuery, like described in elastic4s docu. although it didn't work for me at all:
ErrorMessage:
value extraSource is not a member of com.sksamuel.elastic4s.SearchDefinition

Date Filter on elastic search

I'm trying to create a Range Filter on elastic search using the following syntax:
{
"size": 100,
"filter": {
"and": {
"filters": [
{
"range": {
"listingDate": {
"gt": "15/07/2017 16:08:53"
}
}
}
]
}
}
}
The data format is:
"listingDate": "07/07/2015 09:30:00",
However regardless of the filter properties the same incorrect results are being returned by elastic search. I have tried adding the following format:
"format": "dd/MM/yyyy HH:mm:ss"
but I get the same incorrect results.
A fuller example is:
{
"size": 100,
"sort": [
{
"listingDate": {
"order": "asc"
}
}
],
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "Event"
}
},
{
"range": {
"listingDate": {
"gte": "15/07/2015 16:08:53"
}
}
},
{
"range": {
"endDate": {
"gte": "15/07/2015 16:08:53"
}
}
}
]
}
},
"filter": {
"and": {
"filters": [
{
"terms": {
"departments": [
"2393"
]
}
}
]
}
}
}
In JSON documents, dates are represented as strings. Elasticsearch uses a set of preconfigured formats to recognize and parse these strings into a long value representing milliseconds-since-the-epoch in UTC. It might be possible that your date field might not be listed in the set of preconfigured ES date formats.
Formatted dates will be parsed using the format specified on the date field by default, but it can be overridden by passing the format parameter to the range query.
{
"range" : {
"listingDate" : {
"gte": "07/07/2015 09:30:00",
"format": "dd/MM/yyyy HH:mm:ss"
}
}
}
let suppose "arr" argument have a date range e.g. ["2019-07-10","2019-07-11"]
let start_date_query;
let range=[];
if ( arr.date_from ){
if(arr.date_from[1]){
range.push({
"range":{
"start_date":{ "gte":arr.date_from[0] }
}
});
range.push({
"range":{
"end_date":{ "lte": arr.date_from[1] }
}
});
start_date_query = {
"query": {
"constant_score": {
"filter": {
"bool": {
"must":range
}
}
}
};
}
}
}