Get inserted document counts in specific date range using date histogram in elasticsearch - date

I have list documents in elasticsearch which contains various fileds.
documents looks like below.
{
"role": "api_user",
"apikey": "key1"
"data":{},
"#timestamp": "2021-10-06T16:47:13.555Z"
},
{
"role": "api_user",
"apikey": "key1"
"data":{},
"#timestamp": "2021-10-06T18:00:00.555Z"
},
{
"role": "api_user",
"apikey": "key1"
"data":{},
"#timestamp": "2021-10-07T13:47:13.555Z"
}
]
I wanted to find the number of documents present in specifi date range with 1day interval, let's say
2021-10-05T00:47:13.555Z to 2021-10-08T00:13:13.555Z
I am trying the below aggregation for the result.
{
"size": 0,
"query": {
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": "2021-10-05T00:47:13.555Z",
"lte": "2021-10-08T00:13:13.555Z",
"format": "strict_date_optional_time"
}
}
}
]
}
}
},
"aggs": {
"data": {
"date_histogram": {
"field": "#timestamp",
"calendar_interval": "day"
}
}
}
}
The expected output should be:-
For 2021-10-06 I should get 2 documents and 2021-10-07 I should get 1 document and if the docs are not present I should get count as 0.

the below solution works
{
"size":0,
"query":{
"bool":{
"must":[
],
"filter":[
{
"match_all":{
}
},
{
"range":{
"#timestamp":{
"gte":"2021-10-05T00:47:13.555Z",
"lte":"2021-10-08T00:13:13.555Z",
"format":"strict_date_optional_time"
}
}
}
],
"should":[
],
"must_not":[
]
}
},
"aggs":{
"data":{
"date_histogram":{
"field":"#timestamp",
"fixed_interval":"12h",
"time_zone":"Asia/Calcutta",
"min_doc_count":1
}
}
}
}

Related

How to sort OpenSearch results that have the same score?

I want to set secondary sorting criteria, without deactivating the default behavior, which is sorting by relevance score. In the documentation, all the examples seem to deactivate the default behavior and then sort by the chosen field.
Documentation also gives examples of setting several sorting criteria (the sort attribute of the query is an array of sorting criteria), but I don't see a mention of how to set the first one as sorting by relevance score.
The track_score option allows me to see the relevance score of each hit, but I would like to actually use it as the first ordering rule, and use the other one only for results that have the same relevance score.
You can sort by more than one criteria. The second sort will work whenever the first sort score is the same.
Here is an example:
POST test_stackoverflow_us/_bulk?refresh=true&pretty
{ "index": {}}
{"name":"obama a", "countryCode":"us", "rating":5}
{ "index": {}}
{"name":"obama b", "countryCode":"us", "rating":4}
{ "index": {}}
{"name":"obama ac", "countryCode":"ar", "rating":3}
{ "index": {}}
{"name":"obama ess", "countryCode":"es", "rating":3.5}
GET test_stackoverflow_us/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2
}
}
],
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
},
"size": 50,
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"rating": {
"order": "desc"
}
}
]
}

Check for missing field or null value in mongoDB atlas

I am using mongodb atlas for full text search.
My sample collection looks like this :
{
"_id": "62fdfd7518da050007f035c5",
"expiryDate": "2022-08-18T23:59:59+05:30",
"arrayField" : ['abc', 'def', 'ghi', 'jkl']
},
{
"_id": "62fdfd7518da050007f035c6",
"expiryDate": null,
"arrayField" : ['abc','jkl']
},
{
"_id": "62fdfd7518da050007f035c7",
"arrayField" : []
},
{
"_id": "62fdfd7518da050007f035c8",
"expiryDate": null
}
expiryDate is a Date type field and arrayField is an Array type field.
My goal is to get all documents where either :
expiryDate doesn't exists OR
If expiryDate does exists, then it must be null OR
If expiryDate does exists, then it must greater than current time.
My current atlas aggregation looks like :
{
'compound' : {
'should' : [
{
'compound' : {
'mustNot' : [{
"exists": {
"path": "expiryDate",
}
}]
}
},
{
"range": {
"path": "expiryDate",
'gte': new Date()
}
}
],
'minimumShouldMatch' : 1
}
}
This is not returning all documents where the expiryDate field have null value and it is only matching one clause of should where expiryDate is greater than or equal to current time. I want it to return all those documents too where the expiryDate is null.
Please advise.
You can use the $exists operator (docs) to check if an element exists, and if it does, run a check on its value.
So I tried with multiple clauses and approaches and found that there are two solutions to this problem :
Use combination of $must, $should and $mustNot :
{
'compound' : {
'should' : [
{
'compound' : {
'mustNot' : [{
"exists": {
"path": "expiryDate",
}
}]
}
},
{
"compound": {
"must": [
{
"exists": {
"path": "expiryDate"
}
}
],
"mustNot": [
{
"range": {
"path": "expiryDate",
"lt": new Date()
}
}
]
}
}
{
"range": {
"path": "expiryDate",
'gte': new Date()
}
}
],
'minimumShouldMatch' : 1
}
}
And the second one is rather not optimized but works. Since at the end it's and aggregation; We can use $match operator just outside the $search pipeline like so :
db.exampleCollection.aggregate([
{
"$search": {
"index": "default",
"compound": {
"must": [
...some conditions
],
"filter": [
...some clauses
]
}
}
},
{
"$match": [...some other conditions]
},
{
"$project": {
...some fields
}
},
{
"$skip": 0
},
{
"$limit": 10
},
{
"$sort": {
"score": 1
}
}
])
Hope it helps someone 🙂
#PawanSaxena I'm trying to do something similar, i.e., finding documents whose date field exists (and) with null value. Did you test out just the 2nd goal in your original post. Somehow it didn't work for me. Here is my testing code:
{
"compound": {
"must": [
{
"exists": {
"path": "expiryDate"
}
}
],
"mustNot": [ /// should it be "must"?
{
"range": {
"path": "expiryDate",
"lt": new Date() /// see below _MONGOSH, new Date() will output current date
// "lt": 1 /// this doesn't work, either
}
}
]
}
}

need to phonenumber from table abc in mongodb

{
"id": "1234",
"applicant": [
{
"phone": [
{
"prirotynumber": "1",
"areacode": "407",
"linenumber": "1234",
"exchangenumber": "7899"
},
{
"prirotynumber": "27",
"areacode": "407",
"linenumber": "1234",
"exchangenumber": "79999"
}
]
}
]
}
for this id=1234 i need to fetch homephonenuber as applicant.phone.areacode+applicant.phone+linenumber+ applicant.phone+exchangenumber if prirotynumber=1
and
cellphone as applicant.phone.areacode+applicant.phone+linenumber+ applicant.phone+exchangenumber if prirotynumber=27
Expected result here:
{
"key":"value"
}
If this isn't what you need, make your expected result more clarify with right sample data.
db.collection.aggregate([
{
"$match": {
"id": "1234",
"applicant.phone.prirotynumber": "1"
}
},
{
"$unwind": "$applicant"
},
{
"$unwind": "$applicant.phone"
},
{
"$match": {
"applicant.phone.prirotynumber": "1"
}
},
{
"$set": {
"homePhoneNumber ": {
$concat: [
"$applicant.phone.areacode",
"-",
"$applicant.phone.linenumber",
"-",
"$applicant.phone.exchangenumber"
]
}
}
}
])
mongoplayground

Using $match to query from different arrays with the same key value

Suppose I have this simple JSON data of two documents both with two different arrays namely carPolicies and paPolicies. Within these arrays are objects named as policy where it contains a key 'agent' where the value is '47'.
{
"_id": {
"$oid": "some_id"
},
"name": "qwe",
"password": "pw",
"carPolicies": [
{
"policy": {
"agent": "47"
}
},
{
"policy": {
"agent": "47"
}
}
],
"paPolicies": [
{
"policy": {
"agent": "47"
}
},
{
"policy": {
"agent": "47"
}
}
]
}
{
"_id": {
"$oid": "some_id"
},
"name": "rty",
"password": "wp",
"carPolicies": [
{
"policy": {
"agent": "47"
}
},
{
"policy": {
"agent": "47"
}
}
],
"paPolicies": [
{
"policy": {
"agent": "47"
}
},
{
"policy": {
"agent": "47"
}
}
]
}
Using mongoDB's $match operator, how do I come up with a query that if agent value is 47 in either arrays, it returns me the document's name?
This is what I currently have:
db.collection('users').aggregate([
// Get just the docs that contain an agent element where agent is === req.params.name
{$match: {$or: [{'paPolicies.policy.agent': req.params.name}, {'carPolicies.policy.agent': req.params.name}]} },
{
$project: {
policy: {
$filter: {
// how to do an 'or' operator at 'input' so it can be input: '$paPolicies.policy || $carPolicies.policy'
input: '$paPolicies.policy',
as: 'police',
cond: { $eq: ['$$police.agent', req.params.name]}
}
},
_id: 1, name: 1
}
}
])
I know that the above code is wrong but I feel like it's the closest I can currently get to a solution and hopefully gives an idea of what I'm trying to achieve.
If I get the requirement right. How about just using dot(.) notation in a .find() query with projection as second parameter.
db.collection.find({
$or: [
{
"carPolicies.policy.agent": "47"
},
{
"paPolicies.policy.agent": "47"
}
]
},
{
"_id": 1,
"name": 1
})

filter range date elasticsearch

This is how my datas look like
{
"name": "thename",
"openingTimes": {
"monday": [
{
"start": "10:00",
"end": "14:00"
},
{
"start": "19:00",
"end": "02:30"
}
]
}
}
I want to query this document saying, opened on monday between 13:00 and 14:00.
I tried this filter but it doesn't return my document:
{
"filter": {
"range": {
"openingTimes.monday.start": {
"lte": "13:00"
},
"openingTimes.monday.end": {
"gte": "14:00"
}
}
}
}
If I simply say opened on monday at 13:00, it works:
{
"filter": {
"range": {
"openingTimes.monday.start": {
"lte": "13:00"
}
}
}
}
Or even closing on monday from 14:00, works too:
{
"filter": {
"range": {
"openingTimes.monday.start": {
"gte": "14:00"
}
}
}
}
but combining both of them doens't give me anything. How can I manage to create a filter meaning opened on monday between 13:00 and 14:00 ?
EDIT
This is how I mapped the openingTime field
{
"properties": {
"monday": {
"type": "nested",
"properties": {
"start": {"type": "date","format": "hour_minute"},
"end": {"type": "date","format": "hour_minute"}
}
}
}
}
SOLUTION (#DanTuffery)
Based on #DanTuffery answer I changed my filter to his (which is working perfectly) and added the type definition of my openingTime attribute.
For the record I am using elasticsearch as my primary db through Ruby-on-Rails using the following gems:
gem 'elasticsearch-rails', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
gem 'elasticsearch-model', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
gem 'elasticsearch-persistence', git: 'git://github.com/elasticsearch/elasticsearch-rails.git', require: 'elasticsearch/persistence/model'
Here is how my openingTime attribute's mapping looks like:
attribute :openingTimes, Hash, mapping: {
type: :object,
properties: {
monday: {
type: :nested,
properties: {
start:{type: :date, format: 'hour_minute'},
end: {type: :date, format: 'hour_minute'}
}
},
tuesday: {
type: :nested,
properties: {
start:{type: :date, format: 'hour_minute'},
end: {type: :date, format: 'hour_minute'}
}
},
...
...
}
}
And here is how I implemented his filter:
def self.openedBetween startTime, endTime, day
self.search filter: {
nested: {
path: "openingTimes.#{day}",
filter: {
bool: {
must: [
{range: {"openingTimes.#{day}.start"=> {lte: startTime}}},
{range: {"openingTimes.#{day}.end" => {gte: endTime}}}
]
}
}
}
}
end
First create your mapping with the openingTimes object at the top level.
/PUT http://localhost:9200/demo/test/_mapping
{
"test": {
"properties": {
"openingTimes": {
"type": "object",
"properties": {
"monday": {
"type": "nested",
"properties": {
"start": {
"type": "date",
"format": "hour_minute"
},
"end": {
"type": "date",
"format": "hour_minute"
}
}
}
}
}
}
}
}
Index your document
/POST http://localhost:9200/demo/test/1
{
"name": "thename",
"openingTimes": {
"monday": [
{
"start": "10:00",
"end": "14:00"
},
{
"start": "19:00",
"end": "02:30"
}
]
}
}
With a nested filter query you can search for the document with the start and end fields within boolean range queries:
/POST http://localhost:9200/demo/test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "openingTimes.monday",
"filter": {
"bool": {
"must": [
{
"range": {
"openingTimes.monday.start": {
"lte": "13:00"
}
}
},
{
"range": {
"openingTimes.monday.end": {
"gte": "14:00"
}
}
}
]
}
}
}
}
}
}
}