Match_phrase isn’t precise - rest

this is my REST API:
GET logstash-2017.12.29/_search
{
"_source": {
"includes": [ "IPV4_DST_ADDR","IPV4_SRC_ADDR","IN_BYTES","OUT_BYTES"]
},
"size" : 100,
"query": {
"bool": {
"should": [
{
"match_phrase":{"IPV4_DST_ADDR":"192.168.0.159"}
},
{
"match_phrase":{"IPV4_SRC_ADDR":"192.168.0.159"}
}
],
"must":
{
"range" : {
"LAST_SWITCHED" : {
"gte" : 1514543547
}
}
}
}
},
"aggs": {
"IN_PKTS": {
"sum": {
"field": "IN_PKTS"
}
},
"IN_BYTES": {
"sum": {
"field": "IN_BYTES"
}
},
"OUT_BYTES": {
"sum": {
"field": "OUT_BYTES"
}
},
"OUT_PKTS": {
"sum": {
"field": "OUT_PKTS"
}
},
"genres":{
"terms" : {
"field" : "L7_PROTO_NAME.keyword",
"order" : { "in_bytes" : "desc" }
},
"aggs":{
"in_bytes": {
"sum": { "field":"IN_BYTES"}
}
}
},
"download1" : {
"filter" : { "term": { "IPV4_DST_ADDR":"192.168.0.159"} },
"aggs" : {
"downlod_bytes" : { "sum" : { "field" : "IN_BYTES" } }
}
},
"download2" : {
"filter" : { "term": { "IPV4_SRC_ADDR":"192.168.0.159"} },
"aggs" : {
"downlod_bytes" : { "sum" : { "field" : "OUT_BYTES" } }
}
},"upload1" : {
"filter" : { "term": { "IPV4_DST_ADDR":"192.168.0.159"} },
"aggs" : {
"downlod_bytes" : { "sum" : { "field" : "OUT_BYTES" } }
}
},"upload2" : {
"filter" : { "term": { "IPV4_SRC_ADDR":"192.168.0.159"} },
"aggs" : {
"downlod_bytes" : { "sum" : { "field" : "IN_BYTES" } }
}
}
}
I found there are some return documents didn't meet my requirement.
{
"_index": "logstash-2017.12.29",
"_type": "ntopng-*",
"_id": "AWCh1jPtnZ2m3739FTU7",
"_score": 1,
"_source": {
"IPV4_SRC_ADDR": "192.168.0.109", // not in my expectation
"IN_BYTES": 132,
"IPV4_DST_ADDR": "224.0.0.252", // not in my expectation
"OUT_BYTES": 0
}
}
the return document IPV4_SRC_ADDR or IPV4_DST_ADDR are not "192.168.0.159".
it seems fuzzy search, but I want to match_phrase 100%.
either IPV4_SRC_ADDR or IPV4_DST_ADDR is "192.168.0.159".
How should I modified my REST API .
thank you in advance!

You should map your IP fields using the ip data type
{
"mappings": {
"my_type": {
"properties": {
"IPV4_SRC_ADDR": {
"type": "ip"
},
"IPV4_DST_ADDR": {
"type": "ip"
}
}
}
}
}
Then you'll be able to match those addresses exactly using a simple term query:
"should": [
{
"term":{"IPV4_DST_ADDR":"192.168.0.159"}
},
{
"term":{"IPV4_SRC_ADDR":"192.168.0.159"}
}
],
UPDATE:
Given your mapping you can also use the .keyword sub-field, like this
{
"_source": {
"includes": [
"IPV4_DST_ADDR",
"IPV4_SRC_ADDR",
"IN_BYTES",
"OUT_BYTES"
]
},
"size": 100,
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"IPV4_DST_ADDR.keyword": "192.168.0.159"
}
},
{
"term": {
"IPV4_SRC_ADDR.keyword": "192.168.0.159"
}
}
],
"must": {
"range": {
"LAST_SWITCHED": {
"gte": 1514543547
}
}
}
}
},
"aggs": {
"IN_PKTS": {
"sum": {
"field": "IN_PKTS"
}
},
"IN_BYTES": {
"sum": {
"field": "IN_BYTES"
}
},
"OUT_BYTES": {
"sum": {
"field": "OUT_BYTES"
}
},
"OUT_PKTS": {
"sum": {
"field": "OUT_PKTS"
}
},
"genres": {
"terms": {
"field": "L7_PROTO_NAME.keyword",
"order": {
"in_bytes": "desc"
}
},
"aggs": {
"in_bytes": {
"sum": {
"field": "IN_BYTES"
}
}
}
},
"download1": {
"filter": {
"term": {
"IPV4_DST_ADDR.keyword": "192.168.0.159"
}
},
"aggs": {
"download_bytes": {
"sum": {
"field": "IN_BYTES"
}
}
}
},
"download2": {
"filter": {
"term": {
"IPV4_SRC_ADDR.keyword": "192.168.0.159"
}
},
"aggs": {
"downlod_bytes": {
"sum": {
"field": "OUT_BYTES"
}
}
}
},
"upload1": {
"filter": {
"term": {
"IPV4_DST_ADDR.keyword": "192.168.0.159"
}
},
"aggs": {
"downlod_bytes": {
"sum": {
"field": "OUT_BYTES"
}
}
}
},
"upload2": {
"filter": {
"term": {
"IPV4_SRC_ADDR.keyword": "192.168.0.159"
}
},
"aggs": {
"downlod_bytes": {
"sum": {
"field": "IN_BYTES"
}
}
}
}
}
}

Related

Elasticsearch search by time range

Trying to search based on date and time separately,
elastic-search document format is,
{
"id": "101",
"name": "Tom",
"customers": ["Jerry", "Nancy", "soli"],
"start_time": "2021-12-13T06:57:29.420198Z",
"end_time": "2021-12-13T07:00:23.511722Z",
}
I need to search based on date and time separately,
Ex:
{
"query": {
"bool" : {
"must" : [
{
"match" : { "customers" : "Jerry" }
},
{
"range": {
"start_time": {"gte" : "2021-12-13", "lte" : "2021-12-15" }}
}
]}
}
}
o/p: I am getting the above doc as the result which is expected.
but when I use the below query, then I am getting errors,
"failed to parse date field [6:57:29] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [6:57:29] with format [strict_date_optional_time||epoch_millis]]"
{
"query": {
"bool" : {
"must" : [
{
"match" : { "customers" : "Jerry" }
},
{
"range": {
"start_time": {"gte" : "6:57:29", "lte" : "6:59:35" }}
}
]}
}
}
Why I am not able to get the result based on time?
is there any idea to achieve a search based on both date and time with the single field?
Ex:
{
"query": {
"bool" : {
"must" : [
{
"match" : { "customers" : "Jerry" }
},
{
"range": {
"start_time": {"gte" : "2021-12-13", "lte" : "2021-12-15" }}
},
{
"range": {
"start_time": {"gte" : "6:57:29", "lte" : "6:59:35" }}
}
]}
}
}
I also tried to achieve this using regular expressions, but it didn't help me.
This is the mapping,
{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1
},
"mappings": {
"dynamic": "true",
"_source": {
"enabled": "true"
},
"runtime": {
"start_time": {
"type": "keyword",
"script": {
"source": "doc.start_time.start_time.getHourOfDay() >=
params.min && doc.start_time.start_time.getHourOfDay()
<= params.max"
}
}
},
"properties": {
"name": {
"type": "keyword"
},
"customers": {
"type": "text"
}
}
}
}
Above statement gives error ==> "not a statement: result not used from boolean and operation [&&]"
This is the search query,which I'll try once the index will be created,
{
"query": {
"bool" : {
"must" : [
{
"match" : { "customers" : "Jerry" }
},
{
"match" : { "name" : "Tom" }
},
{
"range": {
"start_time": {
"gte": "2015-11-01",
"lte": "2015-11-30"
}
}
},
{
"script": {
"source":
"doc.start_time.start_time.getHourOfDay()
>= params.min &&
doc.start_time.start_time.getHourOfDay() <= params.max",
"params": {
"min": 6,
"max": 7
}
}
}
]}
}
}

Mongo Filter Query Nested with mulitple and

Following is my query does exactly match with my document but still not getting output.Don't know why. Following is the document as well.
db.getCollection("analytics").find(
{
"$and" : [
{
"archive" : false
},
{
"platform" : "WEB"
},
{
"vendorId" : "3c7adbfe-14d7-4b26-9134-7e05d56573cc"
},
{
"createdAt" : {
"$gte" : 1578268800000.0
}
},
{
"createdAt" : {
"$lte" : 1580860800000.0
}
},
{
"$and" : [
{
"data.mobile" : "123"
},
{
"page" : "Loan Application"
},
{
"event" : "click"
}
]
},
{
"$and" : [
{
"data.aadharNumber" : "123"
},
{
"page" : "Personal Information"
},
{
"event" : "click"
}
]
},
{
"$and" : [
{
"data.totalExp" : "5"
},
{
"page" : "Professional Information"
},
{
"event" : "click"
}
]
}
]
}
);
Documents :
[
{
"page": "Loan Application",
"event": "click",
"loggedIn": true,
"vendorId": "3c7adbfe-14d7-4b26-9134-7e05d56573cc",
"data": {
"first": "Praveen",
"mobile": "1234"
},
"platform": "WEB"
},
{
"page": "Personal Information",
"event": "click",
"loggedIn": true,
"vendorId": "3c7adbfe-14d7-4b26-9134-7e05d56573cc",
"data": {
"panNumber": "123",
"aadharNumber": "123"
},
"platform": "WEB"
},
{
"page": "Professional Information",
"event": "click",
"loggedIn": true,
"vendorId": "3c7adbfe-14d7-4b26-9134-7e05d56573cc",
"data": {
"totalExp": "5"
},
"platform": "WEB"
}
]
There are a lot of issues going on with your query, you can try below query to return all documents :
db.getCollection("analytics").find({
$expr: {
$and: [
{
$eq: [
"$platform",
"WEB"
]
},
{
$eq: [
"$vendorId",
"3c7adbfe-14d7-4b26-9134-7e05d56573cc"
]
},
{
$or: [
{
"$and": [
{
"data": {
"mobile": "123"
}
},
{
"page": "Loan Application"
},
{
"event": "click"
}
]
},
{
"$and": [
{
"data": {
"aadharNumber": "123"
}
},
{
"page": "Personal Information"
},
{
"event": "click"
}
]
},
{
"$and": [
{
"data": {
"totalExp": "5"
}
},
{
"page": "Professional Information"
},
{
"event": "click"
}
]
}
]
}
]
}
})
Test : MongoDB-Playground

Cloud Firestore subcollection query via REST

I have collection x, each document of x has subcollection y. Each document of y has a time attribute. I can't figure out how to query just that subcollection via REST (I know this feature exists in the SDK). My query so far, which is obviously wrong:
{
"structuredQuery": {
"from": [
{
"collectionId": "x",
"allDescendants": true
}
],
"where": {
"compositeFilter": {
"op": "AND",
"filters": [
{
"fieldFilter": {
"field": {
"fieldPath": "y.time"
},
"op": "GREATER_THAN_OR_EQUAL",
"value": {
"integerValue": 1577836800000
}
}
},
{
"fieldFilter": {
"field": {
"fieldPath": "y.time"
},
"op": "LESS_THAN_OR_EQUAL",
"value": {
"integerValue": 1578355200000
}
}
}
]
}
}
}
}
Sending a POST to https://firestore.googleapis.com/v1/projects/PROJECT/databases/{default}/documents:runQuery, but I've also tried .../documents/x/ID/y:runQuery but that's obviously wrong too.
I believe you described a collection group query for collection group y. In the REST API, this is an allDescendants query on the path projects/PROJECT/databases/(default)/documents (known as the root document):
https://firestore.googleapis.com/v1/projects/PROJECT/databases/(default)/documents:runQuery
{
"structuredQuery": {
"from": [
{
"collectionId": "y",
"allDescendants": true
}
],
"where": {
"compositeFilter": {
"op": "AND",
"filters": [
{
"fieldFilter": {
"field": {
"fieldPath": "time"
},
"op": "GREATER_THAN_OR_EQUAL",
"value": {
"integerValue": 1577836800000
}
}
},
{
"fieldFilter": {
"field": {
"fieldPath": "time"
},
"op": "LESS_THAN_OR_EQUAL",
"value": {
"integerValue": 1578355200000
}
}
}
]
}
}
}
}
Declare the path to the subcollection in the URL:
https://firestore.googleapis.com/v1/projects/PROJECT/databases/(default)/documents/x/documentX:runQuery
Then make the collectionId in from collection "y" and allDescendants false:
{
"structuredQuery": {
"from": [
{
"collectionId": "y",
"allDescendants": false
}
],
"where": {
"compositeFilter": {
"op": "AND",
"filters": [
{
"fieldFilter": {
"field": {
"fieldPath": "y.time"
},
"op": "GREATER_THAN_OR_EQUAL",
"value": {
"integerValue": 1577836800000
}
}
},
{
"fieldFilter": {
"field": {
"fieldPath": "y.time"
},
"op": "LESS_THAN_OR_EQUAL",
"value": {
"integerValue": 1578355200000
}
}
}
]
}
}
}
}
Source: https://firebase.google.com/docs/firestore/reference/rest/v1/projects.databases.documents/runQuery#path-parameters

How to filter unique value in REST API

I want to filter unique IP regardless it's in DST_Local_IP or in SRC_Local_IP.
this is my REST API:
{
"size" : 0,
"query": {
"bool": {
"should": [
{
"match":{"IPV4_DST_ADDR":"120.127.0.0/16"}
},
{
"match":{"IPV4_SRC_ADDR":"120.127.0.0/16"}
},
{
"range" : {
"LAST_SWITCHED" : {
"gte" : 0
}
}
}
],
"minimum_should_match": 2
}
},
"aggs": {
"DST_Local_IP": {
"filter": {
"bool": {
"filter": {
"match":{"IPV4_DST_ADDR":"120.127.0.0/16"}
}
}
},
"aggs": {
"dst_local_ip" : {
"terms" : {
"field" : "IPV4_DST_ADDR",
"size": 10000
}
}
}
},
"SRC_Local_IP": {
"filter": {
"bool": {
"filter": {
"match":{"IPV4_SRC_ADDR":"120.127.0.0/16"}
}
}
},
"aggs": {
"src_local_ip" : {
"terms" : {
"field" : "IPV4_SRC_ADDR",
"size": 10000
}
}
}
}
}
}
response:
"aggregations": {
"SRC_Local_IP": {
"doc_count": 48287688,
"src_local_ip": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "120.127.160.8",
"doc_count": 6890185
},
{
"key": "120.127.160.77",
"doc_count": 3791683
},
{
"key": "120.127.160.65",
"doc_count": 1646648
},
{
"key": "120.127.160.42",
"doc_count": 1058027
}
.
.
.
"DST_Local_IP": {
"doc_count": 36696216,
"dst_local_ip": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "120.127.160.8",
"doc_count": 2762815
},
{
"key": "120.127.160.99",
"doc_count": 1344110
}
I want the return value is distinct because the ip in DST_Local_IP may be in SRC_Local_IP duplicated, but I just want the unique ip regardless the ip is in DST_Local_IP or SRC_Local_IP.
How can I do?could you give me some idea:)
thank you in advance!

How to sum two variable in REST API

I want to sum two variable in REST API,and order by it.
This is my REST API:
"aggs": {
"genres": {
"terms": {
"field": "L7_PROTO_NAME.keyword",
"order": {
"sum_bytes": "desc"
}
},
"aggs": {
"in_bytes": {
"sum": {
"field": "IN_BYTES"
}
},
"out_bytes": {
"sum": {
"field": "OUT_BYTES"
}
}
}
thank you in advance!
You need to create another sub-aggregation that sums the two fields and then order the terms aggregation by that sub-aggregation:
{
"query": {
"bool": {
"should": [
{
"term": {
"_index": "logstash-2018.01.02"
}
},
{
"term": {
"IPV4_DST_ADDR": "192.168.0.159"
}
},
{
"term": {
"IPV4_SRC_ADDR": "192.168.0.159"
}
}
]
}
},
"aggs": {
"genres": {
"terms": {
"field": "L7_PROTO_NAME.keyword",
"order": {
"sum_bytes": "desc"
}
},
"aggs": {
"in_bytes": {
"sum": {
"field": "IN_BYTES"
}
},
"out_bytes": {
"sum": {
"field": "OUT_BYTES"
}
},
"sum_bytes": {
"sum": {
"script": {
"source": "doc.IN_BYTES.value + doc.OUT_BYTES.value"
}
}
}
}
}
}
}
Since scripts are quite computation heavy, you should sum those two fields at indexing time and index the result as a new field that you can use directly in your aggregation, like this:
{
"query": {
"bool": {
"should": [
{
"term": {
"_index": "logstash-2018.01.02"
}
},
{
"term": {
"IPV4_DST_ADDR": "192.168.0.159"
}
},
{
"term": {
"IPV4_SRC_ADDR": "192.168.0.159"
}
}
]
}
},
"aggs": {
"genres": {
"terms": {
"field": "L7_PROTO_NAME.keyword",
"order": {
"sum_bytes": "desc"
}
},
"aggs": {
"in_bytes": {
"sum": {
"field": "IN_BYTES"
}
},
"out_bytes": {
"sum": {
"field": "OUT_BYTES"
}
},
"sum_bytes": {
"sum": {
"field": "SUM_BYTES"
}
}
}
}
}
}