Subtract two values in Kibana for specific timestamps - elastic-stack

This is the first time Im doing this and cant seem to find an online resource.
The index is aggregated at a daily level. So one record per day.
26 April:
{
"_index": "gamers",
"_type": "dailyAgg",
"_id": "dailyAgg-2015-04-26T00:00:00Z",
"_score": null,
"_source": {
"timestamp": "2017-04-26T00:00:00Z",
"player_count": 800
},
"timestamp": [
1493164800000
]
},
"sort": [
1493164800000
]
}
25 April:
{
"_index": "gamers",
"_type": "dailyAgg",
"_id": "dailyAgg-2017-04-25T00:00:00Z",
"_score": null,
"_source": {
"timestamp": "2017-04-25T00:00:00Z",
"player_count": 500
},
"timestamp": [
1493078400000
]
},
"sort": [
1493078400000
]
}
What I need is:
player_count(Today) - player_count(Yesterday)
=> player_count(26 April) - player_count(25 April) = 800 - 500 = 300
I've tried scripted field and painless scripts, but cant pull the data for the given date.

This is the solution I ended up using: Custom Plugin

Related

MongoDB update_one vs update_many - Improve speed

I got a collection of 10000 ca. docs, where each doc has the following format:
{
"_id": {
"$oid": "631edc6e207c89b932a70a26"
},
"name": "Ethereum",
"auditInfoList": [
{
"coinId": "1027",
"auditor": "Fairyproof",
"auditStatus": 2,
"reportUrl": "https://www.fairyproof.com/report/Covalent"
}
],
"circulatingSupply": 122335921.0615,
"cmcRank": 2,
"dateAdded": "2015-08-07T00:00:00.000Z",
"id": 1027,
"isActive": 1,
"isAudited": true,
"lastUpdated": 1662969360,
"marketPairCount": 6085,
"quotes": [
{
"name": "USD",
"price": 1737.1982544180462,
"volume24h": 14326453277.535921,
"marketCap": 212521748520.66168,
"percentChange1h": 0.62330307,
"percentChange24h": -1.08847937,
"percentChange7d": 10.96517745,
"lastUpdated": 1662966780,
"percentChange30d": -13.49374496,
"percentChange60d": 58.25153862,
"percentChange90d": 42.27475921,
"fullyDilluttedMarketCap": 212521748520.66,
"marketCapByTotalSupply": 212521748520.66168,
"dominance": 20.0725,
"turnover": 0.0674117,
"ytdPriceChangePercentage": -53.9168
}
],
"selfReportedCirculatingSupply": 0,
"slug": "ethereum",
"symbol": "ETH",
"tags": [
"mineable",
"pow",
"smart-contracts",
"ethereum-ecosystem",
"coinbase-ventures-portfolio",
"three-arrows-capital-portfolio",
"polychain-capital-portfolio",
"binance-labs-portfolio",
"blockchain-capital-portfolio",
"boostvc-portfolio",
"cms-holdings-portfolio",
"dcg-portfolio",
"dragonfly-capital-portfolio",
"electric-capital-portfolio",
"fabric-ventures-portfolio",
"framework-ventures-portfolio",
"hashkey-capital-portfolio",
"kenetic-capital-portfolio",
"huobi-capital-portfolio",
"alameda-research-portfolio",
"a16z-portfolio",
"1confirmation-portfolio",
"winklevoss-capital-portfolio",
"usv-portfolio",
"placeholder-ventures-portfolio",
"pantera-capital-portfolio",
"multicoin-capital-portfolio",
"paradigm-portfolio",
"injective-ecosystem"
],
"totalSupply": 122335921.0615
}
Im pulling updated version of it and, to aviod duplicates, im doing the following by using 'update_one'
for doc in new_doc_list:
CRYPTO_TEMPORARY_LIST.update_one(
{ "name" : doc['name']},
{ "$set": {
"lastUpdated": doc['lastUpdated']
}
},
upsert=True)
The problem is it's too slow.
I'm trying to figure out how to improve speed by using update_many but can't figure out how to set it up.
I Basically want to update every document x name. Completely change the doc and not the "lastUpdated" field would b even better.
Thanks guys <3

Elasticsearch query to get items that were modified more then an hour ago

I have the following indexed items in elasticsearch.
{
"_index": "test_index",
"type": "_doc",
"_source": {
"someTitle": "Thank you for your help",
"lastUpdated": 1640085989000}
},
{
"_index": "test_index",
"type": "_doc",
"_source": {
"someTitle": "Thank you for your help",
"lastUpdated": 1640092916012
}
},
{
"_index": "test_index",
"type": "_doc",
"_source": {
"someTitle": "Thank you for your help",
"lastUpdated": 1640092916012
}
}
How to get the items that were updated more than an hour ago based on that lastUpdated value? I have been trying some solutions found in internet but most of them are for querying the string but not number field.
It feels like a range query would do the work [doc]
The section you are looking for is range on dates
Your query should look more or less like that:
GET /<your index>/_search
{
"query": {
"range": {
"lastUpdated": {
"gte": "now-1h"
}
}
}
}
Make sure your mapping is right, and that lastUpdated has the right format [doc].
ES gives you keywords like now and h for simple date math queries. Along with a range query you should be able to do it:
{
"query": {
"range": {
"lastUpdated": {
"lt": "now-1h"
}
}
}
}

MongoDB query for Find 2 levels object element

I have a big issue, i don't know what to do...
What I wanna is to find all objects with Object2 name. I have Object 2 with name element.
What I wanna is to find all objects with the value X in the element name inside Object2. in the example is the value name is ="IWANTALLOBJECTSWITHTHISNAME"
the Json structure.
"objects": [
{
"_id": "5c69a62cf9acf00d00dbc02d",
"date": "2222-02-24T00:00:00.000Z",
"description": "22",
"Object1": {
"_id": "5c69a62cf9acf00d00dbc02b",
"date": "2222-02-24T00:00:00.000Z",
"user": "5c30fd5890bbd24a1c46c7ee",
"positionsObject1": [
{
"id": 1,
"Object2": {
"_id":"5c69a62cf9acf00d00dbc02c",
"name": "IWANTALLOBJECTSWITHTHISNAME"
},
"description": "22",
"value": 22
}
],
"id": 13,
"__v": 0
},
"user": "5c30fd5890bbd24a1c46c7ee",
"id": 7,
"__v": 0
}
]
I'm new in mongoDB and this query is really really hard. I tried everything. Thank very much for the help.
You can specify the path using dot notation:
db.col.find({ "objects.Object1.positionsObject1.Object2.name": "IWANTALLOBJECTSWITHTHISNAME" })

How to perform date arithmetic between nested and unnested dates in Elasticsearch?

Consider the following Elasticsearch (v5.4) object (an "award" doc type):
{
"name": "Gold 1000",
"date": "2017-06-01T16:43:00.000+00:00",
"recipient": {
"name": "James Conroy",
"date_of_birth": "1991-05-30"
}
}
The mapping type for both award.date and award.recipient.date_of_birth is "date".
I want to perform a range aggregation to get a list of the age ranges of the recipients of this award ("Under 18", "18-24", "24-30", "30+"), at the time of their award. I tried the following aggregation query:
{
"size": 0,
"query": {"match_all": {}},
"aggs": {
"recipients": {
"nested": {
"path": "recipient"
},
"aggs": {
"age_ranges": {
"range": {
"script": {
"inline": "doc['date'].date - doc['recipient.date_of_birth'].date"
},
"keyed": true,
"ranges": [{
"key": "Under 18",
"from": 0,
"to": 18
}, {
"key": "18-24",
"from": 18,
"to": 24
}, {
"key": "24-30",
"from": 24,
"to": 30
}, {
"key": "30+",
"from": 30,
"to": 100
}]
}
}
}
}
}
}
Problem 1
But I get the following error due to the comparison of dates in the script portion:
Cannot apply [-] operation to types [org.joda.time.DateTime] and [org.joda.time.MutableDateTime].
The DateTime object is the award.date field, and the MutableDateTime object is the award.recipient.date_of_birth field. I've tried doing something like doc['recipient.date_of_birth'].date.toDateTime() (which doesn't work despite the Joda docs claiming that MutableDateTime has this method inherited from a parent class). I've also tried doing something further like this:
"script": "ChronoUnit.YEARS.between(doc['date'].date, doc['recipient.date_of_birth'].date)"
Which sadly also doesn't work :(
Problem 2
I notice if I do this:
"aggs": {
"recipients": {
"nested": {
"path": "recipient"
},
"aggs": {
"award_years": {
"terms": {
"script": {
"inline": "doc['date'].date.year"
}
}
}
}
}
}
I get 1970 with a doc_count that happens to equal the total number of docs in ES. This leads me to believe that accessing a property outside of the nested object simply does not work and gives me back some default like the epoch datetime. And if I do the opposite (aggregating dates of birth without nesting), I get the exact same thing for all the dates of birth instead (1970, epoch datetime). So how can I compare those two dates?
I am racking my brain here, and I feel like there's some clever solution that is just beyond my current expertise with Elasticsearch. Help!
If you want to set up a quick environment for this to help me out, here is some curl goodness:
curl -XDELETE http://localhost:9200/joelinux
curl -XPUT http://localhost:9200/joelinux -d "{\"mappings\": {\"award\": {\"properties\": {\"name\": {\"type\": \"string\"}, \"date\": {\"type\": \"date\", \"format\": \"yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ\"}, \"recipient\": {\"type\": \"nested\", \"properties\": {\"name\": {\"type\": \"string\"}, \"date_of_birth\": {\"type\": \"date\", \"format\": \"yyyy-MM-dd\"}}}}}}}"
curl -XPUT http://localhost:9200/joelinux/award/1 -d '{"name": "Gold 1000", "date": "2016-06-01T16:43:00.000000+00:00", "recipient": {"name": "James Conroy", "date_of_birth": "1991-05-30"}}'
curl -XPUT http://localhost:9200/joelinux/award/2 -d '{"name": "Gold 1000", "date": "2017-02-28T13:36:00.000000+00:00", "recipient": {"name": "Martin McNealy", "date_of_birth": "1983-01-20"}}'
That should give you a "joelinux" index with two "award" docs to test this out ("James Conroy" and "Martin McNealy"). Thanks in advance!
Unfortunately, you can't access nested and non-nested fields within the same context. As a workaround, you can change your mapping to automatically copy date from nested document to root context using copy_to option:
{
"mappings": {
"award": {
"properties": {
"name": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
},
"date": {
"type": "date"
},
"date_of_birth": {
"type": "date" // will be automatically filled when indexing documents
},
"recipient": {
"properties": {
"name": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
},
"date_of_birth": {
"type": "date",
"copy_to": "date_of_birth" // copy value to root document
}
},
"type": "nested"
}
}
}
}
}
After that you can access date of birth using path date, though the calculations to get number of years between dates are slightly tricky:
Period.between(LocalDate.ofEpochDay(doc['date_of_birth'].date.getMillis() / 86400000L), LocalDate.ofEpochDay(doc['date'].date.getMillis() / 86400000L)).getYears()
Here I convert original JodaTime date objects to system.time.LocalDate objects:
Get number of milliseconds from 1970-01-01
Convert to number of days from 1970-01-01 by dividing it to 86400000L (number of ms in one day)
Convert to LocalDate object
Create date-based Period object from two dates
Get number of years between two dates.
So, the final aggregation query looks like this:
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"age_ranges": {
"range": {
"script": {
"inline": "Period.between(LocalDate.ofEpochDay(doc['date_of_birth'].date.getMillis() / 86400000L), LocalDate.ofEpochDay(doc['date'].date.getMillis() / 86400000L)).getYears()"
},
"keyed": true,
"ranges": [
{
"key": "Under 18",
"from": 0,
"to": 18
},
{
"key": "18-24",
"from": 18,
"to": 24
},
{
"key": "24-30",
"from": 24,
"to": 30
},
{
"key": "30+",
"from": 30,
"to": 100
}
]
}
}
}
}

Getting search result from elastic4s in scala

I have the following code and I am trying to get all the hits from the elasticsearch. If I try to write without the query part it only gives me 10 results when I call .getHits.
val resultFuture = client.execute {
search in "reports/reportOutput" query{ termQuery("mainReportID", reportId.toString)}
}.await
Another issue is that the query part does not actually work and I get nothing. Here is a structure from my elasticsearch.
"hits": {
"total": 266,
"max_score": 1,
"hits":[
{
"_index": "reports",
"_type": "reportOutput",
"_id": "AUwjbAuKTetnUx12_a97",
"_score": 1,
"_source":
{
"displayName": "Classic BMW / MINI",
"model": "Cooper Clubman",
"dayInStock": "10",
"stockNumber": "Q323A",
"miles": "81093",
"interiorColorGeneric": "Black",
"year": "2009",
"trimLevel": "",
"mainReportID": "4d9e4fd3-7fdf-41c8-8c29-45c5acaf78b1",
"modelNumber": "",
"exteriorColorGeneric": "White",
"exteriorColor": "Pepper White",
"vin": "WMWML33509TX35944",
"make": "MINI",
"transmission": "A",
"exteriorColorCode": "850",
"interiorColor": "Gray/Carbon Black",
"interiorColorCode": "K8E1"
}
},
You can increase how many results are returned by setting a limit on the request, for example:
search in "index" limit 100
But the default limit of 10 is not set by elastic4s but by elasticsearch itself and you cannot change it to return all results by default.