How get date from elasticsearch with one format? - date

I have create an index like that:
PUT twitter
PUT twitter/_mapping/myType
{
"myType" : {
"properties" : {
"message" : {"type" : "date",
"date_detection": true,
"store" : true }
}
}
}
Then I put several documents:
POST twitter/myType
{
"message":123456
}
I have this document and other with message values: "123456",-123456,"2014-01-01","-123456" (Note string and numeric difference here). Only document with value "12#3454" failed to put.
So now I execute:
GET twitter/myType/_search?pretty=true&q=*:*
And results are:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1,
"hits": [
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5JvFsHvhUOO_5MdfCv",
"_score": 1,
"_source": {
"message": -123456
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5Ju6aOvhUOO_5MdfCs",
"_score": 1,
"_source": {
"message": "123456"
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5Ju0KOvhUOO_5MdfCq",
"_score": 1,
"_source": {
"message": "2014-01-01"
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5JvDiGvhUOO_5MdfCu",
"_score": 1,
"_source": {
"message": "-123456"
}
}
]
}
}
Why I get these value in date fields instead of string value - ISODateTimeFormat.dateOptionalTimeParser? Is there a way to get all date with one format (e.g. string or millis)?
Elasticsearch version is 1.4.3

That's the _source you are seeing, meaning the exact JSON you indexed, no formatting, nothing.
If you want to see what actually ES indexed (meaning the date in milliseconds), you can use fielddata_fields:
GET /twitter/myType/_search
{
"query": {
"match_all": {}
},
"fielddata_fields": [
"message"
]
}
And the answer to your question is that is not actually available out-of-the-box. You need to use script_fields:
GET /twitter/myType/_search
{
"query": {
"match_all": {}
},
"fielddata_fields": [
"message"
],
"_source": "*",
"script_fields": {
"my_script": {
"script": "new Date(doc[\"message\"].value)"
}
}
}
Also, your mapping is wrong: date_detection should be put in the type not in the field:
PUT twitter
{
"mappings": {
"myType": {
"date_detection": true,
"properties": {
"message": {
"type": "date",
"store": true
}
}
}
}
}
And from the output below you'll see how ES treats those numbers you put in there:
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5J93Q-I7tQJ10g6jk5",
"_score": 1,
"_source": {
"message": "123456"
},
"fields": {
"message": [
3833727840000000
],
"my_script": [
"123456-01-01T00:00:00.000Z"
]
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5J93Q-I7tQJ10g6jk4",
"_score": 1,
"_source": {
"message": 123456
},
"fields": {
"message": [
123456
],
"my_script": [
"1970-01-01T00:02:03.456Z"
]
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5J93Q-I7tQJ10g6jk8",
"_score": 1,
"_source": {
"message": "-123456"
},
"fields": {
"message": [
-3958062278400000
],
"my_script": [
"-123456-01-01T00:00:00.000Z"
]
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5J93Q-I7tQJ10g6jk7",
"_score": 1,
"_source": {
"message": "2014-01-01"
},
"fields": {
"message": [
1388534400000
],
"my_script": [
"2014-01-01T00:00:00.000Z"
]
}
},
{
"_index": "twitter",
"_type": "myType",
"_id": "AU5J93Q-I7tQJ10g6jk6",
"_score": 1,
"_source": {
"message": -123456
},
"fields": {
"message": [
-123456
],
"my_script": [
"1969-12-31T23:57:56.544Z"
]
}
}
]

Related

Enrich array of objects with another document based on a condition

Given two collections users and activities, I want to enrich users.friends.user with all relevant activities, such that users.friends.user matches doc2.invitations.userId.
Below describes exactly what I would like:
user f24b189f-9e00-4d2b-b7f2-148a265fae7c is not a part of any activity, so their activities array is [].
user 223f2e04-6f0c-40a3-a88a-69d0127f2e92 is a part of an activity, so their activities array is populated with the activity object.
Raw:
db={
"users": [
{
"_id": "9c69a57f-e90a-43be-b559-449e973c6cba",
"__v": {
"$numberInt": "0"
},
"friends": [
{
"status": "ACCEPTED",
"user": "f24b189f-9e00-4d2b-b7f2-148a265fae7c"
},
{
"status": "ACCEPTED",
"user": "223f2e04-6f0c-40a3-a88a-69d0127f2e92"
}
]
}
],
"activities": [
{
"_id": {
"$oid": "63a92531bc6dea668f93bb06"
},
"invitations": [
{
"_id": {
"$oid": "63a92531bc6dea668f93bb07"
},
"userId": "9c69a57f-e90a-43be-b559-449e973c6cba",
"status": "ACCEPTED"
},
{
"_id": {
"$oid": "63a92544bc6dea668f93bb09"
},
"userId": "223f2e04-6f0c-40a3-a88a-69d0127f2e92",
"status": "ACCEPTED"
},
],
"__v": {
"$numberInt": "0"
}
}
]
}
Expected result:
"expectedResult": [
{
"_id": "9c69a57f-e90a-43be-b559-449e973c6cba",
"__v": {
"$numberInt": "0"
},
"friends": [
{
"status": "ACCEPTED",
"activities": [{
"_id": {
"$oid": "63a92531bc6dea668f93bb06"
},
"invitations": [
{
"_id": {
"$oid": "63a92531bc6dea668f93bb07"
},
"userId": "9c69a57f-e90a-43be-b559-449e973c6cba",
"status": "ACCEPTED"
},
{
"_id": {
"$oid": "63a92544bc6dea668f93bb09"
},
"userId": "223f2e04-6f0c-40a3-a88a-69d0127f2e92",
"status": "ACCEPTED"
},
],
"__v": {
"$numberInt": "0"
}
}]
},
{
"status": "ACCEPTED",
"activities": []
}
]
}
]
And here's a link to the playground.
I played around with aggregate and populate, but this use case is unlike a regular "join", which is throwing me off. Any pointers will be appreciated.

mongodb:complex nested aggregation

I have this collection:
[
{
"_id": {
"$oid": "60b22e1dbd46fa18a8308318"
},
"title": "basketball",
"price": 12,
"category": "Furniture",
"description": "",
"images": [
"http://res.cloudinary.com/hadarush100/image/upload/v1622289949/nfg948x3zro6gbiuknrz.jpg"
],
"categoryId": 1,
"userId": "60ad16493062eb11141d4927",
"createdAt": 1622289948232,
"chats": [
{
"id": 1,
"createdAt": 1622289948232,
"messages": [
{
"id": "1",
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4927",
"text": "Hello, Im the seller of this product."
}
]
},
{
"id": "2",
"createdAt": 1622289948232,
"messages": [
{
"id": 1,
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4927",
"text": "Hello, Im the seller of this product."
}
]
}
]
}
]
and i want to find specific document (by _id), then dive into specific chat in this document (by id), than use $lookup for replacing the "senderId" property in each message with a "sender" property that contains the full sender details (as a user), that exist in another collection (users). the result needs to look like this:
[
{
"_id": {
"$oid": "60b22e1dbd46fa18a8308318"
},
"title": "basketball",
"price": 12,
"category": "Furniture",
"description": "",
"images": [
"http://res.cloudinary.com/hadarush100/image/upload/v1622289949/nfg948x3zro6gbiuknrz.jpg"
],
"categoryId": 1,
"userId": "60ad16493062eb11141d4927",
"createdAt": 1622289948232,
"chats": [
{
"id": 1,
"createdAt": 1622289948232,
"messages": [
{
"id": "1",
"createdAt": 1622289948232,
"sender": {
"_id": {
"$oid": "60ad16493062eb11141d4927"
},
"username": "hadar",
"email": "hadarushha#gmail.com",
"profileImgUrl": "https://randomuser.me/api/portraits/men/79.jpg",
"createdAt": 1621956168518
},
"text": "Hello, Im the seller of this product."
}
]
},
{
"id": "2",
"createdAt": 1622289948232,
"messages": [
{
"id": 1,
"createdAt": 1622289948232,
"sender": {
"_id": {
"$oid": "60ad16493062eb11141d4927"
},
"username": "hadar",
"email": "hadarushha#gmail.com",
"profileImgUrl": "https://randomuser.me/api/portraits/men/79.jpg",
"createdAt": 1621956168518
},
"text": "Hello, Im the seller of this product."
}
]
}
]
}
]
You can use this aggregation:
$match to filter only selected document (_id)
$unwind multiple time to transform arrays into objects
$lookup to query external collection (users)
$group in reverse order
I assumed that your collections are more or less like this (next time, post both collections and also an example on a working playground)
db={
"products": [
{
"_id": {
"$oid": "60b22e1dbd46fa18a8308318"
},
"title": "basketball",
"price": 12,
"category": "Furniture",
"description": "",
"images": [
"http://res.cloudinary.com/hadarush100/image/upload/v1622289949/nfg948x3zro6gbiuknrz.jpg"
],
"categoryId": 1,
"userId": "60ad16493062eb11141d4927",
"createdAt": 1622289948232,
"chats": [
{
"id": 1,
"createdAt": 1622289948232,
"messages": [
{
"id": "1",
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4927",
"text": "Hello, Im the seller of this product."
}
]
},
{
"id": "2",
"createdAt": 1622289948232,
"messages": [
{
"id": 1,
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4927",
"text": "Hello, Im the seller of this product."
}
]
}
]
},
{
"_id": {
"$oid": "60b22e1dbd46fa18a8308319"
},
"title": "volleyball",
"price": 8,
"category": "Furniture",
"description": "",
"images": [
"http://res.cloudinary.com/hadarush100/image/upload/v1622289949/nfg948x3zro6gbiuknrz.jpg"
],
"categoryId": 1,
"userId": "60ad16493062eb11141d4927",
"createdAt": 1622289948232,
"chats": [
{
"id": 1,
"createdAt": 1622289948232,
"messages": [
{
"id": "1",
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4927",
"text": "Hello, Im the seller of this product."
}
]
},
{
"id": "2",
"createdAt": 1622289948232,
"messages": [
{
"id": 1,
"createdAt": 1622289948232,
"senderId": "60ad16493062eb11141d4928",
"text": "Hello, Im the seller of this product."
}
]
}
]
}
],
"users": [
{
"_id": {
"$oid": "60ad16493062eb11141d4927"
},
"username": "hadar",
"email": "hadarushha#gmail.com",
"profileImgUrl": "https://randomuser.me/api/portraits/men/79.jpg",
"createdAt": 1621956168518
},
{
"_id": {
"$oid": "60ad16493062eb11141d4928"
},
"username": "test",
"email": "test#gmail.com",
"profileImgUrl": "https://randomuser.me/api/portraits/men/49.jpg",
"createdAt": 1621956168528
},
]
}
And here is the working aggregation:
db.products.aggregate([
{
"$match": {
"_id": {
"$oid": "60b22e1dbd46fa18a8308319"
}
}
},
{
"$unwind": "$chats"
},
{
"$unwind": "$chats.messages"
},
{
"$addFields": {
"chats.messages.senderIdObjId": {
"$convert": {
"input": "$chats.messages.senderId",
"to": "objectId",
}
}
}
},
{
"$lookup": {
"from": "users",
"localField": "chats.messages.senderIdObjId",
"foreignField": "_id",
"as": "chats.messages.sender"
}
},
{
"$unwind": "$chats.messages.sender"
},
{
"$group": {
"_id": "$chats.id",
"messages": {
"$push": "$chats.messages"
},
"allFields": {
"$first": "$$ROOT"
}
}
},
{
"$addFields": {
"allFields.chats.messages": "$messages"
}
},
{
"$replaceWith": "$allFields"
},
{
"$group": {
"_id": "$_id",
"chats": {
"$push": "$chats"
},
"allFields": {
"$first": "$$ROOT"
}
}
},
{
"$addFields": {
"allFields.chats": "$chats"
}
},
{
"$replaceWith": "$allFields"
},
])
Working Playground here

ElasticSearch autocomplete for keywords from a string

My document looks like:
"hits": {
"total": 4,
"max_score": 1,
"hits": [
{
"_index": "test_db2",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"name": "very cool shoes",
"price": 26
}
},
{
"_index": "test_db2",
"_type": "test",
"_id": "2",
"_score": 1,
"_source": {
"name": "great shampoo",
"price": 15
}
},
{
"_index": "test_db2",
"_type": "test",
"_id": "3",
"_score": 1,
"_source": {
"name": "shirt",
"price": 25
}
}
]
}
How to create autocomplete in elasticsearch like for example:
I put in input word "sh" , after that I should see result
shoes
shampoo
shirt
.....
Example of what I need
Take a look at ngrams. Or actually, edge ngrams are probably all you need.
Qbox has a couple of blog posts about setting up autocomplete with ngrams, so for a more in-depth discussion I would refer you to these:
https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch
https://qbox.io/blog/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams
But just very quickly, this should get you started.
First I set up the index:
PUT /test_index
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"stop",
"kstem",
"edgengram_filter"
]
}
},
"filter": {
"edgengram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15
}
}
}
},
"mappings": {
"doc": {
"properties": {
"name": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
},
"price":{
"type": "integer"
}
}
}
}
}
Then I indexed your documents:
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"name": "very cool shoes","price": 26}
{"index":{"_id":2}}
{"name": "great shampoo","price": 15}
{"index":{"_id":3}}
{"name": "shirt","price": 25}
Now I can get autocomplete results with a simple match query:
POST /test_index/_search
{
"query": {
"match": {
"name": "sh"
}
}
}
which returns:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0.30685282,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": 0.30685282,
"_source": {
"name": "shirt",
"price": 25
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.19178301,
"_source": {
"name": "great shampoo",
"price": 15
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.15342641,
"_source": {
"name": "very cool shoes",
"price": 26
}
}
]
}
}
Here's the code I used to test it:
http://sense.qbox.io/gist/0886488ddfb045c69eed67b15e9734187c8b2491

Elasticsearch nested query

I'm new to elasticsearch, managed to set it up and import recordset from my mongodb collection using the river plugin. For a start, I want to query against the "desc" field but just can't manage to get the query .. not sure if the problem is driven by the way index was defined.. can anyone help please?
Sample recordset in elastic search looks like this
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 107209,
"max_score": 1,
"hits": [
{
"_index": "shiv",
"_type": "shiv",
"_id": "iG1eIzN7RGO7hFfxTlnLuA",
"_score": 1,
"_source": {
"_id": {
"$oid": "50901d7f485bf7bd1c000021"
},
"brand": "",
"category": {
"$ref": "categories",
"$id": {
"$oid": "4fbd2221758cb11d14000174"
}
},
"comments": [],
"count_comment": 0,
"count_fav": 2,
"count_hotness": 1.46,
"count_rekick": 0,
"count_share": 0,
"country": {
"$ref": "countries",
"$id": {
"$oid": "4fec98f7758cb18c6e0002c9"
}
},
"currency": "pound",
"desc": "A men's automatic watch, this Seamaster Bond model features a Co-Axial escapement and date function. Its blue dial is teamed with a stainless steel case and bracelet for a look that's sporty and refined.",
"gender": "male",
"ident": "omega-seamaster-diver-bond-men-s-automatic-watch---ernest-jones-1351622015",
"img_url": "http://s7ondemand4.scene7.com/is/image/Signet/5735793?$detail$",
"lifestyles": [
{
"$ref": "lifestyles",
"$id": {
"$oid": "508ff6ca485bf73112000060"
}
}
],
"location": "United Kingdom",
"owner": {
"$ref": "accounts",
"$id": {
"$oid": "50742fd8485bf74b7a00213f"
}
},
"price": 2400,
"store": "ernestjones.co.uk",
"tags": [
"ernest-jones",
"bond"
],
"timestamp_creation": 1351622015,
"timestamp_exp": 1356825600,
"timestamp_update": 1351622015,
"title": "Omega Seamaster Diver Bond men's automatic watch - Ernest Jones",
"url": "http%3A%2F%2Fwww.ernestjones.co.uk%2Fwebstore%2Fd%2F5735793%2Fomega%20seamaster%20diver%20bond%20men%27s%20automatic%20watch%2F%3Futm_source%3Dgooglebase%26utm_medium%3Dfeedmanager%26cm_mmc%3DFroogle-_-CKB-_-nurses_fobs-_-watches%26cm_mmca1%3Domega%26cm_mmca2%3Dmale%26cm_mmca3%3Dadult"
}
}
]
}
}
The mapping of the index "shiv" looks like
{
"shiv": {
"properties": {
"$oid": {
"type": "string"
}
}
}
}
Thanks again
There are lots of ways to query, have you tried a match query?
Using curl or a rest client of your choice...
http://[host]:9200/[index_name]/[doc_type]/_search
{
"query" : {
"match" : {
"desc" : "some value you want to find in desc"
}
}
}

Sum Query in Elasticsearch

I'm fairly new to Elasticsearch. I'm trying to write a query that will group by a field and calculate a sum. In SQL, my query would look like this:
SELECT lane, SUM(routes) FROM lanes GROUP BY lane
I have this data that looks like this in ES:
{
"_index": "kpi",
"_type": "mroutes_by_lane",
"_id": "TUeWFEhnS9q1Ukb2QdZABg",
"_score": 1.0,
"_source": {
"warehouse_id": 107,
"date": "2013-04-08",
"lane": "M05",
"routes": 4047
}
},
{
"_index": "kpi",
"_type": "mroutes_by_lane",
"_id": "owVmGW9GT562_2Alfru2DA",
"_score": 1.0,
"_source": {
"warehouse_id": 107,
"date": "2013-04-08",
"lane": "M03",
"routes": 4065
}
},
{
"_index": "kpi",
"_type": "mroutes_by_lane",
"_id": "JY9xNDxqSsajw76oMC2gxA",
"_score": 1.0,
"_source": {
"warehouse_id": 107,
"date": "2013-04-08",
"lane": "M05",
"routes": 3056
}
},
{
"_index": "kpi",
"_type": "mroutes_by_lane",
"_id": "owVmGW9GT345_2Alfru2DB",
"_score": 1.0,
"_source": {
"warehouse_id": 107,
"date": "2013-04-08",
"lane": "M03",
"routes": 5675
}
},
...
I want to essentially run the same query in ES as I did in SQL, so that my result would be something like (in json of course): M05: 7103, M03: 9740
In elasticsearch, you can achieve this by using terms stats facet:
{
"query" : {
"match_all" : { }
},
"facets" : {
"lane_routes_stats" : {
"terms_stats" : {
"key_field" : "lane",
"value_field" : "routes",
"order": "term"
}
}
}
}