How to group by geospatial attribute in mongodb? - mongodb

I have a set of documents in mongodb and I am trying to group the document set using the nearest geopoint coordinates within distance of 100m radius to a given document, and get the average value of type and the $first value for cordinates. A sample document set is as below. Is there a way to do this using existing functions in mongodb aggregation pipeline or do I have to use newly introduced $function to build a custom aggregation function. Any suggestions are highly appreciated.
{"_id":{"$oid":"5e790cfe46fa8260f41d2626"},
"cordinates":[103.96277219999999,1.3437526],
"timestamp":1584991486436,
"user":{"$oid":"5e4bbbc31eac8e2e3ca219a6"},
"type": 1,
"__v":0}
{"_id":{"$oid":"5e790d7346fa8260f41d2627"},
"cordinates":[103.97242539965999,1.33508],
"timestamp":1584991603400,
"user":{"$oid":"5e4bbbc31eac8e2e3ca219a6"},
"type": 1,
"__v":0}
{"_id":{"$oid":"5e790d7346fa8260f41d2627"},
"cordinates":[103.97242539990099,1.33518],
"timestamp":1584991603487,
"user":{"$oid":"5e4bbbc31eac8e2e3ca219a6"},
"type": 2,
"__v":0}
A sample document that would be expected as output after aggregation pipeline.
{"avgCordinates":[103.97242539990099,1.33518],
"avgType": 1.6,
}

I managed solve this by by building a custom function to represent a single value for the geospatial coordinate and then grouping by the returned values. I was able to group nearby coordinates to a single document as the function I used to transform values would also map to nearby scalar values. So far it has given me expected outputs for the heatmap. But still I'm not sure this is the correct way to do this. There should be a better answer for this. I have posted my aggregation pipeline below. Any suggestions for improving this are appreciated.
[
{
'$match': {
'timestamp': {
'$gte': 1599889338000
}
}
}, {
'$addFields': {
'singleCoordinate': {
'$function': {
'body': 'function(coordinates){return ((coordinates[1]+90)*180+coordinates[0])*1000000000000;}',
'args': [
'$coordinates', '$geonear'
],
'lang': 'js'
}
}
}
}, {
'$group': {
'_id': {
'$subtract': [
'$singleCoordinate', {
'$mod': [
'$singleCoordinate', 100
]
}
]
},
'coordinates': {
'$first': '$coordinates'
},
'avgType': {
'$avg': '$type'
}
}
}, {
'$addFields': {
'latitude': {
'$arrayElemAt': [
'$coordinates', 1
]
},
'longitude': {
'$arrayElemAt': [
'$coordinates', 0
]
},
'weight': {
'$multiply': [
'$avgType', '$_id'
]
}
}
}, {
'$project': {
'_id': false,
'coordinates': false,
'avgType': false
}
}
]

Related

How to use one of the current document field as coordinates in $geoIntersects query

I'm trying to create an aggregation pipeline to intersect some shapes with some others. a simplified pipeline is like this:
[
{
'$match': {
'loc.type': {
'$eq': 'Polygon'
}
}
}, {
'$addFields': {
'cor': [
[
-11.337890625, 56.31653672211301
], [
13.1396484375, 42.8115217450979
]
]
}
}, {
'$match': {
'loc': {
'$geoIntersects': {
'$geometry': {
'type': 'LineString',
'coordinates': '$cor'
}
}
}
}
}
]
At the first step, I selected every polygon shape from a collection, in the next stage added some coordinates to it, and at the end match the polygon with the shape added in stage 2.
the problem is in stage 3, I can't fill the coordinates field with the cor field created in stage 2.
the error I get is: GeoJSON coordinates must be an array of coordinates.
another similar problem is this:
[
{
'$match': {
'loc.type': {
'$eq': 'Polygon'
}
}
}, {
'$addFields': {
'myshape': {
'type': 'LineString',
'coordinates': [
[
-11.337890625, 56.31653672211301
], [
13.1396484375, 42.8115217450979
]
]
}
}
}, {
'$match': {
'loc': {
'$geoIntersects': {
'$geometry': '$myshape'
}
}
}
}
]
here in stage 2, I created a complete GeoJson shape (field myshape). I also can't use myshape as the shape for $geometry. here the error I get is: unknown geo specifier: $geometry: "$myshape".
how I can use a field from the current document as value for the fields $geometry or $geometry.coordinates?
Unfortunately, using geospatial data from the document field is not supported for Geospatial Queries

How to use $set and dot notation to update embedded array elements using corresponding old element?

I have following documents in a MongoDb:
from pymongo import MongoClient
client = MongoClient(host='my_host', port=27017)
database = client.forecast
collection = database.regions
collection.delete_many({})
regions = [
{
'id': 'DE',
'sites': [
{
'name': 'paper_factory',
'energy_consumption': 1000
},
{
'name': 'chair_factory',
'energy_consumption': 2000
},
]
},
{
'id': 'FR',
'sites': [
{
'name': 'pizza_factory',
'energy_consumption': 3000
},
{
'name': 'foo_factory',
'energy_consumption': 4000
},
]
}
]
collection.insert_many(regions)
Now I would like to copy the property sites.energy_consumption to a new field sites.new_field for each site:
set_stage = {
"$set": {
"sites.new_field": "$sites.energy_consumption"
}
}
pipeline = [set_stage]
collection.aggregate(pipeline)
However, instead of copying the individual value per site, all site values are collected and added as an array. Intead of 'new_field': [1000, 2000] I would like to get 'new_field': 1000 for the first site:
{
"_id": ObjectId("61600c11732a5d6b103ba6be"),
"id": "DE",
"sites": [
{
"name": "paper_factory",
"energy_consumption": 1000,
"new_field": [
1000,
2000
]
},
{
"name": "chair_factory",
"energy_consumption": 2000,
"new_field": [
1000,
2000
]
}
]
},
{
"_id": ObjectId("61600c11732a5d6b103ba6bf"),
"id": "FR",
"sites": [
{
"name": "pizza_factory",
"energy_consumption": 3000,
"new_field": [
3000,
4000
]
},
{
"name": "foo_factory",
"energy_consumption": 4000,
"new_field": [
3000,
4000
]
}
]
}
=> What expression can I use to only use the corresponding entry of the array?
Is there some sort of current-index operator:
$sites[<current_index>].energy_consumption
or an alternative dot operator (would remind me on difference between * multiplication and .* element wise matrix multiplication)?
$sites:energy_consumption
Or is this a bug?
Edit
I also tried to use the "$" positional operator, e.g. with
sites.$.new_field
or
$sites.$.energy_consumption
but then I get the error
FieldPath field names may not start with '$'
Related:
https://docs.mongodb.com/manual/reference/operator/aggregation/set/#std-label-set-add-field-to-embedded
In MongoDB how do you use $set to update a nested value/embedded document?
If the field is member of an array by selecting it you are selecting all of them.
{ar :[{"a" : 1}, {"a" : 2}]}
"$ar.a" = [1 ,2]
Also you cant mix update operators with aggregation, you cant use things like
$sites.$.energy_consumption, if you are doing aggregation you have to use aggregate operators, with only exception the $match stage where you can use query operators.
Query
alternative slightly different solution from yours using $setField
i guess it will be faster, but probably little difference
no need to use javascript it will be slower
this is >= MongoDB 5 solution, $setField is new operator
Test code here
aggregate(
[{"$set":
{"sites":
{"$map":
{"input":"$sites",
"in":
{"$setField":
{"field":"new_field",
"input":"$$this",
"value":"$$this.energy_consumption"}}}}}}]
)
use $addFields
db.collection.update({},
[
{
"$addFields": {
"sites": {
$map: {
input: "$sites",
as: "s",
in: {
name: "$$s.name",
energy_consumption: "$$s.energy_consumption",
new_field: {
$map: {
input: "$sites",
as: "value",
in: "$$value.energy_consumption"
}
}
}
}
}
}
}
])
mongoplayground
I found following ugly workarounds that set the complete sites instead of only specifying a new field with dot notation:
a) based on javascript function
set_stage = {
"$set": {
"sites": {
"$function": {
"body": "function(sites) {return sites.map(site => {site.new_field = site.energy_consumption_in_mwh; return site})}",
"args": ["$sites"],
"lang": "js"
}
}
}
}
b) based on map and mergeObjects
set_stage = {
"$set": {
"sites": {
"$map": {
"input": "$sites",
"in": {
"$mergeObjects": ["$$this", {
"new_field": "$$this.energy_consumption_in_mwh"
}]
}
}
}
}
}
If there is some kind of $$this context for the dot operator expression, allowing a more elegant solution, please let me know.

Merge arrays by matching similar values in mongodb

This is an extension of the below question.
Filter arrays in mongodb
I have a collection where each document contains 2 arrays as below.
{
users:[
{
id:1,
name:"A"
},
{
id:2,
name:"B"
},
{
id:3,
name:"C"
}
]
priv_users:[
{
name:"X12/A",
priv:"foobar"
},
{
name:"Y34.B",
priv:"foo"
}
]
}
From the linked question, I learnt to use $map to merge 2 document arrays. But I can't figure out to match users.name to priv_users.name to get below output.
{
users:[
{
id:1,
name:"A",
priv:"foobar"
},
{
id:2,
name:"B",
priv:"foo"
},
{
id:3,
name:"C"
}
]
}
users.name and priv_users.name don't have a consistent pattern, but users.name exists within priv_users.name.
MongoDB version is 4.0
This may not be as generic but will push you in the right direction. Consider using the operators $mergeObjects to merge the filtered document from the priv_users array with the document in users.
Filtering takes the $substr of the priv_users name field and compares it with the users name field. The resulting pipeline will be as follows
db.collection.aggregate([
{ '$addFields': {
'users': {
'$map': {
'input': '$users',
'in': {
'$mergeObjects': [
{
'$arrayElemAt': [
{
'$filter': {
'input': '$priv_users',
'as': 'usr',
'cond': {
'$eq': [
'$$this.name',
{ '$substr': [
'$$usr.name', 4, -1
] }
]
}
}
},
0
]
},
'$$this'
]
}
}
}
} }
])
If using MongoDB 4.2 and newer versions, consider using $regexMatch operator for matching the priv_users name field with the users name field as the regex pattern. Your $cond operator now becomes:
'cond': {
'$regexMatch': {
'input': '$$usr.name',
'regex': '$$this.name',
'options': "i"
}
}

Is there a way to use a value recently gotten to look for other? in MongoDB

I want to make a query and use a value gotten to look for other in the same query.
My collection is like this:
{
"houses": {
123: {
"color": "white",
"location": "California"
},
124: {
"color": "blue",
"location": "Las Vegas"
}
},
"owners": {
"Anne": {
"house": 124,
},
"Jake": {
"house": 123
}
}
}
Before doing the query I will know just the name of the owner and I would like to get the house information (color, location).
What I'm asking is that if there's a way of using the house number to get the house info in the same query. Something like this:
db.collection.aggregate([
{'$project' {'houses': 1, 'house_number': '$owners.Anne.house'}},
{'$project': {'house_info': 'houses.$house_number':1}}
])
I tried making the house number a string and concat it with $houses but mongo doesn't let me concat the $ symbol.
I am avoiding to make two queries, one for getting the house number and then a second one for getting the house information.
Could someone please help me with this? Sorry if I can't explain myself very well, english isn't my native language.
Quite possible with the use of $objectToArray and $filter operators. The $objectToArray is to convert the houses object / document to an array of key/value
properties. With this array you can filter using the '$owners.Anne.house' value.
Take for instance this aggregate pipeline:
db.collection.aggregate([
{ '$project': {
'house_info': {
'$filter': {
'input': { '$objectToArray': '$houses' },
'cond': {
'$eq': ['$owners.Anne.house', '$$this.k']
}
}
}
} }
])
The result is something like :
{
'house_info': [
{
k: '124',
v: {
"color": "blue",
"location": "Las Vegas"
}
}
]
}
To get just the data document
{
"color": "blue",
"location": "Las Vegas"
}
Add a further projection pipeline stage that uses $arrayElemAt operator as
db.collection.aggregate([
{ '$project': {
'house_array': {
'$filter': {
'input': { '$objectToArray': '$houses' },
'cond': {
'$eq': ['$owners.Anne.house', '$$this.k']
}
}
}
} },
{ '$project': {
'house_info': {
'$arrayElemAt': ['$house_array', 0]
}
} }
])

If condition in MongoDB for Nested JSON to retrieve a particular value

I've nested JSON like this. I want to retrieve the value of "_value" in second level. i,e. "Living Organisms" This is my JSON document.
{
"name": "Biology Book",
"data": {
"toc": {
"_version": "1",
"ge": [
{
"_name": "The Fundamental Unit of Life",
"_id": "5a",
"ge": [
{
"_value": "Living Organisms",
"_id": "5b"
}
]
}
]
}
}
}
This is what I've tried, using the "_id", I want to retrieve it's "_value"
db.products.aggregate([{"$match":{ "data.toc.ge.ge._id": "5b"}}])
This is the closest I could get to the output you mentioned in the comment above. Hope it helps.
db.collection.aggregate([
{
$match: {
"data.toc.ge.ge._id": "5b"
}
},
{
$unwind: "$data.toc.ge"
},
{
$unwind: "$data.toc.ge.ge"
},
{
$group: {
_id: null,
book: {
$push: "$data.toc.ge.ge._value"
}
}
},
{
$project: {
_id: 0,
first: {
$arrayElemAt: [
"$book",
0
]
},
}
}
])
Output:
[
{
"first": "Living Organisms"
}
]
You can check what I tried here
If you are using Mongoid:
(1..6).inject(Model.where('data.toc.ge.ge._id' => '5b').pluck('data.toc.ge.ge._value').first) { |v| v.values.first rescue v.first rescue v }
# => "Living Organisms"
6 is the number of containers to trim from the output (4 hashes and 2 arrays).
If I understand your question correctly, you only care about _value, so it sounds like you might want to use a projection:
db.products.aggregate([{"$match":{ "data.toc.ge.ge._id": "5b"}}, { "$project": {"data.toc.ge.ge._value": 1}}])