Calculate aggregates in a bucket in Upsert MongoDB update statement - mongodb

My application gets measurements from a device that should be stored in a MongoDB database. Each measurement contains values for several probes of the device.
The measurements should displayed in an aggregation for a certain amount of time. I'm using the Bucket pattern in order to prepare the aggregates and simplify indexing and querying.
The following sample shows a document:
{
"DeviceId": "Device1",
"StartTime": 100, "EndTime": 199,
"Measurements": [
{ "timestamp": 100, "probeValues": [ { "id": "1", "t": 30 }, { "id": "2", "t": 67 } ] },
{ "timestamp": 101, "probeValues": [ { "id": "1", "t": 32 }, { "id": "2", "t": 67 } ] },
{ "timestamp": 102, "probeValues": [ { "id": "1", "t": 34 }, { "id": "2", "t": 55 } ] },
{ "timestamp": 103, "probeValues": [ { "id": "1", "t": 27 }, { "id": "2", "t": 30 } ] }
],
"probeAggregates": [
{ "id": "1", "cnt": 4, "total": 123 },
{ "id": "2", "cnt": 4, "total": 219 }
]
}
Updating the values and calculating the aggregates in a single request works well if the document already exists (1st block: query, 2nd: update, 3rd: options):
{
"DeviceId": "Device1",
"StartTime": 100,
"EndTime": 199
},
{
$push: {
"Measurements": {
"timestamp": 103,
"probeValues": [ { "id": "1", "t": 27 }, { "id": "2", "t": 30 } ]
}
},
$inc: {
"probeAggregates.$[probeAggr1].cnt": 1,
"probeAggregates.$[probeAggr1].total": 27,
"probeAggregates.$[probeAggr2].cnt": 1,
"probeAggregates.$[probeAggr2].total": 30
}
},
{
arrayFilters: [
{ "probeAggr1.id": "1" },
{ "probeAggr2.id": "2" }
]
}
Now I want to extend the statement to do a upsert if the document does not exist yet. However, if I do not change the update statement at all, there is the following error:
The path 'probeAggregates' must exist in the document in order to apply array updates.
If I try to prepare the probeAggregates array in case of an insert (e.g. by using $setOnInsert or $addToSet), this leads to another error:
Updating the path 'probeAggregates.$[probeAggr1].cnt' would create a conflict at 'probeAggregates'
Both errors can be explained and seem legit. One way to solve this would be to change the document structure and create one document per device, timeframe and probe and by that simplify the required update statement. In order to keep the number of documents low, I'd rather solve this by changing the update statement. Is there a way to create a valid document in an upsert?
(as I'm just learning to use a document db, feel free to share your experience in the comments on whether it is a good goal to keep the number of documents low in real world scenarios)

Related

How to avoid huge json documents in mongoDB

I am new to mongoDB modelling, I have been working in a small app that just used to have one collection with all my data like this:
{
"name": "Thanos",
"age": 999,
"lastName": "whatever",
"subjects": [
{
"name": "algebra",
"mark": 999
},
{
"name": "quemistry",
"mark": 999
},
{
"name": "whatever",
"mark": 999
}
]
}
I know this is standard in mongoDB since we don't have to map relotionships to other collections like in a relational database. My problem is that my app is growing and my json, even tho it works perfectly fine, it is starting to be huge since it has a few more (and quite big) nested fields:
{
"name": "Thanos",
"age": 999,
"lastName": "whatever",
"subjects": [
{
"name": "algebra",
"mark": 999
},
{
"name": "quemistry",
"mark": 999
},
{
"name": "whatever",
"mark": 999
}
],
"tutors": [
{
"name": "John",
"phone": 2000,
"status": "father"
},
{
"name": "Anne",
"phone": 200000,
"status": "mother"
}
],
"exams": [
{
"id": "exam1",
"file": "file"
},
{
"id": "exam2",
"file": "file"
},
{
"id": "exam3",
"file": "file"
}
]
}
notice that I have simplified the json a lot, the nested fields have way more fields. I have two questions:
Is this a proper way to model Mongodb one to many relationships and how do I avoid such long json documents without splitting into more documents?
Isn't it a performance issue that I have to go through all the students just to get subjects for example?

Embedded vs. Referenced Documents mongoDB

I'm starting to study mongodb, but I want to understand better when to use embedded or referenced documents.
the project I'm trying to make is something similar to a POS (point of sale), working like:
Every time that someone make a purchase, it inserts on the database, but, there are costumers with N groups of stores and theses "groups of stores" have N stores and N POS.
After this i want a database to update the prices in specific stores (not in groups) and make a summary of how many sales any POS made.
So, talking about perfomance what is the best design and why?
here are some exemples that I made:
Embedded :
{
"group1": [
{
"store_id": 1,
"store1": "store_name",
"POS": [
{
"id_POS": 1,
"POS_name": "name_1",
"purchases": [
{
"id": 1,
"date": "2022_10_05",
"time": "10:00:00"
},
{
"id": 2,
"date": "2022_10_05",
"time": "10:10:00"
}
]
},
{
"id_POS": 2,
"POS_name": "name_2",
"purchases": [
{
"id": 1,
"date": "2022_10_05",
"time": "10:50:00"
},
{
"id": 2,
"date": "2022_10_05",
"time": "11:59:00"
}
]
}
],
"itens": [
{
"id_prod": 4,
"prod_name": "avocado",
"price": 2.5
},
{
"id_prod": 5,
"prod_name": "potato",
"price": 1.5
}
]
}
]
}
Referenced:
group of stores,POS, and itens collection:
{
"group1":{
"stores":[
{
"store_id":1,
"name":"store1",
"POS":[
{"POS":[
{"id_pos":1},
{"id_pos":2}
]}
],
"itens":[
{"id_prod":4},
{"id_prod":5}
]
}
]
}
}
{
"id_pos": 1,
"id_store": 1,
"purchases": [
{
"id": 1,
"date": "2022_10_05",
"time": "10:50:00"
},
{
"id": 2,
"date": "2022_10_05",
"time": "11:59:00"
}
]
}
{
"id_store": 1,
"itens":[{
"id_prod": 4,
"prod_name": "avocado",
"price": 2.5
},
{
"id_prod": 5,
"prod_name": "potato",
"price": 1.5
}]
}

How can I get only specific object from nested array mongodb

I'm using mongoDB with PHP. I know how to get the document based on product_id, but I need only a specific object from the whole document, I don't know how to get only specific object from nested array based on product_id.
for ex. My expected output is:
products": [
{
"product_id": 547,
"name": "cola",
"quantity": 24
}
]
then make some changes on object p.s update the quantity then update it to the database.
My collection looks like
"_id": {
"$oid": "62ed30855836b16fd38a00b9"
},
"name": "drink",
"products": [
{
"product_id": 547,
"name": "cola",
"quantity": 24
},
{
"product_id": 984,
"name": "fanta",
"quantity": 42
},
{
"product_id": 404,
"name": "sprite",
"quantity": 12
},
{
"product_id": 854,
"name": "water",
"quantity": 35
}
]
}
Try this:
db.getCollection('test').aggregate([
{
"$unwind": "$products"
},
{
"$match": {
"products.product_id": 547
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$$ROOT",
"$products"
]
}
}
},
{
"$project": {
"products": 0
}
}
])
The query gives the following output:
{
"name" : "cola",
"product_id" : 547,
"quantity" : 24
}

Get the last document and filter nested array

First time playing with MongoDB (and pymongo) and I have the documents below.
I'm trying to make the following query:
get the top 1 entry, ordered by time_added descending
from that subset of documents (let's call it entries.stocks), get the stocks that price is less than a balance value
order by volume and price in that order.
From what I've googled it seems this can be done with aggregation but I only managed to do the first step with:
db.entries.find().sort('date_added', -1).limit(1)
Documents:
[
{
"_id": {
"$oid": "5fcd2582a7a12288df9a6bf3"
},
"stocks": [
{
"name": "stock_name",
"volume": 73,
"price": 135
},
{
"name": "stock_name",
"volume": 44,
"price": 324
}
],
"date_added": {
"$date": "2020-12-06T18:40:02.000Z"
}
},
{
"_id": {
"$oid": "5fcd2582a7a12288df9a6ad0"
},
"stocks": [
{
"name": "stock_name",
"volume": 342,
"price": 43
},
{
"name": "stock_name",
"volume": 544,
"price": 66
}
],
"date_added": {
"$date": "2020-12-05T18:40:02.000Z"
}
}
]

MongoDB query to nested document returns nothing

Here is a sample product document stored in MongoDB:
[
{
"_id": "....",
"user_id": "....",
"username": "....",
// omitted
"product": {
"description": "A stunningly beautiful page with a constant growth of followers, etc. ❤",
"banner_img": "https://tse3-mm.cn.bing.net/th/id/OIP.jNCbt_c_8vnq7sbWluCVnQHaCG?w=300&h=85&c=7&o=5&pid=1.7",
"niches": "Fashion & Style",
"categories": [
{
"type": "Single",
"pricing": [
{
"time": 6,
"price": 15,
"bio_price": 10
},
{
"time": 12,
"price": 20,
"bio_price": 10
}
]
},
{
"type": "Multiple",
"pricing": [
{
"time": 12,
"price": 30.5,
"bio_price": 15
}
]
},
{
"type": "Story",
"pricing": [
{
"time": 24,
"price": 40,
"bio_price": 20
}
]
}
]
},
"created_at": "2020-01-11T18:19:54.312Z",
"updated_at": "2020-01-11T18:19:54.312Z"
}
],
I'd like to find an account that has a product with Multiple or Story pricing type. My query is as follows:
{
product: {
categories: {
pricing: {
$elemMatch: {
type: { $in: ['Multiple', 'Story'] }
}
}
}
}
}
I'm running this query with lucid-mongo in adonisjs framework.
It should at least return one document but it returns nothing.
I ran the query both in framework and on mongo.exe but not works.
What's wrong with my query?