How to find a result and apply localization in MongoDB? - mongodb

Given the following sample data
db.cars.insertMany([
{
"category": "sedan",
"model": {
"manufacturer": {
"en": "Mercedes",
"ru": "Мерседес"
},
"number": "E320"
}
},
{
"category": "SUV",
"model": {
"manufacturer": {
"en": "Audi",
"ru": "Ауди"
},
"number": "Q7"
}
},
])
I can select a category by its' name with the following query
db.cars.find({'category': 'sedan'})
And also, if I want to do mapping for a given field, I can do the following
db.cars.aggregate({$project: {'model.manufacturer': '$model.manufacturer.ru'}})
Now combining those 2, I get
db.cars.aggregate([{$match: {'category': 'SUV'}}, {$project: {'model.manufacturer': '$model.manufacturer.ru'}}])
Now my question is, is this a right approach, and if yes, how do I keep all the other values without typing them in the aggregation query(like {'model': 1, ... })

Adds new fields to documents. $addFields outputs documents that contain all existing fields from the input documents and newly added fields..
The $addFields stage is equivalent to a $project stage that explicitly specifies all existing fields in the input documents and adds the new fields.
Starting in version 4.2, MongoDB adds a new aggregation pipeline stage $set that is an alias for $addFields.
https://docs.mongodb.com/manual/reference/operator/aggregation/addFields/
db.cars.aggregate([
{
$match: {
"category": "SUV"
}
},
{
$addFields: {
"model.manufacturer": "$model.manufacturer.ru"
}
}
])
or for MongoDB >= v4.2
db.cars.aggregate([
{
$match: {
"category": "SUV"
}
},
{
$set: {
"model.manufacturer": "$model.manufacturer.ru"
}
}
])

Related

MongoDB Rust Driver weird behavior

There is this weird thing,
I have installed the MongoDB Compass and made a aggregation query that works in the Aggregation tab but now when I use the same query in my rust web server it behaves very weirdly
Original message:
{"_id":{"$oid":"61efd41c56ffe6b1b4a15c7a"},"time":{"$date":"2022-01-25T10:42:36.175Z"},"edited_time":{"$date":"2022-01-30T14:29:54.361Z"},"changes":[],"content":"LORA","author":{"$oid":"61df3cab3087579f8767a38d"}}
Message in MongoDB compass after the query:
{
"_id": {
"$oid": "61efd41c56ffe6b1b4a15c7a"
},
"time": {
"$date": "2022-01-25T10:42:36.175Z"
},
"edited_time": {
"$date": "2021-12-17T09:55:45.856Z"
},
"changes": [{
"time": {
"$date": "2021-12-17T09:55:45.856Z"
},
"change": {
"ChangedContent": "LORA"
}
}],
"content": "LMAO",
"author": {
"$oid": "61df3cab3087579f8767a38d"
}
}
Message after the Web Servers query:
{
"_id": {
"$oid": "61efd41c56ffe6b1b4a15c7a"
},
"time": {
"$date": "2022-01-25T10:42:36.175Z"
},
"edited_time": {
"$date": "2022-01-30T14:40:57.152Z"
},
"changes": {
"$concatArrays": ["$changes", [{
"time": {
"$date": "2022-01-30T14:40:57.152Z"
},
"change": {
"ChangedContent": "$content"
}
}]]
},
"content": "LMAO",
"author": {
"$oid": "61df3cab3087579f8767a38d"
}
}
Pure query in MongoDB Compass:
$set stage
{
"changes": { $concatArrays: [ "$changes", [ { "time": ISODate('2021-12-17T09:55:45.856+00:00'), "change": { "ChangedContent": "$content" } } ] ] },
"edited_time": ISODate('2021-12-17T09:55:45.856+00:00'),
"content": "LMAO",
}
Pure query in Web Server:
let update_doc = doc! {
"$set": {
"changes": {
"$concatArrays": [
"$changes", [
{
"time": now,
"change": {
"ChangedContent": "$content"
}
}
]
]
},
"edited_time": now,
"content": content
}
};
I am using update_one method,
like this
messages.update_one(message_filter, update_doc, None).await?;
I don't really understand, and this happens often, sometimes it fixes it self when I add somewhere randomly some scope in the doc eg.: { } but this time I couldn't figure it out,
I had version of the query with $push but that didn't work too
Is there some fault in the rust driver or am I doing something wrong, are there some rules about formatting when using rust driver that I am missing?
The $set aggregation pipeline stage is different from the $set update operator. And the only difference that I can tell, is the pipeline stage handles $concatArrays while the update operator does not.
$set Aggregation Pipeline Stage
$set appends new fields to existing documents. You can include one or more $set stages in an aggregation operation.
To add field or fields to embedded documents (including documents in arrays) use the dot notation.
To add an element to an existing array field with $set, use with $concatArrays.
$set Update Operator
Starting in MongoDB 5.0, update operators process document fields with
string-based names in lexicographic order. Fields with numeric names
are processed in numeric order.
If the field does not exist, $set will add a new field with the
specified value, provided that the new field does not violate a type
constraint. If you specify a dotted path for a non-existent field,
$set will create the embedded documents as needed to fulfill the
dotted path to the field.
If you specify multiple field-value pairs, $set will update or create
each field.
So if you want to update an existing document by inserting elements into an array field, use the $push update operator (potentially with $each if you're inserting multiple elements):
let update_doc = doc! {
"$set": {
"edited_time": now,
"content": content
},
"$push": {
"changes": {
"time": now,
"change": {
"ChangedContent": "$content"
}
}
}
};
Edit: I missed that $content was supposed to be mapped from the existing field as well. That is not supported by an update document, however MongoDB has support for using an aggregation pipeline to update the document. See: Update MongoDB field using value of another field So you can use the original $set just in a different way:
let update_pipeline = vec![
doc! {
"$set": {
"changes": {
"$concatArrays": [
"$changes", [
{
"time": now,
"change": {
"ChangedContent": "$content"
}
}
]
]
},
"edited_time": now,
"content": content
}
}
];
messages.update_one(message_filter, update_pipeline, None).await?;

multi-stage aggregation pipeline matching data based on fields retrieved through $lookup

I'm trying to build a complex, nested aggregation pipeline in MongoDB (4.4.9 Community Edition, using the pymongo driver for Python 3.10).
There are relevant data points in different collections which I want to aggregate into one, NEW (ideally) view (or, if that doesn't work) collection.
The collections, and the relevant fields therein follow a hierarchy. There is members, which contains the top-level key on which other data is to be merged,
membershipNumber.
> members.find_one()
{'_id': ObjectId('61153299af6122XXXXXXXXXXXXX'), 'membershipNumber': 'N03XXXXXX'}
Then, there's a different collection, which contains membershipNumber, but also a different, linked field, an_user_id. an_user_id is used in other collections to denote records/fields in arrays that pertain to that particular user.
I 'join' members and an_users like so:
result = members.aggregate([
{
'$lookup': {
'from': 'an_users',
'localField': 'membershipNumber',
'foreignField': 'memref',
'as': 'an_users'
}
},
{ '$unwind' : '$an_users' },
{
'$project' : {
'_id' : 1,
'membershipNumber' : 1,
'an_user_id' : '$an_users.user_id'
}
}
]);
So far so good, this returns the desired, aggregated record:
{'_id': ObjectId('61153253aBBBBBBBBBBBB'),
'membershipNumber': 'N0XXXXXXXX',
'an_user_id': '48XXXXXX'}
Now, I have a third collection, which contains the an_user_id as a string in arrays, denoting wherever that user clicked a given email, whereby a record is an email (and the an_user_ids in the clicks array are users that clicked a link in that email.
{'_id': ObjectId('blah'),
'email_id': '407XXX',
'actions_count': 17,
'administrative_title': 'test',
'bounce': ['3440XXXX'],
'click': ['38294CCC',
'418FFFF',
'48XXXXXX',
'38eGGGG'}
I want to count the number occurences of a given an_user_id (which I've attained from aggregating) in arrays (e.g. clicks, bounces, opens) in the emails collection, and include it in the .aggregate call, to retrieve something like this:
{'_id': ObjectId('61153253aBBBBBBBBBBBB'),
'membershipNumber': 'N0XXXXXXXX',
'an_user_id': '48XXXXXX',
'n_email_clicks' : 412,
'n_email_bounces' : 12
}
Further, I might want to also attach counts of an_user_id in other collections in my DB.
Consider, e.g., this collection called events:
{
"_id": "617ffa96ee11844e143a63dd",
"id": "12345",
"administrative_title": "my_event",
"created_at": {
"$date": "2020-01-15T16:28:50.000Z"
},
"event_creator_id": "123456",
"event_title": "my_event",
"group_id": "123456",
"permalink": "event_id",
"rsvp_count": 54,
"rsvps": [{
"rsvp_id": "56789",
"display_name": "John Doe",
"rsvp_user_id": "48XXXXXX",
"rsvp_created_at": {
"$date": "2020-01-28T15:38:50.000Z"
},
"rsvp_updated_at": {
"$date": "2020-01-28T15:38:50.000Z"
},
"first_name": "John",
"last_name": "Doe",
}, {
"rsvp_id": "543895",
"display_name": "James Appleslice",
"rsvp_user_id": "N03XXXXXX",
"rsvp_created_at": {
"$date": "2020-02-05T13:15:14.000Z"
},
"rsvp_updated_at": {
"$date": "2020-02-05T13:15:14.000Z"
},
"first_name": "James",
"last_name": "Appleslice"}
]
}
So, the end-product would look something like this:
{'_id': ObjectId('61153253aBBBBBBBBBBBB'),
'membershipNumber': 'N0XXXXXXXX',
'an_user_id': '48XXXXXX',
'n_email_clicks' : 412,
'n_email_bounces' : 12,
'n_rsvps' : 12
}
My idea was to use the $lookup parameter -- however, I only know how to use this for matching on fields that I have in the parent collection that I'm performing the aggregation on, but not on fields that have been generated in the process of the aggregation.
Any help would be hugely appreciated!
You could use $lookup pipeline. First you would $lookup the user id followed by another $lookup to verify if the user id exists in email. Lastly few more stages to collect the results and format per your need. Furthermore, you can add $out stage if you would like to write the results into another collection.
db.members.aggregate([{
$lookup: {
from: "an_users",
let: {
membershipNumber: "$membershipNumber"
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$memref",
"$$membershipNumber"
]
},
}
},
{
"$lookup": {
"from": "emails",
"localField": "user_id",
"foreignField": "click",
"as": "clicks"
}
},
{
"$project": {
"_id": 1,
"membershipNumber": 1,
"an_user_id": "$user_id",
"n_email_clicks": {
$size: "$clicks"
}
}
}
],
as: "details"
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
{
$arrayElemAt: [
"$details",
0
]
},
"$$ROOT"
]
}
}
},
{
$project: {
details: 0
}
}])
Working example - https://mongoplayground.net/p/yrFsNp44hpi

mongodb aggregation - nested group

I'm trying to perform nested group, I have an array of documents that has two keys (invoiceIndex, proceduresIndex) I need the documents to be arranged like so
invoices (parent) -> procedures (children)
invoices: [ // Array of invoices
{
.....
"procedures": [{}, ...] // Array of procedures
}
]
Here is a sample document
{
"charges": 226.09000000000003,
"currentBalance": 226.09000000000003,
"insPortion": "",
"currentInsPortion": "",
"claim": "notSent",
"status": "unpaid",
"procedures": {
"providerId": "9vfpjSraHzQFNTtN7",
"procedure": "21111",
"description": "One surface",
"category": "basicRestoration",
"surface": [
"m"
],
"providerName": "B Dentist",
"proceduresIndex": "0"
},
"patientId": "mE5vKveFArqFHhKmE",
"patientName": "Silvia Waterman",
"invoiceIndex": "0",
"proceduresIndex": "0"
}
Here is what I have tried
https://mongoplayground.net/p/AEBGmA32n8P
Can you try the following;
db.collection.aggregate([
{
$group: {
_id: "$invoiceIndex",
procedures: {
$push: "$procedures"
},
invoice: {
$first: "$$ROOT"
}
}
},
{
$addFields: {
"invoice.procedures": "$procedures"
}
},
{
"$replaceRoot": {
"newRoot": "$invoice"
}
}
])
I retain the invoice fields with invoice: { $first: "$$ROOT" }, also keep procedures's $push logic as a separate field. Then with $addFields I move that array of procedures into the new invoice object. Then replace root to that.
You shouldn't use the procedureIndex as a part of _id in $group, for you won't be able to get a set of procedures, per invoiceIndex then. With my $group logic it works pretty well as you see.
Link to mongoplayground

Move data from inside nested array

I have inserted multiple documents in my Mongo database incorrectly. I have accidentally nested the data inside another data object:
{
"_id": "5cdfda8ddc5cf00031fd3949",
"payload": {
"timestamp": "2019-05-18T10:12:29.896Z",
"data": {
"data": {
"name": 10,
"age": 10,
}
}
},
"__v": 0
}
I would like the document to not have the extra data object. So I would like it to look like this:
{
"_id": "5cdfda8ddc5cf00031fd3949",
"payload": {
"timestamp": "2019-05-18T10:12:29.896Z",
"data": {
"name": 10,
"age": 10,
}
},
"__v": 0
}
Is there a way in Mongo for me to update all the documents that have 2 data objects to just have one like shown above?
Alas, you cannot do this with one database request. You have to loop over all documents programmatically, set the new data and update them in the database.
You could use the aggregation framework, which won't let you update in place, but you could use the $out operator to write the results to a new collection, if that's an option.
db.collection.aggregate([
{
$project: {
__v : 1,
"payload.timestamp" : 1,
"payload.data" : "$payload.data.data"
},
},
{
"$out": "newCollection"
}
])
Or if you have a mixture of docs with correct format and docs with incorrect format, you can use the $cond operator to determine the correct output:
db.collection.aggregate([
{
$project: {
__v : 1,
"payload.timestamp" : 1,
"payload.data" : {
$cond: [
{ $ne : [ "$payload.data.data", undefined]},
"$payload.data.data",
"$payload.data"
]}
}
},
{
"$out": "newCollection"
}
])

MongoDB moving array of sub-documents to it's own collection

I'm looking to move an array of subdocuments into a collection of it's own keyed by the owner id. Currently, my collection is formed like this:
"_id": ObjectId("123"),
"username": "Bob Dole",
"logins": [{
"_id": ObjectId("abc123"),
"date": ISODate("2016")
}, {
"_id": ObjectId("def456"),
"date": ISODate("2016")
}]
I'm looking for the best way to write a script that would loop over each user, and move each item in the logins array to it's own "logins" collection, as follows:
{
"_id": ObjectId("abc123"),
"_ownerId": ObjectId("123"),
"date": ISODate("2016")
}
{
"_id": ObjectId("def567"),
"_ownerId": ObjectId("123"),
"date": ISODate("2016")
}
When the script ends, I'd like the login array to be removed entirely from all users.
this query will create new collection using aggregation framework
to see how it looks - just remove $out pipeline phase
db.thinking.aggregate([
{
$unwind:"$logins"
},{
$project:{
_id:"$logins._id",
_ownerId:"$_id",
date:"$logins.date"
}
},
{
$out: "newCollection"
}
])
to delete array records - as suggested in comment:
db.thinking.update({},{ "$unset": { "logins": "" } },{ "multi": true })