Dynamic field path for $set in mongo db aggregation - mongodb

I have documents containing multilingual data. A simplified version looks like:
{
languages: [
"en",
"fr"
],
title: {
en: "Potato Gratin",
fr: "Gratin de pomme de terre"
},
}
Important parts are:
title contains translations in the shape <lang> : <text>
languages contains the list of supported languages, the fist one being the default language.
not all documents support the same languages
What I would like to do is querying that document for a specific language and either
replace the title object by the correct translation if the language is supported
replace the title object by the default translation if the language is not
I-e querying the above document in french should return {"title": "Gratin de pomme de terre"} and if queried in chinese, it should return {"title": "Potato Gratin"}
I have a playground setup: https://mongoplayground.net/p/CP0Z20dTpgy. I have it so that it sets a lang property that the output should be in. I would then like to have a stage that looks like "$set": {"title": "$title.$lang"} but it complains that a field path component should not start with $, which I am guessing means that mongo does not support dynamic field paths ?
Any idea on how to achieve something like that?
Some notes:
In the actual documents I have a lot of these fields so a solution with "$unwind" would be costly.
the reason it is structured with languages as keys is that it helps with indexing with Mongo Atlas. Changing the structure to have an array of translations would hurt other parts of the app.

You have to transform your object to an array with $objectToArray, filter this array and get element 0 of it. Then you can transform back you value.
db.collection.aggregate([
{
"$match": {
"_id": BinData(0, "3BByrilZQ2GTdlXG0nrGXw=="),
},
},
{
"$set": {
"lang": {
"$cond": [
{
"$in": [
"zh",
"$languages"
]
},
"zh",
{
"$first": "$languages"
}
]
}
}
},
{
$addFields: {
title: {
"$arrayElemAt": [
{
"$filter": {
"input": {
"$objectToArray": "$title"
},
"as": "title",
"cond": {
$eq: [
"$$title.k",
"$lang"
]
}
}
},
0
]
}
}
},
{
$addFields: {
title: "$title.v"
}
}
])
Of course, you have to pass your 'zh' as parameter in your code, on both places.
You can test it here

Related

Query to find the a collection with a number in it

So I have a query in which we are supposed to find all the numbers which contain "9".
One of my entries is as follows:
`
{
"_id": {
"$oid": "63945fb591a4f6443e8edbeb"
},
"customer_id": {
"$oid": "63945f6191a4f6443e8edbea"
},
"customer_name": [
"Spike",
"Takahashi"
],
"customer_age": {
"$numberLong": "23"
},
"mobile_no.": [
"7898654324",
"9232111456"
]
}
`
Now I know $regex is used to do the same for strings but I can't seem to get the output which i want. Probably coz i am using array and numbers as strings. Can anyone help me figure this out?
Now I know $regex is used to do the same for strings but I can't seem to get the output which i want. Probably coz i am using array and numbers as strings.
Actually I don't think the problem is the array here. MongoDB has pretty intuitive syntax for specifying that you want to match any item in the array. In general, something like this would work:
> db.foo.find()
[
{ _id: 1, x: [ '123', '456' ] },
{ _id: 2, x: [ '789', '000' ] }
]
> db.foo.find({x:/9/})
[
{ _id: 2, x: [ '789', '000' ] }
]
Why do I say "in general" and why didn't I use your exact sample document? It's because the field name of interest currently ends in a "." character. Continuing on the theme of MongoDB syntax, the "." character is what the database typically uses to denote nested fields:
> db.foo.find({'x.y':'abc'})
[
{ _id: 3, x: { y: 'abc', z: 0 } }
]
You can read more about querying nested fields using dot notation here.
The trailing dot in the field name (mobile_no.) is sort of confusing the database. We can see in this playground example that a regex similar to the one above (just using the slightly more verbose syntax due to a limitation of the playground) fails to return any results:
db.collection.find({
"mobile_no.": {
$regex: "9"
}
})
no document found
But if we remove the "." character from the document and the query, the document is retrieved as expected:
db.collection.find({
"mobile_no": {
$regex: "9"
}
})
[
{
"_id": ObjectId("63945fb591a4f6443e8edbeb"),
"customer_age": NumberLong(23),
"customer_id": ObjectId("63945f6191a4f6443e8edbea"),
"customer_name": [
"Spike",
"Takahashi"
],
"mobile_no": [
"7898654324",
"9232111456"
]
}
]
Playground example here.
MongoDB improved the way that we can interact with field names containing periods and dollar signs last year as outlined on this page. While you technically could query this document using those new techniques, it is quite convoluted and will not perform efficiently. The query would look something like this:
db.collection.find({
$expr: {
$reduce: {
input: {
$getField: "mobile_no."
},
initialValue: false,
in: {
$or: [
"$$value",
{
$regexMatch: {
input: "$$this",
regex: "9"
}
}
]
}
}
}
})
Playground example here
Personally I think it is much more straightforward to remove the trailing "." from the field name as it does not provide any particular value being stored in the backend database anyway.

$elemMatch doesn't work on nested documents in MongoDB

Stack Overflow!
I have a very strange problem with using $elemMatch in MongoDB. I added multiple documents to a collection. Some of these documents were added using import feature in MongoDB Compass (Add Data -> Import File -> JSON) and some of them were added using insertMany().
Here is an example structure of a single document:
{
"id": "1234567890",
"date": "YYYY-MM-DD",
"contents": {
"0": {
"content": {
"id": "1111111111",
"name": "Name 1"
}
},
"1": {
"content": {
"id": "2222222222",
"name": "Name 2"
}
},
"2": {
"content": {
"id": "3333333333",
"name": "Name 3"
}
}
}
}
The thing is, when I use the following find query using this filter:
{date: "<some_date_here>", "contents": {
$elemMatch: {
"content.id": <some_id_here>
}
}}
ONLY documents that were imported from MongoDB Compass are showing up. Documents that were added by Mongosh or by NodeJS driver (doesn't matter), do NOT show up.
Am I missing something obvious here? What should I do in order to make all documents in a collection (that matches filter) to show up?
Simple filters that do not include $elemMatch work well and all documents that match the filtering rules show up. Problem seems to be with $elemMatch.
I tried adding the same batch of documents using different methods but only direct importing a JSON file in MongoDB Compass make them appear using a filter mentioned above.
Thank you for your help!
$elemMatch if for matching array , and in this case you don't have array
first you should convert contents object to array and then check the query for example id with filter and use match to find all doc that have specific data and size of new filters array
db.collection.aggregate([
{
"$addFields": {
"newField": {
"$objectToArray": "$contents"
}
}
},
{
"$addFields": {
"newField": {
"$filter": {
"input": "$newField",
"as": "z",
"cond": {
$eq: [
"$$z.v.content.id",
"1111111111"
]
}
}
}
}
},
{
"$addFields": {
"newField": {
$size: "$newField"
}
}
},
{
$match: {$and:[ {newField: {
$gt: 0
}},{date:{$gt:Date}}]}
},
{$project:{
contents:1,
date:1,
id:1,
}}
])
https://mongoplayground.net/p/pue4QPp1dYR
in mongoplayground I don't add filter of date

MongoDb - How to only return field of nested subdocument when using lookup aggregation?

I'm very new to MongoDb so I'm used to SQL.
Right now I have two collections in my database:
1) Series (which has nested subdocuments)
2) Review (decided to reference to episode subdocument because there will be a lot of reviews)
See this picture for a better understanding.
Now I want to achieve te following. For every review (two in this case), I want to get the episode name.
I tried the following:
db.review.aggregate([
{
$lookup:{
from:"series",
localField:"episode",
foreignField:"seasons.episodes._id",
as:"episode_entry"
}
}
]).pretty()
The problem is that this returns (ofcourse) not only the title of the referenced episode, but it returns the whole season document.
See the picture below for my current output.
I don't know how to achieve it. Please help me.
I'm using Mongo 3.4.9
I would recommend the following series structure which unwinds the season array into multiple documents one for each season.
This will help you with inserting/updating the episodes directly.
Something like
db.series.insertMany([
{
"title": "Sherlock Holmes",
"nr": 1,
"episodes": [
{
"title": "A Study in Pink",
"nr": 1
},
{
"title": "The Blind Banker",
"nr": 2
}
]
},
{
"title": "Sherlock Holmes",
"nr": 2,
"episodes": [
{
"title": "A Scandal in Belgravia",
"nr": 1
},
{
"title": "The Hounds of Baskerville",
"nr": 2
}
]
}
])
The lookup query will do something like this
episode: { $in: [ episodes._id1, episodes._id2, ... ] }
From the docs
If the field holds an array, then the $in operator selects the
documents whose field holds an array that contains at least one
element that matches a value in the specified array (e.g. , , etc.)
So lookup will return all episodes when there is a match. You can then filter to keep only the one matching your review's episode.
So the query will look like
db.review.aggregate([
{
"$lookup": {
"from": "series",
"localField": "episode",
"foreignField": "episodes._id",
"as": "episode_entry"
}
},
{
"$addFields": {
"episode_entry": [
{
"$arrayElemAt": {
"$filter": {
"input": {
"$let": {
"vars": {
"season": {
"$arrayElemAt": [
"$episode_entry",
0
]
}
},
"in": "$$season.episodes"
}
},
"as": "result",
"cond": {
"$eq": [
"$$result._id",
"$episode"
]
}
}
}
},
0
]
}
}
])

How to concatenate all values and find specific substring in Mongodb?

I have json document like this:
{
"A": [
{
"C": "abc",
"D": "de"
},
{
"C": "fg",
"D": "hi"
}
]
}
I would check whether "A" contains string ef or not.
first Concatenate all values abcdefghi then search for ef
In XML, XPATH it would be something like:
//A[contains(., 'ef')]
Is there any similar query in Mongodb?
All options are pretty horrible for this type of search, but there are a few approaches you can take. Please note though that the end case here is likely the best solution, but I present the options in order to illustrate the problem.
If your keys in the array "A" are consistently defined and always contained an array, you would be searching like this:
db.collection.aggregate([
// Filter the documents containing your parts
{ "$match": {
"$and": [
{ "$or": [
{ "A.C": /e/ },
{ "A.D": /e/ }
]},
{"$or": [
{ "A.C": /f/ },
{ "A.D": /f/ }
]}
]
}},
// Keep the original form and a copy of the array
{ "$project": {
"_id": {
"_id": "$_id",
"A": "$A"
},
"A": 1
}},
// Unwind the array
{ "$unwind": "$A" },
// Join the two fields and push to a single array
{ "$group": {
"_id": "$_id",
"joined": { "$push": {
"$concat": [ "$A.C", "$A.D" ]
}}
}},
// Copy the array
{ "$project": {
"C": "$joined",
"D": "$joined"
}},
// Unwind both arrays
{ "$unwind": "$C" },
{ "$unwind": "$D" },
// Join the copies and test if they are the same
{ "$project": {
"joined": { "$concat": [ "$C", "$D" ] },
"same": { "$eq": [ "$C", "$D" ] },
}},
// Discard the "same" elements and search for the required string
{ "$match": {
"same": false,
"joined": { "$regex": "ef" }
}},
// Project the origial form of the matching documents
{ "$project": {
"_id": "$_id._id",
"A": "$_id.A"
}}
])
So apart from the horrible $regex matching there are a few hoops to go through in order to get the fields "joined" in order to again search for the string in sequence. Also note the reverse joining that is possible here that could possibly produce a false positive. Currently there would be no simple way to avoid that reverse join or otherwise filter it, so there is that to consider.
Another approach is to basically run everything through arbitrary JavaScript. The mapReduce method can be your vehicle for this. Here you can be a bit looser with the types of data that can be contained in "A" and try to tie in some more conditional matching to attempt to reduce the set of documents you are working on:
db.collection.mapReduce(
function () {
var joined = "";
if ( Object.prototype.toString.call( this.A ) === '[object Array]' ) {
this.A.forEach(function(doc) {
for ( var k in doc ) {
joined += doc[k];
}
});
} else {
joined = this.A; // presuming this is just a string
}
var id = this._id;
delete this["_id"];
if ( joined.match(/ef/) )
emit( id, this );
},
function(){}, // will not reduce
{
"query": {
"$or": [
{ "A": /ef/ },
{ "$and": [
{ "$or": [
{ "A.C": /e/ },
{ "A.D": /e/ }
]},
{"$or": [
{ "A.C": /f/ },
{ "A.D": /f/ }
]}
] }
]
},
"out": { "inline": 1 }
}
);
So you can use that with whatever arbitrary logic to search the contained objects. This one just differentiates between "arrays" and presumes otherwise a string, allowing the additional part of the query to just search for the matching "string" element first, and which is a "short circuit" evaluation.
But really at the end of the day, the best approach is to simply have the data present in your document, and you would have to maintain this yourself as you update the document contents:
{
"A": [
{
"C": "abc",
"D": "de"
},
{
"C": "fg",
"D": "hi"
}
],
"search": "abcdefghi"
}
So that is still going to invoke a horrible usage of $regex type queries but at least this avoids ( or rather shifts to writing the document ) the overhead of "joining" the elements in order to effect the search for your desired string.
Where this eventually leads is that a "full blown" text search solution, and that means an external one at this time as opposed to the text search facilities in MongoDB, is probably going to be your best performance option.
Either using the "pre-stored" approach in creating your "joined" field or otherwise where supported ( Solr is one solution that can do this ) have a "computed field" in this text index that is created when indexing document content.
At any rate, those are the approaches and the general point of the problem. This is not XPath searching, not is their some "XPath like" view of an entire collection in this sense, so you are best suited to structuring your data towards the methods that are going to give you the best performance.
With all of that said, your sample here is a fairly contrived example, and if you had an actual use case for something "like" this, then that actual case may make a very interesting question indeed. Actual cases generally have different solutions than the contrived ones. But now you have something to consider.

Are there restrictions on MongoDB collections property names?

I've a document structure wich contains a property named shares which is an array of objects.
Now I tried to match all documents where shared contains the matching _account string with dot notation (shares._account).
It's not working but it seems it's because of the _ char in front of property _account.
So if I put the string to search for inside the name property in that object everything works fine with dot notation.
Are there any limitations on property names?
Thought an _ is allowed because the id has it also in mongodb and for me it's a kind of convention to daclare bindings.
Example:
// Collection Item example
{
"_account": { "$oid" : "526fd2a571e1e13b4100000c" },
"_id": { "$oid" : "5279456932db6adb60000003" },
"name": "shared.jpg",
"path": "/upload/24795-4ui95s.jpg",
"preview": "/img/thumbs/24795-4ui95s.jpg",
"shared": false,
"shares": [
{
"name": "526fcb177675f27140000001",
"_account": "526fcb177675f27140000001"
},
{
"name": "tim",
"_account": "526fd29871e1e13b4100000b"
}
],
"tags": [
"grüngelb",
"farbe"
],
"type": "image/jpeg"
},
I tried to get the item with following query:
// Query example
{
"$or": [
{
"$and": [
{
"type": {
"$in": ["image/jpeg"]
}
}, {
"shares._account": "526fcb177675f27140000001" // Not working
//"shares.name": "526fcb177675f27140000001" // Working
}
]
}
]
}
Apart from the fact that $and can be omitted and $or is pointless "image/jpeg" != "image/jpg":
db.foo.find({
"type": {"$in": ["image/jpeg"]},
"shares._account": "526fcb177675f27140000002"
})
Or if you really want old one:
db.foo.find({
"$or": [
{
"$and": [
{
"type": {
"$in": ["image/jpeg"]
}
}, {
"shares._account": "526fcb177675f27140000002"
}
]
}
]
}
Both will return example document.
Your current query has some unnecessarily complicated constructs:
you don't need the $or and $and clauses ("and" is the default behaviour)
you are matching a single value using $in
The query won't find match the sample document because your query doesn't match the data:
the type field you are looking for is "image/jpg" but your sample document has "image/jpeg"
the shares._account value you are looking for is "526fcb177675f27140000001" but your sample document doesn't include this.
A simplified query that should work is:
db.shares.find(
{
"type": "image/jpeg",
"shares._account": "526fcb177675f27140000002"
}
)