Mongodb match empty object in nested document - mongodb

I'm just wondering if this is possible to do in a single request?
Given
{
_id: 1,
foo: {
fred: {}, // <- I want to remove empty keys like this
barney: { bar: 1 } // <- But keep these keys
}
}
Expected
{
_id: 1,
foo: {
barney: { bar: 1 }
}
}
I know how to do it in several requests, but I'm trying to understand MongoDB better.
Note. fred becomes empty in update command like { $unset: { "fred.baz": 1 } } when baz is the last key in fred.
Maybe it is possible to remove it with its contents? But the command sender does not know, is there any other keys, except baz at the moment.

You can search for empty embedded docs ({ }) and $unset them .. here's an example in the JS shell:
db.mycoll.update(
{'foo.fred':{ }},
{ $unset: {'foo.fred':1} },
false, // upsert: no
true // multi: find all matches
)

Related

Find a value in multiple nested fields with the same name in MongoDB

In some documents I have a property with a complex structure like:
{
content: {
foo: {
id: 1,
name: 'First',
active: true
},
bar: {
id: 2,
name: 'Second',
active: false
},
baz: {
id: 3,
name: 'Third',
active: true
},
}
I'm trying to make a query that can find all documents with a given value in the field name across the different second level objects foo, bar, baz
I guess that a solution could be:
db.getCollection('mycollection').find({ $or: [
{'content.foo.name': 'First'},
{'content.bar.name': 'First'},
{'content.baz.name': 'First'}
]})
But a I want to do it dynamic, with no need to specify key names of nested fields, nether repeat the value to find in every line.
If some Regexp on field name were available , a solution could be:
db.getCollection('mycollection').find({'content.*.name': 'First'}) // Match
db.getCollection('mycollection').find({'content.*.name': 'Third'}) // Match
db.getCollection('mycollection').find({'content.*.name': 'Fourth'}) // Doesn't match
Is there any way to do it?
I would say this is a bad schema if you don't know your keys in advance. Personally I'd recommend to change this to an array structure.
Regardless what you can do is use the aggregation $objectToArray operator, then query that newly created object. Mind you this approach requires a collection scan each time you execute a query.
db.collection.aggregate([
{
$addFields: {
arr: {
"$objectToArray": "$content"
}
}
},
{
$match: {
"arr.v.name": "First"
}
},
{
$project: {
arr: 0
}
}
])
Mongo Playground
Another hacky approach you can take is potentially creating a wildcard text index and then you could execute a $text query to search for the name, obviously this comes with the text index/query limitations and might not be right for your usecase.

Using async loop in mongodb shell for updating many documents

I have a problem with the following query in MongoDB shell ONLY when the size of the array gets bigger, for example, more than 100 elements.
newPointArray --> is an array with 500 elements
newPointArray.forEach(function(newDoc){
//update the mongodb properties for each doc
db.getCollection('me_all_test')
.update({ '_id': newDoc._id },
{ $set: { "properties": newDoc.properties } },
{ upsert: true });
})
Can someone guide me how can I run this query IN MongoDB SHELL for lager array by using an async loop or promise or...?
Thanks in advance
Rather than doing individual .update()s, use a .bulkWrite() operation. This should reduce the overhead of asking mongo to do multiple individual operations. This is assuming that you are doing general operations. I'm not clear on if newPointArray is always new points that don't exist.
Given your example, I believe your script would mimic the following:
// I'm assuming this is your array (but truncated)
let newPointArray = [
{
_id: "1",
properties: {
foo: "bar"
}
},
{
_id: "2",
properties: {
foo: "buzz"
}
}
// Whatever other points you have in your array
];
db
.getCollection("me_all_test")
.bulkWrite(newPointArray
// Map your array to a query bulkWrite understands
.map(point => {
return {
updateOne: {
filter: {
_id: point._id
},
update: {
$set: {
properties: point.properties
}
},
upsert: true
}
};
}));
You may also want to consider setting ordered to false in the operation which may also have performance gains. That would look something liked this:
db
.getCollection("me_all_test")
.bulkWrite([SOME_ARRAY_SIMILAR_TO_ABOVE_EXAMPLE], {
ordered: false
});

pull value from object -> array -> object -> object mongo

I have this following strtucture
{
_id: ".....",
a: {
a1:".......",
b: [
b1: {
b11: "......",
b12: "......",
},
b2: {
b21: "......",
b22: "......",
d: {},
},
c:{
c1: {
......
},
d: {
}
}
]
}
}
here I want to check if property d exists or not inside b, it may exists in multiple objects inside b, if exists pull the d object from record.
Note: There might be a chance that property d exists multiple times inside b1 and b2, In this case I want to remove it from all objects
I tried like
Coll.find({ 'a.b': { $elemMatch: { 'c': { d: { $exists: true } } } } })
but it is not returning anything although there are record, any help appreciated.
I want to pull that data from record too.
Thanks.
UPDATE
Coll.find({ 'a.b.c.d'{ $exists: true } })
that worked for me but still no idea how to use positional operator to pull the value from record
Please take a little more time forming your questions to make them easier to answer: your example data has key-values inside a list (invalid), your question mentions $elemMatch which is a list operator, you talk about removing things but never follow up that thought, and then your UPDATE switches gears and implies it is an object hierarchy.
Taking a hint from your UPDATE, I created some valid data - paste this into mongo shell (perhaps in the test database):
db.test.insert({
a: {
a1: "a1",
b: {
b1: {
b11: "b11",
b12: "b12",
},
b2: {
b21: "b21",
b22: "b22",
d: {},
},
c: {
c1: {
c1: "......"
},
d: {
d1: "woo hoo, struck gold"
}
}
}
}
})
Inside the mongo shell, everything is javascript so "positional operators" are the dot and array subscript operator. A mongo find() returns an array of documents. If you want the the d document (object) from the first doc returned from the query
db.test.find({'a.b.c.d': {$exists: true}})[0].a.b.c.d
produces
{
"d1": "woo hoo, struck gold"
}
UPDATE: Adding detail in response to comment.
If you want to remove the a.b.c.d sub-document, use $unset
// remove sub-document a.b.c.d
db.test.update({}, {$unset: {'a.b.c.d': ''}});
// look at the document to verify that a.b.c.d is removed
db.test.find();

Sorting by relevance with MongoDB

I have a collection of documents in the following form:
{ _id: ObjectId(...)
, title: "foo"
, tags: ["bar", "baz", "qux"]
}
The query should find all documents with any of these tags. I currently use this query:
{ "tags": { "$in": ["bar", "hello"] } }
And it works; all documents tagged "bar" or "hello" are returned.
However, I want to sort by relevance, i.e. the more matching tags the earlier the document should occur in the result. For example, a document tagged ["bar", "hello", "baz"] should be higher in the results than a document tagged ["bar", "baz", "boo"] for the query ["bar", "hello"]. How can I achieve this?
MapReduce and doing it client-side is going to be too slow - you should use the aggregation framework (new in MongoDB 2.2).
It might look something like this:
db.collection.aggregate([
{ $match : { "tags": { "$in": ["bar", "hello"] } } },
{ $unwind : "$tags" },
{ $match : { "tags": { "$in": ["bar", "hello"] } } },
{ $group : { _id: "$title", numRelTags: { $sum:1 } } },
{ $sort : { numRelTags : -1 } }
// optionally
, { $limit : 10 }
])
Note the first and third pipeline members look identical, this is intentional and needed. Here is what the steps do:
pass on only documents which have tag "bar" or "hello" in them.
unwind the tags array (meaning split into one document per tags element
pass on only tags exactly "bar" or "hello" (i.e. discard the rest of the tags)
group by title (it could be also by "$_id" or any other combination of original document
adding up how many tags (of "bar" and "hello") it had
sort in descending order by number of relevant tags
(optionally) limit the returned set to top 10.
You could potentially use MapReduce for something like that. You'd process each document in the Map step, figuring out how many tags match the query, and assign a score. Then you could sort based on that score.
http://www.mongodb.org/display/DOCS/MapReduce
Something that complex should be done after querying. Either server-side through db.eval (if your client supports this) or just clientside. Here's an example for what you're looking for.
It will retreive all posts with the tags you specified, then sorts them according to the amount of matches.
remove the db.eva( part and translate it to the language your client uses to query to get the clientside effect (
db.eval(function () {
var tags = ["a","b","c"];
return db.posts.find({tags:{$in:tags}}).toArray().sort(function(a,b){
var matches_a = 0;
var matches_b = 0;
a.tags.forEach(function (tag) {
for (t in tags) {
if (tag == t) {
matches_a++;
} else {
matches_b++;
}
}
});
b.tags.forEach(function(tag) {
for (t in tags) {
if (tag == t) {
matches_b++;
} else {
matches_a++;
}
}
});
return matches_a - matches_b;
});
});

How to change the type of a field?

I am trying to change the type of a field from within the mongo shell.
I am doing this...
db.meta.update(
{'fields.properties.default': { $type : 1 }},
{'fields.properties.default': { $type : 2 }}
)
But it's not working!
The only way to change the $type of the data is to perform an update on the data where the data has the correct type.
In this case, it looks like you're trying to change the $type from 1 (double) to 2 (string).
So simply load the document from the DB, perform the cast (new String(x)) and then save the document again.
If you need to do this programmatically and entirely from the shell, you can use the find(...).forEach(function(x) {}) syntax.
In response to the second comment below. Change the field bad from a number to a string in collection foo.
db.foo.find( { 'bad' : { $type : 1 } } ).forEach( function (x) {
x.bad = new String(x.bad); // convert field to string
db.foo.save(x);
});
Convert String field to Integer:
db.db-name.find({field-name: {$exists: true}}).forEach(function(obj) {
obj.field-name = new NumberInt(obj.field-name);
db.db-name.save(obj);
});
Convert Integer field to String:
db.db-name.find({field-name: {$exists: true}}).forEach(function(obj) {
obj.field-name = "" + obj.field-name;
db.db-name.save(obj);
});
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update of a field based on its own value:
// { a: "45", b: "x" }
// { a: 53, b: "y" }
db.collection.updateMany(
{ a : { $type: 1 } },
[{ $set: { a: { $toString: "$a" } } }]
)
// { a: "45", b: "x" }
// { a: "53", b: "y" }
The first part { a : { $type: 1 } } is the match query:
It filters which documents to update.
In this case, since we want to convert "a" to string when its value is a double, this matches elements for which "a" is of type 1 (double)).
This table provides the code representing the different possible types.
The second part [{ $set: { a: { $toString: "$a" } } }] is the update aggregation pipeline:
Note the squared brackets signifying that this update query uses an aggregation pipeline.
$set is a new aggregation operator (Mongo 4.2) which in this case modifies a field.
This can be simply read as "$set" the value of "a" to "$a" converted "$toString".
What's really new here, is being able in Mongo 4.2 to reference the document itself when updating it: the new value for "a" is based on the existing value of "$a".
Also note "$toString" which is a new aggregation operator introduced in Mongo 4.0.
In case your cast isn't from double to string, you have the choice between different conversion operators introduced in Mongo 4.0 such as $toBool, $toInt, ...
And if there isn't a dedicated converter for your targeted type, you can replace { $toString: "$a" } with a $convert operation: { $convert: { input: "$a", to: 2 } } where the value for to can be found in this table:
db.collection.updateMany(
{ a : { $type: 1 } },
[{ $set: { a: { $convert: { input: "$a", to: 2 } } } }]
)
For string to int conversion.
db.my_collection.find().forEach( function(obj) {
obj.my_value= new NumberInt(obj.my_value);
db.my_collection.save(obj);
});
For string to double conversion.
obj.my_value= parseInt(obj.my_value, 10);
For float:
obj.my_value= parseFloat(obj.my_value);
db.coll.find().forEach(function(data) {
db.coll.update({_id:data._id},{$set:{myfield:parseInt(data.myfield)}});
})
all answers so far use some version of forEach, iterating over all collection elements client-side.
However, you could use MongoDB's server-side processing by using aggregate pipeline and $out stage as :
the $out stage atomically replaces the existing collection with the
new results collection.
example:
db.documents.aggregate([
{
$project: {
_id: 1,
numberField: { $substr: ['$numberField', 0, -1] },
otherField: 1,
differentField: 1,
anotherfield: 1,
needolistAllFieldsHere: 1
},
},
{
$out: 'documents',
},
]);
To convert a field of string type to date field, you would need to iterate the cursor returned by the find() method using the forEach() method, within the loop convert the field to a Date object and then update the field using the $set operator.
Take advantage of using the Bulk API for bulk updates which offer better performance as you will be sending the operations to the server in batches of say 1000 which gives you a better performance as you are not sending every request to the server, just once in every 1000 requests.
The following demonstrates this approach, the first example uses the Bulk API available in MongoDB versions >= 2.6 and < 3.2. It updates all
the documents in the collection by changing all the created_at fields to date fields:
var bulk = db.collection.initializeUnorderedBulkOp(),
counter = 0;
db.collection.find({"created_at": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
var newDate = new Date(doc.created_at);
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "created_at": newDate}
});
counter++;
if (counter % 1000 == 0) {
bulk.execute(); // Execute per 1000 operations and re-initialize every 1000 update statements
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 != 0) { bulk.execute(); }
The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk API and provided a newer set of apis using bulkWrite():
var bulkOps = [];
db.collection.find({"created_at": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
var newDate = new Date(doc.created_at);
bulkOps.push(
{
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "created_at": newDate } }
}
}
);
})
db.collection.bulkWrite(bulkOps, { "ordered": true });
To convert int32 to string in mongo without creating an array just add "" to your number :-)
db.foo.find( { 'mynum' : { $type : 16 } } ).forEach( function (x) {
x.mynum = x.mynum + ""; // convert int32 to string
db.foo.save(x);
});
What really helped me to change the type of the object in MondoDB was just this simple line, perhaps mentioned before here...:
db.Users.find({age: {$exists: true}}).forEach(function(obj) {
obj.age = new NumberInt(obj.age);
db.Users.save(obj);
});
Users are my collection and age is the object which had a string instead of an integer (int32).
You can easily convert the string data type to numerical data type.
Don't forget to change collectionName & FieldName.
for ex : CollectionNmae : Users & FieldName : Contactno.
Try this query..
db.collectionName.find().forEach( function (x) {
x.FieldName = parseInt(x.FieldName);
db.collectionName.save(x);
});
I need to change datatype of multiple fields in the collection, so I used the following to make multiple data type changes in the collection of documents. Answer to an old question but may be helpful for others.
db.mycoll.find().forEach(function(obj) {
if (obj.hasOwnProperty('phone')) {
obj.phone = "" + obj.phone; // int or longint to string
}
if (obj.hasOwnProperty('field-name')) {
obj.field-name = new NumberInt(obj.field-name); //string to integer
}
if (obj.hasOwnProperty('cdate')) {
obj.cdate = new ISODate(obj.cdate); //string to Date
}
db.mycoll.save(obj);
});
demo change type of field mid from string to mongo objectId using mongoose
Post.find({}, {mid: 1,_id:1}).exec(function (err, doc) {
doc.map((item, key) => {
Post.findByIdAndUpdate({_id:item._id},{$set:{mid: mongoose.Types.ObjectId(item.mid)}}).exec((err,res)=>{
if(err) throw err;
reply(res);
});
});
});
Mongo ObjectId is just another example of such styles as
Number, string, boolean that hope the answer will help someone else.
I use this script in mongodb console for string to float conversions...
db.documents.find({ 'fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.fwtweaeeba = parseFloat( obj.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.0.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[0].content.fwtweaeeba = parseFloat( obj.versions[0].content.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.1.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[1].content.fwtweaeeba = parseFloat( obj.versions[1].content.fwtweaeeba );
db.documents.save(obj); } );
db.documents.find({ 'versions.2.content.fwtweaeeba' : {$exists : true}}).forEach( function(obj) {
obj.versions[2].content.fwtweaeeba = parseFloat( obj.versions[2].content.fwtweaeeba );
db.documents.save(obj); } );
And this one in php)))
foreach($db->documents->find(array("type" => "chair")) as $document){
$db->documents->update(
array('_id' => $document[_id]),
array(
'$set' => array(
'versions.0.content.axdducvoxb' => (float)$document['versions'][0]['content']['axdducvoxb'],
'versions.1.content.axdducvoxb' => (float)$document['versions'][1]['content']['axdducvoxb'],
'versions.2.content.axdducvoxb' => (float)$document['versions'][2]['content']['axdducvoxb'],
'axdducvoxb' => (float)$document['axdducvoxb']
)
),
array('$multi' => true)
);
}
The above answers almost worked but had a few challenges-
Problem 1: db.collection.save no longer works in MongoDB 5.x
For this, I used replaceOne().
Problem 2: new String(x.bad) was giving exponential number
I used "" + x.bad as suggested above.
My version:
let count = 0;
db.user
.find({
custID: {$type: 1},
})
.forEach(function (record) {
count++;
const actualValue = record.custID;
record.custID = "" + record.custID;
console.log(`${count}. Updating User(id:${record._id}) from old id [${actualValue}](${typeof actualValue}) to [${record.custID}](${typeof record.custID})`)
db.user.replaceOne({_id: record._id}, record);
});
And for millions of records, here are the output (for future investigation/reference)-