MongoDB select all where field value in a query list - mongodb

How to achieve below SQL in MongoShell?
Select TableA.* from TableA where TableA.FieldB in (select TableB.FieldValue from TableB)
Mongo doc gives some example of
db.inventory.find( { qty: { $in: [ 5, 15 ] } } )
I want that array be dynamically from another query. Is it possible?
Extending my question
I have a collection of bot names
bots collection
{
"_id" : ObjectId("53266697c294991f57c36e42"),
"name" : "teoma"
}
I have a collection of user traffic, in that traffic collection, I have a field useragent
userTraffic Collection
{
"_id" : ObjectId("5325ee6efb91c0161cbe7b2c"),
"hosttype" : "http",
"useragent" : "Mediapartners-Google",
"is_crawler" : false,
"City" : "Mountain View",
"State" : "CA",
"Country" : "United States"
}
I want to select all user traffic documents where its useragent contains any name of bot collection
This is what I have come up with
var botArray = db.bots.find({},{name:1, _id:0}).toArray()
db.Sessions.find({
useragent: {$in: botArray}
},{
ipaddress:1
})
Here i believe it is doing equals to comparison, but I want it to do like %% comparison
Once I get the result, I want to do an update to that result set as is_crawler= true
Tried something like this, isn't helpful
db.bots.find().forEach( function(myBot) {
db.Sessions.find({
useragent: /myBot.name/
},{
ipaddress:1
})
});
Another way of looping through the records, but no match found.
var bots = db.bots.find( {
$query: {},
$orderby:{
name:1}
});
while( bots.hasNext()) {
var bot = bots.next();
//print(bot.name);
var botName = bot.name.toLowerCase();
print(botName);
db.Sessions.find({
useragent: /botName/,
is_crawler:false
},{
start_date:1,
ipaddress:1,
useragent:1,
City:1,
State:1,
Country:1,
is_crawler:1,
_id:0
})
}

Not in a single query it isn't.
There is nothing wrong with getting the results from a query and feeding that in as your in condition.
var list = db.collectionA.find({},{ "_id": 0, "field": 1 }).toArray();
results = db.collectionB.find({ "newfield": { "$in": list } });
But your actual purpose is not clear, as using SQL queries alone as the only example of what you want to achieve are generally not a good guide to answer the question. The main cause of this is that you probably should be modelling differently than as you do in relational. Otherwise, why use MongoDB at all?
I would suggest reading the documentation section on Data Modelling which shows several examples of how to approach common modelling cases.
Considering that information, then perhaps you can reconsider what you are modelling, and if you then have specific questions to other problems there, then feel free to ask your questions here.

Finally this is how I could accomplish it.
// Get a array with values for name field
var botArray = db.bots.find({},{name:1}).toArray();
// loop through another collection
db.Sessions.find().forEach(function(sess){
if(sess.is_crawler == false){ // check a condition
// loop in the above array
botArray.forEach(function(b){
//check if exists in the array
if(String(sess.useragent).toUpperCase().indexOf(b.name.toUpperCase()) > -1){
db.Sessions.update({ _id : sess._id} // find by _id
,{
is_crawler : true // set a update value
},
{
upsert:false // do update only
})
}
});
}
});

Related

Converting older Mongo database references to DBRefs

I'm in the process of updating some legacy software that is still running on Mongo 2.4. The first step is to upgrade to the latest 2.6 then go from there.
Running the db.upgradeCheckAllDBs(); gives us the DollarPrefixedFieldName: $id is not valid for storage. errors and indeed we have some older records with legacy $id, $ref fields. We have a number of collections that look something like this:
{
"_id" : "1",
"someRef" : {"$id" : "42", "$ref" : "someRef"}
},
{
"_id" : "2",
"someRef" : DBRef("someRef", "42")
},
{
"_id" : "3",
"someRef" : DBRef("someRef", "42")
},
{
"_id" : "4",
"someRef" : {"$id" : "42", "$ref" : "someRef"}
}
I want to script this to convert the older {"$id" : "42", "$ref" : "someRef"} objects to DBRef("someRef", "42") objects but leave the existing DBRef objects untouched. Unfortunately, I haven't been able to differentiate between the two types of objects.
Using typeof and $type simply say they are objects.
Both have $id and $ref fields.
In our groovy console when you pull one of the old ones back and one of the new ones getClass() returns DBRef for both.
We have about 80k records with this legacy format out of millions of total records. I'd hate to have to brute force it and modify every record whether it needs it or not.
This script will do what I need it to do but the find() will basically return all the records in the collection.
var cursor = db.someCollection.find({"someRef.$id" : {$exists: true}});
while(cursor.hasNext()) {
var rec = cursor.next();
db.someCollection.update({"_id": rec._id}, {$set: {"someRef": DBRef(rec.someRef.$ref, rec.someRef.$id)}});
}
Is there another way that I am missing that can be used to find only the offending records?
Update
As described in the accepted answer the order matters which made all the difference. The script we went with that corrected our data:
var cursor = db.someCollection.find(
{
$where: "function() { return this.someRef != null &&
Object.keys(this.someRef)[0] == '$id'; }"
}
);
while(cursor.hasNext()) {
var rec = cursor.next();
db.someCollection.update(
{"_id": rec._id},
{$set: {"someRef": DBRef(rec.someRef.$ref, rec.someRef.$id)}}
);
}
We did have a collection with a larger number of records that needed to be corrected where the connection timed out. We just ran the script again and it got through the remaining records.
There's probably a better way to do this. I would be interested in hearing about a better approach. For now, this problem is solved.
DBRef is a client side thing. http://docs.mongodb.org/manual/reference/database-references/#dbrefs says it pretty clear:
The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef.
The drivers benefit from the fact that order of fields in BSON is consistent to recognise DBRef, so you can do the same:
db.someCollection.find({ $expr: {
$let: {
vars: {firstKey: { $arrayElemAt: [ { $objectToArray: "$someRef" }, 0] } },
in: { $eq: [{ $substr: [ "$$firstKey.k", 1, 2 ] } , "id"]}
}
} } )
will return objects where order of the fields doesn't match driver's expectation.

query for multiple value using single query [duplicate]

How to achieve below SQL in MongoShell?
Select TableA.* from TableA where TableA.FieldB in (select TableB.FieldValue from TableB)
Mongo doc gives some example of
db.inventory.find( { qty: { $in: [ 5, 15 ] } } )
I want that array be dynamically from another query. Is it possible?
Extending my question
I have a collection of bot names
bots collection
{
"_id" : ObjectId("53266697c294991f57c36e42"),
"name" : "teoma"
}
I have a collection of user traffic, in that traffic collection, I have a field useragent
userTraffic Collection
{
"_id" : ObjectId("5325ee6efb91c0161cbe7b2c"),
"hosttype" : "http",
"useragent" : "Mediapartners-Google",
"is_crawler" : false,
"City" : "Mountain View",
"State" : "CA",
"Country" : "United States"
}
I want to select all user traffic documents where its useragent contains any name of bot collection
This is what I have come up with
var botArray = db.bots.find({},{name:1, _id:0}).toArray()
db.Sessions.find({
useragent: {$in: botArray}
},{
ipaddress:1
})
Here i believe it is doing equals to comparison, but I want it to do like %% comparison
Once I get the result, I want to do an update to that result set as is_crawler= true
Tried something like this, isn't helpful
db.bots.find().forEach( function(myBot) {
db.Sessions.find({
useragent: /myBot.name/
},{
ipaddress:1
})
});
Another way of looping through the records, but no match found.
var bots = db.bots.find( {
$query: {},
$orderby:{
name:1}
});
while( bots.hasNext()) {
var bot = bots.next();
//print(bot.name);
var botName = bot.name.toLowerCase();
print(botName);
db.Sessions.find({
useragent: /botName/,
is_crawler:false
},{
start_date:1,
ipaddress:1,
useragent:1,
City:1,
State:1,
Country:1,
is_crawler:1,
_id:0
})
}
Not in a single query it isn't.
There is nothing wrong with getting the results from a query and feeding that in as your in condition.
var list = db.collectionA.find({},{ "_id": 0, "field": 1 }).toArray();
results = db.collectionB.find({ "newfield": { "$in": list } });
But your actual purpose is not clear, as using SQL queries alone as the only example of what you want to achieve are generally not a good guide to answer the question. The main cause of this is that you probably should be modelling differently than as you do in relational. Otherwise, why use MongoDB at all?
I would suggest reading the documentation section on Data Modelling which shows several examples of how to approach common modelling cases.
Considering that information, then perhaps you can reconsider what you are modelling, and if you then have specific questions to other problems there, then feel free to ask your questions here.
Finally this is how I could accomplish it.
// Get a array with values for name field
var botArray = db.bots.find({},{name:1}).toArray();
// loop through another collection
db.Sessions.find().forEach(function(sess){
if(sess.is_crawler == false){ // check a condition
// loop in the above array
botArray.forEach(function(b){
//check if exists in the array
if(String(sess.useragent).toUpperCase().indexOf(b.name.toUpperCase()) > -1){
db.Sessions.update({ _id : sess._id} // find by _id
,{
is_crawler : true // set a update value
},
{
upsert:false // do update only
})
}
});
}
});

Finding which elements in an array do not exist for a field value in MongoDB

I have a collection users as follows:
{ "_id" : ObjectId("51780f796ec4051a536015cf"), "userId" : "John" }
{ "_id" : ObjectId("51780f796ec4051a536015d0"), "userId" : "Sam" }
{ "_id" : ObjectId("51780f796ec4051a536015d1"), "userId" : "John1" }
{ "_id" : ObjectId("51780f796ec4051a536015d2"), "userId" : "john2" }
Now I am trying to write a code which can provides suggestions of a userId to user in case id provided by user already exists in DB. In same routine I just append values from 1 to 5 to the for example in case user have selected userId to be John, suggested user name array that needs to be checked for Id in database will look like this
[John,John1,John2,John3,John4,John5].
Now I just want to execute it against Db and to find out which of the suggested values do not exist in DB. So instead of selecting any document, I want to select values within suggested array which do not exist for users collection.
Any pointers are highly appreciated.
Your general approach here is you want to find the "distinct" values for the "userId's" that already exist in your collection. You then compare these to your input selection to see the difference.
var test = ["John","John1","John2","John3","John4","John5"];
var res = db.collection.aggregate([
{ "$match": { "userId": { "$in": test } }},
{ "$group": { "_id": "$userId" }}
]).toArray();
res.map(function(x){ return x._id });
test.filter(function(x){ return res.indexOf(x) == -1 })
The end result of this is the userId's that do not match in your initial input:
[ "John2", "John3", "John4", "John5" ]
The main operator there is $in which takes an array as the argument and compares those values against the specified field. The .aggregate() method is the best approach, there is a shorthand mapReduce wrapper in the .distinct() method which directly produces just the values in the array, removing the call to a function like .map() to strip out the values for a given key:
var res = db.collection.distinct("userId",{ "userId": { "$in": test } })
It should run notably slower though, especially on large collections.
Also do not forget to index your "userId" and likely you want this to be "unique" anyway so you really just want the $match or .find() result:
var res = db.collection.find({ "$in": test }).toArray()

Inline/combine other collection into one collection

I want to combine two mongodb collections.
Basically I have a collection containing documents that reference one document from another collection. Now I want to have this as a inline / nested field instead of a separate document.
So just to provide an example:
Collection A:
[{
"_id":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"someValue": "foo"
},
{
"_id":"5F0BB248-E628-4B8F-A2F6-FECD79B78354",
"someValue": "bar"
}]
Collection B:
[{
"_id":"169099A4-5EB9-4D55-8118-53D30B8A2E1A",
"collectionAID":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"some":"foo",
"andOther":"stuff"
},
{
"_id":"83B14A8B-86A8-49FF-8394-0A7F9E709C13",
"collectionAID":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"some":"bar",
"andOther":"random"
}]
This should result in Collection A looking like this:
[{
"_id":"90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"someValue": "foo",
"collectionB":[{
"some":"foo",
"andOther":"stuff"
},{
"some":"bar",
"andOther":"random"
}]
},
{
"_id":"5F0BB248-E628-4B8F-A2F6-FECD79B78354",
"someValue": "bar"
}]
I'd suggest something simple like this from the console:
db.collB.find().forEach(function(doc) {
var aid = doc.collectionAID;
if (typeof aid === 'undefined') { return; } // nothing
delete doc["_id"]; // remove property
delete doc["collectionAID"]; // remove property
db.collA.update({_id: aid}, /* match the ID from B */
{ $push : { collectionB : doc }});
});
It loops through each document in collectionB and if there is a field collectionAID defined, it removes the unnecessary properties (_id and collectionAID). Finally, it updates a matching document in collectionA by using the $push operator to add the document from B to the field collectionB. If the field doesn't exist, it is automatically created as an array with the newly inserted document. If it does exist as an array, it will be appended. (If it exists, but isn't an array, it will fail). Because the update call isn't using upsert, if the _id in the collectionB document doesn't exist, nothing will happen.
You can extend it to delete other fields as necessary or possibly add more robust error handling if for example a document from B doesn't match anything in A.
Running the code above on your data produces this:
{ "_id" : "5F0BB248-E628-4B8F-A2F6-FECD79B78354", "someValue" : "bar" }
{ "_id" : "90A26C2A-4976-4EDD-850D-2ED8BEA46F9E",
"collectionB" : [
{
"some" : "foo",
"andOther" : "stuff"
},
{
"some" : "bar",
"andOther" : "random"
}
],
"someValue" : "foo"
}
Sadly mapreduce can't produce full documents.
https://jira.mongodb.org/browse/SERVER-2517
No idea why despite all the attention, whining and upvotes they haven't changed it. So you'll have to do this manually in the language of your choice.
Hopefully you've indexed 'collectionAID' which should improve the speed of your queries. Just write something that goes through your A collection one document at a time, loading the _id and then adding the array from Collection B.
There is a much faster way than https://stackoverflow.com/a/22676205/1578508
You can do it the other way round and run through the collection you want to insert your documents in. (Far less executions!)
db.collA.find().forEach(function (x) {
var collBs = db.collB.find({"collectionAID":x._id},{"_id":0,"collectionA":0});
x.collectionB = collBs.toArray();
db.collA.save(x);
})

In mongoDb, how do you remove an array element by its index?

In the following example, assume the document is in the db.people collection.
How to remove the 3rd element of the interests array by it's index?
{
"_id" : ObjectId("4d1cb5de451600000000497a"),
"name" : "dannie",
"interests" : [
"guitar",
"programming",
"gadgets",
"reading"
]
}
This is my current solution:
var interests = db.people.findOne({"name":"dannie"}).interests;
interests.splice(2,1)
db.people.update({"name":"dannie"}, {"$set" : {"interests" : interests}});
Is there a more direct way?
There is no straight way of pulling/removing by array index. In fact, this is an open issue http://jira.mongodb.org/browse/SERVER-1014 , you may vote for it.
The workaround is using $unset and then $pull:
db.lists.update({}, {$unset : {"interests.3" : 1 }})
db.lists.update({}, {$pull : {"interests" : null}})
Update: as mentioned in some of the comments this approach is not atomic and can cause some race conditions if other clients read and/or write between the two operations. If we need the operation to be atomic, we could:
Read the document from the database
Update the document and remove the item in the array
Replace the document in the database. To ensure the document has not changed since we read it, we can use the update if current pattern described in the mongo docs
You can use $pull modifier of update operation for removing a particular element in an array. In case you provided a query will look like this:
db.people.update({"name":"dannie"}, {'$pull': {"interests": "guitar"}})
Also, you may consider using $pullAll for removing all occurrences. More about this on the official documentation page - http://www.mongodb.org/display/DOCS/Updating#Updating-%24pull
This doesn't use index as a criteria for removing an element, but still might help in cases similar to yours. IMO, using indexes for addressing elements inside an array is not very reliable since mongodb isn't consistent on an elements order as fas as I know.
in Mongodb 4.2 you can do this:
db.example.update({}, [
{$set: {field: {
$concatArrays: [
{$slice: ["$field", P]},
{$slice: ["$field", {$add: [1, P]}, {$size: "$field"}]}
]
}}}
]);
P is the index of element you want to remove from array.
If you want to remove from P till end:
db.example.update({}, [
{ $set: { field: { $slice: ["$field", 1] } } },
]);
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to update an array by removing an element at a given index:
// { "name": "dannie", "interests": ["guitar", "programming", "gadgets", "reading"] }
db.collection.update(
{ "name": "dannie" },
[{ $set:
{ "interests":
{ $function: {
body: function(interests) { interests.splice(2, 1); return interests; },
args: ["$interests"],
lang: "js"
}}
}
}]
)
// { "name": "dannie", "interests": ["guitar", "programming", "reading"] }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify. The function here simply consists in using splice to remove 1 element at index 2.
args, which contains the fields from the record that the body function takes as parameter. In our case "$interests".
lang, which is the language in which the body function is written. Only js is currently available.
Rather than using the unset (as in the accepted answer), I solve this by setting the field to a unique value (i.e. not NULL) and then immediately pulling that value. A little safer from an asynch perspective. Here is the code:
var update = {};
var key = "ToBePulled_"+ new Date().toString();
update['feedback.'+index] = key;
Venues.update(venueId, {$set: update});
return Venues.update(venueId, {$pull: {feedback: key}});
Hopefully mongo will address this, perhaps by extending the $position modifier to support $pull as well as $push.
I would recommend using a GUID (I tend to use ObjectID) field, or an auto-incrementing field for each sub-document in the array.
With this GUID it is easy to issue a $pull and be sure that the correct one will be pulled. Same goes for other array operations.
For people who are searching an answer using mongoose with nodejs. This is how I do it.
exports.deletePregunta = function (req, res) {
let codTest = req.params.tCodigo;
let indexPregunta = req.body.pregunta; // the index that come from frontend
let inPregunta = `tPreguntas.0.pregunta.${indexPregunta}`; // my field in my db
let inOpciones = `tPreguntas.0.opciones.${indexPregunta}`; // my other field in my db
let inTipo = `tPreguntas.0.tipo.${indexPregunta}`; // my other field in my db
Test.findOneAndUpdate({ tCodigo: codTest },
{
'$unset': {
[inPregunta]: 1, // put the field with []
[inOpciones]: 1,
[inTipo]: 1
}
}).then(()=>{
Test.findOneAndUpdate({ tCodigo: codTest }, {
'$pull': {
'tPreguntas.0.pregunta': null,
'tPreguntas.0.opciones': null,
'tPreguntas.0.tipo': null
}
}).then(testModificado => {
if (!testModificado) {
res.status(404).send({ accion: 'deletePregunta', message: 'No se ha podido borrar esa pregunta ' });
} else {
res.status(200).send({ accion: 'deletePregunta', message: 'Pregunta borrada correctamente' });
}
})}).catch(err => { res.status(500).send({ accion: 'deletePregunta', message: 'error en la base de datos ' + err }); });
}
I can rewrite this answer if it dont understand very well, but I think is okay.
Hope this help you, I lost a lot of time facing this issue.
It is little bit late but some may find it useful who are using robo3t-
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
interests: "guitar" // you can change value to
}
},
{ multi: true }
);
If you have values something like -
property: [
{
"key" : "key1",
"value" : "value 1"
},
{
"key" : "key2",
"value" : "value 2"
},
{
"key" : "key3",
"value" : "value 3"
}
]
and you want to delete a record where the key is key3 then you can use something -
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
property: { key: "key3"} // you can change value to
}
},
{ multi: true }
);
The same goes for the nested property.
this can be done using $pop operator,
db.getCollection('collection_name').updateOne( {}, {$pop: {"path_to_array_object":1}})