How to find MongoDB field name at arbitrary depth - mongodb

I imported some sort-of sloppy XML data into a Mongo database. Each Document has nested sub-documents to a depth of around 5-10. I would like to find() documents that have a particular value of a particular field, where the field may appear at any depth in the sub-documents (and may appear multiple times).
I am currently pulling each Document into Python and then searching that dictionary, but it would be nice if I could state a filter prototype where the database would only return documents that have a particular value of the field name somewhere in their contents.
Here is an example document:
{
"foo": 1,
"bar": 2,
"find-this": "Yes!",
"stuff": {
"baz": 3,
"gobble": [
"wibble",
"wobble",
{
"all-fall-down": 4,
"find-this": "please find me"
}
],
"plugh": {
"plove": {
"find-this": "Here too!"
}
}
}
}
So, I'd like to find documents that have a "find-this" field, and (if possible) to be able to find documents that have a particular value of a "find-this" field.

You are right in the certain statement of a BSON document is not an XML document. Since XML is loaded into a tree structure that comprises of "nodes", searching on an arbitary key is quite easy.
A MonoDB document is not so simple to process, and this is a "database" in many respects, so it is generally expected to have a certain "uniformity" of data locations in order to make it easy to both "index" and search.
Nonetheless, it can be done. But of course this does mean a recursive process executing on the server and this means JavaScript processing with $where.
As a basic shell example, but the general function is just a string argument to the $where operator everywhere else:
db.collection.find(
function () {
var findKey = "find-this",
findVal = "please find me";
function inspectObj(doc) {
return Object.keys(doc).some(function(key) {
if ( typeof(doc[key]) == "object" ) {
return inspectObj(doc[key]);
} else {
return ( key == findKey && doc[key] == findVal );
}
});
}
return inspectObj(this);
}
)
So basically, test the keys present in the object to see if they match the desired "field name" and content. If one of those keys happens to be an "object" then recurse into the function and inspect again.
JavaScript .some() makes sure that the "first" match found will return from the search function giving a true result and returning the object where that "key/value" was present at some depth.
Note that $where essentially means traversing your whole collection unless there is some other valid query filter than can be applied to an "index" on the collection.
So use with care, or not at all and just work with re-structring the data into a more workable form.
But this will give you your match.

Here is one example, which I use for recursive search for Key-Value anywhere in document structure:
db.getCollection('myCollection').find({
"$where" : function(){
var searchKey = 'find-this';
var searchValue = 'please find me';
return searchInObj(obj);
function searchInObj(obj){
for(var k in obj){
if(typeof obj[k] == 'object' && obj[k] !== null){
if(searchInObj(obj[k])){
return true;
}
} else {
if(k == searchKey && obj[k] == searchValue){
return true;
}
}
}
return false;
}
}
})

Related

find({}) doesn't return correct result [duplicate]

I imported some sort-of sloppy XML data into a Mongo database. Each Document has nested sub-documents to a depth of around 5-10. I would like to find() documents that have a particular value of a particular field, where the field may appear at any depth in the sub-documents (and may appear multiple times).
I am currently pulling each Document into Python and then searching that dictionary, but it would be nice if I could state a filter prototype where the database would only return documents that have a particular value of the field name somewhere in their contents.
Here is an example document:
{
"foo": 1,
"bar": 2,
"find-this": "Yes!",
"stuff": {
"baz": 3,
"gobble": [
"wibble",
"wobble",
{
"all-fall-down": 4,
"find-this": "please find me"
}
],
"plugh": {
"plove": {
"find-this": "Here too!"
}
}
}
}
So, I'd like to find documents that have a "find-this" field, and (if possible) to be able to find documents that have a particular value of a "find-this" field.
You are right in the certain statement of a BSON document is not an XML document. Since XML is loaded into a tree structure that comprises of "nodes", searching on an arbitary key is quite easy.
A MonoDB document is not so simple to process, and this is a "database" in many respects, so it is generally expected to have a certain "uniformity" of data locations in order to make it easy to both "index" and search.
Nonetheless, it can be done. But of course this does mean a recursive process executing on the server and this means JavaScript processing with $where.
As a basic shell example, but the general function is just a string argument to the $where operator everywhere else:
db.collection.find(
function () {
var findKey = "find-this",
findVal = "please find me";
function inspectObj(doc) {
return Object.keys(doc).some(function(key) {
if ( typeof(doc[key]) == "object" ) {
return inspectObj(doc[key]);
} else {
return ( key == findKey && doc[key] == findVal );
}
});
}
return inspectObj(this);
}
)
So basically, test the keys present in the object to see if they match the desired "field name" and content. If one of those keys happens to be an "object" then recurse into the function and inspect again.
JavaScript .some() makes sure that the "first" match found will return from the search function giving a true result and returning the object where that "key/value" was present at some depth.
Note that $where essentially means traversing your whole collection unless there is some other valid query filter than can be applied to an "index" on the collection.
So use with care, or not at all and just work with re-structring the data into a more workable form.
But this will give you your match.
Here is one example, which I use for recursive search for Key-Value anywhere in document structure:
db.getCollection('myCollection').find({
"$where" : function(){
var searchKey = 'find-this';
var searchValue = 'please find me';
return searchInObj(obj);
function searchInObj(obj){
for(var k in obj){
if(typeof obj[k] == 'object' && obj[k] !== null){
if(searchInObj(obj[k])){
return true;
}
} else {
if(k == searchKey && obj[k] == searchValue){
return true;
}
}
}
return false;
}
}
})

MongoDB - query embedded documents [duplicate]

I imported some sort-of sloppy XML data into a Mongo database. Each Document has nested sub-documents to a depth of around 5-10. I would like to find() documents that have a particular value of a particular field, where the field may appear at any depth in the sub-documents (and may appear multiple times).
I am currently pulling each Document into Python and then searching that dictionary, but it would be nice if I could state a filter prototype where the database would only return documents that have a particular value of the field name somewhere in their contents.
Here is an example document:
{
"foo": 1,
"bar": 2,
"find-this": "Yes!",
"stuff": {
"baz": 3,
"gobble": [
"wibble",
"wobble",
{
"all-fall-down": 4,
"find-this": "please find me"
}
],
"plugh": {
"plove": {
"find-this": "Here too!"
}
}
}
}
So, I'd like to find documents that have a "find-this" field, and (if possible) to be able to find documents that have a particular value of a "find-this" field.
You are right in the certain statement of a BSON document is not an XML document. Since XML is loaded into a tree structure that comprises of "nodes", searching on an arbitary key is quite easy.
A MonoDB document is not so simple to process, and this is a "database" in many respects, so it is generally expected to have a certain "uniformity" of data locations in order to make it easy to both "index" and search.
Nonetheless, it can be done. But of course this does mean a recursive process executing on the server and this means JavaScript processing with $where.
As a basic shell example, but the general function is just a string argument to the $where operator everywhere else:
db.collection.find(
function () {
var findKey = "find-this",
findVal = "please find me";
function inspectObj(doc) {
return Object.keys(doc).some(function(key) {
if ( typeof(doc[key]) == "object" ) {
return inspectObj(doc[key]);
} else {
return ( key == findKey && doc[key] == findVal );
}
});
}
return inspectObj(this);
}
)
So basically, test the keys present in the object to see if they match the desired "field name" and content. If one of those keys happens to be an "object" then recurse into the function and inspect again.
JavaScript .some() makes sure that the "first" match found will return from the search function giving a true result and returning the object where that "key/value" was present at some depth.
Note that $where essentially means traversing your whole collection unless there is some other valid query filter than can be applied to an "index" on the collection.
So use with care, or not at all and just work with re-structring the data into a more workable form.
But this will give you your match.
Here is one example, which I use for recursive search for Key-Value anywhere in document structure:
db.getCollection('myCollection').find({
"$where" : function(){
var searchKey = 'find-this';
var searchValue = 'please find me';
return searchInObj(obj);
function searchInObj(obj){
for(var k in obj){
if(typeof obj[k] == 'object' && obj[k] !== null){
if(searchInObj(obj[k])){
return true;
}
} else {
if(k == searchKey && obj[k] == searchValue){
return true;
}
}
}
return false;
}
}
})

Refer to subfields without specifing name in a mongodb [duplicate]

I imported some sort-of sloppy XML data into a Mongo database. Each Document has nested sub-documents to a depth of around 5-10. I would like to find() documents that have a particular value of a particular field, where the field may appear at any depth in the sub-documents (and may appear multiple times).
I am currently pulling each Document into Python and then searching that dictionary, but it would be nice if I could state a filter prototype where the database would only return documents that have a particular value of the field name somewhere in their contents.
Here is an example document:
{
"foo": 1,
"bar": 2,
"find-this": "Yes!",
"stuff": {
"baz": 3,
"gobble": [
"wibble",
"wobble",
{
"all-fall-down": 4,
"find-this": "please find me"
}
],
"plugh": {
"plove": {
"find-this": "Here too!"
}
}
}
}
So, I'd like to find documents that have a "find-this" field, and (if possible) to be able to find documents that have a particular value of a "find-this" field.
You are right in the certain statement of a BSON document is not an XML document. Since XML is loaded into a tree structure that comprises of "nodes", searching on an arbitary key is quite easy.
A MonoDB document is not so simple to process, and this is a "database" in many respects, so it is generally expected to have a certain "uniformity" of data locations in order to make it easy to both "index" and search.
Nonetheless, it can be done. But of course this does mean a recursive process executing on the server and this means JavaScript processing with $where.
As a basic shell example, but the general function is just a string argument to the $where operator everywhere else:
db.collection.find(
function () {
var findKey = "find-this",
findVal = "please find me";
function inspectObj(doc) {
return Object.keys(doc).some(function(key) {
if ( typeof(doc[key]) == "object" ) {
return inspectObj(doc[key]);
} else {
return ( key == findKey && doc[key] == findVal );
}
});
}
return inspectObj(this);
}
)
So basically, test the keys present in the object to see if they match the desired "field name" and content. If one of those keys happens to be an "object" then recurse into the function and inspect again.
JavaScript .some() makes sure that the "first" match found will return from the search function giving a true result and returning the object where that "key/value" was present at some depth.
Note that $where essentially means traversing your whole collection unless there is some other valid query filter than can be applied to an "index" on the collection.
So use with care, or not at all and just work with re-structring the data into a more workable form.
But this will give you your match.
Here is one example, which I use for recursive search for Key-Value anywhere in document structure:
db.getCollection('myCollection').find({
"$where" : function(){
var searchKey = 'find-this';
var searchValue = 'please find me';
return searchInObj(obj);
function searchInObj(obj){
for(var k in obj){
if(typeof obj[k] == 'object' && obj[k] !== null){
if(searchInObj(obj[k])){
return true;
}
} else {
if(k == searchKey && obj[k] == searchValue){
return true;
}
}
}
return false;
}
}
})

How to search for text or expression in multiple fields

db.movies.find({"original_title" : {$regex: input_data, $options:'i'}}, function (err, datares){
if (err || datares == false) {
db.movies.find({"release_date" : {$regex: input_data + ".*", $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"cast" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"writers" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"genres.name" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"directors" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
res.status(451);
res.json({
"status" : 451,
"error code": "dataNotFound",
"description" : "Invalid Data Entry."
});
return;
} else{
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
I am trying to implement a so called "all-in-one" search so that whenever a user types in any kind of movie related information, my application tries to return all relevant information. However I have noticed that this transaction might be expensive on the backend and sometimes the host is really slow.
How do I smoothly close the db connection and where should I use it?
I read here that it is best not to close a mongodb connection in node.js >>Why is it recommended not to close a MongoDB connection anywhere in Node.js code?
Is the a proper way to implement a all-in-one search kind of a thing by using nested find commands?
Your current approach is full of problems and is not necessary to do this way. All you are trying to do is search for what a can gather is a plain string within a number of fields in the same collection. It may possibly be a regular expression construct but I'm basing two possibilities on a plain text search that is case insensitive.
Now I am not sure if you came to running one query dependant on the results of another because you didn't know another way or though it would be better. Trust me on this, that is not a better approach than anything listed here nor is it really required as will be shown:
Regex query all at once
The first basic option here is to continue your $regex search but just in a singular query with the $or operator:
db.movies.find(
{
"$or": [
{ "original_title" : { "$regex": input_data, "$options":"i"} },
{ "release_date" : { "$regex": input_data, "$options":"i"} },
{ "cast" : { "$regex": input_data, "$options":"i"} },
{ "writers" : { "$regex": input_data, "$options":"i"} },
{ "genres.name" : { "$regex": input_data, "$options":"i"} },
{ "directors" : { "$regex": input_data, "$options":"i"} }
]
},
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
The $or condition here effectively works like "combining queries" as each argument is treated as a query in itself as far as document selection goes. Since it is one query than all the results are naturally together.
Full text Query, multiple fields
If you are not really using a "regular expression" built from regular expression operations i.e ^(\d+)\bword$, then you are probably better off using the "text search" capabilities of MongoDB. This approach is fine as long as you are not looking for things that would be generally excluded, but your data structure and subject actually suggests this is the best option for what you are likely doing here.
In order to be able to perform a text search, you first need to create a "text index", specifically here you want the index to span multiple fields in your document. Dropping into the shell for this is probably easiest:
db.movies.createIndex({
"original_title": "text",
"release_date": "text",
"cast" : "text",
"writers" : "text",
"genres.name" : "text",
"directors" : "text"
})
There is also an option to assign a "weight" to fields within the index as you can read in the documentation. Assigning a weight give "priority" to the terms listed in the search for the field that match in. For example "directors" might be assigned more "weight" than "cast" and matches for "Quentin Tarantino" would therefore "rank higher" in the results where he was a director ( and also a cast member ) of the movie and not just a cast member ( as in most Robert Rodriguez films ).
But with this in place, performing the query itself is very simple:
db.movies.find(
{ "$text": { "$search": input_data } },
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
Almost too simple really, but that is all there is to it. The $text query operator knows to use the required index ( there can only be one text index per collection ) and it will just then look through all of the defined fields.
This is why I think this is the best fit for your use case here.
Parallel Queries
The final alternate I'll give here is you still want to demand that you need to run separate queries. I still deny that you do need to only query if the previous query does not return results, and I also re-assert that the above options should be considered "first", with preference to text search.
Writing dependant or chained asynchronous functions is a pain, and very messy. Therefore I suggest leaning a little help from another library dependency and using the node-async module here.
This provides an aync.map.() method, which is perfectly suited to "combining" results by running things in parallel:
var fields = [
"original_title",
"release_date",
"cast",
"writers",
"genres.name",
"directors"
];
async.map(
fields,
function(field,callback) {
var search = {},
cond = { "$regex": input_data, "$options": "i" };
search[field] = cond; // assigns the field to search
db.movies.find(search,callback);
},
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
And again, that is it. The .map() operator takes each field and transposes that into the query which in turn returns it's results. Those results are then accessible after all queries are run in the final section, "combined" as if they were a single result set, just as the other alternates do here.
There is also a .mapSeries() variant that runs each query in series, or .mapLimit() if you are otherwise worried about using database connections and concurrent tasks, but for this small size this should not be a problem.
I really don't think that this option is necessary, however if the Case 1 regular expression statements still apply, this "may" possibly provide a little performance benefit due to running queries in parallel, but at the cost of increased memory and resource consumption in your application.
Anyhow, the round up here is "Don't do what you are doing", you don't need to and there are better ways to handle the task you want to achieve. And all of them are mean cleaner and easier to code.

MongoDB: Nested Query at unknown depth [duplicate]

I imported some sort-of sloppy XML data into a Mongo database. Each Document has nested sub-documents to a depth of around 5-10. I would like to find() documents that have a particular value of a particular field, where the field may appear at any depth in the sub-documents (and may appear multiple times).
I am currently pulling each Document into Python and then searching that dictionary, but it would be nice if I could state a filter prototype where the database would only return documents that have a particular value of the field name somewhere in their contents.
Here is an example document:
{
"foo": 1,
"bar": 2,
"find-this": "Yes!",
"stuff": {
"baz": 3,
"gobble": [
"wibble",
"wobble",
{
"all-fall-down": 4,
"find-this": "please find me"
}
],
"plugh": {
"plove": {
"find-this": "Here too!"
}
}
}
}
So, I'd like to find documents that have a "find-this" field, and (if possible) to be able to find documents that have a particular value of a "find-this" field.
You are right in the certain statement of a BSON document is not an XML document. Since XML is loaded into a tree structure that comprises of "nodes", searching on an arbitary key is quite easy.
A MonoDB document is not so simple to process, and this is a "database" in many respects, so it is generally expected to have a certain "uniformity" of data locations in order to make it easy to both "index" and search.
Nonetheless, it can be done. But of course this does mean a recursive process executing on the server and this means JavaScript processing with $where.
As a basic shell example, but the general function is just a string argument to the $where operator everywhere else:
db.collection.find(
function () {
var findKey = "find-this",
findVal = "please find me";
function inspectObj(doc) {
return Object.keys(doc).some(function(key) {
if ( typeof(doc[key]) == "object" ) {
return inspectObj(doc[key]);
} else {
return ( key == findKey && doc[key] == findVal );
}
});
}
return inspectObj(this);
}
)
So basically, test the keys present in the object to see if they match the desired "field name" and content. If one of those keys happens to be an "object" then recurse into the function and inspect again.
JavaScript .some() makes sure that the "first" match found will return from the search function giving a true result and returning the object where that "key/value" was present at some depth.
Note that $where essentially means traversing your whole collection unless there is some other valid query filter than can be applied to an "index" on the collection.
So use with care, or not at all and just work with re-structring the data into a more workable form.
But this will give you your match.
Here is one example, which I use for recursive search for Key-Value anywhere in document structure:
db.getCollection('myCollection').find({
"$where" : function(){
var searchKey = 'find-this';
var searchValue = 'please find me';
return searchInObj(obj);
function searchInObj(obj){
for(var k in obj){
if(typeof obj[k] == 'object' && obj[k] !== null){
if(searchInObj(obj[k])){
return true;
}
} else {
if(k == searchKey && obj[k] == searchValue){
return true;
}
}
}
return false;
}
}
})