I am trying to achieve the right query for my NoSQL database, but I am confused how to do it in Cloudant. In SQL with be
SELECT * FROM mydb
WHERE user_permit_doc_id = 10
AND user_tracking_id = 1
My query is like:
https://293e2cb7-3561-4004-a1c3-58d54f517ee6-bluemix.cloudant.com/user_files/_design/user_tracking/_view/new-view?startkey=["user_permit_doc_id:10"]
and it returns all of the docs, not just the ones with this id.
this is m Map Reduce function for the View
function(doc) {
if(doc.user_tracking_id !== null){
emit(doc);
}
}
Example of a doc inside my database of docs
{
"_id": "6e57baa78c6415beeee788bc786cc53a",
"_rev": "5-f15352bce99c307bd246bda4dc0da75a",
"user_tracking_id": "1",
"user_permit_id": "2",
"user_permit_doc_id": "10",
"user_id": "1",
"_attachments": {
"6y41j4i68cic.jpg": {
"content_type": "image/jpeg",
"revpos": 2,
"digest": "md5-KC+G5tbz2UWZSzlPHvBy/Q==",
"length": 68367,
"stub": true
}
}
}
you can change your view into
function(doc) {
if(doc.user_tracking_id !== null){
emit([user_tracking_id, user_permit_doc_id]);
}
}
and then query using the complex key [1, 10]
You would have to incorporate the WHERE user_permit_doc_id = 10 into your map reduce function if you wanted to return only that particular document, like this:
function(doc) {
if(doc.user_permit_doc_id === 10 && doc.user_tracking_id === 1){
emit(doc);
}
}
However, since you are coming over from SQL, you might be more comfortable with Mongo-like queries. If that style of querying your DB suits you better, check out the Cloudant Mango API layer. This API introduces SQL-like querying to NoSQL, actually creating a map reduce function behind the scenes.
Related
I have subjects collection. In this collection every document has tutors field that is object where key is id of tutors( from tutors collection)
tutors: {
"tutor_id_1": {
"name": "jonas",
"email": "jonas#gmail.com"
},
"tutor_id_2":{
"name": "stephen",
"email": "stephen#gmail.com"
},
"tutor_id_3":{
"name": "maria",
"email":"maria#gmail.com"
}
}
So how to query subjects where tutors field contain tutor id equal to "tutor_id_1" ?
I found one way
if I have two variables in client side
const tutorToFindId = "xxx"
const tutorToFindEmail = "YYY"
query(
collection(db, 'subjects'),
where(`tutors.${tutorToFindId}.email`, '==', `${tutorToFindEmail}`)
),
Is there any other way ??
As I understand, "tutor_id_1" is being used as a unique id. Considering that, you may structure your data model with "tutors" as a subcollection instead of a field and you will be able to get the content of that specific document as follows:
const docRef = db.collection('subjects').doc('subjectX').collection('tutors').doc(tutorToFindId);
const tutor = await docRef.get();
if (!tutor.exists) {
console.log('No such document!');
} else {
console.log('Document data:', tutor.data());
}
db.getBy({ where: { [`tutors.${tutorToFindId}.email`]: tutorToFindEmail } });
Take a look at the getBy function in https://www.npmjs.com/package/firebase-firestore-helper
Disclaimer: I am the creator of this library. It helps to manipulate objects in Firebase Firestore (and adds Cache)
Enjoy!
Is it possible to change the data type. eg I have a field 'user' and it's datatype is string. I need to change it's data type to ObjectId.
I have tried but getting error.
> db.booking.find().foreach( function (x) { x.user = ObjectId(x.user); db.booking.save(x); });
2017-06-28T09:30:35.317+0000 E QUERY [thread1] TypeError: db.booking.find(...).foreach is not a function :
#(shell):1:1
>
The best way is to use the bulk operations API with .bulkWrite():
var ops = [];
db.booking.find({},{ "user": 1 }).forEach(doc => {
doc.user = new ObjectId(doc.user.valueOf());
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": { "user": doc.user }
}
}
});
if ( ops.length >= 500 ) {
db.booking.bulkWrite(ops);
ops = [];
}
});
if ( ops.length > 0 ) {
db.booking.bulkWrite(ops);
ops = [];
}
As opposed to methods like .save() this only updates the specified field, as well as actually only committing in "batches" to the server, so you remove the overhead of back and forth communication on each write to only one write and acknowledge per batch. Using 500 is a reasonable size, but the underlying driver and server will always separate at 1000.
I am having a bit of an issue trying to come up with the logic for this. So, what I want to do is:
Bulk update a bunch of posts to my remote MongoDB instance BUT
If update, only update if lastModified field on the remote collection is less than lastModified field in the same document that I am about to update/insert
Basically, I want to update my list of documents if they have been modified since the last time I updated them.
I can think of two brute force ways to do it...
First, querying my entire collection, trying to manually remove and replace the documents that match the criteria, add the new ones, and then mass insert everything back to the remote collection after deleting everything in remote.
Second, query each item and then deciding, if there is one in remote, if I want to update it or no. This seems like it would be very tasking when dealing with remote collections.
If relevant, I am working on a NodeJS environment, using the mondodb npm package for database operations.
You can use the bulkWrite API to carry out the updates based on the logic you specified as it handles this better.
For example, the following snippet shows how to go about this assuming you already have the data from the web service you need to update the remote collection with:
mongodb.connect(mongo_url, function(err, db) {
if(err) console.log(err);
else {
var mongo_remote_collection = db.collection("remote_collection_name");
/* data is from http call to an external service or ideally
place this within the service callback
*/
mongoUpsert(mongo_remote_collection, data, function() {
db.close();
})
}
})
function mongoUpsert(collection, data_array, cb) {
var ops = data_array.map(function(data) {
return {
"updateOne": {
"filter": {
"_id": data._id, // or any other filtering mechanism to identify a doc
"lastModified": { "$lt": data.lastModified }
},
"update": { "$set": data },
"upsert": true
}
};
});
collection.bulkWrite(ops, function(err, r) {
// do something with result
});
return cb(false);
}
If the data from the external service is huge then consider sending the writes to the server in batches of 500 which gives you a better performance as you are not sending every request to the server, just once in every 500 requests.
For bulk operations MongoDB imposes a default internal limit of 1000 operations per batch and so the choice of 500 documents is good in the sense that you have some control over the batch size rather than let MongoDB impose the default, i.e. for larger operations in the magnitude of > 1000 documents. So for the above case in the first approach one could just write all the array at once as this is small but the 500 choice is for larger arrays.
var ops = [],
counter = 0;
data_array.forEach(function(data) {
ops.push({
"updateOne": {
"filter": {
"_id": data._id,
"lastModified": { "$lt": data.lastModified }
},
"update": { "$set": data },
"upsert": true
}
});
counter++;
if (counter % 500 === 0) {
collection.bulkWrite(ops, function(err, r) {
// do something with result
});
ops = [];
}
})
if (counter % 500 != 0) {
collection.bulkWrite(ops, function(err, r) {
// do something with result
}
}
We have a large collection of documents with various text case when entered for their description
eg
Desc =
'THE CAT"
or
"The Dog"
or
"the cow"
We want to make all consistent in Title (Or Proper case) where first letter of each word is upper and rest lower case.
"The Cat", "The Dog", "The Cow"
Looking for assistance in creating update query to do that on mass, rather than manual as data team is doing at present.
thanks
The algorithm for changing the title case below uses Array.prototype.map() method and the String.prototype.replace() method which returns a new string with some or all matches of a pattern replaced by a replacement.
In your case, the pattern for the replace() method will be a String to be replaced by a new replacement and will be treated as a verbatim string.
First off, you need to lowercase and split the string before applying the map() method. Once you define a function that implements the conversion, you then need to iterate your collection to apply an update with this function. Use the cursor.forEach() method on the cursor returned by find() to do the loop and within the loop you can then run an update on each document using the updateOne() method.
For relatively small datasets, the whole operation can be described by the following
function titleCase(str) {
return str.toLowerCase().split(' ').map(function(word) {
return word.replace(word[0], word[0].toUpperCase());
}).join(' ');
}
db.collection.find({}).forEach(function(doc){
db.collection.updateOne(
{ "_id": doc._id },
{ "$set": { "desc": titleCase(doc.desc) } }
);
});
For improved performance especially when dealing with huge datasets, take advantage of using a Bulk() API for updating the collection efficiently in bulk as you will be sending the operations to the server in batches (for example, say a batch size of 500). This gives you much better performance since you won't be sending every request to the server but just once in every 500 requests, thus making your updates more efficient and quicker.
The following demonstrates this approach, the first example uses the Bulk() API available in MongoDB versions >= 2.6 and < 3.2. It updates all the documents in the collection by transforming the title on the desc field using the above function.
MongoDB versions >= 2.6 and < 3.2:
function titleCase(str) {
return str.toLowerCase().split(' ').map(function(word) {
return word.replace(word[0], word[0].toUpperCase());
}).join(' ');
}
var bulk = db.collection.initializeUnorderedBulkOp(),
counter = 0;
db.collection.find().forEach(function (doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "desc": titleCase(doc.desc) }
});
counter++;
if (counter % 500 === 0) {
// Execute per 500 operations
bulk.execute();
// re-initialize every 500 update statements
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// Clean up remaining queue
if (counter % 500 !== 0) { bulk.execute(); }
The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk() API and provided a newer set of apis using bulkWrite().
MongoDB version 3.2 and greater:
var ops = [],
titleCase = function(str) {
return str.toLowerCase().split(' ').map(function(word) {
return word.replace(word[0], word[0].toUpperCase());
}).join(' ');
};
db.Books.find({
"title": {
"$exists": true,
"$type": 2
}
}).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": { "title": titleCase(doc.title) }
}
}
});
if (ops.length === 500 ) {
db.Books.bulkWrite(ops);
ops = [];
}
})
if (ops.length > 0)
db.Books.bulkWrite(ops);
db.movies.find({"original_title" : {$regex: input_data, $options:'i'}}, function (err, datares){
if (err || datares == false) {
db.movies.find({"release_date" : {$regex: input_data + ".*", $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"cast" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"writers" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"genres.name" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
db.movies.find({"directors" : {$regex: input_data, $options:'i'}}, function (err, datares){
if(err || datares == false){
res.status(451);
res.json({
"status" : 451,
"error code": "dataNotFound",
"description" : "Invalid Data Entry."
});
return;
} else{
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
} else {
res.json(datares);
return;
}
});
I am trying to implement a so called "all-in-one" search so that whenever a user types in any kind of movie related information, my application tries to return all relevant information. However I have noticed that this transaction might be expensive on the backend and sometimes the host is really slow.
How do I smoothly close the db connection and where should I use it?
I read here that it is best not to close a mongodb connection in node.js >>Why is it recommended not to close a MongoDB connection anywhere in Node.js code?
Is the a proper way to implement a all-in-one search kind of a thing by using nested find commands?
Your current approach is full of problems and is not necessary to do this way. All you are trying to do is search for what a can gather is a plain string within a number of fields in the same collection. It may possibly be a regular expression construct but I'm basing two possibilities on a plain text search that is case insensitive.
Now I am not sure if you came to running one query dependant on the results of another because you didn't know another way or though it would be better. Trust me on this, that is not a better approach than anything listed here nor is it really required as will be shown:
Regex query all at once
The first basic option here is to continue your $regex search but just in a singular query with the $or operator:
db.movies.find(
{
"$or": [
{ "original_title" : { "$regex": input_data, "$options":"i"} },
{ "release_date" : { "$regex": input_data, "$options":"i"} },
{ "cast" : { "$regex": input_data, "$options":"i"} },
{ "writers" : { "$regex": input_data, "$options":"i"} },
{ "genres.name" : { "$regex": input_data, "$options":"i"} },
{ "directors" : { "$regex": input_data, "$options":"i"} }
]
},
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
The $or condition here effectively works like "combining queries" as each argument is treated as a query in itself as far as document selection goes. Since it is one query than all the results are naturally together.
Full text Query, multiple fields
If you are not really using a "regular expression" built from regular expression operations i.e ^(\d+)\bword$, then you are probably better off using the "text search" capabilities of MongoDB. This approach is fine as long as you are not looking for things that would be generally excluded, but your data structure and subject actually suggests this is the best option for what you are likely doing here.
In order to be able to perform a text search, you first need to create a "text index", specifically here you want the index to span multiple fields in your document. Dropping into the shell for this is probably easiest:
db.movies.createIndex({
"original_title": "text",
"release_date": "text",
"cast" : "text",
"writers" : "text",
"genres.name" : "text",
"directors" : "text"
})
There is also an option to assign a "weight" to fields within the index as you can read in the documentation. Assigning a weight give "priority" to the terms listed in the search for the field that match in. For example "directors" might be assigned more "weight" than "cast" and matches for "Quentin Tarantino" would therefore "rank higher" in the results where he was a director ( and also a cast member ) of the movie and not just a cast member ( as in most Robert Rodriguez films ).
But with this in place, performing the query itself is very simple:
db.movies.find(
{ "$text": { "$search": input_data } },
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
Almost too simple really, but that is all there is to it. The $text query operator knows to use the required index ( there can only be one text index per collection ) and it will just then look through all of the defined fields.
This is why I think this is the best fit for your use case here.
Parallel Queries
The final alternate I'll give here is you still want to demand that you need to run separate queries. I still deny that you do need to only query if the previous query does not return results, and I also re-assert that the above options should be considered "first", with preference to text search.
Writing dependant or chained asynchronous functions is a pain, and very messy. Therefore I suggest leaning a little help from another library dependency and using the node-async module here.
This provides an aync.map.() method, which is perfectly suited to "combining" results by running things in parallel:
var fields = [
"original_title",
"release_date",
"cast",
"writers",
"genres.name",
"directors"
];
async.map(
fields,
function(field,callback) {
var search = {},
cond = { "$regex": input_data, "$options": "i" };
search[field] = cond; // assigns the field to search
db.movies.find(search,callback);
},
function(err,result) {
if(err) {
// respond error
} else {
// respond with data or empty
}
}
);
And again, that is it. The .map() operator takes each field and transposes that into the query which in turn returns it's results. Those results are then accessible after all queries are run in the final section, "combined" as if they were a single result set, just as the other alternates do here.
There is also a .mapSeries() variant that runs each query in series, or .mapLimit() if you are otherwise worried about using database connections and concurrent tasks, but for this small size this should not be a problem.
I really don't think that this option is necessary, however if the Case 1 regular expression statements still apply, this "may" possibly provide a little performance benefit due to running queries in parallel, but at the cost of increased memory and resource consumption in your application.
Anyhow, the round up here is "Don't do what you are doing", you don't need to and there are better ways to handle the task you want to achieve. And all of them are mean cleaner and easier to code.