Which nosql option relative to stored procedures and large arrays? - mongodb

I have a use case for a nosql data store but I don't know which one to use:
Each document in my data store has a key for _id and another key as an array of objects. Each object hash element of this array has a key for _elementid and another for color.
I want my server proxy to send an update request to the data store with a substring used as regex that qualifies all documents whose _id matches the regex. I then want to push an element onto the array of each document of this output. This new element will have the same color for each unshift but the _elementid will be unique for each.
Is there a nosql option out there that offers this kind of stored procedure? Does it have limits on the length of the array?
*** EDIT ***
(1)
DOCUMENT A:
{
_id : "this_is-an-example_10982029822",
dataList : [
{
_elementid : "999999283902830",
color : "blue",
}, {
_elementid : "99999273682763",
color : "red"
}
]
}
DOCUMENT B:
{
_id : "this_is-an-example_209382093820",
dataList : [
{
_elementid : "99999182681762",
color : "yellow"
}
]
}
(2) EXAMPLE OF UPDATE REQUEST
(let [regex_ready_array ["this_is-an-example" "fetcher" "finder"]
fetch_query_regex (str "^" (clojure.string/join "|^" regex_ready_array))
element_template {
:_elementid { (rand-int 1000000000000000) }
:color "green"
}
updated_sister_objs (mc/bulk-update connection "arrayStore" {:_id {$regex fetch_query_regex }} "unshift" element_template)])
(3)
DOCUMENT A:
{
_id : "this_is-an-example_10982029822",
dataList : [
{
_elementid : "999999146514612",
color : "green",
}, {
_elementid : "999999283902830",
color : "blue",
}, {
_elementid : "99999273682763",
color : "red"
}
]
}
DOCUMENT B:
{
_id : "this_is-an-example_209382093820",
dataList : [
{
_elementid : "9999997298729873",
color : "green",
}, {
_elementid : "9999918262881762",
color : "yellow"
}
]
}
*** EDIT 2 ***
(1) the dataList array could be large (large enough that MongoDB's 16mb document size limit would present an issue);
(2) the _elementid values to be assigned to the additional dataList elements will be different for each new element and the store will auto assign these as random number values
(3) a single update request should apply all updates, rather than one update per additional element;
(4) the OP is looking for a compare-and-contrast between several 'nosql solutions' which MongoDB, Cassandra, Redis and CouchDB being suggested as possible candidates.

By Seeing your question. I understand you are using JSONs and Clojure.
Lets see which are good NoSQL for JSONs. Quick overview of populor NoSQL
Apache Cassandra : Data Model in Cassandra is essentially a hybrid between a key-value and a column-oriented (or tabular) database management system. Its data model is a partitioned row store with consistency.
Redis: Redis maps keys to types of values.It has some abstract datatypes other than string like List, Sets, Sorted Sets, Hash Tables, Geospatial data.
Apache CouchDB : CouchDB manages a collection of JSON documents.
MongoDB : CouchDB manages a collection of BSON documents. BSON is Binary JSON http://bsonspec.org/spec.html.
If you are using lots of JSON payload you could use MongoDB or Apache CouchDB. But you want to update JSONs based on REGEX.
Lets check REGEX capability of CouchDB and MongoDB
It can be done easily with MAP Reduce in Both CouchDB and MongoDB
Regex Select: db.student.find( { f_name: { $regex: 'this_is-an-example.*'} } ).pretty();
MongoDB: In mongodb we have regex operations. I have tried it and it works fine.
Reference
https://docs.mongodb.com/manual/reference/operator/query/regex/
mongoDB update statement using regex
https://www.w3resource.com/mongodb/mongodb-regex-operators.php
CouchDB: I haven't tried CouchDB with Regex but as far I know it is possible. Regex function is available as per CouchDB documentation.
{
"selector": {
"afieldname": {"$regex": "^A"}
}
}
Reference
http://docs.couchdb.org/en/2.0.0/api/database/find.html
Temporary couchdb view of documents with doc_id matching regular expression
You could you either of this MongoDB and CouchDB. Lots of resources are avalible for MongoDB.

Related

multi updating a key along the documents of a collection using pymongo

I have lots of documents inside a collection.
The structure of each of the documents inside the collection is as it follows:
{
"_id" : ObjectId(....),
"valor" : {
"AB" : {
"X" : 0.0,
"Y" : 142.6,
},
"FJ" : {
"X" : 0.2,
"Y" : 3.33
....
The collection has currently about 200 documents and I have noticed that one of the keys inside valor has the wrong name. In this case we will say "FJ" shall be "JOF" in all the docs of the collection.
Im pretty sure it is possible to change the key in all the docs using the update function of pymongo. The problem I am facing is that when I visit the online doc available https://docs.mongodb.com/v3.0/reference/method/db.collection.update/ only explains how to change the values(which I would like to remain how they currently are and change only the keys).
This is what I have tried:
def multi_update(spec_key,key_updte):
rdo=col.update((valor.spec_key),{"$set":(valor.key_updte)},multi=True)
return rdo
print(multi_update('FJ','JOF'))
But outputs name 'valor' is not defined . I thought I shall use valor.specific_key to access to the corresponding json
how can I update a key only along the docs of the collection?
You have two problems. First, valor is not an identifier in your Python code, it's a field name of a MongoDB document. You need to quote it in single or double quotes in Python in order to make it a string and use it in a PyMongo update expression.
Your second problem is, MongoDB's update command doesn't allow you set one field to the value of another, nor to rename a field. However, you can reshape all the documents in your collection using the aggregate command with a $project stage and store the results in a second collection using a $out stage.
Here's a complete example to play with:
db = MongoClient().test
collection = db.collection
collection.delete_many({})
collection.insert_one({
"valor" : {
"AB" : {
"X" : 0.0,
"Y" : 142.6,
},
"FJ" : {
"X" : 0.2,
"Y" : 3.33}}})
collection.aggregate([{
"$project": {
"valor": {
"AB": "$valor.AB",
"FOJ": "$valor.FJ"
}
}
}, {
"$out": "collection2"
}])
This is the dangerous part. First, check that "collection2" has all the documents you want, in the desired shape. Then:
collection.drop()
db.collection2.rename("collection")
import pprint
pprint.pprint(collection.find_one())

Search backward on MongoDb

I have 2 collection.
Collection "users"
{
"_id" : ObjectId("54b00098e0fdb6634b1f54e6"),
"state" : "active",
"backends" : [
DBRef("backends", ObjectId("54b001ebe0fd853df1c93419")),
DBRef("backends", ObjectId("54b00284e0fd853df1c9341b"))
]
}
Collection "backends"
{
"_id" : ObjectId("54b001ebe0fd853df1c93419"),
"state" : "running"
}
I want to get a list of backend of a user where the backend's state is "running".
How can mongodb do this like join two table?
Is it any method to search backward from backend or have function the filter?
I can search like this
db.users.find({"backends.$id" : "distring"})
But what if I want to search the state inside backend object? like.
db.users.find({"backends.$state" : "running"})
But ofcoure it is not working.
MongoDB doesn't support joins so you need to do this in two steps. In the shell:
var ids = db.backends.find({state: 'running'}, {_id: 1}).map(function(backend) {
return backend._id;
});
var users = db.users.find({'backends.$id': {$in: ids}}).toArray();
On a side note, you're probably better off using a plain ObjectId instead of a DBRef for the backends array elements unless the ids in that array can actually refer to docs in multiple collections.

Reference an _id in a Subdocument in another Collection Mongodb

I am developing an application with mongodb and nodejs
I should also mention that I am new to both so please help me through this question
my database has a collection categories and then in each category I am storing products in subdocument
just like below :
{
_id : ObjectId(),
name: String,
type: String,
products : [{
_id : ObjectId(),
name : String,
description : String,
price : String
}]
});
When it comes to store the orders in database the orders collection will be like this:
{
receiver : String,
status : String,
subOrders : [
{
products :[{
productId : String,
name : String,
price : String,
status : String
}],
tax : String,
total : String,
status : String,
orderNote : String
}
]
}
As you can see we are storing _id of products which is a subdocument of categories in orders
when storing there is no issue obviously, when it comes to fetch these data if we just need the limited field like name or price there will be no issue as well, but if later on we need some extra fields from products like description,... they are not stored in orders.
My question is this:
Is there any easy way to access other fields of products apart from loop through the whole categories in mongodb, namely I need a sample code for querying the description of a product by only having its _id in mongodb?
or our design and implementation was wrong and I have to re-design it from scratch and separate the products from categories into another collection?
please don't put links to websites or weblogs that generally talks about mongodb and its collections implementations unless they focus on a very similar issue to mine
thanks in advance
I'd assume that you'd want to return as many product descriptions as matched the current list of products, so first, there isn't a query to return only matching array elements. Using $elemMatch you can return a specific element or the first match, but not only matching array elements. However, $elemMatch can also be used as a projection operator.
db.categories({ "products._id" : "PID1" },
{ $elemMatch : { "products._id" : "PID1" },
"products._id" : 1,
"products.description" : 1})
You'd definitely want to index the "products._id" field to achieve reasonable performance.
You might consider instead creating a products collection where each document contains a category identifier, much like you would in a relational database. This is a common pattern in MongoDb when embedding doesn't make sense, or complicates queries and aggregations.
Assuming that is true:
You'll need to load the data from the second collection manually. There are no joins in MognoDb. You might consider using $in which takes a list of values for a field and loads all matching documents.
Depending on the driver you're using to access MongoDb, you should be able to use the projection feature of find, which can limit the fields returned for a document to just those you've specified.
As product descriptions ardently likely to change frequently, you might also consider caching the values for a period on the client (like a web server for example).
db.products.find({ _id: { $in : [ 'PID1', 'PID2'] } }, { description : 1 })

Is it possible to make a "not modify " constrain on MongoDb subdocuments at creation?

I'd like to make a specific subdocument value from a MondoDb document fixed, so it can not be possible to modify it at a next update, or any other MongoDb operations that can modify documents.
For example, if a document like the one bellow is inserted, I will like that "eyesColor" value can not be changed.
{
"id" : "someId",
"name": "Jane",
"eyesColor" : "blue"
}
A possible update can be:
{
"id" : "someId",
"name": "Amy",
"eyesColor" : "green"
}
And the result I need after this update is :
{
"id" : "someId",
"name": "Amy",
"eyesColor" : "blue"
}
I'd like to do this because the possibility of using $set and $unset operators is not present in the project I'm creating. A read on the existing document before the update, in order to get the value of the subdocument ("eyesColor") will decrease the performance of the application I work on.
Actually the constrain I need is similar to the fixed size on collections (capped collections). The difference is that it is on a subdocument instead of collection and on the value contained in the subdocument instead of the size.
Is there any solution to this type of constrain?
There are no constraints in MongoDB (only exception: unique indexes). There is no way to make fields "read-only" on the database-layer.
When you want to use upsert's (db.collection.update with upsert: true) which add certain fields on inserting new documents but don't affect these fields on updates of existing documents, you can place these fields behind the $setOnInsert-operator.

Mongoose: Saving as associative array of subdocuments vs array of subdocuments

I have a set of documents I need to maintain persistence for. Due to the way MongoDB handle's multi-document operations, I need to embed this set of documents inside a container document in order to ensure atomicity of my operations.
The data lends itself heavily to key-value pairing. Is there any way instead of doing this:
var container = new mongoose.Schema({
// meta information here
subdocs: [{key: String, value: String}]
})
I can instead have subdocs be an associative array (i.e. an object) that applies the subdoc validations? So a container instance would look something like:
{
// meta information
subdocs: {
<key1>: <value1>,
<key2>: <value2>,
...
<keyN>: <valueN>,
}
}
Thanks
Using Mongoose, I don't believe that there is a way to do what you are describing. To explain, let's take an example where your keys are dates and the values are high temperatures, to form pairs like { "2012-05-31" : 88 }.
Let's look at the structure you're proposing:
{
// meta information
subdocs: {
"2012-05-30" : 80,
"2012-05-31" : 88,
...
"2012-06-15": 94,
}
}
Because you must pre-define schema in Mongoose, you must know your key names ahead of time. In this use case, we would probably not know ahead of time which dates we would collect data for, so this is not a good option.
If you don't use Mongoose, you can do this without any problem at all. MongoDB by itself excels at inserting values with new key names into an existing document:
> db.coll.insert({ type : "temperatures", subdocuments : {} })
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-30' : 80 } } )
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-31' : 88 } } )
{
"_id" : ObjectId("5238c3ca8686cd9f0acda0cd"),
"subdocuments" : {
"2012-05-30" : 80,
"2012-05-31" : 88
},
"type" : "temperatures"
}
In this case, adding Mongoose on top of MongoDB takes away some of MongoDB's native flexibility. If your use case is well suited by this feature of MongoDB, then using Mongoose might not be the best choice.
you can achieve this behavior by using {strict: false} in your mongoose schema, although you should check the implications on the validation and casting mechanism of mongoose.
var flexibleSchema = new Schema( {},{strict: false})
another way is using schema.add method but i do not think this is the right solution.
the last solution i see is to get all the array to the client side and use underscore.js or whatever library you have. but it depends on your app, size of docs, communication steps etc.