Tracking history in mongoDB - mongodb

I'm trying to create an application using mongoDB, which can track the history of something, for the purposes of this question lets say the price of books.
I have a collection that has documents currently as such,
{
"_id" : ObjectId("595bf365d8dd224be0818690"),
"isbn" : 1234567890123,
"title" : "My awesome example book",
"author": "Author McAutherson",
"priceNow": 13.75
}
now, I'd like to add a history of the prices, so I can access them later, which would have a result similar to
{
"isbn" : 1234567890123,
"title" : "My awesome example book",
"author": "Author McAutherson",
"priceNow": 13.75,
"priceHistory": [
{
"price": 20.43,
"date": "2017-06-07"
},
{
"price": 10.28,
"date": "2017-05-03"
}
]
}
My question is, should I store the price history in a separate collection and document referencing the _id of the original document, or in the original document, and how would I move the current priceNow into the priceHistory when it changes?

Related

Adding multiple key/values

In my database.collection i.e. db.blog.posts I am trying to add a key and value that itself has multiple keys and values.
Current collection:
db.blog.posts.findOne()
"title":"blog posts"
I tried using $set, $push but nothing seems to work.
This also didn't work when I tried adding single collection:
db.blog.posts.updateOne({"title":"blog posts"}, {"$set":{"comments":[{"comment":"good post", "author":"john","votes":0}]}})
Nor insertOne instead of updateOne and I even tried with:
var myEmployee=[
{"comment":"good post", "author":"john", "votes":0},
{"comment":"i thought it was too short", "author":"claire","votes":3},
{"comment":"free watches", "author":"claire","votes":-1},
];
db.blog.posts.insert(myEmployee)
This is what I want:
"title" : "A blog post",
"comments" : [
{
"name" : "joe",
"email" : "joe#example.com",
"content" : "nice post."
},
{
"name" : "bob",
"email" : "bob#example.com",
"content" : "good post."
}
]
The updateOne command you have should have created an array for comments with a single entry. If you wanted multiple entries, you can just add multiple objects to the array in the update. The $set operator will change the value of the key to what you set as the second parameter.
db['blog.posts'].updateOne({"title":"blog posts"}, {
"$set": {
"comments":[
{
"name" : "joe",
"email" : "joe#example.com",
"content" : "nice post."
},
{
"name" : "bob",
"email" : "bob#example.com",
"content" : "good post."
}
]
}
})
If you want to add additional items to the comments, this can be done with $push. The $push operator adds to the array.
db['blog.posts'].updateOne({"title":"blog posts"}, {
"$push": {
"comments": {
"comment": "good post",
"author": "john",
"votes": 0
}
}
})
Docs for $set
Docs for $push
NB the examples above are for a collection named 'blog.posts' rather than a database named 'blog' and a collection names 'posts'. Ideally, brackets should be used for the property accessor where the collection name is not a valid JavaScript identifier although the dot notation in the question still works.

MongoDB: text search on nested view

I have a document in MongoDB 3.2 with the following structure:
"_id" : ObjectId("5759815b94db5928bea3c3a5"),
"source" : "pons1",
"libraries" : [
{
"archive" : “deko1”,
"last_access" : ISODate("2016-06-09T14:45:04.644+0000"),
"books" : [
{
"title": "American Gods",
"author": "Neil Gaiman"
},
{
"title": "A Little Life",
"author": "Hanya Yanagihara"
}
]
},
{
"archive" : “deko90”,
"last_access" : ISODate("2016-06-10T12:45:03.624+0000"),
"books" : [
{
"title": "Sociology of News",
"author": "Michael Schudson"
},
{
"title": "City of God",
"author": "Augustine of Hippo"
}
]
}
]
There is an array (“books”) inside of another array (“libraries”).
Since the book titles are indexed as "text," I want to be able to conduct a free text search, and return only the relevant array elements.
For instance, if I search for the term “Gods,” I would like to see the following result:
"_id" : ObjectId("5759815b94db5928bea3c3a5"),
"source" : "pons1",
"libraries" : [
{
"archive" : “deko1”,
"last_access" : ISODate("2016-06-09T14:45:04.644+0000"),
"books" : [
{
"title": "American Gods",
"author": "Neil Gaiman"
}
]
},
{
"archive" : “deko90”,
"last_access" : ISODate("2016-06-10T12:45:03.624+0000"),
"books" : [
{
"title": "City of God",
"author": "Augustine of Hippo"
}
]
}
]
In MongoDB 3.2, you can filter the elements of an array using “$filter” (https://docs.mongodb.com/manual/reference/operator/aggregation/filter/).
The problem is that you cannot use text search ($text) as a condition for $filter.
$text can only be used in the first stage of the aggregation pipeline ($match) (https://docs.mongodb.com/manual/tutorial/text-search-in-aggregation/).
There is one obvious workaround: give up the power of MongoDB’s text search and maybe work with regex.
That does not seem a good option for me. I’d rather not lose the diacritic insensitivity and other interesting functionalities of MongoDB’s text search.
Is there a way of reconciling $filter and $text in the same MongoDB query?

Relationship between 2 collections in MongoDB and show data between them

I am making a small database for library with MongoDB. I have 2 collections, first one is called 'books' which stores information about books. The second collection is called 'publishers' which stores information about the publishers and the IDs of the books which they published.
This is the document structure for 'books'. It has 3 documents
{
"_id" : ObjectId("565f2481104871a4a235ba00"),
"book_id" : 1,
"book_name" : "C++",
"book_detail" : "This is details"
},
{
"_id" : ObjectId("565f2492104871a4a235ba01"),
"book_id" : 2,
"book_name" : "JAVA",
"book_detail" : "This is details"
},
{
"_id" : ObjectId("565f24b0104871a4a235ba02"),
"book_id" : 3,
"book_name" : "PHP",
"book_detail" : "This is details"
}
This is the document structure for 'publishers'. It has 1 document.
{
"_id" : ObjectId("565f2411104871a4a235b9ff"),
"pub_id" : 2,
"pub_name" : "Publisher 2",
"pub_details" : "This is publishers details",
"book_id" : [2,3]
}
I want to write a query to show all the details of the books which are published by this publisher. I have written this query but it does not work. When I run it, it displays this message "Script executed successfuly, but there are no results to show.".
db.getCollection('publishers').find({"pub_id" : 2}).forEach(
function (functionName) {
functionName.books = db.books.find( { "book_id": functionName.book_id } ).toArray();
}
)
I think that your data structure is flawed. The publisher is a property of a book, not the other way around. You should add pub_id to each book, and remove book_id from the publisher:
{
"_id" : ObjectId("565f2481104871a4a235ba00"),
"book_id" : 1,
"book_name" : "C++",
"book_detail" : "This is details",
"pub_id" : 1
},
{
"_id" : ObjectId("565f2492104871a4a235ba01"),
"book_id" : 2,
"book_name" : "JAVA",
"book_detail" : "This is details"
"pub_id" : 2
},
{
"_id" : ObjectId("565f24b0104871a4a235ba02"),
"book_id" : 3,
"book_name" : "PHP",
"book_detail" : "This is details"
"pub_id" : 2
}
Then, select your books like such:
db.getCollection('books').find({"pub_id" : 2});
Try this way,
db.getCollection('publishers').find({"pub_id" : "2"}).exec(function(err, publisher){
if (err) {
res.send(err);
}
else
if(publisher)
{
publisher.forEach(function(functionName)
{
functionName.books = db.books.find( { "book_id": functionName.book_id } ).toArray();
})
}
})
I would suggest reading the official documentation, because the relationship between books and publishers is precisely the example which is used there: https://docs.mongodb.org/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/
In mongoDB and noSQL at large, it is not true that publisher must be a property of book. This is only the case in RDBMS, where in one-to-many relationships the reference is in the "one" part. The same works the other way round, books don't have to be a property of publisher. The clue here is in the absence of "must."
It all depends on how many is the "many-to-many". In this case, I'd say it's also about what type of library we're talking about, size of catalogue and whether new book purchases are common:
Is the number of books per publisher small AND data about publisher is often accessed with data about the book? Then, embed publisher info in the book document.
Is the number of books per publisher fairly big but reasonably stable (e.g.: historical library where acquisitions are rare)? Then, create a publishers collection with an array of books per publisher.
Is the number of books per publisher fairly big and catalogue grows at a reasonable pace? Then, include reference in the book document and fetch publisher info with it.
Side note
Although not related to the question, I think your document structure is flawed. _id and book_id are redundant. If you want to follow the RDBMS pattern of incremental integer IDs, then it's absolutely OK that you specify your own _id at the time of inserting the document with 1, 2, 3, etc. ObjectID() is a great thing, but, again, there's no obligation to use it.

dynamic size of subdocument mongodb

I'm using mongodb and mongoose for my web application. The web app is used for registration for swimming competitions and each competition can have X number of races. My data structure as of now:
{
"_id": "1",
"name": "Utmanaren",
"location": "town",
"startdate": "20150627",
"enddate": "20150627"
"race" : {
"gender" : "m"
"style" : "freestyle"
"length" : "100"
}
}
Doing this i need to determine and define the number of races for every competition. A solution i tried is having a separate document and having a Id for which competition a races belongs to, like below.
{
"belongsTOId" : "1"
"gender" : "m"
"style" : "freestyle"
"length" : "100"
}
{
"belongsTOId" : "1"
"gender" : "f"
"style" : "butterfly"
"length" : "50"
}
Is there a way of creating and defining dynamic number of races as a subdocument while using Mongodb?
Thanks!
You have basically two approaches of modelling your data structure; you can either design a schema where you can reference or embed the races document.
Let's consider the following example that maps swimming competition and multiple races relationships. This demonstrates the advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-many relationship between competition and race data, the competition has multiple races entities:
// db.competition schema
{
"_id": 1,
"name": "Utmanaren",
"location": "town",
"startdate": "20150627",
"enddate": "20150627"
"races": [
{
"gender" : "m"
"style" : "freestyle"
"length" : "100"
},
{
"gender" : "f"
"style" : "butterfly"
"length" : "50"
}
]
}
With the embedded data model, your application can retrieve the complete swimming competition information with just one query. This design has other merits as well, one of them being data locality. Since MongoDB stores data contiguously on disk, putting all the data you need in one document ensures that the spinning disks will take less time to seek to a particular location on the disk. The other advantage with embedded documents is the atomicity and isolation in writing data. To illustrate this, say you want to remove a competition which has a race "style" property with value "butterfly", this can be done with one single (atomic) operation:
db.competition.remove({"races.style": "butterfly"});
For more details on data modelling in MongoDB, please read the docs Data Modeling Introduction, specifically Model One-to-Many Relationships with Embedded Documents
The other design option is referencing documents follow a normalized schema where the race documents contain a reference to the competition document:
// db.race schema
{
"_id": 1,
"competition_id": 1,
"gender": "m",
"style": "freestyle",
"length": "100"
},
{
"_id": 2,
"competition_id": 1,
"gender": "f",
"style": "butterfly",
"length": "50"
}
The above approach gives increased flexibility in performing queries. For instance, to retrieve all child race documents where the main parent entity competition has id 1 will be straightforward, simply create a query against the collection race:
db.race.find({"competiton_id": 1});
The above normalized schema using document reference approach also has an advantage when you have one-to-many relationships with very unpredictable arity. If you have hundreds or thousands of race documents per given competition, the embedding option has so many setbacks in as far as spacial constraints are concerned because the larger the document, the more RAM it uses and MongoDB documents have a hard size limit of 16MB.
If your application frequently retrieves the race data with the competition information, then your application needs to issue multiple queries to resolve the references.
The general rule of thumb is that if your application's query pattern is well-known and data tends to be accessed only in one way, an embedded approach works well. If your application queries data in many ways or you unable to anticipate the data query patterns, a more normalized document referencing model will be appropriate for such case.
Ref:
MongoDB Applied Design Patterns: Practical Use Cases with the Leading NoSQL Database By Rick Copeland
You basically want to update the data, so you should upsert the data which is basically an update on the subdocument key.
Keep an array of keys in the main document.
Insert the sub-document and add the key to the list or update the list.
To push single item into the field ;
db.yourcollection.update( { $push: { "races": { "belongsTOId" : "1" , "gender" : "f" , "style" : "butterfly" , "length" : "50"} } } );
To push multiple items into the field it allows duplicate in the field;
db.yourcollection.update( { $push: { "races": { $each: [ { "belongsTOId" : "1" , "gender" : "f" , "style" : "butterfly" , "length" : "50"}, { "belongsTOId" : "2" , "gender" : "m" , "style" : "horse" , "length" : "70"} ] } } } );
To push multiple items without duplicated items;
db.yourcollection.update( { $addToSet: { "races": { $each: [ { "belongsTOId" : "1" , "gender" : "f" , "style" : "butterfly" , "length" : "50"}, { "belongsTOId" : "2" , "gender" : "m" , "style" : "horse" , "length" : "70"} ] } } } );
$pushAll deprecated since version 2.4, so we use $each in $push instead of $pushAll.
While using $push you will be able to sort and slice items. You might check the mongodb manual.

Can I utilize indexes when querying by MongoDB subdocument without known field names?

I have a document structure like follows:
{
"_id": ...,
"name": "Document name",
"properties": {
"prop1": "something",
"2ndprop": "other_prop",
"other3": ["tag1", "tag2"],
}
}
I can't know the actual field names in properties subdocument (they are given by the application user), so I can't create indexes like properties.prop1. Neither can I know the structure of the field values, they can be single value, embedded document or array.
Is there any practical way to do performant queries to the collection with this kind of schema design?
One option that came to my mind is to add a new field to the document, index it and set used field names per document into this field.
{
"_id": ...,
"name": "Document name",
"properties": {
"prop1": "something",
"2ndprop": "other_prop",
"other3": ["tag1", "tag2"],
},
"property_fields": ["prop1", "2ndprop", "other3"]
}
Now I could first run query against property_fields field and after that let MongoDB scan through the found documents to see whether properties.prop1 contains the required value. This is definitely slower, but could be viable.
One way of dealing with this is to use schema like below.
{
"name" : "Document name",
"properties" : [
{
"k" : "prop1",
"v" : "something"
},
{
"k" : "2ndprop",
"v" : "other_prop"
},
{
"k" : "other3",
"v" : "tag1"
},
{
"k" : "other3",
"v" : "tag2"
}
]
}
Then you can index "properties.k" and "properties.v" for example like this:
db.foo.ensureIndex({"properties.k": 1, "properties.v": 1})