Mongodb change method of relations - mongodb

Im newbie in mongodb and considering various use cases before really use that.
Consider that collection:
{
"id": "item1",
"tags": ["tag1", "tag2", "tag3"]
},
{
"id": "item2",
"tags": ["tag2", "tag4"]
}
Its simple implementation of relations. It works fine until I attach tags to one tyle of entries in database.
Imagine that I want to add new entry type in live system and I want to attach tags to that type. In "normal" database change of relations to separate tags into new table (tag | id, name) then create tables to connections (item_tag, item2_tag) is not do hard.
But hows achieve that operation in mongodb ?

Imagine that I want to add new entry type in live system and I want to attach tags to that type
There's no need to change anything. You just create that new entry type (you mean a separate collection, correct?) and attach tags to it the same way you do now, embedded arrays.

Related

MongoDB Embedding alongside referencing

There is a lot of content of what kind of relationships should use in a database schema. However, I have not seen anything about mixing both techniques. 
The idea is to embed only the necessaries attributes and with them a reference. This way the application have the necessary data for rendering and the reference for the updating methods.
The problem I see here is that the logic for handle any CRUD operations becomes more tricky because its mandatory to update multiples collections however I have all the information in one single read.
Basic schema for a page that only wants the students names of a classroom:
CLASSROOM COLLECTION
{"_id": ObjectID(),
"students": [{"studentId" : ObjectID(),
"name" : "John Doe",
},
...
]
}
STUDENTS COLLECION
{"_id": ObjectID(),
"name" : "John Doe",
"address" : "...",
"age" : "...",
"gender": "..."
}
I use the students' collection in a different page and there I do not want any information about the classroom. That is the reason not to embed the students.
I started to learning mongo a few days ago and I don't know if this kind of schema bring some problems.
You can embed some fields and store other fields in a different collection as you are suggesting.
The issues with such an arrangement in my opinion would be:
What is the authority for a field? For example, what if a field like name is both embedded and stored in the separate collection, and the values differ?
Both updating and querying become awkward as you need to do it differently depending on which field is being worked with. If you make a mistake and go in the wrong place, you create/compound the first issue.

MongoDB - how to reference children/nested document _id's within parent on insert

I am very new to MongoDb but the project I was just brought in on uses it to store message threads like this:
{
"_id": ObjectId("messageThreadId"),
"messages": [
{
"_id": ObjectId("messageId"),
"body": "Lorem ipsum..."
}, etc...]
"users": [
{
"_id": ObjectId("userId"),
"unreadMessages": ['messageId', 'messageId', etc...]
}
]
}
I need to use pymongo to insert brand new messageThreads which should (initially) contain a single message. However, I am not clear on how to construct the users.unreadMessages lists of messageIds (which should contain just the newly-created initial message). Is there a way of referencing the initial message's _id before/as it's created, from within the same document? Also worth noting that unreadMessages is a list of strings, not ObjectId()s.
Do I need to create the messageThread with the unreadMessages list empty, then go back and retrieve the initial message's _id that was just created, then update every unreadMessages in the list of users? It feels wrong to require multiple transactions for an insert, but this whole schema feels wrong to me.
As DaveStSomeWhere said, I ended up pre-generating the ObjectId and then using it in the document before insertion. This is what PyMongo does when it goes to insert a document anyways: the relevant code in pymongo.collection.insert_one(). Thanks Dave.

Inserting multiple key value pair data under single _id in cloudant db at various timings?

My requirement is to get json pair from mqtt subscriber at different timings under single_id in cloudant, but I'm facing error while trying to insert new json pair in existing _id, it simply replace old one. I need at least 10 json pair under one _id. Injecting at different timings.
First, you should make sure about your architectural decision to update a particular document multiple times. In general, this is discouraged, though it depends on your application. Instead, you could consider a way to insert each new piece of information as a separate document and then use a map-reduce view to reflect the state of your application.
For example (I'm going to assume that you have multiple "devices", each with some kind of unique identifier, that need to add data to a cloudant DB)
PUT
{
"info_a":"data a",
"device_id":123
}
{
"info_b":"data b",
"device_id":123
}
{
"info_a":"message a"
"device_id":1234
}
Then you'll need a map function like
_design/device/_view/state
{
function (doc) {
emit(doc.device_id, 1);
}
Then you can GET the results of that view to see all of the "info_X" data that is associated with the particular device.
GET account.cloudant.com/databasename/_design/device/_view/state
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1},
{"id":"eaa710a5fa1ff4ba6156c997ddf6099b","key":1234,"value":1}
]}
Then you can use the query parameters to control the output, for example
GET account.cloudant.com/databasename/_design/device/_view/state?key=123&include_docs=true
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1,"doc":
{"_id":"28324b34907981ba972937f53113ac3f",
"_rev":"1-bac5dd92a502cb984ea4db65eb41feec",
"info_b":"data b",
"device_id":123}
},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1,"doc":
{"_id":"d50553d206d722b960fb176f11841974",
"_rev":"1-a2a6fea8704dfc0a0d26c3a7500ccc10",
"info_a":"data a",
"device_id":123}}
]}
And now you have the complete state for device_id:123.
Timing
Another issue is the rate at which you're updating your documents.
Bottom line recommendation is that if you are only updating the document once per ~minute or less frequently, then it could be reasonable for your application to update a single document. That is, you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database. You must make sure that your are providing the most recent _rev of that document and you should also check for conflicts that could occur if the document is being updated by multiple devices.
If you are acquiring new data for a particular device at a high rate, you'll likely run into conflicts very frequently -- because cloudant is a distributed document store. In this case, you should follow something like the example I gave above.
Example flow for the second approach outlined by #gadamcox for use cases where document updates are not required very frequently:
[...] you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database.
Your application first fetches the existing document by id: (https://docs.cloudant.com/document.html#read)
GET /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1"]
}
Then your application updates the document in memory
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1","2"]
}
and saves it in the database by specifying the _id and _rev (https://docs.cloudant.com/document.html#update)
PUT /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No":["1","2"]
}

List documents in Meteor collection with duplicate first names

My 'Programs' collection would look like this (as an array);
[{ FullName: "Jane Doe", CampYear: "mays15",...}, { FullName "Jane Doe", CampYear: "mays16",...},...]
Some people in the collection are newbies and have just one document in the collection. Others have multiple documents and are returnees. We'd like the ability to mark or flag somehow the newbies. Somehow iterate through the collection and single out those who just have one document in there. The trouble is if I have a list of, say, 150 names, for each name I'd have to have a separate find operation on the collection, which is too intensive.
I tried using aggregation via the meteorhacks:aggregate but couldn't get it to work. After loading the package, my IDE wouldn't recognize the .aggregate method at all, even on the server.
Underscore might be a worthwhile way of doing it, but I couldn't find a method that might be of assistance.
Any ideas how we could do this?
Based on your comment, I'd probably denormalize your data. I'd have a new collection called CampAttendance or something like that. Then you'd have the structure:
{
"name": "The camper's name",
"years": ["mays2015", ...]
}
You can then use upsert to either insert a new record or $push another camp year onto the years array as you're importing data.
To get the camper names who are 'newbies' then, you do:
CampAttendance.find({ years: { $size: 1 } });

Mongo DB chained query

I am new to MongoDB and as I wonder if a chained query like the following is possible(somewhat like a join):
db.places.insert({
"_id": original_id
"place_name": "Broadway Center"
"url": "bc.example.net"})
db.people.insert({
"name": "Erin"
"places_id": original_id
"url": "bc.example.net/Erin"})
So given a place name string, I want to select the people associated with that place.
But the people collection only reference the place id, not the place name.
You cannot use joins in MongoDB.
The idiomatic solution is retrieve all place_ids for that place_name from your places collection and then use the place_ids to query in your people collection.
Another option is keeping, for example, places in people collection (this makes more sense to me than people inside places collection but, of course, it depends on your domain). But then you have to take into account that in case that only one place changes, you have to change all people documents sharing a specific place. If people and places are in separate collections this doesn't happen so it depends on if we have static data or not and on if we want to optimize searches or updates.