List documents in Meteor collection with duplicate first names - mongodb

My 'Programs' collection would look like this (as an array);
[{ FullName: "Jane Doe", CampYear: "mays15",...}, { FullName "Jane Doe", CampYear: "mays16",...},...]
Some people in the collection are newbies and have just one document in the collection. Others have multiple documents and are returnees. We'd like the ability to mark or flag somehow the newbies. Somehow iterate through the collection and single out those who just have one document in there. The trouble is if I have a list of, say, 150 names, for each name I'd have to have a separate find operation on the collection, which is too intensive.
I tried using aggregation via the meteorhacks:aggregate but couldn't get it to work. After loading the package, my IDE wouldn't recognize the .aggregate method at all, even on the server.
Underscore might be a worthwhile way of doing it, but I couldn't find a method that might be of assistance.
Any ideas how we could do this?

Based on your comment, I'd probably denormalize your data. I'd have a new collection called CampAttendance or something like that. Then you'd have the structure:
{
"name": "The camper's name",
"years": ["mays2015", ...]
}
You can then use upsert to either insert a new record or $push another camp year onto the years array as you're importing data.
To get the camper names who are 'newbies' then, you do:
CampAttendance.find({ years: { $size: 1 } });

Related

MongoDB - how to reference children/nested document _id's within parent on insert

I am very new to MongoDb but the project I was just brought in on uses it to store message threads like this:
{
"_id": ObjectId("messageThreadId"),
"messages": [
{
"_id": ObjectId("messageId"),
"body": "Lorem ipsum..."
}, etc...]
"users": [
{
"_id": ObjectId("userId"),
"unreadMessages": ['messageId', 'messageId', etc...]
}
]
}
I need to use pymongo to insert brand new messageThreads which should (initially) contain a single message. However, I am not clear on how to construct the users.unreadMessages lists of messageIds (which should contain just the newly-created initial message). Is there a way of referencing the initial message's _id before/as it's created, from within the same document? Also worth noting that unreadMessages is a list of strings, not ObjectId()s.
Do I need to create the messageThread with the unreadMessages list empty, then go back and retrieve the initial message's _id that was just created, then update every unreadMessages in the list of users? It feels wrong to require multiple transactions for an insert, but this whole schema feels wrong to me.
As DaveStSomeWhere said, I ended up pre-generating the ObjectId and then using it in the document before insertion. This is what PyMongo does when it goes to insert a document anyways: the relevant code in pymongo.collection.insert_one(). Thanks Dave.

Inserting multiple key value pair data under single _id in cloudant db at various timings?

My requirement is to get json pair from mqtt subscriber at different timings under single_id in cloudant, but I'm facing error while trying to insert new json pair in existing _id, it simply replace old one. I need at least 10 json pair under one _id. Injecting at different timings.
First, you should make sure about your architectural decision to update a particular document multiple times. In general, this is discouraged, though it depends on your application. Instead, you could consider a way to insert each new piece of information as a separate document and then use a map-reduce view to reflect the state of your application.
For example (I'm going to assume that you have multiple "devices", each with some kind of unique identifier, that need to add data to a cloudant DB)
PUT
{
"info_a":"data a",
"device_id":123
}
{
"info_b":"data b",
"device_id":123
}
{
"info_a":"message a"
"device_id":1234
}
Then you'll need a map function like
_design/device/_view/state
{
function (doc) {
emit(doc.device_id, 1);
}
Then you can GET the results of that view to see all of the "info_X" data that is associated with the particular device.
GET account.cloudant.com/databasename/_design/device/_view/state
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1},
{"id":"eaa710a5fa1ff4ba6156c997ddf6099b","key":1234,"value":1}
]}
Then you can use the query parameters to control the output, for example
GET account.cloudant.com/databasename/_design/device/_view/state?key=123&include_docs=true
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1,"doc":
{"_id":"28324b34907981ba972937f53113ac3f",
"_rev":"1-bac5dd92a502cb984ea4db65eb41feec",
"info_b":"data b",
"device_id":123}
},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1,"doc":
{"_id":"d50553d206d722b960fb176f11841974",
"_rev":"1-a2a6fea8704dfc0a0d26c3a7500ccc10",
"info_a":"data a",
"device_id":123}}
]}
And now you have the complete state for device_id:123.
Timing
Another issue is the rate at which you're updating your documents.
Bottom line recommendation is that if you are only updating the document once per ~minute or less frequently, then it could be reasonable for your application to update a single document. That is, you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database. You must make sure that your are providing the most recent _rev of that document and you should also check for conflicts that could occur if the document is being updated by multiple devices.
If you are acquiring new data for a particular device at a high rate, you'll likely run into conflicts very frequently -- because cloudant is a distributed document store. In this case, you should follow something like the example I gave above.
Example flow for the second approach outlined by #gadamcox for use cases where document updates are not required very frequently:
[...] you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database.
Your application first fetches the existing document by id: (https://docs.cloudant.com/document.html#read)
GET /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1"]
}
Then your application updates the document in memory
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1","2"]
}
and saves it in the database by specifying the _id and _rev (https://docs.cloudant.com/document.html#update)
PUT /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No":["1","2"]
}

How does updating embedded documents work in MongoDB?

This is probably an obvious question, but I can't find a clear answer to it.
In MongoDB, suppose I embed document food into document super_market. When I make a change to the food document, will the embedded document in super_market automatically get updated?
Your question denotes a structure like this:
example Super_market:
{ "super_market_name": "SuperMart",
"address": "1 Main Street",
"food": { "food_name":"apple" }
}
It sounds like you're asking about a food document that is separate from the super_market and linked to it.
If you embed the 'food' document as above, the embedded document is the one that you would be altering. It doesn't exist in a separate location that you would modify. If you update super_market.food, the embedded document is the one (and only) that will be affected.

MongoDB - Manipulating multi-level arrays in a document

I am currently building an app with Meteor and MongoDB. I have a 3 level document structure with array in array:
{
_id: "shtZFiTeHrPKyJ8vR",
description: "Some title",
categories: [{
id: "shtZFiTeHrPKyJ8vR",
name: "Foo",
options: [{
id: "shtZFiTeHrPKyJ8vR",
name: "bar",
likes: ["abc", "bce"]
}]
}]
}
Now, the document could be manipulated at any level. Means:
description could be changed
categories can be added / removed / renamed
options can be added / removed / renamed
users can like options, so they must be added or removed
1 and 2 is quite easy. It is also relatively easy to add or remove a new option:
MyCollection.update({ _id: id, "categories.id": categoryId }, {
$push: {
"categories.$.options": {
id: Random.id
name: optionName
}
}
});
But manipulating the options hash requires to do that on javascript objects. That means I first need to find my document, iterate over the options and then write them back.
At least that's what I am doing right now. But I don't like that approach.
What I was thinking about is splitting the collection, at least to put the likes into it's own collection referencing the origin document.
Or is there another way? I don't really like both of my possible solutions.
For this kind of query one would normally use a the Mongo position operator. Although from the docs.
Nested Arrays
The positional $ operator cannot be used for queries
which traverse more than one array, such as queries that traverse
arrays nested within other arrays, because the replacement for the $
placeholder is a single value
Thus the only way to natively do what you want is by using specific indexes.
db.test.update({},{$pull:{"categories.0.options.0.likes":"abc"}})
Unfortunately Mongo does not allow to easily get the index of a match nested document.
I would normally say that once your queries become that difficult it's probably a good idea to revisit the way you store data. Also with that many arrays to which you will be pushing data, Mongo will probably be relocating a lot of documents. This is definitely something that you want to minimize.
So at this point you will need to separate your data out into different documents and even collections.
Your first documents would look like this:
{
_id: "shtZFiTeHrPKyJ8vR",
description: "Some title",
categories: [{
id: "shtZFiTeHrPKyJ8vR",
name: "Foo",
options: ["shtZFiTeHrPKyJ8vR"]
}]
}
This way you can easily add/remove options as you mentioned in your question. You would then need a second collection with documents that represent each option.
{
_id: "shtZFiTeHrPKyJ8vR",
name: "bar",
likes: ["abc", "bce"]
}
You can learn more about references here. This is similar to what you mentioned in your comment. The benefit of this is that you are already reducing the potential amount of relocation. Depending on how you use your data you may even be reducing network usage.
Now doing updates on the likes is easy.
MyCollection.update({ _id: id}, {
$push: {likes: "value"}
});
This does, however, require you to make two queries to the db. Although on the flip side you do a lot less on the client side and a lot less bandwidth is used.
Some other questions you need to ask yourself is if that depth of nesting is really needed. There might be an easier way to go about achieving your goal that doesn't require it to become so complicated.

MongoDB - Query embbeded documents

I've a collection named Events. Each Eventdocument have a collection of Participants as embbeded documents.
Now is my question.. is there a way to query an Event and get all Participants thats ex. Age > 18?
When you query a collection in MongoDB, by default it returns the entire document which matches the query. You could slice it and retrieve a single subdocument if you want.
If all you want is the Participants who are older than 18, it would probably be best to do one of two things:
Store them in a subdocument inside of the event document called "Over18" or something. Insert them into that document (and possibly the other if you want) and then when you query the collection, you can instruct the database to only return the "Over18" subdocument. The downside to this is that you store your participants in two different subdocuments and you will have to figure out their age before inserting. This may or may not be feasible depending on your application. If you need to be able to check on arbitrary ages (i.e. sometimes its 18 but sometimes its 21 or 25, etc) then this will not work.
Query the collection and retreive the Participants subdocument and then filter it in your application code. Despite what some people may believe, this isnt terrible because you dont want your database to be doing too much work all the time. Offloading the computations to your application could actually benefit your database because it now can spend more time querying and less time filtering. It leads to better scalability in the long run.
Short answer: no. I tried to do the same a couple of months back, but mongoDB does not support it (at least in version <= 1.8). The same question has been asked in their Google Group for sure. You can either store the participants as a separate collection or get the whole documents and then filter them on the client. Far from ideal, I know. I'm still trying to figure out the best way around this limitation.
For future reference: This will be possible in MongoDB 2.2 using the new aggregation framework, by aggregating like this:
db.events.aggregate(
{ $unwind: '$participants' },
{ $match: {'age': {$gte: 18}}},
{ $project: {participants: 1}
)
This will return a list of n documents where n is the number of participants > 18 where each entry looks like this (note that the "participants" array field now holds a single entry instead):
{
_id: objectIdOfTheEvent,
participants: { firstName: 'only one', lastName: 'participant'}
}
It could probably even be flattened on the server to return a list of participants. See the officcial documentation for more information.