NoSQL: insert an array - mongodb

I have a question in the homework assignment.
Suppose that each actor can work in multiple movies. We want to record the names of movies for each actor as an embedded array within a document of each actor.
Modify the following insertion query to add the following movies as an array: Life of Pie, Madagascar, and Hunger Games.
db.actor.insert({
first: 'matthew',
last: 'setter',
dob: '21/04/1978',
gender: 'm',
hair_colour: 'brown',
occupation: 'developer',
nationality: 'australian',
height_cm: 185
});
I thought it would be as easy as just adding this line into the code:
movies: ["Life of Pie", "Madagascar", "Hunger Games"]
but obviously, it's not that simple. I tried to find the syntax and examples for inserting an array, but had no luck.

According to the mongodb documentation, you had the right idea:
http://docs.mongodb.org/manual/core/document/
The example they give is:
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : NumberLong(1250000)
}
So, it's possible you were just missing a comma after height_cm: 185:
db.actor.insert({
first: 'matthew',
last: 'setter',
dob: '21/04/1978',
gender: 'm',
hair_colour: 'brown',
occupation: 'developer',
nationality: 'australian',
height_cm: 185,
movies: ["Life of Pie", "Madagascar", "Hunger Games"]
});

Related

MongoDB schema and structure

I just started learning mongoDB and mongoose here.
Is it possible to have value as key in mongoDB? For example, I'd like to have a structure like this:
Person collection:
USA: {
'John Doe': { phone: '...', somethingElse: '...' },
'Jane Doe': { phone: '...', somethingElse: '...' },
},
Australia: {
'John Doe': { phone: '...', somethingElse: '...' },
'Jane Doe': { phone: '...', somethingElse: '...' },
},
England: {
'John Doe': { phone: '...', somethingElse: '...' },
'Jane Doe': { phone: '...', somethingElse: '...' },
}
I know it's a terrible example, and I understand alternatively we can store the documents like:
{_id: 1, name: 'John Doe', address: 'USA', phone: '...', ...},
{_id: 2, name: 'John Doe', address: 'Australia', phone: '...', ...},
{_id: 3, name: 'John Doe', address: 'England', phone: '...', ...},
I guess I'm just trying to understand if storing value as key is even possible here. And if it is, how do we define the schema with mongoose?
Theoretically you could use a schema like:
const testSchema = new Schema({
countries: {
type: Map,
of: {
type: Map,
of: Object,
},
},
});
taking advantage of Map type in mongoose, then you can assign your dynamic object to contries property.
Personally I believe that second approach you mentioned is a way better idea (unless you really have a good reason for using the first one).
First reason is that having dynamic key names makes querying difficult. Instead of simple .find({name: 'John Doe'}) you need to run complicated aggregation queries like here. Basically any traversing like counting people, filtering by phone etc will be painful with dynamic keys.
Second reason is that MongoDB document has 16MB limitation so gathering too many people means you can approach that limit.

Design of RESTful endpoint for querying based on aggregate data

Let's say we have two resources: Person, Movie.
persons/123
{id: 123, firstName: "John", lastName:"Travolta"}
persons/124
{id: 124, firstName: "Uma", lastName: "Thurman"}
persons/125
{id: 125, firstName: "Bob", lastName: "Saget"}
persons/126
{id: 126, firstName: "Christopher", lastName: "Walken"}
persons/127
{id: 127, firstName: "Steve", lastName: "Buscemi"}
movies/1
{id: 1, name: "Pulp Fiction"}
movies/2
{id:2, name: "Reservoir Dogs"}
Then, to relate the two, we have another resource: Cast Member
GET cast-members?movie.name=Pulp%20Fiction
[
{
id: 502,
movie: {id: 1, name: "Pulp Fiction"}
actor: {id: 123, firstName: "John", lastName:"Travolta"},
character: "Vincent Vega"
},
{
id: 503,
movie: {id: 1, name: "Pulp Fiction"}
actor: {id: 124, firstName: "Uma", lastName: "Thurman"},
character: "Mia Wallace"
},
{
id: 504,
movie: {id: 1, name: "Pulp Fiction"}
actor: {id: 126, firstName: "Christopher", lastName: "Walken"},
character: "Buddy Holly"
},
...
]
If I want to see all of the movies Christopher Walken has been in, I know I can do this:
GET cast-members?actor.id=125
What if I want to see all movies where both Uma Thurman and John Travolta are in the Cast? What does this endpoint look like?
GET cast-members?actor.id=124&actor.id=125
doesn't work.
We return Cast Members where actor.id was either 124 or 125?
[
{
id: 587,
movie: {id: 10, name: "Kill Bill"},
// John's not in this movie
actor: {id: 124, firstName: "Uma", lastName: "Thurman"},
character: "The Bride"
},
{
id: 597,
movie: {id: 11, name: "Saturday Night Fever"},
// Uma's not in this movie
actor: {id: 123, firstName: "John", lastName:"Travolta"},
character: "Tony Manero"
},
...
]
This wouldn't be what we want, because we would have to join the movies on the client side (which isn't desirable since it means we have to page through a lot of data before being able to return a result).
the SQL query would be this:
SELECT Movie
FROM CastMember
WHERE Actor in (124, 125)
GROUP BY Movie
HAVING COUNT(DISTINCT Actor) = 2
Is there a way to translate this query to something that makes RESTful sense?
Perhaps the problem is the 'cast-members' resource - it's not clear that its modelling a real resource rather than some kind of 'synthetic' resource. Not all tables in a relational database are actually resources. ReSTful relationships are generally modeled as links between resources rather than as resources themselves.
The core of your resource breakdown here is that you've got
movie
person
and they're (perhaps) grouped into collections
movies
people
Let's say that you structure your URI space so that you've got
/movie
a collection of movies
/movie/{id}
an individual movie with a given id
/person
a collection of people, possibly not all just actors in movies
/person/{id}
an individual actor with a given id
If you want to find all the movies that a person was in, you search your movies resource based on an 'actor' query. That query could take multiple values, because it makes perfect sense to say that a given movie has more than one actor.
So, if you want to find all the films that persons with ids 1234 and 5678 were in your query should be
GET /movie?actor=1234,5678
Now, you could implement that in a few different ways depending on your use case. Perhaps it would only return a list of movie URIs and you'd have to query each one individually, but that doesn't sound great. Perhaps, on the other hand, it would return a list of all the full movie documents which match (titles, full cast, year, length, synopsis, reviews etc) - that could be a lot of data, so you might add a page parameter...
GET /movie?actor=1234,5678&page=2&pageSize=10
Perhaps you only want some of the details associated with each movie - you could add a parameter for the details which make sense for you....
GET /movie?actor=1234,5678&details=title,id,cast
Note: so far there hasn't been need for a 'person' resource. However, the response document from your movies query will contain links to both individual movie URIs and person URIs...
movies?actor=12345,5678
[
{
movie: {id: 10, uri:"/movies/10", name: "...",
cast: [{id: 12345, uri:"/people/12345", firstName: ... }
{id: 5678, uri:"/people/5678", firstName:...}
{...}
]
}
},
{
movie: {id: 11, uri: "/movies/11", name: "...",
cast: [{id: 12345, uri:"/people/12345", firstName: ... }
{id: 5678, uri:"/people/5678", firstName:...}
{...}
]
}
},
...
]

Should I use separate collection or embed fields that I know won't be used for all models. MongoDB

Background:
I'm planning an app that will have 3 types of posts, for n number of games; Single-posts, team-posts, coach-posts. Now I'm not sure of the best Schema for a single type of post.
The posts of a certain type share a couple fundamental attributes, like: user_id, comments, status, etc. But the fields relevant to the game will be unique.
These are the two possibilities I'm considering:
1. Separate collection for each game:
As you can see the playerposts type requires different fields for each game but has a similar structure.
// game1_playerposts
{
_id: ObjectId(),
user_id: ObjectId(),
game: ObjectId(),
comments: [{
user_id: ObjectId(),
comment: String,
score: Number
}],
rank: {
name: String,
abbr: String,
img: String
},
roles: [String],
gamemode: [String]
}
// game2_playerposts
{
_id: ObjectId(),
user_id: ObjectId(),
game: ObjectId(),
comments: [{
user_id: ObjectId(),
comment: String,
score: Number
}],
level: {
name: String,
abbr: String,
img: String
},
champions: [String],
factions: [{
name: String,
abbr: String,
img: String
}]
}
2. One collection for all games:
This way I only need one collection, and will always only use the fields I need, and the rest would remain empty.
{
_id : ObjectId(),
user_id : ObjectId(),
game1 : {
game: ObjectId(),
rank: {
name: String,
  abbr: String,
img: String
},
roles: [String],
gamemodes: [String]
},
game2 : {
game: ObjectId(),
level: {
name: String,
  abbr: String,
img: String
},
champions: [String],
factions: {
name: String,
  abbr: String,
img: String
}
},
game_n {
...
},
comments : [{
user_id: ObjectId(),
comment: String,
score: Number
}],
}
What's better?
Which one of these options would be better suited? Performance is important, but I also want it to be simple to add to the Schema when we decide to add support for another game in the future.
MongoDB is schemaless.
I don't see why you have to have fields you know won't be used. Why not just have a separate document for each individual player post and that document will have the schema that relates to the type of post it is?
You can have in a single collection both of the documents that you have as examples under the "Separate collection for each game" header.
I have not worked with Mongoose, but if using it removes the benefits of MongoDB being schemaless, I don't think it would be as popular a tool as it is, so I think there's a way for it to work.

Difference between and / or clause in mongo db

I am new to mongo db and trying to understand how to query a db.
I was reading a tutorial from this link http://code.tutsplus.com/tutorials/getting-started-with-mongodb-part-1--net-22879
Following this tutorial I created a simple db like below,
db.nettuts.insert({
first: 'matthew',
last: 'setter',
dob: '21/04/1978',
gender: 'm',
hair_colour: 'brown',
occupation: 'developer',
nationality: 'australian'
});
db.nettuts.insert({
first: 'james',
last: 'caan',
dob: '26/03/1940',
gender: 'm',
hair_colour: 'brown',
occupation: 'actor',
nationality: 'american'
});
db.nettuts.insert({
first: 'arnold',
last: 'schwarzenegger',
dob: '03/06/1925',
gender: 'm',
hair_colour: 'brown',
occupation: 'actor',
nationality: 'american'
});
db.nettuts.insert({
first: 'tony',
last: 'curtis',
dob: '21/04/1978',
gender: 'm',
hair_colour: 'brown',
occupation: 'developer',
nationality: 'american'
});
db.nettuts.insert({
first: 'jamie lee',
last: 'curtis',
dob: '22/11/1958',
gender: 'f',
hair_colour: 'brown',
occupation: 'actor',
nationality: 'american'
});
db.nettuts.insert({
first: 'michael',
last: 'caine',
dob: '14/03/1933',
gender: 'm',
hair_colour: 'brown',
occupation: 'actor',
nationality: 'english'
});
db.nettuts.insert({
first: 'judi',
last: 'dench',
dob: '09/12/1934',
gender: 'f',
hair_colour: 'white',
occupation: 'actress',
nationality: 'english'
});
and was trying to use querys like this below
db.nettuts.find({gender: 'm'});
which returned all the actors who are male.
When I tried the command
db.nettuts.find({gender: 'm', $or: [{nationality: 'english'}]});
it returned.
> db.nettuts.find({gender: "m", $or: [{nationality: "english"}]});
{ "_id" : ObjectId("53064b7979d90b140e53df3e"), "first" : "michael", "last" : "caine", "dob" : "14/03/1933", "gender" : "m", "hair_colour" : "brown", "occupation" : "actor", "nationality" : "english" }
>
When I tried the same command, this time with "and" clause
db.nettuts.find({gender: 'm', $and: [{nationality: 'english'}]});
I got the same output as when I used $or
> db.nettuts.find({gender: "m", $and: [{nationality: "english"}]});
{ "_id" : ObjectId("53064b7979d90b140e53df3e"), "first" : "michael", "last" : "caine", "dob" : "14/03/1933", "gender" : "m", "hair_colour" : "brown", "occupation" : "actor", "nationality" : "english" }
>
So given these two output what is the difference between "or" clause and "and" clause here. Please help me understand.
Using $or or $and with an array with only one entry is quite pointless, because both operators apply to the array which is assigned to them. They don't apply to any other parts of the match-object.
When you want to find all actors which are male or english, you would do this:
db.nettus.find({
$or: [
{ "gender": "m"},
{ "nationality": "english"}
]
});
To find all actors which are male and english, you don't even need the $and-operator:
db.nettus.find({
"gender": "m",
"nationality": "english"
});
There are quite few situations where you really must use the $and-operator. The only situation where you have no alternative is when you need to apply the same operator to the same field multiple times. In this example we select all people whose age can be divided by 3 and 5.
db.people.find({
$and: [
{ age: $mod [ 3, 0] },
{ age: $mod [ 5, 0] }
]
});

Reference field within same schema

Is it possible to reference field within same schema? See my example below. Or am I going about this wrong way?
var UserSchema = new mongoose.Schema ({
username: String,
password: String,
email: String,
foods: [{
name: String,
category: String,
ingredients: // how to reference values in the ingredients array?
}],
ingredients: [{
name: String,
category: String
}]
});
Short answer
This is a core MongoDB design decision: MongoDB relationships: embed or reference?
Storing references to objects, rather than independent copies of them, as you would do in a relational database is possible in MongoDB and often done, it just results in more and more complex queries when you need to look them up.
Long answer
If the goal is just to keep the definitions of ingredient schemas consistent, you can define a schema and use it twice. The ingredients will be stored as independent copies, e.g.
[{ username: 'bob',
ingredients: [ { name: 'Carrot', category: 'Vegetable' } , ...],
foods: [ { name: 'Salad', category: 'Lunch', ingredients: [ { name: 'Carrot', category: 'Vegetable'}, ...]}]
}, ...]
var IngredientSchema = new mongoose.Schema({
name: String,
category: String,
});
var UserSchema = new mongoose.Schema ({
username: String,
password: String,
email: String,
foods: [{
name: String,
category: String,
ingredients: [IngredientSchema] // brackets indicates it's an array,
}],
ingredients: [IngredientSchema]
});
Alternatively you can reference ingredients by objectId:
var UserSchema = new mongoose.Schema ({
username: String,
password: String,
email: String,
foods: [{
name: String,
category: String,
ingredients: [mongoose.Schema.Types.ObjectId] // IDs reference individual ingredients,
}],
ingredients: [IngredientSchema]
});
By defining IngredientSchema explicitly, each ingredient object gets its own ObjectId when it is declared. The upside to storing IDs of ingredients (rather than copies of ingredient objects) is more concise and consistent storage. The downside is there will be many more and more complex queries.
[{ username: 'bob',
ingredients: [ { _id: ObjectId('a298d9ef898afc98ddf'), name: 'Carrot', category: 'Vegetable' } , ...],
foods: [ { name: 'Salad', category: 'Lunch', ingredients: [ {$oid: 'a298d9ef898afc98ddf'}, ]}]
}, ...]
A better approach if you want to store references to Ingredients, may be to store Ingredients as its own first class collection. You'll still have many separate queries when you want to look up foods by ingredient, or ingredients by food, but the queries will be simpler.
var UserSchema = new mongoose.Schema ({
username: String,
password: String,
email: String,
foods: [{
name: String,
category: String,
ingredients: [mongoose.Schema.Types.ObjectId] // IDs reference individual ingredients,
}],
ingredients: [mongoose.Schema.Types.ObjectId]
});
if the goal is store normalized references to ingredients and search foods based on them, to quote another [SO post][1], "this is one of those cases where relational databases really shine"
See this SO post for querying subdocuments by Id:
Reading categories and number of articles in a single query
As one respondent notes, "this is one of those cases where relational databases really shine"