MongoDB: Create another collection from sub-array? (many-to-many relationship) - mongodb

Suppose I have 'film' objects like the one below in my collection. Films have many actors, and actors belong to many films. Many-to-many.
How do I create another collection that consists of the unique 'actor' elements? Remember, some actors will be listed in more than one film.
{
"_id" : ObjectId("4edcffa5f320646bc8bd76b4"),
"directed_by" : [
"John Gilling"
],
"forbid:genre" : null,
"genre" : [ ],
"guid" : "#9202a8c04000641f8000000000b02e5d",
"id" : "/en/pirates_of_blood_river",
"initial_release_date" : "1962",
"name" : "Pirates of Blood River",
"starring" : [
{
"actor" : {
"guid" : "#9202a8c04000641f800000000006823e",
"id" : "/en/christopher_lee",
"name" : "Christopher Lee",
"lc_name" : "christopher lee"
}
},
{
"actor" : {
"guid" : "#9202a8c04000641f80000000001de22e",
"id" : "/en/oliver_reed",
"name" : "Oliver Reed",
"lc_name" : "oliver reed"
}
},
{
"actor" : {
"guid" : "#9202a8c04000641f80000000003b41da",
"id" : "/en/glenn_corbett",
"name" : "Glenn Corbett",
"lc_name" : "glenn corbett"
}
}
]
}

This can be done in the client app, but also via aggregation in mongodb - for example, you can run a mapreduce job with an output collection. In the map step, given a film document, you can emit key value pairs of ( actor.guid, { [other actor details] } ), and in the reduce step just return a single set of details (you could also count the number of films the actor was in at this point, if you wanted).
http://www.mongodb.org/display/DOCS/MapReduce has more info on the syntax.

Related

mongodb extracting values from array

Following is example of table in mongodb, I have multiple records for companies like this, which I need help with.
I wanted to query the below table wherein using value from company I should be able to retrieve the name of all the cars.
"vehicles" : [
{
"source" : "jeep",
"tag" : [
{
"company" : "toyota",
"name" : "fortuner"
},
{
"company" : "rangerover",
"name" : "discovery"
]
}
]
Thanks...
try this :
db.vehicles.find({tag: {$elemMatch: {company:'toyota'}}}).pretty();
read more here : https://docs.mongodb.com/manual/reference/operator/query/elemMatch/

Mongodb update and delete operations in a single query

I have documents in which I would like to update the hostUser with one of the members of the document,also have to delete the record from the member document and add the chips of the deleted member in the club chips.
Here is the sample document.
{
"_id" : "1002",
"hostUser" : "1111111111",
"clubChips" : 10000,
"requests" : {},
"profile" : {
"clubname" : "AAAAA",
"image" : "0"
},
"tables" : [
"SJCA3S0Wm"
],
"isDeleted" : false,
"members" : {
"1111111111" : {
"chips" : 0,
"id" : "1111111111"
},
"2222222222" : {
"chips" : 0,
"id" : "2222222222"
}
}
}
This is what I have tried.
db.getCollection('test').updateMany({"hostUser":"1111111111"},
{"$set":{"hostUser":"2222222222"},"$unset":{"members.1111111111":""}})
This is how you would handle unset and set in a single call to updateMany. Can you please clarify what you meant by "check if the values exist in the member field"?
db.getCollection('test').updateMany(
{"hostUser":"1111111111"},
{
'$set': {"hostUser":"2222222222"} ,
'$unset': {"members.1111111111":""}
}
)

Is a mongodb query with 1 indexed field faster than multiple indexed fields?

In the following model a product is owned by a customer. and cannot be ordered by other customers. So I know that in an order by customer 1 there can only be products owned by customer one.
To give you an idea here is a simple version of the data model:
Orders:
{
'customer' : 1
'products' : [
{'productId' : 'a'},
{'productId' : 'b'}
]
}
Products:
{
'id' : 'a'
'name' : 'somename'
'customer' : 1
}
I need to find orders that contain certain products. I know the product id and customer id. I'm free to add/change indexes on my database.
Now my question is. Is it faster to just add a single field index on the product id's and query only using that ID. Or should I go for a compound index with customer and product id?
I'm not sure if this matters, but in my real model the list of products is actually a list of objects which have an amount and a dbref to the product. And the customer is also a dbref.
Here is a full order object:
{
"_id" : 0,
"_class" : "nl.pfa.myprintforce.models.Order",
"orderNumber" : "e35f1fa8-b4c4-4d53-89c9-66abe94a3553",
"status" : "ERROR",
"created" : ISODate("2017-03-30T11:50:50.292Z"),
"finished" : false,
"orderTime" : ISODate("2017-01-12T12:50:50.292Z"),
"expectedDelivery" : ISODate("2017-03-30T11:50:50.292Z"),
"totalItems" : 19,
"orderItems" : [
{
"amount" : 4,
"product" : {
"$ref" : "product",
"$id" : NumberLong(16)
}
},
{
"amount" : 7,
"product" : {
"$ref" : "product",
"$id" : NumberLong(26)
}
},
{
"amount" : 8,
"product" : {
"$ref" : "product",
"$id" : NumberLong(7)
}
}
],
"stateList" : [
{
"timestamp" : ISODate("2017-03-28T11:50:50.074Z"),
"status" : "NEW",
"message" : ""
},
{
"timestamp" : ISODate("2017-03-29T11:50:50.075Z"),
"status" : "IN_PRODUCTION",
"message" : ""
},
{
"timestamp" : ISODate("2017-03-30T11:50:50.075Z"),
"status" : "ERROR",
"message" : "Something went wrong"
}
],
"customer" : {
"$ref" : "customer",
"$id" : ObjectId("58dcf11a71571a24c475c044")
}
}
When I have the following indexes:
1: {"customer" : 1, "orderItems.product" : 1}
2: {"orderItems.product" : 1}
both count queries (I use count to forcefully find all documents without the network transfer):
a: db.getCollection('order').find({
'orderItems.product' : DBRef('product',113)
}).count()
b: db.getCollection('order').find({
'customer' : DBRef('customer',ObjectId("58de009671571a07540a51d5")),
'orderItems.product' : DBRef('product',113)
}).count()
Run with the same time of ~0.007 seconds on a set of 200k.
When I add 1000k record for a different customer (and different products) it does not effect the time at all.
an extended explain shows that:
query 1 just uses index 2.
query 2 uses index 2 but also considered index 1. Perhaps index intersection is used here?
Because if I drop index 1 the results are:
Query a: 0.007 seconds
Query b: 0.035 seconds (5x as long!)
So my conclusion is that with the right indexing both methods work about as fast. However, if you do not need the compound index for anything else it's just a waste of space & write speed.
So: single field index is better in my case.

Error in mongodb query to get movie based on id

> db.movmodels.findOne()
{
"_id" : ObjectId("55320b0e0e9e0d9d0540593c"),
"username" : "punk",
"favMovies" : [
{
"alternate_ids" : {
"imdb" : "0137523"
},
"abridged_cast" : [
{
"characters" : [
"Tyler"
],
"id" : "162652627",
"name" : "Brad Pitt"
},
{
"characters" : [
"Narrator"
],
"id" : "162660884",
"name" : "Edward Norton"
},
{
"characters" : [
"Robert"
],
"id" : "162676383",
"name" : "Meat Loaf"
},
{
"characters" : [
"Angel Face"
],
"id" : "162653925",
"name" : "Jared Leto"
},
{
"characters" : [
"Boss"
],
"id" : "770706064",
"name" : "Zach Grenier"
}
],
"synopsis" : "",
"ratings" : {
"audience_score" : 96,
"audience_rating" : "Upright",
"critics_score" : 80,
"critics_rating" : "Certified Fresh"
},
"release_dates" : {
"dvd" : "2000-06-06",
"theater" : "1999-10-15"
},
"critics_consensus" : "",
"runtime" : 139,
"mpaa_rating" : "R",
"year" : 1999,
"title" : "Fight Club",
**"id" : "13153"**
}
],
"__v" : 0
}
This is my data in mongodb.
As I am new to mongodb I wanted to know query to get movie with a particular id.
The query that I tried is. I need to get the movie based on id so that I can remove it from my database
db.movmodels.findOne({username:"punk"},{favMovies:{id:13153}})
but this gives me error.
2015-04-18T05:41:26.221-0400 E QUERY Error: error: {
"$err" : "Can't canonicalize query: BadValue ported projection option: favMovies: { id: 13153.0 }",
"code" : 17287
}
at Error (<anonymous>)
at DBQuery.next (src/mongo/shell/query.js:259:15)
at DBCollection.findOne (src/mongo/shell/collection.js:188:22)
at (shell):1:14 at src/mongo/shell/query.js:259
There are several problems with your query:
The second parameter to find() is a projection, not part of the query. What you want is to supply one document for the query that has two properties: {"username" : "punk", favMovies : { ... } }
However, you also don't want to compare the entire sub-document favMovies, but you only want to match on one of its properties, the id, which requires to 'reach into the object' using the dot operator: {username:"punk", "favMovies.id" : 13153}.
However, that will probably not work yet, because 13153 is not the same as "13153", the latter being a string while the former is a number in JSON.
db.movmodels.findOne({username:"punk", "favMovies.id" : "13153"})
Keep in mind, however, that this will find the entire document for the user named "punk". I'm not sure what exactly your data structure should look like, but it appears you'll have to $pull the movie from the user. In general, I'd say you're embedding too much data into the user, but that's hard to tell without knowing the exact use case.
Here you go:
If you just wanted to get first user who has this fav movie:
db.movmodels.findOne({"favMovies.id": 13153});
And, if you want to know if that user has that movie as favorite.
db.movmodels.findOne({"favMovies.id": 13153, username:"punk"});
Second argument in the findOne is used to only return particular field.
You can use also $elemMatch projection operator (not to be confused with the $elemMatch query operator)
db.movmodels.find({username:"punk"},{favMovies:{$elemMatch:{id:"13153"}}});
`
If you want to find a movie that has another movie (with id 13153) in 'favMovies' array, then write the query as below:
db.movmodels.findOne({username:"punk",'favMovies.id':13153})
And if you want to find a movie with _id 55320b0e0e9e0d9d0540593cwrite the following query:
db.movmodels.findOne({username:"punk",'_id':ObjectId("55320b0e0e9e0d9d0540593c")})

MongoDB - How can I use MapReduce to merge a value from one collection into another collection on multiple keys of a second collection?

I have two MongoDB collections: The first is a collection that includes frequency information for different IDs and is shown (truncated form) below:
[
{
"_id" : "A1",
"value" : 19
},
{
"_id" : "A2",
"value" : 6
},
{
"_id" : "A3",
"value" : 12
},
{
"_id" : "A4",
"value" : 8
},
{
"_id" : "A5",
"value" : 4
},
...
]
The second collection is more complex and contains information for each _id listed in the first collection (it's called frequency_collection_id in the second collection), but frequency_collection_id may be inside two lists (info.details_one, and info.details_two) for each record:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
}
}
...
]
What I'm looking to do, is merge the frequency information (from the first collection) into the second collection, in effect creating a collection that looks like:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
}
}
...
]
I know that this should be possible with MongoDB's MapReduce functions, but all the examples I've seen are either too minimal for my collection structure, or are answering different questions than I'm looking for.
Does anyone have any pointers? How can I merge my frequency information (from my first collection) into the records (inside my two lists in each record of the second collection)?
I know this is more or less a JOIN, which MongoDB does not support, but from my reading, it looks like this is a prime example of MapReduce.
I'm learning Mongo as best I can, so please forgive me if my question is too naive.
Just like all MongoDB operations, a MapReduce always operates only on a single collection and can not obtain info from another one. So you first step needs to be to dump both collections into one. Your documents have different _id's, so it should not be a problem for them to coexist in the same collection.
Then you do a MapReduce where the map function emits both kinds of documents for their common key, which is their frequency ID.
Your reduce function will then receive an array of two documents for each key: the two documents you have received. You then just have to merge these two documents into one. Keep in mind that the reduce-function can receive these two documents in any order. It can also happen that it gets called for a partial result (only one of the two documents) or for an already completed result. You need to handle these cases gracefully! A good implementation could be to create a new object and then iterate the input-documents copying all existing relevant fields with their values to the new object, so the resulting object is an amalgamation of the input documents.