want to merge two collection in mongo db using map reduce - mongodb

I have two collection as bellow products has reference of user. i search product by name & in return i want combine output of product and user using map reduce method
user collection
{
"_id" : ObjectId("52ac5dd1fb670c2007000000"),
"company" : {
"about" : "This is textile machinery dealer",
"contactAddress" : [{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
},{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
}],
"fax" : "58784868",
"mainProducts" : "ads,asd,asd",
"mobileNumber" : "9537236588",
"name" : "krishna steels",
}
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}
product colletion
{
"_id" : ObjectId("52ac5722fb670cf806000002"),
"category" : "52a2a9cc48a508b80e00001d",
"deliveryTime" : "10 days after received the ",
"price" : {
"minPrice" : "2000",
"maxPrice" : "3000",
"perUnit" : "5288ac6f7c104203e0976851",
"currency" : "INR"
},
"productName" : "New Mobile Solar Charger with Carabiner",
"rejectReason" : "",
"status" : 1,
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}

This cannot be done. Mongo support Map Reduce only on one collection. You could try to fetch and merge in a java collection. Couple of days back I solved a similar problem using java collection.
Click to see similar response about joins and multi collection not supported in mongo.

This can be done using two map reduces.
You run your first MR and then you reduce out the second MR onto the results of the first.
You shouldn't do this though. JOINs are not designed to be done through MR, in fact it sounds like you are trying to do this MR with inline output which in itself is a very bad idea.
MRs are not designed to run inline to the application.
You would be better off doing the JOIN else where.

Related

(MongoDB) Aggregate (.out) moove values to wrong fields

I'm actually creating an autocomplete website bar using express js & mongodb 4.2.7, and using lat & lon to avoid using geocoding api.
Here is the format of my db:
{
"_id" : ObjectId("5eea03e9a7891b6d701df571"),
"id_ban_position" : "ban-position-4c882de48d894fc49ed9be76f7631d6d",
"id_ban_adresse" : "ban-housenumber-78a2651ce382476ca9c8c8fcbcbaa539",
"cle_interop" : "01001_A028_5307",
"id_ban_group" : "ban-group-37c7c3f2b61440b48bca7d8fad56055e",
"id_fantoir" : "01001A028",
"numero" : 5307,
"suffixe" : "",
"nom_voie" : "Lotissement les Lilas",
"code_postal" : 1400,
"nom_commune" : "L'Abergement-Clémenciat",
"code_insee" : 1001,
"nom_complementaire" : "",
"x" : 848436.205131189,
"y" : 6562595.33916435,
"lon" : 4.923279,
"lat" : 46.146903,
"typ_loc" : "parcel",
"source" : "dgfip",
"date_der_maj" : "2019-02-12"
}
as you can see, i got lot of fields that are not necessary for the purpose of my website, and most of all, i don't need to get one row for each number in a street, one unique name of street is enough. So i decided to supress duplicate street name ('nom_voie') and town name ('nom_commune') with aggregate as follow:
db.Addresses.aggregate(
[ {
$group: {
_id: {voie: "$nom_voie", commune: "$nom_commune"},
doc: {$first: "$$ROOT"}
}
},
{$replaceRoot: {newRoot: "$doc"}},
{$out: 'UniqueIds'}
],
{allowDiskUse: true }
);
The problem is that this utilisation mooved a lot of values from one field to an other and made my db absolutely unusable.
{
"_id" : ObjectId("5eea03e9a7891b6d701df571"),
"nom_voie" : "-",
"code_postal" : 1400,
"nom_commune" : "Lotissement les Lilas",
"nom_complementaire" : "",
"lon" : 6562595.33916435,
"lat" : 848436.205131189,
"field19" : "dgfip",
...
}
As you can see, the value of "nom_voie" is now in "nom_commune", "x" value is in "lat", and "y" value is in "lon" and finally the "nom_voie" value is now "-", and i got new fields replacing others ("field19" for "source")...
Am i using aggregate with a wrong option?
I got 47 Millions entry, is this creating some issues?
Thanks to all of you for your time, even if you just reading this!
EDIT :
After few tries and some search i found out that it is the aggregate function that create some problems, but i still don't understand why, if somebody got some hints, i just edited the doc so it's more understandable and readable!
(I thought before that it was a $unset problem in a function)

Best way to create index for MongoDB

I am having records stored in mongo-db collection for customer and there transactions with below format:
{
"_id" : ObjectId("59b6992a0b54c9c4a5434088"),
"Results" : {
"id" : "2139623696",
"member_joined_date" : ISODate("2010-07-07T00:00:00.000+0000"),
"account_activation_date" : ISODate("2010-07-07T00:00:00.000+0000"),
"family_name" : "XYZ",
"given_name" : "KOKI HOI",
"gender" : "Female",
"dob" : ISODate("1967-07-20T00:00:00.000+0000"),
"preflanguage" : "en-GB",
"title" : "MR",
"contact_no" : "60193551626",
"email" : "abc123#xmail.com",
"street1" : "address line 1",
"street2" : "address line 2",
"street3" : "address line 3",
"zipcd" : "123456",
"city" : "xyz",
"countrycd" : "Malaysia",
"Transaction" : [
{
"txncd" : "411",
"txndate" : ISODate("2017-08-02 00:00:00.000000"),
"prcs_date" : ISODate("2017-08-02 00:00:00.000000"),
"txn_descp" : "Some MALL : SHOP & FLY FREE",
"merchant_id" : "6587867dsfd",
"orig_pts" : "0.00000",
"text" : "Some text"
}
]
}
I want to create index on fields "txn_descp", "txndate", "member_joined_date", "gender", "dob" for faster access. Can some one help me in creating index for this document? Will appreciate any kind of help and suggestions.
While creating the index there are a few things to keep in mind.
Always create the index for the queries you use.
Go for compound indexes whenever possible.
First field in the index should be the one with the minimum possible values.Ie, if there is an index with gender and DOB as keys, It is better to have {gender:1,dob:1}

Convert a MongoDB with two collections in a neo4j graph

I finished to create my Mongo database. It is made on two collections:
1. team
2. coach
I give you an example of the documents contained in these collections:
Here is a team document:
{
"_id" : "Mil.74",
"official_name" : "Associazione Calcio Milan S.p.A",
"common_name" : "Milan",
"country" : "Italy",
"started_by" : {
"day" : 16,
"month" : 12,
"year" : 1899
},
"stadium" : {
"name" : "Giuseppe Meazza",
"capacity" : 81277
},
"palmarès" : {
"Serie A" : 18,
"Serie B" : 2,
"Coppa Italia" : 5,
"Supercoppa Italiana" : 6,
"UEFA Champions League" : 7,
"UEFA Super Cup" : 5,
"Cup Winners cup" : 2,
"UEFA Intercontinental cup" : 4
},
"uniform" : "black and red"
}
This is a coach document:
{
"_id" : ObjectId("556cec3b9262ab4f14165fcd"),
"name" : "Carlo",
"surname" : "Ancelotti",
"age" : 55,
"date_Of_birth" : {
"day" : 10,
"month" : 6,
"year" : 1959
},
"place_Of_birth" : "Reggiolo",
"nationality" : "Italian",
"preferred_formation" : "4-2-3-1",
"coached_Team" : [
{
"team_id" : "RMa.103",
"in_charge" : {
"from" : "26/june/2013",
"to" : "25/may/2015"
},
"matches" : 119
},
{
"team_id" : "PSG.00",
"in_charge" : {
"from" : "30/dec/2011",
"to" : "24/june/2013"
},
"matches" : 77
},
{
"team_id" : "Che.11",
"in_charge" : {
"from" : "01/july/2009",
"to" : "22/may/2011"
},
"matches" : 109
},
{
"team_id" : "Mil.74",
"in_charge" : {
"from" : "07/nov/2001",
"to" : "31/may/2009"
},
"matches" : 420
}
]
As you can see, I used a normalized model: every coach has an array of coached teams.
I want to convert this Mongo database into a graph database, in particular Neo4j; my goal is to show that in this highly connected domains neo4j has better performance than Mongo(For example the query:"Find the palmarès of all teams coached by Carlo Ancelotti, in mongo requires two queries, instead in neo4j it's enough to follow relationships).
I found this guide on the forum that uses Gremlin to convert a mongo collection of documents into neo4j graph automatically.The problem is that the guide talks about just one collection.
So, is it possible to generate automatically the neo4j graph starting from my mongo database(with two collections) or must I create the graph "by hand"?
Gremlin is a Domain Specific Language for working with graphs, but it is based on Groovy so you effectively have all the flexibility you want to really do whatever you want. In other words, what you can do with one MongoDB collection you can easily do with two (or however many collections you have). That was the point of the blog post referenced in one of the other answers:
http://thinkaurelius.com/2013/02/04/polyglot-persistence-and-query-with-gremlin/
Gremlin is a great language for transforming data into graph form, whatever its source format is. I would think that you would first load all of your teams as vertices then iterate through your coaches, creating coach vertices and edges to their related teams as you go.
I would also add that nothing is "automatic" about Gremlin. It's not as though you tell Gremlin that you have data in MongoDB and it turns it into a graph. You have to write Gremlin to tell it how you want your MongoDB data turned into a graph.

mongodb: taking a set of keys from one collection and matching with another

I'm new to mongodb and javascript, and have been reading the manual, but I can't seem to put the pieces together to solve the following problem.. I was wondering if you can kindly help.
I have two collections "places" and "reviews".
One document in "places" collection is as follows:
{
"_id" : "004571a7-afe4-4124-996e-b6ec779db494",
"name" : "wakawaka place",
"address" : {
"address" : "12 ad avenue",
"city" : "New York",
},
"review" : [
{
"id" : "i32347",
"review_list" : [
"r123456",
"r123457"
],
}
]
}
The "review" array can be empty for some documents.
And in the "reviews" collection, every document in the collection represents a review:
{
"_id" : ObjectId("53c913689c8e91a5a9c4047f"),
"user_id" : "useridhere",
"review_id" : "r123456",
"attraction_id" : "i32347",
"content" : "review content here"
}
What I would like to achieve is, for each place that has reviews, get the content of each review from the "review" collection and store them together in another new collection.
I'd be grateful for any suggestions on how to go about this.
Thanks

Get nested fields with MongoDB shell

I've "users" collection with a "watchlists" field, which have many inner fields too, one of that is "arrangeable_values" (the second field within "watchlists").
I need to find for each user in "users" collection, each "arrangeable_values" within "watchlists".
How can I do that with mongodb shell ?
Here is an example of data model :
> db.users.findOne({'nickname': 'superj'})
{
"_id" : ObjectId("4f6c42f6018a590001000001"),
"nickname" : "superj",
"provider" : "github",
"user_hash" : null,
"watchlists" : [
{
"_id" : ObjectId("4f6c42f7018a590001000002"),
"arrangeable_values" : {
"description" : "My introduction presentation to node.js along with sample code at various stages of building a simple RESTful web service with journey, cradle, winston, optimist, and http-console.",
"tag" : "",
"html_url" : "https://github.com/indexzero/nodejs-intro"
},
"avatar_url" : "https://secure.gravatar.com/avatar/d43e8ea63b61e7669ded5b9d3c2e980f?d=https://a248.e.akamai.net/assets.github.com%2Fimages%2Fgravatars%2Fgravatar-140.png",
"created_at" : ISODate("2011-02-01T10:20:29Z"),
"description" : "My introduction presentation to node.js along with sample code at various stages of building a simple RESTful web service with journey, cradle, winston, optimist, and http-console.",
"fork_" : false,
"forks" : 13,
"html_url" : "https://github.com/indexzero/nodejs-intro",
"pushed_at" : ISODate("2011-09-12T17:54:58Z"),
"searchable_values" : [
"description:my",
"description:introduction",
"description:presentation",
"html_url:indexzero",
"html_url:nodejs",
"html_url:intro"
],
"tags_array" : [ ],
"watchers" : 75
},
{
"_id" : ObjectId("4f6c42f7018a590001000003"),
"arrangeable_values" : {
"description" : "A Backbone alternative idea",
"tag" : "",
"html_url" : "https://github.com/maccman/spine.todos"
},
"avatar_url" : "https://secure.gravatar.com/avatar/baf018e2cc4616e4776d323215c7136c?d=https://a248.e.akamai.net/assets.github.com%2Fimages%2Fgravatars%2Fgravatar-140.png",
"created_at" : ISODate("2011-03-18T11:03:42Z"),
"description" : "A Backbone alternative idea",
"fork_" : false,
"forks" : 31,
"html_url" : "https://github.com/maccman/spine.todos",
"pushed_at" : ISODate("2011-11-20T22:59:45Z"),
"searchable_values" : [
"description:a",
"description:backbone",
"description:alternative",
"description:idea",
"html_url:https",
"html_url:github",
"html_url:com",
"html_url:maccman",
"html_url:spine",
"html_url:todos"
],
"tags_array" : [ ],
"watchers" : 139
}
]
}
For the document above, the following find() query would extract both the "nickname" of the document, and its associated "arrangeable_values" (where the document is in the users collection):
db.users.find({}, { "nickname" : 1, "watchlists.arrangeable_values" : 1 })
The result you get for your single document example would be:
{ "_id" : ObjectId("4f6c42f6018a590001000001"), "nickname" : "superj",
"watchlists" : [
{ "arrangeable_values" : { "description" : "My introduction presentation to node.js along with sample code at various stages of building a simple RESTful web service with journey, cradle, winston, optimist, and http-console.", "tag" : "", "html_url" : "https://github.com/indexzero/nodejs-intro" } },
{ "arrangeable_values" : { "description" : "A Backbone alternative idea", "tag" : "", "html_url" : "https://github.com/maccman/spine.todos" } }
] }
MongoDB queries return entire documents. You are looking for a field inside an array inside of the document and this will break the find().
The problem here is that any basic find() query, will return all matching documents. The find() does have the option to only return specific fields. But that will not work with your array of sub-objects. You could returns watchlists, but not watchlist entries that match.
As it stands you have two options:
Write some client-side code that loops through the documents and does the filtering. Remember that the shell is effectively a javascript driver, so you can write code in there.
Use the new aggregation framework. This will have a learning curve, but it can effectively extract the sub-items you're looking for.