How to reduce execution time in this mongo db find query? - mongodb

document sample data followed like this,
{
"_id" : ObjectId("62317ae9d007af22f984c0b5"),
"productCategoryName" : "Product category 1",
"productCategoryDescription" : "Description about product category 1",
"productCategoryIcon" : "abcd.svg",
"status" : true,
"productCategoryUnits" : [
{
"unitId" : ObjectId("61fa5c1273a4aae8d89e13c9"),
"unitName" : "kilogram",
"unitSymbol" : "kg",
"_id" : ObjectId("622715a33c8239255df084e4")
}
],
"productCategorySizes" : [
{
"unitId" : ObjectId("61fa5c1273a4aae8d89e13c9"),
"unitName" : "kilogram",
"unitSize" : 10,
"unitSymbol" : "kg",
"_id" : ObjectId("622715a33c8239255df084e3")
}
],
"attributes" : [
{
"attributeId" : ObjectId("62136ed38a35a8b4e195ccf4"),
"attributeName" : "Country of Origin",
"attributeOptions" : [],
"isRequired" : true,
"_id" : ObjectId("622715ba3c8239255df084f8")
}
]
}
This collection has been indexed in "_id". without sub-documents execution time is reduced but all document fields are required.
db.getCollection('product_categories').find({})
This collection contains 30000 records and this query takes more than 30 seconds to execute. so how to solve this issue. Anybody ask me a better solution. Thanks.

Indexing and compound indexing will make it use cache instead of scanning document every time you query it. 30.000 documents is nothing to MongoDB, it can handle millions in a second. If these fields are populated in the process that's another heavy operation for the query.
See if your schema is efficiently structured or you're throttling your connection to the server. Other thing to consider is to project only the fields that you require, using aggregation pipeline.

Although the question is not very clear you can follow this article for some best practices.

Related

Query MongoDB collection

I have a MongoDB collection like this:
{
"_id" : {
"processUuid" : "653d0937-2afc-4915-ad42-d2b69f344402",
"partnerId" : "p377"
},
"isComplete" : true,
"tasks" : {
"dbb361a7-4b73-4691-bde5-2b160346464f" : {
"sku" : "4060079000048",
"status" : "FAILED",
"errorList" : [....]
},
"790dbc6f-563d-4eb7-931c-3cc604563dc1" : {
"sku" : "4060079000130",
"status" : "SUCCESSFUL",
"errorList" : [....]
},
... more tasks
}
... more processes
}
I want to query a certain sku and couldnt find a way. I know i could write an aggregation pipeline to project the inner part of a task, but then i would lose the task identifier which i need in order to add some stuff to a task.
I know the document structure is weird, i would have done it different, but its given.
I tried what i found about querying nested documents and so on, but unfortunately didnt get it. Any hints are appreciated.

How do i remove duplicates in mongodb?

I have a database which consists of few collections , i have tried copying from one collection to another .
In this process connection was lost and had to recopy them
now i find around 40000 records duplicates.
Format of my data:
{
"_id" : ObjectId("555abaf625149715842e6788"),
"reviewer_name" : "Sudarshan A",
"emp_name" : "Wilson Erica",
"evaluation_id" : NumberInt(550056),
"teamleader_id" : NumberInt(17199),
"reviewer_id" : NumberInt(1659),
"team_manager" : "Las Vegas",
"teammanager_id" : NumberInt(12245),
"team_leader" : "Thomas Donald",
"emp_id" : NumberInt(7781)
}
here only evaluation id is unique.
Queries that i have tried:
ensureIndex({id:1}, {unique:true, dropDups:true})
dropDups was removed in mongodb ~2.7.
Here is other realization method
but I don't test it

MongoDB - How can I use MapReduce to merge a value from one collection into another collection on multiple keys of a second collection?

I have two MongoDB collections: The first is a collection that includes frequency information for different IDs and is shown (truncated form) below:
[
{
"_id" : "A1",
"value" : 19
},
{
"_id" : "A2",
"value" : 6
},
{
"_id" : "A3",
"value" : 12
},
{
"_id" : "A4",
"value" : 8
},
{
"_id" : "A5",
"value" : 4
},
...
]
The second collection is more complex and contains information for each _id listed in the first collection (it's called frequency_collection_id in the second collection), but frequency_collection_id may be inside two lists (info.details_one, and info.details_two) for each record:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
}
}
...
]
What I'm looking to do, is merge the frequency information (from the first collection) into the second collection, in effect creating a collection that looks like:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
}
}
...
]
I know that this should be possible with MongoDB's MapReduce functions, but all the examples I've seen are either too minimal for my collection structure, or are answering different questions than I'm looking for.
Does anyone have any pointers? How can I merge my frequency information (from my first collection) into the records (inside my two lists in each record of the second collection)?
I know this is more or less a JOIN, which MongoDB does not support, but from my reading, it looks like this is a prime example of MapReduce.
I'm learning Mongo as best I can, so please forgive me if my question is too naive.
Just like all MongoDB operations, a MapReduce always operates only on a single collection and can not obtain info from another one. So you first step needs to be to dump both collections into one. Your documents have different _id's, so it should not be a problem for them to coexist in the same collection.
Then you do a MapReduce where the map function emits both kinds of documents for their common key, which is their frequency ID.
Your reduce function will then receive an array of two documents for each key: the two documents you have received. You then just have to merge these two documents into one. Keep in mind that the reduce-function can receive these two documents in any order. It can also happen that it gets called for a partial result (only one of the two documents) or for an already completed result. You need to handle these cases gracefully! A good implementation could be to create a new object and then iterate the input-documents copying all existing relevant fields with their values to the new object, so the resulting object is an amalgamation of the input documents.

want to merge two collection in mongo db using map reduce

I have two collection as bellow products has reference of user. i search product by name & in return i want combine output of product and user using map reduce method
user collection
{
"_id" : ObjectId("52ac5dd1fb670c2007000000"),
"company" : {
"about" : "This is textile machinery dealer",
"contactAddress" : [{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
},{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
}],
"fax" : "58784868",
"mainProducts" : "ads,asd,asd",
"mobileNumber" : "9537236588",
"name" : "krishna steels",
}
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}
product colletion
{
"_id" : ObjectId("52ac5722fb670cf806000002"),
"category" : "52a2a9cc48a508b80e00001d",
"deliveryTime" : "10 days after received the ",
"price" : {
"minPrice" : "2000",
"maxPrice" : "3000",
"perUnit" : "5288ac6f7c104203e0976851",
"currency" : "INR"
},
"productName" : "New Mobile Solar Charger with Carabiner",
"rejectReason" : "",
"status" : 1,
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}
This cannot be done. Mongo support Map Reduce only on one collection. You could try to fetch and merge in a java collection. Couple of days back I solved a similar problem using java collection.
Click to see similar response about joins and multi collection not supported in mongo.
This can be done using two map reduces.
You run your first MR and then you reduce out the second MR onto the results of the first.
You shouldn't do this though. JOINs are not designed to be done through MR, in fact it sounds like you are trying to do this MR with inline output which in itself is a very bad idea.
MRs are not designed to run inline to the application.
You would be better off doing the JOIN else where.

Retrieving only the relevant part of a stored document

I'm a newbie with MongoDB, and am trying to store user activity performed on a site. My data is currently structured as:
{ "_id" : ObjectId("4decfb0fc7c6ff7ff77d615e"),
"activity" : [
{
"action" : "added",
"item_name" : "iPhone",
"item_id" : 6140,
},
{
"action" : "added",
"item_name" : "iPad",
"item_id" : 7220,
}
],
"name" : "Smith,
"user_id" : 2
}
If I want to retrieve, for example, all the activity concerning item_id 7220, I would use a query like:
db.find( { "activity.item_id" : 7220 } );
However, this seems to return the entire document, including the record for item 6140.
Can anyone suggest how this might be done correctly? I'm not sure if it's a problem with my query, or with the structure of the data itself.
Many thanks.
You have to wait the following dev: https://jira.mongodb.org/browse/SERVER-828
You can use $slice only if you know insertion order and position of your element.
Standard queries on MongoDb always return all document.
(question also available here: MongoDB query to return only embedded document)