Mongodb multiple subdocument - mongodb

I need a collection with structure like this:
{
"_id" : ObjectId("5ffc3e2df14de59d7347564d"),
"name" : "MyName",
"pays" : "de",
"actif" : 1,
"details" : {
"pt" : {
"title" : "MongoTime PT",
"availability_message" : "In stock",
"price" : 23,
"stock" : 1,
"delivery_location" : "Portugal",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
},
"fr" : {
"title" : "MongoTime FR",
"availability_message" : "En stock",
"price" : 33,
"stock" : 1,
"delivery_location" : "France",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
}
}
}
How can i create an index for each subdocument in 'details' ?
Or maybe it's better to do an array ?
Doing a query like this is currently very long (1 hour). How can I do ?
query = {"details.pt.missing": {"$in": [0, 1, 2, 3]}, "pays": 'de'}
db.find(query, {"_id": false, "name": true}, sort=[("details.pt.updated_date", 1)], limit=300)

An array type would be better, as there are advantages.
(1) You can include a new field which has values like pt, fr, xy, ab, etc. For example:
details: [
{ type: "pt", title : "MongoTime PT", missing: 1, other_fields: ... },
{ type: "fr", title : "MongoTime FR", missing: 1, other_fields: ... },
{ type: "xy", title : "MongoTime XY", missing: 2, other_fields: ... },
// ...
]
Note the introduction of the new field type (this can be any name representing the field data).
(2) You can also index on the array sub-document fields, which can improve query performance. Array field indexes are referred as Multikey Indexes.
The index can be on a field used in a query filter. For example, "details.missing". This key can also be part of a Compound Index. This can help a query filter like below:
{ pays: "de", "details.type": "pt", "details.missing": { $in: [ 0, 1, 2, 3 ] } }
NOTE: You can verify the usage of an index in a query by generating a Query Plan, applying the explain method on the find.
(3) Also, see Embedded Document Pattern as explained in the Model One-to-Many Relationships with Embedded Documents.

Related

What are the efficient query for mongodb if value exist on array then don't update and return the error that id already exist

I have an entry stored on my collection like this:
{
"_id" : ObjectId("5d416c595f19962ff0680dbc"),
"data" : {
"a" : 6,
"b" : [
"5c35f04c4e92b8337885d9a6"
]
},
"image" : "123.jpg",
"hyperlinks" : "google.com",
"expirydate" : ISODate("2019-08-27T06:10:35.074Z"),
"createdate" : ISODate("2019-07-31T10:24:25.311Z"),
"lastmodified" : ISODate("2019-07-31T10:24:25.311Z"),
"__v" : 0
},
{
"_id" : ObjectId("5d416c595f19962ff0680dbd"),
"data" : {
"a" : 90,
"b" : [
"5c35f04c4e92b8337885d9a7"
]
},
"image" : "456.jpg",
"hyperlinks" : "google.com",
"expirydate" : ISODate("2019-08-27T06:10:35.074Z"),
"createdate" : ISODate("2019-07-31T10:24:25.311Z"),
"lastmodified" : ISODate("2019-07-31T10:24:25.311Z"),
"__v" : 0
}
I have to write the query for push userid on b array which is under data object and increment the a counter which is also under data object.
For that, I wrote the Code i.e
db.collection.updateOne({_id: ObjectId("5d416c595f19962ff0680dbd")},
{$inc: {'data.a': 1}, $push: {'data.b': '124sdff54f5s4fg5'}}
)
I also want to check that if that id exist on array then return the response that following id exist, so for that I wrote extra query which will check and if id exist then return the error response that following id exist,
My question is that any single query will do this? Like I don't want to write Two Queries for single task.
Any help is really appreciated for that
You can add one more check in the update query on "data.b". Following would be the query:
db.collection.updateOne(
{
_id: ObjectId("5d416c595f19962ff0680dbd"),
"data.b":{
$ne: "124sdff54f5s4fg5"
}
},
{
$inc: {'data.a': 1},
$push: {'data.b': '124sdff54f5s4fg5'}
}
)
For duplicate entry, you would get the following response:
{ "acknowledged" : true, "matchedCount" : 0, "modifiedCount" : 0 }
If matched count is 0, you can show the error that the id already exists.
You can use the operator $addToSet to check if the element already exits in the array.
db.collection.updateOne({_id: ObjectId("5d416c595f19962ff0680dbd")},
{$inc: {'data.a': 1}, $addToSet: {'data.b': '124sdff54f5s4fg5'}}
)

Complex Sort on multiple very large MongoDB Collections

I have a mongodb database with currently about 30 collections ranging from 1.5gb to 2.5gb and I need to reformat and sort the data into nested groups and dump them to a new collection. This database will eventually have about 2000 collections of the same type and formatting of data.
Data is currently available like this:
{
"_id" : ObjectId("598392d6bab47ec75fd6aea6"),
"orderid" : NumberLong("4379116282"),
"regionid" : 10000068,
"systemid" : 30045305,
"stationid" : 60015036,
"typeid" : 7489,
"bid" : 0,
"price" : 119999.91,
"minvolume" : 1,
"volremain" : 6,
"volenter" : 8,
"issued" : "2015-12-31 09:12:29",
"duration" : "14 days, 0:00:00",
"range" : 65535,
"reportedby" : 0,
"reportedtime" : "2016-01-01 00:22:42.997926"} {...} {...}
I need to group these by regionid > typeid > bid like this:
{"regionid": 10000176,
"orders": [
{
"typeid": 34,
"buy": [document, document, document, ...],
"sell": [document, document, document, ...]
},
{
"typeid": 714,
"buy": [document, document, document, ...],
"sell": [document, document, document, ...]
}]
}
Here's more verbose a sample of my ideal output format: https://gist.github.com/BatBrain/cd3426c29ce8ca8152efd1fa06ca1392
I have been trying to use the db.collection.aggregate() to do this, running this command as an initial test step:
db.day_2016_01_01.aggregate( [{ $group : { _id : "$regionid", entries : { $push: "$$ROOT" } } },{ $out : "test_group" }], { allowDiskUse:true, cursor:{} })
But I have been getting this message, "errmsg" : "BufBuilder attempted to grow() to 134217728 bytes, past the 64MB limit."
I tried looking into how to use the cursor object, but I'm pretty confused about how to apply it in this situation, or even if that is a viable option. Any advice or solutions would be great.

How can I use mongodb projections with Go and mgo?

I am currently trying to extract a single object within a document array inside of mongodb.
This is a sample dataset:
"_id" : ObjectId("564aae61e0c4e5dddb07343b"),
"name" : "The Races",
"description" : "Horse races",
"capacity" : 0,
"open" : true,
"type" : 0,
"races" : [
{
"_id" : ObjectId("564ab9097628ba2c6ec54423"),
"race" : {
"distance" : 3000,
"user" : {
"_id" : ObjectId("5648bdbe7628ba189e011b18"),
"status" : 1,
"lastName" : "Miranda",
"firstName" : "Aramys"
}
}
},
{
"_id" : ObjectId("564ab9847628ba2c81f2f34a"),
"bet" : {
"distance" : 3000,
"user" : {
"_id" : ObjectId("5648bdbe7628ba189e011b18"),
"status" : 1,
"lastName" : "Miranda",
"firstName" : "Aramys"
}
}
},{...}
]
I can successfully query using the following in mongo:
db.tracks.find({"_id": ObjectId("564aae61e0c4e5dddb07343b")}, {"races": { $elemMatch: {"_id": ObjectId("564ab9847628ba2c81f2f34a")}}}).pretty()
I am unable to do the same using mgo and have tried the following:
Using nesting (Throws: missing type in composite literal, missing key in map literal)
// Using nesting (Throws: missing type in composite literal, missing key in map literal)
c.Find(bson.M{{"_id": bson.ObjectIdHex(p.ByName("id"))}, bson.M{"races": bson.M{"$elemMatch": bson.M{"_id": bson.ObjectIdHex(p.ByName("raceId"))}}}}).One(&result)
// Using select (Returns empty)
c.Find(bson.M{"_id": bson.ObjectIdHex(p.ByName("id"))}).Select(bson.M{"races._id": bson.ObjectIdHex(p.ByName("raceId"))}).One(&result)
//As an array (Returns empty)
c.Find([]bson.M{{"_id": bson.ObjectIdHex(p.ByName("id"))}, bson.M{"races": bson.M{"$elemMatch": bson.M{"_id": bson.ObjectIdHex(p.ByName("raceId"))}}}}).One(&result)
I am using httprouter and p.ByName("...") invocations are parameters passed to the handler.
Thanks in advance.
Would go with the Select method as the doc states that this enables selecting which fields should be retrieved for the results found, thus the projection using $elemMatch operator can be applied here in conjuction with Select, with your final query looking something like:
c.Find(bson.M{
"_id": bson.ObjectIdHex(p.ByName("id"))
}).Select(bson.M{
"races": bson.M{
"$elemMatch": bson.M{
"_id": bson.ObjectIdHex(p.ByName("raceId"))
}
}
}).One(&result)

Listing, counting factors of unique Mongo DB values over all keys

I'm preparing a descriptive "schema" (quelle horreur) for a MongoDB I've been working with.
I used the excellent variety.js to create a list of all keys and show coverage of each key. However, in cases where the values corresponding to the keys have a small set of values, I'd like to be able to list the entire set as "available values." In R, I'd be thinking of these as the "factors" for the categorical variable, ie, gender : ["M", "F"].
I know I could just use R + RMongo, query each variable, and basically do the same procedure I would to create a histogram, but I'd like to know the proper Mongo.query()/javascript/Map,Reduce way to approach this. I understand the db.collection.aggregate() functions are designed for exactly this.
Before asking this, I referenced:
http://docs.mongodb.org/manual/reference/aggregation/
http://docs.mongodb.org/manual/reference/method/db.collection.distinct/
How to query for distinct results in mongodb with python?
Get a list of all unique tags in mongodb
http://cookbook.mongodb.org/patterns/count_tags/
But can't quite get the pipeline order right. So, for example, if I have documents like these:
{_id : 1, "key1" : "value1", "key2": "value3"}
{_id : 2, "key1" : "value2", "key2": "value3"}
I'd like to return something like:
{"key1" : ["value1", "value2"]}
{"key2" : ["value3"]}
Or better, with counts:
{"key1" : ["value1" : 1, "value2" : 1]}
{"key2" : ["value3" : 2]}
I recognize one problem with doing this will be any values that have a wide range of different values---so, text fields, or continuous variables. Ideally, if there were more than x different possible values, it would be nice to truncate, say to no more than 20 unique values. If I find it's actually more, I'd query that variable directly.
Is this something like:
db.collection.aggregate(
{$limit: 20,
$group: {
_id: "$??varname",
count: {$sum: 1}
}})
First, how can I reference ??varname? for the name of each key?
I saw this link which had 95% of it:
Binning and tabulate (unique/count) in Mongo
with...
input data:
{ "_id" : 1, "age" : 22.34, "gender" : "f" }
{ "_id" : 2, "age" : 23.9, "gender" : "f" }
{ "_id" : 3, "age" : 27.4, "gender" : "f" }
{ "_id" : 4, "age" : 26.9, "gender" : "m" }
{ "_id" : 5, "age" : 26, "gender" : "m" }
This script:
db.collection.aggregate(
{$project: {gender:1}},
{$group: {
_id: "$gender",
count: {$sum: 1}
}})
Produces:
{"result" :
[
{"_id" : "m", "count" : 2},
{"_id" : "f", "count" : 3}
],
"ok" : 1
}
But what I don't understand is how could I do this generically for an unknown number/name of keys with a potentially large number of return values? This sample knows the key name is gender, and that the response set will be small (2 values).
If you already ran a script that outputs the names of all keys in the collection, you can generate your aggregation framework pipeline dynamically. What that means is either extending the variety.js type script or just writing your own.
Here is what it might look like in JS if passed an array called "keys" which has several non-"_id" named fields (I'm assuming top level fields and that you don't care about arrays, embedded documents, etc).
keys = ["key1", "key2"];
group = { "$group" : { "_id" : null } } ;
keys.forEach( function(f) {
group["$group"][f+"List"] = { "$addToSet" : "$" + f }; } );
db.collection.aggregate(group);
{
"result" : [
{
"_id" : null,
"key1List" : [
"value2",
"value1"
],
"key2List" : [
"value3"
]
}
],
"ok" : 1
}

Map reduce in mongodb

I have mongo documents in this format.
{"_id" : 1,"Summary" : {...},"Examples" : [{"_id" : 353,"CategoryId" : 4},{"_id" : 239,"CategoryId" : 28}, ... ]}
{"_id" : 2,"Summary" : {...},"Examples" : [{"_id" : 312,"CategoryId" : 2},{"_id" : 121,"CategoryId" : 12}, ... ]}
How can I map/reduce them to get a hash like:
{ [ result[categoryId] : count_of_examples , .....] }
I.e. count of examples of each category.
I have 30 categories at all, all specified in Categories collection.
If you can use 2.1 (dev version of upcoming release 2.2) then you can use Aggregation Framework and it would look something like this:
db.collection.aggregate( [
{$project:{"CatId":"$Examples.CategoryId","_id":0}},
{$unwind:"$CatId"},
{$group:{_id:"$CatId","num":{$sum:1} } },
{$project:{CategoryId:"$_id",NumberOfExamples:"$num",_id:0 }}
] );
The first step projects the subfield of Examples (CategoryId) into a top level field of a document (not necessary but helps with readability), then we unwind the array of examples which creates a separate document for each array value of CatId, we do a "group by" and count them (I assume each instance of CategoryId is one example, right?) and last we use projection again to relabel the fields and make the result look like this:
"result" : [
{
"CategoryId" : 12,
"NumberOfExamples" : 1
},
{
"CategoryId" : 2,
"NumberOfExamples" : 1
},
{
"CategoryId" : 28,
"NumberOfExamples" : 1
},
{
"CategoryId" : 4,
"NumberOfExamples" : 1
}
],
"ok" : 1