How do i count number of referals using aggregation in mongoDB - mongodb

this is the schema :
{
"_id" : ObjectId("54f8d7ad92ccf803008a0e4f"),
"personal" : {
"name" : "test",
"placa" : "BBB222"
},
"recruiter" : {
"user_id" : "541cba6fe4b0288d56081fe2",
"date" : 1425594285410,
"name" : "Mario Hart",
"rol" : "greeter",
"channel" : "referido",
"referred" : "VERA"
},
I want to create a list of names counting the number of referals that each one has. I am trying with the following way, but its not working at all.This is the code that i've written.
db.drivers.aggregate( {
$group: { _
id:{"$personal.name",
"$recruiter.referred"
},
total_recommendations:{ $sum:1} }
} ])
This is not working , i cant make this code works .
The server that i am running is on the version 2.6.8

Your syntax is wrong. $ prefixes are for variable references to fields or otherwise reserved for "operators" when used in the "key" part of "key/value" notation. It also never fails to surprise the number of people who use a "compound key" notation in a grouping _id when they have only one field:
db.drivers.aggregate([
{ "$group": { _
"_id": "$recruiter.referred"
"total_recommendations":{ "$sum": 1 }
}}
])

Related

Scala MongoDB aggregate group and match query

I want a query that will take the latest version out of each document, and check if some given string (applicationId) is in the list allowedApplications.
documents example:
{
"applicationId" : "y...",
"allowedApplications": ["x..."],
"name" : "some-name",
"version" : 3
}
{
"applicationId" : "y...",
"allowedApplications": ["x..."],
"name" : "some-name",
"version" : 2
}
{
"applicationId" : "x...",
"allowedApplications": ["y..."],
"name" : "some-other-name",
"version" : 1
}
So the MongoDB query is:
db.getCollection('..').aggregate(
[
{ "$match": { "allowedApplications": "x..." }},
{"$group": { "_id": "$name", "version": { "$max": "$version" }}}
]
)
And the query will output the name and version (I'll perhaps add the allowedApplications later).
I'm trying now to write this in Scala's mongodb driver.
I tried a bunch of stuff, for example:
collection
.aggregate(List(
`match`(equal("allowedApplications", "x..")),
group("$name", addToSet("version", addToSet("$max", "¢version")))
)
)
But couldn't get it to work.
Using Scala 2.13.1 and mongo-scala-driver 4.1.0.
Any help would be appreciated.
Found the answer:
collection
.aggregate(List(
`match`(equal("allowedApplications", "x...")),
group("$name", max("version", "$version"))
)
The order isn't quite the same, but just use the function in the accumulator field.

MongoDB query returning no results

I'm new to mongo and am trying to do a very simple query in this collection:
{
"_id" : ObjectId("gdrgrdgrdgdr"),
"administrators" : {
"-HGFsfes" : {
"name" : "Jose",
"phone" : NumberLong(124324)
},
"-HGFsfqs" : {
"name" : "Peter",
"phone" : "+43242342"
}
},
"countries" : {
"-dgfgrdg : {
"lang" : "en",
"name" : "Canada"
},
"-grdgrdg" : {
"lang" : "en",
"name" : "USA"
}
}
}
How do I make a query that returns the results of administrators with name like "%Jos%" for example.
What I did until now is this: db.getCollection('coll').find({ "administrators.name": /Jos/});
And variations of this. But every thing I tried returns zero results.
What am I doing wrong?
Thanks in advance!
Your mistake is that administrators is not an array, but an object with fields that are themselves objects with name field. Right query will be
{ "administrators.-HGFsfes.name": /Jos/}
Unfortunatelly this way you're only querying -HGFsfes name field, not other administrator name field.
To achieve what you want, the only thing to do is to replace administrators object by an array, so your document will look like this :
{
"administrators" : [
{
"id" : "-HGFsfes",
"name" : "Jose",
"phone" : 124324
},
{
"id" : "-HGFsfqs",
"name" : "Peter",
"phone" : "+43242342"
}
],
countries : ...
}
This way your query will work.
BUT it will return documents where at least one entry in administrators array has the matching name field. To return only administrator matching element, and not whole document, check this question and my answer for unwind/match/group aggregation pipeline.
You need to use query like this:
db.collection_name.find({})
So if your collection name is coll, then it would be:
db.coll.find({"administrators.-HGFsfes.name": /Jos/});
Look this for like query in mongo.
Also, try with regex pattern like this:
db.coll.find({"administrators..-HGFsfes.name": {"$regex":"Jos", "$options":"i"}}});
It will give you only one result because your data is not an array as below in screenshot:
If you want multiple results, then you need to restructure your data.
Ok, think i've found a better solution for you, with aggregation framework.
Run the following query on your current collection, will return you all administrators with name "LIKE" jos (case insensitive with i option) :
db.test1.aggregate(
[
{
$project: {
administrators:{ $objectToArray: "$administrators"}
}
},
{
$unwind: {
path : "$administrators"
}
},
{
$replaceRoot: {
newRoot:"$administrators"
}
},
{
$match: {
"v.name":/jos/i
}
},
]
);
Output
{
"k" : "-HGFsfes",
"v" : {
"name" : "Jose",
"phone" : NumberLong(124324)
}
}
"k" and "v" are coming from "$objectToArray" operator, you can add a $project stage to rename them (or discard if k value doesn't matter)
Not sure for Robomongo testing but in Studio 3T, formerly Robomongo, you can either copy/paste this query in Intellishell console, or copy/import in aggregation tab, (small icon 'paste from the clipboard').
Hope it helps.

Mongodb Update/Upsert array exact match

I have a collection :
gStats : {
"_id" : "id1",
"criteria" : ["key1":"value1", "key2":"value2"],
"groups" : [
{"id":"XXXX", "visited":100, "liked":200},
{"id":"YYYY", "visited":30, "liked":400}
]
}
I want to be able to update a document of the stats Array of a given array of criteria (exact match).
I try to do this on 2 steps :
Pull the stat document from the array of a given "id" :
db.gStats.update({
"criteria" : {$size : 2},
"criteria" : {$all : [{"key1" : "2096955"},{"value1" : "2015610"}]}
},
{
$pull : {groups : {"id" : "XXXX"}}
}
)
Push the new document
db.gStats.findAndModify({
query : {
"criteria" : {$size : 2},
"criteria" : {$all : [{"key1" : "2015610"}, {"key2" : "2096955"}]}
},
update : {
$push : {groups : {"id" : "XXXX", "visited" : 29, "liked" : 144}}
},
upsert : true
})
The Pull query works perfect.
The Push query gives an error :
2014-12-13T15:12:58.571+0100 findAndModifyFailed failed: {
"value" : null,
"errmsg" : "exception: Cannot create base during insert of update. Cause
d by :ConflictingUpdateOperators Cannot update 'criteria' and 'criteria' at the
same time",
"code" : 12,
"ok" : 0
} at src/mongo/shell/collection.js:614
Neither query is working in reality. You cannot use a key name like "criteria" more than once unless under an operator such and $and. You are also specifying different fields (i.e groups) and querying elements that do not exist in your sample document.
So hard to tell what you really want to do here. But the error is essentially caused by the first issue I mentioned, with a little something extra. So really your { "$size": 2 } condition is being ignored and only the second condition is applied.
A valid query form should look like this:
query: {
"$and": [
{ "criteria" : { "$size" : 2 } },
{ "criteria" : { "$all": [{ "key1": "2015610" }, { "key2": "2096955" }] } }
]
}
As each set of conditions is specified within the array provided by $and the document structure of the query is valid and does not have a hash-key name overwriting the other. That's the proper way to write your two conditions, but there is a trick to making this work where the "upsert" is failing due to those conditions not matching a document. We need to overwrite what is happening when it tries to apply the $all arguments on creation:
update: {
"$setOnInsert": {
"criteria" : [{ "key1": "2015610" }, { "key2": "2096955" }]
},
"$push": { "stats": { "id": "XXXX", "visited": 29, "liked": 144 } }
}
That uses $setOnInsert so that when the "upsert" is applied and a new document created the conditions specified here rather than using the field values set in the query portion of the statement are used instead.
Of course, if what you are really looking for is truly an exact match of the content in the array, then just use that for the query instead:
query: {
"criteria" : [{ "key1": "2015610" }, { "key2": "2096955" }]
}
Then MongoDB will be happy to apply those values when a new document is created and does not get confused on how to interpret the $all expression.

Ensure Unique indexes in embedded doc in mongodb

Is there a way to make a subdocument within a list have a unique field in mongodb?
document structure:
{
"_id" : "2013-08-13",
"hours" : [
{
"hour" : "23",
"file" : [
{
"date_added" : ISODate("2014-04-03T18:54:36.400Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.410Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.402Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.671Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
}
]
}
]
}
I want to make sure that the document's hours.hour value has a unique item when inserted. The issue is hours is a list. Can you ensureIndex in this way?
Indexes are not the tool for ensuring uniqueness in an embedded array, rather they are used across documents to ensure that certain fields do not repeat there.
As long as you can be certain that the content you are adding does not differ from any other value in any way then you can use the $addToSet operator with update:
db.collection.update(
{ "_id": "2013-08-13", "hour": 23 },
{ "$addToSet": {
"hours.$.file": {
"date_added" : ISODate("2014-04-03T18:54:36.671Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
}
}}
)
So that document would not be added as there is already an element matching those exact values within the target array. If the content was different (and that means any part of the content, then a new item would be added.
For anything else you would need to maintain that manually by loading up the document and inspecting the elements of the array. Say for a different "filename" with exactly the same timestamp.
Problems with your Schema
Now the question is answered I want to point out the problems with your schema design.
Dates as strings are "horrible". You may think you need them but you do not. See the aggregation framework date operators for more on this.
You have nested arrays, which generally should be avoided. The general problems are shown in the documentation for the positional $ operator. That says you only get one match on position, and that is always the "top" level array. So updating beyond adding things as shown above is going to be difficult.
A better schema pattern for you is to simply do this:
{
"date_added" : ISODate("2014-04-03T18:54:36.400Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.410Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.402Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.671Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
}
If that is in it's own collection then you can always actually use indexes to ensure uniqueness. The aggregation framework can break down the date parts and hours where needed.
Where you must have that as part of another document then try at least to avoid the nested arrays. This would be acceptable but not as flexible as separating the entries:
{
"_id" : "2013-08-13",
"hours" : {
"23": [
{
"date_added" : ISODate("2014-04-03T18:54:36.400Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.410Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.402Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
},
{
"date_added" : ISODate("2014-04-03T18:54:36.671Z"),
"name" : "1376434800_file_output_2014-03-10-09-27_44.csv"
}
]
}
}
It depends on your intended usage, the last would not allow you to do any type of aggregation comparison across hours within a day. Not in any simple way. The former does this easily and you can still break down selections by day and hour with ease.
Then again, if you are only ever appending information then your existing schema should be find. But be aware of the possible issues and alternatives.

MongoDb - How to search BSON composite key exactly?

I have a collection that stored information about devices like the following:
/* 1 */
{
"_id" : {
"startDate" : "2012-12-20",
"endDate" : "2012-12-30",
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount"]
},
"data" : {
"results" : "1"
}
}
/* 2 */
{
"_id" : {
"startDate" : "2012-12-20",
"endDate" : "2012-12-30",
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
},
"data" : {
"results" : "2"
}
}
/* 3 */
{
"_id" : {
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
},
"data" : {
"results" : "3"
}
}
And I am trying to query the documents using the _id field which will be unique. The problem I am having is that when I query for all the different attributes as in:
db.collection.find({$and: [{"_id.dimensions":{ $all: ["manufacturer","model"], $size: 2}}, {"_id.metrics": { $all:["noOfUsers","deviceCount"], $size: 2}}]});
This matches 2 and 3 documents (I don't care about the order of the attributes values), but I would like to only get 3 back. How can I say that there should not be any other attributes to _id than those that I specify in the search query?
Please advise. Thanks.
Unfortunately, I think the closest you can get to narrowing your query results to just unordered _id.dimensions and unordered _id.metrics requires you to know the other possible fields in the _id subdocument field, eg. startDate and endDate.
db.collection.find({$and: [
{"_id.dimensions":{ $all: ["manufacturer","model"], $size: 2}},
{"_id.metrics": { $all:["noOfUsers","deviceCount"], $size: 2}},
{"_id.startDate":{$exists:false}},
{"_id.endDate":{$exists:false}}
]});
If you don't know the set of possible fields in _id, then the other possible solution would be to specify the exact _id that you want, eg.
db.collection.find({"_id" : {
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
}})
but this means that the order of _id.dimensions and _id.metrics is significant. This last query does a document match on exact BSON representation of _id.