Mongodb group by pair - mongodb

I have got a data like this : DATA  , I try to group by domaine names , I want a result to be look like that :
[{
{ "domain": "gmail_com_"
"A": 3
"B": 5
"C": 3 },
............
}]
Where A,B are the lenght of the list for the domain names that match,and C is the size of duplicated ip address .But as you see in the result if the domain names is present in more than two diff timestamp it only group with the two first one, and I want to group two by two with all the possiblities, in my exemple , facebook is present in 3 diff tsp so we should have three diff pair. if someone can help me.
thnx

To get every possible pair of two discreet values from a series of documents, you will need to:
gather the values into an array
assign an index of some sort to identify each
duplicate the array
unwind both duplicates
eliminate pairs with the same index
An aggregation pipeline might look like so:
db.collection.aggregate([
{$group:{
_id:"$domain",
list:{$push:"$ip"}
}},
{$project:{
numberedList:{
$reduce: {
input: "$list",
initialValue: {a:[],c:0},
in:{
a:{$concatArrays:["$$value.a",[{ip:"$$this",idx:"$$value.c"}]]},
c:{$add:["$$value.c",1]}
}}}}},
{$project:{
left:"$numberedList",
right:"$numberedList"
}},
{$unwind:"$left"},
{$unwind:"$right"},
{$match:{$expr:{$ne:["$left.idx","$right.idx"]}}}
])
This should leave you with the domain name in _id and a pair of results in left and right, which you can then process as needed.

Related

How to filter an array in a MongoDB document based on query on collection?

Suppose the following collection of documents that include an 'user_id' field and an array of ids that this user follows
{"user_id": 1 , "follows" : [2,30]},
{"user_id": 2 , "follows" : [1,40]},
{"user_id": 3 , "follows" : [2,50]},
... large collection
I would like to filter out from "references" the numbers that don't exist in the collection as an id. Think about it as a data cleaning procedure, where follows to users that don't exist anymore need to be deleted. Example output from input above:
{"user_id": 1 , "follows" : [2]},
{"user_id": 2 , "follows" : [1]},
{"user_id": 3 , "follows" : [2]},
... large collection
I thought about a projection with a "$filter", but I can't find an expression for checking that a document with that id exists in the whole collection (as $filter seems to be limited to the current document).
Then I tried to aggregate a set of all ids to use an $in condition, but that failed miserable due to the size of collection (too large object error).
Thought about unwinding, but I'm hitting the same rock: can't find an expression to $match or $project that answers the question "Does this value of 'follows' exists as an 'id' in the collection?"
The only other thing I see doing the filtering client side with a few independent queries, but wanted to check first with the community if I'm missing something.
You could do a $lookup, like this:
$lookup: {
from: 'users',
localField: 'follows',
foreignField: 'user_id',
as: 'follows'
}
This will produce a result like { user_id: 1, follows: [ {user_id: 2, follows: [1, 40] } ] }. Then you should be able to get the result you want with $addFields (to map follows to follows.user_id).
$addFields: { follows: "$follows.user_id" }

MongoDB select distinct where not in select distinct

In a mongoDB bd, I need to find all the records where those records aren't in a different collection
Say I have 2 collections
1) user_autos
{
make: string,
user_id: objId
}
2) auto_makes
{
mfg: string,
make: string
}
I need to find all the "makes" that are not part of the "master makes" list
I want to do the parallel to this SQL
SELECT DISTINCT
a.make
FROM
user_autos a
WHERE
a.make NOT IN (
SELECT DISTINCT
b.make
FROM
auto_makes b
)
Help please
to achieve this, you need to make use of aggragation with pipeline stage 'lookup'.
lookup does left join between two collections. so, obviously the unmatching documents of
'user_autos' gives an empty nested array 'auto_makes'. and then 'group' the 'user_autos'
with 'make'. so that a list of 'user_auto' documents will be resulted.
you can do it as below.
db.user_autos.aggregate([
{$lookup:{
from:"äuto_makes",
localField:"make",
foreignField:"make",
as:"m"
}},
{$match:{
m:{$exists:false}
}},
{$group:{
_id:"$make"
}}
//if you want to get the distinct 'make' values as an array of single
//document, add another $group stage.
{$group:{
_id:"",
make_list:{$addToSet:"$_id"}
}}
])
Visit https://docs.mongodb.com/manual/reference/operator/aggregation/group/ ,
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/

Find current and previous documents from mongo db

I have to return 2 documents from a single query. The first value which I will be giving in the query and the second will be the previous one(sorted).
I am able to design both separately. The below code gives separate outputs.
db.collection.find({'_id':'value1'})
db.collection.find({'_id': {'$lt': 'value1'}}).sort({'_id':-1}).limit(1)
How to combine them? So when I execute from my appl it returns 2 outputs
Fetch only a specific key instead of entire document
You can use $lte instead of $lt and limit with 2 - logically it will be the same operation
db.collection.find({ _id: { $lte: 'value1' } }, { _id: 1, yourKey: 1 }).sort({_id: -1}).limit(2)
EDIT: to get specific keys you need to specify them as second argument of .find()

Mongodb counting the number records where column present after grouping

I have a collection of records that look like this:
{'customer':'unique_value',
'Date trial':'12/1/2013',
'Date success':'12/3/2013'}
The Date success field is not present in every record, only the records where there was a success have a success field. Each unique customer may have dozens or hundreds of trials. I'd like to get a list of the customers, the number of trials, and the number of successes
db.collection.aggregate([ {$group:{_id:'$account',attempts:{$sum:1} }} ]);
Will give me the number of trials for each customer, but I'm not coming up with a way to also get the number of records for that customer where the Date success field was present.
Any ideas?
Good question.
You can use some conditional operators to evaluate the fields. Here is the usage combining $cond with $ifNull
db.collection.aggregate([
{$group: { _id: "$account",
trial: {$sum: {$cond: [ {$ifNull: ["$trial",0]}, 1, 0]}},
success: {$sum: {$cond: [ {$ifNull: ["$success",0]}, 1, 0]}}
}}
])
Let's break that down, even if brief it's still a mouthful.
So when we test a field with $ifNull ( ["$trial",0] ), it's basically saying that if the field does not exist then return a 0 which will be interpreted as false. The field value will of course be logically true.
Then we ask of $cond (which is kind of a ternary), given this condition (which will be a result: true or false) then return either the true or false value, being the extra arguments.
Finally in the $group we implement the $sum operator on the field. And since $cond gave us a 1 where there was a true condition, or a 0 in the false case, then we can consider the sum the result of the matching conditions where the element was in place as you expected for your query.
Same principle goes for $project. You are shaping your output document to have both fields whether they are present or not in the source. The value being assigned to the field depends on whether it exists or not. In this case 0 where it's not there and 1 where it is. Then just sum the results.

mongodb group by first character

I have a problem in mongodb.
I want to create aggregation witch result will be like this:
A 10
B 2
C 4
D 9
E 3
...
I have a column words in my table and I want to group my records according to first character of column words.
I find resolve for sql but not for mongo.
I will be very grateful for your help
You don't show what the docs in your collection look like, but you can use the aggregate collection method to do this:
// Group by the first letter of the 'words' field of each doc in the 'test'
// collection while generating a count of the docs in each group.
db.test.aggregate({$group: {_id: {$substr: ['$words', 0, 1]}, count: {$sum: 1}}})