MongoDB Find values passed in that don't match - mongodb

Currently stuck with an issue using MongoDB aggregation. I have a array of '_ids' that I need to check exist in a specific collection.
Example:
I have 3 records in 'Collection 1' with _id 1,2,3. I can find the matching values using:
$match: {
_id: {
$in: [1, 2, 3, 4]
}
}
However what I want to know is from the values I have passed in (1,2,3,4). Which ones don't match up to a record. (In this case _id 4 will not have a matching record)
So instead of returning records with _id 1, 2, 3. It needs to return the _id that doesn't exist. So in this example '_id: 4'
The query should also disregard any extra records in the collection. Example, if the collection held records with ID 1-10, and I passed in a query to determine if the _ids: 1, 7, 15 existed. The the value i'm expecting would be along the lines of ' _id: 15 doesn't exist
The first thought was to use to use $project within a aggregation to hold each _id that was passed in, and then attach each record in the collection. To the matching _id passed in. E.g:
Record 1:
{
_id: 1,
Collection1: [
record details: ...,
...
...
]
},
{
_id: 2,
Collection1: [] // This _id passed in, doesn't have a matching collection
}
However cant seem to get a working example in this instance. Any help would be appreciated!

If the input documents are:
{ _id: 1 },
{ _id: 2 },
{ _id: 5 },
{ _id: 10 }
And the array to match is:
var INPUT_ARRAY = [ 1, 7, 15 ]
The following aggregation:
db.test.aggregate( [
{
$match: {
_id: {
$in: INPUT_ARRAY
}
}
},
{
$group: {
_id: null,
matches: { $push: "$_id" }
}
},
{
$project: {
ids_not_exist: { $setDifference: [ INPUT_ARRAY, "$matches" ] },
_id: 0
}
}
] )
Returns:
{ "ids_not_exist" : [ 7, 15 ] }

Are you looking for $not ?
MDB Docs

Related

Filter only documents that have ALL FIELDS non null (with aggregation framework)

I have many documents, but I want to figure out how to get only documents that have ALL FIELDS non null.
Suppose I have these documents:
[
{
'a': 1,
'b': 2,
'c': 3
},
{
'a': 9,
'b': 12
},
{
'a': 5
}
]
So filtering the documents, only the first have ALL FIELDS not null. So filtering out these documents, I would get only the first. How can I do this?
So when you wanted to get only the documents which have ALL FIELDS, without specifying all of them in filter query like this : { a: {$exists : true}, b : {$exists : true}, c : {$exists : true}} then it might not be a good idea, in other way technically if you've 10s of fields in the document then it wouldn't either be a good idea to mention all of them in the query. Anyhow as you don't want to list them all - We can try this hack if it performs well, Let's say if you've a fixed schema & say that all of your documents may contain only fields a, b & c (_id is default & exceptional) but nothing apart from those try this :
If you can get count of total fields, We can check for field count which says all fields do exists, Something like below :
db.collection.aggregate([
/** add a new field which counts no.of fields in the document */
{
$addFields: { count: { $size: { $objectToArray: "$$ROOT" } } }
},
{
$match: { count: { $eq: 4 } } // we've 4 as 3 fields + _id
},
{
$project: { count: 0 }
}
])
Test : mongoplayground
Note : We're only checking for field existence but not checking for false values like null or [] or '' on fields. Also this might not work for nested fields.
Just in case if you wanted to check all fields exist in the document with their names, So if you can pass all fields names as input, then try below query :
db.collection.aggregate([
/** create a field with all keys/field names in the document */
{
$addFields: {
data: {
$let: {
vars: { data: { $objectToArray: "$$ROOT" } },
in: "$$data.k"
}
}
}
},
{
$match: { data: { $all: [ "b", "c", "a" ] } } /** List down all the field names from schema */
},
{
$project: { data: 0 }
}
])
Test : mongoplayground
Ref : aggregation-pipeline
You can try to use explain to check your queries performance.

How to aggregate all existing field in my document [duplicate]

I got a problem when I use db.collection.aggregate in MongoDB.
I have a data structure like:
_id:...
Segment:{
"S1":1,
"S2":5,
...
"Sn":10
}
It means the following in Segment: I might have several sub attributes with numeric values. I'd like to sum them up as 1 + 5 + .. + 10
The problem is: I'm not sure about the sub attributes names since for each document the segment numbers are different. So I cannot list each segment name. I just want to use something like a for loop to sum all values together.
I tried queries like:
db.collection.aggregate([
{$group:{
_id:"$Account",
total:{$sum:"$Segment.$"}
])
but it doesn't work.
You have made the classical mistake to have arbitrary field names. MongoDB is "schema-free", but it doesn't mean you don't need to think about your schema. Key names should be descriptive, and in your case, f.e. "S2" does not really mean anything. In order to do most kinds of queries and operations, you will need to redesign you schema to store your data like this:
_id:...
Segment:[
{ field: "S1", value: 1 },
{ field: "S2", value: 5 },
{ field: "Sn", value: 10 },
]
You can then run your query like:
db.collection.aggregate( [
{ $unwind: "$Segment" },
{ $group: {
_id: '$_id',
sum: { $sum: '$Segment.value' }
} }
] );
Which then results into something like this (with the only document from your question):
{
"result" : [
{
"_id" : ObjectId("51e4772e13573be11ac2ca6f"),
"sum" : 16
}
],
"ok" : 1
}
Starting Mongo 3.4, this can be achieved by applying inline operations and thus avoid expensive operations such as $group:
// { _id: "xx", segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
db.collection.aggregate([
{ $addFields: {
total: { $sum: {
$map: { input: { $objectToArray: "$segments" }, as: "kv", in: "$$kv.v" }
}}
}}
])
// { _id: "xx", total: 42, segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
The idea is to transform the object (containing the numbers to sum) as an array. This is the role of $objectToArray, which starting Mongo 3.4.4, transforms { s1: 1, s2: 3, ... } into [ { k: "s1", v: 1 }, { k: "s2", v: 3 }, ... ]. This way, we don't need to care about the field names since we can access values through their "v" fields.
Having an array rather than an object is a first step towards being able to sum its elements. But the elements obtained with $objectToArray are objects and not simple integers. We can get passed this by mapping (the $map operation) these array elements to extract the value of their "v" field. Which in our case results in creating this kind of array: [1, 3, 18, 42].
Finally, it's a simple matter of summing elements within this array, using the $sum operation.
Segment: {s1: 10, s2: 4, s3: 12}
{$set: {"new_array":{$objectToArray: "$Segment"}}}, //makes field names all "k" or "v"
{$project: {_id:0, total:{$sum: "$new_array.v"}}}
"total" will be 26.
$set replaces $addFields in newer versions of mongo. (I'm using 4.2.)
"new_array": [
{
"k": "s1",
"v": 10
},
{
"k": "s2",
"v": 4
},
{
"k": "s3",
"v": 12
}
]
You can also use regular expressions. Eg. /^s/i for words starting with "s".

Counting data per user with mongo aggregation framework

I have a collection, where each document contains user_ids as a property, which is an Array field. Example document(s) would be :
[{
_id: 'i3oi1u31o2yi12o3i1',
unique_prop: 33,
prop1: 'some string value',
prop2: 212,
user_ids: [1, 2, 3 ,4]
},
{
_id: 'i3oi1u88ffdfi12o3i1',
unique_prop: 34,
prop1: 'some string value',
prop2: 216,
user_ids: [2, 3 ,4]
},
{
_id: 'i3oi1u8834432ddsda12o3i1',
unique_prop: 35,
prop1: 'some string value',
prop2: 211,
user_ids: [2]
}]
My goal is to get number of documents per user, so sample output would be :
[
{user_id: 1, count: 1},
{user_id: 2, count: 3},
{user_id: 3, count: 2},
{user_id: 4, count: 2}
]
I've tried couple of things none of which worked, lastly I tried :
aggregate([
{ $group: {
_id: { unique_prop: "$unique_prop"},
users: { "$addToSet": "$user_ids" },
count: { "$sum": 1 }
}}
]
But it just returned the users per document. I m still trying to learn the any resource or advice would help.
You need to $unwind the "user_ids" array and in the $group stage count the number of time each "id" appears in the collection.
db.collection.aggregate([
{ "$unwind": "$user_ids" },
{ "$group": { "_id": "$user_ids", "count": {"$sum": 1 }}}
])
MongoDB aggregation performs computation on group of values from documents in a collection and return computed result through executing its stages in a pipeline.
According to above mentioned description please try executing following aggregate query in MongoDB shell.
db.collection.aggregate(
// Pipeline
[
// Stage 1
{
$unwind: "$user_ids"
},
// Stage 2
{
$group: {
_id:{user_id:'$user_ids'},
total:{$sum:1}
}
},
// Stage 3
{
$project: {
_id:0,
user_id:'$_id.user_id',
count:'$total'
}
},
]
);
In above aggregate query initially $unwind operator breaks an array field user_ids of each document into multiple documents for each element of array field and then it groups documents by value of user_ids field contained into each document and performs summation of documents for each value of user_ids field.

Mongo Query to return common values in array

I need a Mongo Query to return me common values present in an array.
So if there are 4 documents in match, then the values are returned if those are present in in all the 4 documents
Suppose I have the below documents in my db
Mongo Documents
{
"id":"0",
"merchants":["1","2"]
}
{
"id":"1",
"merchants":["1","2","4"]
}
{
"id":"2",
"merchants":["4","5"]
}
Input : List of id
(i) Input with id "0" and "1"
Then it should return me merchants:["1","2"] as both are present in documents with id "0" & id "1"
(ii) Input with id "1" and "2"
Then it should return me merchants:["4"] as it is common and present in both documents with id "1" & id "2"
(iii) Input with id "0" and "2"
Should return empty merchants:[] as no common merchants between these 2 documents
You can try below aggregation.
db.collection.aggregate(
{$match:{id: {$in: ["1", "2"]}}},
{$group:{_id:null, first:{$first:"$merchants"}, second:{$last:"$merchants"}}},
{$project: {commonToBoth: {$setIntersection: ["$first", "$second"]}, _id: 0 } }
)
Say you have a function query that does the required DB query for you, and you'll call that function with idsToMatch which is an array containing all the elements you want to match. I have used JS here as the driver language, replace it with whatever you are using.
The following code is dynamic, will work for any number of ids you give as input:
const query = (idsToMatch) => {
db.collectionName.aggregate([
{ $match: { id: {$in: idsToMatch} } },
{ $unwind: "$merchants" },
{ $group: { _id: { id: "$id", data: "$merchants" } } },
{ $group: { _id: "$_id.data", count: {$sum: 1} } },
{ $match: { count: { $gte: idsToMatch.length } } },
{ $group: { _id: 0, result: {$push: "$_id" } } },
{ $project: { _id: 0, result: "$result" } }
])
The first $group statement is to make sure you don't have any
repetitions in any of your merchants attribute in a document. If
you are certain that in your individual documents you won't have any
repeated value for merchants, you need not include it.
The real work happens only upto the 2nd $match phase. The last two
phases ($group and $project) are only to prettify the result,
you may choose not use them, and instead use the language of your
choice to transform it in the form you want
Assuming you want to reduce the phases as per the points given above, the actual code will reduce to:
aggregate([
{ $match: { id: {$in: idsToMatch} } },
{ $unwind: "$merchants" },
{ $group: { _id: "merchants", count: {$sum: 1} } },
{ $match: { count: { $gte: idsToMatch.length } } }
])
Your required values will be at the _id attribute of each element of the result array.
The answer provided by #jgr0 is correct to some extent. The only mistake is the intermediate match operation
(i) So if input ids are "1" & "0" then the query becomes
aggregate([
{"$match":{"id":{"$in":["1","0"]}}},
{"$unwind":"$merchants"},
{"$group":{"_id":"$merchants","count":{"$sum":1}}},
{"$match":{"count":{"$eq":2}}},
{"$group":{"_id":null,"merchants":{"$push":"$_id"}}},
{"$project":{"_id":0,"merchants":1}}
])
(ii) So if input ids are "1", "0" & "2" then the query becomes
aggregate([
{"$match":{"id":{"$in":["1","0", "2"]}}},
{"$unwind":"$merchants"},
{"$group":{"_id":"$merchants","count":{"$sum":1}}},
{"$match":{"count":{"$eq":3}}},
{"$group":{"_id":null,"merchants":{"$push":"$_id"}}},
{"$project":{"_id":0,"merchants":1}}
])
The intermediate match operation should be the count of ids in input. So in case (i) it is 2 and in case (2) it is 3.

How to sum every fields in a sub document of MongoDB?

I got a problem when I use db.collection.aggregate in MongoDB.
I have a data structure like:
_id:...
Segment:{
"S1":1,
"S2":5,
...
"Sn":10
}
It means the following in Segment: I might have several sub attributes with numeric values. I'd like to sum them up as 1 + 5 + .. + 10
The problem is: I'm not sure about the sub attributes names since for each document the segment numbers are different. So I cannot list each segment name. I just want to use something like a for loop to sum all values together.
I tried queries like:
db.collection.aggregate([
{$group:{
_id:"$Account",
total:{$sum:"$Segment.$"}
])
but it doesn't work.
You have made the classical mistake to have arbitrary field names. MongoDB is "schema-free", but it doesn't mean you don't need to think about your schema. Key names should be descriptive, and in your case, f.e. "S2" does not really mean anything. In order to do most kinds of queries and operations, you will need to redesign you schema to store your data like this:
_id:...
Segment:[
{ field: "S1", value: 1 },
{ field: "S2", value: 5 },
{ field: "Sn", value: 10 },
]
You can then run your query like:
db.collection.aggregate( [
{ $unwind: "$Segment" },
{ $group: {
_id: '$_id',
sum: { $sum: '$Segment.value' }
} }
] );
Which then results into something like this (with the only document from your question):
{
"result" : [
{
"_id" : ObjectId("51e4772e13573be11ac2ca6f"),
"sum" : 16
}
],
"ok" : 1
}
Starting Mongo 3.4, this can be achieved by applying inline operations and thus avoid expensive operations such as $group:
// { _id: "xx", segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
db.collection.aggregate([
{ $addFields: {
total: { $sum: {
$map: { input: { $objectToArray: "$segments" }, as: "kv", in: "$$kv.v" }
}}
}}
])
// { _id: "xx", total: 42, segments: { s1: 1, s2: 3, s3: 18, s4: 20 } }
The idea is to transform the object (containing the numbers to sum) as an array. This is the role of $objectToArray, which starting Mongo 3.4.4, transforms { s1: 1, s2: 3, ... } into [ { k: "s1", v: 1 }, { k: "s2", v: 3 }, ... ]. This way, we don't need to care about the field names since we can access values through their "v" fields.
Having an array rather than an object is a first step towards being able to sum its elements. But the elements obtained with $objectToArray are objects and not simple integers. We can get passed this by mapping (the $map operation) these array elements to extract the value of their "v" field. Which in our case results in creating this kind of array: [1, 3, 18, 42].
Finally, it's a simple matter of summing elements within this array, using the $sum operation.
Segment: {s1: 10, s2: 4, s3: 12}
{$set: {"new_array":{$objectToArray: "$Segment"}}}, //makes field names all "k" or "v"
{$project: {_id:0, total:{$sum: "$new_array.v"}}}
"total" will be 26.
$set replaces $addFields in newer versions of mongo. (I'm using 4.2.)
"new_array": [
{
"k": "s1",
"v": 10
},
{
"k": "s2",
"v": 4
},
{
"k": "s3",
"v": 12
}
]
You can also use regular expressions. Eg. /^s/i for words starting with "s".