SpringData MongoDb, how to count distinct of a query? - mongodb

I'm doing paginated search with mongoDb in my Springboot API.
For a customer search path, I'm building a query with a bunch of criteria depending on the user input.
I then do a count to display the total number of results (and the computed number of page associated)
Long total = mongoTemplate.count(query, MyEntity.class);
I then do the paginated query to return only current page results
query.with(PageRequest.of(pagination.getPage(), pagination.getPageSize()));
query.with(Sort.by(Sort.Direction.DESC, "creationDate"));
List<MyEntity> listResults = mongoTemplate.find(query, MyEntity.class);
It all works well.
Now on my total results, i often have multiple result for the same users, I want to display those in the paginated list, but I also want to display a new counter with the total distinct user that are in that search.
I saw the findDistinct parameter
mongoTemplate.findDistinct(query, "userId", OnboardingItineraryEntity.class, String.class);
But I do not want to retrieve a huge list and do a count on it. Is there a way to easily do:
mongoTemplate.countDistinct(query, "userId", OnboardingItineraryEntity.class, String.class);
Cause I've a huge number of criteria, so i find it sad to have to rebuild an Aggregate object from scratch ?
Bonus question, sometime userId will be null, Is there an easy way do count number of distinct (not null) + number of null in one query?
Or do I need to do a query, when i add an extra criteira on userId being null, do a count on that, and then do the count distinct on all and add them up manualy in my code (minus one).

MongoDB aggregation solves this problem in several ways.
Aggregate with $type operator:
db.myEntity.aggregate([
{$match:...}, //add here MatchOperation
{
"$group": {
"_id": {
"$type": "$userId"
},
"count": {
"$sum": 1
}
}
}
])
MongoPlayground
---Ouput---
[
{
"_id": "null", //null values
"count": 2
},
{
"_id": "missing", // if userId doesn't exists at all
"count": 1
},
{
"_id": "string", //not null values
"count": 4
}
]
Single document with null and NonNull fields
db.myEntity.aggregate([
{$match:...}, //add here MatchOperation
{
"$group": {
"_id": "",
"null": {
$sum: {
$cond: [
{
$ne: [{ "$type": "$userId"}, "string"]
},
1,
0
]
}
},
"nonNull": {
"$sum": {
$cond: [
{
$eq: [{ "$type": "$userId" }, "string"]
},
1,
0
]
}
}
}
}
])
MongoPlayground
---Output---
[
{
"_id": "",
"nonNull": 4,
"null": 3
}
]
Performing $facet operator
db.myEntity.aggregate([
{$match:...}, //add here MatchOperation
{
$facet: {
"null": [
{
$match: {
$or: [
{
userId: {
$exists: false
}
},
{
userId: null
}
]
}
},
{
$count: "count"
}
],
"nonNull": [
{
$match: {
$and: [
{
userId: {
$exists: true
}
},
{
userId: {
$ne: null
}
}
]
}
},
{
$count: "count"
}
]
}
},
{
$project: {
"null": {
$ifNull: [
{
$arrayElemAt: [
"$null.count",
0
]
},
0
]
},
"nonNull": {
$ifNull: [
{
$arrayElemAt: [
"$nonNull.count",
0
]
},
0
]
}
}
}
])
MongoPlayground
Note: Try any of these solutions and let me know if you have any problem creating the MongoDB aggregation.

Related

MongoDB aggregation multiple partial matches

I am in the process of moving pagination / filtering away from the client and onto the server.
Data is presented in a table, each column header has a text input where you can type and filter the dataset by what is typed in. This uses a simple indexOf check on the inputted text and the dataset to allow partial matches.
Example table column
{ name: "test 1" }, { name: "test 2" }
The image / data above shows a column in the table. If I were to type in "tes" both results would appear.
let filteredResults = data.filter(row => row.name.toLowerCase().indexOf(filterValue) > -1)
I have now moved this filtering onto the server but I am struggling to work out how to do a similar partial match when querying my data.
This is my query:
aggregate([
{
$facet: {
results: [
{
$match: {
"name": req.body.name
}
},
{
$skip: pageOptions?.pageNo ? (pageOptions.pageNo - 1) * 10 : 0
},
{
$limit: 10
}
],
totalCount: [
{
$match: {
"name": req.body.name
}
},
{ $count: 'totalCount' }
]
}
},
{
$addFields:
{
"total": { $arrayElemAt: ["$totalCount.totalCount", 0] }
}
},
{
$project: {
"totalCount": 0
}
}
]
Each of the fields in the $match stage are possible columns from the table, in this example just the name field. You could filter by more then 1. The above works with exact matches so if we were to search the name column with "test 1" then that record would be returned but if we search for "tes" nothing would be returned.
Any help with this would be great!
You can use a $regex match to perform your case-insensitive, partial string match.
db.collection.aggregate([
{
$match: {
"name": {
// put your query in $regex option
$regex: "tes",
$options: "i"
}
}
},
{
$facet: {
results: [
{
$skip: 0
},
{
$limit: 10
}
],
totalCount: [
{
$count: "totalCount"
}
]
}
},
{
$addFields: {
"total": {
$arrayElemAt: [
"$totalCount.totalCount",
0
]
}
}
},
{
$project: {
"totalCount": 0
}
}
])
Here is the Mongo playground for your reference.
I was able to solve this by using text indexes and the $text operator:
[
{
$match: { $text: { $search: "asdfadsf" } }
}
]

Sort Mongodb documents by seeing if the _id is in another array

I have two collections - "users" and "follows". "Follows" simply contains documents with a "follower" field and a "followee" field that represent when a user follows another user. What I want to do is to be able to query the users but display the users that I (or whatever user is making the request) follow first. For example if I follow users "5" and "14", when I search the list of users, I want users "5" and "14" to be at the top of the list, followed by the rest of the users in the database.
If I were to first query all the users that I follow from the "Follows" collection and get an array of those userIDs, is there a way that I can sort by using something like {$in: [userIDs]}? I don't want to filter out the users that I do not follow, I simply want to sort the list by showing the users that I do follow first.
I am using nodejs and mongoose for this.
Any help would be greatly appreciated. Thank you!
Answer
db.users.aggregate([
{
$addFields: {
sortBy: {
$cond: {
if: {
$in: [ "$_id", [ 5, 14 ] ]
},
then: 0,
else: 1
}
}
}
},
{
$sort: {
sortBy: 1
}
},
{
$unset: "sortBy"
}
])
Test Here
If you don't want you on the list, then
db.users.aggregate([
{
$addFields: {
sortBy: {
$cond: {
if: {
$in: [ "$_id", [ 5, 14 ] ]
},
then: 0,
else: 1
}
}
}
},
{
$sort: {
sortBy: 1
}
},
{
$unset: "sortBy"
},
{
$match: {
"_id": { $ne: 1 }
}
}
])
Test Here
If you want to sort users first
db.users.aggregate([
{
$sort: {
_id: 1
}
},
{
$addFields: {
sortBy: {
$cond: {
if: {
$in: [
"$_id",
[
5,
14
]
]
},
then: 0,
else: 1
}
}
}
},
{
$sort: {
sortBy: 1,
}
},
{
$unset: "sortBy"
},
{
$match: {
"_id": {
$ne: 1
}
}
}
])
Test Here

Combining data from 2 mongoDB collections into 1 document

I want to filter 2 collections and return one document.
I have 2 MongoDB collections modelled as such
Analytics_Region
_id:5ecf3445365eca3e58ff57c0,
type:"city"
name:"Toronto"
CSD:"3520005"
CSDTYPE:"C"
PR:"35"
PRNAME:"Ontario"
geometry:Object
country:"CAN"
updatedAt:2021-04-23T18:25:50.774+00:00
province:"ON"
Analytics_Region_Custom
_id:5ecbe871d8ab4ab6845c5142
geometry:Object
name:"henry12"
user:5cbdd019b9d9170007d15990
__v:0
I want to output a single collection in alphabetical order by name,
{
_id: 5ecbe871d8ab4ab6845c5142,
name: "henry12",
type: "custom",
province: null
},
{
_id:5ecf3445365eca3e58ff57c0,
name:"Toronto"
type:"city"
province:"ON",
}
Things to note: In the output, we have added a type of "custom" for every document in Analytics_Region_custom. We also add a province of "null" for every document.
So far I looked into $lookup (to fetch results from another collection) but it does not seem to work for my needs since it adds an array onto every document
You can use $unionWith
Documents will be added to the pipeline(no check for duplicates), and from those documents we will project the fields
if type is missing => custom
if province missing => null
*if those 2 have any false value, like false/0/null the old value is kept (new value only if field is missing)
Test code here
db.coll1.aggregate([
{
"$unionWith": {
"coll": "coll2"
}
},
{
"$project": {
"_id": "$_id",
"name": "$name",
"type": {
"$cond": [
{
"$ne": [
{
"$type": "$type"
},
"missing"
]
},
"$type",
"custom"
]
},
"province": {
"$cond": [
{
"$ne": [
{
"$type": "$province"
},
"missing"
]
},
"$province",
null
]
}
}
},
{
"$sort": {
"name": 1
}
}
])
$unionWith to perform union of both collections
$project to project only fields that you want
sort to sort by name field
db.orders.aggregate([
{
$unionWith: "inventory"
},
{
$project: {
_id: 1,
name: 1,
province: { $cond: { if: "$province", then: "$province", else: null } },
type: { $cond: { if: "$type", then: "$type", else: "custom" } }
}
},
{
$sort: { name: 1 }
}
])
Working example

Mongodb: is it possible to do this in one query?

I am new to Mongodb, Here is my document format:
{
"_id": {
"$oid": "5ee023790a0e502e3a9ce9e7"
},
"data": {
"Quick": [
["1591745491", "4", "uwp"],
["1591745492", "4", "uwp"],
["1591745516", "12", "Word"],
["1591747346", "8", "uwp"]
]
"Key": [
["1591747446", "Num"]
]
"Search": [
["1591745491", "tty"],
["1591745492", "erp"],
["1591745516", "Word"],
["1591747346", "uwp"]
]
},
"devicecode": "MP1G5L9EMP1G5L9E#LENOVO"
}
What I want to do is:
group by devicecode
for each group, count how many times they used "Quick", "key" and "Search" (count how many line under the name)
Currently I am using a python program to get this done. but I believe that should be a way to get it done within Mongodb.
The output format should look like this:
devicecode: MP1G5L9EMP1G5L9E#LENOVO, Quick: 400, key: 350, Search: 660
...
You could use aggregation framework to compute the length of individual arrays in the $set stage and then in the $group stage group-by device while summing up the computed array length values from the previous stage. Finally, in the $project stage map _id to devicecode and deselect _id.
db.getCollection("testcollection").aggregate([
{
$set: {
QuickLen: {
$size: {
$ifNull: [
"$data.Quick",
[]
]
}
},
KeyLen: {
$size: {
$ifNull: [
"$data.Key",
[]
]
}
},
SearchLen: {
$size: {
$ifNull: [
"$data.Search",
[]
]
}
}
}
},
{
$group: {
_id: "$devicecode",
Quick: {
$sum: "$QuickLen"
},
key: {
$sum: "$KeyLen"
},
Search: {
$sum: "$SearchLen"
}
}
},
{
$project: {
devicecode: "$_id",
Quick: 1,
key: 1,
Search: 1,
_id: 0
}
}
])

Mongo db not in query by having two subset of documents from same collection

I am new to mongodb. Assume the following. There are 3 types of documents in one collection x, y and z.
docs = [{
"item_id": 1
"type": "x"
},
{
"item_id": 2
"type": "x"
},{
"item_id": 3
"type": "y",
"relavent_item_ids": [1, 2]
},
{
"item_id": 3
"type": "y",
"relavent_item_ids": [1, 2, 3]
},{
"item_id": 4
"type": "z",
}]
I want to get the following.
Ignore the documents with type z
Get all the documents of type x where it's item_id is not in relavent_item_ids of type y documents.
The result should have item_id field.
I tried doing match $in but this returns me all the records, I am unable to figure out how to have in condition with subset of documents of type y.
You can use below query
const item_ids = (await db.collection.find({ "type": "y" })).map(({ relavent_item_ids }) => relavent_item_ids)
const result = db.collection.find({
"item_id": { "$exists": true },
"type": { "$ne": "z", "$eq": "x" },
"relavent_item_ids": { "$nin": item_ids }
})
console.log({ result })
Ignore the documents with type z --> Use $ne not equal to query operator to filter out z types.
Get all the documents of type x where it's item_id is not in relavent_item_ids of type y documents --> Use $expr to match the same documents fields.
The result should have item_id field --> Use $exists query operator.
The solution:
db.test.aggregate( [
{
$facet: {
firstQuery: [
{
$match: { type: { $eq: "x", $ne: "z" } }
},
{
$project: {
item_id : 1, _id: 0
}
}
],
secondQuery: [
{
$match: { type: "y" }
},
{
$group: {
_id: null,
relavent: { $push: "$relavent_item_ids" }
}
},
{
$project: {
relavent: {
$reduce: {
input: "$relavent",
initialValue: [ ],
in: { $setUnion: [ "$$value", "$$this" ] }
}
}
}
}
]
}
},
{
$addFields: { secondQuery: { $arrayElemAt: [ "$secondQuery", 0 ] } }
},
{
$project: {
result: {
$filter: {
input: "$firstQuery" ,
as: "e",
cond: { $not: [ { $in: [ "$$e.item_id", "$secondQuery.relavent" ] } ] }
}
}
}
},
] )
Using the input documents in the question post and adding one more following document to the collection:
{
"item_id": 11,
"type": "x",
}
: only this document's item_id (value 11) will show in the output.
The aggregation uses a $facet to make two individual queries with a single pass. The first query gets all the "x" types (and ignores type "z") as an array. The second query gets an array of relavent_item_ids with unique values (from the documents of type "y"). The final, $project stage filters the first query result array with the condition:
Get all the documents of type x where it's item_id is not in
relavent_item_ids of type y documents
I am not sure if its an elegant solution.
db.getCollection('test').aggregate([
{
"$unwind": {
"path": "$relavent_item_ids",
"preserveNullAndEmptyArrays": true
}
},
{
"$group": {
"_id":null,
"relavent_item_ids": {"$addToSet":"$relavent_item_ids"},
"other_ids": {
"$addToSet":{
"$cond":[
{"$eq":["$type", "x"]},
"$item_id",
null
]
}
}
}
},
{
"$project":{
"includeIds": {"$setDifference":["$other_ids", "$relavent_item_ids"]}
}
},
{
"$unwind": "$includeIds"
},
{
"$match": {"includeIds":{"$ne":null}}
},
{
"$lookup":{
"from": "test",
"let": { "includeIds": "$includeIds"},
"pipeline": [
{ "$match":
{ "$expr":
{ "$and":
[
{ "$eq": [ "$item_id", "$$includeIds" ] },
{ "$eq": [ "$type", "x" ] }
]
}
}
}
],
"as": "result"
}
},
{
"$unwind": "$result"
},
])