How to sort a dictionary keys and pick the first in MongoDb? - mongodb

I'm running the following query as described in the docs.
db.getCollection('things')
.find(
{ _id: UUID("...") },
{ _id: 0, history: 1 }
)
It produces a single element that, when unfolded in the GUI, shows the dictonary history. When I unfold that, I get to see the contents: bunch of keys and correlated values.
Now, I'd like to sort the keys alphabetically and pick n first ones. Please note that it's not an array but a dictionary that is stored. Also, it would be great if I could flatten the structure and pop up my history to be the head (root?) of the document returned.
I understand it's about projection and slicing. However, I'm not getting anywhere, despite many attempts. I get syntax errors or a full list of elements. Being rather nooby, I fear that I require a few pointers on how to diagnose my issue to begin with.
Based on the comments, I tried with aggregate and $sort. Regrettably, I only seem to be sorting the current output (that produces a single document due to the match condition). I want to access the elements inside history.
db.getCollection('things')
.aggregate([
{ $match: { _id: UUID("...") } },
{ $sort: { history: 1 } }
])
I'm sensing that I should use projection to pull out a list of elements residing under history but I'm getting no success using the below.
db.getCollection('things')
.aggregate([
{ $match: { _id: UUID("...") } },
{ $project: { history: 1, _id: 0 } }
])

It is a long process to just sort object properties by alphabetical order,
$objectToArray convert history object to array in key-value format
$unwind deconstruct above generated array
$sort by history key by ascending order (1 = ascending, -1 = descending)
$group by _id and reconstruct history key-value array
$slice to get your number of properties from dictionary from top, i have entered 1
$arrayToObject back to convert key-value array to object format
db.getCollection('things').aggregate([
{ $match: { _id: UUID("...") } },
{ $project: { history: { $objectToArray: "$history" } } },
{ $unwind: "$history" },
{ $sort: { "history.k": 1 } },
{
$group: {
_id: "$_id",
history: { $push: "$history" }
}
},
{
$project: {
history: {
$arrayToObject: { $slice: ["$history", 1] }
}
}
}
])
Playground
There is another option, but as per MongoDB, it can not guarantee this will reproduce the exact result,
$objectToArray convert history object to array in key-value format
$setUnion basically this operator will get unique elements from an array, but as per experience, it will sort elements by key ascending order, so as per MongoDB there is no guarantee.
$slice to get your number of properties from dictionary from top, i have entered 1
$arrayToObject back to convert key-value array to object format
db.getCollection('things').aggregate([
{ $match: { _id: UUID("...") } },
{
$project: {
history: {
$arrayToObject: {
$slice: [
{ $setUnion: { $objectToArray: "$history" } },
1
]
}
}
}
}
])
Playground

Related

MongoDB get only the last documents per grouping based on field

I have a collection "TokenBalance" like this holding documents of this structure
{
_id:"SvVV1qdUcxNwSnSgxw6EG125"
balance:Array
address:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648156174658
_created_at:2022-03-24T21:09:34.737+00:00
_updated_at:2022-03-24T21:09:34.737+00:00
}
Each address has multiple documents like of structure above based on timestamps.
So address X can have 1000 objects with different timestamps.
What I want is to only get the last created documents per address but also pass all the document fields into the next stage which is where I am stuck. I don't even know if the way I am grouping is correctly done with the $last operator. I would appreciate some guidance on how to achieve this task.
What I have is this
$group stage (1st stage)
{
_id: '$address',
timestamp: {$last: '$timestamp'}
}
This gives me a result of
_id:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648193827320
But I want the other fields of each document as well so I can further process them.
Questions
1) Is it the correct way to get the last created document per "address" field?
2) How can I get the other fields into the result of that group stage?
Use $denseRank
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "$address",
sortBy: { timestamp: -1 },
output: { rank: { $denseRank: {} } }
}
},
{
$match: { rank: 1 }
}
])
mongoplayground
I guess you mean this:
{ $group: {
_id: '$address',
timestamp: {$last: '$timestamp'},
data: { $push: "$$ROOT" }
} }
If the latest timestamp is also the last sorted by _id you can use something like this:
[{$group: {
_id: '$_id',
latest: {
$last: '$$ROOT'
}
}}, {$replaceRoot: {
newRoot: '$latest'
}}]

How to group documents of a collection to a map with unique field values as key and count of documents as mapped value in mongodb?

I need a mongodb query to get the list or map of values with unique value of the field(f) as the key in the collection and count of documents having the same value in the field(f) as the mapped value. How can I achieve this ?
Example:
Document1: {"id":"1","name":"n1","city":"c1"}
Document2: {"id":"2","name":"n2","city":"c2"}
Document3: {"id":"3","name":"n1","city":"c3"}
Document4: {"id":"4","name":"n1","city":"c5"}
Document5: {"id":"5","name":"n2","city":"c2"}
Document6: {"id":"6,""name":"n1","city":"c8"}
Document7: {"id":"7","name":"n3","city":"c9"}
Document8: {"id":"8","name":"n2","city":"c6"}
Query result should be something like this if group by field is "name":
{"n1":"4",
"n2":"3",
"n3":"1"}
It would be nice if the list is also sorted in the descending order.
It's worth noting, using data points as field names (keys) is somewhat considered an anti-pattern and makes tooling difficult. Nonetheless if you insist on having data points as field names you can use this complicated aggregation to perform the query output you desire...
Aggregation
db.collection.aggregate([
{
$group: { _id: "$name", "count": { "$sum": 1} }
},
{
$sort: { "count": -1 }
},
{
$group: { _id: null, "values": { "$push": { "name": "$_id", "count": "$count" } } }
},
{
$project:
{
_id: 0,
results:
{
$arrayToObject:
{
$map:
{
input: "$values",
as: "pair",
in: ["$$pair.name", "$$pair.count"]
}
}
}
}
},
{
$replaceRoot: { newRoot: "$results" }
}
])
Aggregation Explanation
This is a 5 stage aggregation consisting of the following...
$group - get the count of the data as required by name.
$sort - sort the results with count descending.
$group - place results into an array for the next stage.
$project - use the $arrayToObject and $map to pivot the data such
that a data point can be a field name.
$replaceRoot - make results the top level fields.
Sample Results
{ "n1" : 4, "n2" : 3, "n3" : 1 }
For whatever reason, you show desired results having count as a string, but my results show the count as an integer. I assume that is not an issue, and may actually be preferred.

Mongoose remove duplicate social security customer

I have customers who have duplicate SocialSecurity and I want to remove them and keep the newest customer. I am doing this by comparing _id and keeping the one with the largest value. Unfortunately, when I am playing with dumby data, it seems like my code does not always delete the one with the smallest _id. Any idea why? I thought the $sort would work
let hc = db.getSiblingDB('customer');
hc.customers.aggregate([
{
"$group": {
_id: {socialsecurity: "$socialsecurity"},
imeis: { $addToSet: "$_id" },
count: { $sum : 1 }
}
},
{
"$match": {
count: { "$gt": 1 }
}
},
{
$sort: {_id: 1}
}
]).forEach(function(doc) {
doc.socialsecurity.shift();
hc.customers.remove({
_id: {$in: doc.socialsecurity}
});
})
Problem
{ $sort: { _id: 1} } is sorting in ascending order
So smallest will be the 1st _id in the array
doc.socialsecurity.shift(); will remove 1st element from the array that is smallest one.
Solution
{ $sort: { _id: -1 } } sort in descending order
OR
change doc.socialsecurity.shift(); to doc.socialsecurity.pop(); remove the last element from the array.

Count nested wildcard array mongodb query

I have the following data of users and model cars:
[
{
"user_id":"ebebc012-082c-4e7f-889c-755d2679bdab",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_37c04124-cb12-436c-902b-6120f4c51782":0,
"car_b78ddcd0-1136-4f45-8599-3ce8d937911f":1
},
{
"user_id":"f3eb2a61-5416-46ba-bab4-459fbdcc7e29",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_0d15eae9-9585-4f49-a416-46ff56cd3685":1
}
]
I want to see how many users have a car_ with the value 1 using mongodb, something like:
{"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d": 2}
For this example.
The issue is that I will never know how are the fields car_ are going to be, they will have a random structure (wildcard).
Notes:
car_id and user_id are at the same level.
The car_id is not given, I simply want to know for the entire database which are the most commmon cars_ with value 1.
$group by _id and convert root object to array using $objectToArray,
$unwind deconstruct root array
$match filter root.v is 1
$group by root.k and get total count
db.collection.aggregate([
{
$group: {
_id: "$_id",
root: { $first: { $objectToArray: "$$ROOT" } }
}
},
{ $unwind: "$root" },
{ $match: { "root.v": 1 } },
{
$group: {
_id: "$root.k",
count: { $sum: 1 }
}
}
])
Playground

MongoDB query or aggregation to skip sub-documents

I'd like to create a query or aggregation where the returned documents do not include sub-documents. I do not know that a given field will be a sub-document ahead of time (or I would just use the projection to skip them). So for example, if I have a document like this:
{
_id: 1,
field1: "a",
field2: "b",
field3: {
subfield1: "c",
subfield2: "d"
}
}
When my query returns this document, it either skips field3, or replaces field3's value with something else (e.g. a string = "field_is_an_object").
As I said, I don't know ahead of time which fields will be sub-documents (or "object" types). The $redact operator was the closest I could find, but I couldn't figure out a syntax to get it to work.
There are at least two ways you can achieve what you want:
The first one is pretty concise and requires just one aggregation stage which, however, is a little bit more complex and harder to understand:
db.collection.aggregate({
$replaceRoot: { // create a new top level document
"newRoot": { // ...which shall be
$arrayToObject: { // ...created from an array
$filter: { // ...that again should contain only those elements
input: { // ...from our input array
$objectToArray: "$$ROOT" // ...which is our respective top level document transformed into an array of key-value pairs
},
cond: { // ...where
$ne: [ { $type: "$$this.v" }, "object" ] // ...the "v" (as in "value" field is not an object)
}
}
}
}
}
})
The second one I can think of is way more verbose but pretty easy to understand by adding the stages step-by-step (as always with the aggregation framework).
db.collection.aggregate({
$project: {
"tmp": { // we create a temporary field
$objectToArray: "$$ROOT" // that contains our respective root document represented as an array of key-value pairs
}
}
}, {
$unwind: "$tmp" // flatten the temporary array into multiple documents
}, {
$match: {
"tmp.v": { $not: { $type: "object" } } // filter all documents out that we do not want in our result
}
}, {
$group: { // group all documents together again
"_id": "$_id", // into one bucket per original document ("_id")
"tmp": {
$push: "$tmp" // and create an array with all the key-value pairs that have survived our $match stage
}
}
}, {
$replaceRoot: { // create a new top level document...
"newRoot": {
$arrayToObject: "$tmp" // ...out of the data we have left in our array
}
}
})