Mongodb querying for all min values that match criteria and indexing

Mongodb querying for all min values that match criteria and indexing - mongodb

Suppose I had the following:
[
{
"team": "A",
"age": 1,
"name": "Abe"
},
{
"team": "A",
"age": 5,
"name": "Apple"
},
{
"team": "B",
"age": 1,
"name": "Ben"
},
{
"team": "B",
"age": 2,
"name": "Bon"
},
{
"team": "C",
"age": 5,
"name": "Cherry"
}
]
I have the following query:
They must be in either TeamA or TeamB.
After filtering, I want only the youngest.
So in this example, it would return only Abe and Ben.
Preferably, I want it done in a single query. My guess is I have to use an aggregation pipeline, something like
db.People.aggregate(
[
$match: { $or: [{ team: 'A' }, { team: 'B' }] },
// some more stuff
);
Question1: I'm not sure what the next step would be. Could someone point me in the right direction?
Question2:
There may be a million records and I was thinking of adding two index:
Index on Team. I'm thinking this will allow it to filter Teams of interest.
Index on Age so that it can grab only the mins.
Would these indexes help or what kind of indexes should I be looking into?
**Edit: I'm getting closer, but I'm only interested in the records themselves. **
db.collection.aggregate([
{
$match: {
$or: [
{
team: "A"
},
{
team: "B"
}
]
}
},
{
$group: {
_id: "$age",
items: {
$push: "$$ROOT"
}
}
},
{
$sort: {
_id: 1
}
},
{
$limit: 1,
}
])

Related

How can I get random documents with the field starting letters (A to Z) in MongoDB?

I'm trying to make an aggregate with mongodb. My goal is that I want to get random names
starting with letters A to Z. As a result, each word starts with letter must be only once in the response but I can't figure out how to do it. I used match condition with regex and sample condition to get random documents.
Here is my collection;
[
{
"name": "ahmet"
},
{
"name": "barış"
},
{
"name": "ceyhun"
},
{
"name": "aslan"
},
{
"name": "deniz"
},
....
]
Here is my aggregate function;
db.collection.aggregate([
{
$match: {
name: {
$regex: "^a|^b|^c" // must be A to Z
}
}
},
{
"$sample": {
"size": 3 // Must be 26
}
}
])
I'm waiting response to be like this;
[
{
"name": "ahmet"
},
{
"name": "barış"
},
{
"name": "ceyhun"
},
.... // other words starting with d, e , f but only one word for each letter
]
But I'm getting;
[
{
"_id": ObjectId("5a934e000102030405000001"),
"name": "barış"
},
{
"_id": ObjectId("5a934e000102030405000003"),
"name": "aslan"
},
{
"_id": ObjectId("5a934e000102030405000000"),
"name": "ahmet"
},
// name => aslan, name => ahmet (Two words starting with same letter)
]
I'm newbie at mongodb and if anyone can help me where I'm wrong, I'll be appreciate.
Mongo Playground

You can do something like this:
Edit with guard improvement suggestions* (using $substrCP, $ifNull):
db.collection.aggregate([
{
$group: {
_id: {$substrCP: ["$name", 0, 1]},
name: {$push: "$name"}
}
},
{
$project: {_id: 0,
name: {
$arrayElemAt: [
"$name",
{$toInt: {$multiply: [{$rand: {}}, {$size: {$ifNull: ["$name",[]] }}]}
}
]
}
}
}
])
As you can see on this playground example.
The $group will keep a list of names per each firstL, the $arryElemAt with the $rand will keep only a random item.
*Thanks to #Mbay and #Paul for the improvement suggestions

Including additional fields in a Mongodb aggregate query

I have a data structure like this. Each student will have multiple entries based on when they enter the classrooms. The query needs to get the latest record of each student based on a list of student ids and department name. It also should show the teacher id and last timestmap
[
{
"studentid": "stu-1234",
"dept": "geog",
"teacher_id": 1,
"LastSwipeTimestamp": "2021-11-25T10:50:00.5230694Z"
},
{
"studentid": "stu-1234",
"dept": "geog",
"teacher_id": 2,
"LastSwipeTimestamp": "2021-11-25T11:50:00.5230694Z"
},
{
"studentid": "stu-abc",
"dept": "geog",
"teacher_id": 11,
"LastSwipeTimestamp": "2021-11-25T09:15:00.5230694Z"
},
{
"studentid": "stu-abc",
"dept": "geog",
"teacher_id": 21,
"LastSwipeTimestamp": "2021-11-25T11:30:00.5230694Z"
}
]
Here is what I have, but it doesn't show teacher id or the last swipe timestamp. What do I need to change or add?

Maybe you need something like this
db.collection.aggregate([
{
$match: {
"studentid": {
"$in": [
"stu-abc",
"stu-1234"
]
},
"dept": "geog"
}
},
{
$sort: {
"LastSwipeTimestamp": -1
}
},
{
$group: {
"_id": {
"studentid": "$studentid",
"dept": "$dept"
},
"teacher_id": {
$first: "$teacher_id"
},
"LastSwipeTimestamp": {
$first: "$LastSwipeTimestamp"
}
}
},
{
$project: {
_id: 0,
"studentid": "$_id.studentid",
"dept": "$_id.dept",
"teacher_id": "$teacher_id",
"LastSwipeTimestamp": "$LastSwipeTimestamp"
}
}
])
explained:
You need to consider the not grouped fields in the $group stage so they are also available to the next $project stage...

Query maximum N records of each group base on a condition in MongoDB?

I have a question regarding querying data in MongoDB. Here is my sample data:
{
"_id": 1,
"category": "fruit",
"userId": 1,
"name": "Banana"
},
{
"_id": 2,
"category": "fruit",
"userId": 2,
"name": "Apple"
},
{
"_id": 3,
"category": "fresh-food",
"userId": 1,
"name": "Fish"
},
{
"_id": 4,
"category": "fresh-food",
"userId": 2,
"name": "Shrimp"
},
{
"_id": 5,
"category": "vegetable",
"userId": 1,
"name": "Salad"
},
{
"_id": 6,
"category": "vegetable",
"userId": 2,
"name": "carrot"
}
The requirements:
If the category is fruit, returns all the records match
If the category is NOT fruit, returns maximum 10 records of each category grouped by user
The category is known and stable, so we can hard-coded in our query.
I want to get it done in a single query. So the result expected should be:
{
"fruit": [
... // All records of
],
"fresh-food": [
{
"userId": 1,
"data": [
// Top 10 records of user 1 with category = "fresh-food"
]
},
{
"userId": 2,
"data": [
// Top 10 records of user 2 with category = "fresh-food"
]
},
...
],
"vegetable": [
{
"userId": 1,
"data": [
// Top 10 records of user 1 with category = "vegetable"
]
},
{
"userId": 2,
"data": [
// Top 10 records of user 2 with category = "vegetable"
]
},
]
}
I've found the guideline to group by each group using $group and $slice, but I can't apply the requirement number #1.
Any help would be appreciated.

You need to use aggregation for this
$facet to categorize incoming data, we categorized into two. 1. Fruit and 2. non_fruit
$match to match the condition
$group first group to group the data based on category and user. Second group to group by its category only
$objectToArray to make the object into key value pair
$replaceRoot to make the non_fruit to root with fruit
Here is the code
db.collection.aggregate([
{
"$facet": {
"fruit": [
{ $match: { "category": "fruit" } }
],
"non_fruit": [
{
$match: {
$expr: {
$ne: [ "$category", "fruit" ]
}
}
},
{
$group: {
_id: { c: "$category", u: "$userId" },
data: { $push: "$$ROOT" }
}
},
{
$group: {
_id: "$_id.c",
v: {
$push: {
uerId: "$_id.u",
data: { "$slice": [ "$data", 3 ] }
}
}
}
},
{ $addFields: { "k": "$_id", _id: "$$REMOVE" } }
]
}
},
{ $addFields: { non_fruit: { "$arrayToObject": "$non_fruit" } }},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [ "$$ROOT", "$non_fruit" ]
}
}
},
{ $project: { non_fruit: 0 } }
])
Working Mongo playground

Find documents matching a condition depending on another document value

My collection contains the following documents:
{
"_id": "a",
"index": 1
},
{
"_id": "b",
"index": 2
},
{
"_id": "c",
"index": 3
},
{
"_id": "c",
"index": 4
}
Given an id, I would like to find all the documents having a greater index that the index corresponding to the id.
For example, if id="b", then index=2 and result would be
{
"_id": "c",
"index": 3
},
{
"_id": "c",
"index": 4
}
I thougt I could use an aggregation pipeline and use $add_field to add searched index into each document and the use a $match, but cannot find how to do it. I mean, my problem would be solved if I cold produce this result:
{
"_id": "a",
"index": 1,
"ref_index": 2
},
{
"_id": "b",
"index": 2,
"ref_index": 2
},
{
"_id": "c",
"index": 3,
"ref_index": 2
},
{
"_id": "c",
"index": 4,
"ref_index": 2
}

I am not sure is there any straight way to handle this operation, you need to do 2 queries or you can try below aggregation pipeline,
$facet to separate results, getIndex to get matching document of _id: "b", allDocs to get all documents
$filter to iterate loop of allDocs and filter document by index greater than condition
$unwind deconstruct allDocs array
$replaceRoot to replace allDocs object to root
db.collection.aggregate([
{
$facet: {
getIndex: [{ $match: { _id: "b" } }],
allDocs: [{ $match: {} }]
}
},
{
$project: {
allDocs: {
$filter: {
input: "$allDocs",
cond: {
$gt: [
"$$this.index",
{ $first: "$getIndex.index" }
]
}
}
}
}
},
{ $unwind: "$allDocs" },
{ $replaceRoot: { newRoot: "$allDocs" } }
])
Playground

Filtering a mongodb query result based on the position of a field in an array

Apologies for the confusing title, I am not sure how to summarize this.
Suppose I have the following list of documents in a collection:
{ "name": "Lorem", "source": "A" }
{ "name": "Lorem", "source": "B" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Ipsum", "source": "B" }
{ "name": "Ipsum", "source": "C" }
{ "name": "Foo", "source": "B" }
as well an ordered list of accepted sources, where lower indexes signify higher priority
sources = ["A", "B"]
My query should:
Take a list of available sources and a list of wanted names
Return a maximum of one document per name.
In case of multiple matches, the document with the most prioritized source should be chosen.
Example:
wanted_names = ['Lorem', 'Ipsum', 'Foo', 'NotThere']
Result:
{ "name": "Lorem", "source": "A" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Foo", "source": "B" }
The results don't necessarily have to be ordered.
Is it possible to do this with a Mongo query alone? If so could someone point me towards a resource detailing how to accomplish it?
My current solution doesn't support a list of names, and instead relies on a Python script to execute multiple queries:
db.collection.aggregate([
{$match: {
"name": "Lorem",
"source": {
$in: sources
}}},
{$addFields: {
"order": {
$indexOfArray: [sources, "$source"]
}}},
{$sort: {
"order": 1
}},
{$limit: 1}
]);
Note: _id fields are omitted in this question for the sake of brevity

How about this: With $group we have $min operator which takes lower source
Note: If you prioritize as ['B', 'A'], use $max then
db.collection.aggregate([
{
$match: {
"name": {
$in: [
"Lorem",
"Ipsum",
"Foo",
"NotThere"
]
},
"source": {
$in: [
"A",
"B"
]
}
}
},
{
$group: {
_id: "$name",
source: {
$min: "$source"
}
}
},
{
$project: {
_id: 0,
name: "$_id",
source: 1
}
}
])
MongoPlayground

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Mongodb querying for all min values that match criteria and indexing - mongodb

Related

How can I get random documents with the field starting letters (A to Z) in MongoDB?

Including additional fields in a Mongodb aggregate query

Query maximum N records of each group base on a condition in MongoDB?

Find documents matching a condition depending on another document value

Filtering a mongodb query result based on the position of a field in an array

Categories

Resources