List all values existing of a property? - mongodb

Assume I have a Student collection:
{
name: "ABC",
age: 10,
address {
city: "CITY1",
state: "STATE",
}
}
{
name: "DEF",
age: 11,
address {
city: "CITY2",
state: "STATE",
}
}
{
name: "ABC",
age: 12,
address {
city: "CITY1",
state: "STATE",
}
}
Can I get the list of all unique City values from the list? For example, with the above 3 documents, I would like to get the list {"CITY1", "CITY2"}
I was just getting started with MongoDB from Relational Database, so this is a little confused for me, since I needed another Address table for it and I can just use SELECT DISTINCT to get what I want.

MongoDB has a similar db.collection.distinct() command.
To access elements in the address subdocument you need to use dot notation, so the complete query would be:
db.Student.distinct("address.city")
Some helpful documentation links to help you make the translation from SQL queries:
SQL to MongoDB Mapping Chart
SQL to Aggregation Mapping Chart

Just for notes, there is already distinct as mentioned, but for a more conventional response, use aggregate:
db.Student.aggregate([
{"$unwind": "$address" }},
{"$group": { "_id": "$address.city" }},
{"$project": { "_id": 0, "city" : "$_id" }}
])
Long winded compared to distinct, but it depends on what your eyes want.

Related

How to update fields in a MongoDB collection if certain conditions met between two collections?

What am I doing?
So I am trying to update two fields in my MongoDB collection. The collection name is mydata and looks like this:
{
id: 123,
name: "John",
class: "A-100",
class_id: "", <-- Need to update this field,
class_type: "", <-- Need to update this field
}
What do I want to do?
I have another collection that is older, but it contains two fields that I need that I do not have in my current collection. But they both have the id field that corresponds. This is how it looks like the other collection:
{
id: 123,
name: "John",
class: "A-100",
class_id: 235, <-- Field That I need,
class_type: "Math" <-- Field That I need
}
What have I done so far?
I started an aggregate function that starts with a $lookup then $unwind then $match then $project. Looks like this:
db.mydata.aggregate([
{
$lookup: {
from: "old_collection",
localField: "id",
foreignField: "id",
as: "newData"
}
},
{
$unwind: "newData"
},
{
$match: {"class": "A-100"}
},
{
$project: {
_id: 0,
"id": "$newData.id",
"class_id": "$newData.class_id",
"class_type": "$newData.class_type"
}
},
Need help here to update mydata collection in the
fields that I pointed in the top
])
In summary
What I am trying to do is: If two objects from different collections have the same Id then pick the keys from the second object and update the keys in the first object.
Is there a way to do that in MongoDB?
Thanks.

Is there a way to sort the order of columns in mongodb?

I am learning MongoDB and I've encountered a thing that mildly annoys me.
Let's say I got this collection:
[
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Tom",
followers: 10,
active: true
},
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Rob",
followers: 109,
active: true
},
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Jacob",
followers: 2,
active: false
}
]
and I rename the name column to username with the command:
db.getCollection('users').update({}, { $rename: { "name" : "username" }}, false, true)
now the username property is at the end of the record, example:
[
// ... rest of collection has the same structure
{
_id: ObjectId("XXXXXXXXXXXXXX"),
followers: 109,
active: true,
username: "Rob"
}
// ... rest of collection has the same structure
]
How do I prevent this from happening or how do I place them in a specific order? This is infuriating to work with in Robo/Studio 3T. I've got a collection with about 15 columns which are now out of order which in the GUI because of this
The $rename operator logically performs an $unset of both the old name and the new name, and then performs a $set operation with the new name. As such, the operation may not preserve the order of the fields in the document; i.e. the renamed field may move within the document.
Documentation
It is the behaviour from version 2.6
Since it is JSON based, you can get any field easily. And you have very less columns.
Keys in JSON objects are in their very nature unordered. See RFC 4627 which defines JSON, section 1 "Introduction":
An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.
(Emphasis mine)
Therefore, it would even be correct, if you wrote
{
"name": "Joe",
"city": "New York"
}
and got back
{
"city": "New York",
"name": "Joe"
}

Mongodb Aggregate function with reference to another collection

I have two collections
Persons
{
"name": "Tom",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
Cars
{
"model": "BMW",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
How can i do a query such that i can get the following results (for example)
BMW : 2
Toyota: 3
Below is my aggregate function. I can get it the data out, however it does the car _id instead of the model name.
db.persons.aggregate([
{
$group: {
_id: {
car_Id: "$car_id",
},
carsCount: {$sum: 1}
},
},
]);
Appreciate any assistance.
you can aggregate internally in one collection with one query. I would suggest to keep aggregated data like that or similar:
{
car_id: 273645jhg2f52345hj3564jh6
sum: 4
}
And replace ID with name when you will need to expose data to the user.

Can I use populate before aggregate in mongoose?

I have two models, one is user
userSchema = new Schema({
userID: String,
age: Number
});
and the other is the score recorded several times everyday for all users
ScoreSchema = new Schema({
userID: {type: String, ref: 'User'},
score: Number,
created_date = Date,
....
})
I would like to do some query/calculation on the score for some users meeting specific requirement, say I would like to calculate the average of score for all users greater than 20 day by day.
My thought is that firstly do the populate on Scores to populate user's ages and then do the aggregate after that.
Something like
Score.
populate('userID','age').
aggregate([
{$match: {'userID.age': {$gt: 20}}},
{$group: ...},
{$group: ...}
], function(err, data){});
Is it Ok to use populate before aggregate? Or I first find all the userID meeting the requirement and save them in a array and then use $in to match the score document?
No you cannot call .populate() before .aggregate(), and there is a very good reason why you cannot. But there are different approaches you can take.
The .populate() method works "client side" where the underlying code actually performs additional queries ( or more accurately an $in query ) to "lookup" the specified element(s) from the referenced collection.
In contrast .aggregate() is a "server side" operation, so you basically cannot manipulate content "client side", and then have that data available to the aggregation pipeline stages later. It all needs to be present in the collection you are operating on.
A better approach here is available with MongoDB 3.2 and later, via the $lookup aggregation pipeline operation. Also probably best to handle from the User collection in this case in order to narrow down the selection:
User.aggregate(
[
// Filter first
{ "$match": {
"age": { "$gt": 20 }
}},
// Then join
{ "$lookup": {
"from": "scores",
"localField": "userID",
"foriegnField": "userID",
"as": "score"
}},
// More stages
],
function(err,results) {
}
)
This is basically going to include a new field "score" within the User object as an "array" of items that matched on "lookup" to the other collection:
{
"userID": "abc",
"age": 21,
"score": [{
"userID": "abc",
"score": 42,
// other fields
}]
}
The result is always an array, as the general expected usage is a "left join" of a possible "one to many" relationship. If no result is matched then it is just an empty array.
To use the content, just work with an array in any way. For instance, you can use the $arrayElemAt operator in order to just get the single first element of the array in any future operations. And then you can just use the content like any normal embedded field:
{ "$project": {
"userID": 1,
"age": 1,
"score": { "$arrayElemAt": [ "$score", 0 ] }
}}
If you don't have MongoDB 3.2 available, then your other option to process a query limited by the relations of another collection is to first get the results from that collection and then use $in to filter on the second:
// Match the user collection
User.find({ "age": { "$gt": 20 } },function(err,users) {
// Get id list
userList = users.map(function(user) {
return user.userID;
});
Score.aggregate(
[
// use the id list to select items
{ "$match": {
"userId": { "$in": userList }
}},
// more stages
],
function(err,results) {
}
);
});
So by getting the list of valid users from the other collection to the client and then feeding that to the other collection in a query is the onyl way to get this to happen in earlier releases.

Get last record for several items at once with mongo

In my mongo database, I have basically 2 collections:
pupils
{_id: ObjectID(539ab7ffefbb93120c9697f7), firstname: 'Arnold', lastname: 'Smith'}
{_id: ObjectID(539ab7ffefbb93120c5473c3), firstname: 'Steven', lastname: 'Jens'}
marks
{ date: '2014-06-12', value: 12, pupilID: 539ab7ffefbb93120c9697f7}
{ date: '2014-06-05', value: 9, pupilID: 539ab7ffefbb93120c9697f7}
{ date: '2014-05-10', value: 17, pupilID: 539ab7ffefbb93120c9697f7}
{ date: '2014-05-10', value: 7, pupilID: 539ab7ffefbb93120c5473c3}
Is there a way with mongoshell to get the last mark of each pupils without having to manually loop through the list of pupils and get the last mark for each one ?
Currently I loop through each pupils and perform a:
db.marks.find({pupilID: pupilID}).sort({_id: -1}).limit(1)
But I'm quite concerned regarding the performances if the marks collections contains a high number of items.
Well your dates are not the best example here as they are strings. You should convert them to proper "Date" types, but at least they are lexical for sorting.
Not the "join" you seem to be implicitly looking for, but you can get the $last mark for each student from your "marks" collection, which will probably do some way to helping your result:
db.marks.aggregate([
{ "$sort": { "date": 1 } },
{ "$group": {
"_id": "$pupilID",
"date": { "$last": "$date" },
"value": { "$last": "$value" }
}}
]}
And that will give you the last mark "value" by date for each "pupilID". The joining of data is up to you, but this is better than looping whole collections or otherwise firing off on query per "pupil".