What is the proper way of combining 2 documents in MongoDB - mongodb

I currently have 2 collections:
users that looks like:
const User = new Schema({
type: String,
required: true
type: String,
required: true
type: String,
required: false,
// ID of the guild a user belongs to
type: Schema.Types.ObjectId,
ref: 'guilds',
default: '61a679e18d84bff40c2f88fd',
required: true
type: Number,
required: true,
default: 100
guilds contains the objectID as _id and a field "name".
Now I would like to get a document by username and also the information of the guild that the user belongs to.
I read about using db.collection.aggregate this however results in all users and their guild information. Is it possible to use $match inside the aggregation to just get that single username? I'm fairly new to MongoDB and am just trying things out. If you have any resources or documentation I'd be happy to read those too!
In SQL it would look something like:
SELECT * FROM users where username = 'SomeUsername' INNER JOIN guilds on users.guildID = guilds.id

Aggregations can solve this (not recommended)
$lookup: {
from: 'guilds',
as: 'guild',
localeField: 'guildID',
foreignField: '_id',
$unwrap: {
path: '$guilds',
preserveNullAndEmptyArrays: true
$match: {
$or: [
{ 'guild._id': guildId },
{ ... other options ... }
While this works and can be reasonably fast depending on your indexes and number of documents it can be better to add frequently queried fields to the related documents. In your case: add guildId and guildName to your user.
While this duplicates data and might not be considered best practice in relational dbs it is common to do this in document based databases. This is the fastest solution.
The alternative to an aggregation and embedding guildData into the user is to send two queries. One for the user, then one for the guild. This is called the relationship-pattern. This is the most common solution I believe)
Many (all?) ODM libraries, such as mongoose, handle the resolving of relationships automatically for you (mongoose calls this population). Which can simplify querying a lot, I think!


Mongoose findOne not working as expected on nested records

I've got a collection in MongoDB whose simplified version looks like this:
Dealers = [{
Id: 123,
Name: 'Someone',
Email: 'someone#somewhere.com',
Vehicles: [
Id: 1234,
Make: 'Honda',
Model: 'Civic'
Id: 2345,
Make: 'Ford',
Model: 'Focus'
Id: 3456,
Make: 'Ford',
Model: 'KA'
And my Mongoose Model looks a bit like this:
const vehicle_model = mongoose.Schema({
Id: {
Type: Number
Email: {
Type: String
Vehicles: [{
Id: {
Type: Number
Make: {
Type: String
Model: {
Type: String
Note the Ids are not MongoDB Ids, just distinct numbers.
I try doing something like this:
const response = await vehicle_model.findOne({ 'Id': 123, 'Vehicles.Id': 1234 })
But when I do:
It's returned all the Vehicles nested records instead on the one I'm after.
What am I doing wrong?
This question is asked very frequently. Indeed someone asked a related question here just 18 minutes before this one.
When query the database you are requesting that it identify and return matching documents to the client. That is a separate action entirely than asking for it to transform the shape of those documents before they are sent back to the client.
In MongoDB, the latter operation (transforming the shape of the document) is usually referred to as "Projection". Simple projections, specifically just returning a subset of the fields, can be done directly in find() (and similar) operations. Most drivers and the shell use the second argument to the method as the projection specification, see here in the documentation.
Your particular case is a little more complicated because you are looking to trim off some of the values in the array. There is a dedicated page in the documentation titled Project Fields to Return from Query which goes into more detail about different situations. Indeed near the bottom is a section titled Project Specific Array Elements in the Returned Array which describes your situation more directly. In it is where they describe usage of the positional $ operator. You can use that as a starting place as follows:
"Id": 123,
"Vehicles.Id": 1234
"Vehicles.$": 1
Playground demonstration here.
If you need something more complex, then you would have to start exploring usage of the $elemMatch (projection) operator (not the query variant) or, as #nimrod serok mentions in the comments, using the $filter aggregation operator in an aggregation pipeline. The last option here is certainly the most expressive and flexible, but also the most verbose.

Sorting nested objects in MongoDB

So I have documents that follow this schema:
_id: String,
globalXP: {
xp: {
type: Number,
default: 0
level: {
type: Number,
default: 0
guilds: [{ _id: String, xp: Number, level: Number }]
So basically users have their own global XP and xp based on each guild they are in.
Now I want to make a leaderboard for all the users that have a certain guildID in their document.
What's the most efficient way to fetch all the user documents that have the guild _id in their guilds array and how do I sort them afterwards?
I know it might be messy as hell but bare with me here.
If I've understand well, you only need this line of code:
var find = await model.find({"guilds._id":"your_guild_id"}).sort({"globalXP.level":-1})
This query will return all documentas where guilds array contains the specific _id and sort by player level.
In this way the best level will be displayed first.
Here is an example how the query works. Please check if it work as you expected.

MongoDB a field in a document is unique, but not required so getting duplicate error [duplicate]

I was wondering if there is way to force a unique collection entry but only if entry is not null.
Sample schema:
var UsersSchema = new Schema({
name : {type: String, trim: true, index: true, required: true},
email : {type: String, trim: true, index: true, unique: true}
'email' in this case is not required but if 'email' is saved I want to make sure that this entry is unique (on a database level).
Empty entries seem to get the value 'null' so every entry wih no email crashes with the 'unique' option (if there is a different user with no email).
Right now I'm solving it on an application level, but would love to save that db query.
As of MongoDB v1.8+ you can get the desired behavior of ensuring unique values but allowing multiple docs without the field by setting the sparse option to true when defining the index. As in:
email : {type: String, trim: true, index: true, unique: true, sparse: true}
Or in the shell:
db.users.ensureIndex({email: 1}, {unique: true, sparse: true});
Note that a unique, sparse index still does not allow multiple docs with an email field with a value of null, only multiple docs without an email field.
See http://docs.mongodb.org/manual/core/index-sparse/
Yes, it is possible to have multiple documents with a field set to null or not defined, while enforcing unique "actual" values.
MongoDB v3.2+.
Knowing your concrete value type(s) in advance (e.g, always a string or object when not null).
If you're not interested in the details, feel free to skip to the implementation section.
longer version
To supplement #Nolan's answer, starting with MongoDB v3.2 you can use a partial unique index with a filter expression.
The partial filter expression has limitations. It can only include the following:
equality expressions (i.e. field: value or using the $eq operator),
$exists: true expression,
$gt, $gte, $lt, $lte expressions,
$type expressions,
$and operator at the top-level only
This means that the trivial expression {"yourField"{$ne: null}} cannot be used.
However, assuming that your field always uses the same type, you can use a $type expression.
{ field: { $type: <BSON type number> | <String alias> } }
MongoDB v3.6 added support for specifying multiple possible types, which can be passed as an array:
{ field: { $type: [ <BSON type1> , <BSON type2>, ... ] } }
which means that it allows the value to be of any of a number of multiple types when not null.
Therefore, if we want to allow the email field in the example below to accept either string or, say, binary data values, an appropriate $type expression would be:
{email: {$type: ["string", "binData"]}}
You can specify it in a mongoose schema:
const UsersSchema = new Schema({
name: {type: String, trim: true, index: true, required: true},
email: {
type: String, trim: true, index: {
unique: true,
partialFilterExpression: {email: {$type: "string"}}
or directly add it to the collection (which uses the native node.js driver):
User.collection.createIndex("email", {
unique: true,
partialFilterExpression: {
"email": {
$type: "string"
native mongodb driver
using collection.createIndex
"email": 1
}, {
unique: true,
partialFilterExpression: {
"email": {
$type: "string"
function (err, results) {
// ...
mongodb shell
using db.collection.createIndex:
"email": 1
}, {
unique: true,
partialFilterExpression: {
"email": {$type: "string"}
This will allow inserting multiple records with a null email, or without an email field at all, but not with the same email string.
Just a quick update to those researching this topic.
The selected answer will work, but you might want to consider using partial indexes instead.
Changed in version 3.2: Starting in MongoDB 3.2, MongoDB provides the
option to create partial indexes. Partial indexes offer a superset of
the functionality of sparse indexes. If you are using MongoDB 3.2 or
later, partial indexes should be preferred over sparse indexes.
More doco on partial indexes: https://docs.mongodb.com/manual/core/index-partial/
Actually, only first document where "email" as field does not exist will get save successfully. Subsequent saves where "email" is not present will fail while giving error ( see code snippet below). For the reason look at MongoDB official documentation with respect to Unique Indexes and Missing Keys here at http://www.mongodb.org/display/DOCS/Indexes#Indexes-UniqueIndexes.
// NOTE: Code to executed in mongo console.
db.things.ensureIndex({firstname: 1}, {unique: true});
db.things.save({lastname: "Smith"});
// Next operation will fail because of the unique index on firstname.
db.things.save({lastname: "Jones"});
By definition unique index can only allow one value to be stored only once. If you consider null as one such value it can only be inserted once! You are correct in your approach by ensuring and validating it at application level. That is how it can be done.
You may also like to read this http://www.mongodb.org/display/DOCS/Querying+and+nulls

JSON Schema with dynamic key field in MongoDB

Want to have a i18n support for objects stored in mongodb collection
currently our schema is like:
_id: "id"
name: "name"
localization: [{
lan: "en-US",
name: "name_in_english"
}, {
lan: "zh-TW",
name: "name_in_traditional_chinese"
but my thought is that field "lan" is unique, can I just use this field as a key, so the structure would be
_id: "id"
name: "name"
localization: {
"en-US": "name_in_english",
"zh-TW": "name_in_traditional_chinese"
which would be neater and easier to parse (just localization[language] would get the value i want for specific language).
But then the question is: Is this a good practice in storing data in MongoDB? And how to pass the json-schema check?
It is not a good practice to have values as keys. The language codes are values and as you say you can not validate them against a schema. It makes querying against it impossible. For example, you can't figure out if you have a language translation for "nl-NL" as you can't compare against keys and neither is it possible to easily index this. You should always have descriptive keys.
However, as you say, having the languages as keys makes it a lot easier to pull the data out as you can just access it by ['nl-NL'] (or whatever your language's syntax is).
I would suggest an alternative schema:
your_id: "id_for_name"
lan: "en-US",
name: "name_in_english"
your_id: "id_for_name"
lan: "zh-TW",
name: "name_in_traditional_chinese"
Now you can :
set an index on { your_id: 1, lan: 1 } for speedy lookups
query for each translation individually and just get that translation:
db.so.find( { your_id: "id_for_name", lan: 'en-US' } )
query for all the versions for each id using this same index:
db.so.find( { your_id: "id_for_name" } )
and also much easier update the translation for a specific language:
{ your_id: "id_for_name", lan: 'en-US' },
{ $set: { name: "ooga" } }
Neither of those points are possible with your suggested schemas.
Obviously the second schema example is much better for your task (of course, if lan field is unique as you mentioned, that seems true to me also).
Getting element from dictionary/associated array/mapping/whatever_it_is_called_in_your_language is much cheaper than scanning whole array of values (and in current case it's also much efficient from the storage size point of view (remember that all fields are stored in MongoDB as-is, so every record holds the whole key name for json field, not it's representation or index or whatever).
My experience shows that MongoDB is mature enough to be used as a main storage for your application, even on high-loads (whatever it means ;) ), and the main problem is how you fight database-level locks (well, we'll wait for promised table-level locks, it'll fasten MongoDB I hope a lot more), though data loss is possible if your MongoDB cluster is built badly (dig into docs and articles over Internet for more information).
As for schema check, you must do it by means of your programming language on application side before inserting records, yeah, that's why Mongo is called schemaless.
There is a case where an object is necessarily better than an array: supporting upserts into a set. For example, if you want to update an item having name 'item1' to have val 100, or insert such an item if one doesn't exist, all in one atomic operation. With an array, you'd have to do one of two operations. Given a schema like
{ _id: 'some-id', itemSet: [ { name: 'an-item', val: 123 } ] }
you'd have commands
// Update:
{ _id: id, 'itemSet.name': 'item1' },
{ $set: { 'itemSet.$.val': 100 } }
// Insert:
{ _id: id, 'itemSet.name': { $ne: 'item1' } },
{ $addToSet: { 'itemSet': { name: 'item1', val: 100 } } }
You'd have to query first to know which is needed in advance, which can exacerbate race conditions unless you implement some versioning. With an object, you can simply do
{ _id: id },
{ $set: { 'itemSet.name': 'item1', 'itemSet.val': 100 } }
If this is a use case you have, then you should go with the object approach. One drawback is that querying for a specific name requires scanning. If that is also needed, you can add a separate array specifically for indexing. This is a trade-off with MongoDB. Upserts would become
{ _id: id },
$set: { 'itemSet.name': 'item1', 'itemSet.val': 100 },
$addToSet: { itemNames: 'item1' }
and the query would then simply be
db.coll.find({ itemNames: 'item1' })
(Note: the $ positional operator does not support array upserts.)

