Unique index in mongoDB 3.2 ignoring null values - mongodb

I want to add the unique index to a field ignoring null values in the unique indexed field and ignoring the documents that are filtered based on partialFilterExpression.
The problem is Sparse indexes can't be used with the Partial index.
Also, adding unique indexes, adds the null value to the index key field and hence the documents can't be ignored based on $exist criteria in the PartialFilterExpression.
Is it possible in MongoDB 3.2 to get around this situation?

I am adding this answer as I was looking for a solution and didn't find one. This may not answer exactly this question or may be, but will help lot of others out there like me.
Example. If the field with null is houseName and it is of type string, the solution can be like this
db.collectionName.createIndex(
{name: 1, houseName: 1},
{unique: true, partialFilterExpression: {houseName: {$type: "string"}}}
);
This will ignore the null values in the field houseName and still be unique.

Yes, you can create partial index in MongoDB 3.2
Please see https://docs.mongodb.org/manual/core/index-partial/#index-type-partial
MongoDB recommend usage of partial index over sparse index. I'll suggest you to drop your sparse index in favor of partial index.

You can create partial index in mongo:3.2.
Example, if ipaddress can be "", but "127.0.0.1" should be unique. The solution can be like this:
db.collectionName.createIndex(
{"ipaddress":1},
{"unique":true, "partialIndexExpression":{"ipaddress":{"$gt":""}}})
This will ignore "" values in ipaddress filed and still be unique

{
"YourField" : {
"$exists" : true,
"$gt" : "0",
"$type" : "string"
}
}

To create at mongodbCompass you must write it as JSON:
for find other types wich supports see this link.

Yes, that can be a kind of a problem that the partial filter expression cannot contain any 'not' filters.
For those who can be interested in a C# solution for an index like this, here is an example.
We have a 'User' entity, which has one-to-one 'relation' to a 'Doctor' entity.
This relation is represented by the not required, nullable field 'DoctorId' in the 'User' entity. In other words, there is a requirement that a given 'Doctor' can be linked to only single 'User' at a time.
So we need an unique index which can fire an exception when something attempts to set DoctorId to the same Guid which already set for any other 'User' entity. At the same time multiple 'null' entries must be allowed for the 'DoctorId' field, since many users do not have any doctor attached to them.
The solution to build this kind of an index looks like:
var uniqueDoctorIdIndexDefinition = new IndexKeysDefinitionBuilder<User>()
.Ascending(o => o.DoctorId);
var existsFilter = Builders<User>.Filter.Exists(o => o.DoctorId);
var notNullFilter = Builders<User>.Filter.Type(o => o.DoctorId, BsonType.String);
var andFilter = Builders<User>.Filter.And(existsFilter, notNullFilter);
var createIndexOptions = new CreateIndexOptions<User>
{
Unique = true,
Name = UniqueDoctorIdIndexName,
PartialFilterExpression = andFilter,
};
var uniqueDoctorIdIndex = new CreateIndexModel<User>(
uniqueDoctorIdIndexDefinition,
createIndexOptions);
users.Indexes.CreateOne(uniqueDoctorIdIndex);
Probably in your description of a 'User' entity you must directly specify the BsonType of the 'DoctorId' field, by using an attribute, for example in our case it was:
[BsonRepresentation(BsonType.String)]
public Guid? DoctorId { get; set; }
I am more than sure that there is a more proficient and compact solution for this problem, so would be happy if somebody suggests it here.

Here is an example that I modified from the mongoDB partial index documentation:
db.contacts.createIndex(
{ email: 1 },
{ unique: true, partialFilterExpression: { email: { $exists: true } } }
)
IMPORTANT
To use the partial index, a query must contain the filter expression (or a modified filter expression that specifies a subset of the filter expression) as part of its query condition.
You can see that queries such as:
db.contacts.find({'email':'name#email.com'}).explain()
will indicate that they doing an index scan, even if you don't specify {$exists: true} because you're implicitly specifying a subset of the partialFilterExpression by specifying an email in your filter.
On the other hand, the following query will do a collection scan:
db.contacts.find({email: {$exists: false}})
WARNING
mythicalcoder's answer (currently the highest voted answer) is very misleading because it successfully creates a unique index, but the query planner will not generally be able to use the index you've created unless you add houseName: {$type: "string"} into your filter expression. This can have performance costs which you might not be aware of and can cause problems down the road.

Related

Mongodb create index for boolean and integer fields

user collection
[{
deleted: false,
otp: 3435,
number: '+919737624720',
email: 'Test#gmail.com',
name: 'Test child name',
coin: 2
},
{
deleted: false,
otp: 5659,
number: '+917406732496',
email: 'anand.satyan#gmail.com',
name: 'Nivaan',
coin: 0
}
]
I am using below command to create index Looks like for string it is working
But i am not sure this is correct for number and boolean field.
db.users.createIndex({name:"text", email: "text", coin: 1, deleted: 1})
I am using this command to filter data:
db.users.find({$text:{$search:"anand.satya"}}).pretty()
db.users.find({$text:{$search:"test"}}).pretty()
db.users.find({$text:{$search:2}}).pretty()
db.users.find({$text:{$search:false}}).pretty()
string related fields working. But numeric and boolean fields are not working.
Please check how i will create index for them
The title and comments in this question are misleading. Part of the question is more focused on how to query with fields that contain boolean and integer fields while another part of the question is focused on overall indexing strategies.
Regarding indexing, the index that was shown in the question is perfectly capable of satisfying some queries that include predicates on coin and deleted. We can see that when looking at the explain output for a query of .find({$text:{$search:"test"}, coin:123, deleted: false}):
> db.users.find({$text:{$search:"test"}, coin:123, deleted: false}).explain().queryPlanner.winningPlan.inputStage
{
stage: 'FETCH',
inputStage: {
stage: 'IXSCAN',
filter: {
'$and': [ { coin: { '$eq': 123 } }, { deleted: { '$eq': false } } ]
},
keyPattern: { _fts: 'text', _ftsx: 1, coin: 1, deleted: 1 },
indexName: 'name_text_email_text_coin_1_deleted_1',
isMultiKey: false,
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'backward',
indexBounds: {}
}
}
Observe here that the index scan stage (IXSCAN) is responsible for providing the filter for the coin and deleted predicates (as opposed to the database having to do that after FETCHing the full document.
Separately, you mentioned in the question that these two particular queries aren't working:
db.users.find({$text:{$search:2}}).pretty()
db.users.find({$text:{$search:false}}).pretty()
And by 'not working' you are referring to the fact that no results are being returned. This is also related to the following discussion in the comments which seemed to have a misleading takeaway:
You'll have to convert your coin and deleted fields to string, if you want it to be picked up by $search – Charchit Kapoor
So. There is no way for searching boolean or integger field. ? – Kiran S youtube channel
Nope, not that I know of. – Charchit Kapoor
You can absolutely use boolean and integer values in your query predicate to filter data. This playground demonstrates that.
What #Charchit Kapoor is mentioning that can't be done is using the $text operator to match and return results whose field values are not strings. Said another way, the $text operator is specifically used to perform a text search.
If what you are trying to achieve are direct equality matches for the field values, both strings and otherwise, then you can delete the text index as there is no need for using the $text operator in your query. A simplified query might be:
db.users.find({ name: "test"})
Demonstrated in this playground.
A few additional things come to mind:
Regarding indexing overall, databases will generally consider using an index if the first key is used in the query. You can read more about this for MongoDB specifically on this page. The takeaway is that you will want to create the appropriate set of indexes to align with your most commonly executed queries. If you have a query that just filters on coin, for example, then you may wish to create an index that has coin as its first key.
If you want to check if the exact string value is present in multiple fields, then you may want to do so using the $or operator (and have appropriate indexes for the database to use).
If you do indeed need more advanced text searching capabilities, then it would be appropriate to either continue using the $text operator or consider Atlas Search if the cluster is running in Atlas. Doing so does not prevent you from also having indexes that would support your other queries, such as on { coin: 2 }. It's simply that the syntax for performing such a query needs to be updated.
There is a lot going on here, but the big takeaway is that you can absolutely filter data based on any data type. Doing so simply requires using the appropriate syntax, and doing so efficiently requires an appropriate indexing strategy to be used along side of the queries.

Keeping default mongo _id and unique index of MondoDB

Is it good or bad practice to keep the standard "_id" generated my mongo in a document as well as my own unique identifier such as "name", or should I just replace _id generated with the actual name so my documents will look like this:
{
_id: 782yb238b2327b3,
name: "my_name"
}
or just like this:
{
_id: "my_name"
}
This depends on the scenario, there is nothing wrong with having your own unique ID, it may be string or a number, completely depends on your situation as long as its unique, the important thing is you are in charge of it. You would want to add an index to it of course.
for example i have an additional ID field which is a number called 'ID', because i required a sequential number as an identifier, another usecase may be that your migrating an application so you have to conform to a particular sequence pattern.
The sequences for the unique identifies could easily be stored in a separate document/collections.
There is no issue with using the built in _id if you have no requirement not to have a custom one, an interesting fact is that you can get the created date out of the _id. Always useful.
db.col.insert( { name: "test" } );
var doc = db.col.findOne( { name: "test" } );
var timestamp = doc._id.getTimestamp();

How to read a specific key-value pair from mongodb collection

If I have a mongodb collection users like this:
{
"_id": 1,
"name": {
"first" : "John",
"last" :"Backus"
},
}
How do I retrieve name.first from this without providing _id or any other reference. Also, is it possible that pulling just the `name^ can give me the array of embedded keys (first and last in this case)? How can that be done?
db.users.find({"name.first"}) didn't work for me, I got a:
SyntaxError "missing: after property id (shell):1
The first argument to find() is the query criteria whereas the second argument to the find() method is a projection, and it takes the form of a document with a list of fields for inclusion or exclusion from the result set. You can either specify the fields to include (e.g. { field: 1 }) or specify the fields to exclude (e.g. { field: 0 }). The _id field is implicitly included, unless explicitly excluded.
In your case, db.users.find({name.first}) will give an error as it is expected to be a search criteria.
To get the name json :
db.users.find({},{name:1})
If you want to fetch only name.first
db.users.find({},{"name.first":1})
Mongodb Documentation link here
To fetch all the record details:
db.users.find({"name.first":""})
To fetch just the name or specific field:
db.users.find({{},"name.X":""});
where X can be first, last .
dot(.) notation can be used if required to traverse inside the array for key value pair as
db.users.find({"name.first._id":"xyz"});
In 2022
const cursor = db
.collection('inventory')
.find({
status: 'A'
})
.project({ item: 1, status: 1 });
Source: https://www.mongodb.com/docs/manual/tutorial/project-fields-from-query-results/

How to apply constraints in MongoDB?

I have started using MongoDB and I am fairly new to it.
Is there any way by which I can apply constraints on documents in MongoDB?
Like specifying a primary key or taking an attribute as unique?
Or specifying that a particular attribute is greater than a minimum value?
MongoDB 3.2 Update
Document validation is now supported natively by MongoDB.
Example from the documentation:
db.createCollection( "contacts",
{ validator: { $or:
[
{ phone: { $type: "string" } },
{ email: { $regex: /#mongodb\.com$/ } },
{ status: { $in: [ "Unknown", "Incomplete" ] } }
]
}
} )
Original answer
To go beyond the uniqueness constraint available natively in indexes, you need to use something like Mongoose and its ability to support field-based validation. That will give you support for things like minimum value, but only when updates go through your Mongoose schemas/models.
Being a "schemaless" database, some of the things you mention must be constrained from the application side, rather than the db side. (such as "minimum value")
However, you can create indexes (keys to query on--remember that a query can only use one index at a time, so it's generally better to design your indexes around your queries, rather than just index each field you might query against):
http://www.mongodb.org/display/DOCS/Indexes#Indexes-Basics
And you can also create unique indexes, which will enforce uniqueness similar to a unique constraint (it does have some caveats, such as with array fields):
http://www.mongodb.org/display/DOCS/Indexes#Indexes-unique%3Atrue

mongoDB: unique index on a repeated value

So i'm pretty new to mongoDb so i figure this could be a misunderstanding on general usage. so bear with me.
I have a document schema I'm working with as such
{
name: "bob",
email: "bob#gmail.com",
logins: [
{ u: 'a', p: 'b', public_id: '123' },
{ u: 'x', p: 'y', public_id: 'abc' }
]
}
My Problem is that i need to ensure that the public ids are unique within a document and collection,
Furthermore there are some existing records being migrated from a mySQL DB that dont have records, and will therefore all be replaced by null values in mongo.
I figure its either an index
db.users.ensureIndex({logins.public_id: 1}, {unique: true});
which isn't working because of the missing keys and is throwing a E11000 duplicate key error index:
or this is a more fundamental schema problem in that I shouldn't be nesting objects in an array structure like that. In which case, what? a seperate collection for the user_logins??? which seems to go against the idea of an embedded document.
If you expect u and p to have always the same values on each insert (as in your example snippet), you might want to use the $addToSet operator on inserts to ensure the uniqueness of your public_id field. Otherwise I think it's quite difficult to make them unique across a whole collection not working with external maintenance or js functions.
If not, I would possibly store them in their own collection and use the public_id as _id field to ensure their cross-document uniqueness inside a collection. Maybe that would contradict the idea of embedded docs in a doc database, but according to different requirements I think that's negligible.
Furthermore there are some existing records being migrated from a mySQL DB that dont have records, and will therefore all be replaced by null values in mongo.
So you want to apply a unique index on a data set that's not truly unique. I think this is just a modeling problem.
If logins.public_id is null that's going to violate your uniqueness constraint, then just don't write it at all:
{
logins: [
{ u: 'a', p: 'b' },
{ u: 'x', p: 'y' }
]
}
Thanks all.
In the end I opted to seperate this into 2 collections, one for users and one for logins.
users this looked a little like..
userDocument = {
...
logins: [
DBRef('loginsCollection', loginDocument._id),
DBRef('loginsCollection', loginDocument2._id),
]
}
loginDocument = {
...
user: new DBRef('userCollection', userDocument ._id)
}
Although not what i was originally after (a single collection) It is working niocely and by utilising the MongoId uniquness there is a constraint now built in at a database level and not implemented at the application level.