_id : replace ObjectId to hexa value or string and indexing it

_id : replace ObjectId to hexa value or string and indexing it - mongodb

I have a collection with hexa address like this :
address: "dbaf14e1c476e76ea05a8b71921a46d6b06f0a950f17c5f9f1a03b8fae467f10"
They are all unique, so I thought I could store them in the _id field.
But I'm asking myself 2 question :
First, is there a better way to store such hexa value ? Or a String is the best option ?
And, if I set my _id field as a String, how is the indexing going to happen ? I know we can't change the index type of an _id field but is the default index type (exemple: db.collection.createIndex({address: 1}) ) working on string ?
I used to do
db.collection.createIndex({address: "text"})

Related

Mongo unique compound text index

I'm trying to create a Mongo index with 2 text fields, whereby either field can have a value in another document, but the same pair cannot. I am familiar with this concept in MySQL, but do not understand it in Mongo.
I would like to create a unique index on the symbol and date fields of these documents:
db.earnings_quotes.insert({"symbol":"UNF","date":"2017-01-04","quote":{"price": 5000}});
db.earnings_quotes.createIndex({symbol: 'text', date: 'text'}, {unique: true})
db.earnings_quotes.insert({symbol: 'HAL', date: '2018-01-22', quote: { "price": 10000 }});
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: sample.earnings_quotes.$symbol_text_date_text dup key: { : \"01\", : 0.6666666666666666 }"
}
})
I don't understand the error message here... In this case, neither symbol, nor date overlap with the first record.

A text index actually behaves a bit like a multikey index, it tries to cut text into bits that can be then queried using specific text search operators. Also, the order of the fields in the text index doesn't really matter (compared to a normal compound index), MongoDB will just go through all the values in both symbol and date and index those separately.
In this case I believe that mongo tries to index the 01 in the 2017 and the 01 in -01- separately.
I don't think in your case you really want to do a text index, it's made for searching through long texts, not fields with single values in them.
And also, the multikey nature of the text index makes it really hard to stay unique.
My advice would be to go like this:
db.earnings_quotes.createIndex({symbol: 1, date: 1}, {unique: true})

By default mongo uses _id as unique key and index, so one solution to your problem is save your data in _id field.
e.g:
{
"_id":{
"symbol" :"xyz" ,
"date" :"12-12-20" ,
}
//Other fields in collection
}
This will create a composite key.

Text Indexes MongoDB, Minimum length of search string

I have created a text index for collection X from mongo shell
db.X.ensureIndex({name: 'text', cusines: 'text', 'address.city': 'text'})
now if a document whose name property has a value seasons, its length is 7
so if I run the find query(with a search string of length <= 5)
db.X.find({$text: {$search: 'seaso'}})
it does not return any value if I change the search string to season (length >= 6) then it returns the document.
Now my question is does the search string has some minimum length constraint to fetch the records.
if yes, then is there is any way to change it?

MongoDB $text searches do not support partial matching. MongoDB allows support text search queries on string content with support for case insensitivity and stemming.
Looking at your examples:
// this returns nothing because there is no inferred association between
// the value: 'seasons' and your input: 'seaso'
db.X.find({$text: {$search: 'seaso'}})
// this returns a match because 'season' is seen as a stem of 'seasons'
db.X.find({$text: {$search: 'season'}})
So, this is not an inssue with the length of your input. Searching on seaso returns no matches because:
Your text index does not contain the whole word: seaso
Your text index does not contain a whole word for which seaso is a recognised stem
This presumes that the language of your text index is English, You can confirm this by runing db.X.getIndexes() and you'll see this in the definition of your text index:
"default_language" : "english"
FWIW, if your index is case insensitive then the following will also return matches:
db.X.find({$text: {$search: 'SEaSON'}})
db.X.find({$text: {$search: 'SEASONs'}})
Update 1: in repsonse to this question "is it possible to use RegExp".
Assuming the name attribute contains the value seasons and you are seaching with seaso then the following will match your document:
db.X.find({type: {$regex: /^seaso/}})
More details in the docs but ...
This will not use your text index so if you proceeed with using the $regex operator then you won't need the text index.
Index coverage with the $regex operator is probably not what you expect, the brief summary is this: if your search value is anchored (i.e. seaso, rather than easons) then MongoDB can use an index but otherwise it cannot.

Finding documents in mongodb collection where a field is equal to given integer value

I would like to find all documents where the field property_id is equal to -549. I tried:
db.getCollection('CollectionName').find({'property_id' : -549 })
This returns no records with a message: "Fetched 0 record(s)."
But I am seeing the document right there where the field property_id is -549.
I am not sure what I am doing wrong. The type of field is int32.

Check if there is a space in the field name 'property_id':
Try
db.getCollection('CollectionName').find({'\uFEFFproperty_id' : -549 })

Does Mongodb queries a record by DateTime quicker than by String?

For example, this is a record:
{
"_id" : ObjectId("576bc7a48114a14b47920d60"),
"id" : "TEST0001",
"testTime" : ISODate("2016-06-23T11:28:06.529+0000")
}
The testTime is ISODate, does Mongodb query the record by testTime is quicker than this? :
{
"_id" : ObjectId("576bc7a48114a14b47920d60"),
"id" : "TEST0001",
"testTime" : "2016-06-23 11:28:06"
}

yes it does.
The difference is produced on basis that date object is stored as a number in dateTime object.
To understood this we could go with this ilustration:
When there is a query on dateTime filed and dateTime is stored in numerical object, that means we have comparison on numbers. Mongo will compare object with size of 64 bits (8bytes) see here with same object.
When comparing string, mongo loads string like this: 2016-06-27T08:39:44.000 which is 23 chars*2bytes (utf) => 46 bytes to compare in memory and need to check all bytes from highest to lowest one..
Now, you know the answer why it is faster using dateObject instead of string.
Any comments welcome!
link here
Comparison/Sort Order
MinKey (internal type)
Null
Numbers (ints, longs, doubles)
Symbol, String
Object
Array
BinData
ObjectId
Boolean
Date
Timestamp
Regular
Expression
MaxKey (internal type)

MongoDB can not create unique sparse index (duplicate key)

I want to create a unique index over two columns where the index should allow multiple null values for the second part of the index. But:
db.model.ensureIndex({userId : 1, name : 1},{unique : true, sparse : true});
Throws a duplicate key exception: E11000 duplicate key error index: devmongo.model.$userId_1_name_1 dup key: { : "-1", : null }. I thought because of the sparse=true option the index should allow this constellation? How can I achieve this? I use MongoDB 2.6.5

Sparse compound indexes will create an index entry for a document if any of the fields exist, setting the value to null in the index for any fields that do not exist in the document. Put another way: a sparse compound index will only skip a document if all of the index fields are missing from the document.
As of v3.2, partial indexes can be used to accomplish what you're trying to do. You could use:
db.model.ensureIndex({userId : 1, name : 1}, { partialFilterExpression: { name: { $exists: true }, unique: true });
which will only index documents that have a name field.
NB: This index cannot be used by mongo to handle a query by userId as it will not contain all of the documents in the collection. Also, a null in the document is considered a value and a field that has a null value exists.

The compound index should be considered as a whole one, so unique requires (userId, name) pair must be unique in the collection, and sparse means if both userId and name missed in a document, it is allowed. The error message shows that there are at least two documents whose (userId, name) pairs are equivalent (if a field missed, the value can be considered as null).

In my case, it turns out field names are case sensitive.
So creating a compound index on {field1 : 1, field2 : 1} is not the same as {Field1 : 1, Field2 : 1}