Unique multikey in mongodb for field of array of embedded fields - mongodb

I have this Document structure:
id: "xxxxx",
name: "John",
pets: [
{
"id": "yyyyyy",
"type": "Chihuahua",
},
{
"id": "zzzzzz",
"type": "Labrador",
}
]
The pets field is not an array of embedded documents (not referencing any other collection).
I want the pets id to be unique across the documents and the document itself, but it seems the mongodb official docs say its not possible and doesnt offer other solution:
For unique indexes, the unique constraint applies across separate documents in the collection rather than within a single document.
Because the unique constraint applies to separate documents, for a
unique multikey index, a document may have array elements that result
in repeating index key values as long as the index key values for that
document do not duplicate those of another document.
https://docs.mongodb.com/manual/core/index-multikey/
I have tried this using mongodb golang driver:
_, err = collection.Indexes().CreateOne(context.TODO(), mongo.IndexModel{
Keys: bson.M{"pets.id": 1},
Options: options.Index().SetUnique(true),
})
but like the docs said, it allows 2 pets of a person to have same ID, while not allowing a pet from a different person to have the same ID compared to the pet of the first person...
is there anyway to enforce this in mongodb ?

Related

How to generate unique id for each element of an array field in MongoDB

How to create a unique ID for each element of an array field, where uniqueness is maintained globally for all documents of the collection?
Is it possible to specify create a unique index for this field?
You can make use of ObjectId data type. ObjectIds are 12-byte values that are guaranteed to be unique across all documents in a collection. You can specify an ObjectId as the value for a field in an array when inserting a new document.
For example, if you have following document:
{
_id: ObjectId("5f9b5a6d65c5f09f7b5a6d65"),
nameOfArrayField: []
}
You can use the following command to insert a new document:
db.collection.insertOne({
nameOfArrayField: [
{
id: new ObjectId(),
name: "Big Cat Public Safety Law"
}
]
});
To specify a unique index, you can use createIndex() method in the MongoDB shell.
db.collection.createIndex({ "nameOfArrayField.id": 1 }, { unique: true })
unique: true option ensures that the id field of the element array will be unique globally for all documents of the collection. It will prevent from inserting the duplicate element with the same id field in the array. Point to be noted that it is an asynchronous operation. You can use the db.collection.getIndexes() method to check if the index is created or not.

Insert multiple documents on duplicate update existing document with the new document?

What is the correct method for inserting multiple documents, say 5,000 of them in one command, on duplicate unique index, updating existing documents with new documents on all fields?
For instance, out of the 5,000 documents, 1,792 of them are new with no duplicates by unique indexes so they are inserted, and 3,208 of them have duplicates in the collection by unique indexes which should be replaced into the existing ones by all values.
I tried insertMany() with the unordered option but it seems to skip duplicate documents.
And then updateMany() with upsert:true isn't for inserting multiple documents but only updating certain fields in a collection?
Is this possible at all?
========Example=========
For a business collection with unique index of field "name":
{"name":"Google", "address":"...", "employees":38571, "phone":12345}
{"name":"Microsoft", "address":"...", "employees":73859, "phone":54321}
{"name":"Apple", "address":"...", "employees":55177, "phone":88888}
{"name":"Meta", "address":"...", "employees":88901, "phone":77777}
Now we want to update the collection with these 4 documents:
{"name":"Apple", "address":"...", "employees":55177, "phone":22222}
{"name":"Dell", "address":"...", "employees":77889, "phone":11223}
{"name":"Google", "address":"...", "employees":33333, "phone":44444}
{"name":"IBM", "address":"...", "employees":77777, "phone":88888}
In MySQL, I could just do this in one query:
INSERT INTO business (name, address, employees, phone)
VALUES
('Apple', '...', 55177, 22222),
('Dell', '...', 77889, 11223),
('Google', '...', 33333, 44444),
('IBM', '...', 77777, 88888)
AS new
ON DUPLICATE KEY UPDATE
address = new.address
employees = new.employees
phone = new.phone
And the collection documents become:
{"name":"Google", "address":"...", "employees":33333, "phone":44444} # updated
{"name":"Microsoft", "address":"...", "employees":73859, "phone":54321} # no change
{"name":"Apple", "address":"...", "employees":55177, "phone":22222} # updated
{"name":"Meta", "address":"...", "employees":88901, "phone":77777} # no change
{"name":"Dell", "address":"...", "employees":77889, "phone":11223} # inserted
{"name":"IBM", "address":"...", "employees":77777, "phone":88888} # inserted
How do I do this in MongoDB?
You probably just need the $merge. Put the documents you need to go through into another collection(says toBeInserted). $merge toBeInserted into the existing collection.
db.toBeInserted.aggregate([
{
"$project": {
// select the relevant fields
_id: 0,
name: 1,
address: 1,
employees: 1,
phone: 1
}
},
{
"$merge": {
"into": "companies",
"on": "name",
"whenMatched": "merge",
"whenNotMatched": "insert"
}
}
])
Mongo Playground

MongoDB index for nested map values

I have a MongoDB Collection that contains documents with a nested map, similar to the following document:
{
"_id": "1"
"accounts": {
"account-id-1": { "email": "example1#example.com", ... },
"account-id-2": { "email": "example2#example.com", ... },
}
}
The accounts map contains account IDs as keys and the remaining account data as values/objects. Now I want to add an index for the email field of the nested object, but I can't do that by defining the fields as one would normally do for nested fields, e.g. accounts.account-id-1.email because the mid part (account-id-1) is different for each entry.
I have read about wildcard indexes, but it seems to me that the index expression always ends withe the special wildcard symbol $**, but never has it in the middle.
My question is whether it's possible to define such an index in the following way or similarly: accounts.$**.email, so that only the email field gets indexed.

MongoDB: using indexes on multiple fields or an array

I'm new with mongo
Entity:
{
"sender": {
"id": <unique key inside type>,
"type": <enum value>,
},
"recipient": {
"id": <unique key inside type>,
"type": <enum value>,
},
...
}
I need to create effective seach by query "find entities where sender or recipient equal to user from collection" with paging
foreach member in memberIHaveAccessTo:
condition ||= member == recipient || member == sender
I have read some about mongo indexes. Probably my problem can be solve by storing addional field "members" which will be array contains sender and recipient and then create index on this array
Is it possible to build such an index with monga?
Is mongo good choise to create indexes like?
Some thoughts about the issues raised in the question about querying and the application of indexes on the queried fields.
(i) The $or and two indexes:
I need to create effective search by query "find entities where sender
or recipient equal to user from collection...
Your query is going to be like this:
db.test.find( { $or: [ { "sender.id": "someid" }, { "recipient.id": "someid" } ] } )
With indexes defined on "sender.id" and "recipient.id", two individual indexes, the query with the $or operator will use both the indexes.
From the docs ($or Clauses and Indexes):
When evaluating the clauses in the $or expression, MongoDB either
performs a collection scan or, if all the clauses are supported by
indexes, MongoDB performs index scans.
Running the query with an explain() and examining the query plan shows that indexes are used for both the conditions.
(ii) Index on members array:
Probably my problem can be solve by storing addtional field "members"
which will be array contains sender and recipient and then create
index on this array...
With the members array field, the query will be like this:
db.test.find( { members_array: "someid" } )
When an index is defined on members_array field, the query will use the index; the generated query plan shows the index usage. Note that an index defined on an array field is referred as Multikey Index.

MongoDB "filtered" index: is it possible?

Is it possible to index some documents of the collection "only if" one of the fields to be indexed has a particular value?
Let me explain with an example:
The collection "posts" has millions of documents, ALL defined as follows:
{
    "network": "network_1",
    "blogname": "blogname_1",
    "post_id": 1234,
    "post_slug": "abcdefg"
}
Let's assume that the distribution of the post is equally split on network_1 and network_2
My application OFTEN select the type of query based on the value of "network" (although sometimes I need the data from both networks):
For example:
www.test.it/network_1/blog_1/**postid**/1234/
-> db.posts.find ({network: "network_1" blogname "blog_1", post_id: 1234})
www.test.it/network_2/blog_4/**slug**/aaaa/
-> db.posts.find ({network: "network_2" blogname "blog_4" post_slug: "yyyy"})
I could create two separate indexes (network / blogname / post_id and network / blogname / post_slug) but I would get a huge waste of RAM, since 50% of the data in the index will never be used.
Is there a way to create an index "filtered"?
Example:
(Note the WHERE parameter)
db.posts.ensureIndex ({network: 1 blogname: 1, post_id: 1}, {where: {network: "network_1"}})
db.posts.ensureIndex ({network: 1 blogname: 1, post_slug: 1}, {where: {network: "network_2"}})
Indeed it's possible in MongoDB 3.2+ They call it partialFilterExpression where you can set a condition based on which index will be created.
Example
db.users.createIndex({ "userId": 1, "project": 1 },
{ unique: true, partialFilterExpression:{
userId: { $exists: true, $gt : { $type : 10 } } } })
Please see Partial Index documentation
As of MongoDB v3.2, partial indexes are supported. Documentation: https://docs.mongodb.org/manual/core/index-partial/
It's possible, but it requires a workaround which creates redundancy in your documents, requires you to rewrite your find-queries and limits find-queries to exact matches.
MongoDB supports sparse indexes which only index the documents where the given field exists. You can use this feature to only index a part of the collection by adding this field only to those documents you want to index.
The bad news is that sparse indexes can only include a single field. But the good news is, that this field can also contain an object with multiple fields, so you can still store all the data you want to search for in this field.
To do this, add a new field to the included documents which includes an object with the fields you search for:
{
"network": "network_1",
"blogname": "blogname_1",
"post_id": 1234,
"post_slug": "abcdefg"
"network_1_index_key": {
"blogname": "blogname_1",
"post_id": 1234
}
}
Your ensureIndex command would index the field network_1_index_key:
db.posts.ensureIndex( { network_1_index_key: 1 }, { sparse: true } )
A find-query which is supposed to use this index, must now query for the exact object of the field network_1_index_key:
db.posts.find ({
network_1_index_key: {
blogname: "blogname_1",
post_id: 1234
}
})
Doing this would likely only make sense when the documents you want to index are a very small part of the collection. When its about half, I would just create a regular index and live with it because the larger document-size could mitigate the gains from the reduced index size.
You can try create index on all field (network / blogname / post_id / post_slug)