I have two MongoDB collections. questions:
{
"_id" : "8735574",
"title" : "...",
"owner" : {
"user_id" : 950690
},
}
{
"_id" : "8736808",
"title" : "...",
"owner" : {
"user_id" : 657258
},
}
and users:
{
"_id" : 950690,
"updated" : SomeDate,
...
}
{
"_id" : 657258,
"updated" : SomeDate,
...
}
The entries in users have to be regularily created or updated based on questions. So I would like to get all the user_ids from questions that either do not have an entry in users at all or their entry in users was updated more than e.g. one day ago.
To achieve this, I could read all user_ids from questions and then manually drop all users from the result that do not have to be updated. But this seems to be a bad solution as it reads a lot of unneccessary data. Is there a way to solve this differently? Some kind of collection join would be great but I know that this does not (really) exist in MongoDB. Any suggestions?
PS: Nesting these collections into a single collection is no solution as users has to be referenced from elsewhere as well.
Unfortunately there is no good way of doing this and since you don't have access to the indexes to able to do this client side without reading out all the data and manually manipulating it, it is the only way.
The join from users to questions could be done by querying the users collection and then doing an $in on the questions collection but that's really the only optimisation that can be made.
Related
I am pretty new to both Mongodb and CosmosDB and I am trying to understand better with how to implement many to many relationship across different collections. For example, I have a contact collection, which stores the contact information. I have another Activity collection, has the activity information. So I will need to set up the relationship between contact and activity. It is many to many relationship. The linked document that I am reading from Mongodb documentation seems to be only referred to linking document within the same collection. How I can reference the contact ID from Activity collection?
For example, I have Activity collection:
{
"_id" : ObjectId("ABCDEFG"),
"ActivityName" : "Meeting",
"ActivityContact" : {
"$ref" : "Contact",
"$id" : ObjectId("1234567"),
"$db" : "Test"
}
On another collection called Contact: I have data like:
{
"_id": ObjectId("1234567"),
"ContactName" : "TestUser",
"Address" : "Some address"
}
So when I called Activity collection, it will get the data from Contact collection, sort of like left outer join from relational databases. When I ran the above insert with setting $ref, $id and $db, it was throwing the following error:
Failed to insert document.
Error:
Unsupported shard key value: { "$ref" : "Contact", "$id" : ObjectId("1234567"), "$db" : "Test" }
Per mongodb document: there are two reference types:
https://docs.mongodb.com/manual/reference/database-references/#document-references
One is manual and one is DBRef. I was trying to work out a solution for DBRef on CosmosDB. Seems there might be limited drivers available when using DBRef for Mongodb. Then I tried to use the manual reference instead.
So on the activity collection. I have the document below:
{
"_id" : "ObjectId(\"ABCDEFG\")",
"ActivityName" : "Meeting",
"Department" : "Dev",
"Contact" : [
123456,
234567
]
}
{
"_id" : 123456,
"Department" : "Dev",
"FirstName" : "John",
"LastName" : "Smith",
"Address" : "Somewhere"
}
{
"_id" : 234567,
"Department" : "Dev",
"FirstName" : "Mary",
"LastName" : "Black",
"Address" : "Somewhere"
}
I have to add a department because it is the shard key when I create this collection on CosmosDB. Without specifying this shard key, CosmosDB won't let me create this manual reference. But even with the manual reference, it only allow me to insert one contact, it throws errors for "Error:
E11000 duplicate key error collection: Failed _id or unique index constraint."
How to do resolve this issue?
Thanks,
CosmosDB does not support $lookup you will have to use MongoDB natively, but since you're using a managed database, I suggest you explore Atlas.
This is how it would look in MongoDB:
pipe = [
{
'$lookup': {
'from': 'contact',
'localField': '$id',
'foreignField': '_id',
'as': 'contact'
}
}
]
then run it on your activity collection via connection.activity.aggregate(pipe).
Also, I suggest you avoid naming keys with a $ operator.
After further testing the investigation. The error I am getting is due to CosmosDB doesn't support unique sparse index on the collection. I have a unique index on one of the field and it will only enter the first contact as the unique index will take it as NULL , it will failed on all the other entries because you can only have one NULL value since sparse is not supported on COSMOS DB right now.
It is on the roadmap now.
https://feedback.azure.com/forums/263030-azure-cosmos-db?category_id=321994&filter=top&page=2
The current implementation of MongoDB allows you to create documents (such as a user document) without a field that has a unique sparse index (such as username). This is not possible with the Azure Cosmos DB Mongo API, forcing you to populate… more
PLANNED · Azure Cosmos DB Team (Product Manager, Microsoft Azure) responded
Thank you Dan for your suggestion. This is currently on our road map and in plan for our upcoming development cycle which runs from January to June.
We will update here when this becomes available.
Thanks.
I am trying to find a way to get a list of MongoDB documents that are referenced in a subdocument in another collection.
I have a collection with user documents. In another collection I keep a list of businesses. Every business has a subdocument containing a list of references to users.
The User collection:
/* user-1 */
{
"_id" : ObjectId("54e5e78680c7e191218b49b0"),
"username" : "jachim#example.com",
"password" "$2y$13$21p6hx3sd200cko4o0w04u46jNv3tNl3qpVWVbnAyzZpDxsSVDDLS"
}
/* user-2 */
{
"_id" : ObjectId("54e5e78480c7e191218b49ab"),
"username" : "jachim#example.net",
"password" : "$2y$13$727amk1a7fwo4sgw8kkkcuWi4vhj2zKvZZIEDWtDQLo6dUjb0YnYy",
}
The Business collection
/* business-1 */
{
"_id" : ObjectId("54e5e78880c7e191218b4c52"),
"name" : "Stack Overflow",
"users" : [
{
"$ref" : "User",
"$id" : ObjectId("54e5e78680c7e191218b49b0"),
"$db" : "test"
}
]
}
I can get the user from a business by following the references in the business.users list, I can get the businesses from a user with the db.Business.find({"users.$id": ObjectId("54e5e78480c7e191218b49ab")}) query, but I cannot create a query to find all users that are referenced somewhere in a business.
I can do this client side in two steps:
db.Business.distinct("users.$id");
Which will return a list of user ids. This list I can use in a query to the user collection:
db.User.find({ _id: { $in: [ LIST_OF_IDS ] } });
But this could result in very big queries (potentially leading to queries larger than 16MB).
I think MapReduce would be a solution for this, but I'm not quite sure what fields I should use there.
Any experts here on this?
After some more research and a chat on the MongoDB IRC channel, there are several options to get this to work:
Go with the $in query.
Keep track of the relation on both sides (a little harder to keep the relations up to date on both sides, but it works).
Change the owning side of the relation (keeping track of businesses in the user document), but this depends on the nature of your queries.
The Aggregation Framework would not work, because it cannot query multiple collections, nor will MapReduce (for the same reason, although it is possible).
This my code:
db.test.find() {
"_id" : ObjectId("4d3ed089fb60ab534684b7e9"),
"title" : "Sir",
"name" : {
"_id" : ObjectId("4d3ed089fb60ab534684b7ff"),
"first_name" : "Farid"
},
"addresses" : [
{
"city" : "Baku",
"country" : "Azerbaijan"
},{
"city" : "Susha",
"country" : "Azerbaijan"
},{
"city" : "Istanbul",
"country" : "Turkey"
}
]
}
I want get output only all city. Or I want get output only all country. How can i do it?
I'm not 100% about your code example, because if your 'find' by ID there's no need to search by anything else... but I wonder whether the following can help:
db.test.insert({name:'farid', addresses:[
{"city":"Baku", "country":"Azerbaijan"},
{"city":"Susha", "country":"Azerbaijan"},
{"city" : "Istanbul","country" : "Turkey"}
]});
db.test.insert({name:'elena', addresses:[
{"city" : "Ankara","country" : "Turkey"},
{"city":"Baku", "country":"Azerbaijan"}
]});
Then the following will show all countries:
db.test.aggregate(
{$unwind: "$addresses"},
{$group: {_id:"$country", countries:{$addToSet:"$addresses.country"}}}
);
result will be
{ "result" : [
{ "_id" : null,
"countries" : [ "Turkey", "Azerbaijan"]
}
],
"ok" : 1
}
Maybe there are other ways, but that's one I know.
With 'cities' you might want to take more care (because I know cities with the same name in different countries...).
Based on your question, there may be two underlying issues here:
First, it looks like you are trying to query a Collection called "test". Often times, "test" is the name of an actual database you are using. My concern, then, is that you are trying to query the database "test" to find any collections that have the key "city" or "country" on any of the internal documents. If this is the case, what you actually need to do is identify all of the collections in your database, and search them individually to see if any of these collections contain documents that include the keys you are looking for.
(For more information on how the db.collection.find() method works, check the MongoDB documentation here: http://docs.mongodb.org/manual/reference/method/db.collection.find/#db.collection.find)
Second, if this is actually what you are trying to do, all you need to for each collection is define a query that only returns the key of the document you are looking for. If you get more than 0 results from the query, you know documents have the "city" key. If they don't return results, you can ignore these collections. One caveat here is if data about "city" is in embedded documents within a collection. If this is the case, you may actually need to have some idea of which embedded documents may contain the key you are looking for.
I'm still learning about MongoDB and I would like to know if someone could help me with the situation I'm facing.
I'm taking over a DB structure that has been created like a relational DB abd I would like to embed a full document (instead of a reference only to the document) into all my documents.
Let me try to explain the best I can:
I have an activity table that references a user using its userID
activity : {
"user_id" : ObjectId("5324a18d3061650002030000")
}
user_id is the primary id of another document called user.
user:
{
"_id" : ObjectId("5324a18d3061650002030000"),
"active" : true,
"birth_date" : ISODate("1980-03-25T00:00:00.000Z")
}
What I would like to do is to insert my user into my activity document:
activity : {
"user_id" : ObjectId("5324a18d3061650002030000")
user:
{
"_id" : ObjectId("5324a18d3061650002030000"),
"active" : true,
"birth_date" : ISODate("1980-03-25T00:00:00.000Z")
}
}
I would like to do that for all my activity documents (knowing that they all reference different users of course), what would be the best way to do that please?
Thanks a lot guys !!!
It should be simpler like this:
db.users.find().forEach(function(doc) {
db.activity.update({user_id: doc._id}, {$set:{user:doc}});
});
Is it possible to use ensureindex within records and not for whole collection.
Eg: My database structure is
{ "_id" : "com.android.hello",
"rating" : [
[ { "user" : "BBFE7F461E10BEE10A92784EFDB", "value" : "4" } ],
[ { "user" : "BBFE7F461E10BEE10A92784EFDB", "value" : "4" } ]
]
}
It is a rating system and i don't want the user to rate multiple times on the same application (com.android.hello). If i use ensureindex on the user field then user is able to vote only on one application. When i try to vote on a different application altogether (com.android.hi) it says duplicate key.
No, you can not do this. Uniqueness is only enforced on a per document level. You will need to redesign your schema for the above to work. For example to:
{
"_id" : "com.android.hello",
"rating": {
"user" : "BBFE7F461E10BEE10A92784EFDB",
"value" : "4"
}
}
And then just store multiple...
(I realize you didn't provide the full document though)
ensureIndex
creates indexes , which is applied to whole collection. In case you want only for few records , you may have to keep two collections and apply ensureIndex on one of the collection.
As #Derick said, no however it is possible to make sure they can only vote once atomically:
var res=db.votes.update(
{_id: 'com.android.hello', 'rating.user': {$nin:['BBFE7F461E10BEE10A92784EFDB']}},
{$push:{rating:{user:'BBFE7F461E10BEE10A92784EFDB',value:4}}},
{upsert:true}
);
if(res['upserted']||res['n']>0){
print('voted');
}else
print('nope');
I was a bit concerned that $push would not work in upsert but I tested this as working.