I have question related to DBRef of MongoDB. Imagine this scenario:
Group{
...
"members" : [
{
"$ref" : "User",
"$id" : ObjectId("505857a4e4b5541060863061")
},
{
"$ref" : "User",
"$id" : ObjectId("50586411e4b0b31012363208")
},
{
"$ref" : "User",
"$id" : ObjectId("50574b9ce4b0b3106023305c")
},
]
...
}
So given group document has 3 user DBRef. Where in java class of Group, members is tagged with morphia as #Reference:
public class Group {
...
#Reference
List<User> members;
...
}
Question: When calling RequestFactory function getGroup().with("members") will RequestFactory get all members in ONLY 1 DB access ?
Or will Request factory make 3 DB access for each DBRef in Group document in the scenario given above?
Thank you very much in advance.
RequestFactory itself doesn't access the DB. What it'll do here is:
call getMembers(), as it was requested by the client through .with("members")
for each entity proxy seen (either in the request or in the response), call its locator's isLive method, or if has no Locator, call the entity's findXxx with its getId() (and check whether null is returned).
The first step depends entirely on Morphia's implementation:
because you didn't set lazy = true on your #Reference, it won't matter whether RequestFactory calls getMembers() or not, the members will always be loaded.
in any case (either eager or lazy fetching), it will lead to 4 Mongo queries (one to get the Group and another 3 for the members; I don't think Morphia tries to optimize the number of queries to only make 1 query to get all 3 members at a time)
The second step however depends entirely on your code.
Remember that RequestFactory wants you to have a single instance of an entity per HTTP request. As I understand it, Morphia has an EntityCache doing just that, but I suspect it could be bypassed by some methods/queries.
Related
I have 2 collections.
#Entity
public class TypeA {
//other fields
#Reference
List<TypeB> typeBList;
}
#Entity
public class TypeB{
//Fields here.
}
After a save operation, sample TypeA document is as below :
{
"_id" : ObjectId("58fda48c60b200ee765367b1"),
"typeBList" : [
{
"$ref" : "TypeB",
"$id" : ObjectId("58fda48c60b200ee765367ac")
},
{
"$ref" : "TypeB",
"$id" : ObjectId("58fda48c60b200ee765367af")
}
]
}
When I query for TypeA , morphia eagerly loads all the TypeB entites, which I dont want.
I tried using the #Reference(lazy = true). But no help.
So is there a way I can write a query using morphia where I only get all $ids inside the typeBList?
Have a list ob ObjectIds instead of a #Reference and manually fetch those entries when you need them.
Lazy will only load referenced entities on demand, but since you are trying to access something in that list, they will be loaded.
Personal opinion: #Reference looks great when you start, but its use can quickly cause issues. Don't build a schema with lots of references — MongoDB is not a relational database.
I am converting filters from the client to a QueryImpl using the setQueryObject method.
When I try to add another complex criteria to that query my original query is moved to a field named baseQuery and the new criteria is the query field.
When the query is executed only the new criteria is used and the baseQuery is not used.
It only happens when the client query is formatted like this: { "$or" : [{ "field1" : { "$regex" : "value1", "$options" : "i"}},...]}
and the new criteria is formatted in the same way(meaning an $or operation).
It seems that when I try to merge 2 $or queries it happens but when I merge $or with $and it concatenates the queries properly.
Am I using it wrong or is it a genuine bug?
Edit:
Code:
public static List<Entity> getData(client.Query query) {
QueryImpl<Entity> finalQuery = Morphia.realAccess().extractFromQuery(Entity.class,query);
finalQuery.and(finalQuery.or(finalQuery.criteria("field").equal(false), finalQuery.criteria("field").doesNotExist()));
return finalQuery.asList();
}
public <E> QueryImpl<E> extractFromQuery(Class<E> clazz, client.Query query) {
QueryImpl<E> result = new QueryImpl<E>(clazz,this.db.getCollection(clazz),this.db);
result.setQueryObject(query.getFiltersAsDBObject);
return result;
}
QueryImpl and setQueryObject() are internal constructs. You really shouldn't be using them as they may change or go away without warning. You should be using the public query builder API to build up your query document.
I am having the same problem, It seems to be working when I do this:
finalQuery.and();
finalQuery.and(finalQuery.or(finalQuery.criteria("field").equal(false), finalQuery.criteria("field").doesNotExist()));
It is kind of ugly and would love to hear if someone have a different approach, but the only way I was able to find to convert the data from the client side which is a DBObject is to use the setQueryObject() of QueryImpl.
There is an existing person collection in the system which is like:
{
"_id" : ObjectId("536378bcc9ecd7046700001f"),
"engagements":{
"5407357013875b9727000111" : {
"role" : "ADMINISTRATOR",
},
"5407357013875b9727000222" : {
"role" : "DEVELOPER",
}
}
}
So that multiple user objects can have the same engagement with a specific role, I need to fire a query in this hierarchy where I can get all the persons which have a specific engagement in the engagements property of person collection.
I want to get all the persons which have
5407357013875b9727000222 in the engagements.
I know $in operator could be used but the problem is that I need to compare the keys of the sub Json engagements.
I think it's as simple as this:
db.users.find({'engagements.5407357013875b9727000222': {$exists: true}})
If you want to match against multiple engagement ids, then you'll have to use $or. Sorry, no $in for you here.
Note, however, that you need to restructure your data, as this one can't be indexed to help this concrete query. Here I assume you care about performance and this query is used often enough to have impact on the database.
I have two MongoDB collections user and customer which are in one-to-one relationship. I'm new to MongoDB and I'm trying to insert documents manually although I have Mongoose installed. I'm not sure which is the correct way of storing document reference in MongoDB.
I'm using normalized data model and here is my Mongoose schema snapshot for customer:
/** Parent user object */
user: {
type: Schema.Types.ObjectId,
ref: "User",
required: true
}
user
{
"_id" : ObjectId("547d5c1b1e42bd0423a75781"),
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333",
}
I want to make a reference to this user document from the customer document. Which of the following is correct - (A) or (B)?
customer (A)
{
"_id" : ObjectId("547d916a660729dd531f145d"),
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
"user" : ObjectId("547d5c1b1e42bd0423a75781")
}
customer (B)
{
"_id" : ObjectId("547d916a660729dd531f145d"),
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
"user" : {
"_id" : ObjectId("547d5c1b1e42bd0423a75781")
}
}
Remember these things
Embedding is better for...
Small subdocuments
Data that does not change regularly
When eventual consistency is acceptable
Documents that grow by a small amount
Data that you’ll often need to perform a second query to fetch Fast reads
References are better for...
Large subdocuments
Volatile data
When immediate consistency is necessary
Documents that grow a large amount
Data that you’ll often exclude from the results
Fast writes
Variant A is Better.
you can use also populate with Mongoose
Use variant A. As long as you don't want to denormalize any other data (like the user's name), there's no need to create a child object.
This also avoids unexpected complexities with the index, because indexing an object might not behave like you expect.
Even if you were to embed an object, _id would be a weird name - _id is only a reserved name for a first-class database document.
One to one relations
1 to 1 relations are relations where each item corresponds to exactly one other item. e.g.:
an employee have a resume and vice versa
a building have and floor plan and vice versa
a patient have a medical history and vice versa
//employee
{
_id : '25',
name: 'john doe',
resume: 30
}
//resume
{
_id : '30',
jobs: [....],
education: [...],
employee: 25
}
We can model the employee-resume relation by having a collection of employees and a collection of resumes and having the employee point to the resume through linking, where we have an ID that corresponds to an ID in th resume collection. Or if we prefer, we can link in another direction, where we have an employee key inside the resume collection, and it may point to the employee itself. Or if we want, we can embed. So we could take this entire resume document and we could embed it right inside the employee collection or vice versa.
This embedding depends upon how the data is being accessed by the application and how frequently the data is being accessed. We need to consider:
frequency of access
the size of the items - what is growing all the time and what is not growing. So every time we add something to the document, there is a point beyond which the document need to be moved in the collection. If the document size goes beyond 16MB, which is mostly unlikely.
atomicity of data - there're no transactions in MongoDB, there're atomic operations on individual documents. So if we knew that we couldn't withstand any inconsistency and that we wanted to be able to update the entire employee plus the resume all the time, we may decide to put them into the same document and embed them one way or the other so that we can update it all at once.
In mongodb its very recommended to embedding document as possible as you can, especially in your case that you have 1-to-1 relations.
Why? you cant use atomic-join-operations (even it is not your main concern) in your queries (not the main reason). But the best reason is each join-op (theoretically) need a hard-seek that take about 20-ms. embedding your sub-document just need 1 hard-seek.
I believe the best db-schema for you is using just an id for all of your entities
{
_id : ObjectId("547d5c1b1e42bd0423a75781"),
userInfo :
{
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333",
},
customerInfo :
{
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
},
staffInfo :
{
........
}
}
Now if you just want the userinfo you can use
db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{userInfo : 1}).userInfo;
it will give you just the userInfo:
/* 0 */
{
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333"
}
And if you just want the **customerInfo ** you can use
db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{customerInfo : 1}).customerInfo;
it will give you just the customerInfo :
/* 0 */
{
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street"
}
and so on.
This schema has the minimum hard round-trip and actually you are using mongodb document-based feature with best performance you can achive.
I have a couple of collections, for example;
members
id
name
//other fields we don't care about
emails
memberid
//other fields we don't care about
I want to delete the email for a given member. In SQL I could use a nested query, something like;
delete emails
where memberid in (select id from members where name = "evanmcdonnal")
In mongo I'm trying something like this;
db.emails.remove( {"memberid":db.members.find( {"name":"evanmcdonnal"}, {id:1, _id:0} ) )
But it returns no results. So I took the nested query and ran it on it's own. The issue I believe is that it returns;
{
"id":"myMadeUpId"
}
Which - assuming inner queries execute first - gives me a query of;
db.emails.remove( {"memberid":{ "id""myMadeUpId"} )
When really I just want the value of id. I've tried using dictionary and dot notation to access the value of id with no luck. Is there a way to do this that is similar to my attempted query above?
Let's see how you'd roughly translate
delete emails where memberid in (select id from members where name = "evanmcdonnal")
into a set of mongo shell operations. You can use:
db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).forEach(function(doc) {
db.emails.remove({ "memberid" : doc.id });
});
However, this does one remove query for each result document from members. You could push the members result ids into an array and use $in:
var plzDeleteIds = db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).toArray();
db.emails.remove({ "memberid" : { "$in" : plzDeleteIds } });
but that could be a problem if plzDeleteIds gets very, very large. You could batch. In all cases we need to do multiple requests to the database because we are querying multiple collections, which always requires multiple operations in MongoDB (as of 2.6, anyway).
The more idiomatic way to do this type of thing in MongoDB is to store the member information you need in the email collection on the email documents, possibly as a subdocument. I couldn't say exactly if and how you should do this since you've given only a bit of your data model that has, apparently, been idealized.
As forEach() way didn't work for me i solved this using:
var plzDeleteIds = db.members.find({ "name" : "evanmcdonnal" }, { "id" : 1 }).toArray();
var aux = plzDeleteIds["0"];
var aux2 = aux.map(function(u) { return u.name; } );
db.emails.remove({ "memberid" : { "$in" : aux2 } });
i hope it help!
I do not believe what you are asking for is possible. MongoDB queries talk to just one collection -- there is no syntax to go cross-collection.
However, what about the following:
The name in members does not seem to be unique. If you were to delete emails from the "emails' collection using name as the search attribute, you might have a problem. Why not store the actual email address in the email collection? And store email address again in the members collection. When your user logs in, you will have retrieved his member record -- including the email address. When you want to delete his emails, you already have his email and you can do:
db.emails.remove({emailAddress: theActualAddress))
Does that work?