Morphia disable loading of #referenced entities - mongodb

I have 2 collections.
#Entity
public class TypeA {
//other fields
#Reference
List<TypeB> typeBList;
}
#Entity
public class TypeB{
//Fields here.
}
After a save operation, sample TypeA document is as below :
{
"_id" : ObjectId("58fda48c60b200ee765367b1"),
"typeBList" : [
{
"$ref" : "TypeB",
"$id" : ObjectId("58fda48c60b200ee765367ac")
},
{
"$ref" : "TypeB",
"$id" : ObjectId("58fda48c60b200ee765367af")
}
]
}
When I query for TypeA , morphia eagerly loads all the TypeB entites, which I dont want.
I tried using the #Reference(lazy = true). But no help.
So is there a way I can write a query using morphia where I only get all $ids inside the typeBList?

Have a list ob ObjectIds instead of a #Reference and manually fetch those entries when you need them.
Lazy will only load referenced entities on demand, but since you are trying to access something in that list, they will be loaded.
Personal opinion: #Reference looks great when you start, but its use can quickly cause issues. Don't build a schema with lots of references — MongoDB is not a relational database.

Related

Spring Data MongoDB #DBRef Loads null for child class

I'm trying to save and load an object containing a #DBRef to another object that happens to be of a child class type and Spring Data MongoDB loads the field as null.
class Person {
#Id
private long id;
#DBRef
private Food food;
}
class Food {
#Id
private long id;
}
class Burger extends Food {}
I'm saving the objects separately:
Burger burger = new Burger();
foodRepository.save(burger);
Person person = new Person();
person.setFood(burger);
personRepository.save(person);
What happens is that the burger object gets saved in the food collection with the _class value of Burger in MongoDB and the $ref in the Person document points to burger and not food.
person collection:
{
"_id" : NumberLong(1),
"food" : {
"$ref" : "burger",
"$id" : NumberLong(2)
},
"_class" : "Person"
}
food collection:
{
"_id" : NumberLong(2),
"_class" : "Burger"
}
If I load the object using findAll() or findById(), the food field is null. But if i use findByFood() with the burger object, the person object is returned. What am I not understanding correctly here?
personRepository.findById(1L).getFood(); // null
personRepository.findByFood(burger); // Person object
This got answered in the Spring Data MongoDB JIRA board by Christoph Strobl. Pasting his answer below in case anyone finds this. I personally picked Option 1 to solve my issue.
Persisting different types of Food via a Repository uses the Repository interface type argument to determine the actual MongoDB collection.
In this case all subtypes of Food will end up in the food collection, unless they are explicitly redirected via the collection attribute of the #Document annotation.
The Person repository on the other hand does not know about the Food Repository and the collection routing to the food collection. So the mapping sees the Burger entity in Person#food and creates a reference to the burger collection, that then cannot be resolved, leading to the null value you see.
{
"_id" : 1,
"food" : {
"$ref" : "burger",
"$id" : 1
},
"_class" : "Person"
}
You could try one of the following:
Use the #Document annotation and explicitly name the collection to store Food and it sub types, so that the mapping layer can set the $ref correctly.
#Document("food")
class Food {
#Id private long id;
}
Use MongoOpperations#save to persist Food. That way every subclass will end up in its own collection, which allows "$ref" : "burger" to be resolved.

Spring Data Mongo - apply unique combination fields in embedded document

I'm working on Spring Boot v2.1.3.RELEASE & Spring Data Mongo. In this example, I want to apply uniqueness on email & deptName. The combination of email & deptName must be unique and is there any way to get email out since its repeating in each array object ?
I tried below, but it's not working !
#CompoundIndexes({
#CompoundIndex(name = "email_deptName_idx", def = "{'email' : 1, 'technologyEmployeeRef.technologyCd' : 1}")
})
Sample Data
{
"_id" : ObjectId("5ec507c72d8c2136245d35ce"),
....
....
"firstName" : "John",
"lastName" : "Doe",
"email" : "john.doe#gmail.com",
.....
.....
.....
"technologyEmployeeRef" : [
{
"technologyCd" : "john.doe#gmail.com",
"technologyName" : "Advisory",
....
.....
"Status" : "A"
},
{
"technologyCd" : "john.doe#gmail.com",
"technologyName" : "Tax",
.....
.....
"Status" : "A"
}
],
"phoneCodes" : [
"+352"
],
....
....
}
Technology.java
#Data
#Builder
#AllArgsConstructor
#NoArgsConstructor
#Document
public class Technology {
#Indexed(name = "technologyCd", unique = true, sparse = true)
private String technologyCd;
#Indexed(name = "technologyName", unique = true, sparse = true)
private String technologyName;
private String status;
}
EmployeeTechnologyRef.java
#Data
#Builder
#AllArgsConstructor
#NoArgsConstructor
public class EmployeeTechnologyRef {
private String technologyCd;
private String primaryTechnology;
private String status;
}
Employee.java
#Data
#Builder
#AllArgsConstructor
#NoArgsConstructor
#Document
#CompoundIndexes({
#CompoundIndex(name="emp_tech_indx", def = "{'employeeTechnologyRefs.primaryTechnology' : 1, 'employeeTechnologyRefs.technologyCd' : 1}" ,unique = true, sparse = true)
})
public class Employee {
private String firstName;
private String lastName;
private String email;
private List<EmployeeTechnologyRef> employeeTechnologyRefs;
}
I used below code but its not giving me any error of duplicate. How can we do this ?
Technology java8 = Technology.builder().technologyCd("Java").technologyName("Java8").status("A").build();
Technology spring = Technology.builder().technologyCd("Spring").technologyName("Spring Boot2").status("A").build();
List<Technology> technologies = new ArrayList<>();
technologies.add(java8);
technologies.add(spring);
technologyRepository.saveAll(technologies);
EmployeeTechnologyRef t1 = EmployeeTechnologyRef.builder().technologyCd("Java").primaryTechnology("Y")
.status("A")
.build();
EmployeeTechnologyRef t2 = EmployeeTechnologyRef.builder().technologyCd("Spring").primaryTechnology("Y")
.status("A")
.build();
List<EmployeeTechnologyRef> employeeTechnologyRefs = new ArrayList<>();
employeeTechnologyRefs.add(t1);
employeeTechnologyRefs.add(t2);
employeeTechnologyRefs.add(t1);
Employee employee = Employee.builder().firstName("John").lastName("Kerr").email("john.kerr#gmail.com")
.employeeTechnologyRefs(employeeTechnologyRefs).build();
employeeRepository.save(employee);
In MongoDB, a unique index ensures that a particular value in a field is not present in more than one document. It will not guarantee that a value is unique across an array within a single document. This is explained here in the MongoDB Manual where it discusses unique multikey Indexes.
Thus, a unique index will not satisfy your requirement. It will prevent seperate documents from containing duplicate combinations, but it will still allow a single document to contain duplicate values across an array.
The best option you have is to change your data model so as to split the array of technologyEmployeeRef objects into separate documents. Splitting it up into separate documents will allow you to use a unique index to enforce uniqueness.
The particular implementation that should be taken for this data model change would depend upon your access pattern (which is out of the scope of this question).
One such way this could be done is to create a TechnologyEmployee collection that has all of the fields that currently exist in the technologyEmployeeRef array. Additionally, this TechnologyEmployee collection would have a field, such as email, which would allow you to associate it with a document in the Employee collection.
Sample Employee Document
{
....
....
"firstName" : "John",
"lastName" : "Doe",
"email" : "john.doe#gmail.com",
.....
.....
.....
}
Sample EmployeeTechnology Document
{
"email" : "john.doe#gmail.com",
"technologyCd" : "Java",
"technologyName" : "Java8",
....
.....
"status" : "A"
}
Index in EmployeeTechnology collection
{'email' : 1, 'technologyCd' : 1}, {unique: true}
The disadvantage of this approach is that you would need to read from two collections to have all of the data. This drawback may not be a big deal if you rarely need to retrieve the data from both collections at the same time. If you do need all the data, it can be sped up through use of indexes. With the indexes, it could be furthered sped up through the use of covered queries.
Another option is to denormalize the data. You would do this by duplicating the Employee data that you need to access at the same time as the Technology data.
Sample Documents
[
{
....
"firstName" : "John",
"lastName" : "Doe",
"email" : "john.doe#gmail.com",
.....
"technologyCd" : "Java",
"technologyName" : "Java8",
....
"status" : "A"
},
{
....
"firstName" : "John",
"lastName" : "Doe",
"email" : "john.doe#gmail.com",
.....
"technologyCd" : "Spring",
"technologyName" : "Spring Boot2",
....
"status" : "A"
}
]
In this MongoDB blog post,they say that
You’d do this only for fields that are frequently read, get read much more often than they get updated, and where you don’t require strong consistency, since updating a denormalized value is slower, more expensive, and is not atomic.
Or as you've already mentioned, it may make sense to leave the data model as it is and to perform the check for uniqueness on the application side. This could likely give you the best read performance, but it does come with some disadvantages. First, it will slow down write operations because the application will need to run some checks before it can update the database.
It may be unlikely, but there is also a possibility that you could still end up with duplicates. If there are two back-to-back requests to insert the same EmployeeTechnology object into the array, then the validation of the second request may finish (and pass) before the first request has written to the database. I have seen a similar scenario myself with an application I worked on. Even though the application was checking for uniqueness, if a user double-clicked a submit button there would end up being duplicate entries in the database. In this case, disabling the button on the first click drastically reduced the risk. This small risk may be tolerable, depending on your requirements and the impact of having duplicate entries.
Which approach makes the most sense largely depends on your access pattern and requirements. Hope this helps.

How to update document in mongo to get performance?

I am new to Spring Data Mongo. I've a scenario where I want to create a Study if already not present in mongo db. If its already present, then I've to update it with the new values.
I tried in the following way, which works fine in my case, but I'm not sure this is the correct/Best/Advisable way to update etc as far as performance is concerned.
Could anyone please guide on this?
public void saveStudy(List<Study> studies) {
for (Study study : studies) {
String id = study.getId();
Study presentInDBStudy = studyRepository.findOne(id);
//find the document, modify and update it with save() method.
if(presentInDBStudy != null) {
presentInDBStudy.setTitle(task.getTitle());
presentInDBStudy.setDescription(study.getDescription());
presentInDBStudy.setStart(study.getStart());
presentInDBStudy.setEnd(study.getEnd());
repository.save(presentInDBStudy);
}
else
repository.save(study);
}
}
You will have to use the MongoTemplate.upsert() to achieve this.
You will need to add two more classes: StudyRepositoryCustom which is an interface and a class that extends this interface, say StudyRepositoryImpl
interface StudyRepositoryCustom {
public WriteResult updateStudy(Study study);
}
Update your current StudyRepository to extend this interface
#Repository
public interface StudyRepository extends MongoRepository<Study, String>, StudyRepositoryCustom {
// ... Your code as before
}
And add a class that implements the StudyRepositoryCustom. This is where we will #Autowire our MongoTemplate and provide the implementation for updating a Study or saving it if it does not exist. We use the MongoTemplate.upsert() method.
class StudyRepositoryImpl implements StudyRepositoryCustom {
#Autowired
MongoTemplate mongoTemplate;
public WriteResult updateStudy(Study study) {
Query searchQuery = new Query(Criteria.where("id").is(study.getId());
WriteResult update = mongoTemplate.upsert(searchQuery, Update.update("title", study.getTitle).set("description", study.getDescription()).set(...)), Study.class);
return update;
}
}
Kindly note that StudyRepositoryImpl will automatically be picked up by the Spring Data infrastructure as we've followed the naming convention of extending the core repository interface's name with Impl
Check this example on github, for #Autowire-ing a MongoTemplate and using custom repository as above.
I have not tested the code but it will guide you :-)
You can use upsert functionality for this as described in mongo documentation.
https://docs.mongodb.com/v3.2/reference/method/db.collection.update/
You can update your code to use <S extends T> List<S> save(Iterable<S> entites); to save all the entities. Spring's MongoRepository will take care of all possible cases based on the presence of _id field and its value.
More information here https://docs.mongodb.com/manual/reference/method/db.collection.save/
This will work just fine for basic save operations. You don't have to load the document for update. Just set the id and make sure to include all the fields for update as it updates by replacing the existing document.
Simplified Domain Object:
#Document(collection = "study")
public class Study {
#Id
private String id;
private String name;
private String value;
}
Repository:
public interface StudyRepository extends MongoRepository<Study, String> {}
Imagine you've existing record with _id = 1
Collection state before:
{
"_id" : 1,
"_class" : "com.mongo.Study",
"name" : "saveType",
"value" : "insert"
}
Run all the possible cases:
public void saveStudies() {
List<Study> studies = new ArrayList<Study>();
--Updates the existing record by replacing with the below values.
Study update = new Study();
update.setId(1);
update.setName("saveType");
update.setValue("update");
studies.add(update);
--Inserts a new record.
Study insert = new Study();
insert.setName("saveType");
insert.setValue("insert");
studies.add(insert);
--Upserts a record.
Study upsert = new Study();
upsert.setId(2);
upsert.setName("saveType");
upsert.setValue("upsert");
studies.add(upsert);
studyRepository.save(studies);
}
Collection state after:
{
"_id" : 1,
"_class" : "com.mongo.Study",
"name" : "saveType",
"value" : "update"
}
{
"_id" : 3,
"_class" : "com.mongo.Study",
"name" : "saveType",
"value" : "insert"
}
{
"_id" : 2,
"_class" : "com.mongo.Study",
"name" : "saveType",
"value" : "upsert"
}

MongoDB DBRef handling inside GWT's RequestFactory function call

I have question related to DBRef of MongoDB. Imagine this scenario:
Group{
...
"members" : [
{
"$ref" : "User",
"$id" : ObjectId("505857a4e4b5541060863061")
},
{
"$ref" : "User",
"$id" : ObjectId("50586411e4b0b31012363208")
},
{
"$ref" : "User",
"$id" : ObjectId("50574b9ce4b0b3106023305c")
},
]
...
}
So given group document has 3 user DBRef. Where in java class of Group, members is tagged with morphia as #Reference:
public class Group {
...
#Reference
List<User> members;
...
}
Question: When calling RequestFactory function getGroup().with("members") will RequestFactory get all members in ONLY 1 DB access ?
Or will Request factory make 3 DB access for each DBRef in Group document in the scenario given above?
Thank you very much in advance.
RequestFactory itself doesn't access the DB. What it'll do here is:
call getMembers(), as it was requested by the client through .with("members")
for each entity proxy seen (either in the request or in the response), call its locator's isLive method, or if has no Locator, call the entity's findXxx with its getId() (and check whether null is returned).
The first step depends entirely on Morphia's implementation:
because you didn't set lazy = true on your #Reference, it won't matter whether RequestFactory calls getMembers() or not, the members will always be loaded.
in any case (either eager or lazy fetching), it will lead to 4 Mongo queries (one to get the Group and another 3 for the members; I don't think Morphia tries to optimize the number of queries to only make 1 query to get all 3 members at a time)
The second step however depends entirely on your code.
Remember that RequestFactory wants you to have a single instance of an entity per HTTP request. As I understand it, Morphia has an EntityCache doing just that, but I suspect it could be bypassed by some methods/queries.

MongoDB schema design - finding the last X comments across all blog posts filtered by user

I am trying to reproduce the classic blog schema of one Post to many Comments using Morphia and the Play Framework.
My schema in Mongo is:
{ "_id" : ObjectId("4d941c960c68c4e20d6a9abf"),
"className" : "models.Post",
"title" : "An amazing blog post",
"comments" : [
{
"commentDate" : NumberLong("1301552278491"),
"commenter" : {
"$ref" : "SiteUser",
"$id" : ObjectId("4d941c960c68c4e20c6a9abf")
},
"comment" : "What a blog post!"
},
{
"commentDate" : NumberLong("1301552278492"),
"commenter" : {
"$ref" : "SiteUser",
"$id" : ObjectId("4d941c960c68c4e20c6a9abf")
},
"comment" : "This is another comment"
}
]}
I am trying to introduce a social networking aspect to the blog, so I would like to be able to provide on a SiteUser's homepage the last X comments by that SiteUser's friends, across all posts.
My models are as follows:
#Entity
public class Post extends Model {
public String title;
#Embedded
public List<Comment> comments;
}
#Embedded
public class Comment extends Model {
public long commentDate;
public String comment;
#Reference
public SiteUser commenter;
}
From what I have read elsewhere, I think I need to run the following against the database (where [a, b, c] represents the SiteUsers) :
db.posts.find( { "comments.commenter" : {$in: [a, b, c]}} )
I have a List<SiteUser> to pass in to Morphia for the filtering, but I don't know how to
set up an index on Post for Comments.commenter from within Morphia
actually build the above query
Either put #Indexes(#Index("comments.commenter")) on the Post class, or #Indexed on the commenter field of the Comment class (Morphia's Datastore.ensureIndexes() will recurse in the classes and correctly create the comments.commenter index on the Post collection)
I think ds.find(Post.class, "comments.commenter in", users) would work, ds being a Datastore and users your List<SiteUser> (I don't use #Reference though, so I can't confirm; you might have to first extract the list of their Keys).