Questions regarding mongodb sub-document and spring-data-mongo querying

Questions regarding mongodb sub-document and spring-data-mongo querying - mongodb

I'm still trying to get my hands around mongodb and how best Entities can be mapped. if you take for example: the entity user and the entity addresses. there could be one-to-many when someone is coming from jpa background. Here in mongo i don't want to use dbref. So addresses are in a Set collection in user.
Supposing i was using spring-data-mongo:
Question 1 : should both User and Address have the #Document annotation?Or just User?
Question 2 : what is the best way to query for addresses of a user. It is possible at first place? Because right now I query to get the User by username or Id and then get the addresses of the user.Can I query directly for sub-document? if yes how is it done using spring-data-mongo Criteria Query:
#Document
public class User{
#Id
private Long ID;
private String username;
private Set<Address> addresses = new HashSet<Address>();
...
}
#Document
public class Address {
#Id
private Long ID;
private String city;
private String line1;
...
}

Question 1: No, #Document is not strictly necessary at all. We just leverage this on application startup if you activate classpath-scanning for document classes. If you don't the persistence metadata scanning will be done on the first persistence operation. We traverse properties of domain objects then, so Address will be discovered.
Question 2: You'll have to read the User objects entirely as MongoDB currently does not allow returning sub-documents. So you'll have to query for the entire Userdocument but can restrict the fields being returned to the addresses field using a fieldSpec on the Query object or the repository abstraction's #Query annotation (see ref docs).

Related

Spring Mongo DB #DBRef(lazy=true) - How to lazy Load

I have a model like the one below (assume as pseudo code )
class Student {
#Id
private String id;
private String firstname;
.....;
#DBRef(lazy=true)
private College college
// getters and setters
}
class College {
#Id
private String id;
private String name;
// other attributes.
// getters and setters
}
I am using #DBRef(lazy=true) so that I do not load the college associated with the student. For example: if I have a repository method for Student called findByFirstname(String firstname), I can load the student without the college.
However, at times I would also want to load the student with college. Is it possible to write a repository method with a custom query using the #Query annotation (org.springframework.data.mongodb.core.query.Query) where I can load the student (all fields) and also the associated college instance ?
#Query( what should go here ?)
Student findStudentWithCollege(String firstname)
If no, then what would be a suggested way to load lazy documents on demand ?
As per the documentation
"DBRefs can also be resolved lazily. In this case the actual Object or Collection of references is resolved on first access of the property. Use the lazy attribute of #DBRef to specify this. Required properties that are also defined as lazy loading DBRef and used as constructor arguments are also decorated with the lazy loading proxy making sure to put as little pressure on the database and network as possible." I guess this may not be suitable for cases where one would want to load a student whose last name is "Smith" along with the college instance for each of the students retrieved.

How to properly use Locking or Transactions to prevent duplicates using Spring Data

What is the best way to check if a record exists and if it doesn't, create it (avoiding duplicates)?
Keep in mind that this is a distributed application running across many application servers.
I'm trying to avoid these:
Race Conditions
TOCTOU
A simple example:
Person.java
#Entity
public class Person {
#Id
#GeneratedValue
private long id;
private String firstName;
private String lastName;
//Getters and Setters Omitted
}
PersonRepository.java
public interface PersonRepository extends CrudRepository<Person, Long>{
public Person findByFirstName(String firstName);
}
Some Method
public void someMethod() {
Person john = new Person();
john.setFirstName("John");
john.setLastName("Doe");
if(personRepo.findByFirstName(john.getFirstName()) == null){
personRepo.save(john);
}else{
//Don't Save Person
}
}
Clearly as the code currently stands, there is a chance that the Person could be inserted in the database in between the time I checked if it already exists and when I insert it myself. Thus a duplicate would be created.
How should I avoid this?
Based on my initial research, perhaps a combination of
#Transactional
#Lock
But the exact configuration is what I'm unsure of. Any guidance would be greatly appreciated. To reiterate, this application will be distributed across multiple servers so this must still work in a highly-available, distributed environment.

For Inserts: if you want to prevent same recordsto be persisted, than you may want to take some precoutions on DB side. In your example, if firstname should be unique, then define a unique index on that column, or a agroup of colunsd that should be unique, and let the DB handle the check, you just insert & get exception if you're inserting a record that's already inserted.
For updates: use #Version (javax.persistence.Version) annotation like this:
#Version
private long version;
Define a version column in tables, Hibernate or any other ORM will automatically populate the value & also verison to where clause when entity updated. So if someone try to update the old entity, it prevent this. Be careful, this doesn't throw exception, just return update count as 0, so you may want to check this.

Hazelcast Complex Object Model in Cache

I am looking to put a complex model into Hazelcast to use it as the data tier of an application with MapStore implementations rendering the actual objects to the database. So for example, lets say we have the following noxiously common model where I have stripped out getters and setters for brevity:
class Customer {
public int id;
public String name;
public Address address;
}
class Address {
public int id;
public String street;
public string city;
public String state;
public String zip;
}
class InterestGroup {
public int id;
public String name;
public List<Customer> customers;
}
This is a model that I want to store in the database but I also want to map into Hazelcast. Furthermore lets say that I want customers to share addresses such that if the address changes for one, it will change for all customers with that address.
I can write MapStore classes to read this information out of the database and even give each object a primary key to use as a map key. What I am having trouble with is setting up navigation within the map between entities. Lets say I obtain a customer and want to navigate to the address of that customer and then get all customers that use that address.
If I load customers and addresses into a map, I dont want to embed all customers in an address nor do I want to embed the address in each customer. I want to navigate transparrently from the customer to the address. Is there a means by which I could do this in hazelcast without breaking the dynamics of a nested object but while allowing addresses to live in another map? The situation is similar for interest groups. If I embed all customers in an interest group then I am duplicating data all over especially if the customer is in several interest groups.
To accomplish this without duplication all over do I have to compromise the object structure of my entities?
Thanks in advance.

If you know how to build the address_key for the address Hazelcast map you can implement HazecastInstanceAware to your model classes and build some kind of "lazy fetch" using getters to retrieve the address. Does that make sense to you? :)

What is the best practice for joins in MongoDB?

I have two class called School and Student as you see. I want to search for "students that school names are bla bla bla" and "schools that have students which has higher grade than 90". I read some documents but I am a little confused.
public class School extends BasicDBObject {
private int id;
private String name;
private String number;
private List<Student> studentList = new ArrayList<Student>();,
//getter and setters
}
public class Student extends BasicDBObject{
private int id;
private String name;
private String grade;
private School school;
//getter and setters
}

MongoDB is not a relational database. It doesn't support joins. To simulate a join, you have to query the first collection, get the results, and then query the second collection with a large $in query filled with the applicable key values of the documents returned by the first query. This is as slow and ugly as it sounds, so the database schema should be designed to avoid it.
For your example, I would add the school names to the Student documents. This would allow to satisfy both of your use-cases with a single query.
Anyone coming from a relational database background would now say "But that's a redundancy! You've violated the second normal form!". That's true, but normalization is a design pattern specific to relational databases. It doesn't necessarily apply to document-oriented databases. So what design patterns are applicable for document-oriented databases? Tough call. It's a new technology. We are still figuring this out.

Does Hibernate Search load entity associations and can I mix free text queries with native SQL?

package example;
...
#Entity
#Indexed
public class Book {
#Id
#GeneratedValue
private Integer id;
#Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
private String title;
#Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
private String subtitle;
#Field(index = Index.YES, analyze=Analyze.NO, store = Store.YES)
#DateBridge(resolution = Resolution.DAY)
private Date publicationDate;
#IndexedEmbedded
#ManyToMany
private Set<Author> authors = new HashSet<Author>();
#OneToMany(mappedBy="book")
List<BookPages> bookPages;
}
1) If the search result is of type Book.class does the result contain #ManyToOne objects (bookPages) or do I have to load them separately? Because I need them for showing the result.
2) Is it possible to add a native sql clause to the search? Because I need to limit the result and for that I have to JOIN another table which is not declared in Book.class.

That is a basic Hibernate ORM question, not related to Hibernate Search. Yes you can always navigate from one entity to its relations by just invoking the getter / accessing the fields: depending on your (configurable) fetch strategy it will either have the relation preloaded in "one shot" when loading the main entity (likely with a JOIN) or fetch it transparently on demand. This configuration is however not have any effect on functionality, more a performance tuning option.
No you can't mix SQL with an Hibernate Search (Fulltext) query; what you can do is to expose the needed data from the other table in the mapping - which would be a cleaner mapping anyway - and then use the Hibernate Search annotations to make sure all fields you need are indexed as well, so that you can include the restrictions in the FullTextQuery directly; fill perform much faster as well than any SQL.

It is not possible to mix native SQL with hibernate search query, as there is no way to intersect the results from both queries without iterating on at least one of the results.
See the documentation reference about this exact question.
Hibernate Search - FAQ - Can I mix HQL and Lucene queries?
http://hibernate.org/search/faq/