What is the best practice for joins in MongoDB? - mongodb

I have two class called School and Student as you see. I want to search for "students that school names are bla bla bla" and "schools that have students which has higher grade than 90". I read some documents but I am a little confused.
public class School extends BasicDBObject {
private int id;
private String name;
private String number;
private List<Student> studentList = new ArrayList<Student>();,
//getter and setters
}
public class Student extends BasicDBObject{
private int id;
private String name;
private String grade;
private School school;
//getter and setters
}

MongoDB is not a relational database. It doesn't support joins. To simulate a join, you have to query the first collection, get the results, and then query the second collection with a large $in query filled with the applicable key values of the documents returned by the first query. This is as slow and ugly as it sounds, so the database schema should be designed to avoid it.
For your example, I would add the school names to the Student documents. This would allow to satisfy both of your use-cases with a single query.
Anyone coming from a relational database background would now say "But that's a redundancy! You've violated the second normal form!". That's true, but normalization is a design pattern specific to relational databases. It doesn't necessarily apply to document-oriented databases. So what design patterns are applicable for document-oriented databases? Tough call. It's a new technology. We are still figuring this out.

Related

How to properly use Locking or Transactions to prevent duplicates using Spring Data

What is the best way to check if a record exists and if it doesn't, create it (avoiding duplicates)?
Keep in mind that this is a distributed application running across many application servers.
I'm trying to avoid these:
Race Conditions
TOCTOU
A simple example:
Person.java
#Entity
public class Person {
#Id
#GeneratedValue
private long id;
private String firstName;
private String lastName;
//Getters and Setters Omitted
}
PersonRepository.java
public interface PersonRepository extends CrudRepository<Person, Long>{
public Person findByFirstName(String firstName);
}
Some Method
public void someMethod() {
Person john = new Person();
john.setFirstName("John");
john.setLastName("Doe");
if(personRepo.findByFirstName(john.getFirstName()) == null){
personRepo.save(john);
}else{
//Don't Save Person
}
}
Clearly as the code currently stands, there is a chance that the Person could be inserted in the database in between the time I checked if it already exists and when I insert it myself. Thus a duplicate would be created.
How should I avoid this?
Based on my initial research, perhaps a combination of
#Transactional
#Lock
But the exact configuration is what I'm unsure of. Any guidance would be greatly appreciated. To reiterate, this application will be distributed across multiple servers so this must still work in a highly-available, distributed environment.
For Inserts: if you want to prevent same recordsto be persisted, than you may want to take some precoutions on DB side. In your example, if firstname should be unique, then define a unique index on that column, or a agroup of colunsd that should be unique, and let the DB handle the check, you just insert & get exception if you're inserting a record that's already inserted.
For updates: use #Version (javax.persistence.Version) annotation like this:
#Version
private long version;
Define a version column in tables, Hibernate or any other ORM will automatically populate the value & also verison to where clause when entity updated. So if someone try to update the old entity, it prevent this. Be careful, this doesn't throw exception, just return update count as 0, so you may want to check this.

How to retrieve nested data in JPQL?

working with OpenJPA2 persistence. I have a very simple entity class, that does have a String property and a List property. I do persist its instances flawlessly with the nested List (in a JSF2 web project). I check the database and there appears two tables (I use automatic schema generation), one for the entity itself, and other table for the nested List. All data persisted using EntityManager is stored fine on both tables.
Problem is I cannot retrieve nested data. I mean, I do a Query for getting all instances of the entity, but the List of all instances come empty.
(DB Engine is MySQL. ORM is OpenJPA2. Server is TomEE 1.6. IDE is Netbeans 8. I use automatic schema generation, so I DO NOT WANT to design the database tables, and I DO WANT to let the ORM create the tables, so I can work purely with objects and forget about DB.)
following is Entity Class code:
#Entity
public class Cliente implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
private String nombre;
#ElementCollection
private List<String> emails = new ArrayList<String>();
// getters and setters omitted for brevity.
It does have an associated Facade Class, which is ClienteFacade. It includes the getAll method which uses a JPQL Query.
public List<Cliente> listaClientes(){
Query query = em.createQuery("SELECT c FROM Cliente c");
return query.getResultList(); }
Problem is, I get all instances of the entity in a List, but all lists of List emails come out empty. I think problem may be in the JPQL query. Still trying to learn JPQL with some difficulty... so please, How can I retrieve the nested data with JPQL?
Many thanks!
Try #ElementCollection(fetch = FetchType.EAGER) because the default type is lazy. That means that the lists are not loaded directly.

Difference between defining queries in the repository interface or entity class?

Sorry if this is a very nooby/stupid question, but I was wondering if there was any difference, besides implementation, between defining a query in the repository:
public interface EmployeeRepository<Employee, Integer> {
#Query("select e from Employee e where e.name like :name")
public List<Employee> findByName(#Param("name") String name);
}
and defining a query in the entity:
#Entity
#NamedQuery(name="Employee.findByName", query="select e from Employee e where e.name like :name")
public class Employee {
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private int id;
//...
}
Like are there advantages/disadvantages to either one?
Generally speaking we recommend defining the queries at the repository interface for a very simple reason: it's conceptually closer to the query execution. Also, #Query has a few advanced options when it comes to the additional queries that e.g. need to be triggered to implement pagination.
However, if you want to re-use the query definition on multiple query methods, using a named query is still a reasonable option.
The most important aspect IMO is consistency either among the team or at least per repo. If you start with named queries, don't mix them up with #Query definitions as that might confuse developers or at least make it harder to understand what's going on.

Questions regarding mongodb sub-document and spring-data-mongo querying

I'm still trying to get my hands around mongodb and how best Entities can be mapped. if you take for example: the entity user and the entity addresses. there could be one-to-many when someone is coming from jpa background. Here in mongo i don't want to use dbref. So addresses are in a Set collection in user.
Supposing i was using spring-data-mongo:
Question 1 : should both User and Address have the #Document annotation?Or just User?
Question 2 : what is the best way to query for addresses of a user. It is possible at first place? Because right now I query to get the User by username or Id and then get the addresses of the user.Can I query directly for sub-document? if yes how is it done using spring-data-mongo Criteria Query:
#Document
public class User{
#Id
private Long ID;
private String username;
private Set<Address> addresses = new HashSet<Address>();
...
}
#Document
public class Address {
#Id
private Long ID;
private String city;
private String line1;
...
}
Question 1: No, #Document is not strictly necessary at all. We just leverage this on application startup if you activate classpath-scanning for document classes. If you don't the persistence metadata scanning will be done on the first persistence operation. We traverse properties of domain objects then, so Address will be discovered.
Question 2: You'll have to read the User objects entirely as MongoDB currently does not allow returning sub-documents. So you'll have to query for the entire Userdocument but can restrict the fields being returned to the addresses field using a fieldSpec on the Query object or the repository abstraction's #Query annotation (see ref docs).

Does Hibernate Search load entity associations and can I mix free text queries with native SQL?

package example;
...
#Entity
#Indexed
public class Book {
#Id
#GeneratedValue
private Integer id;
#Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
private String title;
#Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
private String subtitle;
#Field(index = Index.YES, analyze=Analyze.NO, store = Store.YES)
#DateBridge(resolution = Resolution.DAY)
private Date publicationDate;
#IndexedEmbedded
#ManyToMany
private Set<Author> authors = new HashSet<Author>();
#OneToMany(mappedBy="book")
List<BookPages> bookPages;
}
1) If the search result is of type Book.class does the result contain #ManyToOne objects (bookPages) or do I have to load them separately? Because I need them for showing the result.
2) Is it possible to add a native sql clause to the search? Because I need to limit the result and for that I have to JOIN another table which is not declared in Book.class.
That is a basic Hibernate ORM question, not related to Hibernate Search. Yes you can always navigate from one entity to its relations by just invoking the getter / accessing the fields: depending on your (configurable) fetch strategy it will either have the relation preloaded in "one shot" when loading the main entity (likely with a JOIN) or fetch it transparently on demand. This configuration is however not have any effect on functionality, more a performance tuning option.
No you can't mix SQL with an Hibernate Search (Fulltext) query; what you can do is to expose the needed data from the other table in the mapping - which would be a cleaner mapping anyway - and then use the Hibernate Search annotations to make sure all fields you need are indexed as well, so that you can include the restrictions in the FullTextQuery directly; fill perform much faster as well than any SQL.
It is not possible to mix native SQL with hibernate search query, as there is no way to intersect the results from both queries without iterating on at least one of the results.
See the documentation reference about this exact question.
Hibernate Search - FAQ - Can I mix HQL and Lucene queries?
http://hibernate.org/search/faq/