Migrating an ORM based Spring project (Hibernate / JPA) to noSQL (MongoDB/Cassandra/CauchDB) - jpa

I am fairly new to the noSQL world, and although I understand the benefits of performance and "cloud" friendliness, it seems the RDBMS world is much simpler and standard and limited to fewer players
I worked with SQL Server, Oracle, DB2, Sybase, Terradata, MySQL and others, and it seems they have in common much more (in terms of Query language, Indexing, ACID, etc) than the noSQL family.
My question is mainly this
Is it at all a valid concept to move an existing Spring/Java EE+JPA app to a noSQL storage? or it will require a complete re-architecture of the system beyond the medium of storage?
Hoping it's a valid goal, are there any migration paths that were case studied as best practices?
Is there an equivalent to the concept in "noSQL" that is comparable to ORM for RDBMS? e.g. any layer of separating the storage implementation from concept (I know GAE BigTable supports JDO and JPA but only partially, is there a newer JSR for a more noSQL friendly JPA?)
Are there any attempts to standardize "noSQL" the same way RDBMS are (query language,
API)
Is "noSQL" a too wide term? Should I modify the question per implementation (KV/Document)

Try Kundera : https://github.com/impetus-opensource/Kundera. it is an open source JPA2.0 compliant solution. You can also join http://groups.google.com/group/kundera-discuss/subscribe for further discussion.
-Vivek

DataNucleus allows JPA persistence to RDBMS, MongoDB, HBase and various others. That is one way you can tackle the problem, giving you a start point for use of your app with other datastores. From there you could modify class hierarchies to get around some of the problems that these other datastores bring. Use of JPA with other datastores is not part of any JSR and never will be, since JPA is designed around RDBMS solely. JDO on the other hand is already a standard for all datastores, as it was designed to be (also supported by DataNucleus)

EclipseLink 2.4 supports JPA with MongoDB and other NoSQL data sources.
http://java-persistence-performance.blogspot.com/2012/04/eclipselink-jpa-supports-mongodb.html

PlayOrm is another solution with it's Scalable-SQL and is JPA-like but breaks from JPA in that it supports noSQL patterns that JPA can't support.

Related

JPA specification for NoSQL databases

I have just heard about that JPA specification is now available and usable for nosql databases. My question here is, is it the same as the one which we are used to use with relational databases ? because there are many differences between relational and nosql database especially when we talk about transaction which is not the same whatsoever. Did oracle release a new specification that encompasses new changes for nosql databases ?
regards
JPA is defined totally for RDBMS datastores. See Oracles specification.
There are a few of the well known JPA implementations that have extended their support for JPA to also allow some non-RDBMS datastores to be used with the same API (the original one that did this was DataNucleus JPA, but Hibernate and EclipseLink have copied this since). While you can use the same API for persistence, you have to be aware that you make some compromises since the query language in particular is not always suited to non-RDBMS datastores. There are no plans (that I know of) to have a JPA spec for non-RDBMS.
There is a JDO (Java Data Objects) persistence spec that applies to RDBMS and non-RDBMS, and the JDO API is more suited to many different types of datastores.

play framework anorm for different database

I am new to Scala as well as play framework with Scala 2.0. I like the idea of writing the SQL code myself and have full control rather than depend on ORM tool. But does Anorm SQL work across different database vendors like MySQL and Oracle? Since I am writing an application which should be capable to work with any Relational database, my requirement is to write SQL which should work across databases since my application should work with vendor database.
Some vendor might have Oracle and some might have MySQL. So my code should be DB agnostic.Is this possible in Scala as I know that quires which run on mysql will not run on Oracle.
Thanks in Advance,
Pradeep
Short answer: NO.
Long answer: Anorm is just a library for dispatching your SQL queries to the database through JDBC, retrieving the results and delivering them to you. It does not understand the differences between different databases because it relies on JDBC for connection handling, and on you for writing queries.
You either have to handle different DB engines yourself or have an ORM handle that for you.
PS: Unless you really need to have a DB agnostic application (and fully understand its implications), I'd suggest you simply target 2-3 popular engines and avoid the future complications.

Java EE using MongoDB without JPA or EJB?

Could someone give me an explanation of if what I'm doing makes sense?
I am currently developing a Java EE application using MVC architecture, and MongoDB as my database. What I have is several entities written as Java objects with custom mapping methods to persist to and from my MongoDB, as well as a separate controller class to perform Database queries and operations. I am able to store these entities in my session with no problem, but I haven't tested this on a larger scale. I've tried annotating my objects as beans, however I received errors.
My typical method of transmitting data is to query my MongoDB, receive the information, map it to a java object, and store it in a session to be accessed by the front end. Is this the proper way to go about this?
Do my entities need to be EJBs? What do I have to gain from making them EJBs? I'm sorry if this question is presented poorly and seems unintelligent. I just want to have a better understanding of the technology I am trying to utilize before further developing. Most of the reading I have done on such topics has been to no avail. If anyone has some clear reading or an explanation that should help me understand what I am asking, it would be most appreciated.
I assume by "EJBs", you are referring to "Entity Beans"? In EJB 3, entity beans are "replaced" by JPA. Think of JPA as a "specification" for ORM frameworks. JPA/ORM are frameworks for mapping Java objects to and from relational databases. You are using MongoDB, which is not a relational database, and hence not that suitable for JPA. So I would say no, there is no need for you to consider JPA. Instead you should consider other frameworks, like Spring Data, which can simplify the task you are doing.
in my opinion you can use EJB3 with mongodb without JPA and entity manager, but you will have Stateless/Singleton/Startup/MDB beans with 0 configuration and with perfect managable backend.

Java ORMs on NoSQL DB like HBase

I have recently started getting familiarized with NoSQL (HBase). I am definitely a noob.
I was investigating about ORMs and high level clients which can be used on HBase and came across a few.
Some ORM libraries like Kundera are providing SQL like data query functionality. I am finding this a little counter intuitive.
Can any one help me understand why we would again need SQL like querying if the whole objective was to move away from it?
Also can anyone comment on your experiences with ORMs for HBase? I looked at a few of them from http://wiki.apache.org/hadoop/SupportingProjects and started looking at Kundera.
Another related question - Does data query with Kundera run map reduce jobs internally?
kundera or Spring data might provide user friendly ORM layer over NoSQL databases, but the underlying entity model still has to be NoSQL friendly. This means that NoSQL users should not blindly follow RDBMS modeling strategies but design ORM entities in such a way so that all NoSQL capabilities can be used.
As a thumb rule, the kundera ORM entities should be designed using query-first strategy where first the queries need to defined so as to create primary keys and also ensuring that relationship model is used as minimal as possible. Querying on random columns and full scans should be avoided and so data might have to be replicated across entities for reducing multiple entity look ups. Also, transactions management needs to be planned. FYI, kundera does not support transactions(beyond single row TX supported by Hbase/Cassandra).
Reason for using Kundera:
1) If looking for SQL like support over HBase. As it is build on top of HBase native API, so it simply transforms these SQL queries in to corresponding GET or PUT method calls.
2) Currently it support HBase-0.20.6 only. Kundera-2.0.6 will enable support for HBase 0-90.x versions.
3) Kundera does not do sometihng out of the box to provide map reduce over SQL like queries. However support for such thing will be provided in Kundera-2.0.6 by enabling support for Hive native queries only!
It is totally JPA compliant, so no need to learn something new. It simply hides complexity at developer level with very minimal effort.
SQL like querying is for developement ease, quick developement, less error prone and reusability ofcourse!
-Vivek

is there any benefits for using ORDBMS instead RDBMS behind JPA

Today I reviewed postgreSQL wiki and I found it is a ORDBMS (object-relational database management system), so I want to know is there any benefits for using postgreSql (RDBMS) behind the JPA (hibernate, eclipselink, ....) instead of a RDBMS (Mysql, ...) for performance issues or not?
As you know JPA use ORM and use JQL (java query language)
Regards
Object-Relational data is defined as structured data, which is user defined types in the database.
OR data types include:
Structs - structured types
Arrays - array types
These types are defined differently in each database, in Oracle they are OBJECT types, VARRAY types, and NESTED TABLE, and REF types.
JDBC standardizes access to OR data types using the Struct, Array and Ref interfaces.
With OR data-types you can have more complex database schemas, such as a TABLE of Employee_Type that has a Varray of Phone_Types and a Ref to it manager.
JPA does not have any direct support for mapping OR data-types, but some providers do.
EclipseLink has support for mapping OR data-types including, Structs, Ref, and Arrays. Custom mappings and annotations are used to map these, but the runtime JPA API is the same.
I would not normally recommend usage of OR data-types, as they are less standard than traditional relational tables, and do not give much benefit. Some database defined OR data-types, such as spatial data-types do offer advantages as they have integrated database support.
See,
http://en.wikibooks.org/wiki/Java_Persistence/Advanced_Topics#Structured_Object-Relational_Data_Types
I would say no. JPA is targeted at RDBMSs, and doesn't use the additional capabilities offered by ORDBMSs.
Now, PostgreSQL is also a very good RDBMS (you're not forced to use its object-oriented features, and my guess would be that most of its users don't), and you may use it with JPA without problem.
JPA is the translator between "thinking in objects" (Java) and "thinking in relations" (SQL). Therefore a JPA implementation will always speak to the DB in terms of relations. "Object Relational" stuff is ignored here.
Ignoring JPA and talking directly to the DB in "ORDBMS" speak won't buy you performance benefits in the most common cases, because ORDBMS are still RDBMS with some glue logic to look a little bit object-stylish. The data is stored in relations, all access paths are the same as pure relational access path.
If you really want to see performance benefits by switching not only the database product but the database technology (or philosophy) you should look at real Object Databases or even NoSQL.