Morphia and object graphs - mongodb

I've yet to use Morphia, but I'm considering it for a current project.
Suppose I have a POJO with a number of #Reference annotations and I ask Morphia to fetch the object graph from the database. If I then make another DAO or DataStore call and ask Morphia to fetch some object that was already instantiated in the first graph, would Morphia return a reference to the already instantiated object or would it create a new instance?
If Morphia returns a new instance of the object each time, does anyone have a recommendation of how to best approach creating a Morphia-backed repository that won't duplicate already-instantiated objects?

As I see it in Morphia, it will re read every reference.
This is one of the problems, why I created Morphium. I integrated a caching layer there, so if you read a reference, this one won't be read again (at least, if you search by ID...)

We use morphia in production and there are two ways to make sure you don't load the references which is something we came across too.
One is to use the lazy loading option when you define the #Reference element in your main class. This of course means that this behavior is 'global' to that object.
The better way to do this is to not define an #Reference using Morphia and instead managing the references yourself. Let me know if you need a code sample.

I've stopped using #Reference too and instead declare something like:
ObjectId itemId
rather than having a field item. This has 2 benefits: (1) it lets me define a getter through a helper getObject(...) method which I have written with object caching and (2) it stores a simple ObjectId in the Mongo object rather than a full DBRef which includes the collection name and thus about twice the data size.

Related

Kotlin immutable entities changing unexpectedly when using it with JPA

In our project we are using kotlin with JPA. All of our entities are immutable so, it is not possible to set fields of our entities directly. You have to create a new instance by using the copy method. If you want these changes to be reflected to database, you must persist this newly created entity with an explicit function call.
In the beginning, this approach looks perfect to us. However, nowadays we are having some problems like some of our instances are changing unexpectedly in the memory.
val instance1 = repository.findById(entityId)
repository.save(instance1.copy(deletedAt = Instant.now()))
..
..
assertNull(instance1.deletedAt())
In the code snipped above, instance1 is retrieved from database and its deletedAt field is set with copy method and the new instance which is created with this copy method is passed to save method of the repository. We don't set any field of instance1, we create a new instance to do these changes. However, the result on assert line is unexpectedly not-null.
It seems, There is a confliction on JPA persistence context (first level cache) and kotlin's immutable and copy method logic.
Is anyone facing this problem or any suggestion or best practices when using JPA and immutable Kotlin entities?
I suspect the problem is that you're ignoring the return value from save().  Its docs say:
Saves a given entity. Use the returned instance for further operations as the save operation might have changed the entity instance completely.
But you're not doing that; you're instead continuing to use the original instance which (as that says) may have changed.
Instead, store the return value from save(), and use that thereafter.  (Either by making instance1 a var, or creating a new val and not referring to instance1 afterward.)
(This isn't a Kotlin-specific problem, and is exactly the same in Java.  JPA , Spring, &c work their magic by futzing with the bytecode, so can do things your code can't — such as changing immutable values.  Most of the time you can ignore it, but this case makes it obvious.)
Immutable types are not compatible on how JPA works.
JPA works around the concept of UnitOfWork, which mean objects retrieved from the database lives in a PersistedContext (1st level cache) and they get discarded once the EntityManager is closed (on a web application at the end of the HTTP request).
When using the copy method in an entity you just retrieved from the database, the copied object is considered detached from the current session meaning that changes on it cannot be tracked by JPA and the underlying implememtation (Hibernate / EclipseLink) have hard time figuring out which SQL statement needs to be fired (Insert/Update/Delete ????)
Things got way more complex when you have complex object graph with OneToMany associations and cascading options.
So my recommendation is unfortunately is to avoid Immutable types when using JPA.

Mongo db how to save an object

Below code gives error and it says School class must implement DBObject interface. The problem is that this interface has tons of methods. I have nearly 100 class and I don't want to write millions of methods. Is there any easy way to save an object?
DBCollection table = db.getCollection("school");
School document = new School();
table.insert(document);
Instead of implementing DBObject or extending one of the existing implementations like BasicDBObject, you could have all objects which can be saved in the database have a method public DBObject toDBObject() which creates and returns a DBObject representation of the object. The BasicDBObject is a Map<String, Object> which handles the object data as key/value pairs, so it is a good candidate for this.
For a more generic solution, you could use reflection to create a method which can convert any Java object into a DBObject. To have more control over this, you could make up some annotations, add them to your classes and have your conversion method check them.
Now you have created your own object mapping framework for MongoDB. But why reinvent the wheel when others have already done it? So before you do this, check out if the existing mapping frameworks like morphia fulfill your use-case - they likely do and will save you hours of programming and weeks of debugging.
[opinion]
I usually despise object-relational mappers in the context of relational databases because of the impedance mismatch problem, but for heterogeneous databases like MongoDB they make a lot more sense, because you can store objects which have the same base-class but also some different class-specific fields in the same table collection without any ugly workarounds.
[/opinion]

How to use ensureIndex with AdvancedDatastore by specifying a collection name?

In morphia you can use #Index annotations to create automated indexes for #Entity classes. I am trying to create these indexes by specifying a collection name but couldn't find a way to do it. Using AdvancedDatastore you can save an Entity into any collection you want, but is it possible to ensure indexes on a specified collection rather than the default collection of the Entity.
advancedDatastore.ensureIndexes(Entity.class); // This will create indexes on the mapped Entities.
I am looking for a way to do the following, but I didn't see any method similar to below one, is there a workaround to achieve this:
advancedDatstore.ensureIndexes("exampleCollection", Entity.class); //create indexes of Entity.class for the exampleCollection.
Yes, you can extend the AdvancedDatastore interface and the DatastoreImpl concrete class to add ensureIndex* methods with the extra argument. We do this in our organization and it works.
There is also a pending pull request to add this feature directly to Morphia here: https://github.com/mongodb/morphia/pull/541. If you are willing to build your own Morphia jar, you can use the patch listed there.

How to update object in Mongo with an immutable Salat case class

I'm working on a project with Scala, Salat, Casbah, Mongo, Play2, BackboneJS... But it's quite a lot of new things to learn in the same time... I'm ok with Scala but I find my code crappy and I don't really know what's the solution to improve it.
Basically my usecase is:
A MongoDB object is sent to the browser's JS code by Play2
The JS code update the object data (through a Backbone model)
The JS send back the the updated JSON to the server (sent by Backbone save method, and received by Play with a json bodyparser)
The JSON received by Play should update the object in MongoDB
Some fields should not be updatable for security reasons (object id, creationDate...)
My problem is the last part.
I'm using case classes with Salat as a representation of the objects stored in MongoDB.
I don't really know how to handle the JSON i receive from the JS code.
Should I bind the JSON into the Salat case class and then ask Mongo to override the previous object data by the full new case class object?
If so is there a way with Play2 or Salat to automatically create back the case class from the received JSON?
Should I handle my JSON fields individually with $set for the fields I want to update?
Should i make the elements of my case class mutable? It's what we actually do in Java with Hibernate for exemple: get the object from DB, change its state, and save it. But it doesn't seem to be the appropriate way to do with Scala...
If someone can give me some advices for my usecase it would be nice because I really don't know what to do :(
Edit: I asked a related question here: Should I represent database data with immutable or mutable data structures?
Salat handles JSON using lift-json - see https://github.com/novus/salat/wiki/SalatWithPlay2.
Play itself uses Jerkson, which is another way to decode your model objects - see http://blog.xebia.com/2012/07/22/play-body-parsing-with-jerkson/ for an example.
Feel free to make a small sample Github project that demonstrates your issue and post to the Salat mailing list at https://groups.google.com/group/scala-salat for help.
There are really two problems in your question:
How do I use Play Salat.
How do I prevent updates to certain fields.
The answer to your first question lies in the Play Salat documentation. Your second question could be answered a few ways.
a. When the update is pushed to the server from Backbone, you could grab the object id and find it in the database. At that point you have both copies of the object. At that point, you can fire a business rule to make sure the sender didn't attempt to change those fields.
or
b. You could put some of your fields in another document of an embedded document. The client would have access to them for rendering purposes but your API wouldn't allow them to be pushed back to Mongo.
or
c. You could write a custom update query that ignores the fields you don't want changed.
Actually the answer is pretty simple: I didn't know there was a built-in copy method on case classes that allows to copy an immutable case class while changing some data.
I don't have nested case class structures but the Tony Morris suggestion of using Lenses seems nice too.

How to do filtering for many entities of many DataServices in one common class?

Tier database and every single table has a DataSetId and I absolutely want to be sure that the data is always partitioned correctly.
Currently I'm using the QueryInterceptor attribute but it's messy and overly repetitive and prone to errors. Some new Dev could add a new table and forget to filter by DataSetId, or just rename a table. So I've put this in a base class but the IQuerable properties of my repository are never called.
I have a "CoreRepository" class that inherits from ObjectContext, and each of my IQueryable collections uses "CoreObjectSet". CoreObjectSet extends ObjectSet by always adding an expression to filter by DataSetId. When used directly this works fine. But when used for a DataService the Get accessor for the collections on the Repository are never called by the DataService. It appears to be cheating and not using them at all and accessing the data directly.
Is there a way to get the DataService to access through the repository class correctly (And still get the efficiency of passing through the query as SQL)?
If this is the behaviour why even make DataService of T anyway if it's not even going to use the class? For the ADO team to just ignore it and use the edmx directly seems like a hack.
Thanks
Aaron
Looks like the only way around it is to use a T4 template to generate the DataService. I much prefer a base class or some kind of reusable handler but ADO has given me no choice here.