OCM or Nodes in JCR? - content-management-system

We are developing a CMS based on JCR/Sling/JSP/Felix/etc.
What I found so far is using Nodes are very straight forward and flexible. But my concern is over time it could become too hard to maintain and manage.
So, is it wise to invest in using a OCM? Would it be just an extra layer of complexity? What's the real benefit in OCM if there's any? Or it's better for us to stick to Nodes instead?
And lastly, is Jackrabbit OCM the best option for us if we are to go down that path?
Thank you.

In my personal experience I can say it severly depends on your situation if OCM is a useful tool for your project or not.
The real problem in using OCM (in my personal experience) is when the definition of a class used in existing persisted data (as objects) in the repository has changed. For example: you found it necessary to change some members and methods of a class to match with functionality changes. By this I mean that the class definition of the persisted data object in the repository no longer matches the definition of actual class. When a persisted data is saved to the jcr repository it is usually saved in a format that java understands in terms of serialization. Which means that when something changes to the definition of the used class, the saved data in the repository can no longer be correctly interpreted by java. This issue tends to lead to complex deployment where you need to convert old persisted data objects to the new definition and save them again in the repository to make sure you can still use "old" but still required persisted data.
What does work (in my opinion) is using a framework that allows to map nodes and node properties to java objects directly (for example by using annotations) and the other way around (persist a java object to the repository as a JCR node where the java member fields are actual node properties). This way you stick to the data representation of jcr (nodes with properties) and can still map them to the members of a java class.
I've used a framework like this in a cms called AEM (of Adobe) before, although I must mention this is in a OSGI context (but the prinicipe still stands). The used framework basically allowed maximum flexibility and persists the java object as a JCR node and the other way around. Because it mapped directly to the jcr definition, code changes in the class and members ment just changing annotations, and old persisted data was still usuable without much effort.

Related

DDD - How to store aggregates in NoSql databases

A current project needs us to persist domain objects in a NoSQL database such as mongoDB.
In many examples (incl. Eric Evans, Vaughn Vernon) the domain objects are serialized and persisted to the mongoDB directly.
We would like to avoid mixing the domain layer with persistence related inforamtion by not having any annotations in our domain objects.
Also we are concerned about corrupting the persisted data by changing the domain object in the future.
We came to the conclusion that we need to have some kind of DTOs translating between the domain objects and the persisted data.
Did anyone of you come across a good solution for such a case?
Yes. Your domain models should be ignorant of persistence. So you need a DTO or what I call data models (apart from the domain models and view models). Your data models will be map to the domain models before persisting to the database. This mapping is pretty common in insert and update operations. For read-only operations (reporting, etc) you can bypass the mapping from data models and to domain models. That will prevent loading the whole object graph of your domain models. This is widely applied in CQRS architecture patterns where read and write commands are separated.
Like you, I want business objects to have no dependency on any kind of specific repository. I solved it like this: That have your business object define its state objects and repository functions as interfaces. Your repository implementation can create an actual state object and inject that into your business object using the constructor.
There are a lot of advantages to this approach (such as having business objects for specific purposes), but you easily achieve complete (two-way) independence of your repository this way. Martin Fowler also hinted at this approach elsewhere.
I actually use the same pattern in my Angular / TypeScript projects. My read-api calls return DTO objects that get state objects injected as well and their properties map directly onto state objects.
These DTOs that end up as untyped javascript objects when they come from the api to the client (Angular) project are then in turn injected as state objects into TypeScript objects, injected in the constructor again and mapped by getters and setters. It works very cleanly and is well maintainable. I have an example on my GitHub (niwra) account (Software-Management repositories), but can expand here if anyone is interested.
MongoDB allows for very clean and Unit-Testable repository implementations, that returns strongly typed aggregates. The only thing I haven't solved cleanly yet is telling MongoDb about state objects for child-collections. Currently that is pretty 'static' still, but I'm sure I'll find some nice solution.
You can store your domain objects as-is in document databases. Vaughn Vernon has posted an article The Ideal Domain-Driven Design Aggregate Store? about this, featuring PostgreSQL new (at that time) JSONB document-like storage.
Of course, you get a risk having your aggregates polluted by BsonX attributes, which you probably do not want. You can avoid this by using convention configuration but you will still need to think about serialisation and this can have an effect on the level of encapsulation.
Another pattern here is to use a separate state object, which is then held as a property inside the aggregate root (or regular entity). I would not call it a "DTO", since this is clearly your aggregate state. You are not transferring anything. Methods inside your aggregate can mutate the state or, even better, the state would be an immutable value object and new state is produced when you need to change the state.
In such case persistence would only care about the state object. You still might be unhappy to have MongoDb attributes on the state object properties and this is reasonable. Then, you would need to have an identical structure inside the persistence mechanism, so you can map properties on-to-one.
A current project needs us to persist domain objects in a NoSQL
database such as mongoDB. In many examples (incl. Eric Evans, Vaughn
Vernon) the domain objects are serialized and persisted to the mongoDB
directly.
I can confirm you that MongoDB is a good choice for persisting DDD models. I use MongoDB as an Event store in my current project. You can use MongoDB even if you are not using Event sourcing, for example using an ODM (Object Document Mapper): you have a document for each Aggregate instance (this applies to any document based database, not only MongoDB) and you store nested entities and value objects as nested documents.
We would like to avoid mixing the domain layer with persistence related inforamtion by not having any annotations in our domain objects.
You can use xml mapping.
Also we are concerned about corrupting the persisted data by changing the domain object in the future.
For this you can use custom migration scripts. If you use Event sourcing then there are event versioning strategies.
We came to the conclusion that we need to have some kind of DTOs translating between the domain objects and the persisted data.
This is a bad conclusion.
If you use CQRS you won't need DTOs because the readmodels are enough.

Managing changes in class structure to be consistent with mongodb collection

We are using mongodb with c#. We are trying to figure out a way to keep our collection consistent seamlessly. Right now, if a developer make any changes to the class structure(add a field or change data type or changing the property within a nested class) he/she has to change the mongo collection manually.
Its a pain as our project is growing and the developers working on the project keeps increasing. Was wondering whether someone already have figured out a way to manage this issue.
Research
I found a similar question. however, couldn't find the solution.
Found a way to find all properties Finding the properties; however, datatype and nested documents becomes an issue.
If you want to migrate gradually as records are accessed you need to follow a few simple rules:
1) If you add a field it had better be nullable or have a default value specified.
2) Never rename fields, never change field types
- Instead always add new fields, add migration code, remove the old fields only when all documents have been migrated over.
For prototyping with MongoDB and C# I build a dynamic wrapper ... that lets you specify your objects using only interfaces (no classes needed), and it lets you dynamically add new interfaces to an existing object. Not ready for production use but for prototyping it saves a lot of effort and makes migration really easy.

Preventing Netbeans JAXB generation trashing classes

I'm developing a SOAP service using JAX-WS and JAXB under Netbeans 6.8, and getting a little frustrated with Netbeans trashing my work every time the XSD schema my JAXB bindings are based upon changes.
To elaborate, the IDE automatically generates classes bound to the schema, which can then be (un)marshalled from/to XML using JAXB. To these classes I've added extra methods to (for example) convert to and from separate classes designed to be persisted to database with JPA. The problem is that whenever the schema changes and I rebuild, these classes are regenerated, and all my custom methods are deleted. I can manually replace them by copy-pasting from a backup file, but that is rather time-consuming and tedious. As I'm using an iterative design approach, the schema is changing rather frequently and I'm wasting an awful lot of time whenever it does, simply to reinstate my previous code.
While the IDE automatically regenerating the JAXB-bound classes is entirely reasonable and I don't mean to imply otherwise, I was wondering if anyone had any bright ideas as to how to prevent my extra work having to be manually reinstated every time my schema changes?
Making modifications to the XJC-generate source isn't really a good idea, for reasons that you've discovered. You should either use a binding customization or XJC to plugin to generate the additional code that you need, or else move your additional code out of the XJC-generate code and into separate source files.
If your additional code is there to convert between JAXB and JPA class models, then it can probably stand on its own as a distinct translation layer. It's not very OO that way, but it'll get around your problem.
Alternatively, there's an XJC plugin that is supposed to let you preserve code that's been manually added to the generated source, but it's poorly documented (and I haven't used it myself). You might have to dog around on http://jaxb.dev.java.net/ to find out how to use it.
Instead of modifying the generated classes to add methods - extend the generated classes and put your additional methods in your classes derived from the generated classes.

Sending persisted JDO instances over GWT-RPC

I've just started learning Google Web Toolkit and finished writing the Stock Watcher tutorial app.
Is my thinking correct that if one wants to persist a business object (like a Stock) using JDO and send it back and forth to/from the client over RPC then one has to create two separate classes for that object: One with the JDO annotations for persisting it on the server and another which is serialisable and used over RPC?
I notice the Stock Watcher has separate classes and I can theorise why:
Otherwise the gwt compiler would try
to generate javascript for everything
the persisted class referenced like
JDO and com.google.blah.users.User, etc
Also there may be logic on the server-side
class which doesn't apply to the client
and vice-versa.
I just want to make sure I'm understanding this correctly. I don't want to have to create two versions of all my business object classes which I want to use over RPC if I don't have to.
The short answer is: you don't need to create duplicate classes.
I recommend that you take a look from the following google groups discussion on the gwt-contributors list:
http://groups.google.com/group/google-web-toolkit-contributors/browse_thread/thread/3c768d8d33bfb1dc/5a38aa812c0ac52b
Here is an interesting excerpt:
If this is all you're interested in, I
described a way to make GAE and
GWT-RPC work together "out of the
box". Just declare your entities as:
#PersistenceCapable(identityType =
IdentityType.APPLICATION, detachable
= "false") public class MyPojo implements Serializable { }
and everything will work, but you'll
have to manually deal with
re-attachment when sending objects
from the client back to the server.
You can use this option, and you will not need a mirror (DTO) class.
You can also try gilead (former hibernate4gwt), which takes care of some details within the problems of serializing enhanced objects.
Your assessment is correct. JDO replaces instances of Collections with their own implementations, in order to sniff when the object graph changes, I suppose. These implementations are not known by the GWT compiler, so it will not be able to serialize them. This happens often for classes that are composed of otherwise GWT compliant types, but with JDO annotations, especially if some of the object properties are Collections.
For a detailed explanation and a workaround, check out this pretty influential essay on the topic: http://timepedia.blogspot.com/2009/04/google-appengine-and-gwt-now-marriage.html
I finally found a solution. Don't change your object at all, but for the listing do it this way:
List<YourCustomObject> secureList=(List<YourCustomObject>)pm.newQuery(query).execute();
return new ArrayList<YourCustomObject>(secureList);
The actual problem is not in Serializing the Object... the problem is to Serialize the Collection class which is implemented by Google and is not allowed to Serialize out.
You do not have to create two versions of the domain model.
Here are two tips:
Use a String encoded key, not the Appengine Key class.
pojo = pm.detachCopy(pojo)
...will remove all the JDO enhancements.
You don't have to create separate instances at all, in fact you're better off not doing it. Your JDO objects should be plain POJOs anyway, and should never contain business logic. That's for your business layer, not your persistent objects themselves.
All you need to do is include the source for the annotations you are using and GWT should compile your class just fine. Also, you want to avoid using libraries that GWT can't compile (like things that use reflection, etc.), but in all the projects I've done this has never been a problem.
I think that a better format to send objects through GWT is through JSON. In this case from the server a JSON string would be sent which would then have to be parsed in the client. The advantage is that the final Javascript which is rendered in the browser has a smaller size. thus causing the page to load faster.
Secondly to send objects through GWT, the objects should be serializable. This may not be the case for all objects
Thirdly GWT has inbuilt functions to handle JSON... so no issues on the client end

Entity Framework and Encapsulation

I would like to experimentally apply an aspect of encapsulation that I read about once, where an entity object includes domains for its attributes, e.g. for its CostCentre property, it contains the list of valid cost centres. This way, when I open an edit form for an Extension, I only need pass the form one Extension object, where I normally access a CostCentre object when initialising the form.
This also applies where I have a list of Extensions bound to a grid (telerik RadGrid), and I handle an edit command on the grid. I want to create an edit form and pass it an Extension object, where now I pass the edit form an ExtensionID and create my object in the form.
What I'm actually asking here is for pointers to guidance on doing this this way, or the 'proper' way of achieving something similar to what I have described here.
It would depend on your data source. If you are retrieving the list of Cost Centers from a database, that would be one approach. If it's a short list of predetermined values (like Yes/No/Maybe So) then property attributes might do the trick. If it needs to be more configurable per-environment, then IoC or the Provider pattern would be the best choice.
I think your problem is similar to a custom ad-hoc search page we did on a previous project. We decorated our entity classes and properties with attributes that contained some predetermined 'pointers' to the lookup value methods, and their relationships. Then we created a single custom UI control (like your edit page described in your post) which used these attributes to generate the drop down and auto-completion text box lists by dynamically generating a LINQ expression, then executing it at run-time based on whatever the user was doing.
This was accomplished with basically three moving parts: A) the attributes on the data access objects B) the 'attribute facade' methods at the middle-tier compiling and generation dynamic LINQ expressions and C) the custom UI control that called our middle-tier service methods.
Sometimes plans like these backfire, but in our case it worked great. Decorating our objects with attributes, then creating a single path of logic gave us just enough power to do what we needed to do while minimizing the amount of code required, and completely eliminated any boilerplate. However, this approach was not very configurable. By compiling these attributes into the code, we tightly coupled our application to the datasource. On this particular project it wasn't a big deal because it was a clients internal system and it fit the project timeline. However, on a "real product" implementing the logic with the Provider pattern or using something like the Castle Projects IoC would have allowed us the same power with a great deal more configurability. The downside of this is there is more to manage, and more that can go wrong with deployments, etc.