Managing changes in class structure to be consistent with mongodb collection - mongodb

We are using mongodb with c#. We are trying to figure out a way to keep our collection consistent seamlessly. Right now, if a developer make any changes to the class structure(add a field or change data type or changing the property within a nested class) he/she has to change the mongo collection manually.
Its a pain as our project is growing and the developers working on the project keeps increasing. Was wondering whether someone already have figured out a way to manage this issue.
Research
I found a similar question. however, couldn't find the solution.
Found a way to find all properties Finding the properties; however, datatype and nested documents becomes an issue.

If you want to migrate gradually as records are accessed you need to follow a few simple rules:
1) If you add a field it had better be nullable or have a default value specified.
2) Never rename fields, never change field types
- Instead always add new fields, add migration code, remove the old fields only when all documents have been migrated over.
For prototyping with MongoDB and C# I build a dynamic wrapper ... that lets you specify your objects using only interfaces (no classes needed), and it lets you dynamically add new interfaces to an existing object. Not ready for production use but for prototyping it saves a lot of effort and makes migration really easy.

Related

mongodb schema design - add collection or extend existing one

There's a collection with 100.000 documents. Only 10 of them must have additional property that is not necessary for other documents (e.g. list of departments with only top ones have property 'Location');
As far as I understand both approaches should work just fine, but which one is preferable since using noSql db:
add one more collection with documents that have 2 property: DepartmentId, Location.
add property 'Location' to only selected documents, so others won't have it.
The problem you are facing is well known. You have the same with source code for example.
When you are updating a piece of code, do you save it as User.js, User2.js, User3.js ... ?
Or do you use a versionning system like git and have an unique User.js?
Translating the git analogy to your issue, you should update the current data.
In mongodb you actually have two choice to perform the update.
Update the model in your code, and update every entry in database to match the new model.
Create a new model that will apply to new entries, and still have the old model to handle old formatted data.
use-more-than-one-schema-per-collection-on-mongodb

OCM or Nodes in JCR?

We are developing a CMS based on JCR/Sling/JSP/Felix/etc.
What I found so far is using Nodes are very straight forward and flexible. But my concern is over time it could become too hard to maintain and manage.
So, is it wise to invest in using a OCM? Would it be just an extra layer of complexity? What's the real benefit in OCM if there's any? Or it's better for us to stick to Nodes instead?
And lastly, is Jackrabbit OCM the best option for us if we are to go down that path?
Thank you.
In my personal experience I can say it severly depends on your situation if OCM is a useful tool for your project or not.
The real problem in using OCM (in my personal experience) is when the definition of a class used in existing persisted data (as objects) in the repository has changed. For example: you found it necessary to change some members and methods of a class to match with functionality changes. By this I mean that the class definition of the persisted data object in the repository no longer matches the definition of actual class. When a persisted data is saved to the jcr repository it is usually saved in a format that java understands in terms of serialization. Which means that when something changes to the definition of the used class, the saved data in the repository can no longer be correctly interpreted by java. This issue tends to lead to complex deployment where you need to convert old persisted data objects to the new definition and save them again in the repository to make sure you can still use "old" but still required persisted data.
What does work (in my opinion) is using a framework that allows to map nodes and node properties to java objects directly (for example by using annotations) and the other way around (persist a java object to the repository as a JCR node where the java member fields are actual node properties). This way you stick to the data representation of jcr (nodes with properties) and can still map them to the members of a java class.
I've used a framework like this in a cms called AEM (of Adobe) before, although I must mention this is in a OSGI context (but the prinicipe still stands). The used framework basically allowed maximum flexibility and persists the java object as a JCR node and the other way around. Because it mapped directly to the jcr definition, code changes in the class and members ment just changing annotations, and old persisted data was still usuable without much effort.

entity framework and database default values workaround

I have to decide about an important item and I need your help.
I'm facing an huge existing database with a lot of default values on nullable columns.
The team has to build a new MVC4 application on top of it (in fact it is a rewrite of old VB6 application).
I (as a consultant) have 'forced' the use of EF5 to get rid of all stored procedures and migrate to a more modern techology.
Now, after my research, it is clear to me that EF5 doesn't support database default values per default. This is why my inserted records are corrupt (they are inserted because the columns are nullable, but with NULL of course).
Some options came up like using the constructor technique, setting the default values in design on the edmx, or playing around with the xml of the edmx.
Despite, these methods are not usefull for us. Where the constructor technique looks ok for me, it is not feasible to do that for all tables in the DB. I also have a 'njet' from the technical person because he wants to maintain these values on 1 place. Same story for setting the default values in design. The database is also not in our scope (read: as less as possible changes to keep existing applications running).
At this point, I'm not sure it EF is the correct choice for our project.
Is somebody aware of (3th party) tools that can fill in the database default values automatically in the generated xml of the edmx file?
Is there som more info about how this xml is build and if there is a possiblity to interfere in the process?
Is there a good readon why these default values should not be taken? Is this going to change in a later release?
Are there other good practices that can be applied to that problem without having all values dupplicated or a massive workload?
Can I arrange something with my poco generator?
I realize there are already a lot of posts of this topic. Too bad, there is no suitable solution for me since we have already something existing and (with all respect) an old VB6 team that I have to convince.
Thanks for your feedback!

Preventing duplicates in MongoDB using Spring Data (Spring Roo)

I have been trying to get my head wrapped around MongoDB, as it's used by Spring, so I decided to start a little project in Spring Roo.
In my project, I am storing my User Login data to MongoDB. The trouble is that the registration process, which creates a new User object and stores it in the MongoDB, has a tendency to create duplicates despite the fact I have #Unique on the loginId field.
Now, I know part of the problem is that I am thinking about things from a JPA/RDBMS perspective, and MongoDB is not a relational DB and thus has a different set of parameters in which to operate with, but I having trouble finding guidance in anything more than a VERY simple sample code.
First, what Spring/Other annotations are available, and more importantly, commonly used when dealing with MongoDB from a Spring-world? Second, when dealing with documents that need to be "uniqued", how does one typically do this? Do you first search on the unique field to ensure it's not already there first, then do the insert? Third, in JPA-land, I could use the annotations #PrePersist and #PreUpdate to do last-minute data manipulation, like MD5-hashing passwords that have been updated or adding/updating a "Last Modified" date just prior to storing. I know this are JPA-isms, but can I still use those, and if not, is there an alternative for use with Spring Data/MongoDB?
I ended up using the #Id annotation on my Entities, which indicates which field is used as the id field. As long as the field is unique, writting subsequent updates will properly replace the existing entity instead of adding a new one.
I ended up creating additional method to check if there exists a data which have a duplicate value to the one we are entering.
If it exists, i return failure mentioning that there exist duplicate value. Otherwise it saves the newly entered value

OO Databases that Pass by reference?

I have played with MongoDB a little and wondered is there every going to be, or can there even be a database which passes by reference or pointer.
E.g. I have a single user instance which can be put into multiple other arrays, if you change it once in one place it changes in all arrays.
I understand that in a database you don't want your data flung all over the disk but might we ever see one?
I would venture to guess you are accidentally creating multiple references when using MongoDB. There seems to be little value in creating a Graph for the database in which none of the nodes are shared.
I have used db4o and it will maintain a single instance of an object such that if you change it, then reference it from another Object the change will be "reflected." I put that in quotes because they should be the same object due to the way graphs work.