Play2-Scala-Reactivemongo Losing Mongo schema flexibility using ReactiveMongo

Play2-Scala-Reactivemongo Losing Mongo schema flexibility using ReactiveMongo - mongodb

I'm trying to use Play2 with ReactiveMongo to build my web application. I spent few days reading related documentation and tutorials.
In my opinion one of the most powerful features of MongoDB is schema flexibility, that is the possibility of storing in the same collection documents that doesn't have exactly the same structure, but may differ one from another. For example one document may have a field that another doesn't.
Play with ReactiveMongo use case classes to implement models, but case classes obviously have a fixed structure. So all the instances of the class will have the same structure.
Does it represent a loss of flexibility? Or there is a way to implement schema flexibility with ReactiveMongo?

In addition to Andre's answer: ReactiveMongo also supports optional fields in documents, as Options in their case classes. So you can have both the convenience and type-safety of a Scala class modelling your document and a flexible document structure.
If your document structure is one where the field-names are totally dynamic (which generally is a bad idea in Mongo), then as Andre said, you may not want to use the case-class-based ReactiveMongo document modelling at all. But you can also usually use a hybrid approach, where some aspects of a document are de/serialized dynamically using name-value maps, and some use case-classes.

From what I've read in the documentations from both ReactiveMongo and ReactiveMongo Play plugin. They work with BSON and JSON structures, respectively.
It's only when you use the extensions that you have the models defined as case classes to build the DAOs. So you have all the flexibility you need or all the convenience you want. It's just a matter of choosing what structure you work with.

Related

Combining Neo4J and MongoDB : Consistency

I am experimenting a lot these days, and one of the things I wanted to do is combine two popular NoSQL databases, namely Neo4j and MongoDB. Simply because I feel they complement eachother perfectly. The first class citizens in Neo4j, the relations, are imo exactly what's missing in MongoDB, whereas MongoDb allows me to not put large amounts of data in my node properties.
So I am trying to combine the two in a Java application, using the Neo4j Java REST binding, and the MongoDB Java driver. All my domain entities have a unique identifier which I store in both databases. The other data is stored in MongoDB and the relations between entities are stored in Neo4J. For instance, both databases contain a userid, MongoDB contains the profile information, and Neo4J contains friendship relations. With the custom data access layer I have written, this works exactly like I want it to. And it's fast.
BUT... When I want to create a user, I need to create both a node in Neo4j and a document in MongoDB. Not necessarily a problem, except that Neo4j is transactional and MongoDB is not. If both were transactional, I would just roll back both transactions when one of them fails. But since MongoDB isn't transactional, I cannot do this.
How do I ensure that whenever I create a user, either both a Node and Document are created, or none of both. I don't want to end up with a bunch of documents that have no matching node.
On top of that, not only do I want my combined database interaction to be ACID compliant, I also want it to be threadsafe. Both the GraphDatabaseService and the MongoClient / DB are provided from singletons.
I found something about creating "Transaction Documents" in MongoDB, but I realy don't like that approach. I would like something nice and clean like the neo4j beginTx, tx.success, tx.failure, tx.finish setup. Ideally, something I can implement in the same try/catch/finally block.
Should I perhaps make a switch to CouchDB, which does appear to be transactional?
Edit : After some more research, sparked by a comment, I came to realize that CouchDB is also not suitable for my specific needs. To clarify, the Neo4j part is set in stone. The Document Store database is not as long as it has a Java Library.

Pieter-Jan,
if you are able to use Neo4j 2.0 you can implement a Schema-Index-Provider (which is really easy) that creates your documents transactionally in MongoDB.
As Neo4j makes its index providers transactional (since the beginning), we did that with Lucene and there is one for Redis too (needs to be updated). But it is much easier with Neo4j 2.0, if you want to you can check out my implementation for MapDB. (https://github.com/jexp/neo4j-mapdb-index)

Although I'm a huge fan of both technologies, I think a better option for you could be OrientDB. It's a graph (as Neo4) and document (as MongoDB) database in one and supports ACID transactions. Sounds like a perfect match for your needs.

As posted here https://stackoverflow.com/questions/23465663/what-is-the-best-practice-to-combine-neo4j-and-mongodb?lq=1, you might have a look on Structr.
Its backend can be regarded as a Document database around Neo4j. It's fully transactional and open-source.

Grapical database builder for MongoDB (like DBForge for MySQL)

I really like DBForge with it's ability to create database schemes with a graphical UI.
Is there a tool like that for MongoDB?
Basically I'm looking for a tool that helps creating clean MongoDB Collections and Documents.
I'm starting a very big project in a few weeks and need to design a pretty huge MongoDB Database with lots of collections. So to keep everything organized I would like to have something graphical to look at instead of just coding the entities and their properties.

Try Mongo Explorer # http://mongoexplorer.com/ as it has some of the options you are looking for.
Good luck

I'm not aware of any specific MongoDB 'schema' designer. Since a collection is itself schema-less it's hard to imagine exactly what that would look like. You might for example decide in your database to store People, Cats and Cars all in the same collection because it happens to make it easier to query across them by name.
Typically I just use a standard class-viewer or entity designer for those occasions when I do want to see a graphical view of the objects that will be persisted as documents.
Relationships are also tricky for a graphical tool - sometimes all you have is an ObjectId, sometimes you denormalize and have maybe a Name and an ObjectId allowing you to display a list without fetching each item and sometimes you store an entire copy of the other object embedded in the current object.
You can even come up with ways to store data in MongoDB to mimic multiple-inheritance. Since most graphical entity designers support only single inheritance you would again have issues trying to use them with MongoDB.
In summary I recommend that you model your entities using whatever entity designer you have for the language you will be using and then worry about how to map them to collections to support your query and update patterns, adjusting them as necessary to denormalize parts of the data where you need performance, or to share a common field (by interface or inheritance) where you want to be able to easily search across entity types (e.g. string[] Tags, or string Name).

UI / GUI Admin Tools for MongoDB
This waas answered in this post.
I an using robomongo and its good.
The latest version supports 3.0 as well.
http://mongodb-tools.com/tool/robomongo/
http://robomongo.org/

Scala integration with Mongodb

We're using mongodb, and rewriting parts of our stack with scala. I'm wondering if I should stick with mophia, or use a scala mongodb library such as subset.
Question is what do I get out of subset? e.g. with mophia I don't have to manually define the mongodb field names like i have to do in subset...
Is subset really worth using?

We use casbah + salat and it works well in almost all cases.

With Scala you should consider using Casbah, which is an officially supported interface for MongoDB that builds on the Java driver.
Casbah's approach is intended to add fluid, Scala-friendly syntax on top of MongoDB and handle conversions of common types. If you try to save a Scala List or Seq to MongoDB, we automatically convert it to a type the Java driver can serialize. If you read a Java type, we convert it to a comparable Scala type before it hits your code. All of this is intended to let you focus on writing the best possible Scala code using Scala idioms. A great deal of effort is put into providing you the functional and implicit conversion tools you’ve come to expect from Scala, with the power and flexibility of MongoDB.
Casbah provides improved interfaces to GridFS, Map/Reduce and the core Mongo APIs. It also provides a fluid query syntax which emulates an internal DSL and allows you to write code which looks like what you might write in the JS Shell. There is also support for easily adding new serialization/deserialization mechanisms for common data types.

Additionally to the ORM-Mapper/Client-Libraries, I would suggest you give Rouge a try. It will serve you with a nice Query DSL for Mongo. Rogue 1.X will only support Lift-MongoDB but version 2.x (which will ship in very near future) will work for a lot more MongoDB libraries.
A sample query would be (pure Scala code with compiletime typechecking):
Venue where (_.mayor eqs 1234) and (_.categories contains "Thai") fetch(10)
which queries for 10 entries in the Venue collection where 1234 is the mayor and Thai is one of its categories.

I am the author of Subset. I would say "Subset" is not really a kind of ORM library. It has no methods for working with databases and collections, leaving it to Java/Scala drivers. But it is more focused on transformations of MongoDB documents. This transformation core is rather generic and suitable not only for reading/writing of fields, but for applications that need perform e.g. document migrations as well. Query/Update builders Subset provides are built on top of this "core".
That said, if you need ORM, there are simpler alternatives indeed. I never had an intent for Subset to compete with true ORM libraries, I've filled the gap I met in my projects.

Does a good framework for MongoDB "schema" upgrades in Scala exist?

What are the options for MongoDB schema migrations/upgrades?
We (my colleagues and I) have a somewhat large (~100 million record) MongoDB collection. This collection is mapped (ORM'd) to a Scala lift-mongodb object that has been through a number of different iterations. We've got all sorts of code in there which handles missing fields, renames, removals, migrations, etc.
As much as the whole "schema-less" thing can be nice and flexible, in this case it's causing a lot of code clutter as our object continues to evolve. Continuing down this "flexible object" path is simply not sustainable.
How have you guys implemented schema migrations/upgrades in MongoDB with Scala? Does a framework for this exist? I know that Foursquare uses Scala with MongoDB and Rogue (their own query DSL)... does anyone know how they handle their migrations?
Thank you.

Perhaps this can help somewhat, this is how Guardian.co.uk handle this:
http://qconlondon.com/dl/qcon-london-2011/slides/MatthewWall_WhyIChoseMongoDBForGuardianCoUk.pdf
Schema upgrades
This can be mitigated by:
Adding a “version” key to each document
Updating the version each time the application modiﬁes a document
Using MapReduce capability to forcibly migrate documents from older versions if required

I program migrations of MongoDB data with my own Scala framework "Subset". It lets define document fields pretty easily, fine tune data serialization (e.g. write "date" is a specific format and so on) and build queries and update modifiers in terms of the defined fields. This "gist" gives a good introduction

What NoSQL database to use as replacement for MySQL?

When it comes to NoSQL, there are bewildering number of choices to select a specific NoSQL database as is clear in the NoSQL wiki.
In my application I want to replace mysql with NOSQL alternative. In my application I have user table which has one to many relation with large number of other tables. Some of these tables are in turn related to yet other tables. Also I have a user connected to another user if they are friends.
I do not have documents to store, so this eliminates document oriented NoSQL databases.
I want very high performance.
The NOSQL database should work very well with Play Framework and scala language.
It should be open source and free.
So given above, what NoSQL database I should use?

I think you may be misunderstanding the nature of "document databases". As such, I would recommend MongoDB, which is a document database, but I think you'll like it.
MongoDB stores "documents" which are basically JSON records. The cool part is it understands the internals of the documents it stores. So given a document like this:
{
"name": "Gregg",
"fave-lang": "Scala",
"fave-colors": ["red", "blue"]
}
You can query on "fave-lang" or "fave-colors". You can even index on either of those fields, even the array "fave-colors", which would necessitate a many-to-many in relational land.
Play offers a MongoDB plugin which I have not used. You can also use the Casbah driver for MongoDB, which I have used a great deal and is excellent. The Rogue query DSL for MongoDB, written by FourSquare is also worth looking at if you like MongoDB.
MongoDB is extremely fast. In addition you will save yourself the hassle of writing schemas because any record can have any fields you want, and they are still searchable and indexable. Your data model will probably look much like it does now, with a users "collection" (like a table) and other collections with records referencing a user ID as needed. But if you need to add a field to one of your collections, you can do so at any time without worrying about the older records or data migration. There is technically no schema to MongoDB records, but you do end up organizing similar records into collections.
MongoDB is one of the most fun technologies I have happened to come across in the past few years. In that one happy Saturday I decided to check it out and within 15 minutes was productive and felt like I "got it". I routinely give a demo at work where I show people how to get started with MongoDB and Scala in 15 minutes and that includes installing MongoDB. Shameless plug if you're into web services, here's my blog post on getting started with MongoDB and Scalatra using Casbah: http://janxspirit.blogspot.com/2011/01/quick-webb-app-with-scala-mongodb.html
You should at the very least go to http://try.mongodb.org
That's what got me started.
Good luck!

At this point the answer is none, I'm afraid.
You can't just convert your relational model with joins to a key-value store design and expect it to be a 1:1 mapping. From what you said it seems that you do have joins, some of them recursive, i.e. referencing another row from the same table.
You might start by denormalizing your existing relational schema to move it closer to a design you wish to achieve. Then, you could see more easily if what you are trying to do can be done in a practical way, and which technology to choose. You may even choose to continue using MySQL. Just because you can have joins doesn't mean that you have to, which makes it possible to have a non-relational design in a relational DBMS like MySQL.
Also, keep in mind - non-relational databases were designed for scalability - not performance! If you don't have thousands of users and a server farm a traditional relational database may actually work better for you.

Hmm, You want very high performance of traversal and you use the word "friends". The first thing that comes to mind is Graph Databases. They are specifically made for this exact case.
Try Neo4j http://neo4j.org/
It's is free, open source, but also has commercial support and commercial licensing, has excellent documentation and can be accessed from many languages (REST interface).
It is written in java, so you have native libraries or you can embedd it into your java/scala app.

Regarding MongoDB or Cassendra, you now (Dec. 2016, 5 years late) try longevityframework.org.
Build your domain model using standard Scala idioms such as case classes, companion objects, options, and immutable collections. Tell us about the types in your model, and we provide the persistence.
See "More Longevity Awesomeness with Macro Annotations! " from John Sullivan.
He provides an example on GitHub.
If you've looked at longevity before, you will be amazed at how easy it has become to start persisting your domain objects. And the best part is that everything persistence related is tucked away in the annotations. Your domain classes are completely free of persistence concerns, expressing your domain model perfectly, and ready for use in all portions of your application.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse