MongoDB + Neo4J vs OrientDB vs ArangoDB [closed] - mongodb

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am currently on design phase of a MMO browser game, game will include tilemaps for some real time locations (so tile data for each cell) and a general world map. Game engine I prefer uses MongoDB for persistent data world.
I will also implement a shipping simulation (which I will explain more below) which is basically a Dijkstra module, I had decided to use a graph database hoping it will make things easier, found Neo4j as it is quite popular.
I was happy with MongoDB + Neo4J setup but then noticed OrientDB , which apparently acts like both MongoDB and Neo4J (best of both worlds?), they even have VS pages for MongoDB and Neo4J.
Point is, I heard some horror stories of MongoDB losing data (though not sure it still does) and I don't have such luxury. And for Neo4J, I am not big fan of 12K€ per year "startup friendly" cost although I'll probably not have a DB of millions of vertexes. OrientDB seems a viable option as there may be also be some opportunities of using one database solution.
In that case, a logical move might be jumping to OrientDB but it has a small community and tbh didn't find much reviews about it, MongoDB and Neo4J are popular tools widely used, I have concerns if OrientDB is an adventure.
My first question would be if you have any experience/opinion regarding these databases.
And second question would be which Graph Database is better for a shipping simulation. Used Database is expected to calculate cheapest route from any vertex to any vertex and traverse it (classic Dijkstra). But also have to change weights depending on situations like "country B has embargo on country A so any item originating from country A can't pass through B, there is flood at region XYZ so no land transport is possible" etc. Also that database is expected to cache results. I expect no more than 1000 vertexes but many edges.
Thanks in advance and apologies in advance if questions are a bit ambiguous
PS : I added ArangoDB at title but tbh, hadn't much chance to take a look.
Late edit as of 18-Apr-2016 : After evaluating responses to my questions and development strategies, I decided to use ArangoDB as their roadmap is more promising for me as they apparently not trying to add tons of hype features that are half baked.

Disclaimer: I am the author and owner of OrientDB.
As developer, in general, I don't like companies that hide costs and let you play with their technology for a while and as soon as you're tight with it, start asking for money. Actually once you invested months to develop your application that use a non standard language or API you're screwed up: pay or migrate the application with huge costs.
You know, OrientDB is FREE for any usage, even commercial. Furthermore OrientDB supports standards like SQL (with extensions) and the main Java API is the TinkerPop Blueprints, the "JDBC" standard for Graph Databases. Furthermore OrientDB supports also Gremlin.
The OrientDB project is growing every day with new contributors and users. The Community Group (Free channel to ask support) is the most active community in GraphDB market.
If you have doubts with the GraphDB to use, my suggestion is to get what is closer to your needs, but then use standards as more as you can. In this way an eventual switch would have a low impact.

It sounds as if your use case is exactly what ArangoDB is designed for: you seem to need different data models (documents and graphs) in the same application and might even want to mix them in a single query. This is where a multi-model database as ArangoDB shines.
If MongoDB has served you well so far, then you will immediately feel comfortable with ArangoDB, since it is very similar in look and feel. Additionally, you can model graphs by storing your vertices in one (or multiple) collections, and your edges in one or more so-called "edge-collections". This means that individual edges are simply documents in their own right and can hold arbitrary JSON data. The database then offers traversals, customizable with JavaScript to match any needs you might have.
For your variations of the queries, you could for example add attributes about these embargos to your vertices and program the queries/traversals to take these into account.
The ArangoDB database is licensed under the Apache 2 license, and community as well as professional support is readily available.
If you have any more specific questions do not hesitate to ask in the google group
https://groups.google.com/forum/#!forum/arangodb
or contact
hackers (at) arangodb.org
directly.

Neo4j's pricing is actually quite flexible, so don't be put away by the prices on the website.
You can also get started with the community edition or personal edition for a long time.
The Neo4j community is very active and helpful and quickly provide support and help for your questions. I think that's the biggest plus besides performance and convenience. I
n general using a graph model
Regarding your use-case:
Neo4j is used exactly for this route calculation scenario by one of the largest logistic companies in the world where it routes up to 4000 packages per second across the country.
And it is used in other game engines, like here at GameSys for game economy simulation and in another one for the routing (not in earth coordinates but in game-world-coordinates using Neo4j-Spatial).
I'm curious why you have only that few nodes? Are those like transport portals? I wonder where you store the details and the dynamics about the routes (like the criteria you mentioned) are they coming from the outside - in memory state of the game engine?
You should probably share some more details about your model and the concrete use-case.
And it might help to know that both Emil, one of the founders of Neo4j and I are old time players of multi user dungeons (MUDs), so it is definitely a use-case close to our heart :)

Related

Enterprise NoSQL Stack Solution for Mobile/Web [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm tasked with investigating for our firm a full-stack solution where we'll be using a NoSQL database backend. It'll most likely be fed from a data warehouse and/or operational data store of some type in near-realtime (hopefully :). It will be used mainly by our mobile and web applications via REST.
A few requirements/assumptions:
It will be read-only (in the near term) and consumed by clients in REST format
It has to be scalable
Fast response time
Enterprise support - or if lacking actual support, something industry proven if open-source (basically management wants to hold
someone accountable if something in the stack fails)
Minimal client data transformations - i.e: data should be stored in as close to ready-to-use format as possible
Service API Management of some sort will most likely be needed (eg: 3scale)
Services will be used internally, but solution shouldn't prevent us from exposing them externally as a longterm goal
Micro-services are preferable (provided sufficient API management is in place)
We have in-house expertise in Java and Grails for our mobile/portal solutions
Some of the options I was tossing around were:
CouchDB: inherently returns REST - no need for translation layer - as
long as clients speak REST, we're all good
MongoDB: need a REST layer in between client and DB - haven't found a widely used one based on my investigation (the ones on Mongo's site all seem in their infancy - i.e: RestHeart)
Some questions I have:
Do I need an appserver? Or any layer in between the client and DB
for performance/caching reasons? I was thinking a reverse-proxy like
nginx would be a good idea for this?
Why not use CouchDB in this solution if it supports REST out of the box?
I'm struggling with deciding between which NoSQL DB to use, whether or not I need a REST translation layer, appserver, etc. I've read the pros and cons of each and mostly they say go Mongo - but for what I'm trying to do the lack of a mature REST layer is concerning.
I'm just looking for some ideas, tips, lessons learned that anyone out there would be willing to share.
Thanks!
The problem with exposing the database directly to the client is that most databases do not support permission control which is as fine-grained as you want it to be. You often can not allow a client to view and edit its own data while also forbidding it from viewing and editing any data of other users or even worse from the server itself. At least not when you still want a sane database schema.
You will also often find yourself in the situation that you have a document with several fields of which only some are supposed to be under the control of the user and others are not. I can, for example, edit the content of this answer, but I can not edit the time it was posted, the name it was posted under or its voting score. So far I have never seen a database system which can handle permission for individual fields (when anyone has: feel free to post in the comments).
You might think about trying to handle this on the client and just don't offer any user interface for editing said fields. But that will only work in a trusted environment. When you have untrusted users, they could create a clone of your client-sided application which does expose this functionality. There is no way for you to tell the difference between the genuine client and a clone, especially not when you don't have a smart application server (and even then it is practically impossible).
For that reason it is almost always required to have an application server between clients and database which handles authentication and permission management of the clients and only forwards those requests to the persistence layer which are permitted.
I totally agree with the answer from #Philipp. In the case of using CouchDB you will minimum want to use a proxy server in front to enable SSL.
Almost all of your requirements can be fulfilled by CouchDB. Especially the upcoming v2 will give you the "datacenter-needs".
But it's simply very complex to answer what should be the right tool for you purpose. If you get some business model requirements on top like lets say: throttling - then you will definitely need an application server middleware like http://mcavage.me/node-restify/
Maybe it's a good idea to spend some money to professionals like
http://www.neighbourhood.ie/couchdb-support/ ? (I'm not involved)

EventStore vs. MongoDb [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I would like to know what advantages there are to using EventStore (http://geteventstore.com) over implementing event sourcing yourself in a MongoDb.
The reason I ask, is that our company has a number of people that work with MongoDb daily. They don't work with Event Sourcing though. While they are not completely in the dark about the subject, they aren't about to start implementing it anywhere either.
I am about to start a project, that is perfectly suited for Event Sourcing. There are about 16 very well defined events, and about 7 well defined projections. I say "about" because I know there will be demand for more projections and events once they see the product in use.
The approach is going to be API first, with a REST Api that other parts of our organisation are going to consume.
While I have read a lot about Event Sourcing the way Greg Young defines it, I have never actually implemented an Event Sourcing solution.
This is a green field project. No technology restrictions since we are going to expose everything as a REST interface. So if anyone has working experience with EvenStore or Event Sourcing with MongoDb please enlighten me.
Also an almost totally non related question about Event Sourcing:
Do you ever query the event store directly? Or would you always create new projections and replay event to populate those projections?
Disclaimer I am Greg Young (if you cant read my name :))
I am going to answer this question though I believe it will likely get deleted anyways. This question alone for me is a bit odd, but the answers are fairly bizarre. I won't take the time to answer each reply individually but will instead put all of my comments in this reply.
1) There is a comment that we only run on a custom version of mono which is a detail but... This is not the case (and has not been for over a year). We were waiting on critical patches we made to mono (as example threadpool.c to hit their master). This has happened.
2) EventStore is 3-clause BSD licensed. Not sure how you could claim we are not Open Source. We also have a company behind it and provide commercial support.
3) Someone mentioned us going on to version 3 in Sept. Version 1 was released 2 years ago. Version 2 added Clustering (obviously some breaking changes vs single node). Version 3 is adding a ton of stuff including ability to have competing consumers. Very little has changed in terms of the actual client protocol over this time (especially for those using the HTTP API).
What is really disturbing for me in the recommendations however is that they don't seem to understand what they are comparing. It would be roughly the equivalent of me saying "Which should I use neo4j or leveldb?". You could build yourself a graph database on top of leveldb but that would be quite a bit of work.
Mongo in this case would be a storage engine on the event store the OP would have to write him/herself. The writing of a production quality event store is a non-trivial exercise on top of a storage engine if you want to have even the most basic operations.
I wrote this in response to the mailing list equivalent of this question:
How will you do the following with Mongo?:
Write and read events to/from streams with ordering/optimistic concurrency/etc
Then:
Your projections don't want to read from streams in the same way they were written, projections are normally interested in event types and want all events of type T regardless of stream written to and in proper order.
You probably also want for instance the ability to switch live from pushed event notifications to handling pulled information (eg polling) etc.
It would make more sense if Kafka, datomic, and Event Store were being compared.
Seeing as the other replies don't talk about the tooling or benefits in EventStore and only refer to the benefits of MongoDB I'll chime in. But note that my experience is limited.
I'll start with the cons...
There are a lot of check-ins which can lead to deciding which version you are going to actively support yourself. While the team has been solidifying their releases, that they have arrived at version 3 not even 18 months after being released should be an indicator that you have to pull up the version you are supporting for another more recent version (which can also impact the platform you choose to deploy to).
It's not going to easily work on every platform (especially if you're trying to move to a cloud environment or a docker based lxc container). Some of this is due to the community surrounding other DBs such as Mongo. But the team seems to have been working their butts off on read/write performance while maintaining cross platform stability. As time presses on I've found that you don't want to deviate too far from a bare-metal OS implementation which this day in age is not attractive.
Uses a special version of Mono. Finding support for older versions of Mono only serve to make the process more of a root canal.
To make the most of performance of EventStore you really need to think about your architecture. EventStore outputs to flat files and event data can grow pretty quickly. What's the fail rate of the disks are you persisting your data to. How are things compressed? archived? etc. You have a lot of control and the control is geared towards storing your data as events. However, while I'm sure Greg Young himself could quote me to my grave the features that optimize and save your disks in the long term, I'll more than likely find a mature Mongo community that has had experience running into similar cases.
And the Pros...
RESTful - It's AtomPub. Is your stream not specific enough? Create another and do http gets till your hearts content. Concerned about routing do do an http forward. Concerned about security put an http proxy in front. Simple!
You have a nice suite of tools and UI for testing out and building your projections as your events start to generate new data (eg. use chrome browser as a way to debug your projections... ya they're written with java script)
Read performance - Since the application outputs to a flat file you can get kernel level caching and expose them via http in the drop of a hat. Also indexes are across your streams for querying projections against larger data sets (but I really get the feeling index performance will creep up on you over time).
I personally would not use this for a core / mission critical / or growing application! However, if you have a side case for keeping your evented environment interesting then I'd give it go! I personally have to stick to Mongo for now.

Is Filemaker suitable for an EMR? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
A medical practice has approached us about using Filemaker as a fully-fledged EMR system with a HEAVY emphasis on using iPads to enter patient records, photos, digital signatures etc which can obviously be accessed on desktops as well. Ultimately they would like such a system to replace their current EMR and takeover all billing operations, patient scheduling and so forth. They only use Macs in their practice.
We have very little experience with Filemaker but found this discussing the Pros and Cons of it however it seems that Filemaker has come a long way since 2009 when that question was asked...
So overall I'm just trying to work out if Filemaker is suitable for such an application or what would be the pros and cons of using a combination of FMP12 and FM Go.
(Sorry if I've done anything wrong - first question...)
Thanks!
As a FileMaker Developer myself, I would say go for it. I agree with Mikhail - You WILL see results faster than any other platform. You can make changes yourself easily and live or you can get a FileMaker developer - just like you would need to get a developer for any application.
With an off-the-shelf application, they tend to be quite inflexible, however I am sure there are systems out there that allow some customisation.
FileMaker is a very capable product. We have written many applications for vertical markets, such as law firms and even a Harley Street plastic surgeon who gathers patient data on an iPad and even sketches the suggested surgery on a picture of the patient.
For those who think FileMaker is a baby, have a look at http://www.businessmancrm.com - this is a full ERP system used all over the world. This is not an advert, but a demonstration of what is possible with FileMaker.
Dollar for dollar, FileMaker will win hands down... and when it comes to time frames, there is no contest. We are open minded - We constantly look for other products to develop applications for ourselves and customers and we have not found anything more viable just yet.
Pros:
Extremely quick environment
Cross platform
Integrate other SQL data sources into application
ODBC Support
Remote Access
Can be run from a USB stick if needed!
Thousands of developers around the world
Large community
FileMaker Inc. have made a profit every single quarter since existence, therefore are stable and do have the backing of Apple!
Reasonable Cost
Make changes yourself
Easy to backup, supports incremental backup
Easy to secure and encrypt data on a network
Supports terminal server
FMGo is free
Cons
High level language (not low level with layout object control) - However does support plugins
Requires FileMaker client (unless a web application/interface is built in PHP or using IWP - Instant Web Publishing)
Proprietary Database (however can easily link into MySQL, MSSQL and Oracle)
Honestly, not worth it.
It's a very clunky front-end for a database.
If you do decide to pick it up your basically stuck with paying a Filemaker developer for the rest of its existence.
One of my clients at the moment has had it for the last ~6+? years after only being with them for the last 8 months i'm trying very hard to push them away from it and onto a newer system.
I can suggest looking at Mastercare EMR, Profile and MMEX.
FileMaker is perfectly capable of it, of course, and I expect you'll be getting first results much faster than with any other approach, especially with the iPad. There's quite a few EMRs out there written in FileMaker. There are downsides, of course; it was always targeted to end users so it ended up fairly inconventional from a common programmer's point of view. Many programmers dislike this. Being end-user it suffers from many simplifications (well, not exactly suffers, actually; this makes development faster as there's fewer choices), but people always want something special so there's a huge number of workarounds to overcome these simplifications. These workarounds vary from relatively harmless to very hairy ones.
For example, to sign documents on iPad you need to add a webviewer control pointed to a generated HTML page via the "data:" protocol. The page is going to have a JavaScript that captures user's touches, paints them on a canvas, and serializes this into a string. Later a script will capture the string, store it in a FileMaker field, and change the generated HTML to use this string so the JavaScript can redraw the signature. This one is relatively simple and since the functionality cannot be obtained in any other way, it's in wide use; there's even a commercial module for around $300. A complex app may consists of dozens of such workarounds; anyone who is not a FileMaker developer won't be able to understand why you need a webviewer to capture a signature or why you use a strange contraption of invisible tabs to display what looks like a simple pop-up list. I.e. it's not like you read a book and work from there; be ready to read quite a few blogs and frequent forums and mailing lists.
That said, it's a good product nonetheless with unique capabilities (that iPhone/iPad client, for one); paired with a good developer it can be very powerful.
Having developed an EMR system at a recent position for 3 years, I can tell you from experience that the requirements for a true EMR system may quickly outgrow the scope of what is easy to do in FileMaker. A few really big, important EMR features come to mind immediately:
Insurance Eligibility verification: is there going to be a way to hit all of the major payers' web services or a third party aggregator to verify insurance eligibility from the iPad?
Insurance Card OCR: sure you can snap a photo of an insurance card, but now you have back office staff typing that information in from an image. We implemented OCR of insurance cards in our EMR and it was a huge cost and time saver.
Security / Privacy concerns: HIPAA compliance is a big deal, and is FileMaker suitably transparent to be compliant? Is there any way to audit who looks at a record? How is the data transferred across the wire?
E-prescribing: All modern EMR's support electronic prescriptions, which carries a complex set of rules and implementation details along with it, I would want to be certain FileMaker could be integrated with an e-prescribing gateway before proceeding.
My main concern with using any off the shelf, cross platform tool to approach a problem as big and complex as an EMR would be getting painted into a corner down the road, having invested a bunch of time and money into a solution that may leave you unable to implement a feature or requirement, whereas paying the up front price of developing a native iOS app (and web apps and whatever else you need to integrate with) would eliminate that possibility, but obviously cost more.

NoSQL Database for ECommerce

I will be constructing an ecommerce site, and would like to use a no-sql database, which will fit well with the plans for the app. But when it comes to which database would fit the job, im not sure. After comparing various DB's, the ones that seem best might be either mongo, couch, or even orientdb. I have seen arguments for all of them to be used or not used compared to something like MySQL. But between themselves (nosql databases), which one would fit well with an ecommerce solution?
Note, for the use case, i wont be having thousands of transactions a second. Or similarly high write rates. they will be moderate sure, but at a level that any established database could handle.
CouchDB: Has master to master replication, which I could really use. If not, I will still have to implement the same functionality in code anyways. I need to be able to have a users database, sync with the mothership. (users will have their own, potentially localhost database, that could sync with the main domains server). Couch is also fast, once your queries have been stored in the db.As i will probably have a higher need for read performance. Though not by a lot.
MongoDB: queries are very easy and user friendly. Also, with the fact that end users may need to query for certain things at a given time that I may not be able to account for ahead of time, this seems like it may be a better fit. I dont have to pre-store my queries in the db. Does support atomic transactions, though only when writing to a single document at a time.
OrientDB: A graph database. much different that most people are used to, but with the needs, it could fit very well too. Orient has the benefits of being schemaless, as well as having support for ACID transactions. There is a lot of customer, and product relationships that a graph database could be great with. Orient also support master to master replication, similar to couchdb.
Dont get me wrong, I can see how to build this traditionally with something like MySQL, but the ease and simplicity of a nosql solution, is very attractive. Although, in my case, needing a schemaless solution, would be much easier in nosql rather than mysql. a given product may have more or less items, than another. and avoiding recreating a table whenever a new field is added, is preferrable.
So between these 3 (or even others you think may be better), what features in each could potentially work for, or against me in regards to an ecommerce based site, when dealing with customer transactions?
Edit: The reason I am not using an existing solution, is because with the integrated features I need, there are no solutions available out there. We are also aiming to use this as a full product for our company. There will be a handful of other integrations than just sales. It is also going to be working with a store's POS system.
Since e-commerce can encompass everything from shopping carts through to membership and recurring subscriptions, it is hard to guess exactly what requirements and complexity you are envisioning.
When constructing an e-commerce site, one of the early considerations should be investigating whether there is already an established e-commerce product or toolkit that could meet your requirements. There are many subtleties to processes like ordering, invoicing, payments, products, and customer relationships even when your use case appears to be straightforward. It may also be possible to separate your application into the catalog management aspects (possibly more custom) versus the billing (potentially third party, perhaps even via a hosted billing/payment API).
Another consideration should be who you are developing the e-commerce site for: is this to scratch your own itch, or for a client? Time, budget, and features for a custom build can be difficult to estimate and schedule .. and a niche choice of technology may make it difficult to find/hire additional development expertise.
A third consideration is what your language(s) of choice are for developing your application. Some languages will have more complete/mature/documented drivers and/or framework abstractions for the different databases.
That said, writing an e-commerce system appears to be a rite of passage for many developers ;-).
Edit: a lot has changed since this answer was originally posted in 2012 and you should definitely refer to current product information. For example, MongoDB has had support for Decimal128 values since MongoDB 3.4 (2016) and multi-document transactions since MongoDB 3.6 (2017).
Check the comparison of different available NoSql databases here. Suit your requirement as per that.
MongoDB 4 now multi-document ACID transactions! That makes it suitable for e-Commerce!
Check out: https://www.mongodb.com/transactions

Is nosql a right tool for multy level forum like comment system?

I want to build a web app similar to Reddit.com, where you have multy level of comments, lots of reads and writes. I was wondering if nosql and mongoDB in particular is the right tool for this?
Comments -- it's really thing for nosql database, no doubt. You avoiding multiple joins to itself. And it's means that your system can scale out!
With mongodb you can store all hierarchy within one document. Some peoples can say that here will be problems with atomic updates, but i guess that it's not a problem because of you can load and save back entire comments tree. In any way you can easy redesign your system later to support atomic updates and avoid issues with concurrency.
Reddit itself uses Cassandra. If you want something "similar to reddit.com," maybe you should look at their source -- https://github.com/reddit/reddit/wiki.
Here's what David King (ketralnis) said earlier this year about the Cassandra 0.7 release: "Running any large website is a constant race between scaling your user base and scaling your infrastructure to support it. Our traffic more than tripled this year, and the transparent scalability afforded to us by Apache Cassandra is in large part what allowed us to do it on our limited resources. Cassandra v0.7 represents the real-life operations lessons learned from installations like ours and provides further features like column expiration that allow us to scale even more of our infrastructure."
However, Rick Branson notes that Reddit doesn't take full advantage of Cassandra's features, so if you were to start from scratch, you'd want to do some things differently.