So I'm writing an application that would allow a user to create a type of quiz. Each quiz can be different, IE they will have unique questions and answers. I've never played with any nosql database and I'm trying to wrap my head around how a nosql database (couch, mongo, etc...) actually organize the data. It seems this kind of data storage might lend itself well to this type of application (IE no tables that define how many questions can be asked) but I'm still unsure of how a nosql database actually works.
I tend to be more of a visual learner, can anyone point me to some good visuals or describe how nosql databases actually organize the data they hold?
I will talk about monogdb. Mongodb store her data in bson format (binary json). You can keep think that it store data in json format.
Mongodb contains one big benefit that you can use -- ability to embedd documents. You can use this feature in your application:
So you can embedd answers in your questions collection like this:
questions {
_id: 1,
text: "What is your name",
answers: [
{
order: 1,
text: "Andrew"
},
{
order: 1,
text: "Greg"
}
]
}
Embedding usual keeps your database schema simpler and allow to avoid joins and data in general looks more naturally. For example in sql worlds you does not have any other solution as create separate tables for questions and answers.
Another benefit you may use is - scalability. Mongodb was designed to be fully scalable, it support sharding and replica sets.
You can start reading about schema desing more here. Also you can look into little monogdb book, there only 30 pages, but it should help understand more deeply how mongodb works.
Just choose some nosql database and try to play with, you have fairly simple project to start with. And i am sure that you will love it once try.
Related
First of all, I have extensive experience on Relational DBs but very beginner level knowledge of Document DB. I'm exploring MongoDB but my question is in general to Document DB.
AFA I know (I may be wrong), A Document DB is consisting of containers and containers contain same of different object structures. These object structures are defined such a way that filters and information can be applied in most optimum way. For ex. A is written by Authors. So object of Book will contain list of authors also. This way searching can be made faster and performance can be gained.
What is my problem?
I'm creating an application (yet haven't started as I'm confused here). It's relational DB is something like this....
The problem is I'm not able to design the Document DB structure for this requirement.
Please somebody help my in designing such database or can give me idea on "What approach one should select while designing such database?"
This comes down to answering the following questions:
What are your most common access patterns? It is helpful to think of your API methods, or top 5-10 queries to decide how to organize.
What are your transactional needs? Which of these entity types occur together in transactions and queries?
How often do they change? Should you embed or reference?
If you could include these details, we can help with more targeted suggestions.
http://azure.microsoft.com/documentation/articles/documentdb-modeling-data/ is also worth a read if you haven't already.
The main difference between DocumentDB and relational databases/MongoDB is that collections are more like shards/partitions and not tables.
I plan to write a blog style app, wondering what should i be using for storage.
I intend to go with NoSql solution because doing db schema is boring. and I believe I can do most of the functionality with json structured data.
What would be some considerations when design this? Which NoSql technology fits this purpose more?
Roughly looking mongo/couchdb would do, I am hoping to get some experience based advise.
Appreciate your help!
MongoDB/CouchDB
I guess the easier one of both to start with is MongoDB. It has a bit more a feeling like good-old relational databases, because you can add indexes to columns or call operations like count. In CouchDB as far as I know it you rather use Map-Reduce for all such functions. An index is generated in CouchDB by a so called views.
Also MongoDB maps the database, table concept roughly to NoSQL (two level access of data), whereas CouchDB only knows one level (database).
mytable = Connection().mydatabase.mytable # MongoDB
mytable.save(document)
mydb = couchdb.Server()['mydatabase'] # CouchDB
mydb.save(doc)
So I guess CouchDB might be a bit harder to understand at the beginning, because you have to select the documents by some sort of type (or use multiple dbs, but I think an additional attribute type is what people use, see this presentation by David Zuelke page 41.
MongoDB usually works with an API you can include in your programming language (if a library exists, but they exist for most languages). These calls are then sent in binary format to the server. On the other hand, CouchDB uses a REST-API.
Structure of the data
You can look around for some tutorials around the net. They really often explain something regarding blogs, because blogs are a good example for document oriented datases.
Let’s have a small look ourselves here: You will have a table (or type if you use CouchDB) for your posts. Each post can have a text, some tags, a date, comments. The point about document dbs is, that you can store everything aside the document and do save all these relations relational dbs have.
This means, we might model our posts like this:
{type: post,
date: 2012-06-19 22:14:23,
author: user1462192,
text: Welcome to my blog,
comments: [
{author: Aufziehvogel,
date: 2012-06-19 22:14:45,
text: Hello!
},
{author: user1462192,
date: 2012-06-19 22:14:45,
text: Hello, too!
}
],
tags: [welcome, new, interesting]
}
So that’s what a post could look like.
What you always have to do when developing software. Think about, what data you will save. Think about how it is related. And then as for document-oriented databases you also have to think about how you need to access it.
Sometimes you might have data that should not be saved as a child element of the post itself, because it is too large. Probably you do not only have the name of an author, but also more information like age, registration date, …
Then a user might look like this:
{name: Aufziehvogel,
age: 21,
registration: 2012-06-19,
interests: [php, nosql, data-mining, foreign-languages]
}
You would not want to attach this data to each blog post, because some of it might change and because it is very large. Instead you would (just like with relational dbs) store a refernce to the user in your post-data. Then you would have to merge authors and blog posts like given in the presentation linked above (p 40-42). This would merge the required author with the blog post.
What you could also do, is saving the authorname and the ID there, to be able to display the name and generate a HTML-link without having to grab the "real" author from the database.
Validating
What Zuelke also shows is that as for document oriented dbs it’s the application’s task to check whether data is well-formed. In MySQL many tasks can be performed by the database (columns, data type, length, UNIQUE keys), but when using document oriented dbs you have to do it on your own in the application (except that I think MongoDB features stuff like unique keys).
This makes a good code structure important too, so that you do not have to worry about the format of the data at too many places.
I guess there could be said even more, but I hope that’s a first start.
use NoSQL data base provide by app42 .Here is the how to use app42 NoSQL.
http://api.shephertz.com/apis/storage.php
I am currently trying to implement Tumblr-like user interactions like reblog, following, followers, commenting, blog posts of people who I currently following etc.
Also there is a requirement to display activity for each blog post.
I am stuck with creating proper schema for database. There are several way to achieve this kind of functionality (defining data structures embedded like blog posts and comments, creating an activity document for each action etc.) but I couldn't currently decide which way is the best in terms of performance and scalability.
For instance let's look at implementation of people who I follow. Here is sample User document.
User = { id: Integer,
username: String,
following: Array of Users,
followers: Array of Users,
}
This seems trivial. I can manage following field per user action (follow/unfollow) but what if an user who I currently follow is deleted. Is it effective to update all User records who follows deleted user.
Another problem is creating a view of blog post from people who I follow.
Post = { id: Integer,
author: User,
body: Text,
}
So is it effective query latest posts like;
db.posts.find( { author: { $in : me.followers} } )
It seems (to me) that you are trying to use a single data store (in this case a document-oriented NoSQL database) to fulfill (at least) two different requirements. The first thing you seem to be trying to do is store data in a document-oriented store. I am going to assume that you have legitimate reasons for doing this.
The second thing you seem to be trying to do is establish relationship(s) between the documents you are storing. Your example shows a FOLLOWS relationship. I would recommend treating this as a different requirement from storing data in a document-oriented NoSQL database and look at storing the relationships in a graph-oriented NoSQL database such as Neo4j. This way, your entities can be stored in the document store and relationships in the graph store using just the document IDs.
My experience has been that it will be difficult (if not impossible) to get a single NoSQL database to meet all functional and non-functional needs of a medium to large sized application. For example, the latest application I am working on uses MongoDB, Redis and Neo4j besides an RDBMS. I spent a lot of time experimenting with technologies and settled on this combination. I have committed myself to using Spring 3, along with the Spring Data project and so far my experience has been great.
One approach that works is called "Star Schema". If you search the web or wikipedia then you'll find lots of information.
I'm in the early stages of making a blogging site where users can have multiple blogs. I've decided to use document based storage for the blog entries (either MongoDB or CouchDB).
However, I will need to manage my users—mostly for authentication. Can this be done in a document-oriented database? How would I set that up? One document listing all the users seems like a bad idea. Or, Should I fall back to a relational database for this (most likely MySQL)?
It's perfectly possible and even more practical than a RDBMS is most cases. RDBMSs require a schema definition whereas document databases tend to be conceptually schemaless. This is especially useful for user databases since you can add user information whenever you want without any migrations. For example this is perfectly valid :
{
id: <your UUID>,
name: "Willy",
email: "willy#won.ca"
},
{
id: <your UUID>,
name: "John",
facebookId: 10029823,
avatarUrl: "http:\\graph.facebook.com\picture\10029823
}
In other words, it offers quite a bit of flexibility. There are no significant downsides that I can think of.
In terms of CouchDB versus MongoDB the choice really depends on your personal preferences. CouchDB community and support is in somewhat of a decline whereas MongoDB's continues to grow. Personally I prefer MongoDB but it's safe to say CouchDB's API and overall design is somewhat cleaner.
Good luck.
It's perfectly possible, and in my opinion, a good idea. Like Remon says, the schema-less design of a document database is a good idea for flexibility.
To answer your question about how to model it, I would suggest (in mongodb) a collection of documents called users, with one document in the collection for each user. A unique index on the collection by user name would be a good idea.
When it comes to NoSQL, there are bewildering number of choices to select a specific NoSQL database as is clear in the NoSQL wiki.
In my application I want to replace mysql with NOSQL alternative. In my application I have user table which has one to many relation with large number of other tables. Some of these tables are in turn related to yet other tables. Also I have a user connected to another user if they are friends.
I do not have documents to store, so this eliminates document oriented NoSQL databases.
I want very high performance.
The NOSQL database should work very well with Play Framework and scala language.
It should be open source and free.
So given above, what NoSQL database I should use?
I think you may be misunderstanding the nature of "document databases". As such, I would recommend MongoDB, which is a document database, but I think you'll like it.
MongoDB stores "documents" which are basically JSON records. The cool part is it understands the internals of the documents it stores. So given a document like this:
{
"name": "Gregg",
"fave-lang": "Scala",
"fave-colors": ["red", "blue"]
}
You can query on "fave-lang" or "fave-colors". You can even index on either of those fields, even the array "fave-colors", which would necessitate a many-to-many in relational land.
Play offers a MongoDB plugin which I have not used. You can also use the Casbah driver for MongoDB, which I have used a great deal and is excellent. The Rogue query DSL for MongoDB, written by FourSquare is also worth looking at if you like MongoDB.
MongoDB is extremely fast. In addition you will save yourself the hassle of writing schemas because any record can have any fields you want, and they are still searchable and indexable. Your data model will probably look much like it does now, with a users "collection" (like a table) and other collections with records referencing a user ID as needed. But if you need to add a field to one of your collections, you can do so at any time without worrying about the older records or data migration. There is technically no schema to MongoDB records, but you do end up organizing similar records into collections.
MongoDB is one of the most fun technologies I have happened to come across in the past few years. In that one happy Saturday I decided to check it out and within 15 minutes was productive and felt like I "got it". I routinely give a demo at work where I show people how to get started with MongoDB and Scala in 15 minutes and that includes installing MongoDB. Shameless plug if you're into web services, here's my blog post on getting started with MongoDB and Scalatra using Casbah: http://janxspirit.blogspot.com/2011/01/quick-webb-app-with-scala-mongodb.html
You should at the very least go to http://try.mongodb.org
That's what got me started.
Good luck!
At this point the answer is none, I'm afraid.
You can't just convert your relational model with joins to a key-value store design and expect it to be a 1:1 mapping. From what you said it seems that you do have joins, some of them recursive, i.e. referencing another row from the same table.
You might start by denormalizing your existing relational schema to move it closer to a design you wish to achieve. Then, you could see more easily if what you are trying to do can be done in a practical way, and which technology to choose. You may even choose to continue using MySQL. Just because you can have joins doesn't mean that you have to, which makes it possible to have a non-relational design in a relational DBMS like MySQL.
Also, keep in mind - non-relational databases were designed for scalability - not performance! If you don't have thousands of users and a server farm a traditional relational database may actually work better for you.
Hmm, You want very high performance of traversal and you use the word "friends". The first thing that comes to mind is Graph Databases. They are specifically made for this exact case.
Try Neo4j http://neo4j.org/
It's is free, open source, but also has commercial support and commercial licensing, has excellent documentation and can be accessed from many languages (REST interface).
It is written in java, so you have native libraries or you can embedd it into your java/scala app.
Regarding MongoDB or Cassendra, you now (Dec. 2016, 5 years late) try longevityframework.org.
Build your domain model using standard Scala idioms such as case classes, companion objects, options, and immutable collections. Tell us about the types in your model, and we provide the persistence.
See "More Longevity Awesomeness with Macro Annotations! " from John Sullivan.
He provides an example on GitHub.
If you've looked at longevity before, you will be amazed at how easy it has become to start persisting your domain objects. And the best part is that everything persistence related is tucked away in the annotations. Your domain classes are completely free of persistence concerns, expressing your domain model perfectly, and ready for use in all portions of your application.