When to use NoSql, and which one? [closed] - nosql

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've been programming with php and mySql for a while now and recently decided that I wanted to give nosql a try. I would really appreciate if some of you with experience could tell me:
When is it a good time to switch, how do I know nosql is for me?
Which nosql software would you recommend?
Thanks

When is it a good time to switch?
It really depends on the particular project. But in general I see that I can use nosql for 95% of web applications. I will still use old good sql for the systems which should guarantee ACID (for example, systems that work with 'real' money).
How do I know nosql is for me?
It's for you, for you, believe me. ;)
You just need to try something from the nosql world, read some existing articles and you will see all of the benefits and problems.
Which nosql software would you recommend?
I would personally recommend you to start from mongodb, because it really simple. To become an expert in sql takes years, to become an expert in mongodb needs a month or so.
I suggest that you spend an hour for reading "The little mongodb book" and try to write your first test application starting from tomorrow.
No one here will say that you need to use this, or this database. What database to use depends on project and requirements.

It depenends on your application needs.. There are a lot of options.
You can use a document-oriented like mongodb, a "extended" key-value like Redis or maybe a graph-oriented like neo4j
This article is very useful http://highscalability.com/blog/2010/12/6/what-the-heck-are-you-actually-using-nosql-for.html

This recent blog post in High Scalability pretty much answers your question in regards to when to use NoSQL.
I myself always go MySQL until it fails me and then choose the right tool for the job, some of the non-relational databases I worked with are:
Riak: a dynamo clone which is useful when you need to access records quickly but you have too many records to keep on one machine. For instance a recommender system for users in a web application, you want to access the data in a few milliseconds but you have 200M users.
MongoDB: a document-based database, for when I needed speed but had a write-intensive application (read/write ratio close to 1:1) the data was highly transient so I didn't care about the durability issues

The best time to switch is when you:
Start working on a new project and you make your first architectural decisions. Porting an existing application to a different database can cause a lot of headaches.
Hit a brick wall... or better before you see one coming :) The main reason is usually lack of performance or scalability.
Need a missing feature (eg: complex hierachical structures, graph-like traversals, etc..).
I would recommend a lot of them, but each of them has their own sweet-spots where they shine and other parts where they lack features. The only way to pick the right tool(s) is to get familiar with a couple of them.
Web developers usually learn key-value stores (memcached, redis) first as they can fix a lot of performance problems (but also add some complexity to your app...).
There are document (schema-less) databases like MongoDB or CouchDB which can significantly enhance your productivity if your data model often changes.
For graph traversals there is NeoJ.
For "big data" there is Hadoop and its related projects.
And a list goes on and on...

Related

Why is there no ActiveRecord REST adapter [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We've been working on various projects using ActiveResource for a couple years now. It seems that ActiveResource is great to use if you are using Rails on both the client and server sides.
Problems with ActiveResource
We're using Scala on the server side and constantly running up against "Oh, ActiveResource doesn't want the API to <do this standard thing>" or "ActiveResource does <this weird thing>" so we have to change the server to support the demands of the client. This has all been discussed before, I know.
Another problem is that many gems and libraries require ActiveRecord. I can't count the number of gems we've run into that "require" your model to use ActiveRecord even though they don't actually use the actual AR functionality. It seems this is mostly because that's the easy path for gem development. "I'm using ActiveRecord, and can't imagine anyone not using it, so I'll just require that rather than figure out the more general way" (note, I've done this myself, so I'm not simply complaining)
So, if we use ActiveResource, we have to break the server to make it work, and we can't use a large portion of what makes Rails great.
REST Adapter?
All of this brought us to ask the question "Why does ActiveResource exist at all?" I mean, why would you have this secondary data storage path? Why isn't ActiveResource just a REST adapter? With a REST adapter, you can have all the good things in all the gems, and don't have to fight with ActiveResource's finicky nature. You just build your model the same way you build any model.
So I started exploring building one. It actually doesn't seem difficult at all. A few hours work and you could have the basic functionality built up. There are examples elsewhere using REST and SOAP, so it's doable.
So the question comes back. If it's so easy, why the hell hasn't this been done before?
Not simply a datastore?
I've come up with what I wonder is the answer. While building up a scaffold for this, I quickly ran into an issue. REST is very good at doing two things: 1) Act on this object, and 2) Act on all objects. In fact, that's pretty much the limit of the REST standard.
And so I started to wonder if scope is the reason there's no REST adapter. The ActiveRecord subsystem seems to need more than just "get one" and "get all." It's based on querying the datastore as much as anything.
The Actual Question
So, the actual question: Is there no ActiveRecord REST adapter simply because REST defines no standardized way to say "give me all of the cars where the car is in the same parking lot as these drivers and the drivers have a key."
That's right.
Unlike databases which have SQL as a standard way to express conditions, there is no standard in REST to support all the ActiveRecord functions, such as joins, group, and having.
How many hours work do you think it would take to do correlated queries or sub-queries?
I'm not being casually dismissive here. This concept touches on some personal projects I've considered, with some of the same issues I've been thinking through.
ActiveRecord supports all of SQL, which is way more powerful than most people use or need. Basically every part of an SQL statement has an ActiveRecord method which takes a string to fill in that section of the SQL.
You'd want to limit the client to the part of ActiveRecord people actually use. You'd still need IS NULL, and IS NOT NULL. You'd need comparisons such as less than and greater than. You'd want to support OR statements, for "field1 IS NULL OR field1 = ''".
To do all the comparison stuff, like where(["updated_at > ?", cutoff]) you would need a RESTful server more robust than existing web services. The server would have to use your gem, or be built with guidelines for implementing all the functionality.
So, in the end why? You're implementing a limited database API, going over the network with string URLs instead of binary packets, to a database engine that you are implementing.
On the other hand, if there was a standard for this, there might be good benefit to such an implementation.
If there was an implementation which one could install on a RESTful web server, which, while maybe not as powerful as SQL, could do indexed queries, post index processing of simple non-indexable expressions to qualify records return, and sort, (even if passing this to an SQL database do all the work), one could enable this on a server, and products, like Crystal Reports could be developed to use the standard for a report client.
Going through the web server API layer, could provide a way to enforce restrictions on what database operations could be performed to provide more security than fully opening up database access. Also, logic could be added to the web service to audit and do processing on events resulting from the CRUD operations (essentially triggers). Yes, database products supply security policies, triggers, and stored procedures to do these things, but with the product we are discussing, one could do it more easily in ruby, than using the database functions.
One could also have pseudo data, which is calculated from ruby code but acts like database records, along side the general DB RESTful access. Sure, databases can do this which store procedures, and some support writing stored procedures in Java, but this would be better because it would be easier to implement and could be written in ruby.

What should I do when something I know is dead ? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I recently came into some blogs saying that Linq to Sql is dead. and a few days ago I saw a discussion with some people saying that Silverlight maybe will take ASP.net place! ...I don't want to ask if they are true or not but it is so annoying when you wake up and found your languages and informations in the casket! (I think that Microsoft languages have the big part of this words). So what should I do in this case ? drop every thing back and start again with the new techniques or keep using the old stuff or what ?
A platform isn't dead because someone says it's dead. It's dead when it's no longer used, and in that case, it's obvious you shouldn't be using it either. Unless, of course, you're employed by a big company that has a big legacy system that needs further development using that "dead" technology.
Nothing ever really dies. Especially Microsoft stuff. I thought I had wasted 10 years as a FoxPro dev (not even VFP!), but this helped me get jobs, even in related technologies like dBASE and Clipper, since those skills are harder to find.
Having said that, by all means keep up with the mainstream. Continuous learning is expected in this field.
In any case, neither of those claims (re Linq to Sql and Silverlight) is true.
First off, dont immediately jump to the assumption that what you read in a blog or in a tech magazine is true. If the 'old stuff' still works, whats the compelling reason to change? Basically its just keeping up with fashion. Use what you want to use. If you understand the basics of programming, you can adapt as needed when presented with new technologies.
Neither ASP.NET or Linq-to-SQL appear to be vanishing any time soon; however, this advice always applies:
Put pressure on Microsoft (or any other vendor, for that matter) to support products your business depends on.
If your business regularly purchases upgrades to its Microsoft stack, MS will want to make you happy. Get involved at Microsoft Connect, contact your MS representative, and be involved. The safest path to maintaining support from MS is by participating in their process. (It's why Windows XP remained supported long after MS announced support would end.)
Simple answer: Be sufficiently broad in your technical abilities.
As a hiring manager, I'm more concerned that you know how to write good programs and that you know how to get answers to your questions, than I am that you spent 10 years focusing on a specific technology.
For example: If I'm hiring for an MVC2 position, I would gladly take a seasoned Microsoft web developer with webforms experience and some MVC2 exposure over an unseasoned and unskilled programmer that's been working in MVC2 since it came out.
Many programmers are keen to try out new things. The statement "Technique X is dead", means just that someone tries to convince himself or a group, that investing more time into something else seems worthwhile.
To really kill a technology you would need to remove all related downloads and available knowledge or relicense the software in some form that makes usage impossible.
When a group of people shout that X is dead, they are just moving to Y. If you can still hire X programmers, you are not in big trouble.
Don Box once said "If your this kind of people that start working with a technology when it's dead, it time for you to start using COM". He said that to introduced .net an confirm that way that COM won't be further improved. See ? "Dead" means it won't evolve as much as other anymore but definitely not that we should't use anymore.
Don't jump to conclusions based on (non-authoritative) blogs. People have opinions, and that's all they are. Until the vendors of the technologies you're using come out and say it, proceed with caution.
I'm not a .Net expert by any means but I think there's a reason for not using Linq for SQL queries since a dedicated ORM approach will most likely be faster and more configurable. As far as Silverlight, it's just a subset of WPF which is a GUI framework. They possibly wish that silverlight would take over some processing from the server side but I doubt they'll get that sort of market penetration.
That's life on the cutting edge... You got into this business because you liked the latest technology, right? Well, don't look back, look forward. Before your favorite technology even has the sniffles, you should be looking at what's going up. If you study up on the latest technology, you won't even notice when the old stuff go away.
Software development is all about learning things so that you can build things that will be thrown away and replaced with newer things some day! If you don't want to commit to continuous change, you're in the wrong field! :-)

Is MongoDB reliable? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am developing a dating website and I am thinking of using a NoSQL database to store the profiles etc. I am currenly looking into the MongoDB and so far I am very pleased. The only worry is that I read on different websites that MongoDB is unreliable and not good.
I looked into the NoSQL alternatives and found no one that fully meets my specific criterias:
Easy to learn and use.
Fully compatible with PHP out of the box.
Fast and well documented.
What do you think, am I doing the right thing to go with MongoDB or is it a waste of time?
Thankful for all input in the matter!
I researched MongoDB for my social service startup and it is definitely worth considering. MongoDB has a powerful feature set that makes it a realistic and strong alternative to RDBMS solutions.
Amongst them:
Document Database: Most of your data is embedded in a document, so in order to get the data about a person, you don't have to join several tables. Thus, better performance for many use cases.
Strong Query Language: Despite not being a RDBMS, MongoDB has a very strong query language that allows you to get something very specific or very general from a document or documents. The DB is queried using javascript so you can do many more things beside querying (e.g. functions, calculations).
Sharding & Replication: Sharding allows you application to scale horizontally rather than vertically. In other words, more small servers instead of one huge server. And replication gives you fail-over safety in several configurations (e.g. master/slave).
Powerful Indexing: I originally got interested in MongoDB because it allows geo-spatial indexing out of the box but it has many other indexing configurations as well.
Cross-Platform: MongoDB has many drivers.
As for the documentation, there is not deluge but that is because this project only started in 2009; soon there will be plenty more. However there is enough to get started with your project. In addition to that you can check out Kyle Banker's MongoDB in Action, great resource.
Lastly, I had experience only with RDMBS prior to MongoDB, didn't know javascript or json and still found it to be very simple and elegant.
Consider this related question on MongoDB and CouchDB - Fit for Production?
MongoDB has a showcase of Production Deployments as well. Be sure to analyze the uses of MongoDB rather than the size of the company.
Any software can be reliable or unreliable. MongoDB has replica sets, which give you hardware failover capabilities. You can take backups on a regular basis, which gives you a recovery interval, and you get sharding which can give you some modicum of redundancy, especially when combined with replica sets.
The issue isn't whether or not the technology is reliable, the issue is whether or not you have a well-defined backup and recovery plan that suits your platform of choice.
If MongoDB suits your needs, you're making the right choice. Just make sure to investigate what you can do to increase your reliability.
If its good enough for Foursquare it's most likely good enough for you.
I come from a RDBMS background (12 years) and have spent the last 6 months looking at NoSQL options. For your scenario, MongoDB, sounds like a good choice. What I am hearing from those who have worked with MongoDB in production for some time, is that you should follow these best practices:
Keep key size small
Evaluate (and possibly add) indexes to speed up queries
Pay attention to schema (I know seems strange for a 'schemaless' database, but I have heard this several times
Use replica sets
Here's a video of a best-practices talk from the MongoDB LA User Group that I find useful
10gen, the company behind MongoDB provide official PHP driver.
As Jeremiah says, they implement replica sets in the last version (1.6.0) and they have already debug it (1.6.1 and next version in some weeks: 1.6.2).
Mostover, the free support by the company and communities is very fast and efficient (by 'free' I mean question on the google groups: http://groups.google.com/group/mongodb-user?pli=1)
Well, another points about reliability :
The community reacts extremely fast if you meet any critical issue.
You need to worry about your expectations about "reliability" : Do you need a guarantee on storing your data safely, never getting corrupted ?
In this case, you will have to compare the cost of buying reliable hardware and deploying MongoDB replica-sets
Do you mean having a highly available service ?
MongoDB has some youth issues, I can't say the opposite. But this is definitely NOT a waste of time, and perhaps a long-time solution.
That depends on what you need reliability for. Mongo is very reliable for reading - it have strong availability and sharding features.
OTOH, Mongo writes are not reliable. While most go through, it is never guaranteed that update succeeds or not and you have to manually query database to check if it did.
Thus, Mongo is best used when you have more reads than writes you absolutely need to succeed.
MongoDB would be a good choice.We evaluated and started using MongoDB for our business usecases. MongoDB is giving us better performance than Oracle and also it is easy to scale horizontally.

NoSQL best practices [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are the best practices for NoSQL Databases, OODBs or whatever other acronyms may exist for them?
For example, I've often seen a field "type" being used for deciding how the DB document (in couchDB/mongoDB terms) should be interpreted by the client, the application.
Where applicable, use PHP as a reference language. Read: I'm also interested in how such data can be best handled on the client side, not only strictly the DB structure. This means practically that I'm also looking for patterns like "ORM"s for SQL DBs (active record, data mapper, etc).
Don't hesitate making statements about how such a DB and the new features of PHP 5.3 could best work together.
I think that currently, the whole idea of NoSQL data stores and the concept of document databases is so new and different from the established ideas which drive relational storage that there are currently very few (if any) best practices.
We know at this point that the rules for storing your data within say CouchDB (or any other document database) are rather different to those for a relational one. For example, it is pretty much a fact that normalisation and aiming for 3NF is not something one should strive for. One of the common examples would be that of a simple blog.
In a relational store, you'd have a table each for "Posts", "Comments" and "Authors". Each Author would have many Posts, and each Post would have many Comments. This is a model which works well enough, and maps fine over any relational DB. However, storing the same data within a docDB would most likely be rather different. You'd probably have something like a collection of Post documents, each of which would have its own Author and collection of Comments embedded right in. Of course that's probably not the only way you could do it, and it is somewhat a compromise (now querying for a single post is fast - you only do one operation and get everything back), but you have no way of maintaining the relationship between authors and posts (since it all becomes part of the post document).
I too have seen examples making use of a "type" attribute (in a CouchDB example). Sure, that sounds like a viable approach. Is it the best one? I haven't got a clue. Certainly in MongoDB you'd use seperate collections within a database, making the type attribute total nonsense. In CouchDB though... perhaps that is best. The other alternatives? Separate databases for each type of document? This seems a bit loopy, so I'd lean towards the "type" solution myself. But that's just me. Perhaps there's something better.
I realise I've rambled on quite a bit here and said very little, most likely nothing you didn't already know. My point is this though - I think its up to us to experiment with the tools we've got and the data we're working with and over time the good ideas will be spread and become the best-practices. I just think you're asking a little too early in the game.
"NoSQL" should be more about building the datastore to follow your application requirements, not about building the app to follow a certain structure -- that's more like a traditional SQL approach.
Don't abandon a relational database "just because"; only do it if your app really needs to.

Best SQLite practices on the iPhone [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are some best practices to keep in mind when working extensively with SQLite on the iPhone? Tips/tricks/convenience factors all appreciated.
I can recommend using FMDB as a nice Cocoa SQLite wrapper.
Measure your app's memory footprint and look for leaks in Instruments. Then try it after invoking sqlite3_exec with:
pragma cache_size=1
and/or
pragma synchronous=0
YMMV. There are reports of performance boosts, large reductions in RAM usage, and fewer leaks. However, be careful about making adjustments without understanding the impact (for example, synchronous turns off flushing which speeds things up by a lot, but can cause DB corruption if the phone is power-cycled at the wrong time).
More here: http://www.sqlite.org/pragma.html
Off the top of my head:
Use Transactions.
Make sure your SQL leverages tables in the correct order.
Don't add indexes you're not entirely sure you need.
Perhaps not only specific to iPhone but to embedded devices there are some great tips here.
This link pertains to an older version of SQLite but still proves useful.
Lastly this Stack Question also has some good info.
We use SQLite with a .Net Compact Framework Application currently and it's performance is fantastic and we've spent a bit of time optimizing but not nearly as much as we could.
Best of luck.
I've found that it's often faster to just get the ID's I'm looking for in a complex query and then get the rest of the information on demand.
So for example:
SELECT person_id
FROM persons
WHERE (complex where clause)
and then as each person is being displayed I'll run
SELECT first_name, last_name, birth_date, ...
FROM persons
WHERE person_id = #person_id
I typically find this makes the complex query run in 1/2 the time and the lookups for a given person are typically on the order of 2ms (this is on tables with 17k rows).
Your experience may vary and you should time things yourself.
Also, I have to give credit to Wil Shipley for suggesting this technique in his talk here:
http://www.vimeo.com/4421498.
I actually use the hydration/dehydration pattern extensively from the sqlitebooks which is a superset of this technique.
I am lazy and like to stick in the core code as much as possible, hence I like the ORM tool SQLitePersistentObjects:
http://code.google.com/p/sqlitepersistentobjects/
You make your domain model objects inherit from SQLitePersistentObject (ok a little intrusive) and then you can persist/retrieve your objects as needed.
To persist:
[person save];
Loading it back in is almost as easy. Any persistable object gets dynamic class methods added to it to allow you to search. So, we could retrieve all the Person objects that had a last name of "Smith" like so:
NSArray *people = [PersistablePerson findByLastName:#"Smith"];
One other option I have not tried yet is Core Data (need to be an Apple iphone dev), although its a 3.0 feature and so it depends on your app whether thats an option..
PLDatabase is an FMDB alternative: http://code.google.com/p/pldatabase/
I've used it in one of my projects without issue.