NoSQL best practices [closed] - mongodb

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are the best practices for NoSQL Databases, OODBs or whatever other acronyms may exist for them?
For example, I've often seen a field "type" being used for deciding how the DB document (in couchDB/mongoDB terms) should be interpreted by the client, the application.
Where applicable, use PHP as a reference language. Read: I'm also interested in how such data can be best handled on the client side, not only strictly the DB structure. This means practically that I'm also looking for patterns like "ORM"s for SQL DBs (active record, data mapper, etc).
Don't hesitate making statements about how such a DB and the new features of PHP 5.3 could best work together.

I think that currently, the whole idea of NoSQL data stores and the concept of document databases is so new and different from the established ideas which drive relational storage that there are currently very few (if any) best practices.
We know at this point that the rules for storing your data within say CouchDB (or any other document database) are rather different to those for a relational one. For example, it is pretty much a fact that normalisation and aiming for 3NF is not something one should strive for. One of the common examples would be that of a simple blog.
In a relational store, you'd have a table each for "Posts", "Comments" and "Authors". Each Author would have many Posts, and each Post would have many Comments. This is a model which works well enough, and maps fine over any relational DB. However, storing the same data within a docDB would most likely be rather different. You'd probably have something like a collection of Post documents, each of which would have its own Author and collection of Comments embedded right in. Of course that's probably not the only way you could do it, and it is somewhat a compromise (now querying for a single post is fast - you only do one operation and get everything back), but you have no way of maintaining the relationship between authors and posts (since it all becomes part of the post document).
I too have seen examples making use of a "type" attribute (in a CouchDB example). Sure, that sounds like a viable approach. Is it the best one? I haven't got a clue. Certainly in MongoDB you'd use seperate collections within a database, making the type attribute total nonsense. In CouchDB though... perhaps that is best. The other alternatives? Separate databases for each type of document? This seems a bit loopy, so I'd lean towards the "type" solution myself. But that's just me. Perhaps there's something better.
I realise I've rambled on quite a bit here and said very little, most likely nothing you didn't already know. My point is this though - I think its up to us to experiment with the tools we've got and the data we're working with and over time the good ideas will be spread and become the best-practices. I just think you're asking a little too early in the game.

"NoSQL" should be more about building the datastore to follow your application requirements, not about building the app to follow a certain structure -- that's more like a traditional SQL approach.
Don't abandon a relational database "just because"; only do it if your app really needs to.

Related

Updating Database from iOS app [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Hi I am newbie to iOS programming,
I am building an iOS app to display a catalogue(Product name, Item No, Discription, and Image) of the products. After the user installs the app on the iOS device there may be updates happening to the product list. The user will not be able to modify any data in the database.
Can some one give me an idea of what kind Database i would require to use (SQL lite, Json or Coredata) and how i can let the update happen. Should I update just the new / modified records or update the complete database each time.
From some examples of apps i have seen from the appstore the app downloads the entire (latest version) of the database the first time the app is loaded.
Thanks in advance to all the friends out there in the community. your suggestions, codes, examples and any reference materials and links will be of great help.
Cheers!!
A couple of thoughts:
Apple's recommended framework is Core Data. Especially if you have a lot of data, it's probably worth familiarizing yourself with it.
Direct SQLite programming can have its advantages, but unless you have a compelling reason to pursue it (and I don't see anything suggesting this in your question), stick with Core Data. If you do decide to use SQLite, consider using the FMDB Objective-C wrapper for SQLite.
If you're dealing with a trivial amount of data (e.g. a dozen records), Core Data is probably overkill and you could just use a property list (plist). For example, if you have downloaded your JSON into a NSArray or NSDictionary, you can then just do writeToFile to save it, and dictionaryWithContentsOFFile or arrayWithContentsOfFile to read it back at a future date.
JSON is generally considered more of a mechanism for exchanging data with a server. I wouldn't be inclined to store data locally in JSON format (though you could). I'd use a plist instead.
By the way, it's generally not advised to store the images themselves in your CoreData/SQLite database. If you have larger files, for performance reasons, we often store them in the iOS File System (and save some reference to the file name in the database).
You mention that you have seen apps that download the entire (latest version) of a database. A more sophisticated implementation (critical with larger databases) would entail downloading updates (edits, deletions, insertions) rather than the full database. For a small database, if you can get away with the solution you propose (and it certainly makes it easier), but as your app becomes more sophisticated, you will want to consider more elegant server integration.

Why is there no ActiveRecord REST adapter [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We've been working on various projects using ActiveResource for a couple years now. It seems that ActiveResource is great to use if you are using Rails on both the client and server sides.
Problems with ActiveResource
We're using Scala on the server side and constantly running up against "Oh, ActiveResource doesn't want the API to <do this standard thing>" or "ActiveResource does <this weird thing>" so we have to change the server to support the demands of the client. This has all been discussed before, I know.
Another problem is that many gems and libraries require ActiveRecord. I can't count the number of gems we've run into that "require" your model to use ActiveRecord even though they don't actually use the actual AR functionality. It seems this is mostly because that's the easy path for gem development. "I'm using ActiveRecord, and can't imagine anyone not using it, so I'll just require that rather than figure out the more general way" (note, I've done this myself, so I'm not simply complaining)
So, if we use ActiveResource, we have to break the server to make it work, and we can't use a large portion of what makes Rails great.
REST Adapter?
All of this brought us to ask the question "Why does ActiveResource exist at all?" I mean, why would you have this secondary data storage path? Why isn't ActiveResource just a REST adapter? With a REST adapter, you can have all the good things in all the gems, and don't have to fight with ActiveResource's finicky nature. You just build your model the same way you build any model.
So I started exploring building one. It actually doesn't seem difficult at all. A few hours work and you could have the basic functionality built up. There are examples elsewhere using REST and SOAP, so it's doable.
So the question comes back. If it's so easy, why the hell hasn't this been done before?
Not simply a datastore?
I've come up with what I wonder is the answer. While building up a scaffold for this, I quickly ran into an issue. REST is very good at doing two things: 1) Act on this object, and 2) Act on all objects. In fact, that's pretty much the limit of the REST standard.
And so I started to wonder if scope is the reason there's no REST adapter. The ActiveRecord subsystem seems to need more than just "get one" and "get all." It's based on querying the datastore as much as anything.
The Actual Question
So, the actual question: Is there no ActiveRecord REST adapter simply because REST defines no standardized way to say "give me all of the cars where the car is in the same parking lot as these drivers and the drivers have a key."
That's right.
Unlike databases which have SQL as a standard way to express conditions, there is no standard in REST to support all the ActiveRecord functions, such as joins, group, and having.
How many hours work do you think it would take to do correlated queries or sub-queries?
I'm not being casually dismissive here. This concept touches on some personal projects I've considered, with some of the same issues I've been thinking through.
ActiveRecord supports all of SQL, which is way more powerful than most people use or need. Basically every part of an SQL statement has an ActiveRecord method which takes a string to fill in that section of the SQL.
You'd want to limit the client to the part of ActiveRecord people actually use. You'd still need IS NULL, and IS NOT NULL. You'd need comparisons such as less than and greater than. You'd want to support OR statements, for "field1 IS NULL OR field1 = ''".
To do all the comparison stuff, like where(["updated_at > ?", cutoff]) you would need a RESTful server more robust than existing web services. The server would have to use your gem, or be built with guidelines for implementing all the functionality.
So, in the end why? You're implementing a limited database API, going over the network with string URLs instead of binary packets, to a database engine that you are implementing.
On the other hand, if there was a standard for this, there might be good benefit to such an implementation.
If there was an implementation which one could install on a RESTful web server, which, while maybe not as powerful as SQL, could do indexed queries, post index processing of simple non-indexable expressions to qualify records return, and sort, (even if passing this to an SQL database do all the work), one could enable this on a server, and products, like Crystal Reports could be developed to use the standard for a report client.
Going through the web server API layer, could provide a way to enforce restrictions on what database operations could be performed to provide more security than fully opening up database access. Also, logic could be added to the web service to audit and do processing on events resulting from the CRUD operations (essentially triggers). Yes, database products supply security policies, triggers, and stored procedures to do these things, but with the product we are discussing, one could do it more easily in ruby, than using the database functions.
One could also have pseudo data, which is calculated from ruby code but acts like database records, along side the general DB RESTful access. Sure, databases can do this which store procedures, and some support writing stored procedures in Java, but this would be better because it would be easier to implement and could be written in ruby.

When to use NoSql, and which one? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've been programming with php and mySql for a while now and recently decided that I wanted to give nosql a try. I would really appreciate if some of you with experience could tell me:
When is it a good time to switch, how do I know nosql is for me?
Which nosql software would you recommend?
Thanks
When is it a good time to switch?
It really depends on the particular project. But in general I see that I can use nosql for 95% of web applications. I will still use old good sql for the systems which should guarantee ACID (for example, systems that work with 'real' money).
How do I know nosql is for me?
It's for you, for you, believe me. ;)
You just need to try something from the nosql world, read some existing articles and you will see all of the benefits and problems.
Which nosql software would you recommend?
I would personally recommend you to start from mongodb, because it really simple. To become an expert in sql takes years, to become an expert in mongodb needs a month or so.
I suggest that you spend an hour for reading "The little mongodb book" and try to write your first test application starting from tomorrow.
No one here will say that you need to use this, or this database. What database to use depends on project and requirements.
It depenends on your application needs.. There are a lot of options.
You can use a document-oriented like mongodb, a "extended" key-value like Redis or maybe a graph-oriented like neo4j
This article is very useful http://highscalability.com/blog/2010/12/6/what-the-heck-are-you-actually-using-nosql-for.html
This recent blog post in High Scalability pretty much answers your question in regards to when to use NoSQL.
I myself always go MySQL until it fails me and then choose the right tool for the job, some of the non-relational databases I worked with are:
Riak: a dynamo clone which is useful when you need to access records quickly but you have too many records to keep on one machine. For instance a recommender system for users in a web application, you want to access the data in a few milliseconds but you have 200M users.
MongoDB: a document-based database, for when I needed speed but had a write-intensive application (read/write ratio close to 1:1) the data was highly transient so I didn't care about the durability issues
The best time to switch is when you:
Start working on a new project and you make your first architectural decisions. Porting an existing application to a different database can cause a lot of headaches.
Hit a brick wall... or better before you see one coming :) The main reason is usually lack of performance or scalability.
Need a missing feature (eg: complex hierachical structures, graph-like traversals, etc..).
I would recommend a lot of them, but each of them has their own sweet-spots where they shine and other parts where they lack features. The only way to pick the right tool(s) is to get familiar with a couple of them.
Web developers usually learn key-value stores (memcached, redis) first as they can fix a lot of performance problems (but also add some complexity to your app...).
There are document (schema-less) databases like MongoDB or CouchDB which can significantly enhance your productivity if your data model often changes.
For graph traversals there is NeoJ.
For "big data" there is Hadoop and its related projects.
And a list goes on and on...

Is MongoDB reliable? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am developing a dating website and I am thinking of using a NoSQL database to store the profiles etc. I am currenly looking into the MongoDB and so far I am very pleased. The only worry is that I read on different websites that MongoDB is unreliable and not good.
I looked into the NoSQL alternatives and found no one that fully meets my specific criterias:
Easy to learn and use.
Fully compatible with PHP out of the box.
Fast and well documented.
What do you think, am I doing the right thing to go with MongoDB or is it a waste of time?
Thankful for all input in the matter!
I researched MongoDB for my social service startup and it is definitely worth considering. MongoDB has a powerful feature set that makes it a realistic and strong alternative to RDBMS solutions.
Amongst them:
Document Database: Most of your data is embedded in a document, so in order to get the data about a person, you don't have to join several tables. Thus, better performance for many use cases.
Strong Query Language: Despite not being a RDBMS, MongoDB has a very strong query language that allows you to get something very specific or very general from a document or documents. The DB is queried using javascript so you can do many more things beside querying (e.g. functions, calculations).
Sharding & Replication: Sharding allows you application to scale horizontally rather than vertically. In other words, more small servers instead of one huge server. And replication gives you fail-over safety in several configurations (e.g. master/slave).
Powerful Indexing: I originally got interested in MongoDB because it allows geo-spatial indexing out of the box but it has many other indexing configurations as well.
Cross-Platform: MongoDB has many drivers.
As for the documentation, there is not deluge but that is because this project only started in 2009; soon there will be plenty more. However there is enough to get started with your project. In addition to that you can check out Kyle Banker's MongoDB in Action, great resource.
Lastly, I had experience only with RDMBS prior to MongoDB, didn't know javascript or json and still found it to be very simple and elegant.
Consider this related question on MongoDB and CouchDB - Fit for Production?
MongoDB has a showcase of Production Deployments as well. Be sure to analyze the uses of MongoDB rather than the size of the company.
Any software can be reliable or unreliable. MongoDB has replica sets, which give you hardware failover capabilities. You can take backups on a regular basis, which gives you a recovery interval, and you get sharding which can give you some modicum of redundancy, especially when combined with replica sets.
The issue isn't whether or not the technology is reliable, the issue is whether or not you have a well-defined backup and recovery plan that suits your platform of choice.
If MongoDB suits your needs, you're making the right choice. Just make sure to investigate what you can do to increase your reliability.
If its good enough for Foursquare it's most likely good enough for you.
I come from a RDBMS background (12 years) and have spent the last 6 months looking at NoSQL options. For your scenario, MongoDB, sounds like a good choice. What I am hearing from those who have worked with MongoDB in production for some time, is that you should follow these best practices:
Keep key size small
Evaluate (and possibly add) indexes to speed up queries
Pay attention to schema (I know seems strange for a 'schemaless' database, but I have heard this several times
Use replica sets
Here's a video of a best-practices talk from the MongoDB LA User Group that I find useful
10gen, the company behind MongoDB provide official PHP driver.
As Jeremiah says, they implement replica sets in the last version (1.6.0) and they have already debug it (1.6.1 and next version in some weeks: 1.6.2).
Mostover, the free support by the company and communities is very fast and efficient (by 'free' I mean question on the google groups: http://groups.google.com/group/mongodb-user?pli=1)
Well, another points about reliability :
The community reacts extremely fast if you meet any critical issue.
You need to worry about your expectations about "reliability" : Do you need a guarantee on storing your data safely, never getting corrupted ?
In this case, you will have to compare the cost of buying reliable hardware and deploying MongoDB replica-sets
Do you mean having a highly available service ?
MongoDB has some youth issues, I can't say the opposite. But this is definitely NOT a waste of time, and perhaps a long-time solution.
That depends on what you need reliability for. Mongo is very reliable for reading - it have strong availability and sharding features.
OTOH, Mongo writes are not reliable. While most go through, it is never guaranteed that update succeeds or not and you have to manually query database to check if it did.
Thus, Mongo is best used when you have more reads than writes you absolutely need to succeed.
MongoDB would be a good choice.We evaluated and started using MongoDB for our business usecases. MongoDB is giving us better performance than Oracle and also it is easy to scale horizontally.

Best SQLite practices on the iPhone [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are some best practices to keep in mind when working extensively with SQLite on the iPhone? Tips/tricks/convenience factors all appreciated.
I can recommend using FMDB as a nice Cocoa SQLite wrapper.
Measure your app's memory footprint and look for leaks in Instruments. Then try it after invoking sqlite3_exec with:
pragma cache_size=1
and/or
pragma synchronous=0
YMMV. There are reports of performance boosts, large reductions in RAM usage, and fewer leaks. However, be careful about making adjustments without understanding the impact (for example, synchronous turns off flushing which speeds things up by a lot, but can cause DB corruption if the phone is power-cycled at the wrong time).
More here: http://www.sqlite.org/pragma.html
Off the top of my head:
Use Transactions.
Make sure your SQL leverages tables in the correct order.
Don't add indexes you're not entirely sure you need.
Perhaps not only specific to iPhone but to embedded devices there are some great tips here.
This link pertains to an older version of SQLite but still proves useful.
Lastly this Stack Question also has some good info.
We use SQLite with a .Net Compact Framework Application currently and it's performance is fantastic and we've spent a bit of time optimizing but not nearly as much as we could.
Best of luck.
I've found that it's often faster to just get the ID's I'm looking for in a complex query and then get the rest of the information on demand.
So for example:
SELECT person_id
FROM persons
WHERE (complex where clause)
and then as each person is being displayed I'll run
SELECT first_name, last_name, birth_date, ...
FROM persons
WHERE person_id = #person_id
I typically find this makes the complex query run in 1/2 the time and the lookups for a given person are typically on the order of 2ms (this is on tables with 17k rows).
Your experience may vary and you should time things yourself.
Also, I have to give credit to Wil Shipley for suggesting this technique in his talk here:
http://www.vimeo.com/4421498.
I actually use the hydration/dehydration pattern extensively from the sqlitebooks which is a superset of this technique.
I am lazy and like to stick in the core code as much as possible, hence I like the ORM tool SQLitePersistentObjects:
http://code.google.com/p/sqlitepersistentobjects/
You make your domain model objects inherit from SQLitePersistentObject (ok a little intrusive) and then you can persist/retrieve your objects as needed.
To persist:
[person save];
Loading it back in is almost as easy. Any persistable object gets dynamic class methods added to it to allow you to search. So, we could retrieve all the Person objects that had a last name of "Smith" like so:
NSArray *people = [PersistablePerson findByLastName:#"Smith"];
One other option I have not tried yet is Core Data (need to be an Apple iphone dev), although its a 3.0 feature and so it depends on your app whether thats an option..
PLDatabase is an FMDB alternative: http://code.google.com/p/pldatabase/
I've used it in one of my projects without issue.