MongoDB as the primary database? - mongodb

I have read a lot of the MongoDB.
I like all the features it provides, but I wonder if it's possible to have it as the only database for my application, including storing sensitive information.
I know that it compromises the durability part in ACID but I will as a solution have 1 master and 2 slaves in different locations.
If I do that, is it possible to use it as the primary database, storing everything?
UPDATE:
Lets put it this way.
I really need a document storage rather than traditional dbms for be able to create my flexible application. But is MongoDB reliable enough to store customer sensitive information if I have multiple database replications and master-slave? Cause as far as I know one major downside is that it compromises the D in ACID. So I solve it with multiple databases.
Now there is not major problems such as lost of data issues?
And someone told me that with MongoDB a customer could be billed twice. Could someone enlighten this?

Yes, MongoDB can work as an application's only data store. Use replication and make sure you know about safe writes and w.
And someone told me that with MongoDB a customer could be billed
twice. Could someone enlighten this?
I'm guessing whoever told you that was talking about eventual consistency. Some NoSQL databases are eventually consistent, meaning that you could have conflicting writes. MongoDB is not, though, it is strongly consistent (just like a relational database).

Your application being flexible or not has absoutely nothing to do with wether you use "nosql", a "document db" or a proper RDBMS. Nobody using your application will care either way.
If you need flexibility while coding, you should research into frameworks, like ActiveRecord for Ruby, which can make DB-interfacing much more simple, generic and powerful. At that level, you can gain alot more than just changing the DB, and you can even become DB-agnostic, meaning you can change DB without changing any code. Indeed, I have found ActiveRecord to boost my productivity many many fold by alleviating me from tedious and error-prone "code intermixed with SQL".
Indeed, if you feel you need a schemaless database, for critical data, you are doing something wrong. You are putting your own convenience over critical needs of the projects, or in ignorance thinking you won't get into problems later. Sooner or later, lack of consistency will bite your ass, hard!
I feel you are hesistant towards RDBMS because you are not really that comfortable with all the jargons, syntax and sound CS principles.
Believe me, if you're going to create an application of any value, you are hundred times better learning SQL, ACID and good database-principles in the first place. Just read up on the basics, and build your knowledge from wherever you are now. It's the same for each and every one of us, it takes time, but you learn to do things right from the start.
Low-level tools like MongoDB and equivalent just provide you with infinitely more ammunition to shoot yourself in the foot with. They make it seem easy. In reality however, they leave the hard work for you, the coder, to deal with, while an RDBMS will handle more of the cruft for you once you grok the basics.
Why use computers at all, if you want more work, you can just go back to paper. Design will be a breeze, and implementation can be super-quick. Done. Except it won't be right of course.
In the real world, we can't afford to ignore consistency, database design and many more sound CS principles. Which is why it's a great idea to study them in the first place, and keep learning more and more.
Don't buy into the hype. You ask question about MongoDB here, but include that you really need its features. With 25 years of computer experience, I simply don't buy it. If you think creatively, an RDBMS can be made to do whatever you want it to, or a framework can be utilized to save you from errors and wasted time.
Crafting ACID properties onto MongoDB seems like more work to me, and by experience, sounds like an excercise in futility, rather than using what is already designed to suit such purposes.

Is it possible? Sure. Why not? It's possible to store everything as XML in text files if you wanted to.
Is it the best idea? It depends on what you're trying to do, the rest of your system architecture, the scale your site is intended to achieve, and so on.

Related

How can I abstract database repository from data service?

I was reading at this article https://www.infoq.com/articles/spring-data-intro to understand how can a data service layer be independent of database(RDBMS /NoSQL). It looks like there's no way to design entity and repository to be independent of database. This article was written on 2012. Do we've any other technologies since then that has implemented this feature?
Before actually answering your question I have to ask: Why do you want to do that? Think twice because abstraction comes at a cost and just doing it in order to have a "clean" design is most certainly not worth it.
Now to your question:
There is no library or framework that does this out of the box.
Probably the thing getting closest to it is spring data which you are obviously aware of. If you stick to the persistent storage independent interfaces your repositories will abstract over the persistence store use at least to some extent. BUT: you will have to provide different kinds of metadata (typically as annotations on the entities) to make it work. So in this sense, it is a really leaky abstraction.
Of course, you can roll your own: create an interface with the operations you need and provide implementations for the different data stores you want to use. Also, include a storage independent way of providing metadata.
So the question becomes: Why hasn't anybody done that yet? And why you probably shouldn't do it either?
It is hard: Just writing SQL in a way that all (relevant) SQL databases understand is difficult, see this for an example.
You loose a lot of power of your stores. For example, RDBMSes are great at joining stuff. But joining is basically a no go for many No-SQL databases. So your API probably shouldn't offer this feature. This basically dumbs everything down to the common denominator, which is going to be really small when you have very different data stores.
It is not worth it. This leads back to my opening question: Why would you want to do it anyway? I certainly see the use of switching different RDBMSes. Some companies only want certain vendors in their datacenter, it is nice to have an in memory variant for testing and so on. But switching a store from for example Oracle to Hazelcast and then to MongoDB to CSV? Why would you want to do that? What is the business value of that?

Using PostgreSQL or PostgreSQL + MongoDB?

I'm currently planning a social-media application - especially the backend.
Basically I have all the social aspects for which I want to use SQL (PostgreSQL I guess) but I also have geolocations organized in lists (so many-to-one) which will propably make out the biggest ammount of data. I know that PostgreSQL has modules for GIS capabilities and my initial thought was to just use PostgreSQL for everything, just for the sake of simplicity and because performance of Geolocation searches should be around the same for both systems, if not even in favor of PostgreSQL. I can also use JSON Type in PostgreSQL so it basically has the most obvious advantages of MongoDB covered.
On the other hand I'm affraid of scalability as the geolocations are going to be the biggest chunk of data and the tables are propably going have heaps of rows.
So my thought now is to implement geolocations in MongoDB with its easy scalability, easy to use geolocation search and embedd e.g Comments/Likes for a geolocation directly into the document, which would make the geolocation reads/searches way easier but then again I had to combine this data with social data from SQL, e.g fetch all users that commented a geolocation and get their profile info from PostgreSQL and 'manually' combine it. Even though parts of this could be done on frontend saving me a lot of resources.
I'm not sure how good this idea performs and if I'm really doing myself a favor there.
tldr: Use PostgreSQL.
Long answer:
You are trying to pre-optimize for a problem you don't even know if you will have. You don't know how many geolocations you will have, what the usage behaviors will be of your users and you probably don't even have any users yet.
I've used MongoDB before and migrated to PostgreSQL. There are many, many features and benefits to using a 'real' database for storing highly structured data. I suggest googling around for 'PostgreSQL vs X' articles, but the overall consensus that I've found is that PGSQL is extremely mature, reliable, performant and supported.
From my personal experience using Mongo then switching to PGSQL, I will never use Mongo again unless PGSQL (or another full-fledged SQL database) is completely falling over and I've spent months fixing it. Even then I'd take a hard look at other NoSQL databases too. PGSQL has so many amazing features and powerful tools that make it a joy to use.
For the seemingly few things you think you need Mongo for, PGSQL can do, and do just as well or better. It has native JSON types with indexes, geo support, full text indexing, etc. PGSQL has been around longer and has more support (useful for debugging, performance tuning, etc).
Regardless of which technologies you are thinking of using, you can't make any sort of informed decision if you don't:
Test with large data sets
and
Know your usage patterns and data volumes
So at this point I'd pick the more matured and powerful tool and setup monitoring for it. Watch the usage and performance of PGSQL, see how it holds up. Research best practices for PGSQL. Get to know it, learn it, dive in deep. When it comes to scaling individual services, each one is somewhat unique and will not fit a simple "Should I use X or Y?" question.
Good luck!

multithread database fetch xcode

Hi thank you for the help in advance,
I have looked at some of the posts and I am a bit confused about the multi threading. It seems that it may be pretty easy, however I am very new to programming so I am still trying to grasp this.
These are two calls to pull data from a database, and they take forever as it is... So I'm thinking about multithreading these until I can learn how to build a core data for this. Right now i am using sqllite and the database involves 10,000 + recipes... So not lightning fast like I would like...
Please let me know what you think, and how I can make these happen maybe simultaneously? (If thats even possible)
Thank you in advance.
requestCount++;
[[DataPuller sharedDataPuller] getAllDeletedRecipeList];
[DataPuller sharedDataPuller].target = self;
requestCount++;
[[DataPuller sharedDataPuller] getAllRecipesList];
[DataPuller sharedDataPuller].target = self;
If you're doing SQLite, you might want to contemplate using FMDB which (a) gets you out of the weeds of sqlite3 calls; and (b) offers a FMDatabaseQueue which allows you to coordinate queries from multiple queues, so that the data operations don't stumble across each other.
Having said that, you suggest that you're having significant performance issues which you're hoping to solve with a shift to Core Data or going multi-threaded with SQLite. That's unlikely. Poor performance of local database operations is generally more of a matter of your application design (e.g. it's unlikely to be wise to try to retrieve the entire details for all 10,000 recipes ... you probably want to retrieve just the unique identifiers, perhaps only those required for the given screen, and then only retrieve the particulars for a given recipe at a later point as you need that). For local database interaction, you rarely have to contemplate a multithreaded implementation, but rather just design the system to retrieve the least possible information at any given point that you need for the presentation. I personally find that my database-driven apps generally only need to go multithreaded when I'm doing extensive interaction with some remote web service (in which case, it's the retrieval of server data and the parsing of that which goes on the separate thread, not necessarily the database operations themselves).
If you're having performance issues with your recipe app, I'd suggest you submit a far more detailed question, with code samples, that articulates your particular performance problem. And I wouldn't be surprised if multi-threading was not part of the solution. Likely, appropriate use of indexes and a more judicious retrieval of information at any given point might be more critical.
Get records from database in the form of pages; i.e. 20 or 50 recipes per page. Have a look on YouTube app. on iPhone or have a look on my app. HCCHelper

NoSQL (e.g. MongoDB) or RDMS (e.g. PostgreSQL) for new Scala project?

I'm developing a brand new project in Scala. It's just an application for a bunch of CRUD operations, however, because of some eccentric requirements, Play2 or Lift does not fit the bill, so I'm going to develop the application from the ground up. This means that Anorm or ScalaQuery becomes less obvious choices for database integration, and leaves me with the question: is it time to try something new?
My past technology stacks mostly included Java and PostgreSQL and I have experience with both ORM and plain SQL. Are NoSQL database management systems like MongoDB a good replacement for a typical RDBMS or are they special case application data stores? Also, how does the choice of database effect the greater Scala system design (if at all)? For example, the fact that you are using a JSON-like interface to talk to the database, and JSON between the web and a REST service, does not mean that much if everything in the middle becomes Scala objects, or does it?
I'm basically asking for someone's experience on moving from relational to object/document type databases, using Scala in particular. I know that good RDBMS integration is promised in the upcoming release of SLICK. So, if a company like TypeSafe decides to make a RDBMS integration part of the TypeSafe stack, then will I be swimming upstream by integrating to MongoDB using Casbah for example?
Apologies if this question appears a bit vague. I do hope that someone with the right insights or experience will be able to help though.
Update:
Apologies for not adding links to SLICK (it being fairly new). Here goes:
Quick overview
Project home
Update 2:
My personal first win for a technology is usually developer productivity - this translates to lightweight and simple: quick to learn, easy to maintain, no magic
I am currently in a similar situation, and since I have some experience with web development and SQL databases, I took it as an opportunity to work with MongoDB, Cashbah (and Scalatra). My experience is still very limited and the project and the amount of data I am working with is pretty small, but here are a few observations I've made.
For the few sets of data I have, performance does not seem to motivate either SQL or NoSQL. However, performance in the presence of huge amounts of data is often listed as a reason for using NoSQL, e.g., by Wikipedia
My documents (entries in the database) arise from benchmarking test suits, and mainly have a static structure, and I am optimistic that I could store them in a fixed-schema SQL database. However, a few substructures are not static, e.g., new test cases are added, new statistics are tracked, others are removed. This was my main motivation for trying a schema-free NoSQL database. Also, because I had the feeling that the document approach of MongoDB makes it much more obvious which data belongs together (i.e., to a document), in contrast to entries in a relational database, where the data would be distributed over various tables and rows, and where a full "document" would need to be reconstructed by joins.
Tools such as Lift-Json or Rogue allow you to work with regular Scala objects in a type-safe, although the data is regularly (de-)serialised as (from) JSON. However, this naturally works best if the structure of your data is mainly static, otherwise, you you are left with using strings to access your data (e.g., for expanding the results of a query using Cashbah).
If you are mainly concerned about a coherent representation of data on server and client side, languages such as Opa or Haxe might be of interest, since they compile to code that can executed on both sides. See this page for "multitarget" or "tierless" languages.
Got too long for a comment. Was just trying to relate my short experience with Scala (about 6 months now, since about when Play2 came out--it's quickly become my go to language).
I've enjoyed using Salat/Casbah with MongoDB in my last few projects; most have been in Play2, but the latest was without a webapp framework. It definitely hasn't felt like swimming upstream.
I would say that there are particular use cases for which I wouldn't use mongo, but it works nicely as a general purpose object data store, especially if you expect to query by id or index and don't need transactions (and will need minimal ad-hoc aggregation type stuff).
Expect to require a separate set of servers dedicated to mongodb (or to use a service dedicated to mongodb), but I guess that's normal for most serious database apps.
I've also used Play2/Anorm, which was surprisingly enjoyable to use for some ad-hoc query dashboard-style report pages. I started trying to go the Squeryl route, but Anorm seemed easier to use for one-off aggregation queries. Haven't looked at SLICK, but it sounds interesting.
It's really hard to say without knowing what problems you would like the app to solve.
I've personally found my productivity increased using NoSQL DBs via REST/JSON. Though bear in mind most NoSQL DBs offer REST interfaces which preclude the need for much middleware, Scala or otherwise, unless you intend to write a webapp with a UI.
If this is a learning exercise, I recommend you try multiple things out, as each NoSQL DB has something different to offer to your toolkit, and have personally found CouchDB, Riak, Neo4j, and MongoDb all with various pluses and drawbacks and good for different purposes.
Hope this helps, good luck.

Sqlite3 alternatives for iPhone

Are there any other database engines that could be used on the iPhone, besides sqlite3? Something like textDb is for PHP, single-file and no server.
There are a slew of alternatives to SQLite, but there is little point to using them as others have pointed out.
Before pointing out some alternatives, some points:
First, SQLite is an excellent single-file, non-client-server, small-footprint SQL database. The performance is excellent, it is a relatively tiny runtime, and it is thoroughly fast. There isn't an embeddable SQL-interpreting alternative that is either all around technically superior or anywhere near as popular.
Secondly, if you are doing persistency in an iPhone application, you should very likely be using CoreData. There are certainly reasons not to, but they are pretty uncommon. Beyond being a higher level mapping to a relational store that is quite adeptly integrated with Cocoa Touch, Core Data solves a number of very difficult problems above and beyond persistency; object graph management, efficient memory use (i.e. push stuff out of memory when no longer needed), and undo support, to name a few.
Finally, if you do decide to use some other database persistency layer, keep in mind that the iPhone 3G and prior is an extremely memory constrained runtime environment. The very presence of any kind of additional library can significantly reduce the working memory available to your app. Whatever solution you choose, make sure it is optimized to use as little memory as possible.
So, seriously, if you are looking to not use SQLite or CoreData it is either because you have a very rare case where they aren't appropriate or because you are being curious. If curious, well... good for you!
If you are looking for alternatives, the SQLite documentation includes a set of links to similar products.
Pretty sparse list and it isn't because the author is hiding anything. There simply isn't a lot of motivation in the industry to try and re-invent this particular wheel because SQLite does a really good job. There is a reason why Google, Adobe, GE, FireFox, Microsoft, Sun, REALBasic, Skype, Symbian, Apple, and others have pretty much standardized on SQLite to solve their non-client/server relational persistency needs; it just works.
If you're looking for an alternative, I would say Core Data.
I had the same question for a long time and even used SQLite in some projects. After speaking with an Apple Engineer though, he pointed out that Core Data could do everything that I was using SQLite for (mainly just storing information and accessing it in a few different ways).
I would start with the with Core Data Programming Guide and see how it works for what you're looking to do.
I think your problem is that you are thinking of a software library more like a software product. People want choice between Internet browsers for all sorts of reasons. But when you have a software library, it's pretty much set up for one purpose. If it doesn't fulfill that purpose well enough, it shouldn't be a library.
Do you not like NSObject? Do you not like the Core Foundation library? Then write your own. However, to drag up an unfortunately overused analogy, don't reinvent the wheel, unless your job is making new and innovative wheels.
SQLite does perform acceptably, and so it is supplied as a library on the iPhone platform. SQLite works for what I need it to do. If it doesn't work for you, then maybe you have some reason you'd like to share?
Freedom of choice is fine if you want to choose between Internet browsers, but I would think that as a programmer, one should have a very specific reason for going out of their way, spending valuable time to fix something that already works. Even with my choice of Internet browser, I have specific reasons I choose one over another.
No. Everyone seems to be happy with SQLite.