How many joins per functionality in real world application? - rdbms

The queries that i create frequently have 7-8 joins to retrieve data. Are these many joins normal in a real database application or is my database design poor? I am curious because if on each request database has to do so much work, then won't it die if few thousands of client connect?

In my opinion it's inevitable in some cases, the key is to have the correct indexes for the queries you're doing. With a deep object graph in ORM, or perhaps one with joined subclasses, it'd be easy to go over the 7-8 joins you talk of. I'm keen to hear what everyone else has to say about it :)

Its not possible to draw a Conclusion in this regard without the Application Logic Details.
If your Application Logic leads you to unavoidable joins to maintain the integrity Its not a Problem, and Your Database Platform must handle it.

That is a lot of joins. It's hard to say without seeing your schema, but I have seen cases where people have gone nuts making a schema overcomplicated. I remember one application I worked on where every address and phone number in the system was treated as an entity, and queries frequently involved joins of over a dozen tables. You should be careful when making a schema to distinguish between those things you care about individually tracking and everything else, otherwise you can end up with needless complexity.

Related

How can I abstract database repository from data service?

I was reading at this article https://www.infoq.com/articles/spring-data-intro to understand how can a data service layer be independent of database(RDBMS /NoSQL). It looks like there's no way to design entity and repository to be independent of database. This article was written on 2012. Do we've any other technologies since then that has implemented this feature?
Before actually answering your question I have to ask: Why do you want to do that? Think twice because abstraction comes at a cost and just doing it in order to have a "clean" design is most certainly not worth it.
Now to your question:
There is no library or framework that does this out of the box.
Probably the thing getting closest to it is spring data which you are obviously aware of. If you stick to the persistent storage independent interfaces your repositories will abstract over the persistence store use at least to some extent. BUT: you will have to provide different kinds of metadata (typically as annotations on the entities) to make it work. So in this sense, it is a really leaky abstraction.
Of course, you can roll your own: create an interface with the operations you need and provide implementations for the different data stores you want to use. Also, include a storage independent way of providing metadata.
So the question becomes: Why hasn't anybody done that yet? And why you probably shouldn't do it either?
It is hard: Just writing SQL in a way that all (relevant) SQL databases understand is difficult, see this for an example.
You loose a lot of power of your stores. For example, RDBMSes are great at joining stuff. But joining is basically a no go for many No-SQL databases. So your API probably shouldn't offer this feature. This basically dumbs everything down to the common denominator, which is going to be really small when you have very different data stores.
It is not worth it. This leads back to my opening question: Why would you want to do it anyway? I certainly see the use of switching different RDBMSes. Some companies only want certain vendors in their datacenter, it is nice to have an in memory variant for testing and so on. But switching a store from for example Oracle to Hazelcast and then to MongoDB to CSV? Why would you want to do that? What is the business value of that?

Using PostgreSQL or PostgreSQL + MongoDB?

I'm currently planning a social-media application - especially the backend.
Basically I have all the social aspects for which I want to use SQL (PostgreSQL I guess) but I also have geolocations organized in lists (so many-to-one) which will propably make out the biggest ammount of data. I know that PostgreSQL has modules for GIS capabilities and my initial thought was to just use PostgreSQL for everything, just for the sake of simplicity and because performance of Geolocation searches should be around the same for both systems, if not even in favor of PostgreSQL. I can also use JSON Type in PostgreSQL so it basically has the most obvious advantages of MongoDB covered.
On the other hand I'm affraid of scalability as the geolocations are going to be the biggest chunk of data and the tables are propably going have heaps of rows.
So my thought now is to implement geolocations in MongoDB with its easy scalability, easy to use geolocation search and embedd e.g Comments/Likes for a geolocation directly into the document, which would make the geolocation reads/searches way easier but then again I had to combine this data with social data from SQL, e.g fetch all users that commented a geolocation and get their profile info from PostgreSQL and 'manually' combine it. Even though parts of this could be done on frontend saving me a lot of resources.
I'm not sure how good this idea performs and if I'm really doing myself a favor there.
tldr: Use PostgreSQL.
Long answer:
You are trying to pre-optimize for a problem you don't even know if you will have. You don't know how many geolocations you will have, what the usage behaviors will be of your users and you probably don't even have any users yet.
I've used MongoDB before and migrated to PostgreSQL. There are many, many features and benefits to using a 'real' database for storing highly structured data. I suggest googling around for 'PostgreSQL vs X' articles, but the overall consensus that I've found is that PGSQL is extremely mature, reliable, performant and supported.
From my personal experience using Mongo then switching to PGSQL, I will never use Mongo again unless PGSQL (or another full-fledged SQL database) is completely falling over and I've spent months fixing it. Even then I'd take a hard look at other NoSQL databases too. PGSQL has so many amazing features and powerful tools that make it a joy to use.
For the seemingly few things you think you need Mongo for, PGSQL can do, and do just as well or better. It has native JSON types with indexes, geo support, full text indexing, etc. PGSQL has been around longer and has more support (useful for debugging, performance tuning, etc).
Regardless of which technologies you are thinking of using, you can't make any sort of informed decision if you don't:
Test with large data sets
and
Know your usage patterns and data volumes
So at this point I'd pick the more matured and powerful tool and setup monitoring for it. Watch the usage and performance of PGSQL, see how it holds up. Research best practices for PGSQL. Get to know it, learn it, dive in deep. When it comes to scaling individual services, each one is somewhat unique and will not fit a simple "Should I use X or Y?" question.
Good luck!

multithread database fetch xcode

Hi thank you for the help in advance,
I have looked at some of the posts and I am a bit confused about the multi threading. It seems that it may be pretty easy, however I am very new to programming so I am still trying to grasp this.
These are two calls to pull data from a database, and they take forever as it is... So I'm thinking about multithreading these until I can learn how to build a core data for this. Right now i am using sqllite and the database involves 10,000 + recipes... So not lightning fast like I would like...
Please let me know what you think, and how I can make these happen maybe simultaneously? (If thats even possible)
Thank you in advance.
requestCount++;
[[DataPuller sharedDataPuller] getAllDeletedRecipeList];
[DataPuller sharedDataPuller].target = self;
requestCount++;
[[DataPuller sharedDataPuller] getAllRecipesList];
[DataPuller sharedDataPuller].target = self;
If you're doing SQLite, you might want to contemplate using FMDB which (a) gets you out of the weeds of sqlite3 calls; and (b) offers a FMDatabaseQueue which allows you to coordinate queries from multiple queues, so that the data operations don't stumble across each other.
Having said that, you suggest that you're having significant performance issues which you're hoping to solve with a shift to Core Data or going multi-threaded with SQLite. That's unlikely. Poor performance of local database operations is generally more of a matter of your application design (e.g. it's unlikely to be wise to try to retrieve the entire details for all 10,000 recipes ... you probably want to retrieve just the unique identifiers, perhaps only those required for the given screen, and then only retrieve the particulars for a given recipe at a later point as you need that). For local database interaction, you rarely have to contemplate a multithreaded implementation, but rather just design the system to retrieve the least possible information at any given point that you need for the presentation. I personally find that my database-driven apps generally only need to go multithreaded when I'm doing extensive interaction with some remote web service (in which case, it's the retrieval of server data and the parsing of that which goes on the separate thread, not necessarily the database operations themselves).
If you're having performance issues with your recipe app, I'd suggest you submit a far more detailed question, with code samples, that articulates your particular performance problem. And I wouldn't be surprised if multi-threading was not part of the solution. Likely, appropriate use of indexes and a more judicious retrieval of information at any given point might be more critical.
Get records from database in the form of pages; i.e. 20 or 50 recipes per page. Have a look on YouTube app. on iPhone or have a look on my app. HCCHelper

Coldfusion MVC framework with crud for many-mamy

I've spent the last couple of months on and off with CFwhees.
There's a lot it has going for simplicity, but I always come up against an issue or two that I have a hard time troubleshooting. Biggest of them is to get a many-many relationship to work properly.
I'm switching between two environments Railo/mySQL and CF/MsSQL so it would be nice if it could work on both.
I'm trying to roll out a web application in a limited amount of time, as I've spent already too much time on CF wheels.
Can anyone recommend a framework that will make creating many-many relationships and the related CRUD easy and has a big community?
Some of the one's I've seen mentioned frequently are MachII, FuseBox, Model-Glue, ColdBox
Most of those frameworks you list do not have built-in ORM like wheels does. This means you'll be either using straight SQL queries or the CF9(Hibernate) ORM. I think it is only fair to mention that both of those options are also available for CFWheels.
I've written a fairly large app in CFWheels. In my app there were several instances of many-to-many relationships, and I was able to make it work without too much pain. That being said, I have felt your frustration with the CFWheels ORM. It can be clunky once you get to complex relationships. In those cases, I've had to make a judgement call as to whether it was worth it to try to build a query using the ORM, or just build a custom SQL query and store it in the CFC for my model. In fact, for 99% of my report queries for this app, I just resorted to writing the SQL in the model. But for CRUD operations, this wasn't really a limiting factor.
I'm curious what specific problems you're experiencing with Wheels - care to post an example?
yes the orm in cfwheels can be buggy at times. if you encounter a bug or even what you THINK might be a bug, we want to know. please take the time to file a bug report so we can investigate. that all being said, i'm very surprised that the CF community hasn't taken noticed about Don Humphrey's ORM called CFRel. It's probably one of the biggest things to happen to CFML since fusebox.
Oh... and there is even a cfwheels plugin for it.

MongoDB as the primary database?

I have read a lot of the MongoDB.
I like all the features it provides, but I wonder if it's possible to have it as the only database for my application, including storing sensitive information.
I know that it compromises the durability part in ACID but I will as a solution have 1 master and 2 slaves in different locations.
If I do that, is it possible to use it as the primary database, storing everything?
UPDATE:
Lets put it this way.
I really need a document storage rather than traditional dbms for be able to create my flexible application. But is MongoDB reliable enough to store customer sensitive information if I have multiple database replications and master-slave? Cause as far as I know one major downside is that it compromises the D in ACID. So I solve it with multiple databases.
Now there is not major problems such as lost of data issues?
And someone told me that with MongoDB a customer could be billed twice. Could someone enlighten this?
Yes, MongoDB can work as an application's only data store. Use replication and make sure you know about safe writes and w.
And someone told me that with MongoDB a customer could be billed
twice. Could someone enlighten this?
I'm guessing whoever told you that was talking about eventual consistency. Some NoSQL databases are eventually consistent, meaning that you could have conflicting writes. MongoDB is not, though, it is strongly consistent (just like a relational database).
Your application being flexible or not has absoutely nothing to do with wether you use "nosql", a "document db" or a proper RDBMS. Nobody using your application will care either way.
If you need flexibility while coding, you should research into frameworks, like ActiveRecord for Ruby, which can make DB-interfacing much more simple, generic and powerful. At that level, you can gain alot more than just changing the DB, and you can even become DB-agnostic, meaning you can change DB without changing any code. Indeed, I have found ActiveRecord to boost my productivity many many fold by alleviating me from tedious and error-prone "code intermixed with SQL".
Indeed, if you feel you need a schemaless database, for critical data, you are doing something wrong. You are putting your own convenience over critical needs of the projects, or in ignorance thinking you won't get into problems later. Sooner or later, lack of consistency will bite your ass, hard!
I feel you are hesistant towards RDBMS because you are not really that comfortable with all the jargons, syntax and sound CS principles.
Believe me, if you're going to create an application of any value, you are hundred times better learning SQL, ACID and good database-principles in the first place. Just read up on the basics, and build your knowledge from wherever you are now. It's the same for each and every one of us, it takes time, but you learn to do things right from the start.
Low-level tools like MongoDB and equivalent just provide you with infinitely more ammunition to shoot yourself in the foot with. They make it seem easy. In reality however, they leave the hard work for you, the coder, to deal with, while an RDBMS will handle more of the cruft for you once you grok the basics.
Why use computers at all, if you want more work, you can just go back to paper. Design will be a breeze, and implementation can be super-quick. Done. Except it won't be right of course.
In the real world, we can't afford to ignore consistency, database design and many more sound CS principles. Which is why it's a great idea to study them in the first place, and keep learning more and more.
Don't buy into the hype. You ask question about MongoDB here, but include that you really need its features. With 25 years of computer experience, I simply don't buy it. If you think creatively, an RDBMS can be made to do whatever you want it to, or a framework can be utilized to save you from errors and wasted time.
Crafting ACID properties onto MongoDB seems like more work to me, and by experience, sounds like an excercise in futility, rather than using what is already designed to suit such purposes.
Is it possible? Sure. Why not? It's possible to store everything as XML in text files if you wanted to.
Is it the best idea? It depends on what you're trying to do, the rest of your system architecture, the scale your site is intended to achieve, and so on.