Propagated delete in code or database? - iphone

I'm working on an iPhone application with a few data relationships (Author -> Books for example). When a user deletes an Author object from the application, I have a few SQLite triggers that run on the delete to remove any books from the database that have a foreign key matching the Author's primary key.
I'm also using a trigger to insert some data when a new item is created.
I can't help but shake the feeling that this might be bad design or lead to some problems down the road I am not thinking of. That said, should I rely on code in my app to handle propagating the deletes like this when the database has the capability built in to handle it?
What say you?

True. Use the inbuilt capabilities of the database as much as possible. Atleast try and start off like that and only compromise when things really demand so.

I would make use of the database's features to ensure relational integrity, especially with respect to updates/deletes. There are cases where I might use a trigger to insert some additional data (auditing comes to mind), though I would tend to avoid this and insert all of the data from my application. If you are doing multiple inserts, though, make sure to wrap it all in a single transaction so that you don't end up with a partial insert which could lead to loss of relational integrity.

I like the idea of using the database's built in functionality (I am not familiar with how it works).. but I would worry if I went back to the code a year from now, would I remember how it worked? (Given the code isn't right in front of me).
I imagine if you add a lot of comments to remind yourself about how it works now, if anything goes wrong in the future, at least you won't need to relearn the database features when you need to go do some debugging.

You're a few steps ahead of me: I recently learned about how to do that stuff with triggers and I am tempted to use them myself.
Based on the other answers here, it seems like a philosophical choice. It would probably be fine to use either triggers or code, but best to be consistent. So don't use triggers for cascading deletes on one table but then C code for another table.
Since you tagged the question iphone, I think the most important difference would be relative performance of C code versus a trigger. You'd probably have to code both and experiment to determine the difference, if any.
Another thing that comes to mind is that, of all the horror stories that I read on thedailywtf.com, about half of them seem to involve database triggers.

Unfortunately SQLite does NOT support on delete cascade etc. From the SQLite documentation:
http://www.sqlite.org/omitted.html
FOREIGN KEY constraints are parsed but are not enforced. However, the equivalent constraint enforcement can be achieved using triggers. The SQLite source tree contains source code and documentation for a C program that will read an SQLite database, analyze the foreign key constraints, and generate appropriate triggers automatically.
There is some support for triggers but it is not complete. Missing subfeatures include FOR EACH STATEMENT triggers (currently all triggers must be FOR EACH ROW), INSTEAD OF triggers on tables (currently INSTEAD OF triggers are only allowed on views), and recursive triggers - triggers that trigger themselves.
Therefore, the only way to code on delete cascade etc using SQLite requires triggers.
Kind regards,

Code goes in your app.
Triggers are code. The functionality goes in your app. Not in the database.
I think that databases should be used for data, not processing. I think apps should be used for processing, not data.
Database processing features merely muddy the water.

Related

Is it practical to use one table for reading purpose only in a relational database?

I know this question would not be ideal in a real database world, however, I am building a web REST api to server a result that potentially need to join almost every table(i use normalization for sure).
So is it OK to do have one single table to hold the meta data used for reading API, but the table get updated as well when data updated in other tables? I am using PostgreSQL by the way.
This is not very clear so I will state my understanding of the question and give you what I see are the tradeoffs.
First.... It sounds to me like you want to effectively materialize a metadata table and have it live-updated when other tables update. This is not really what the MATERIALIED VIEW support in PostgreSQL is for.
You can use a trigger to update the data whenever something changes. Because of the way PostgreSQL handles things, this leads to more disk and CPU activity, but will probably add more on the latter than the former. So if you hare heavily CPU-bound that will pose more problems than if you are I/O bound.
Using triggers in this way adds a fair bit of complexity to your database and may reduce write scaling a bit but if the data is seldom written but read frequently it may be a clear win.
So in answer to your question, yes it is practical in at least some cases. Whether it is practical in your case, that will be for you to decide.

Synchronizing two tables, best practice

I need to synchronize two tables across databases, whenever either one changes, on either update, delete or insert. The tables are NOT identical.
So far the easiest and best solution i have been able to find is adding SQL triggers.
Which i slowly started adding, and seems to be working fine. But before i continue finishing it, I want to be sure that this is a good idea? And in general good practice.
If not, what a better option for this scenario?
Thank you in advance
Regards
Daniel.
Triggers will work, but there are quite a few different options available to consider.
Are all data modifications to these tables done through stored procedures? If so, consider putting the logic in the stored procedures instead of in a trigger.
Do the updates have to be real-time? If not, consider a job that regularly synchronizes the tables instead of a trigger. This probably gets tricky with deletes, though. Not impossible, just tricky.
We had one situation where the tables were very similar, but had slightly different column names or orders. In that case, we created a view to the original table that let the application use the view instead of the second copy of the table. We were also able to use a Synonym one time to point to the original table, but that requires that the table structures be the same.
Generally speaking, a lot of people try to avoid unnecessary triggers as they're just too easy to miss when doing other work in the database. That doesn't make them bad, but can lead to interesting times when trying to troubleshoot problems.
In your scenario, I'd probably briefly explore other options before continuing with the triggers. Just watch out for cascading trigger effects where your one update results in the second table updating, passing the update back to the first table, then the second, etc. You can guard for this a little with nesting levels. Otherwise you run the risk of hitting that maximum recursion level and throwing errors.

Hibernate High Concurrency and User defined #Id ’s

firstly please excuse my relative inexperience with Hibernate I’ve only really been using it in fairly standard cases, and certainly never in a scenario where I had to manage the primary key’s (#Id) myself, which is where I believe my problems lies.
Outline: I’m bulk-loading facebook profile information through FB’s batch API's and need to mirror this information in a local database, all of which is fine, but runs into trouble when I attempt to do it in parallel.
Imagine a message queue processing batches of friend data in parallel and lots of the same shared Likes and References (between the friends), and that’s where my problem lies.
I run into repeated Hibernate ConstraintViolationException’s which are due to duplicate PK entries - as one transaction has tried to flush it’s session after determining an entity as transient when in fact another transaction has already made the same determination and beaten the first to committing, resulting in the below:
Duplicate entry '121528734903' for key 'PRIMARY'
And the ConstraintViolationException being raised.
I’ve managed to just about overcome this by removing all cascading from the parent entity, and performing atomic writes, one record per-transaction and trying to essentially just catching any exceptions, ignoring them if they do occur as I’d know that another transaction had already done the job, but I’m not very happy with this solution and cant imagine it's the most efficient use of hibernate.
I'd welcome any suggestions as to how I could improve the architecture…
Currently using : Hibernate 3.5.6 / Spring 3.1 / MySQL 5.1.30
Addendum: at the moment I'm using a hibernate merge() which checks initially for the existence of a row and will either merge (update) or insert dependant on existence, problem is even with an isolation level of READ_UNCOMMITTED sometimes the wrong determination is made, i.e. two transactions decide the same, and I've got an exception again.
Locking doesn't really help me either, optimistic or pessimistic as the condition is only a problem in the initial insert case and there's no row to lock, making it very difficult to handle concurrency...
I must be missing something but I've done the reading, my worry is that not being able to leave hibernate to manage PK's i'm kinda scuppered - as it checks for existence to early in the session and come time to synchronise the session state is invalid.
Anyone with any suggestion for me..? thanks.
Take this with a large grain of salt as I know very little about Hibernate, but it sounds like what you need to do is specify that the default mysql INSERT statement is instead made an INSERT IGNORE statement. You might want to take a look at #SQLInsert in Hibernate, I believe that's where you would need to specify the exact insert statement that should be used. I'm sorry I can't help with the syntax, but I think you can probably find what you need by looking at the Hibernate documentation for #SQLInsert and if necessary the MySQL documentation for INSERT IGNORE.

In a SQLite database is it better to use tirggers to handle cascading table changes, or is it better to do it programmatically?

Background
I have a couple of projects that use a SQLite DB for data. The data stored in the databases are obviously stored across several tables, linked by key/foreign key values.
The thing is that in these databases, if something changes to one record I have to update several other tables. The best example off the top of my head is deleting a record. I have to make sure all other records related to the one being deleted are deleted as well. Now, this example can be solved using key/foreign key values, I believe, but what about more complicated updates?
Now I'm no pro DB admin, but I know that there needs to be data integrity in the DB or things get ugly.
The Question
So, my question. I know that I have greater control when updating related tables programmatically, but at the cost of human error and time. I may miss something or not implement the tables updates correctly and it takes a lot longer to code in the updates. On the other hand, I can put in triggers and let the DB handle the updates to other tables, but I then lose a lot of control.
So, which one is better? Is each better in different situations?
On the other hand, I can put in
triggers and let the DB handle the
updates to other tables, but I then
lose a lot of control.
What control do you think you're losing? If data integrity requires that "such-and-such an update here requires additional updates there and there", you're not losing control by coding that in a trigger. You're centralizing control, and delegating it to the dbms, which is the only piece of software that can guarantee every application follows those requirements.
I know that I have greater control
when updating related tables
programmatically, but at the cost of
human error and time. I may miss
something or not implement the tables
updates correctly and it takes a lot
longer to code in the updates.
You're thinking like a programmer, not a database designer. (That's an observation, not a criticism.) Don't think, "I might miss something". That way of thinking really misses the mark.
Instead, when you're tempted to delegate data integrity to application code, think "Every programmer and every new or changed application that hits this database from now until the end of time has to get it perfectly right."
Now, honestly, does that really sound like a good idea to you?
(The last Fortune 500 company I worked in had programs written in at least two dozen different languages hitting their OLTP database.)

How would you use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up?

Whenever I watch a demo regarding the Entity Framework the demonstrator simply sets up some tables and performs Inserts, Updates and Deletes using automatically created code stubs but never shows any use of stored procedures. It seems to me that this is executing SQL from the client.
In my experience this is not particular good practice so I am presuming that my understanding of the Entity Framework is wrong.
Similarly WCF RIA Services demos use the EF and the demos are always the same. Can anyone shed any light on how you would use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up.
I think I am confused and shouldn't be!!?
There's nothing wrong with executing SQL from the client. Most (if not all) of the problems that it might cause are in fact not there when using something like EF. For instance:
Client generated SQL might cause runtime syntax errors. This is not unlikely since the description of your query is mostly checked on compile time (assuming that the generator itself doesn't generate invalid SQL, which is also unlikely)
Client generated SQL might be inefficient. This is not true with modern database software which have query caches. EF works in a way that's compatible with query caches, i.e. it generates the same SQL consistently (as long as you use the same code consistently) and uses parameters for varying data.
Client generated SQL might be insecure (SQL injections and whatnot). This is all handled by the generator, which uses parameters for your values and does not interpolate user input into the query itself.
Back in the old Client / Server days, it used to be considered good practice to do all db updates using stored procedures.
But now, it's perfectly acceptable to have an O/RM generate SQL and run directly against DB.
Well, part of the reason why executing sql in stored procedures is a good idea is that it gives you a level of abstraction - when db changes inevitably occur, you make a change in a single place (the proc) rather than a dozen places (all the places where you were calling the client sql). Entity Framework provides this layer of abstraction through the data model, and you have the same advantage.
There are some other reasons why you might want to look at procs, like security granularity (only allowing certain users the right to execute), and some minor performance differences. Ultimately, you have to decide for yourself what the right trade-off is. EF is an attempt to dramatically reduce the developer time spent creating a data layer, with the trade-offs listed above.
never shows any use of stored procedures
Take a look at this video: Using Your Own Stored Procedures to Insert, Update and Delete Entities in Entity Framework.
Note that there are a lot of other videos on that topic there that are certainly worth watching!
The legend is that Scott Hanselman once said "It's not a real demo unless someone drags a datagrid" (pg 478 Silverlight 4 In Action, Pete Brown)
You have to remember that demos, are all about selling software, and not at all about communicating best practice. So your observations about the demos are absolutely correct, they cover the basics, and leave it to the observer to fill in the blanks.
As to your comment about Stored Procedures, and various answers to your question about the generator. The generator is good, and getting better. Howerver there are certain circumstances when it will generate completely unusable queries. (see my SO question here and discussed on the ADO.NET team blog)
Therefore there are occasions when hand crafted queries are your only recourse (either by way of stored proc, table value functions, views etc)