Managing database changes

Managing database changes - postgresql

I'm starting to move more logic into the database, using triggers, views, functions, CTEs, etc. When plv8/json comes out for postgres, I can see myself putting lots of logic in there.
I'm having problems with the "standard" way of doing database migrations in sequel and activerecord. Both sequel and activerecord let you put arbitrary sql code into timestamped files. When each file is ran, a schema_versions table is updated with the filename (or timestamp in the filename), which keeps record of which migrations have been applied to the current database.
If a lot of coding is being done at the database level, that means that modifications to existing views, functions, etc follow the below pattern:
Migration 1 defines a function and a view that uses that function.
-- Migration 1
create function calculate(x int) returns int as $$
return x + 1;
$$ language sql;
create view foos as (
select something, calculate(something) from a_table
);
Requirements change, and I need to change a function type. In Migration 2 I have to drop all objects that depend on foo, and recreate them by copying their entire body -- even if there weren't any changes in most of the other code!
-- Migration 2
-- Have to drop all views and functions that depend on the
-- `calculate(int)` function.
drop view foos;
create or replace calculate(x bigint) returns bigint as $$
return x + 1;
$$ language sql;
-- I could do `drop function calculate(int) cascade`,
-- but I might accidentally drop some objects that wouldn't get recreated below.
-- Now I have to recreate foo.
create view foos as (
select something, calculate(something) from a_table
);
If I'm building a system based on views and functions and triggers, my migrations would be filled with duplicated code, and it's difficult to find the latest version of the code. You might say "don't do that!", but for my purposes (e-commerce, shipping, transactions), I'm finding it's a lot easier and faster to have the database ensure the integrity of the data by doing the logic inside the database.
You can (of course) dump the current database schema (which includes all the code definitions), but I think you lose comments. And you wouldn't generally want to edit a giant file that contains the whole schema.
Any ideas on how to solve this problem?
My best idea is to how the sql code contained in their own canonical files (app/sql/orders/shipping.sql, app/sql/orders/creation.sql, etc). Everyone develops directly on these. Whenever it's time for a release, then you'd want to make a new migration file, look at all the changed code since the previous release, figure out the dependency chain of the database objects that need to be dropped and recreated, and then copy the sql from the canonical sql files into a new sequel/activerecord migration file. But it's a pain. :/
Thoughts are very welcome. I hope I explained this well enough, I'm cutting back on my caffeine intake and I'm a little groggy atm.
Oh, I asked a similar question on Stack Overflow: Changing the type of a column used in other views The answer was a function that let me pass in:
sql code to run
database views to drop and recreate
The function would retrieve the view definition, drop the views, run the sql code, then recreate the view definition (in reverse order of dropping). Perhaps a system of functions like this would help solve the problem of having to copy/paste sql code into the migration files.

I'd recommend liquibase.
You create files which track the changes to your database and these will be run into the database in the correct migration order.

You might find Dave Wheeler's blog-posts interesting starting from here:
http://justatheory.com/computers/databases/simple-sql-change-management.html
My rate of database change is fairly small but I tend to be careless and make small changes to the schema directly, so I've had to come up with a fair bit of infrastructure to catch when I've done so. The basic elements are:
A makefile that can rebuild a development database from scratch
A set of schema-files separated into "modules" (lookups_schema.sql, lookup_data.sql)
A set of update files that transition from one revision to the next
I don't usually have the corresponding downgrade scripts, some people do
A script to populate my database with a plausible amount of test data
Crucially, a test suite via pgTAP that checks my various functions, views and also the upgrade scripts. The upgrade tests can be run against a live database too.
If you have a separate instance of PostgreSQL set up with fsync turned off / on ramdisk etc then rebuilding the whole DB and populating it can take seconds (if you don't have too much test data).
Start with #1, #2, then add #6 (pgTAP is very cool), then the rest. The crucial thing is a test suite that checks your in-database code.
There are tools that try to automate schema changes for you, but they are really only good at adding a new column to a table and that sort of thing. Once you have code in your db then they're not much help.

Related

Delayed Table Name Resolution in View

I have a view over a table. It turns out the table gets moved and an updated version of it created each night. This ensures there is always a table of the expected name present in the database, but I cannot find a way to make my view continue to point to the current version of the table. Whichever table existed when the view was created is the one I end up pointing to even after it moves and goes stale.
ViewA:
select a, b, c from todays_table;
todays_table stays current all day, then at night it gets renamed to todays_table01. View A now points to todays_table01 and a new table shows up called todays_table. Again, todays_table is current, but ViewA no longer is.
Is there a way to delay the table name resolution until the view is used? I haven't been able to get EXECUTE IMMEDIATE working for SELECT statement. I think I could get a dynamic SQL statement working if I used a cursor, but I have never needed these before and I'm not sure if they are the right path. I read about AUTO_REVAL but I believe this would only delay resolution until the first time the view was used and still go stale that night.
I could, of course, stop using the view and just move the complex query into my program but there are many places it is needed so I would like to eliminate all other solutions before falling back to this.
It would be ideal to eliminate the temporary table and just have the master table receive updates throughout the day but this is beyond my comprehension as I know nothing about RPG II and OCL.
Thanks for reading.
Edit
Per #Mr. Llama's suggestion, I experimented with using synonyms and aliases to point to todays_table and then having my view point to the synonym. Unfortunately in this scenario, the view uses the alias to resolve the actual table name on creation so the view continues to point to todays_table when it is renamed to todays_table01, though the alias continues to reference todays_table.
Edit 2
I'm accepting #mustaccio's answer because it does work and would be a reasonable approach to this problem if I could get the parameters going where they need to. My particular project requires flexibility so I am actually going to jump on the nightly process bandwagon and add a program to recreate my views after the process messes with their references as #danny117 suggested.
Thanks to everyone who replied though, I learned a lot about how all of these pieces work together.

I think you might be able to achieve what you want by wrapping your view definition in a SQL table function, something like
CREATE FUNCTION insteadofview (<parameters>)
RETURNS TABLE (<columns>)
...
RETURN
SELECT <the rest of your view definition>
Depending on how you query your view, you will probably need to pass search criteria into the function as parameters, otherwise performance will be suboptimal because the function will have to return all rows from the query before search arguments can be applied.
According to the manual, as you have noticed views on a table that is renamed continue to point to the original table object. Routines, however, including table functions, will be invalidated and their plans prepared again when they next invoked, using the original source table name.
I have no way of testing this though.
Full syntax to create a table function.

Is it possible to prevent the SQL Producer from overwriting just one of the tables columns?

Scenario: A computed property needs to available for RAW methods. The IsComputed property set in the model will not work as its value will not be available to RAW methods.
Attempted Solution: Create a computed column directly on the SQL table as opposed to setting the IsComputed property in the model. Specify that CodefluentEntities not overwrite the computed column. I would than expect the BOM to read the computed SQL field no differently than if it was a normal database field.
Problem: I can't figure out how to prevent Codefluent Entities from overwriting the computed column. I attempted to use the production flags as well as setting produce="false" for the property in the .cfp. Neither worked.
Question: Is it possible to prevent Codefluent Entities from overwriting my computed column and if so, how?

The solution youre looking for is here
You can execute whatever custom T-SQL scripts you like, the only premise is to give the script a specific name so the Producer knows when to execute it.
i.e. if you want your custom script to execute after the tables are generated, name your script
after_[ProjectName]_tables.
Save your custom t-sql file alongside the codefluent generated files and build the project.
In my specific case, i had to enable full-text index in one of my table columns, i wrote the SQL script for the functionality, saved it as
`after_[ProjectName]_relations_add`
Heres how they look in my file directory
file directory

Alternate Solution: An alternate solution is to execute the following the TSQL script after the SQL Producer finishes generating.
ALTER TABLE PunchCard DROP COLUMN PunchCard_CompanyCodeCalculated
GO
ALTER TABLE PunchCard
ADD PunchCard_CompanyCodeCalculated AS CASE
WHEN PunchCard_CompanyCodeAdjusted IS NOT NULL THEN PunchCard_CompanyCodeAdjusted
ELSE PunchCard_CompanyCode
END
GO
Additional Configuration Needed to Make Solution Work: In order for this solution to work one must also configure the BOM so that it does not attempt to save the data associated with the computed columns. This can be done through Model using the advanced properties. In my case I selected the CompanyCodeCalculated property. Went to advanced settings. And set the Save setting to False.
Question: Somewhere in the Knowledge Center there is a passing reference on how to automate the execution SQL Scripts after the SQL Producer finishes but I can not find it. Anybody now how this is done?
Post Usage Comments: Just wanted to let people know I implemented this approach and am so far happy with the results.

Managing postgresql views without having to write migrations?

The problems with SQL views is that every time I need to make a small change I need to create another migration. Being in a small startup, that's quite a hindrance to have to change something small to change the view.
Is it advisable to do the following
Drop and recreate view everytime I deploy my app;
This way, when I change something in the view, it will get updated in the database as soon as I deploy my app.

What you are describing is just another type of migration that get reversed on deploy. This may make sense for your business needs, and if you get blocked by this technique, you can always fall back to the regular migration system.
The best way to implement such a system in PostgreSQL is to create a schema that you drop on deploy. This way you don't have to create all the DROP VIEW ... commands, just DROP SCHEMA and everything in there will be deleted. Then you can run you procedure to rebuild it.
Example deploy script to execute on deploy:
/* Drop and rebuild the schema */
DROP SCHEMA IF EXISTS view_schema;
CREATE VIEW view_schema.my_users AS (SELECT * FROM users);
CREATE VIEW view_schema.my_products AS (SELECT * FROM products);
....

PostgreSQL transaction variables

This question is sort of a follow up to this question, but it's different enough of a topic that I feel like it merits it's own discussion. For a bit of background, you can refer to it.
As a part of a new file importing system, I am building an audit system based on this wiki page. But, one of the things that I would like to include in the audit trail is the file name of the file that the data came from (these files are archived for long term storage so if there are questions, I can always go back).
One way I could go it is to create a import_batch record and record the name of the file there and then just stamp records when they update. Which is the path that I'm going down. But, it feels a bit clunky in a way. I'm been pondering the idea of trying to have the audit trigger be able to get the import_batch_id without it having to be in the NEW.* record. It seems like to me there are at least a couple of ways I might be able to accomplish this.
I could have a function that could create a temp table and store any information in it that I want (such as batch # or file name or whatever). This seem pretty clean and as I understand it would only live for the duration of the transaction. And as I understand it, it wouldn't have to worry about naming collisions. Each transaction would have a temp file named "tmp_import_info".
If I only care about the import_batch_id (which has a seq), I could probably just get the current value of the sequencer. I'm not a 100% sure how this would behave in a multi-user setting. I would think it would be possible for trans#1 to create import_batch_id #222 and then trans#2 to start and get #223. And then my audit trail would record the wrong data.
Are there other options that I'm not seeing here? Is there a way to add a transaction/session variable? Basically, something like pg_settings (but, that does allow for inserts, updates and deletes of values).
It feels like the best option might be the temp table.

The main good news for variant 2. is - quoting the manual here:
currval
Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did.
Store your import file names in a table with a serial primary key. You can refer to your last value from the sequence with currval or lastval. Concurrent users cannot interfere. As long as you don't foil this path inside your own transaction yourself, this is safe.

iOS - Create Database Schema (Run code only once)

I'm using FMDB for my iPhone App database and i want to create the database and tables schema only once.
How can i run OBJC code when the user installs or updates the app?
Kinds Regards

You can set a boolean value in NSUserDefaults - NSUserDefaults is only reset when the user deletes the app, so you have some code that executes if a particular boolean value is not found in the user defaults (and then saves that value after execution to prevent it from being run again).
That will cover your plain 'run code once upon install' scenario - you can achieve the same for updates with a similar approach, but utilising the CFBundleVersion variable (which will be different for each version of your app).

First of all, you might not want to think about executing something during upgrading, because it's not possible. Like #lxt suggested, you can store a value in the preference to indicate database version, but it might not be bulletproof.
A common approach to solve this problem is to use self-built meta-data. When you first created the database, you should create an extra table named "metadata" or "properties", with two varchar columns, "name" and "value". You insert one row, ("database_ver", "1").
In your database layer (or adapter) class, you create an "open" method to handle opening. Within this method, you first run select database_ver from metadata; to check database version. If nothing is fetched, you run table creation scripts, and insert database_ver=1 row.
Later on if you upgraded your table format, provide alter table statements for each version, and run them based on database_ver. For installations after the upgrade, you can use the updated create table statements, then set "database_ver" to "2" (or above) directly, without going through alter table.
Compared to storing value in the preference, it's actually more common to store it in the database itself. Because even if the user backed up the file somewhere, or skipped a version, you can still tell the format of the database by its metadata table.
FMDB has no problem running such mechanism.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse