Detect schema change by DDL for PostgreSQL - postgresql

I am wondering if there is a tool that can do such a thing. It can detect difference between 2 versions of DDL and automatically apply scripts if there is a change on the schema. And I want it to work for PostgreSQL.
For example, we had v1.sql:
CREATE TABLE student (
id int,
name text
);
now we have v2.sql:
CREATE TABLE student (
id int,
name text,
nick_name text
);
As you can see there is one more column added the the student table when we move from v1 to v2. So the we should generate and apply this script:
ALTER TABLE student ADD nick_name text;
The use case is like when we want to move from version to version. It will automatically update the schema if there is a change. It does not have to handle too complex situation, a simple add/drop column and table should be fine. Other things might still need to be done manually.
A CCDR strategy could always work, but we prefer not to do that way while it can be done by simply adding/dropping a column.

Related

Create a new table with existing style from layer_styles table in postgres

Is there any way to create a postgis table with existing style from layer_styles table? Say for example i have so many styles stored in layer_styles table. I need to assign one of the style from layer_styles table to the new table which i am going to create. Can that be done using postgresql during table creation using sql command?
You need to identify the layer IDs of interest (or name, or any column you want) and to create the new table using this data+structure. However using the style in this secondary table may not be that easy
create table public.layer_styles_2 as
(select * from public.layer_styles where id in (2,3));

How to write a trivial on update rule on a view which just forwards the given update to the table represented by the view

I got table a
CREATE TABLE test (
id SERIAL,
name character varying NOT NULL,
PRIMARY KEY (id)
);
a view
CREATE VIEW TEST_VIEW AS
SELECT id,name
FROM test;
and just want to forward a given update queue to the actual table behind the view
CREATE RULE TEST_VIEW_UPDATE
AS ON UPDATE TO TEST_VIEW
DO INSTEAD UPDATE TEST;
But this approach results in an error as the SET statement is probably missing. How can I do this correctly in the most generic (therefore no limitation on what is actually updated) way?
On PostgreSQL 9.3 this will work automatically and without changes. PostgreSQL will create simple views as updateable by default.
In prior versions, specify all columns in the UPDATE. There's no wildcard.
If you're on 9.1 or above (which you should always mention in every question - select version()) you should use an INSTEAD OF view trigger rather than a rule.
As far as I know, it's not possible to do it like this, you have to write actual command:
CREATE RULE TEST_VIEW_UPDATE
AS ON UPDATE TO TEST_VIEW
DO INSTEAD UPDATE TEST set name = NEW.name, col1 = NEW.col1 where id = NEW.id;
It's also possible to do what you want with triggers - check this and this links.

postgresql trigger (strategy) to update table based on entries of another table

I am fairly new to PostgreSQL (spoilt by django ORM!), and I would like to create a trigger which updates a table based on entries of another table.
So, I have the following table on my schema:
collection_myblogs(id, col1,col2,title,col4,col5)
..where field id is autogenerated. Now, I have a new table created like so:
CREATE TABLE FullText(id SERIAL NOT NULL, content text NOT NULL);
ALTER TABLE ONLY FullText ADD CONSTRAINT fulltext_pkey PRIMARY KEY (id);
and I insert values from collection_myblogs like so:
INSERT INTO FullText(content) SELECT title FROM collection_myblogs;
All fine so far...I would now like a trigger on FullText such that FullText updates itself with new entries everytime collection_myblogs has a new entry. So, I attempted creating a trigger as following:
CREATE TRIGGER collection_ft_update BEFORE INSERT OR UPDATE ON collection_myblogs FOR EACH ROW EXECUTE PROCEDURE ft_update();
Now, I am not entirely sure what should go on ft_update() function, and at the moment, I have:
CREATE FUNCTION ft_update() RETURNS trigger AS '
BEGIN
INSERT INTO FullText(content) SELECT new.title;
return new;
END
' LANGUAGE plpgsql;
..which works fine for INSERTS but not UPDATES. i.e if I update the title of the orginal column collections_myblog(title) it appears as a new entry on FullText I am unsure how to deal with ids here.
I would like the ids i.e primary keys to be the same on each table. So, the idea for me is to have FullText(id, content) == collection_myblogs(id, title) - if this makes sense. So, the id and the content should be replicated from collection_myblogs table. How would one go about achieving this?
My understanding is that I can use a trigger before any insert or an update on my collection_myblogs and somehow maintain FullText(id, content) == collection_myblogs(id, title)
I would appreciate any guidance on this.
There are actually a large number of ways to handle this problem. Some examples:
Use table inheritance to create an "interface" to your data (no trigger needed, the abstract table ends up functioning like a view). This is complicated territory though.
Use the trigger approach like you do and then handle UPDATE and DELETE separately. The big issue here is that if you have two areas of text that are identical, your update trigger needs to be able to separate them.
There are many others but those should get you started.
Actually, it turned out to be quite simple. Just had to follow this

Creating a "table of tables" in PostgreSQL or achieving similar functionality?

I'm just getting started with PostgreSQL, and I'm new to database design.
I'm writing software in which I have various plugins that update a database. Each plugin periodically updates its own designated table in the database. So a plugin named 'KeyboardPlugin' will update the 'KeyboardTable', and 'MousePlugin' will update the 'MouseTable'. I'd like for my database to store these 'plugin-table' relationships while enforcing referential integrity. So ideally, I'd like a configuration table with the following columns:
Plugin-Name (type 'text')
Table-Name (type ?)
My software will read from this configuration table to help the plugins determine which table to update. Originally, my idea was to have the second column (Table-Name) be of type 'text'. But then, if someone mistypes the table name, or an existing relationship becomes invalid because of someone deleting a table, we have problems. I'd like for the 'Table-Name' column to act as a reference to another table, while enforcing referential integrity.
What is the best way to do this in PostgreSQL? Feel free to suggest an entirely new way to setup my database, different from what I'm currently exploring. Also, if it helps you answer my question, I'm using the pgAdmin tool to setup my database.
I appreciate your help.
I would go with your original plan to store the name as text. Possibly enhanced by additionally storing the schema name:
addin text
,sch text
,tbl text
Tables have an OID in the system catalog (pg_catalog.pg_class). You can get those with a nifty special cast:
SELECT 'myschema.mytable'::regclass
But the OID can change over a dump / restore. So just store the names as text and verify the table is there by casting it like demonstrated at application time.
Of course, if you use each tables for multiple addins it might pay to make a separate table
CREATE TABLE tbl (
,tbl_id serial PRIMARY KEY
,sch text
,name text
);
and reference it in ...
CREATE TABLE addin (
,addin_id serial PRIMARY KEY
,addin text
,tbl_id integer REFERENCES tbl(tbl_id) ON UPDATE CASCADE ON DELETE CASCADE
);
Or even make it an n:m relationship if addins have multiple tables. But be aware, as #OMG_Ponies commented, that a setup like this will require you to execute a lot of dynamic SQL because you don't know the identifiers beforehand.
I guess all plugins have a set of basic attributes and then each plugin will have a set of plugin-specific attributes. If this is the case you can use a single table together with the hstore datatype (a standard extension that just needs to be installed).
Something like this:
CREATE TABLE plugins
(
plugin_name text not null primary key,
common_int_attribute integer not null,
common_text_attribute text not null,
plugin_atttributes hstore
)
Then you can do something like this:
INSERT INTO plugins
(plugin_name, common_int_attribute, common_text_attribute, hstore)
VALUES
('plugin_1', 42, 'foobar', 'some_key => "the fish", other_key => 24'),
('plugin_2', 100, 'foobar', 'weird_key => 12345, more_info => "10.2.4"');
This creates two plugins named plugin_1 and plugin_2
Plugin_1 has the additional attributes "some_key" and "other_key", while plugin_2 stores the keys "weird_key" and "more_info".
You can index those hstore columns and query them very efficiently.
The following will select all plugins that have a key "weird_key" defined.
SELECT *
FROM plugins
WHERE plugin_attributes ? 'weird_key'
The following statement will select all plugins that have a key some_key with the value the fish:
SELECT *
FROM plugins
WHERE plugin_attributes #> ('some_key => "the fish"')
Much more convenient than using an EAV model in my opinion (and most probably a lot faster as well).
The only drawback is that you lose type-safety with this approach (but usually you'd lose that with the EAV concept as well).
You don't need an application catalog. Just add the application name to the keys of the table. This of course assumes that all the tables have the same structure. If not: use the application name for a table name, or as others have suggested: as a schema name( which also would allow for multiple tables per application).
EDIT:
But the real issue is of course that you should first model your data, and than build the applications to manipulate it. The data should not serve the code; the code should serve the data.

What is the difference between these two T-SQL statements?

In a SSIS package at work there are some SQL tasks that create staging tables for holding import data. All the statements take this form:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.tbNewTable') AND type in (N'U'))
BEGIN
TRUNCATE TABLE dbo.tbNewTable
END
ELSE
BEGIN
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
END
In Itzik Ben-Gan's T-SQL Fundamentals I see a different form of statement for creating a table:
IF OBJECT_ID('dbo.tbNewTable', 'U') IS NOT NULL
BEGIN
DROP TABLE dbo.tbNewTable
END
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
Each of these appears to do the same thing. After execution, there will be a empty table called tbNewTable in the dbo schema.
Are there any practical or theoretical differences between the two? What implications might they have?
The first one assumes that if the table exists, it has the same columns as those it would create. The second one does not make that assumption. So if a table with that name happened to exist and had a different set of columns, the two would have very different results.
The first will not actually DROP the table -- it merely TRUNCATES all the data in said table. Hence why the CREATE is guarded.
Thus the form with the DROP will allow the subsequent CREATE to change the schema (when the new table is created) even if tbNewTable previously existed.
Because the DROP/CREATE alters the database schema it may not also be allowed in all cases. For instance, a view created with a SCHEMABINDING will prevent the table from being dropped. (This also hold true for more general FK relationships, should any exist.)
...when SCHEMABINDING is specified, the base table or tables cannot be modified in a way that would affect the view definition.
The TRUNCATE should be marginally faster in one of those constant "don't care" ways: there should be no performance consideration given to one over the other.
There are also permission differences. TRUNCATE only requires the ALTER permission.
The minimum permission required is ALTER on table_name. TRUNCATE TABLE permissions default to the table owner...
Happy coding.
These are very different..
The first does an equality check on the sys.objects system table and looks to see if there is a matching table name. If so, it truncates the table. Basically removing all rows but maintaining the table structure itself - i.e. the actual table is never dropped.
In the second, the check to make sure that the table exists is implicitly done using the OBJECT_ID() method. If so, the table is dropped completely - rows and structure.
If you have a primary and foreign key constraint on the table, you'll certainly have issues dropping it completely... and if you have other tables that are linked to the table you are trying to 'truncate' you'll have issues there too, unless you have cascade deletion turned on.
I tend to dislike either construction in an SSIS package. I create the tables in a deployment script and I want the package to fail if one of the tables I use is missing later on because then something drastically wrong has happened and I want to investigate what before I try putting data anywhere.