Import CSV Table Definitions into PostgreSQL - postgresql

I have a file of table definitions, in the following format
Table Name Field Name Field Data Type
ATableName1 AFieldName1 VARCHAR2
ATableName1 AFieldName2 NUMBER
...
ATableNameX AFieldNameX1 TIMESTAMP(6)
Is there any easy way to import this into Postgres to automatically create the tables?
What if I split the file up into individual tables, and just had a csv of field names/data types for each table?
Field Name Field Data Type
AFieldName1 VARCHAR2
AFieldName2 NUMBER
My searching has only yielded data import via copy, and table creation (based on data) using pgfutter.

mind I change varchar2 to varchar and number to integer.alsoyou have tsv - in order to use it, change chr(44) in my code to chr(9). Mind I dont check for injection, otherwise here's working example:
t=# do
$$
declare
_r record;
begin
for _r in (
with t(l) as (values('ATableName1,AFieldName1i, VARCHAR
ATableName1,AFieldName2,INTEGER
ATableNameX,AFieldNameX1,TIMESTAMP(6)'::text)
)
, r as (select unnest(string_to_array(l,chr(10))) rw from t)
, p as (select split_part(rw,chr(44),1) tn, split_part(rw,chr(44),2) cn,split_part(rw,chr(44),3) tp from r)
select tn||' ('||string_agg(cn||' '||tp, ', ')||')' s from p
group by tn
) loop
raise info '%','create table '||_r.s;
execute 'create table '||_r.s;
end loop;
end;
$$
;
INFO: create table ATableNameX (AFieldNameX1 TIMESTAMP(6))
INFO: create table ATableName1 (AFieldName1i VARCHAR, AFieldName2 INTEGER)
DO
Time: 16.743 ms
t=# \dt AF
t=# \dt atablename*
List of relations
Schema | Name | Type | Owner
--------+-------------+-------+-------
public | atablename1 | table | vao
public | atablenamex | table | vao

SQL is your friend, it is very expressive, you can construct your tables def using string_agg function. Have a look on the example here.
http://sqlfiddle.com/#!17/0fe14/1

Related

fetch data from two different tables in single function postresql

I want to write function to get data from two different tables
My code:
create function
return table(a integer,b integer,c integer,k integer,l integer,m integer);
if(x=1) then
select a,b,c from mst_1
else
select k,l,m from mst_2
end IF;
end;
the problem is that two tables posses different columns, I'm getting error.
I replicated a case similar to yours, and it's just a matter of using the correct sintax.
If you have two tables like test and test_other like in my case
create table test (id serial, name varchar, surname varchar);
insert into test values(1,'Carlo', 'Rossi');
insert into test values(2,'Giovanni', 'Galli');
create table test_other (id_other serial, name_other varchar, surname_other varchar);
insert into test_other values(1,'Beppe', 'Bianchi');
insert into test_other values(2,'Salmo', 'Verdi');
you now want a function that returns the 3 columns from test if an input parameter is 1, the 3 columns from test_other otherwise.
Your function will look like the following
create or replace function case_return(x integer)
returns table(id integer,value_1 varchar, value_2 varchar)
language plpgsql
as
$$
begin
if(x=1) then
return query select test.id,test.name,test.surname from test;
else
return query select test_other.id_other, test_other.name_other, test_other.surname_other from test_other;
end IF;
end;
$$
;
The function always returns the columns id, value_1 and value_2 as per definition even if your source columns are different
defaultdb=> select * from case_return(0); id | value_1 | value_2
----+---------+---------
1 | Beppe | Bianchi
2 | Salmo | Verdi
(2 rows)
defaultdb=> select * from case_return(1); id | value_1 | value_2
----+----------+---------
1 | Carlo | Rossi
2 | Giovanni | Galli
(2 rows)

Postgres 10: do rows automatically move between partitions?

Assuming I have a parent table with child partitions that are created based on the value of a field.
If the value of that field changes, is there a way to have Postgres automatically move the row into the appropriate partition?
For example:
create table my_table(name text)
partition by list (left(name, 1));
create table my_table_a
partition of my_table
for values in ('a');
create table my_table_b
partition of my_table
for values in ('b');
In this case, if I change the value of name in a row from aaa to bbb, how can I get it to automatically move that row into my_table_b.
When I tried to do that, (i.e. update my_table set name = 'bbb' where name = 'aaa';), I get the following error:
ERROR: new row for relation "my_table_a" violates partition constraint
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63
it doesn't handle updates that cross partition boundaries.
thus you need to create one yourself... here's your set:
t=# insert into my_table select 'abc';
INSERT 0 1
t=# insert into my_table select 'bcd';
INSERT 0 1
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_a | abc
my_table_b | bcd
(2 rows)
here's rule and fn():
t=# create or replace function puf(_j json,_o text) returns void as $$
begin
raise info '%',': '||left(_j->>'name',1);
execute format('insert into %I select * from json_populate_record(null::my_table, %L)','my_table_'||left(_j->>'name',1), _j);
execute format('delete from %I where name = %L','my_table_'||left(_o,1), _o);
end;
$$language plpgsql;
CREATE FUNCTION
t=# create rule psr AS ON update to my_table do instead select puf(row_to_json(n),OLD.name) from (select NEW.*) n;
CREATE RULE
here's update:
t=# update my_table set name = 'bbb' where name = 'abc';
INFO: : b
puf
-----
(1 row)
UPDATE 0
checking result:
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_b | bcd
my_table_b | bbb
(2 rows)
once again:
t=# update my_table set name = 'a1' where name = 'bcd';
INFO: : a
puf
-----
(1 row)
UPDATE 0
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_a | a1
my_table_b | bbb
(2 rows)
Of course using json to pass NEW record looks ugly. And it is ugly indeed. But I did not have time to study the new PARTITION feature of 10, so don't know the elegant way to do this task. Hopefully I could give the generic idea of how you can possible solve the problem and you will produce a better neat code.
update
its probablygood idea to limit such rule to ON update to my_table where left(NEW.name,1) <> left(OLD.name,1) do instead, to release the heavy manipulations need

How does postgres db_link_build_sql_insert work?

I can't seem to figure out how this function is supposed to work for pushing data across from one table in your local database to another on a separate database. I have looked at the documentation and still don't understand the example given. I am working with a postgres 9.2 which makes it possible to use dblink.
Here is some example code where I am creating a test database and pushing values from my local table to the table on the test database. Can someone please fill in the missing part of the dblink_build_sql_insert function?
--drop database if exists testdb;
--create database testdb;
drop table if exists t;
create table t ( a integer, b text);
insert into t values (1,'10'), (2,'10'), (3,'30'), (4,'30');
create extension if not exists dblink;
select dblink_connect('dbname=testdb');
select dblink('drop table if exists t;');
select dblink('create table t ( a integer, b text);');
select dblink_build_sql_insert('t', ????);
select * from dblink('select * from t;') as (a integer, b text);
from docs:
Synopsis
dblink_build_sql_insert(text relname,
int2vector primary_key_attnums,
integer num_primary_key_atts,
text[] src_pk_att_vals_array,
text[] tgt_pk_att_vals_array) returns text
You don't have PK specified, so I assume it is on (a), which automatically means, that primary_key_attnums = 1(PK on first column) and num_primary_key_atts=1 (one column PK). Two rest values made same to prepare statement ro "replicate" row with a=1 as is:
b=# select dblink_build_sql_insert('t',
'1'::int2vector,
1::int2, -- num of pkey values
'{1}'::text[], -- old pkey
'{1}'::text[] -- new pkey
)
;
dblink_build_sql_insert
-------------------------------------
INSERT INTO t(a,b) VALUES('1','10')
(1 row)
b=# select dblink($$INSERT INTO t(a,b) VALUES('1','10')$$);
dblink
----------------
("INSERT 0 1")
(1 row)
b=# select * from dblink('select * from t;') as (a integer, b text);
a | b
---+----
1 | 10
(1 row)
b=# select dblink_disconnect();
dblink_disconnect
-------------------
OK
(1 row)

Constant column in child partition tables

We are using inheritance based partitioning in our application. The partitioning is on a column such that each partition has a different value for this column. Something like this:
CREATE TABLE base(
tblock INT NOT NULL,
-- other fields --
);
-- Create a partition
CREATE TABLE partition_1(
CHECK(tblock=1),
INHERITS base
);
There are a lot of such partitions, and each one has a large number of records (in the millions). Overall database size is in the terabytes.
In the above schema, the partitions have to have a column tblock, even though each partition has a constant value for that column in all rows. This is clearly a waste of space on disk.
Is there any way to declare the partitions so that it does not actually store the value of tblock on disk?
We are currently on Postgresql 9.2.6.
of course you can write trigger which will skip some columns, but if you do that, you will lose benefits from partitioning (constraint exclusions)
you cant drop this column if table is inherited
here's some example
create table base(i integer, j integer);
CREATE TABLE
create table inh_1() inherits (base);
CREATE TABLE
create table inh_2() inherits (base);
CREATE TABLE
CREATE OR REPLACE FUNCTION part_trigger()
RETURNS TRIGGER AS $$
BEGIN
if NEW.j = 1 THEN INSERT INTO inh_1 (i) VALUES (NEW.i);
ELSIF NEW.j = 2 THEN INSERT INTO inh_2 (i) VALUES (NEW.i);
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE FUNCTION
CREATE TRIGGER insert_base
BEFORE INSERT ON base
FOR EACH ROW EXECUTE PROCEDURE part_trigger();
CREATE TRIGGER
insert into base values (100,1);
INSERT 0 0
insert into base values (140,2);
INSERT 0 0
sebpa=# select * from base;
i | j
-----+---
100 |
140 |
(2 rows)
sebpa=# select * from inh_1;
i | j
-----+---
100 |
(1 row)
sebpa=# select * from inh_2;
i | j
-----+---
140 |
(1 row)

Duplicate single database record

Hello what is the easiest way to duplicate a DB record over the same table?
My problem is that the table where I am doing this has many column, like 100+, and I don't like how the solution looks like. Here is what I do (this is inside plpqsql function):
...
1. duplicate record
INSERT INTO history
(SELECT NEXTVAL('history_id_seq'), col_1, col_2, ... , col_100)
FROM history
WHERE history_id = 1234
ORDER BY datetime DESC
LIMIT 1)
RETURNING
history_id INTO new_history_id;
2. update some columns
UPDATE history
SET
col_5 = 'test_5',
col_23 = 'test_23',
datetime = CURRENT_TIMESTAMP
WHERE history_id = new_history_id;
Here are the problems I am attempting to solve
Listing all these 100+ columns looks lame
When new column is added eventually the function should be updated too
On separate DB instances the column order might differ, which would cause the function fail
I am not sure if I can list them once more (solving issue 3) like insert into <table> (<columns_list>) values (<query>) but then the query looks even uglier.
I would like to achieve something like 'insert into ', but this seems impossible the unique primary key constraint will raise a duplication error.
Any suggestions?
Thanks in advance for you time.
This isn't pretty or particularly optimized but there are a couple of ways to go about this. Ideally, you might want to do this all in an UPDATE trigger though you could implement a duplication function something like this:
-- create source table
CREATE TABLE history (history_id serial not null primary key, col_2 int, col_3 int, col_4 int, datetime timestamptz default now());
-- add some data
INSERT INTO history (col_2, col_3, col_4)
SELECT g, g * 10, g * 100 FROM generate_series(1, 100) AS g;
-- function to duplicate record
CREATE OR REPLACE FUNCTION fn_history_duplicate(p_history_id integer) RETURNS SETOF history AS
$BODY$
DECLARE
cols text;
insert_statement text;
BEGIN
-- build list of columns
SELECT array_to_string(array_agg(column_name::name), ',') INTO cols
FROM information_schema.columns
WHERE (table_schema, table_name) = ('public', 'history')
AND column_name <> 'history_id';
-- build insert statement
insert_statement := 'INSERT INTO history (' || cols || ') SELECT ' || cols || ' FROM history WHERE history_id = $1 RETURNING *';
-- execute statement
RETURN QUERY EXECUTE insert_statement USING p_history_id;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql';
-- test
SELECT * FROM fn_history_duplicate(1);
history_id | col_2 | col_3 | col_4 | datetime
------------+-------+-------+-------+-------------------------------
101 | 1 | 10 | 100 | 2013-04-15 14:56:11.131507+00
(1 row)
As I noted in my original comment, you might also take a look at the colnames extension as an alternative to querying the information schema.
You don't need the update anyway, you can supply the constant values directly in the SELECT statement:
INSERT INTO history
SELECT NEXTVAL('history_id_seq'),
col_1,
col_2,
col_3,
col_4,
'test_5',
...
'test_23',
...,
col_100
FROM history
WHERE history_sid = 1234
ORDER BY datetime DESC
LIMIT 1
RETURNING history_sid INTO new_history_sid;