dataframe.to_sql index as primary key in postgresql - postgresql

I have a dataframe with an index that I want to store in a postgresql database. For this I use df.to_sql(table_name,engine,if_exists='replace', index=True,chunksize=10000)
The index column from the pandas dataframe is copied to the database but is not set as primary key.
There are two solutions that require an additional step:
specify a schema df.to_sql(schema=) docs
Set the primary key after the table is ingested. query:
ALTER TABLE table_name ADD PRIMARY KEY (id_column_name)
Is there a way to set the primary key without specifying the schema or altering the table?

After calling to_sql:
import sqlalchemy
engine = create_engine()
engine.execute('ALTER TABLE schema.table ADD PRIMARY KEY (keycolumn);')
Unfortunately, pandas.to_sql doesn't set primary key, it even also destructs the primary key of existing table. One must aware for the primary keys.

Related

PostgreSQL declarative partition - unique constraint on partitioned table must include all partitioning columns [duplicate]

This question already has an answer here:
ERROR: unique constraint on partitioned table must include all partitioning columns
(1 answer)
Closed last month.
I'm trying to create a partitioned table which refers to itself, creating a doubly-linked list.
CREATE TABLE test2 (
id serial NOT NULL,
category integer NOT NULL,
time timestamp(6) NOT NULL,
prev_event integer,
next_event integer
) PARTITION BY HASH (category);
Once I add primary key I get the following error.
alter table test2 add primary key (id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: PRIMARY KEY constraint on table "test2" lacks column "category" which is part of the partition key.
Why does the unique constrain require all partitioned columns to be included?
EDIT: Now I understand why this is needed: https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
Once I add PK with both columns it works.
alter table test2 add primary key (id, category);
But then adding the FK to itself doesn't work.
alter table test2 add foreign key (prev_event) references test2 (id) on update cascade on delete cascade;
ERROR: there is no unique constraint matching given keys for referenced table "test2"
Since PK is not just id but id-category I can't create FK pointing to id.
Is there any way to deal with this or am I missing something?
I would like to avoid using inheritance partitioning if possible.
EDIT2: It seems this is a known problem. https://www.reddit.com/r/PostgreSQL/comments/di5mbr/postgresql_12_foreign_keys_and_partitioned_tables/f3tsoop/
Seems that there is no straightforward solution. PostgreSQL simply doesn't support this as of v14. One solution is to use triggers to enforce 'foreign key' behavior. Other is to use multi-column foreign keys. Both are far from optimal.

CREATE TABLE LIKE with different primary key for partitioning

I've got an existing table of dogs which I would like to partition by list using the colour column:
CREATE TABLE dogs (
id int PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
colour text,
name text
)
;
Because it's not possible to partition an existing table, I'm going to make a new empty partitioned table then copy the data across.
CREATE TABLE
trade_capture._customer_invoice
(
LIKE
trade_capture.customer_invoice
INCLUDING ALL
)
PARTITION BY LIST (bill_month)
;
This reports:
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "_dogs" lacks column "colour" which is part of the partition key.
I know I can ignore the primary key like this, but it seems bad to have a table with no primary key!
CREATE TABLE
_dogs
(
LIKE
dogs
INCLUDING ALL
EXCLUDING INDEXES
)
PARTITION BY LIST (colour)
;
What's the best way to proceed so I have a partitioned table which still has a primary key?
Add the partitioning key to the primary key:
ALTER TABLE trade_capture._customer_invoice
ADD PRIMARY KEY (id, bill_month);
There is no other option to have a unique constraint on a partitioned table.

Dropping Unique Constraint - PostgreSQL

TL;DR
I am seeking clarity on this: does a FOREIGN KEY require a UNIQUE CONSTRAINT on the other side, specifically, in Postgres and, generally, in relational database systems?
Perhaps, I can test this, but I'll ask, if the UNIQUE CONSTRAINT is required by the FOREIGN KEY what would happen if I don't create it? Will the Database create one or will it throw an error?
How I got there
I had earlier on created a table with a column username on which I imposed a unique constraint. I then created another table with a column bearer_name having a FOREIGN KEY referencing the previous table's column username; the one which had a UNIQUE CONSTRAINT.
Now, I want to drop the UNIQUE CONSTRAINT on the username column from the database because I have later on created a UNIQUE INDEX on the same column and intuitively I feel that they serve the same purpose, or don't they? But the database is complaining that the UNIQUE INDEX has some dependent objects and so it can't be dropped unless I provide CASCADE as an option in order to drop even the dependent object. It's identifying the FOREIGN KEY on bearer_name column in the second table as the dependent object.
And is it possible for the FOREIGN KEY to be a point to the UNIQUE INDEX instead of the UNIQUE CONSTRAINT?
I am seeking clarity on this: does a FOREIGN KEY require a UNIQUE CONSTRAINT on the other side
No it does not require only UNIQUE CONSTRAINT. It could be PRIMARY KEY or UNIQUE INDEX.
Perhaps, I can test this, but I'll ask, if the UNIQUE CONSTRAINT is required by the FOREIGN KEY what would happen if I don't create it? Will the Database create one or will it throw an error?
CREATE TABLE tab_a(a_id INT, b_id INT);
CREATE TABLE tab_b(b_id INT);
ALTER TABLE tab_a ADD CONSTRAINT fk_tab_a_tab_b FOREIGN KEY (b_id)
REFERENCES tab_b(b_id);
ERROR: there is no unique constraint matching given keys
for referenced table "tab_b"
DBFiddle Demo
And is it possible for the FOREIGN KEY to be a point to the UNIQUE INDEX instead of the UNIQUE CONSTRAINT?
Yes, it is possible.
CREATE UNIQUE INDEX tab_b_i ON tab_b(b_id);
DBFiddle Demo2

How to add primary key to View?

I have a view and want to make one attribute a primary key.
CREATE VIEW filedata_view
AS SELECT num PRIMARY KEY, id, ST_TRANSFORM(the_geom,900913) AS the_geom
FROM filedata
But get a error
ERROR: syntax error at or near "PRIMARY"
LINE 2: AS SELECT num PRIMARY KEY, id, ST_TRANSFORM(the_geom,900913)...
How to do this?
Views in Postgresql can't have primary keys.
you are basically on wrong way creating constraint on a View, constraints should be created on tables, but some DBMSes do support adding constraints on Views like oracle with this syntax:
ALTER VIEW VIEW_NAME ADD PRIMARY KEY PK_VIEW_NAME DISABLE NOVALIDATE;
Oracle Doc For Constraints
You can specify only unique, primary key, and foreign key constraints on views, and they are supported only in DISABLE NOVALIDATE mode.
so they only support it for compatibility, if you want to have a primary key to stop insertion of duplicate data in column num in filedata table, you should do it by altering the filedata table and add a primary key on it or by creating your table with primary key on column num from the start.
Postgresql doesn't support constraints on views. Other DBMSes (eg Oracle) do support this but Postgresql doesn't.

PostgreSQL: is it possible to provide custom name for PRIMARY KEY or UNIQUE?

When I write:
CREATE TABLE accounts (
username varchar(64) PRIMARY KEY,
I get primary key named:
accounts_pkey
Is it possible to assign my own custom name, for instance "accounts_primary_key"?
Same story about UNIQUE.
I couldn't find it in PostgreSQL documentation.
Thanks in advance.
The trick is the CONSTRAINT part in the column_constraint section of CREATE TABLE. Example:
> create table x(xx text constraint xxxx primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "xxxx" for table "x"
CREATE TABLE
This works for all kind of constraints, including PRIMARY KEY and UNIQUE.
See the docs of CREATE TABLE for details.