Postgresql and primary key, foreign key indexing

Postgresql and primary key, foreign key indexing - postgresql

On https://stackoverflow.com/questions/10356484/how-to-add-on-delete-cascade-constraints#= a user, kgrittn, commented saying that
But I notice that you have not created indexes on referencing columns... Deletes on the referenced table will take a long time without those, if you get many rows in those tables. Some databases automatically create an index on the referencing column(s); PostgreSQL leaves that up to you, since there are some cases where it isn't worthwhile.
I'm having difficulty understanding this completely. Is he saying that primary keys are not created automatically with an index or is he saying that foreign keys should be indexed (in particular cases that is). I've looked at the PostgreSQL documentation and it appears from there that an index is created for primary keys automatically. Is there a command I can use to list all indexes?
Thanks

A primary key is behind the scenes a special kind of a unique index. The quote referencing, that it might be a good idea to create an index also on columns, where the primary key is used as an foreign key.

Related

SQLAlchemy, directly inserting primary keys seems to disable key auto generation

I am trying to populate some tables using data that I extracted from Google BigQuery. For that purpose I essentially normalized a flattened table into multiple tables that include the primary key of each row in the multiple tables. The important point is that I need to load those primary keys in order to satisfy foreign key references.
Having inserted this data into tables, I then try to add new rows to these tables. I don't specify the primary key, presuming that Postgres will auto-generate those key values.
However, I always get a 'duplicate key value violates unique constraint "xxx_pkey" ' type error, e.g.
"..duplicate key value violates unique constraint "collection_pkey" DETAIL: Key (id)=(1) already exists.
It seems this is triggered by including the primary key in the data when initializing table. That is, explicitly setting primary keys, somehow seems to disable or reset the expected autogeneration of the primary key. I.E. I was expecting that new rows would be assigned primary keys starting from the highest value already in a table.
Interestingly I get the same error whether I try to add a row via SQLAlchemy or from the psql console.
So, is this as expected? And if so, is there some way to get the system to again auto-generate keys? There must be some hidden psql state that controls this...the schema is unchanged by directly inserting keys, but psql behavior is changed by that action.
I am happy to provide additional information.
Thanks

Handling the order of dropping constraints in Postgres

I am using a tool called apgdiff 'https://www.apgdiff.com/' for finding the DDL diff between 2 postgres database. It parses 2 postgres dumps and generate the diff between the 2 dumps in terms of alter queries .
The tool actually doesn't mind the order of creating or dropping foreign key constraints while generating the diff. i.e. foreign key constraints should be created after primary key , or to be dropped before dropping the primary key . But still, what makes me curious is a line of code in their sourcecode, which says that all the primary keys should be dropped first and then all other non-primary keys should be dropped . Do we have any such constraint in Postgres that the primary keys should be dropped first and then the remaining constraints ..

If anything, other constraints should be dropped first, because foreign key constraints depend on primary key (or unique) constraints. It doesn't matter, though, if you use the CASCADE keyword when dropping the constraints.
I can't see a reason why dropping primary key constraints first should make a difference.

FOR KEY SHARE - what are 'key values'?

The documentation states
A key-shared lock blocks other transactions from performing DELETE or any UPDATE that changes the key values.
Does "key values" refer to the primary key, or the unique keys, or the indexed keys, or the columns used for the SELECT query?

The term key values refers foreign keys.
Alvaro Herrera, the author of the patch in Postgres 9.3 wrote (per this source):
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
You can also find this mention in the documentation:
Currently, the set of columns considered for the UPDATE case are those that have a unique index on them that can be used in a foreign key (so partial indexes and expressional indexes are not considered), but this may change in the future.

How to create a primary key using the hash method in postgresql

Is there any way to create a primary key using the hash method? Neither of the following statements work:
oid char(30) primary key using hash
primary key(oid) using hash

I assume, you meant to use the hash index method / type.
Primary keys are constraints. Some constraints can create index(es) in order to work properly (but this fact should not be relied upon). F.ex. a UNIQUE constraint will create a unique index. Note, that only B-tree currently supports unique indexes. The PRIMARY KEY constraint is a combination of the UNIQUE and the NOT NULL constraints, so (currently) it only supports B-tree.
You can set up a hash index too, if you want (besides the PRIMARY KEY constraint) -- but you cannot make that unique.
CREATE INDEX name ON table USING hash (column);
But, if you are willing to do this, you should be aware that there is some limitation on the hash indexes (up until PostgreSQL 10):
Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.
Also:
Currently, only the B-tree, GiST and GIN index methods support multicolumn indexes.
Note: Unfortunately, oid is not the best name for a column in PostgreSQL, because it can also be a name for a system column and type.
Note 2: The char(n) type is also discouraged. You can use varchar or text instead, with a CHECK constraint -- or (if the id is so uuid-like) the uuid type itself.

EF db first and table without key

I am trying to use Entity Framework DB first to do quick prototyping of a reporting website for a huge db. The problem is one of the tables doesn't have a key. I got an 'Error 159: EntityType has no key defined'. If I add a key on the model designer, I got 'Error 3024: Must specify mapping for all key properties'. My question is whether there is a way to workaround this WITHOUT adding a key to the table. The table is not in our control.

Huge table which does not have a key? It would not be possible for you or for table owner to search for anything in this table without using full table scan. Also, it is basically impossible to use UPDATE by single row without having primary key.
You really have to either create synthetic key, or ask owner to do that. As a workaround, you might be able to find some existing column (or 2-3 columns) which is unique enough that it can be used as unique key. If it is unique but does not have actual index created, that would be still not good for performance - you should create such index.