Keeping a table column empty when it is indexed as unique - postgresql

Is it possible to keep a table column empty if it's defined as unique?
Table schema
Column | Type | Modifiers | Description
-------------------+------------------------+---------------+-------------
id | integer | not null |
name | character varying(64) | |
Indexes
Indexes:
"clients_pkey" PRIMARY KEY, btree (id)
"clients_name_idx" UNIQUE, btree (name)
Has OIDs: yes
Due to modifications to the application sometimes the name column needs to be empty, is this possible at all?

If the column can contain NULL values, then that is OK, as NULL is not included in the index.
Note that some databases don't implement the standard properly (some versions of SQL Server only allowed one NULL value per unique constraint, but I'm sure if that is still the case).

Using NULL is the better option, but you could also use a conditional unique index:
CREATE UNIQUE INDEX unique_clients_name ON clients (name) WHERE name <> '';
And avoid oid's, these are useless and obsolete.

Related

Oracle Primary Key to Postgres

I'm migrating tables from Oracle to Postgres.
When a Primary key is created on an Oracle table, it implicitly creates a unique index with the same name.
But there is no such index created in Postgres or it is not visible in data dictionary tables.
Postgres doesn't allow creating an index with the Primary key name. I want to know if a unique index is required in Postgres on the primary key column. Does it by any way alter query performance if I do not create unique index for primary key column? Thanks in advance.
That is not correct:
create table pk_test (id integer primary key);
\d pk_test
Table "public.pk_test"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
Indexes:
"pk_test_pkey" PRIMARY KEY, btree (id)

How do I reference a unique index that uses a function in ON CONFLICT?

I'm using postgres 9.5.3, and I have a table like this:
CREATE TABLE packages (
id SERIAL PRIMARY KEY,
name VARCHAR NOT NULL
);
I have defined a function, canonical_name, like this:
CREATE FUNCTION canonical_name(text) RETURNS text AS $$
SELECT replace(lower($1), '-', '_')
$$ LANGUAGE SQL;
I've added a unique index to this table that uses the function:
CREATE UNIQUE INDEX index_package_name
ON packages (canonical_name(name));
CREATE INDEX
# \d+ packages
Table "public.packages"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------------------+-------------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('packages_id_seq'::regclass) | plain | |
name | character varying | not null | extended | |
Indexes:
"packages_pkey" PRIMARY KEY, btree (id)
"index_package_name" UNIQUE, btree (canonical_name(name::text))
And this unique index is working as I expect; it prevents insertion of duplicates:
INSERT INTO packages (name)
VALUES ('Foo-bar');
INSERT INTO packages (name)
VALUES ('foo_bar');
ERROR: duplicate key value violates unique constraint "index_package_name"
DETAIL: Key (canonical_name(name::text))=(foo_bar) already exists.
My problem is that I want to use this unique index to do an upsert, and I can't figure out how I need to specify the conflict target. The documentation seems to say I can specify an index expression:
where conflict_target can be one of:
( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]
ON CONSTRAINT constraint_name
But all of these things below that I've tried produce errors as shown, instead of a working upsert.
I've tried matching the index expression as I specified it:
INSERT INTO packages (name)
VALUES ('foo_bar')
ON CONFLICT (canonical_name(name))
DO UPDATE SET name = EXCLUDED.name;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Matching the index expression as \d+ showed it:
INSERT INTO packages (name)
VALUES ('foo_bar')
ON CONFLICT (canonical_name(name::text))
DO UPDATE SET name = EXCLUDED.name;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Just naming the column that the unique index is on:
INSERT INTO packages (name)
VALUES ('foo_bar')
ON CONFLICT (name)
DO UPDATE SET name = EXCLUDED.name;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Using the index name instead:
INSERT INTO packages (name)
VALUES ('foo_bar')
ON CONFLICT (index_package_name)
DO UPDATE SET name = EXCLUDED.name;
ERROR: column "index_package_name" does not exist
LINE 3: ON CONFLICT (index_package_name)
So how do I specify that I want to use this index? Or is this a bug?
Important note: This behavior can only be observed on versions before 9.5.4. This is a bug that was fixed in 9.5.4. The rest of the answer describes the buggy behavior:
As you found out, you can only specify the expression for a unique constraint and not the one for a unique index.
This is somewhat confusing because under the hood a unique constraint is just a unique index (but that is considered an implementation detail).
To make matters worse for you, you cannot define a unique constraint over a unique index that contains expressions – I am not certain what the reason is, but suspect the SQL standard.
One way you can do this would be to add an artificial column, filled with the “canonical name” by a trigger and define the constraint on that column. I admit that that is not nice.
The correct solution, however, is to upgrade to the latest minor release for PostgreSQL 9.5.

Cause of PostgreSQL foreign key violation?

My PostgreSQL (9.2) database contains two tables registrations and attributes with a foreign key constraint:
postgres=# \d+ registrations;
Table "public.registrations"
Column | Type | Modifiers | Storage | Stats target | Description
---------+-------+-----------+----------+--------------+-------------
name | text | not null | extended | |
parent | text | | extended | |
storage | bytea | | extended | |
Indexes:
"registrations_pkey" PRIMARY KEY, btree (name)
Referenced by:
TABLE "attributes" CONSTRAINT "attributes_cname_fkey" FOREIGN KEY (cname) REFERENCES registrations(name) ON DELETE CASCADE
Has OIDs: no
postgres=# \d+ attributes;
Table "public.attributes"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------+-----------+----------+--------------+-------------
cname | text | not null | extended | |
aname | text | not null | extended | |
tags | text | | extended | |
value | bytea | | extended | |
Indexes:
"attributes_pkey" PRIMARY KEY, btree (cname, aname)
Foreign-key constraints:
"attributes_cname_fkey" FOREIGN KEY (cname) REFERENCES registrations(name) ON DELETE CASCADE
Has OIDs: no
At some point I realised that some rows violated the foreign key constraint:
postgres=# SELECT COUNT(*) FROM attributes LEFT JOIN registrations ON attributes.cname=registrations.name WHERE registrations.name IS NULL;
count
-------
71
(1 row)
Could you help me understand how this corruption could happen?
A constraint marked as NOT VALID is the one case you might expect to see violations, but the NOT VALID clause would show up in the psql \d+ output. (I believe it's possible to manually update this flag in the catalog, but I hope for your sake that this isn't the issue...)
As far as I know, the only supported way of bypassing a foreign key check is to SET session_replication_role TO replica before modifying the data. This is there for the benefit of replication processes, operating under the assumption that the constraint has already been validated on the master - though this can certainly go wrong if your replicator is buggy or misconfigured.
It's also possible for a superuser to manually disable the constraint's underlying triggers (and it's often tempting for someone trying to speed up a bulk import). The following will tell you if the triggers are currently active (tgenabled should be 'O'):
SELECT *
FROM pg_trigger
WHERE tgname ~ '^RI_ConstraintTrigger'
AND tgrelid IN ('registrations'::regclass, 'attributes'::regclass)
I don't think there's any way of knowing whether this was temporarily changed in the past, though if you have statement logging enabled, you might find an ALTER TABLE ... DISABLE TRIGGER statement in there somewhere.
There is also at least one loophole in the foreign key enforcement, and of course, it's always possible that you've found a bug...
This can happen if the FK contraint was created with a NOT VALID clause (don't do this):
CREATE TABLE one
( one_id INTEGER NOT NULL PRIMARY KEY
);
CREATE TABLE two
( two_id INTEGER NOT NULL PRIMARY KEY
, one_id INTEGER NOT NULL
);
INSERT INTO one(one_id)
SELECT gs FROM generate_series(0,12,2) gs;
INSERT INTO two(two_id,one_id)
SELECT gs, gs FROM generate_series(0,12,3) gs;
ALTER TABLE two
ADD CONSTRAINT omg FOREIGN KEY (one_id) references one(one_id)
-- DEFERRABLE INITIALLY DEFERRED
NOT VALID
;

How does LIMIT interact with DELETE by primary key in Postgres? (Fix corrupt unique index)

I've been handed a database that's stuck in a weird state. At some indeterminate time in the past, I ended up in a situation where I had duplicate rows in the same table with the same primary key:
=> \d my_table
Table "public.my_table"
Column | Type | Modifiers
--------------------+-------------------------+-----------
id | bigint | not null
some_data | bigint |
a_string | character varying(1024) | not null
Indexes:
"my_table_pkey" PRIMARY KEY, btree (id)
=> SELECT id, count(*) FROM my_table GROUP BY id HAVING count(*) > 1 ORDER BY id;
#50-some results, non-consecutive rows.
I have no idea how the database got into this state, but I want to be able to safely get out of it. If, for each duplicated primary key, if I execute a query of the form:
DELETE FROM my_table WHERE id = "a_duplicated_row" LIMIT 1;
Is it only going to delete one row from the table, or is it going to delete both rows with the given primary key?
Alas, PostgreSQL does not yet implement LIMIT for DELETE or UPDATE. If the rows are indistinguishable in every other way, you will need to carefully use the hidden ctid column to break ties, like discussed here. Or just create the table by selecting distinct tuples from the existing table, and renaming.

PostgreSQL: Should I alter an existing index or create a new one?

I have a billing_infos table that has a column called order_id. It already has an index.
\d billing_infos
...
Indexes:
"billing_infos_pkey" PRIMARY KEY, btree (id)
"billing_infos_address_id_idx" btree (address_id)
"index_billing_infos_on_order_id" btree (order_id) //<--this one
However, I am not sure if this is a unique index. I have the task of making a unique index, but I'm not sure if I should change the one that's already there or create a new one. The order_id values should all be unique.
Should I create a new index or alter the existing one?
And how do I check to see if the existing indexes are unique?
It is probably least invasive to create a unique index concurrently. Note that using a CONSTRAINT is the recommended way to enforce uniqueness. A UNIQUE index is more useful if the columns being checked require a function to create the uniqueness. An example of the latter is using COALESCE() to prevent NULLs from bypassing the UNIQUE check.
Eg.
create unique index foo_col1_col2_uidx on foo (col1, coalesce(col2,-1));
In the example above, col2 is an integer column and is not defined as NOT NULL.
Example of creating unique index concurrently.
create unique index concurrently billing_infos_order_id_uidx on billing_infos (order_id);
The output in psql from \d for a UNIQUE index (I've named mine _uidx) and UNIQUE CONSTRAINT (_uc) looks like the following:
\d foo
Table "public.foo"
Column | Type | Modifiers
--------+-----------------------------+-----------
x | integer |
tstamp | timestamp without time zone |
col | text |
Indexes:
"foo_col_uidx" UNIQUE, btree (col) <<< unique index
"foo_tstamp_uc" UNIQUE CONSTRAINT, btree (tstamp) <<< unique constraint
"foo_idx" btree (x)
That is not a unique index.
Try creating a simple table which has a primary key, a column with a unique constraint, and a column with a normal index:
create table example (id integer primary key, alpha integer, beta integer, gamma integer);
alter table example add constraint alpha_unique unique (alpha);
create index beta_normal on example (beta);
create unique index gamma_unique on example (gamma);
If you use \d on it, the output is:
Table "public.example"
Column | Type | Modifiers
--------+---------+-----------
id | integer | not null
alpha | integer |
beta | integer |
gamma | integer |
Indexes:
"example_pkey" PRIMARY KEY, btree (id)
"alpha_unique" UNIQUE CONSTRAINT, btree (alpha)
"gamma_unique" UNIQUE, btree (gamma)
"beta_normal" btree (beta)
As you can see, when an index is unique, it says so. You can even see when the index implements a unique constraint.
So, what should you do? Firstly, don't add a unique index. Don't ever do that. If you want to impose a uniqueness constraint on a column, you do that by adding a unique constraint. A constraint, not an index. The clue is in the name.
Adding a unique index may well work, but it is the wrong thing to do as the PostgreSQL manual says:
The preferred way to add a unique constraint to a table is ALTER TABLE
... ADD CONSTRAINT. The use of indexes to enforce unique constraints
could be considered an implementation detail that should not be
accessed directly.
So, simply use the alter table ... add constraint syntax i use above to add the constraint.
Use PgAdmin tool to see the index structure.