In postgresql: Clarification on "CONSTRAINT foo_key PRIMARY KEY (foo)" - postgresql

Sorry if this is a dead simple question but I'm confused from the documentation and I'm not getting any clear answers from searching the web.
If I have the following table schema:
CREATE TABLE footable
(
foo character varying(10) NOT NULL,
bar timestamp without time zone,
CONSTRAINT pk_foo PRIMARY KEY (foo)
);
and then use the query:
SELECT bar FROM footable WHERE foo = '1234567890';
Will the select query find the given row by searching an index or not? In other word: does the table have a primary key (which is foo) or not?
Just to get it clear. I'm used to specifying "PRIMARY KEY" after the column I'm specifying like this:
"...foo character varying(10) PRIMARY KEY, ..."
Does it change anything?

Why not look at the query plan and find out yourself? The query plan will tell you exactly what indexes are being used, so you don't have to guess. Here's how to do it:
http://www.postgresql.org/docs/current/static/sql-explain.html
But in general, it should use the index in this case since you specified the primary key in the where clause and you didn't use something that could prevent it from using it (a LIKE, for example).
It's always best to look at the query plan to verify it for sure, then there's no doubt.

In both cases, the primary key can be used but it depends. The optimizer will make a choice depending on the amount of data, the statistics, etc.
Naming the constraint can make debugging and error handling easier, you know what constraint is violated. Without a name, it can be confusing.

Related

Why am i getting postgresql error "Key (id)=(357) already exists"? [duplicate]

I have a question I know this was posted many times but I didn't find an answer to my problem. The problem is that I have a table and a column "id" I want it to be unique number just as normal. This type of column is serial and the next value after each insert is coming from a sequence so everything seems to be all right but it still sometimes shows this error. I don't know why. In the documentation, it says the sequence is foolproof and always works. If I add a UNIQUE constraint to that column will it help? I worked before many times on Postres but this error is showing for me for the first time. I did everything as normal and I never had this problem before. Can you help me to find the answer that can be used in the future for all tables that will be created? Let's say we have something easy like this:
CREATE TABLE comments
(
id serial NOT NULL,
some_column text NOT NULL,
CONSTRAINT id_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE interesting.comments OWNER TO postgres;
If i add:
ALTER TABLE comments ADD CONSTRAINT id_id_key UNIQUE(id)
Will if be enough or is there some other thing that should be done?
This article explains that your sequence might be out of sync and that you have to manually bring it back in sync.
An excerpt from the article in case the URL changes:
If you get this message when trying to insert data into a PostgreSQL
database:
ERROR: duplicate key violates unique constraint
That likely means that the primary key sequence in the table you're
working with has somehow become out of sync, likely because of a mass
import process (or something along those lines). Call it a "bug by
design", but it seems that you have to manually reset the a primary
key index after restoring from a dump file. At any rate, to see if
your values are out of sync, run these two commands:
SELECT MAX(the_primary_key) FROM the_table;
SELECT nextval('the_primary_key_sequence');
If the first value is higher than the second value, your sequence is
out of sync. Back up your PG database (just in case), then run this command:
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);
That will set the sequence to the next available value that's higher
than any existing primary key in the sequence.
Intro
I also encountered this problem and the solution proposed by #adamo was basically the right solution. However, I had to invest a lot of time in the details, which is why I am now writing a new answer in order to save this time for others.
Case
My case was as follows: There was a table that was filled with data using an app. Now a new entry had to be inserted manually via SQL. After that the sequence was out of sync and no more records could be inserted via the app.
Solution
As mentioned in the answer from #adamo, the sequence must be synchronized manually. For this purpose the name of the sequence is needed. For Postgres, the name of the sequence can be determined with the command PG_GET_SERIAL_SEQUENCE. Most examples use lower case table names. In my case the tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.) and their names all started with a capital letter.
In an e-mail from 2004 (link) I got the right hint.
(Let's assume for all examples, that Foo is the table's name and Foo_id the related column.)
Command to get the sequence name:
SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id');
So, the table name must be in double quotes, surrounded by single quotes.
1. Validate, that the sequence is out-of-sync
SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')) AS "Current Value", MAX("Foo_id") AS "Max Value" FROM "Foo";
When the Current Value is less than Max Value, your sequence is out-of-sync.
2. Correction
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')), (SELECT (MAX("Foo_id") + 1) FROM "Foo"), FALSE);
Replace the table_name to your actual name of the table.
Gives the current last id for the table. Note it that for next step.
SELECT MAX(id) FROM table_name;
Get the next id sequence according to postgresql. Make sure this id is higher than the current max id we get from step 1
SELECT nextVal('"table_name_id_seq"');
if it's not higher than then use this step 3 to update the next sequence.
SELECT setval('"table_name_id_seq"', (SELECT MAX(id) FROM table_name)+1);
The primary key is already protecting you from inserting duplicate values, as you're experiencing when you get that error. Adding another unique constraint isn't necessary to do that.
The "duplicate key" error is telling you that the work was not done because it would produce a duplicate key, not that it discovered a duplicate key already commited to the table.
For future searchs, use ON CONFLICT DO NOTHING.
Referrence - https://www.calazan.com/how-to-reset-the-primary-key-sequence-in-postgresql-with-django/
I had the same problem try this:
python manage.py sqlsequencereset table_name
Eg:
python manage.py sqlsequencereset auth
you need to run this in production settings(if you have)
and you need Postgres installed to run this on the server
From http://www.postgresql.org/docs/current/interactive/datatype.html
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be in a unique constraint or a primary key, it must now be specified, same as with any other data type.
In my case carate table script is:
CREATE TABLE public."Survey_symptom_binds"
(
id integer NOT NULL DEFAULT nextval('"Survey_symptom_binds_id_seq"'::regclass),
survey_id integer,
"order" smallint,
symptom_id integer,
CONSTRAINT "Survey_symptom_binds_pkey" PRIMARY KEY (id)
)
SO:
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass),
MAX(id)
FROM public."Survey_symptom_binds";
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass) less than MAX(id) !!!
Try to fix the proble:
SELECT setval('"Survey_symptom_binds_id_seq"', (SELECT MAX(id) FROM public."Survey_symptom_binds")+1);
Good Luck every one!
I had the same problem. It was because of the type of my relations. I had a table property which related to both states and cities. So, at first I had a relation from property to states as OneToOne, and the same for cities. And I had the same error "duplicate key violates unique constraint". That means that: I can only have one property related to one state and city. But that doesnt make sense, because a city can have multiple properties. So the problem is the relation. The relation should be ManyToOne. Many properties to One city
Table name started with a capital letter if tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.)
SELECT setval('"Table_name_Id_seq"', (SELECT MAX("Id") FROM "Table_name") + 1)
WHERE
NOT EXISTS (
SELECT *
FROM (SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Table_name"', 'Id')) AS seq, MAX("Id") AS max_id
FROM "Table_name") AS seq_table
WHERE seq > max_id
)
try that CLI
it's just a suggestion to enhance the adamo code (thanks a lot adamo)
SELECT setval('tableName_columnName_seq', (SELECT MAX(columnName) FROM tableName));
For programatically solution at Django. Based on Paolo Melchiorre's answer, I wrote a chunk as a function to be called before any .save()
from django.db import connection
def setSqlCursor(db_table):
sql = """SELECT pg_catalog.setval(pg_get_serial_sequence('"""+db_table+"""', 'id'), MAX(id)) FROM """+db_table+""";"""
with connection.cursor() as cursor:
cursor.execute(sql)
I have similar problem but I solved it by removing all the foreign key in my Postgresql

Optimizing dataset with a composite primary key that includes a status field

I might be over thinking things, but I am considering opportunities to optimize a large dataset. I have a table schema whose contents is primarily grabbed with its primary key and a status field. Is there any optimization that can be done if I somehow included the status field as part of a composite primary key to speed up search? So for example my table schema might be something like:
CREATE TABLE object_t (
object_id SERIAL PRIMARY KEY NOT NULL,
status VARCHAR CHECK (status ~ '(Astatus|Bstatus|Cstatus|Dstatus)'),
contents TEXT
);
My queries will almost always be something like
SELECT * FROM object_t WHERE search condition AND status = ...
Is there any advantage here for large datasets to adjust the schema to have:
PRIMARY KEY (object_id, status)
Further, tables that I join with this table also always include that status filter. Should I then adjust foreign key constraints to be something like:
FOREIGN KEY (object_id, status) REFERENCES object_t (object_id, status)
Is there any optimization to be gained here, or is simply operating with the object_id and status filter as good as it gets?
The index you suggest will very likely not help.
It only offers a benefit if there is also a condition similar to this in the query:
WHERE object_id = ? AND status = ?
If only status is in the WHERE condition, the index cannot be used at all.
The name status suggests that there are not very many different values. That means that such a condition is often not very selective, which speaks against indexing it.
If you allways query for a certain status, a partial index can be helpful.
But the best thing you can do is experiment.

Using TSQL, CAST() with COLLATE is non-deterministic. How to make it deterministic? What is the work-around?

I have a function that includes:
SELECT #pString = CAST(#pString AS VARCHAR(255)) COLLATE SQL_Latin1_General_Cp1251_CS_AS
This is useful, for example, to remove accents in french; for example:
UPPER(CAST('Éléctricité' AS VARCHAR(255)) COLLATE SQL_Latin1_General_Cp1251_CS_AS)
gives ELECTRICITE.
But using COLLATE makes the function non-deterministic and therefore I cannot use it as a computed persisted value in a column.
Q1. Is there another (quick and easy) way to remove accents like this, with a deterministic function?
Q2. (Bonus Question) The reason I do this computed persisted column is to search. For example the user may enter the customer's last name as either 'Gagne' or 'Gagné' or 'GAGNE' or 'GAGNÉ' and the app will find it using the persisted computed column. Is there a better way to do this?
EDIT: Using SQL Server 2012 and SQL-Azure.
You will find that it is in fact deterministic, it just has different behavior depending on the character you're trying to collate.
Check the page for Windows 1251 encoding for behavior on accepted characters, and unacceptable characters.
Here is a collation chart for Cyrillic_General_CI_AI. This is codepage 1251 Case Insensitive and Accent Insensitive. This will show you the mappings for all acceptable characters within this collation.
As for the search question, as Keith said, I would investigate putting a full text index on the column you are going to be searching on.
The best answer I got was from Sebastian Sajaroff. I used his example to fix the issue. He suggested a VIEW with a UNIQUE INDEX. This gives a good idea of the solution:
create table Test(Id int primary key, Name varchar(20))
create view TestCIAI with schemabinding as
select ID, Name collate SQL_Latin1_General_CP1_CI_AI as NameCIAI from Test
create unique clustered index ix_Unique on TestCIAI (Id)
create unique nonclustered index ix_DistinctNames on TestCIAI (NameCIAI)
insert into Test values (1, 'Sébastien')
--Insertion 2 will fail because of the unique nonclustered indexed on the view
--(which is case-insensitive, accent-insensitive)
insert into Test values (2, 'Sebastien')

postgresql duplicate key violates unique constraint

I have a question I know this was posted many times but I didn't find an answer to my problem. The problem is that I have a table and a column "id" I want it to be unique number just as normal. This type of column is serial and the next value after each insert is coming from a sequence so everything seems to be all right but it still sometimes shows this error. I don't know why. In the documentation, it says the sequence is foolproof and always works. If I add a UNIQUE constraint to that column will it help? I worked before many times on Postres but this error is showing for me for the first time. I did everything as normal and I never had this problem before. Can you help me to find the answer that can be used in the future for all tables that will be created? Let's say we have something easy like this:
CREATE TABLE comments
(
id serial NOT NULL,
some_column text NOT NULL,
CONSTRAINT id_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE interesting.comments OWNER TO postgres;
If i add:
ALTER TABLE comments ADD CONSTRAINT id_id_key UNIQUE(id)
Will if be enough or is there some other thing that should be done?
This article explains that your sequence might be out of sync and that you have to manually bring it back in sync.
An excerpt from the article in case the URL changes:
If you get this message when trying to insert data into a PostgreSQL
database:
ERROR: duplicate key violates unique constraint
That likely means that the primary key sequence in the table you're
working with has somehow become out of sync, likely because of a mass
import process (or something along those lines). Call it a "bug by
design", but it seems that you have to manually reset the a primary
key index after restoring from a dump file. At any rate, to see if
your values are out of sync, run these two commands:
SELECT MAX(the_primary_key) FROM the_table;
SELECT nextval('the_primary_key_sequence');
If the first value is higher than the second value, your sequence is
out of sync. Back up your PG database (just in case), then run this command:
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);
That will set the sequence to the next available value that's higher
than any existing primary key in the sequence.
Intro
I also encountered this problem and the solution proposed by #adamo was basically the right solution. However, I had to invest a lot of time in the details, which is why I am now writing a new answer in order to save this time for others.
Case
My case was as follows: There was a table that was filled with data using an app. Now a new entry had to be inserted manually via SQL. After that the sequence was out of sync and no more records could be inserted via the app.
Solution
As mentioned in the answer from #adamo, the sequence must be synchronized manually. For this purpose the name of the sequence is needed. For Postgres, the name of the sequence can be determined with the command PG_GET_SERIAL_SEQUENCE. Most examples use lower case table names. In my case the tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.) and their names all started with a capital letter.
In an e-mail from 2004 (link) I got the right hint.
(Let's assume for all examples, that Foo is the table's name and Foo_id the related column.)
Command to get the sequence name:
SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id');
So, the table name must be in double quotes, surrounded by single quotes.
1. Validate, that the sequence is out-of-sync
SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')) AS "Current Value", MAX("Foo_id") AS "Max Value" FROM "Foo";
When the Current Value is less than Max Value, your sequence is out-of-sync.
2. Correction
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')), (SELECT (MAX("Foo_id") + 1) FROM "Foo"), FALSE);
Replace the table_name to your actual name of the table.
Gives the current last id for the table. Note it that for next step.
SELECT MAX(id) FROM table_name;
Get the next id sequence according to postgresql. Make sure this id is higher than the current max id we get from step 1
SELECT nextVal('"table_name_id_seq"');
if it's not higher than then use this step 3 to update the next sequence.
SELECT setval('"table_name_id_seq"', (SELECT MAX(id) FROM table_name)+1);
The primary key is already protecting you from inserting duplicate values, as you're experiencing when you get that error. Adding another unique constraint isn't necessary to do that.
The "duplicate key" error is telling you that the work was not done because it would produce a duplicate key, not that it discovered a duplicate key already commited to the table.
For future searchs, use ON CONFLICT DO NOTHING.
Referrence - https://www.calazan.com/how-to-reset-the-primary-key-sequence-in-postgresql-with-django/
I had the same problem try this:
python manage.py sqlsequencereset table_name
Eg:
python manage.py sqlsequencereset auth
you need to run this in production settings(if you have)
and you need Postgres installed to run this on the server
From http://www.postgresql.org/docs/current/interactive/datatype.html
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be in a unique constraint or a primary key, it must now be specified, same as with any other data type.
In my case carate table script is:
CREATE TABLE public."Survey_symptom_binds"
(
id integer NOT NULL DEFAULT nextval('"Survey_symptom_binds_id_seq"'::regclass),
survey_id integer,
"order" smallint,
symptom_id integer,
CONSTRAINT "Survey_symptom_binds_pkey" PRIMARY KEY (id)
)
SO:
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass),
MAX(id)
FROM public."Survey_symptom_binds";
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass) less than MAX(id) !!!
Try to fix the proble:
SELECT setval('"Survey_symptom_binds_id_seq"', (SELECT MAX(id) FROM public."Survey_symptom_binds")+1);
Good Luck every one!
I had the same problem. It was because of the type of my relations. I had a table property which related to both states and cities. So, at first I had a relation from property to states as OneToOne, and the same for cities. And I had the same error "duplicate key violates unique constraint". That means that: I can only have one property related to one state and city. But that doesnt make sense, because a city can have multiple properties. So the problem is the relation. The relation should be ManyToOne. Many properties to One city
Table name started with a capital letter if tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.)
SELECT setval('"Table_name_Id_seq"', (SELECT MAX("Id") FROM "Table_name") + 1)
WHERE
NOT EXISTS (
SELECT *
FROM (SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Table_name"', 'Id')) AS seq, MAX("Id") AS max_id
FROM "Table_name") AS seq_table
WHERE seq > max_id
)
try that CLI
it's just a suggestion to enhance the adamo code (thanks a lot adamo)
SELECT setval('tableName_columnName_seq', (SELECT MAX(columnName) FROM tableName));
For programatically solution at Django. Based on Paolo Melchiorre's answer, I wrote a chunk as a function to be called before any .save()
from django.db import connection
def setSqlCursor(db_table):
sql = """SELECT pg_catalog.setval(pg_get_serial_sequence('"""+db_table+"""', 'id'), MAX(id)) FROM """+db_table+""";"""
with connection.cursor() as cursor:
cursor.execute(sql)
I have similar problem but I solved it by removing all the foreign key in my Postgresql

Primary key defined by many attributes?

Can I define a primary key according to three attributes? I am using Visual Paradigm and Postgres.
CREATE TABLE answers (
time SERIAL NOT NULL,
"{Users}{userID}user_id" int4 NOT NULL,
"{Users}{userID}question_id" int4 NOT NULL,
reply varchar(255),
PRIMARY KEY (time, "{Users}{userID}user_id", "{Users}{userID}question_id"));
A picture may clarify the question.
Yes you can, just as you showed.(though I question your naming of the 2. and 3. column.)
From the docs:
"Primary keys can also constrain more than one column; the syntax is similar to unique constraints:
CREATE TABLE example (
a integer,
b integer,
c integer,
PRIMARY KEY (a, c)
);
A primary key indicates that a column or group of columns can be used as a unique identifier for rows in the table. (This is a direct consequence of the definition of a primary key. Note that a unique constraint does not, by itself, provide a unique identifier because it does not exclude null values.) This is useful both for documentation purposes and for client applications. For example, a GUI application that allows modifying row values probably needs to know the primary key of a table to be able to identify rows uniquely.
A table can have at most one primary key (while it can have many unique and not-null constraints). Relational database theory dictates that every table must have a primary key. This rule is not enforced by PostgreSQL, but it is usually best to follow it.
"
Yes, you can. There is just such an example in the documentation.. However, I'm not familiar with the bracketed terms you're using. Are you doing some variable evaluation before creating the database schema?
yes you can
if you'd run it - you would see it in no time.
i would really, really, really suggest to rethink naming convention. time column that contains serial integer? column names like "{Users}{userID}user_id"? oh my.