Postgres: Understanding Primary Key Sequences - postgresql

My database has fallen out of sync, which lead me to this question:
How to reset postgres' primary key sequence when it falls out of sync? (Copied Below)
However, I don't quite understand some things here: What is 'your_table_id_seq'? I have no clue where to find this. Doing some digging, I found a table called pg_sequence
in pg_catalog. Is it related to this? I can't see any way to relate that data back to my table though.
-- Login to psql and run the following
-- What is the result?
SELECT MAX(id) FROM your_table;
-- Then run...
-- This should be higher than the last result.
SELECT nextval('your_table_id_seq');
-- If it's not higher... run this set the sequence last to your highest id.
-- (wise to run a quick pg_dump first...)
BEGIN;
-- protect against concurrent inserts while you update the counter
LOCK TABLE your_table IN EXCLUSIVE MODE;
-- Update the sequence
SELECT setval('your_table_id_seq', COALESCE((SELECT MAX(id)+1 FROM your_table), 1), false);
COMMIT;

The following query gives names of all sequences.
SELECT c.relname
FROM pg_class c
WHERE c.relkind = 'S';
Typically a sequence is named as ${table}_id_seq.
I found the answer in this question: List all sequences in a Postgres db 8.1 with SQL

Related

Feedback on whether index was created on materialized views in postgresql

I created a unique index for a materialized view as :
create unique index if not exists matview_key on
matview (some_group_id, some_description);
I can't tell if it has been created
How do I see the index?
Thank you!
Two ways to verify index creation:
--In psql
\d matview
--Using SQL
select
*
from
pg_indexes
where
indexname = 'matview_key'
and
tablename = 'matview';
More information on pg_indexes.
Like has been commented, if the command finishes successfully and you don't get an error message, the index was created. Possible caveat: while the transaction is not committed, nobody else can see it (except the unique name is reserved now), and it still might get rolled back. Check in a separate transaction to be sure.
To be absolutely sure:
SELECT pg_get_indexdef(oid)
FROM pg_catalog.pg_class
WHERE relname = 'matview_key'
AND relkind = 'i'
-- AND relnamespace = 'public'::regnamespace -- optional, to make sure of the schema, too
This way you see whether an index of the given name exists, and also its exact definition to rule out a different index with the same name. Pure SQL, works from any client. (There is nothing special about an index on materialized views.)
Also filter for the schema to be absolutely sure. Would be the "default" schema (a.k.a. "current" schema) in your case, since you did not specify in the creation. See:
How does the search_path influence identifier resolution and the "current schema"
Related:
Create index if it does not exist
How to check if a table exists in a given schema
In psql:
\di public.matview_key
To only find indexes. Again, the schema is optional to narrow down.
Progress Reporting
If creating an index takes a long time, you can look up progress in pg_stat_progress_create_index since Postgres 12:
SELECT * FROM pg_stat_progress_create_index
-- WHERE relid = 'public.matview'::regclass -- optionally narrow down
Un alternative to looking into pg_indexes is pg_matviews (for a materialized view only)
select *
from pg_matviews
where matviewname = 'my_matview_name';

What is the scope of a PostgreSQL Temp Table?

I have googled quite a bit, and I have fairly decent reading comprehension, but I don't understand if this script will work in multiple threads on my postgres/postgis box. Here is the code:
Do
$do$
DECLARE
x RECORD;
b int;
begin
create temp table geoms (id serial, geom geometry) on commit drop;
for x in select id,geom from asdf loop
truncate table geoms;
insert into geoms (geom) select someGeomfield from sometable where st_intersects(somegeomfield,x.geom);
----do something with the records in geoms here...and insert that data somewhere else
end loop;
end;
$do$
So, if I run this in more than one client, called from Java, will the scope of the geoms temp table cause problems? If so, any ideas for a solution to this in PostGres would be helpful.
Thanks
One subtle trap you will run into though, which is why I am not quite ready to declare it "safe" is that the scope is per session, but people often forget to drop the tables (so they drop on disconnect).
I think you are much better off if you don't need the temp table after your function to drop it explicitly after you are done with it. This will prevent issues that arise from trying to run the function twice in the same transaction. (On commit you are dropping)
Temp tables in PostgreSQL (or Postgres) (PostGres doesn't exists) are local only and related to session where they are created. So no other sessions (clients) can see temp tables from other session. Both (schema and data) are invisible for others. Your code is safe.

How do I drop all tables in psql (PostgreSQL interactive terminal) that starts with a common word?

How do I drop all tables whose name start with, say, doors_? Can I do some sort of regex using the drop table command?
I prefer not writing a custom script but all solutions are welcomed. Thanks!
This script will generate the DDL commands to drop them all:
SELECT 'DROP TABLE ' || t.oid::regclass || ';'
FROM pg_class t
-- JOIN pg_namespace n ON n.oid = t.relnamespace -- to select by schema
WHERE t.relkind = 'r'
AND t.relname ~~ E'doors\_%' -- enter search term for table here
-- AND n.nspname ~~ '%myschema%' -- optionally select by schema(s), too
ORDER BY 1;
The cast t.oid::regclass makes the syntax work for mixed case identifiers, reserved words or special characters in table names, too. It also prevents SQL injection and prepends the schema name where necessary. More about object identifier types in the manual.
About the schema search path.
You could automate the dropping, too, but it's unwise not to check what you actually delete before you do.
You could append CASCADE to every statement to DROP depending objects (views and referencing foreign keys). But, again, that's unwise unless you know very well what you are doing. Foreign key constraints are no big loss, but this will also drop all dependent views entirely. Without CASCADE you get error messages informing you which objects prevent you from dropping the table. And you can then deal with it.
I normally use one query to generate the DDL commands for me based on some of the metadata tables and then run those commands manually. For example:
SELECT 'DROP TABLE ' || tablename || ';' FROM pg_tables
WHERE tablename LIKE 'prefix%' AND schemaname = 'public';
This will return a bunch of DROP TABLE xxx; queries, which I simply copy&paste to the console. While you could add some code to execute them automatically, I prefer to run them on my own.

How to delete unused sequences?

We are using PostgreSQL. My requirement is to delete unused sequences from my database.
For example, if I create any table through my application, one sequence will be created, but for deleting the table we are not deleting the sequence, too. If want to create the same table another sequence is being created.
Example: table: file; automatically created sequence for id coumn: file_id_seq
When I delete the table file and create it with same name again, a new sequence is being created (i.e. file_id_seq1). I have accumulated a huge number of unused sequences in my application database this way.
How to delete these unused sequences?
A sequence that is created automatically for a serial column is deleted automatically, when the column (or its table) is dropped. The problem you describe should not exist to begin with. Only very old versions of PostgreSQL did not do that. 7.4 or older?
Solution for the problem
This query will generate the DDL commands to delete all "unbound" sequences in the database it is executed in:
SELECT string_agg('DROP SEQUENCE ' || c.oid::regclass, '; ') || ';' AS ddl
FROM pg_class c
LEFT JOIN pg_depend d ON d.refobjid = c.oid
AND d.deptype <> 'i'
WHERE c.relkind = 'S'
AND d.refobjid IS NULL;
The cast to regclass in c.oid::regclass automatically schema-qualifies sequence names where necessary according to the current search_path. See:
How to check if a table exists in a given schema
How does the search_path influence identifier resolution and the "current schema"
Result:
DROP SEQUENCE foo_id_seq;
DROP SEQUENCE bar_id_seq;
...
Execute the result to drop all sequences that are not bound to a serial column (or any other column). Study the meaning of columns and tables here.
Careful! These sequences might be in use otherwise. There are use cases where sequences are created as standalone objects. For instance, if you want multiple columns to share one sequence. You should know exactly what you are doing.
However, you cannot delete sequences bound to a serial column this way. So the operation is safe in this respect.
DROP SEQUENCE test_id_seq;
Result:
ERROR: cannot drop sequence test_id_seq because other objects depend on it
DETAIL: default for table test column id depends on sequence test_id_seq
HINT: Use DROP ... CASCADE to drop the dependent objects too.
If you are using pgAdmin, you can select the sequence and check the "depends on" tab. It will list any object that relies on the sequence.
Another way is to TRY to delete the sequence. If a table references it, pgAdmin will throw an error saying that something is depending on this sequence. If you are able to delete the sequence without any errors, there is no dependency.
Be sure to test this somewhere.
What i do is first I got all the sequences and then saved these result into a file then i run the file in psql: below content was saved with file name del_seq_all.sql and then list sequences in test1.sql . i dont know this is the correct solution or not. But result is coming as expected.
\o d:/test1.sql
SELECT 'drop sequence ' || c.relname || ';' FROM pg_class c WHERE
(c.relkind = 'S');
\o
\i d:/test1.sql
Proceed with caution, "drop sequence sequence_name_here" will successfully drop a sequence even if it's attached as a default nextval() value of a table column. There seems to be some disconnect here especially if the sequence was created separately. I'm also looking for the perfect one liner to clean up 100% unused sequences.
Building on the answer by #erwin:
DO $$ DECLARE
r text;
BEGIN
FOR r IN (
SELECT cl.relname
FROM pg_class cl
LEFT JOIN pg_namespace ns ON ns."oid" = cl.relnamespace
LEFT JOIN pg_depend d ON d.refobjid = cl."oid" AND d.deptype <> 'i'
WHERE ns.nspname = 'public'
AND cl.relkind = 'S'
AND d.refobjid IS NULL
) LOOP
-- dangerous, test before you execute!
RAISE NOTICE '%', -- once confident, comment this line ...
-- EXECUTE -- ... and uncomment this one
'DROP SEQUENCE ' || quote_ident(r);
END LOOP;
END $$;
This will actually execute the query and produce the intended result

how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;