PostgreSQL: SELECT INTO - how to create indexes? - postgresql

since SELECT INTO NEW_TABLE FROM QUERY creates NEW_TABLE the new table will not have any indices. Is there some way to utilise SELECT INTO with an existing table where I've created the desired indices? I am aware of INSERT INTO TABLE SELECT ... but I've encountered very bad performance compared to SELECT INTO.
Thanks

Not sure what performance issues do you talk about, but generally, if you're making copy of table, it's much better to create indexes after inserting data.
I.e. - you do:
create table new_table as select * from old_table;
Then just create indexes.
One option to simplify index creation is to use pg_dump and it's -s and -t options, with some "grep":
pg_dump -s -t old_table database_name | \
grep -E '^CREATE.*INDEX' | \
sed 's/old_table/new_table/g'

Related

How to hash an entire table in postgres?

I'd like to get a hash for data in an entire table. I need to compare two databases after migration to validate that the data migration was successful. Is it possible to reliably and reproducibly generate a hash for an entire table in a database?
You can do this from the command line (replacing of course my_database and my_table):
psql my_database -c 'copy my_table to stdout' |sha1sum
If you want to use a query to limit columns, add ordering, etc., just modify the query:
psql my_database -c 'copy (select * from my_table order by my_id_column) to stdout' |sha1sum
Note that this does not hash anything except the column data. No schema information, constraints, indexes, metadata, permissions, etc.
Note also that sha1sum is an arbitrary hashing program; you can pipe this to any program that generates a hash. Some cuspy options are sha256sum and md5sum.

pg_dump for all metadata and only table data of selected tables

I want to create a script that will dump the whole schema and the data of only a few tables and write it to one file.
Use the --exclude-table-data option of pg_dump to define the tables whose data should be excluded from the dump.
multiple -t lists table you want take backup of, eg
MacBook-Air:~ vao$ pg_dump -d t -t pg_database -t a -t so | grep 'CREATE TABLE'
CREATE TABLE pg_database (
CREATE TABLE a (
CREATE TABLE so (
takes backup of structure and data of three mentioned tables. I use grep to hide other rows and yet give idea of backup contents
https://www.postgresql.org/docs/current/static/app-pgdump.html
-t table
--table=table
Dump only tables with names matching table. For this purpose, “table”
includes views, materialized views, sequences, and foreign tables.
Multiple tables can be selected by writing multiple -t switches.

What is the quickest way to duplicate/clone a table in Postgres?

I know that I could do CREATE TABLE tbl_2 AS (select * from tbl_1)
But is there a better/faster/stronger way to do this? I am talking about performance more than anything else. The tables are all denormalised and I do not have any foreign key constraints to worry about.
EDIT
May be there isn't any better way? Ref: https://dba.stackexchange.com/questions/55661/how-to-duplicate-huge-postgres-table
A better way really depends on what exactly you're hoping to accomplish.
If you want to keep all the constraints and indexes from the original table you can use the LIKE clause in your CREATE TABLE statement like so:
CREATE TABLE tbl_2 (LIKE tbl_1 INCLUDING INDEXES INCLUDING CONSTRAINTS);
But that just creates an empty table. You would still have to copy in the data.
Alternatively you can use something like the following:
$ pg_dump -t tbl_1 | sed -e 's/^SET search_path = .*$/SET search_path = tmpschema, pg_catalog;' > table.sql
$ psql -d test -c 'CREATE SCHEMA tmpschema'
$ psql -1 -d test -f table.sql
$ psql -d test -c 'ALTER TABLE tmpschema.tbl_1 RENAME TO tbl_2; ALTER TABLE tmpschema.tbl_2 SET SCHEMA public; DROP SCHEMA tmpschema'
Perhaps it is not faster than CREATE TABLE ... AS (SELECT ...), but it will copy all indexes and constraints as well.

Dumping only certain part of table with pg_dump

I want to dump a table of a PostgreSQL database (on Heroku) but want to only get the rows of the table matching certain a criteria, e.g.
created_at > "2016-01-01".
Is that even possible using the pg_dump utility?
pg_dump cannot do that. You can use COPY to extract data from a single table with a condition:
COPY (SELECT * FROM tab WHERE created_at > '2016-01-01') TO '/data/dumpfile';

Postgres : pg_restore/pg_dump everything EXCEPT the table id's for a table

Currently I'm doing something like:
pg_dump -a -O -t my_table my_db > my_data_to_import.sql
What I really want is to be able to import/export just the data without causing conflicts with my autoid field or overwriting existing data.
Maybe I'm thinking about the whole process wrong?
You can use COPY with column list to dump and restore just data from one table. For example:
COPY my_table (column1, column2, ...) TO 'yourdumpfilepath';
COPY my_table (column1, column2, ...) FROM 'yourdumpfilepath';
OID is one of the system columns. For example it is not included in SELECT * FROM my_table (you need to use SELECT oid,* FROM my_table). OID is not the same as ordinary id column created along with other columns in CREATE TABLE. Not every table has OID column. Check default_with_oids option. If it's set to off, then probalby you don't have OID column in your table, but even if so, then you can still create table with OID using WITH OIDS option. It's recommended not to use OID as table's column (that's why default_with_oids is set to off prior to PostgreSQL 8.1).
pg_dump --insert -t TABLENAME DBNAME > fc.sql
cat fc.sql | sed -e 's/VALUES [(][0-9][0-9],/VALUES (/g'|sed -e 's/[(]id,/(/g' > fce.sql
psql -f fce.sql DBNAME
This dumps the table with columns into fc.sql then uses sed to remove the id, and the value associated with it