How to insert a result of a parallelized SELECT query into a table, in Postgresql? - postgresql

According to https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html,
"Even when it is in general possible for parallel query plans to be
generated, the planner will not generate them for a given query if any
of the following are true:
The query writes any data or locks any database rows. If a query contains a data-modifying operation either at the top level or within
a CTE, no parallel plans for that query will be generated. This is a
limitation of the current implementation which could be lifted in a
future release."
Indeed, when I try to insert result of a parallel SELECT query into a table ( either by SELECT.. INTO or by WITH..SELECT..INTO ), the query is not executed as parallel query.
My question is: Is there any way to trick the Postgresql so that a SELECT query is executed as a parallel query and then its result inserted into a table?

There is a trick with psql -o parameter.
i.e.
Step1:
call psql -h localhost -d dbname -U username -c "select * from vw_FBigTable_extract" -o FBigTable_extract.csv -A -t -F ","
Step2:
call psql -h localhost -d dbname -U username -c "COPY t_FBigTable_extract FROM 'FBigTable_extract.csv' WITH (FORMAT CSV, DELIMITER ',', HEADER FALSE, ENCODING 'windows-1252')"
Sometimes works faster then non-parallel approach.

Related

Executing psql queries from a file vs passing it through bash

I am trying to pass in queries to psql through an python script.
PGPASSWORD=pass -U postgres -d postgres -h localhost -c "insert into table1 values(1,2); select * from table2;"
Here suppose, the second query (select * from table2) fails, then the first query is also not applied(not sure if it is not applied or its effect is rolled back)
But if I have both of the queries in a file name <file.sql>
PGPASSWORD=pass -U postgres -d postgres -h localhost -m file.sql
then even if the second query fails, the first one is executed. Does the first method executes all the queries as one transaction and if one fails, it rolls back the results?
Yes, that is exactly what happens.
The argument to -c is sent to the server as a single request, so it runs as a single transaction.
The documentation says:
Each SQL command string passed to -c is sent to the server as a single request. Because of this, the server executes it as a single transaction even if the string contains multiple SQL commands, unless there are explicit BEGIN/COMMIT commands included in the string to divide it into multiple transactions.
You can use the -c option more than once if you don't want that.

Copy select query result to csv in postgres

I have following script that I am using to copy select query result in csv file.
psql -h hostname -U username -d dbname -t -A -F"," --c "some query with multiple joins and conditions" > "E:\output.csv"
while executing the script it is creating the file with name output.csv but there is no data in file.
I have tried with sample query simple join operation with 3 tables for limited data and it is working fine.
But with actual query it keeps executing for the couple of hours and ended with the empty file. Which is dissatisfying to get empty file after waiting such a long period.
any suggestions or corrections?

Postgres taking backup of master data and schema for few of the tables

In my database, I have the master tables starting with m_* and other. What I want to take the back of tables with following scenario.
Backup schema + data for master tables i.e table names starting with m_*
Backuo schema structure for the rest of the tables.
I did read the following command somewhere
pg_dump -U "postgres" -h "local" -p "5432"
-d dbName -F c -b -v -f c:\uti\backup.dmp
--exclude-table-data '*.table_name_pattern_*'
--exclude-table-data 'some_schema.another_*_pattern_*'
But I have so many tables and I find it tedious to put each table name in it. Any tidy way to get around it?
Using Linux:
File foo.sh (adjust filtering conditions):
psql <connection and other parameters> -c "copy (select format('--exclude-table-data=%s.%s', schemaname, tablename) from pg_tables where schemaname in ('public', 'foo') and tablename<>'t') to stdout;"
Command (note about backticks):
pg_dump <connection and other parameters> `./foo.sh`
Note that it is very flexible approach.

Creating a database dump for specific tables and entries Postgres

I have a database with hundreds of tables, what I need to do is export specified tables and insert statements for the data to one sql file.
The only statement I know can achieve this is
pg_dump -D -a -t zones_seq interway > /tmp/zones_seq.sql
Should I run this statement for each and every table or is there a way to run a similar statement to export all selected tables into one big sql big. The pg_dump above does not export the table schema only inserts, I need both
Any help will be appreciated.
Right from the manual: "Multiple tables can be selected by writing multiple -t switches"
So you need to list all of your tables
pg_dump --column-inserts -a -t zones_seq -t interway -t table_3 ... > /tmp/zones_seq.sql
Note that if you have several table with the same prefix (or suffix) you can also use wildcards to select them with the -t parameter:
"Also, the table parameter is interpreted as a pattern according to the same rules used by psql's \d commands"
If those specific tables match a particular pattern, you can use that with the -t option in pg_dump.
pg_dump -D -a -t zones_seq -t interway -t "<pattern>" -f /tmp/zones_seq.sql <DBNAME>
For example to dump tables which start with "test", you can use
pg_dump -D -a -t zones_seq -t interway -t "^test*" -f /tmp/zones_seq.sql <DBNAME>

Generate DDL programmatically on Postgresql

How can I generate the DDL of a table programmatically on Postgresql? Is there a system query or command to do it? Googling the issue returned no pointers.
Use pg_dump with this options:
pg_dump -U user_name -h host database -s -t table_or_view_names -f table_or_view_names.sql
Description:
-s or --schema-only : Dump only ddl / the object definitions (schema), without data.
-t or --table Dump : Dump only tables (or views or sequences) matching table
Examples:
-- dump each ddl table elon build.
$ pg_dump -U elon -h localhost -s -t spacex -t tesla -t solarcity -t boring > companies.sql
Sorry if out of topic. Just wanna help who googling "psql dump ddl" and got this thread.
You can use the pg_dump command to dump the contents of the database (both schema and data). The --schema-only switch will dump only the DDL for your table(s).
Why would shelling out to psql not count as "programmatically?" It'll dump the entire schema very nicely.
Anyhow, you can get data types (and much more) from the information_schema (8.4 docs referenced here, but this is not a new feature):
=# select column_name, data_type from information_schema.columns
-# where table_name = 'config';
column_name | data_type
--------------------+-----------
id | integer
default_printer_id | integer
master_host_enable | boolean
(3 rows)
The answer is to check the source code for pg_dump and follow the switches it uses to generate the DDL. Somewhere inside the code there's a number of queries used to retrieve the metadata used to generate the DDL.
Here is a good article on how to get the meta information from information schema,
http://www.alberton.info/postgresql_meta_info.html.
I saved 4 functions to mock up pg_dump -s behaviour partially. Based on \d+ metacommand. The usage would be smth alike:
\pset format unaligned
select get_ddl_t(schemaname,tablename) as "--" from pg_tables where tableowner <> 'postgres';
Of course you have to create functions prior.
Working sample here at rextester