How to return exit code of now rows matched in postgresql - postgresql

I am writing a script so that to filter jobs that are failed in past 24hrs and also the failed job is not completed or not in running state again.
rm -rf jobsfailed.txt jobscompleted.txt jobsnotcompleted.txt ## Remove the files which are created
export PGPASSWORD="$PG_PASS"
failed_jobs=`psql -t -U bacula bacula << EOD
SELECT jobid,job,name,jobbytes,realendtime,jobstatus,level FROM job WHERE jobstatus IN ('f','A','E') AND realendtime > NOW() - INTERVAL '24 HOURS';
\q
EOD` ### Collect all the jobs which are in the defined states
echo "$failed_jobs" >> jobsfailed.txt ### redirect the values to a file
sortedlist=$(awk -F'|' '{print $3}' jobsfailed.txt | sort | uniq) ## sort the values based on the jobname
for i in $sortedlist
do
retVal=$?
jobs_notcompleted=`psql -t -U bacula bacula << EOD1
SELECT jobid,job,name,jobbytes,realendtime,jobstatus,level FROM job WHERE name LIKE '$i' AND jobstatus IN ('T','R') AND starttime > NOW() - INTERVAL '24 HOURS' ORDER BY jobid DESC LIMIT 1;
\q
EOD1` ### If the job is in above defined states(T or R) then jobs completed successfully. Any other state apart from above then job not completed
if [[ $retVal -eq 0 ]]; then
echo "$jobs_notcompleted" >> jobscompleted.txt
else
echo "$jobs_notcompleted" >> jobsnotcompleted.txt
fi
exit $retVal
done
But i am not getting desired output. Since if no state is getting matched, then it is producing (0 rows) output. Please let me know, is there any other way if 0 rows are matched then that $jobs_notcompleted value should redirect to the jobsnotcompleted.txt file. jobscompleted.txt file is getting created and working as expected.

For some reason, I feel that you need a "NOT" between "jobstatus" and "in" for your second query, because the ${retVal} would, in my mind, always return 0 unless there was a DB error.
If I am wrong on that point, then the "retVal=$?" needs to be moved to after the 2nd query.

Related

Batch file for reading sql scripts from file and export results to csv

I want to make a batch file that will get query from .SQL script from the directory and export results in .csv format. I need to connect to the Postgres server.
So I'm trying to do this using that answer https://stackoverflow.com/a/39049102/9631920.
My file:
#!/bin/bash
# sql_to_csv.sh
echo test1
CONN="psql -U my_user -d my_db -h host -port"
QUERY="$(sed 's/;//g;/^--/ d;s/--.*//g;' 'folder/folder/folder/file.sql' | tr '\n' ' ')"
echo test2
echo "$QUERY"
echo test3
echo "\\copy ($QUERY) to 'folder/folder/folder/file.csv' with csv header" | $CONN > /dev/null
echo query in progress
It shows me script from query and test3 and then stops. What am I doing wrong?
edit.
My file:
#!/bin/bash
PSQL = "psql -h 250.250.250.250 -p 5432 -U user -d test"
${PSQL} << OMG2
CREATE TEMP VIEW xyz AS
`cat C:\Users\test\Documents\my_query.sql`
;
\copy (select * from xyz) TO 'C:\Users\test\Documents\res.csv';
OMG2
But it's not asking password, and not getting any result file
a shell HERE-document will solve most of your quoting woes
a temp view will solve the single-query-on-a-single line problem
Example (using a multi-line two-table JOIN):
#!/bin/bash
PSQL="psql -U www twitters"
${PSQL} << OMG
-- Some comment here
CREATE TEMP VIEW xyz AS
SELECT twp.name, twt.*
FROM tweeps twp
JOIN tweets twt
ON twt.user_id = twp.id
AND twt.in_reply_to_id > 3
WHERE 1=1
AND (False OR twp.screen_name ilike '%omg%' )
;
\copy (select * from xyz) TO 'omg.csv';
OMG
If you want the contents of an existing .sql file, you can cat it inside the here document, using a backtick-expansion:
#!/bin/bash
PSQL="psql -X -n -U www twitters"
${PSQL} << OMG2
-- Some comment here
CREATE TEMP VIEW xyz AS
-- ... more comment
-- cat the original file here
`cat /home/dir1/dir2/dir3/myscript.sql`
;
\copy (select * from xyz) TO 'omg.csv';
OMG2

time.sleep() has 60 sec limit in plpythonu?

Note: I'm running Postgres 11.7 and Python 2.7.17.
It appears that time.sleep() has 60 sec limit in plpythonu function. It works as expected up to 60 seconds. But if passed a value greater than 60 then it stops at 60 seconds.
CREATE OR REPLACE FUNCTION test_sleep(interval_str INTERVAL)
RETURNS INTERVAL
AS $$
try:
import time
import datetime
start_time = time.time()
t = datetime.datetime.strptime(interval_str,"%H:%M:%S")
tdelta = datetime.timedelta(hours=t.hour, minutes=t.minute, seconds=t.second)
interval_secs = tdelta.total_seconds()
print 'interval_secs=%s' % repr(interval_secs)
time.sleep(interval_secs)
elapsed_secs = time.time() - start_time
return datetime.timedelta(seconds=elapsed_secs)
except:
import traceback
print traceback.format_exc()
raise
$$ LANGUAGE PLPYTHONU;
Here's a test on command line using psql. The first test runs as expected. I told it to sleep for two seconds and it dis. The second test only sleeps for 60 seconds even though I requested a sleep for 120 seconds.
$ echo "select now(), test_sleep('2 sec'); select now();" | psql
now | test_sleep
------------------------------+-----------------
2021-05-07 11:10:27.63041-04 | 00:00:02.005745
(1 row)
now
-------------------------------
2021-05-07 11:10:29.652542-04
(1 row)
$ echo "select now(), test_sleep('2 min'); select now();" | psql
now | test_sleep
-------------------------------+-----------------
2021-05-07 11:10:36.056578-04 | 00:00:59.977787
(1 row)
now
-------------------------------
2021-05-07 11:11:36.050637-04
(1 row)
$
Here's output from the Postgres log file.
interval_secs=2.0
interval_secs=120.0
This is not unexpected (although I can't reliably reproduce it).
https://realpython.com/python-sleep/
Note: In Python 3.5, the core developers changed the behavior of
time.sleep() slightly. The new Python sleep() system call will last at
least the number of seconds you’ve specified, even if the sleep is
interrupted by a signal. This does not apply if the signal itself
raises an exception, however.
So in python2, time.sleep can return early if interrupted by a signal. If you don't like that, have it loop to sleep again for the remainder of the time.

ERROR: invalid input syntax for type timestamp with time zone

I have a PostgreSQL query as below which is running fine . I am calling it from a shell script as below
Result=$(psql -U username -d database -t -c
$'SELECT round(sum(i.total), 2) AS "ROUND(sum(i.total), 2)"
FROM invoice i
WHERE i.create_datetime = '2019-03-01 00:00:00-06'
AND i.is_review = '1' AND i.user_id != 60;')
now I want the value which I have hard coded as i.create_datetime = '2019-03-01 00:00:00-06' to replace it with a variable date value.
I have tried two ways
way 1:
Result=$(psql -U username -d database -t -c
$'WITH var(reviewMonth) as (values(\'$reviewMonth\'))
SELECT round(sum(i.total),2) AS "ROUND(sum(i.total),2)"
FROM var,invoice i
WHERE i.create_datetime = var.reviewMonth::timestamp
AND i.is_review = \'1\' AND i.user_id != 60;')
and
way 2:
Result=$(psql -U username -d database -t -c
$'SELECT round(sum(i.total),2) AS "ROUND(sum(i.total),2)"
FROM invoice i
WHERE i.create_datetime = \'$reviewMonth\'
AND i.is_review = \'1\' AND i.user_id != 60;')
But both way it's throwing error
way 1 throwing error as :
ERROR: operator does not exist: timestamp with time zone = text
way 2 throwing error as :
ERROR: invalid input syntax for type timestamp with time zone: "$reviewMonth"
Please suggest what should be my approach.
You should try using the psql variables. Here's an example:
# Put the query in a file, with the variable TSTAMP:
> echo "SELECT :'TSTAMP'::timestamp with time zone;" > query.sql
> export TSTAMP='2019-03-01 00:00:00-06'
> RESULT=$(psql -U postgres -t --variable=TSTAMP="$TSTAMP" -f query.sql )
> echo $RESULT
2019-03-01 06:00:00+00
Note how we format the string literal substitution in the query: :'TSTAMP'
You could also do the substitution yourself. Here's an example using a heredoc:
> export TSTAMP='2019-03-01 00:00:01-06'
> RESULT=$(psql -U postgres -t << EOF
SELECT '$TSTAMP'::timestamp with time zone;
EOF
)
> echo $RESULT
2019-03-01 06:00:01+00
In this case, we aren't using psql's variable substitution, so we have to quote the variable like '$TSTAMP' . Using a heredoc makes the quoting much simpler than using -c because you aren't trying to quote the whole command.
EDIT: more examples because it appears this wasn't clear enough. TSTAMP does not have to be hard coded, it's just a bash variable than can be set like any other bash variable.
> TSTAMP=$(date -d 'now' +'%Y-%m-01 00:00:00')
> RESULT=$(psql -U postgres -t << EOF
SELECT '$TSTAMP'::timestamp with time zone;
EOF
)
> echo $RESULT
2019-06-01 00:00:00+00
However, if you're really just looking for the start of the month, there's no need for shell variables at all
> RESULT=$(psql -U postgres -t << EOF
SELECT date_trunc('month', now());
EOF
)
> echo $RESULT
2019-06-01 00:00:00+00

Convert Tables in Postgresql to Shapefile

So far I have loaded all the parcel tables (with geometry information) in Alaska to PostgreSQL. The tables are originally stored in dump format. Now, I want to convert each table in Postgres to shapefile through cmd interface using ogr2ogr.
My code is something like below:
ogr2ogr -f "ESRI Shapefile" "G:\...\Projects\Dataset\Parcel\test.shp" PG:"dbname=parceldb host=localhost port=5432 user=postgres password=postgres" -sql "SELECT * FROM ak_fairbanks"
However, the system kept returning me this info: Unable to open datasource
PG:dbname='parceldb' host='localhost' port='5432' user='postgres' password='postgres'
With the following drivers.
There is pgsql2shp option also available. For this you need to have this utility in your system.
The command that can be follow for this conversion is
pgsql2shp -u <username> -h <hostname> -P <password> -p 5434 -f <file path to save shape file> <database> [<schema>.]<table_name>
This command has other options also which can be seen on this link.
Exploring this case based on the comments in another answer, I decided to share my Bash scripts and my ideas.
Exporting multiple tables
To export many tables from a specific schema, I use the following script.
#!/bin/bash
source ./pgconfig
export PGPASSWORD=$password
# if you want filter, set the tables names into FILTER variable below and removing the character # to uncomment that.
# FILTER=("table_name_a" "table_name_b" "table_name_c")
#
# Set the output directory
OUTPUT_DATA="/tmp/pgsql2shp/$database"
#
#
# Remove the Shapefiles after ZIP
RM_SHP="yes"
# Define where pgsql2shp is and format the base command
PG_BIN="/usr/bin"
PG_CON="-d $database -U $user -h $host -p $port"
# creating output directory to put files
mkdir -p "$OUTPUT_DATA"
SQL_TABLES="select table_name from information_schema.tables where table_schema = '$schema'"
SQL_TABLES="$SQL_TABLES and table_type = 'BASE TABLE' and table_name != 'spatial_ref_sys';"
TABLES=($($PG_BIN/psql $PG_CON -t -c "$SQL_TABLES"))
export_shp(){
SQL="$1"
TB="$2"
pgsql2shp -f "$OUTPUT_DATA/$TB" -h $host -p $port -u $user $database "$SQL"
zip -j "$OUTPUT_DATA/$TB.zip" "$OUTPUT_DATA/$TB.shp" "$OUTPUT_DATA/$TB.shx" "$OUTPUT_DATA/$TB.prj" "$OUTPUT_DATA/$TB.dbf" "$OUTPUT_DATA/$TB.cpg"
}
for TABLE in ${TABLES[#]}
do
DATA_QUERY="SELECT * FROM $schema.$TABLE"
SHP_NAME="$TABLE"
if [[ ${#FILTER[#]} -gt 0 ]]; then
echo "Has filter by table name"
if [[ " ${FILTER[#]} " =~ " ${TABLE} " ]]; then
export_shp "$DATA_QUERY" "$SHP_NAME"
fi
else
export_shp "$DATA_QUERY" "$SHP_NAME"
fi
# remove intermediate files
if [[ "$RM_SHP" = "yes" ]]; then
rm -f $OUTPUT_DATA/$SHP_NAME.{shp,shx,prj,dbf,cpg}
fi
done
Splitting data into multiple files
To avoid the problem of large tables when pgsql2shp does not write data to the shapefile, we can adopt data splitting using the paging strategy. In Postgres we can use LIMIT, OFFSET and ORDER BY for paging.
Applying this method and considering that your table has a primary key used to sort the data in my example script.
#!/bin/bash
source ./pgconfig
export PGPASSWORD=$password
# if you want filter, set the tables names into FILTER variable below and removing the character # to uncomment that.
# FILTER=("table_name_a" "table_name_b" "table_name_c")
#
# Set the output directory
OUTPUT_DATA="/tmp/pgsql2shp/$database"
#
#
# Remove the Shapefiles after ZIP
RM_SHP="yes"
# Define where pgsql2shp is and format the base command
PG_BIN="/usr/bin"
PG_CON="-d $database -U $user -h $host -p $port"
# creating output directory to put files
mkdir -p "$OUTPUT_DATA"
SQL_TABLES="select table_name from information_schema.tables where table_schema = '$schema'"
SQL_TABLES="$SQL_TABLES and table_type = 'BASE TABLE' and table_name != 'spatial_ref_sys';"
TABLES=($($PG_BIN/psql $PG_CON -t -c "$SQL_TABLES"))
export_shp(){
SQL="$1"
TB="$2"
pgsql2shp -f "$OUTPUT_DATA/$TB" -h $host -p $port -u $user $database "$SQL"
zip -j "$OUTPUT_DATA/$TB.zip" "$OUTPUT_DATA/$TB.shp" "$OUTPUT_DATA/$TB.shx" "$OUTPUT_DATA/$TB.prj" "$OUTPUT_DATA/$TB.dbf" "$OUTPUT_DATA/$TB.cpg"
}
for TABLE in ${TABLES[#]}
do
GET_PK="SELECT a.attname "
GET_PK="${GET_PK}FROM pg_index i "
GET_PK="${GET_PK}JOIN pg_attribute a ON a.attrelid = i.indrelid AND a.attnum = ANY(i.indkey) "
GET_PK="${GET_PK}WHERE i.indrelid = 'test'::regclass AND i.indisprimary"
PK=($($PG_BIN/psql $PG_CON -t -c "$GET_PK"))
MAX_ROWS=($($PG_BIN/psql $PG_CON -t -c "SELECT COUNT(*) FROM $schema.$TABLE"))
LIMIT=10000
OFFSET=0
# base query
DATA_QUERY="SELECT * FROM $schema.$TABLE"
# continue until all data are fetched.
while [ $OFFSET -le $MAX_ROWS ]
do
DATA_QUERY_P="$DATA_QUERY ORDER BY $PK OFFSET $OFFSET LIMIT $LIMIT"
OFFSET=$(( OFFSET+LIMIT ))
SHP_NAME="${TABLE}_${OFFSET}"
if [[ ${#FILTER[#]} -gt 0 ]]; then
echo "Has filter by table name"
if [[ " ${FILTER[#]} " =~ " ${TABLE} " ]]; then
export_shp "$DATA_QUERY_P" "$SHP_NAME"
fi
else
export_shp "$DATA_QUERY_P" "$SHP_NAME"
fi
# remove intermediate files
if [[ "$RM_SHP" = "yes" ]]; then
rm -f $OUTPUT_DATA/$SHP_NAME.{shp,shx,prj,dbf,cpg}
fi
done
done
Common config file
Configuration file for PostgreSQL connection used in both examples (pgconfig):
user="postgres"
host="my_ip_or_hostname"
port="5432"
database="my_database"
schema="my_schema"
password="secret"
Another strategy is to choose GeoPackage as the output file that supports a larger file size than the shapefile format, maintaining portability across Operating Systems and having sufficient support in GIS softwares.
ogr2ogr -f GPKG output_file.gpkg PG:"host=my_ip_or_hostname user=postgres dbname=my_database password=secret" -sql "SELECT * FROM my_schema.my_table"
References:
Retrieve primary key columns - Postgres
LIMIT, OFFSET, ORDER BY and Pagination in PostgreSQL
ogr2ogr - GDAL

Conditionally construct schema depending on database version in PostgreSQL

Assume the following table and custom range type:
create table booking (
identifier integer not null primary key,
room uuid not null,
start_time time without time zone not null,
end_time time without time zone not null
);
create type timerange as range (subtype = time);
In PostgreSQL v10, you can do:
alter table booking add constraint overlapping_times
exclude using gist
(
room with =,
timerange(start_time, end_time) with &&
);
In PostgreSQL v9.5/v9.6, you have to manually cast the uuid column as gist_btree does not support uuid:
alter table booking add constraint overlapping_times
exclude using gist
(
(room::text) with =,
timerange(start_time, end_time) with &&
);
I would like to support v9.5, v9.6 and v10 for my customers. Is there a way to conditionally add the above constraint in the same .sql file, depending on the version of the current database?
You can use dynamic sql
For example:
do
$block$
declare
l_version text;
begin
select setting into l_version from pg_settings where name = 'server_version';
execute
format(
$script$
alter table booking add constraint overlapping_times
exclude using gist
(
%s with =,
timerange(start_time, end_time) with &&
)
$script$,
case when (l_version like '9.5.%' or l_version like '9.6.%') then '(room::text)' else 'room' end
)
;
end;
$block$
language plpgsql;
Here is a proof of concept:
#!/bin/sh
#THE_HOST="192.168.0.104"
THE_HOST="192.168.0.101"
get_version ()
{
psql -t -h ${THE_HOST} -U postgres postgres <<OMG | awk -e '{ print $2; }'
select version();
OMG
}
# ############################################################################
# main
#
# - Connect to database to retrieve version
# - use the retrieved version to create a symlink "versioned" to
# one of our subdirs
# - call an sql script that includes this symlink/some.sql
# symlinking to a non-existing directory will cause a dead link
# (, and the script to fail.)
# ergo: there should be a subdir "verX.Y.Z" for every supported version X.Y.Z
# ############################################################################
pg_version=`get_version`
#echo "version=${pg_version}"
rm versioned
ln -fs "ver${pg_version}" versioned
if [ -f versioned/alter.sql ]; then
echo created link versioned to "ver${pg_version}"
else
echo "version ${pg_version} not supported today..."
echo "Failed!"
exit 1
fi
psql -t -h ${THE_HOST} -U postgres postgres <<LETS_GO
\i common/create.sql
\i versioned/alter.sql
\echo done!
LETS_GO
#eof