Tracking amount of bytes written to temp tables - postgresql

With PostgreSQL 9.5, I would like to track the total amount of bytes written (since DB cluster start) to:
WAL
temp files
temp tables
For 1.:
select
pg_size_pretty(archived_count * 16*1024*1024) temp_bytes,
(now() - stats_reset)::text uptime
from pg_stat_archiver;
For 2.:
select
(now() - stats_reset)::text uptime,
pg_size_pretty(temp_bytes) temp_bytes
from pg_stat_database where datname = 'mydb';
How do I get 3.?
In response to a comment below, I did some tests to check where temp tables are actually written.
First, the DB parameter temp_buffers is at 8GB on this cluster:
select pg_size_pretty(setting::bigint*8192) from pg_settings
where name = 'temp_buffers';
-- "8192 MB"
Lets create a temp table:
drop table if exists foo;
create temp table foo as
select random() from generate_series(1, 1000000000);
-- Query returned successfully: 1000000000 rows affected, 10:22 minutes execution time.
Check the PostgreSQL backend pid and OID of the created temp table:
select pg_backend_pid(), 'pg_temp.foo'::regclass::oid;
-- 46573;398695055
Check the RSS size of the backend process
~$ grep VmRSS /proc/46573/status
VmRSS: 9246276 kB
As can be seen, this is only slightly above the 8GB set with temp_buffers.
The data inserted into the temp table is however immediately written, and it is written to the normal tablespace directories, not temp files:
select * from pg_relation_filepath('pg_temp.foo')
-- "base/16416/t3_398695055"
Here is the number of files and amount written:
with temp_table_files as
(
select * from pg_ls_dir('base/16416/') fn
where fn like 't3_398695055%'
)
select
count(*) as cnt,
pg_size_pretty(sum((pg_stat_file('base/16416/' || fn)).size)) as size
from temp_table_files;
-- 34;"34 GB"
And finally verify that the set of temp files owned by this backend PID is indeed empty:
with temp_files_per_pid as
(
with temp_files as
(
select
temp_file,
(regexp_replace(temp_file, $r$^pgsql_tmp(\d+)\..*$$r$, $rr$\1$rr$, 'g'))::int as pid,
(pg_stat_file('base/pgsql_tmp/' || temp_file)).size as size
from pg_ls_dir('base/pgsql_tmp') temp_file
)
select pid, pg_size_pretty(sum(size)) from temp_files group by pid order by pid
)
select * from temp_files_per_pid where pid = 46573;
Returns nothing.
What is also "interesting", after dropping the temp table
DROP TABLE foo;
the RSS of the backend process does not reduce:
~$ grep VmRSS /proc/46573/status
VmRSS: 9254544 kB
Doing the following will also not free the RSS again:
RESET ALL;
DEALLOCATE ALL;
DISCARD TEMP;

What I know, there are not any special metric for temp tables. The temp tables uses session (process) memory to temp_buffers size (8MB by default). When these temp buffers are full, then temporary files are generated.

Related

Get postgres query log statement and duration as one record

I have log_min_duration_statement=0 in config.
When I check log file, sql statement and duration are saved into different rows.
(Not sure what I have wrong, but statement and duration are not saved together as this answer points)
As I understand session_line_num for duration record always equals to session_line_num + 1 for relevant statement, for same session of course.
Is this correct? is below query reliable to correctly get statement with duration in one row?
(csv log imported into postgres_log table):
WITH
sql_cte AS(
SELECT session_id, session_line_num, message AS sql_statement
FROM postgres_log
WHERE
message LIKE 'statement%'
)
,durat_cte AS (
SELECT session_id, session_line_num, message AS duration
FROM postgres_log
WHERE
message LIKE 'duration%'
)
SELECT
t1.session_id,
t1.session_line_num,
t1.sql_statement,
t2.duration
FROM sql_cte t1
LEFT JOIN durat_cte t2
ON t1.session_id = t2.session_id AND t1.session_line_num + 1 = t2.session_line_num;

AWS Redshift: FATAL: connection limit "500" exceeded for non-bootstrap users

Hope you're all okay.
We hit this limit quite often. We know there is no way to up the 500 limit of concurrent user connections in Redshift. We also know certain views (pg_user_info) provide info as to the user's actual limit.
We are looking for some answers not found in this forum plus any guidance based on your experience.
Questions:
Does recreation of the cluster with bigger EC2 instances, would yield a higher limit value?
Does adding new nodes to the existing cluster would yield a higher limit value?
From the app development perspective: What specific strategies/actions you'd recommend in order to spot or predict a situation whereby this limit will be hit?
Txs - Jimmy
Okay folks.
thanks to all who answered.
I posted a support ticket in AWS and this is the recommendation, pasting all here, it's long but I hope it works for many people running into this issue. The idea is to catch the situation before it happens:
To monitor the number of connections made to the database, you can create a cloudwatch alarm based on the Database connections metrics that will trigger a lambda function when a certain threshold is reached. This lambda function can then terminate idle connections by calling a procedure that terminates idle connections.
Please find the query that creates a procedure to log and terminate long running inactive sessions
:
1. Add view to get all current inactive sessions in the cluster
CREATE OR REPLACE VIEW inactive_sessions as (
select a.process,
trim(a.user_name) as user_name,
trim(c.remotehost) as remotehost,
a.usesysid,
a.starttime,
datediff(s,a.starttime,sysdate) as session_dur,
b.last_end,
datediff(s,case when b.last_end is not null then b.last_end else a.starttime end,sysdate) idle_dur
FROM
(
select starttime,process,u.usesysid,user_name
from stv_sessions s, pg_user u
where
s.user_name = u.usename
and u.usesysid>1
and process NOT IN (select pid from stv_inflight where userid>1
union select pid from stv_recents where status != 'Done' and userid>1)
) a
LEFT OUTER JOIN (
select
userid,pid,max(endtime) as last_end from svl_statementtext
where userid>1 and sequence=0 group by 1,2) b ON a.usesysid = b.userid AND a.process = b.pid
LEFT OUTER JOIN (
select username, pid, remotehost from stl_connection_log
where event = 'initiating session' and username <> 'rsdb') c on a.user_name = c.username AND a.process = c.pid
WHERE (b.last_end > a.starttime OR b.last_end is null)
ORDER BY idle_dur
);
2. Add table for logging information about long running transactions that was terminated
CREATE TABLE IF NOT EXISTS terminated_inactive_sessions (
process int,
user_name varchar(50),
remotehost varchar(50),
starttime timestamp,
session_dur int,
idle_dur int,
terminated_on timestamp DEFAULT GETDATE()
);
3. Add procedure to log and terminate any inactive transactions running for longer than 'n' amount of seconds
CREATE OR REPLACE PROCEDURE terminate_and_log_inactive_sessions (n INTEGER)
AS $$
DECLARE
expired RECORD ;
BEGIN
FOR expired IN SELECT process, user_name, remotehost, starttime, session_dur, idle_dur FROM inactive_sessions where idle_dur >= n
LOOP
EXECUTE 'INSERT INTO terminated_inactive_sessions (process, user_name, remotehost, starttime, session_dur, idle_dur) values (' || expired.process || ' , ''' || expired.user_name || ''' , ''' || expired.remotehost || ''' , ''' || expired.starttime || ''' , ' || expired.session_dur || ' , ' || expired.idle_dur || ');';
EXECUTE 'SELECT PG_TERMINATE_BACKEND(' || expired.process || ')';
END LOOP ;
END ;
$$ LANGUAGE plpgsql;
4. Execute the procedure by running the following command:
call terminate_and_log_inactive_sessions(100);
Here is a sample lambda function that attempts to close idle connections by querying the view 'inactive_sessions' created above, which you can use as a reference.
#Current time
now = datetime.datetime.now()
query = "SELECT process, user_name, session_dur, idle_dur FROM inactive_sessions where idle_dur >= %d"
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
try:
conn = psycopg2.connect("dbname=" + db_database + " user=" + db_user + " password=" + db_password + " port=" + db_port + " host=" + db_host)
conn.autocommit = True
except:
logger.error("ERROR: Unexpected error: Could not connect to Redshift cluster.")
sys.exit()
logger.info("SUCCESS: Connection to RDS Redshift cluster succeeded")
with conn.cursor() as cur:
cur.execute(query % (session_idle_limit))
row_count = cur.rowcount
if row_count >=1:
result = cur.fetchall()
for row in result:
print("terminating session with pid %s that has been idle for %d seconds at %s" % (row[0],row[3],now))
cur.execute("SELECT PG_TERMINATE_BACKEND(%s);" % (row[0]))
conn.close()
else:
conn.close()
As you said this is a hard limit in Redshift and there is no way to up it. Redshift is not a high concurrency / high connection database.
I expect that if you need the large data analytic horsepower of Redshift you can get around this with connection sharing. Pgpool is a common tool for this.

Importing bytea data into PostgreSQL by using COPY FROM stdin

I generated a (UTF-8) file by an external program for importing into PostgreSQL 9.6.1. Problem is the bytea field (PWHASH).
Snippet from this file (using TAB as delimiter)
COPY USERS (ID,CODE,PWHASH,EMAIL) FROM stdin;
7 test1 E'\\\\x657B954D27B4AC56FA997D24A5FF2563' test#amce.org
\.
When importing with
psql mydb myrole -f test.sql
Everything goes well.
However, if i query the result, the byte array is not 16 bytes, but 37 bytes:
select passwordhash,length(passwordhash) from users;
passwordhash | length
------------------------------------------------------------------------------+--------
\x45275c78363537423935344432374234414335364641393937443234413546463235363327 | 37
What is the correct syntax for this?
The format of the input file is wrong. It should be like this:
7 test1 \\x657B954D27B4AC56FA997D24A5FF2563 test#amce.org
I will have to "prepare" data I believe. Smth like here:
t=# insert into u select 'x657B954D27B4AC56FA997D24A5FF2563';
INSERT 0 1
Time: 5990.809 ms
t=# select b from u;
b
----------------------------------------------------------------------
\x783635374239353444323742344143353646413939374432344135464632353633
(1 row)
Time: 0.234 ms
t=# insert into u select decode('657B954D27B4AC56FA997D24A5FF2563','hex');
INSERT 0 1
Time: 62.767 ms
t=# select b from u;
b
----------------------------------------------------------------------
\x783635374239353444323742344143353646413939374432344135464632353633
\x657b954d27b4ac56fa997d24a5ff2563
(2 rows)
Time: 0.208 ms
So in your case you can:
create table t as select ID,CODE,PWHASH::text,EMAIL from users where false;
COPY t (ID,CODE,PWHASH,EMAIL) FROM stdin;
insert into users select ID,CODE,decode(substr(PWHASH,4),'hex'),EMAIL from t;

RedShift copy command return

can we get the number of row inserted through copy command? Some records might fail, then what is the no of records successfully inserted?
I have a file with json object in Amazon S3 and trying to load data into Redshift using copy command. How do I know how many of records successfully got inserted and how many failed?
Loading some example data:
db=# copy test from 's3://bucket/data' credentials '' maxerror 5;
INFO: Load into table 'test' completed, 4 record(s) loaded successfully.
COPY
db=# copy test from 's3://bucket/err_data' credentials '' maxerror 5;
INFO: Load into table 'test' completed, 1 record(s) loaded successfully.
INFO: Load into table 'test' completed, 2 record(s) could not be loaded. Check 'stl_load_errors' system table for details.
COPY
Then the following query:
with _successful_loads as (
select
stl_load_commits.query
, listagg(trim(filename), ', ') within group(order by trim(filename)) as filenames
from stl_load_commits
left join stl_query using(query)
left join stl_utilitytext using(xid)
where rtrim("text") = 'COMMIT'
group by query
),
_unsuccessful_loads as (
select
query
, count(1) as errors
from stl_load_errors
group by query
)
select
query
, filenames
, sum(stl_insert.rows) as rows_loaded
, max(_unsuccessful_loads.errors) as rows_not_loaded
from stl_insert
inner join _successful_loads using(query)
left join _unsuccessful_loads using(query)
group by query, filenames
order by query, filenames
;
Giving:
query | filenames | rows_loaded | rows_not_loaded
-------+------------------------------------------------+-------------+-----------------
45597 | s3://bucket/err_data.json | 1 | 2
45611 | s3://bucket/data1.json, s3://bucket/data2.json | 4 |
(2 rows)

H2 Optimize select statement / shutdown defrag

Test Case:
drop table master;
create table master(id int primary key, fk1 int, fk2 int, fk3 int, dataS varchar(255), data1 int, data2 int, data3 int, data4 int,data5 int,data6 int,data7 int,data8 int,data9 int,b1 boolean,b2 boolean,b3 boolean,b4 boolean,b5 boolean,b6 boolean,b7 boolean,b8 boolean,b9 boolean,b10 boolean,b11 boolean,b12 boolean,b13 boolean,b14 boolean,b15 boolean,b16 boolean,b17 boolean,b18 boolean,b19 boolean,b20 boolean,b21 boolean,b22 boolean,b23 boolean,b24 boolean,b25 boolean,b26 boolean,b27 boolean,b28 boolean,b29 boolean,b30 boolean,b31 boolean,b32 boolean,b33 boolean,b34 boolean,b35 boolean,b36 boolean,b37 boolean,b38 boolean,b39 boolean,b40 boolean,b41 boolean,b42 boolean,b43 boolean,b44 boolean,b45 boolean,b46 boolean,b47 boolean,b48 boolean,b49 boolean,b50 boolean);
create index idx_comp on master(fk1,fk2,fk3);
#loop 5000000 insert into master values(?, mod(?,100), mod(?,5), ?,'Hello World Hello World Hello World',?, ?, ?,?, ?, ?, ?, ?, ?,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true,true,true,true,true,true,true,false,false,false,true);
1.The following select statement takes up to 30seconds. Is there a way to optimize the response time?
SELECT count(*), SUM(CONVERT(b1,INT)) ,SUM(CONVERT(b2,INT)),SUM(CONVERT(b3,INT)),SUM(CONVERT(b4,INT)),SUM(CONVERT(b5,INT)),SUM(CONVERT(b6,INT)),SUM(CONVERT(b7,INT)),SUM(CONVERT(b8,INT)),SUM(CONVERT(b9,INT)),SUM(CONVERT(b10,INT)),SUM(CONVERT(b11,INT)),SUM(CONVERT(b12,INT)),SUM(CONVERT(b13,INT)),SUM(CONVERT(b14,INT)),SUM(CONVERT(b15,INT)),SUM(CONVERT(b16,INT))
FROM master
WHERE fk1=53 AND fk2=3
2.I tried shutdown defrag. But this statement took about 40min for my test case. After shutdown defrag the select takes up to 15seconds. If i execute the statement again it takes under 1sec. Even if stop and start the server, the statement takes about 1sec.
Has H2 a persistent Cache?
Infrastructure: WebBrowser <-> H2 Console Server <-> H2 DB: h2 1.3.158
According to the profiler output, the main problem (93%) is reading from the disk. I ran this in the H2 Console:
#prof_start;
SELECT ... FROM master WHERE fk1=53 AND fk2=3;
#prof_stop;
and got:
Profiler: top 3 stack trace(s) of 48039 ms [build-158]:
4084/4376 (93%):
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:338)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
at org.h2.store.FileStore.readFully(FileStore.java:285)
at org.h2.store.PageStore.readPage(PageStore.java:1253)
at org.h2.store.PageStore.getPage(PageStore.java:707)
at org.h2.index.PageDataIndex.getPage(PageDataIndex.java:225)
at org.h2.index.PageDataNode.getRowWithKey(PageDataNode.java:269)
at org.h2.index.PageDataNode.getRowWithKey(PageDataNode.java:270)
According to EXPLAIN ANALYZE SELECT it's reading over 55'000 pages from the disk (2 KB each page; 110 MB) for this query. I'm not sure how other databases perform for such a query. But I guess if possible the query should be changed so that it reads less data.
Is it possible to have a temporary table/view that already has the datatype conversions done? If it's feasible to have that update itself from the main table occassionally (once a night or so), then you've got a lot of processing power that goes into the conversion done already.
If that's not feasible, you may want to do multiple sub-selects, one for each "b" column, where you only pull where b# = 1. Then do a COUNT instead of a SUM, which should be faster as well. For instance:
SELECT (count1+count2) AS Count,
(SELECT COUNT(*) FROM master WHERE fk1=53 AND fk2=3 AND b1=1) AS count1
(SELECT COUNT(*) FROM master WHERE fk1=53 AND fk2=3 AND b2=1) AS count2
I'm not sure if that exact syntax works in your program, but hopefully as a generic SQL idea it gets you on the right track.