Fetch Row count for all Database in PostgreSQL - postgresql

I'm able to fetch row count of a specific database in PostgreSqL using below query
select current_timestamp - query_start as runtime, datname, usename, query
from pg_stat_activity
where state != 'idle'
order by 1 desc;
+─────────────+──────────+─────────────+
| schemaname | relname | n_live_tup |
+─────────────+──────────+─────────────+
| 2022 | AA | 13960236 |
| 2022 | BB | 7176815 |
| 2022 | CC | 4669837 |
| 2022 | DD | 3782882 |
| 2022 | EE | 3315106 |
| 2022 | FF | 3060672 |
+─────────────+──────────+─────────────+
How could I get the row count for all database. I will be using python to query PostgreSQL.

Related

select inside select and minimizing running time in POSTGRESQL

I am trying to calculate the total booking number and the percentage for each hotel per year using POSTGRESQL. Here is my code:
WITH distribution_per_year AS
(
SELECT hotel, arrival_date_year,
COUNT(*) AS booking_by_hotel,
(SELECT COUNT(*) AS total_booking FROM "Full_Data" )
FROM "Full_Data"
GROUP BY hotel, arrival_date_year
)
SELECT hotel, arrival_date_year, booking_by_hotel, total_booking,
round(booking_by_hotel *100.00 / total_booking , 2) as percent
FROM distribution_per_year
and it worked, it gives me the results as I wanted
| hotel | arrival_date_year | booking_by_hotel | total_booking | percent |
|:------ |:----------------- |:---------------- |:------------- |:--------|
| Hotel1 | 2015 | 6526 | 100561 | 6.49 |
| Hotel1 | 2016 | 33210 | 100561 | 33.02 |
| Hotel1 | 2017 | 20064 | 100561 | 19.95 |
| Hotel2 | 2015 | 6758 | 100561 | 6.72 |
| Hotel2 | 2016 | 22434 | 100561 | 22.31 |
| Hotel2 | 2017 | 11569 | 100561 | 11.50 |
My question is: I noticed it takes time to run this code. I think it's because it repeats the subquery every time it group by
(SELECT COUNT(*) AS total_booking FROM "Full_Data" )
Is there a way to enhance this code??

PostgreSQL insert performance - why would it be so slow?

I've got a PostgreSQL database running inside a docker container on an AWS Linux instance. I've got some telemetry running, uploading records in batches of ten. A Python server inserts these records into the database. The table looks like this:
postgres=# \d raw_journey_data ;
Table "public.raw_journey_data"
Column | Type | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
email | character varying | | |
t | timestamp without time zone | | |
lat | numeric(20,18) | | |
lng | numeric(21,18) | | |
speed | numeric(21,18) | | |
There aren't that many rows in the table; about 36,000 presently. But committing the transactions that insert the data is taking about a minute each time:
postgres=# SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
pid | age | usename | query
-----+-----------------+----------+--------
30 | | |
32 | | postgres |
28 | | |
27 | | |
29 | | |
37 | 00:00:11.439313 | postgres | COMMIT
36 | 00:00:11.439565 | postgres | COMMIT
39 | 00:00:36.454011 | postgres | COMMIT
56 | 00:00:36.457828 | postgres | COMMIT
61 | 00:00:56.474446 | postgres | COMMIT
35 | 00:00:56.474647 | postgres | COMMIT
(11 rows)
The load average on the system's CPUs is zero and about half of the 4GB system RAM is available (as shown by free). So what causes the super-slow commits here?
The insertion is being done with SqlAlchemy:
db.session.execute(import_table.insert([
{
"email": current_user.email,
"t": row.t.ToDatetime(),
"lat": row.lat,
"lng": row.lng,
"speed": row.speed
}
for row in data.data
]))
Edit Update with the state column:
postgres=# SELECT pid, age(clock_timestamp(), query_start), usename, state, query
FROM pg_stat_activity
WHERE query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
pid | age | usename | state | query
-----+-----------------+----------+-------+--------
32 | | postgres | |
30 | | | |
28 | | | |
27 | | | |
29 | | | |
46 | 00:00:08.390177 | postgres | idle | COMMIT
49 | 00:00:08.390348 | postgres | idle | COMMIT
45 | 00:00:23.35249 | postgres | idle | COMMIT
(8 rows)

how is backend_start great than xact_start

How can the backend_start be greater than 2 days of xact_start/query_start? The 3rd sessions looks good, but the first 2 looks weird, is this possible? Would this mean anything?
pg=> select * from pg_catalog.pg_stat_activity where usename = 'etl_user' and state = 'active' and backend_xmin = 65201266;
datid | datname | pid |usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event| state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+----------+------------------------+----------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+------------+--------+-------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------
16408 | pg| 37908 | 229661 | etl_user | PostgreSQL JDBC Driver | | | | 2021-04-20 21:36:22.540271+00 | 2021-04-17 22:31:32.314106+00 | 2021-04-17 22:31:32.317577+00 | 2021-04-20 21:36:22.541472+00 | || active | | 65201266 | SELECT 1 FROM (SELECT ...) | parallel worker
16408 | pg| 37909 | 229661 | etl_user | PostgreSQL JDBC Driver | | | | 2021-04-20 21:36:22.540909+00 | 2021-04-17 22:31:32.314106+00 | 2021-04-17 22:31:32.317577+00 | 2021-04-20 21:36:22.542134+00 | || active | | 65201266 | SELECT 1 FROM (SELECT ...) | parallel worker
16408 | pg| 3601 | 229661 | etl_user | PostgreSQL JDBC Driver | 10.175.130.142 | | 49832 | 2021-04-17 22:31:32.232008+00 | 2021-04-17 22:31:32.314106+00 | 2021-04-17 22:31:32.317577+00 | 2021-04-17 22:31:32.317578+00 | || active | | 65201266 | SELECT 1 FROM (SELECT ...) | client backend
(3 rows)
It looks to me like those are parallel workers started up to help the leader, and they inherit the leaders xact_start, but not backend_start. It would help to see the rest of the columns in pg_stat_activity, and know the version.
Yes, that looks impossible.
The only explanation that I have is that someone changed the system time since the sessions started.

SQL server left join not returning expected records from left table

I have two objects within a SQl Server 2008 R2 database, which I am trying to join together with a left join but I am unable to get the left join to return all records from the table.
1 table - tt_activityoccurrence
1 view - vw_academicweeks
The vw_academicweeks, is a view that contains for each academic year a week number, and the first day and last day of the week and contains 52 records for each academic year.
tt_activityoccurrence is a table which contains occurrences of lessons within a year, lessons will not occur in all 52 weeks of the year.
With my query I am trying to return all instances from the vw_academicweeks view to return the following information
+------------+------------+------------+------------+---------+
| ActivityID | WeekStart | StartTime | EndTime | week_no |
+------------+------------+------------+------------+---------+
| 59936 | 04/09/2017 | 05/09/2017 | 05/09/2017 | 6 |
| 59936 | 11/09/2017 | 12/09/2017 | 12/09/2017 | 7 |
| 59936 | 18/09/2017 | 19/09/2017 | 19/09/2017 | 8 |
| 59936 | 25/09/2017 | 26/09/2017 | 26/09/2017 | 9 |
| 59936 | 02/10/2017 | 03/10/2017 | 03/10/2017 | 10 |
| 59936 | 09/10/2017 | 10/10/2017 | 10/10/2017 | 11 |
| 59936 | 16/10/2017 | 17/10/2017 | 17/10/2017 | 12 |
| 59936 | Null | Null | Null | 13 |
| 59936 | 30/10/2017 | 31/10/2017 | 31/10/2017 | 14 |
| 59936 | 06/11/2017 | 07/11/2017 | 07/11/2017 | 15 |
| 59936 | 13/11/2017 | 14/11/2017 | 14/11/2017 | 16 |
| 59936 | 20/11/2017 | 21/11/2017 | 21/11/2017 | 17 |
| 59936 | 27/11/2017 | 28/11/2017 | 28/11/2017 | 18 |
| 59936 | 04/12/2017 | 05/12/2017 | 05/12/2017 | 19 |
| 59936 | 11/12/2017 | 12/12/2017 | 12/12/2017 | 20 |
| 59936 | 18/12/2017 | 19/12/2017 | 19/12/2017 | 21 |
| 59936 | Null | Null | Null | 22 |
| 59936 | Null | Null | Null | 23 |
+------------+------------+------------+------------+---------+
With the left join I can return all values except the nulls, so that the week_no column is missing rows, 13,22 and 23. I have also tried this with an outer join but receive the same information.
I feel I am missing something obvious but it is escaping me at the moment.
select
ttao.ActivityID
,dateadd(dd,datediff(dd,0,DATEADD(dd, -(DATEPART(dw, ttao.StartTime)-1), ttao.StartTime)),0) WeekStart
,ttao.StartTime
,ttao.EndTime
,aw.week_no
from
vw_AcademicWeeks AW
left join TT_ActivityOccurrence TTAO on
(dateadd(dd,datediff(dd,0,DATEADD(dd, -(DATEPART(dw, ttao.StartTime)-1), ttao.StartTime)),0))=aw.ay_start
where
ay_code='1718' and
TTAO.ActivityID='59936'
order by aw.week_no asc
Your where clause makes it an inner join by eliminating rows outside of the scope of your join. You need to move this logic up to your join statement. Note, I didn't validate your join condiditon (the dateadd...datediff logic)
select
ttao.ActivityID
,dateadd(dd,datediff(dd,0,DATEADD(dd, -(DATEPART(dw, ttao.StartTime)-1), ttao.StartTime)),0) WeekStart
,ttao.StartTime
,ttao.EndTime
,aw.week_no
from
vw_AcademicWeeks AW
left join TT_ActivityOccurrence TTAO on
(dateadd(dd,datediff(dd,0,DATEADD(dd, -(DATEPART(dw, ttao.StartTime)-1), ttao.StartTime)),0)) = aw.ay_start
and ay_code='1718'
and TTAO.ActivityID='59936'
order by aw.week_no asc

Crosstab function and Dates PostgreSQL

I had to create a cross tab table from a Query where dates will be changed into column names. These order dates can be increase or decrease as per the dates passed in the query. The order date is in Unix format which is changed into normal format.
Query is following:
Select cd.cust_id
, od.order_id
, od.order_size
, (TIMESTAMP 'epoch' + od.order_date * INTERVAL '1 second')::Date As order_date
From consumer_details cd,
consumer_order od,
Where cd.cust_id = od.cust_id
And od.order_date Between 1469212200 And 1469212600
Order By od.order_id, od.order_date
Table as follows:
cust_id | order_id | order_size | order_date
-----------|----------------|---------------|--------------
210721008 | 0437756 | 4323 | 2016-07-22
210721008 | 0437756 | 4586 | 2016-09-24
210721019 | 10749881 | 0 | 2016-07-28
210721019 | 10749881 | 0 | 2016-07-28
210721033 | 13639 | 2286145 | 2016-09-06
210721033 | 13639 | 2300040 | 2016-10-03
Result will be:
cust_id | order_id | 2016-07-22 | 2016-09-24 | 2016-07-28 | 2016-09-06 | 2016-10-03
-----------|----------------|---------------|---------------|---------------|---------------|---------------
210721008 | 0437756 | 4323 | 4586 | | |
210721019 | 10749881 | | | 0 | |
210721033 | 13639 | | | | 2286145 | 2300040