Dividing 2 count statements in Postgresql - postgresql

I do have a question about the division of 2 count statements below, which give me the error underneath.
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
INNER JOIN account ON account.sfid = transactions.accountsfid
INNER JOIN transactionLineItems ON transactions.transactionNumber
= transactionLineItems.transactionNumber
INNER JOIN products ON transactionLineItems.USIM = products.USIM
WHERE products.gender = 'male' AND products.agegroup = 'adult'
AND transactions.transactionDate >= current_date - interval
'730' day)/
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
WHERE transactions.transactionDate >=
current_date - interval '730' day)
ERROR: syntax error at or near "/"
LINE 6: ...tions.transactionDate >= current_date - interval '730' day)/``
What I think the problem is, that the my count statements are creating tables, and the division of the tables is the problem, but how can I make this division work?
Afterwards I want to check the result against a percentage, e.g. < 0.2.
Can anyone help me with this.

Is that your complete query? Something like this works in Postgres 10:
SELECT
(SELECT COUNT(id) FROM test WHERE state = false) / (SELECT COUNT(id) FROM test WHERE state = true) as y
The extra SELECT in front of both sub queries with the division is what's important. Otherwise I also get the error you mentioned.
See also my DB Fiddle version of this query.

Related

Postgres Aggregate function in where clause

I am trying to get the count of leads that enrolled in the last 180 days. I have the subquery for leads as
left outer join (select nl.zip, count(distinct lsu.lead_id) as leadCount
from lead_status_updates lsu
join normalized_lead_locations nl on nl.lead_id = lsu.lead_id
where category in ('PropertySell','PropertySearch')
AND lower(status) not like ('invite%')
and DATE_PART('day', current_date - min(lsu.created)) < 181
group by nl.zip) lc on lc.Zip = cbsa.zip
but I get the error message that I can't have aggregate functions in the where clause. Anyone know how to fix this?

How to fix column "sv.last_updated" must appear in the GROUP BY clause or be used in an aggregate function in PostgreSQL

What I'm trying to achieve is like get the total count of transactions and their total amount for every hour a given day.
I have written a cast query in PostgreSQL. if I execute it gives
column "sv.last_updated" must appear in the GROUP BY clause or be used in an aggregate function
select cast('00:00' as time) + g.h * interval '1 hour' as time,
count(sv.id) as counts,
sum(sv.amount) as amount
from generate_series(0, 23, 1) g(h)
left join paymentvirtualization.summery_virtualizer sv
on extract(hour from sv.last_updated) = g.h
and date_trunc('day', sv.last_updated) = '2021-09-28'
and sv.guid = '1aecb2ba5c3941fe9cdab0cbf0c64937'
group by g.h
order by g.h,sv.last_updated desc limit 1;
What is this issue and how can i fix this?
Thanks.

How to pass the result from first query to the second one with PostgresOperator in airflow 2.x?

I have to pass the result of my first redshift query to the second one. I am using postgres operator, Postgre script. doesn't have any return function as you see in this link
Actually I thought to modify the script and add return to the execute method. But the point is that I do not use the execute method and for executing the sql script I am using this:
retrieve_latest_query_task = PostgresOperator(
sql='rs_warm-up_query-id.sql',
postgres_conn_id='redshift',
task_id='retrieve_latest_query_ids_from_metadata'
)
Here are my two queries:
SELECT query
FROM (SELECT query,
querytxt,
ROW_NUMBER() OVER (PARTITION BY querytxt ORDER BY query ASC) AS num
FROM stl_query
WHERE userid = 102
AND starttime >= CURRENT_DATE - 2 + INTERVAL '7 hour'
AND starttime < CURRENT_DATE - 2 + INTERVAL '11 hour'
AND UPPER(querytxt) LIKE 'SELECT %'
ORDER BY query)
WHERE num = 1;
and with the retrieve data (which is a list) , I have to pass it to the second script:
SELECT LISTAGG(CASE WHEN LEN (RTRIM(TEXT)) = 0 THEN TEXT ELSE RTRIM(TEXT) END,'') within group(ORDER BY SEQUENCE) AS TEXT
FROM stl_querytext
WHERE query = {};
I thought that using xcom could be a good solution, as I don't return many rows. But I don't know how to use it with Postgres.
I don't want to use the temporal table, as I believe that for that small volume I don't need.
I ll appreciate your help.

Postgres SQL - How to create a dynamic date variable

I want my query to have a dynamic date. The way it is written now, I would have to manually change the date every time. Please see the following as an example:
(select*
from table2
where table2.begin_timestamp::date = '2015-04-01')as start
left outer join
(Select *
from table 1
where opened_at::date >= ('2015-04-01' - 15)
and opened_at::date <= '2015-04-01’)
I don't want '2015-04-01' to be hard-coded. I want to run this query over and over for a series of dates.
Using normal joins, you can do this in an on clause or where clause but not inside the subquery. That leads to logic like this:
from (select*
from table2
) start left outer join
table 1
on opened_at::date >= table2.begin_timestamp::date - interval '15 day' and
opened_at::date <= table2.begin_timestamp::date
I'm not a postgres developer but I think you can adapt a technique from the sql server world called "tally tables".
Esentially your goal is to join day d and the window of days that are at most 15 days greater than it.
You can use something like
SELECT * FROM generate_series('2015-04-01'::timestamp,
'2015-04-30 00:00', '1 days');
To generate a date sequence and from there you can write something like
select *
from table a
join generate_series('2015-04-01'::timestamp,'2015-04-30','1 days') s(o)
on a.begin_timestamp::date = s.o
join table2 b
on a.opened_at>= b.begin_timestamp::date - interval '15 days'
and opened_at::date <= table2.begintimestamp::date
Essentially, instead of looping you use a series of the dates between the beginning of the interval and the end of the range to produce the results you are after.

'View' (NOT DELETE) Duplicate Rows from a Postgresql table obtained from joins

So I have temp table I created by joining three tables :
Trips
Stops
Stop_times
The Stop_times table has a list of trip_ids, the corresponding stops and the scheduled arrival and departure times of buses at those stops.
I searched online and everywhere I seem to find answers for how to delete duplicates (using ctid, nested queries) but not view them.
My query looks something like this :
CREATE TEMP TABLE temp as
SELECT
(CASE st.arrival_time < current_timestamp::time
WHEN true THEN (current_timestamp::date + interval '1 day') + st.arrival_time
ELSE (current_timestamp::date) + st.arrival_time
END) as arrival,
CASE st.departure_time < current_timestamp::time
WHEN true THEN (current_timestamp::date + interval '1 day') + st.departure_time
ELSE (current_timestamp::date) + st.departure_time
END as departure, st.trip_id, st.stop_id, st.stop_headsign,route_id, t.trip_headsign, s.stop_code, s.stop_name, s.stop_lat, s.stop_lon
FROM schema.stop_times st
JOIN schema.trips t ON t.trip_id=st.trip_id
JOIN schema.stops s ON s.stop_id=st.stop_id
order by arrival, departure;
I know that there are duplicates (by running the select * and select DISTINCT on temp), I just need to identify the duplicates...any help will be appreciated!
PS : I know I can use DISTINCT and get rid of duplicates, but it is slowing down the query a lot so I need to rework the query for which I need to identify the duplicates, the resultant records are greater than 200,000 so exporting them to excel and filtering duplicates is not an option either (I tried but excel can't handle it)
I believe this will give you what you want:
SELECT arrival, departure, trip_id, stop_id, stop_headsign, route_id,
headsign, stop_code, stop_name, stop_lat, stop_lon, count(*)
FROM temp
GROUP BY arrival, departure, trip_id, stop_id, stop_headsign, route_id,
headsign, stop_code, stop_name, stop_lat, stop_lon
HAVING count(*) > 1;