Oracle to Postgres - Convert Keep dense_rank query - postgresql

New to Postgresql and trying to convert an Oracle query to Postgres that uses keep dense_rank.
Here is the Oracle query that works properly:
SELECT
dbx.bundle_id,
j.JOB_CD||'_'||d.doc_type_cd||wgc.ending_cd as job_doc_type_wgt_grp,
b.create_date,
min (A.ADDRESS1) keep ( dense_rank first order by a.POSTAL_CD) as first_address,
max (A.ADDRESS1) keep ( dense_rank last order by a.POSTAL_CD) as last_address,
min (A.DOCUMENT_ID) keep ( dense_rank first order by a.POSTAL_CD) as first_docid,
max (A.DOCUMENT_ID) keep ( dense_rank last order by a.POSTAL_CD) as last_docid,
min (doc.original_img_name) keep ( dense_rank first order by a.POSTAL_CD) as first_doc_name,
max (doc.original_img_name) keep ( dense_rank last order by a.POSTAL_CD) as first_doc_name,
COUNT(distinct a.DOCUMENT_ID) as doc_count,
SUM(B.PAGES) AS BUNDLE_PAGE_COUNT
FROM ADDRESS A
JOIN DOC_BUNDLE_XREF DBX ON (DBX.DOCUMENT_ID = A.DOCUMENT_ID)
JOIN DOCUMENT DOC ON DOC.DOCUMENT_ID = DBX.DOCUMENT_ID
JOIN BUNDLE B ON B.BUNDLE_ID = DBX.BUNDLE_ID
JOIN JOB J ON J.JOB_ID = B.JOB_ID
JOIN DOC_TYPE D ON D.DOC_TYPE_ID=B.DOC_TYPE_ID
JOIN WEIGHT_GROUP_CD WGC ON WGC.WEIGHT_GROUP_CD_ID = B.WEIGHT_GROUP_CD_ID
WHERE A.ADDRESS_TYPE_ID =
(SELECT MAX( address_type_id )
FROM ADDRESS AI
WHERE AI.document_id =A.DOCUMENT_ID)
AND DBX.BUNDLE_ID in (1404,1405,1407)
group by dbx.BUNDLE_ID, j.JOB_CD||'_'||d.doc_type_cd||wgc.ending_cd, b.create_date;
Here's the PG version:
SELECT
dbx.bundle_id,
j.JOB_CD||'_'||d.doc_type_cd||wgc.ending_cd as job_doc_type_wgt_grp,
b.create_date,
FIRST_VALUE(A.ADDRESS1) OVER (order by a.POSTAL_CD) as first_address,
LAST_VALUE(A.ADDRESS1) OVER (order by a.POSTAL_CD) as last_address,
FIRST_VALUE(A.DOCUMENT_ID) OVER (order by a.POSTAL_CD) as first_docid,
LAST_VALUE(A.DOCUMENT_ID) OVER (order by a.POSTAL_CD) as last_docid,
FIRST_VALUE(doc.original_img_name) OVER (order by a.POSTAL_CD) as first_doc_name,
LAST_VALUE(doc.original_img_name) OVER (order by a.POSTAL_CD) as first_doc_name,
COUNT(distinct a.DOCUMENT_ID) as doc_count,
SUM(B.PAGES) AS BUNDLE_PAGE_COUNT
FROM ADDRESS A
JOIN DOC_BUNDLE_XREF DBX ON (DBX.DOCUMENT_ID = A.DOCUMENT_ID)
JOIN DOCUMENT DOC ON DOC.DOCUMENT_ID = DBX.DOCUMENT_ID
JOIN BUNDLE B ON B.BUNDLE_ID = DBX.BUNDLE_ID
JOIN JOB J ON J.JOB_ID = B.JOB_ID
JOIN DOC_TYPE D ON D.DOC_TYPE_ID=B.DOC_TYPE_ID
JOIN WEIGHT_GROUP_CD WGC ON WGC.WEIGHT_GROUP_CD_ID = B.WEIGHT_GROUP_CD_ID
WHERE A.ADDRESS_TYPE_ID = (
SELECT MAX( address_type_id )
FROM ADDRESS AI
WHERE AI.document_id =A.DOCUMENT_ID)
AND DBX.BUNDLE_ID in (1404,1405,1407)
group by dbx.BUNDLE_ID, j.JOB_CD||'_'||d.doc_type_cd||wgc.ending_cd, b.create_date;
Running this statement yields the following error:
SQL Error [42803]: ERROR: column "a.address1" must appear in the GROUP BY clause or be used in an aggregate function
I've tried MIN and MAX in place of FIRST_VALUE and LAST_VALUE, same results. The same error happens on all of the other FIRST_VALUE and LAST_VALUE functions.
What am I missing here? Any idea why it doesn't recognize a.address1 (or any of the other columns) as being in an aggregate?
I'm using DBeaver version 21 to run these queries if that makes any difference. Any guidance is greatly appreciated.

Related

simpler query for counting total row in the column

not sure if my below query script right to execute as I am trying to find just one query in Oracle SQL query
select distinct (master_id) , last_name from (
select q2.* , max (count_a) over (partition by master_id)
count_b from ( select q1.* , count (*) over (partition by
master_id order by purchased_date desc ) count_a from profile q1)
q2) where count_b > 2
I am trying to minimise timer to execute get data by reducing sub query
for example above it has two subqueries
max (count_a) over (partition by master_id) count_b
count (*) over (partition by master_id order by purchased_date desc ) count_a
so I played around until this query
max (count (*)) over (partition by master_id) count
SQL query script;
select * from profile a
join( select * from (
select master_id, max (count(*)) over (partition by
master_id) count from profile) where count >2) b
ON a. master_id = b. master_id
Thank you in advance for your help

SQL Group By that works in SQLite does not work in Postgres

This statement works in SQLite, but not in Postgres:
SELECT A.*, B.*
FROM Readings A
LEFT JOIN Offsets B ON A.MeterNum = B.MeterNo AND A.DateTime > B.TimeDate
WHERE A.MeterNum = 1
GROUP BY A.DateTime
ORDER BY A.DateTime DESC
The Readings table contains electric submeter readings each with a date stamp. The Offsets table holds an adjustment that the user enters after a failed meter is replaced with a new one that starts again at zero. Without the Group By statement the query returns a line for each meter reading with each prior adjustment made before the reading date while I only want the last adjustment.
All the docs I've seen on Group By for Postgres indicate I should be including an aggregate function which I don't need and can't use (The Reading column contains the Modbus string returned from the meter).
Just pick the latest reading in a derived table. In Postgres this can be done quite efficiently using distinct on ()
SELECT A.*, B.*
FROM readings A
left join (
select distinct on (meterno) o.*
from offsets o
order by o.meterno, o.timedate desc
) B ON A.MeterNum = B.MeterNo AND A.DateTime > B.TimeDate
WHERE A.meternum = 1
ORDER BY A.DateTime DESC
distinct on () will only return one row per meterno and this is the "latest" row due to the order by ... , timedate desc
The query might even be faster by pushing the condition on datetime > timedate into the derived table using a lateral join:
SELECT A.*, B.*
FROM readings A
left join lateral (
select distinct on (meterno) o.*
from offsets o
where a.datetime > o.timedeate
order by o.meterno, o.timedate desc
) B ON A.MeterNum = B.MeterNo
WHERE A.meternum = 1
ORDER BY A.DateTime DESC

Distinct one column return selected columns and order by date desc using sybase

Hi Guys i have a problem on how im going to distinct single column and return selected column
Ac_no ord_status order_no
12334 PL 1
12334 ML 2
12334 CL 3
64543 PL 1
65778 JL 6
83887 CL 4
83887 KL 3
Ac_no ord_statu sorder_no
12334 CL 3
64543 PL 1
65778 JL 6
83887 CL 4
i want to see that result
here is my sample or code but unfortunately the code didnt work in sybase 1.2.0.637
SELECT Ac_no, ord_status, order_no
select *, ROW_NUMBER() OVER (PARTITION BY Ac_no order by ord_status)rm
from wo_order)x
where x = 1
It appears that you want to display, for each Ac_no group of records, the single record having the lowest ord_status. You were on the right track, but you need to restrict the subquery using the alias you defined for the row number:
SELECT Ac_no, ord_status, order_no
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Ac_no ORDER BY ord_status) rn
FROM wo_order
) t
WHERE rn = 1;
Here is a version which should run on your version of Sybase, even without using ROW_NUMBER:
SELECT w1.Ac_no, w1.ord_status, w1.order_no
FROM wo_order w1
INNER JOIN
(
SELECT Ac_no, MIN(ord_status) AS min_ord_status
FROM wo_order
GROUP BY Ac_no
) w2
ON w1.Ac_no = w2.Ac_no AND
w1.ord_status = w2.min_ord_status;
Should work without a window function:
select t1.* from wo_order t1,
(select max(order_no) order_no, ac_no from wo_order group by ac_no) t2
where
t1.ac_no=t2.ac_no
and t1.order_no=t2.order_no

How to workaround unsupported percentile_cont in Postgres/Citus?

I have a query similar to this:
select
coalesce(s.import_date, r.import_date) as import_date,
coalesce(s.bedrooms, r.bedrooms) as bedrooms,
coalesce(s.ptype, r.ptype) as property_type,
s.s_price,
s.s_transactions,
....
r.r_rent,
....
from
(
select
sc.import_date,
sc.bedrooms,
sc.ptype,
percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by sc.asking_price) filter(where sc.price > 0) as s_price,
sum(1) filter(where sc.sold_price > 0) as s_transactions,
......
from prices sc
where sc.ptype = 'F' and sc.bedrooms = 2 and st_Intersects('010300002.....'::geometry,sc.geom)
and sc.import_date between '2012-01-01' and '2019-01-01'
group by sc.import_date, sc.bedrooms, sc.property_type
) s
full join
(
select
rc.import_date,
rc.bedrooms,
rc.ptype,
percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by rc.rent) filter(where rc.rent > 0) as r_rent,
.....
from rents rc
where rc.ptype = 'F' and rc.bedrooms = 2 and st_Intersects('010300002....'::geometry,rc.geom)
and rc.import_date between '2012-01-01' and '2019-01-01'
group by rc.import_date, rc.bedrooms, rc.property_type
) r
on r.import_date = s.import_date;
When I run it against my distributed tables on Citus/Postgres-11 I get:
ERROR: unsupported aggregate function percentile_cont
Is there any way to workaround this limitation?
AFAIK there is no easy workaround for this.
You can always pull all the data to the coordinator and calculate percentiles there. It is not advisable to do this in the same query though.
SELECT percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by r.order_col)
FROM
(
SELECT order_col, ...
FROM rents
WHERE ...
) r
GROUP BY ...
This query will pull all the data returned by the inner subquery to the coordinator and calculate the percentiles in the coordinator.

JOIN tables inside a subquery in DB2

I'm having trouble with paginating with joined tables in DB2. I want to return rows 10-30 of a query that contains an INNER JOIN.
This works:
SELECT *
FROM (
SELECT row_number() OVER (ORDER BY U4SLSMN.SLNAME) AS ID,
U4SLSMN.SLNO, U4SLSMN.SLNAME, U4SLSMN.SLLC
FROM U4SLSMN) AS P
WHERE P.ID BETWEEN 10 AND 30
This does not work:
SELECT *
FROM (
SELECT row_number() OVER (ORDER BY U4SLSMN.SLNAME) AS ID,
U4SLSMN.SLNO, U4SLSMN.SLNAME, U4SLSMN.SLLC, U4CONST.C4NAME
FROM U4SLSMN INNER JOIN U4CONST ON U4SLSMN.SLNO = U4CONST.C4NAME
) AS P
WHERE P.ID BETWEEN 10 AND 30
The error I get is:
Selection error involving field *N.
Note that the JOIN query works correctly by itself, just not when it's run as a subquery.
How do I perform a join inside a subquery in DB2?
Works fine for me on v7.1 TR9
Here's what I actually ran:
select *
from ( select rownumber() over (order by vvname) as ID, idescr, vvname
from olsdta.ioritemmst
inner join olsdta.vorvendmst on ivndno = vvndno
) as P
where p.id between 10 and 30;
I much prefer the CTE version however:
with p as
( select rownumber() over (order by vvname) as ID, idescr, vvname
from olsdta.ioritemmst
inner join olsdta.vorvendmst on ivndno = vvndno
)
select *
from p
where p.id between 10 and 30;
Finally, note that at 7.1 TR11 (7.2 TR3), IBM added support of the LIMIT and OFFSET clauses. Your query could be re-done as follows:
SELECT
U4SLSMN.SLNO, U4SLSMN.SLNAME, U4SLSMN.SLLC, U4CONST.C4NAME
FROM U4SLSMN INNER JOIN U4CONST ON U4SLSMN.SLNO = U4CONST.C4NAME
ORDER BY U4SLSMN.SLNAME
LIMIT 20 OFFSET 9;
However, note that the LIMIT & OFFSET clauses are only supported in prepared or embedded SQL. You can't use them in STRSQL or STRQMQRY. I believe the "Run SQL Scripts" GUI interface does support them. Here's an article about LIMIT & OFFSET