SQL subquery fails in Spark 2 when loading PostgreSQL table - postgresql

I am facing a very annoying PSQL issue when trying to load part of a PostgreSQL table via a subquery.
The query is :
SELECT
N1,
N2,
N3,
N4
FROM CORR
WHERE CORR_N5 >= (now() - interval '18 year')
AND CORR_N5 <= (now() - interval '18 year' + interval '1 month')
This one works if written directly in PgAdmin. However when I run it from a spark 2 job, I get tho following error message :
org.postgresql.util.PSQLException: ERROR: subquery in FROM must have an alias
Hint: For example, FROM (SELECT ...) [AS] foo.
Even when I put an alias after all the clauses, the same issue happens.
Any advice ?
Thanks in advance

Melvin, have a look at the below links
https://pganalyze.com/docs/log-insights/app-errors/U115
subquery in FROM must have an alias
SELECT * FROM (
SELECT N1, N2, N3, N4
FROM CORR WHERE COR_N5 >= (now() - interval '18 year')
AND CORR_N5 <= (now() - interval '18 year' + interval '1 month')
) AS input

Related

Why does `date_trunc('day', current_date) + interval '1 day' - interval '1 second'` cause query to hang?

When I set up a date range using max(lasttime) for the upper bound, the query works.
range_values as (
select date_trunc('month', current_date) as minval,
max(lasttime) as maxval
from people
)
When I use date_trunc('day', current_date) + interval '1 day' - interval '1 second' for the upper bound, the query hangs seemingly forever.
range_values as (
select date_trunc('month', current_date) as minval,
(
date_trunc('day', current_date) + interval '1 day' - interval '1 second'
) as maxval
from people
)
Here's how those values differ.
select max(lasttime) as max_lasttime, (date_trunc('day', current_date) + interval '1 day' - interval '1 second') as end_of_day from people;
{
"max_lasttime": "2023-02-13 07:30:01",
"end_of_day": "2023-02-13 23:59:59-07"
}
I expected this would not make a difference. Why does it?
PostgreSQL 10.18 (Ubuntu 10.18-0ubuntu0.18.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
In the 1st query, you are using an aggregate (max) so the output is a single row, regardless of the size of the people table.
In the 2nd query, you are fetching 2 constant values for every row in the people, so chances are you are building a massive cross-join with another (or the same!) table.

oracle Level column to postgres conversion

I have an oracle query to change it into postgres
SELECT cast(to_char(ADD_MONTHS(TRUNC(ADD_MONTHS(SYSDATE, -6),'MM'),LEVEL - 1),'MMYYYY') as number) monthid,
to_char (ADD_MONTHS(TRUNC(ADD_MONTHS(SYSDATE, -6),'MM'), LEVEL - 1),'MON-YYYY') monthdesc
From dual
CONNECT BY LEVEL <= MONTHS_ BETWEEN (SYSDATE, ADD_MONTHS (SYSDATE, -6)) + 1;
I tried with CTE and generate_series, but stuck somewhere to get result set
---------------------
MONTHID MONTHDESC
---------------------
172022 JUL-2022
82022 AUG-2022
92022 SEP-2022
102022 OCT-2022
112022 NOV-2022
122022 DEC-2022
12023 JAN-2023
This will generate a list of the last six months:
select to_char(g.dt, 'mmyyyy') as monthid,
to_char(g.dt, 'MON-yyyy') as monthdesc
from generate_series(date_trunc('month', current_date) - interval '6 month',
date_trunc('month', current_date), interval '1 month') as g(dt)
However this will return 072022 for July 2022, not 172022 as in your sample data.

PostgreSQL: Date Calendar Days Interval Scenario

I would like to print this table (displaying only 4 rows for brevity):
Dates
Period
01-MAR-2022
61
02-MAR-2022
61
03-MAR-2022
61
30-APR-2022
61
So far I have:
SELECT CAST(TRUNC(date_trunc('month',CURRENT_DATE) + interval '-2 month') AS DATE) + (n || 'day')::INTERVAL AS Dates
, date_trunc('month',CURRENT_DATE) + interval '-2 month' + INTERVAL '2 month' - date_trunc('month',CURRENT_DATE) + interval '-2 month' AS Period
FROM generate_series(0,61) n
Please help with a better way of generating the period and also replacing the hard-coded 61 in generate_series(0,61).
Thanks!
What are you actually trying to accomplish, it is not clear nor specified. BTW your query is invalid. It appears you looking to list each data from first date of 2 months prior to the last date of 1 month prior and the total number of days in that range. The following would give the first date, and using date subtraction gives the number of days.
with full_range( first_dt, num_days) as
( select date_trunc ('month', (current_date - interval '2 months'))::date
, date_trunc ('month', (current_date - interval '1 day'))::date -
date_trunc ('month', (current_date - interval '2 months'))::date
)
select *
from full_range;
With that in hand you can use the num_days with generate series with the expression
select generate_series(0, num_days-1) from full_range
Finally combine the above arriving at: (see demo)
with full_range( first_dt, num_days) as
( select date_trunc ('month', (current_date - interval '2 months'))::date
, date_trunc ('month', (current_date - interval '1 day'))::date -
date_trunc ('month', (current_date - interval '2 months'))::date
)
select (first_dt + n*interval '1 day')::date, num_days
from full_range
cross join (select generate_series(0, num_days-1) from full_range) gn(n);

expressions including functions for values in interval type for postgres

Is there a way in postgres db. to have expression inside interval type ? , e.g. I can execute a query -
select (timestamp '1970-01-01 00:00:00' + interval '1 second' * J0.C3) from T637 J0
Now , I need to be able to manipulate value of '1 second' dynamically , meaning '(1+2) seconds' fails for postgres , along with that the "value portion" value if it is getting derived via a function , and output of that function is some integer , I can't use that function in interval , e.g. interval func(args) seconds, how to achieve such dynamic "value portion" for "interval" in postgres db ?
interval '(1+2) seconds' is not a valid expression. However, interval '1 second' * 2 is valid.
So, to get the equivalent of '(m+n) seconds', one would would typically do either:
(interval '1 second' * m) + (interval '1 second' * n)
or
(m+n) * interval '1 second'
likewise, if there is a function that returns a numeric value and we want an interval of that magnitude (with units of seconds), the following could be used:
func(args) * (interval '1 second')
For this kind of "interval building", make_interval() is quite handy:
make_interval(secs => (1 + 2) * some_column);

Date subtraction in postgres

I want to subtract minute from NOW() and the value of "how many minutes" I am reading from another table:
SELECT * FROM A, B
WHERE
A.entity_type_id = B.entity_type_id
AND A.status = 'PENDING'
AND A.request_time < (NOW() - INTERVAL B.retry_interval MINUTE)
AND A.retry_count >= B.retry_allowed_count
Here the problem is B.retry_interval is fetched from another table, while normally the queries like these are A.request_time < (NOW() - INTERVAL '10 MINUTE')
How do I achieve this?
Multiply the interval by interval '1 minute'
SELECT *
FROM A, B
WHERE
A.entity_type_id = B.entity_type_id
AND A.status = 'PENDING'
AND A.request_time < NOW() - B.retry_interval * INTERVAL '1 minute'
AND A.retry_count >= B.retry_allowed_count
I know its quite old enough, but here what i do. Btw my postgres version is 9.4.18.
try using interval data type. Convert the number from column to sting and concat it.
SELECT *
FROM A, B WHERE
A.entity_type_id = B.entity_type_id
AND A.status = 'PENDING'
AND A.request_time < (NOW() - concat(B.retry_interval::text,'minute')::interval
AND A.retry_count >= B.retry_allowed_count
Thanks to my coworker who find "interval" data type