Order By Date desc evaluates 01-01-1970 higher then everything else - postgresql

I have the following simplified query
SELECT
id
to_char(execution_date, 'YYYY-MM-DD') as execution_date
FROM schema.values
ORDER BY execution_date DESC, id DESC
execution_date can be null.
If no value is present in execution_date it will be set to 1970-01-01 as default. My problem is, that the following table values will lead to a result where 1970-01-01 is treated as the newest date.
Table:
id
execution_date
1
2
2020-01-01
3
2022-01-02
4
Result I would expect
id
execution_date
3
2022-01-02
2
2020-01-01
4
1970-01-01
1
1970-01-01
What I get
id
execution_date
4
1970-01-01
1
1970-01-01
3
2022-01-02
2
2020-01-01
How can I get the correct order and is it possible to easily return an empty varchar if the date is empty?

If you table has NULL values, not empty values, you can try to use nulls last :
with t as (select 1 as id, NULL::date as dt
union select
2, '2020-01-01'::date
union select
3, '2020-01-02'::date
union select
4, NULL::date)
select *
from t
order by t.dt desc nulls last, id desc;
It should work for an empty text values also:
with t as (select 1 as id, ''::text as dt
union select
2, '2020-01-01'::text
union select
3, '2020-01-02'::text
union select
4, NULL::text)
select *
from t
order by t.dt desc nulls last, id desc
And if you need to change your NULL date to 1970 just use COALESCE() :
with t as (select 1 as id, NULL::date as dt
union select
2, '2020-01-01'::date
union select
3, '2020-01-02'::date
union select
4, NULL::date)
select coalesce(t.dt, '1970-01-01'::date) as dt
from t
order by t.dt desc nulls last, id desc
Here's the dbfiddle

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=5d1fc31a3cf2d3121092f2446cce87e5
SELECT
id,
to_char(coalesce( execution_date, '1970-01-01'::date), 'YYYY-MM-DD') as execution_date
FROM values1
ORDER BY execution_date DESC, id DESC;

I did not to see the forest for the trees...
Here is the simple solution:
SELECT
id
CASE WHEN execution_date IS NULL THEN ''
ELSE to_char(execution_date, 'YYYY-MM-DD') END
AS execution_date
FROM schema.values
ORDER BY execution_date DESC, id DESC

Related

BigQuery SQL: Group rows with shared ID that occur within 7 days of each other, and return values from most recent occurrence

I have a table of datestamped events that I need to bundle into 7-day groups, starting with the earliest occurrence of each event_id.
The final output should return each bundle's start and end date and 'value' column of the most recent event from each bundle.
There is no predetermined start date, and the '7-day' windows are arbitrary, not 'week of the year'.
I've tried a ton of examples from other posts but none quite fit my needs or use things I'm not sure how to refactor for BigQuery
Sample Data;
Event_Id
Event_Date
Value
1
2022-01-01
010203
1
2022-01-02
040506
1
2022-01-03
070809
1
2022-01-20
101112
1
2022-01-23
131415
2
2022-01-02
161718
2
2022-01-08
192021
3
2022-02-12
212223
Expected output;
Event_Id
Start_Date
End_Date
Value
1
2022-01-01
2022-01-03
070809
1
2022-01-20
2022-01-23
131415
2
2022-01-02
2022-01-08
192021
3
2022-02-12
2022-02-12
212223
You might consider below.
CREATE TEMP FUNCTION cumsumbin(a ARRAY<INT64>) RETURNS INT64
LANGUAGE js AS """
bin = 0;
a.reduce((c, v) => {
if (c + Number(v) > 6) { bin += 1; return 0; }
else return c += Number(v);
}, 0);
return bin;
""";
WITH sample_data AS (
select 1 event_id, DATE '2022-01-01' event_date, '010203' value union all
select 1 event_id, '2022-01-02' event_date, '040506' value union all
select 1 event_id, '2022-01-03' event_date, '070809' value union all
select 1 event_id, '2022-01-20' event_date, '101112' value union all
select 1 event_id, '2022-01-23' event_date, '131415' value union all
select 2 event_id, '2022-01-02' event_date, '161718' value union all
select 2 event_id, '2022-01-08' event_date, '192021' value union all
select 3 event_id, '2022-02-12' event_date, '212223' value
),
binning AS (
SELECT *, cumsumbin(ARRAY_AGG(diff) OVER w1) bin
FROM (
SELECT *, DATE_DIFF(event_date, LAG(event_date) OVER w0, DAY) AS diff
FROM sample_data
WINDOW w0 AS (PARTITION BY event_id ORDER BY event_date)
) WINDOW w1 AS (PARTITION BY event_id ORDER BY event_date)
)
SELECT event_id,
MIN(event_date) start_date,
ARRAY_AGG(
STRUCT(event_date AS end_date, value) ORDER BY event_date DESC LIMIT 1
)[OFFSET(0)].*
FROM binning GROUP BY event_id, bin;

Select dates missing data in a range

I have a postgres table test_table that looks like this:
date | test_hour
------------+-----------
2000-01-01 | 1
2000-01-01 | 2
2000-01-01 | 3
2000-01-02 | 1
2000-01-02 | 2
2000-01-02 | 3
2000-01-02 | 4
2000-01-03 | 1
2000-01-03 | 2
I need to select all the dates which don't have test_hour = 1, 2, and 3, so it should return
date
------------
2000-01-03
Here is what I have tried:
SELECT date FROM test_table WHERE test_hour NOT IN (SELECT generate_series(1,3));
But that only returns dates that have extra hours beyond 1, 2, 3
You can use aggregation and conditional HAVING clauses, like so:
SELECT mydate
FROM mytable
GROUP BY mydate
HAVING
MAX(CASE WHEN test_hour = 1 THEN 1 END) != 1
OR MAX(CASE WHEN test_hour = 2 THEN 1 END) != 1
OR MAX(CASE WHEN test_hour = 3 THEN 1 END) != 1
Another possibility would be to join it against the series (or another subquery containing the hours) and do a [distinct] count on the hours aggregatet per date:
select date from tst
inner join (select generate_series(1,3) "hour") hours on hours.hour = tst.hour
group by tst.date
having count(distinct tst.hour) < 3;
or
select date from tst
where hour in (select generate_series(1,3))
group by date
having count(distinct tst.hour) < 3;
[You don't need the distinct if date/hour combinations in Your table are unique]
A solution using set difference, giving you exactly the rows that are missing:
(SELECT DISTINCT
date, all_hour
FROM test_table
CROSS JOIN generate_series(1,3) all_hour)
EXCEPT
(TABLE test_table)
And a solution using an array aggregate and the array contains operator:
SELECT date
FROM test_table
GROUP BY date
HAVING NOT array_agg(test_hour) #> ARRAY(SELECT generate_series(1,3))
(online demos)

Difference between the max date and the penultimate max for specific employee - postgresql

Bit stuck on a problem. Trying to find the difference between two dates in postgreSQL.
I have a table emp with many employees in it:
emp_id, date
1, 31-10-2017
1, 08-08-2017
1, 02-06-2017
I want it to look like this:
emp_id, max_date, penultimate_date, difference
1, 31-10-2017, 08-08-2017, 84 days
Obviously you can use max(date) and group by the emp_id, however how do you retrieve the penultimate date. I have used a few functions like:
order by date desc limit 1 offset 1
I have also tried to put these in sub queries but that hasn,t worked as there are many employee numbers and I need one row for each employee.
Can anyone help???
Thanks,
pp84
as kindly suggested by #Haleemur Ali, order by date desc limit 1 offset 1 would not work with several emp_id:
t=# with d(emp_id, date)as (values(1, '31-10-2017'::date),(1, '08-08-2017'),(1, '02-06-2017' ),(2,'2016-01-01'),(2,'2016-02-02'),(2,'2016-03-03'))
select distinct emp_id
, max(date) over (partition by emp_id) max_date
, nth_value(date,2) over (partition by emp_id) penultimate_date
, max(date) over (partition by emp_id) - nth_value(date,2) over (partition by emp_id) diff
from d
;
emp_id | max_date | penultimate_date | diff
--------+------------+------------------+------
2 | 2016-03-03 | 2016-02-02 | 30
1 | 2017-10-31 | 2017-08-08 | 84
(2 rows)
Time: 0.756 ms
WITH emps (emp_id, date) AS (
VALUES (1, '2017-10-31'::DATE)
, (1, '2017-08-08'::DATE)
, (1, '2017-08-08'::DATE)
)
SELECT DISTINCT ON (emp_id)
emp_id
, "date" max_date
, LEAD("date") OVER w penultimate_date
, "date" - LEAD("date") OVER w difference
FROM emps
WINDOW w AS (PARTITION BY emp_id)
ORDER BY emp_id, date DESC
When ordered in descending order, the LEAD("date") w will give the value of the date value from the next row.
The DISTINCT ON limits the resultset to 1 row (the first row encountered) per emp_id.
With our ordering this first row must contain the greatest date, and the LEAD(...) over w therefore returns the penultimate date. This gives us the following result:
emp_id | max_date | penultimate_date | difference
--------+------------+------------------+------------
1 | 2017-10-31 | 2017-08-08 | 84
(1 row)

Parameter not picking up NULL values in SSRS report appears blank

I need to design report using Date parameters(needs be in UK date format)
Original date format: 2007-11-30 00:00:00.000
So, I am using CONVERT(Date, Start_Date, 103): 2007-11-30 in my query.
Data in Query:
Start_Date End_Date Type
--------------------------------------------------------------- NULL 2016-01-23 3 Month
2009-08-11 2009-07-27 3 Month
NULL 2015-10-13 3 Month
NULL 2016-02-16 3 Month
NULL 2015-12-28 3 Month
Now while designing report I am using this dataset:
Main dataset Query:
SELECT Col1, Col2, Start_Date, Target_Date, Col3
FROM Table
WHERE (Col1 IN (#Param1))
AND (Col2IN (#Param2))
AND (Start_Date IN (#Start_Date)) AND (Target_Date IN (#Target_Date))
Now, When I run this report for Start_Date = 2009-08-11 and End_Date = 2009-07-27 and Type= 3 Month I get detailed data in report.
However, When I select Start_Date = NULL and End_Date = 2016-02-16 and Type= 3 Month my report appears blank. I checked dataset and that too returns all values as NULL.
Can you please suggest/help with this issue?
Regards,
AR
Start_Date = NULL
Will not work.You need to use
Start_Date IS NULL
or
ISNULL(Start_Date, 0) = 0
so something like this:
SELECT Col1, Col2, Start_Date, Target_Date, Col3
FROM [Table]
WHERE Col1 IN (#Param1)
AND Col2 IN (#Param2)
AND ISNULL(Start_Date, 0) IN (ISNULL(#Start_Date, 0))
AND ISNULL(Target_Date, 0) IN (ISNULL(#Target_Date, 0));

Checking missing hours for every id in a table

I have a table that contains column for id-s (id_code) and a time for transaction (time). What I need is to figure out those hours between two dates for each id where no transaction took place. Lets say i need to check missing hours for id 1 and id 2 from a table below between 2014-06-13 12:00:00 and 2014-06-13 14:59:59 - the desired result would be that id 1 has missing transactions 2014-06-13 13:00:00 and id 2 is missing transactions 2014-06-13 14:00:00.
id_code | time
1 | 2014-06-13 12:23:12
2 | 2014-06-13 12:27:23
1 | 2014-06-13 12:56:21
2 | 2014-06-13 13:34:12
1 | 2014-06-13 14:23:56
I am using PostgreSQL 9.3
SQL Fiddle
select c.id, d.time
from
(
select distinct id
from t
) c
cross join
generate_series (
(select date_trunc('hour', min(t.time)) from t),
(select date_trunc('hour', max(t.time)) from t),
interval '1 hour'
) d(time)
left join
(
select id, date_trunc('hour', t.time) as time
from t
group by id, 2
) t on t.time = d.time and c.id = t.id
where t.time is null
order by c.id, d.time
The generate_series will build a set of all possible hours. The cross join will make that a matrix of all possible ids of all possible hours. Then the t.time is null condition will filter those id x hours that do not exist.
SELECT DISTINCT id, h FROM t, generate_series('2014-06-13 12:00:00'::timestamp, '2014-06-13 14:59:59'::timestamp, '1 hour') h
EXCEPT
SELECT id, date_trunc('hour', time) FROM t
Thanks to Clodoaldo Neto for providing a useful SQL Fiddle page for testing!