T-SQL : How to output whitespaces with Substring on an empty column - tsql

I am trying to convert a Informix select query to T-SQL.
On a Date column thats in YYYYMMDD format, the Informix query has below Statusdate in Select and I wrote the equivalent T-SQL which works perfectly to output Statusdate in the format YYYY-MM-DD.
Informix:
Date[1,4]||"-"||Date[5,6]||"-"||Date[7,8] as StatusDate
T-SQL:
SUBSTRING(Date,1,4)+'-'+SUBSTRING(Date,5,2)+'-'+SUBSTRING(Date,7,2) AS StatusDate
But if the Date column is empty, the Informix output is [ - - ] i.e 4spaces-2spaces-2spaces
while SQL output is [--] i.e no white spaces.
How can I updated the T-SQL query to get output similar to Informix when column is empty.

You could just use a CASE expression here to explicitly handle the output for NULL dates:
CASE WHEN Date IS NOT NULL
THEN LEFT(Date, 4) + '-' + SUBSTRING(Date, 5, 2) + '-' + RIGHT(Date, 2)
ELSE ' - - ' END AS StatusDate

Related

Create rows from part of column names

Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T
I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here
#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;

In Amazon Redshift though I have specified service_date column as Date datatype but when I am taking date in IN operator it working with quotes only

select *
from nsclc_thought_spot
where patientid = 11000001
and service_date in ('2019-07-08', '2019-07-10')
order by patientid, service_date
is returning the results properly
But this is not working as expected:
select *
from nsclc_thought_spot
where patientid = 11000001
and service_date in (2019-07-08, 2019-07-10)
order by patientid, service_date
This query is not returning results.
If I have defined service_date column as date, then why do I have to pass the values in quotes inside IN operator in redshift?
Because 2019-07-08 means the integer 2019 minus the integer 7 minus the integer 8 which equals the integer 2004. Without quotes in SQL numbers are seen as numeric values. To be interpreted as something else you need to quote them (which is a text value) and then they need to be cast to the data type needed. In this case '2019-07-08' is a text value but Redshift will implicitly cast this to a date to make the comparison to the column data "service_date".
If you want to do this explicitly you can add the casting to the values - ... service_date IN ('2019-07-08'::date,'2019-07-10'::date) ... - which might make things clearer for you.

Postgresql - select column based on condition

In this query the 'Daily' in the case will be replaced by a variable. I am not able to make this query work. I want to have the date column being either a day, a week a month or a year based on the value of the variable. but it is giving me various errors..
CASE types date and double precison cannot be matched
syntax error near "as"
what am I doing wrong?
select
case 'Daily'
when 'Daily' then DATE(to_timestamp(e.startts)) as "Date",
when 'Weekly' then DATE_PART('week',to_timestamp(e.startts)) as "Date",
when 'Monthly' then to_char(to_timestamp(e.startts), 'mm/yyyy') as "Date",
when 'Yearly' then to_char(to_timestamp(e.startts), 'yyyy') as "Date",
end
sum(e.checked)
from entries e
WHERE
e.startts >= date_part('epoch', '2020-10-01T15:01:50.859Z'::timestamp)::int8
and e.stopts < date_part('epoch', '2021-11-08T15:01:50.859Z'::timestamp)::int8
group by "Date"
CASE ... END is an expression. An expression must have a well-defined data type, so PostgreSQL makes sure that the expressions in the THEN clause have the same data type (or at least compatible ones).
You would need a type cast, probably to text, in the first two branches:
... THEN CAST (date(to_timestamp(e.startts)) AS text)
But it would be much better to use to_char in all branches – there are format codes for everything you need.
An expression can have no alias, only an entry in the SELECT or FROM list can. So you need to append AS "Date" at the end of the CASE ... END expression, not somewhere in the middle.

Conversion failed when converting date and/or time from character string. when filtering sql select query

I have a relatively basic query on which dates that are saved in a table as nvarchar(200).
I am trying to do the filter on an InteractionDate fields that looks like this
'02-03-2018 12:00', '03-04-2018 14:46', '03-04-2018 14:44' etc.
But get the error Conversion failed when converting date and/or time from character string. when trying to convert the nvarchar date field.
This is what the query looks like
select
act.InteractionDate,
act.Status
from JobCanvas_B2B canvas
inner join PersonActivity_JobCanvas inters on inters.CanvasId =
canvas.CanvasId
inner join PersonActivity act on act.PersonActivityId =
inters.PersonActivityId
inner join Stage s on s.StageId = act.StageId
where convert(date, act.InteractionDate, 101) > convert(date, '01-01-2018 12:00', 101)
How could I do this date conversion correctly so that the query works?
If you can tolerate dealing with only the date component, then use mask 103 as you did in your answer. If you also need the time component, then we can try going through format mask 120, with some string manipulation along the way:
CONVERT(datetime, SUBSTRING(act.InteractionDate, 7, 4) + '-' +
SUBSTRING(act.InteractionDate, 4, 2) + '-' + LEFT(act.InteractionDate, 2) +
' ' + RIGHT(act.InteractionDate, 5), 120)
Demo
Ideally, we should be able to use mask 131 directly, but I could not get it working, at least not with the type of data you have. Instead, the above snippet manually builds a timestamp of the format yyyy-mm-dd hh:mi:ss.
The best long term solution here is to not store date information as text. If you must do that, then you an ISO format which is easy to convert with SQL Server's built in functions.
This did the trick,
where convert(date, act.InteractionDate, 103) > convert(date, '01-01-2018 12:00', 103)

Extract year from date within WHERE clause

I need to include EXTRACT() function within WHERE clause as follow:
SELECT * FROM my_table WHERE EXTRACT(YEAR FROM date) = '2014';
I get a message like this:
pg_catalog.date_part(unknown, text) doesn't exist**
SQL State 42883
Here is my_table content (gid INTEGER, date DATE):
gid | date
-------+-------------
1 | 2014-12-12
2 | 2014-12-08
3 | 2013-17-15
I have to do it this way because the query is sent from a form on a website that includes a 'Year' field where users enter the year on a 4-digits basis.
The problem is that your column is of data type text, while EXTRACT() only works for date / time types.
You should convert your column to the appropriate data type.
ALTER TABLE my_table ALTER COLUMN date TYPE date;
That's smaller (4 bytes instead of 11 for the text), faster and cleaner (disallows illegal dates and most typos).
If you have non-standard format add a USING clause with a conversion expression. Example:
Alter character field to date
Also, for your queries to be fast with a plain index on date you should rather use sargable predicates. Like:
SELECT * FROM my_table
WHERE date >= '2014-01-01'
AND date < '2015-01-01';
Or, to go with your 4-digit input for the year:
SELECT * FROM my_table
WHERE date >= to_date('2014', 'YYYY')
AND date < to_date('2015', 'YYYY');
You could also be more explicit:
to_date('2014' || '0101', 'YYYYMMNDD')
Both produce the same date '2014-01-01'.
Aside: date is a reserved word in standard SQL and a basic type name in Postgres. Don't use it as identifier.
This happens because the column has a text or varchar type, as opposed to date or timestamp. This is easily reproducible:
SELECT 1 WHERE extract(year from '2014-01-01'::text)='2014';
yields this error:
ERROR: function pg_catalog.date_part(unknown, text) does not exist
LINE 1: SELECT 1 WHERE extract(year from '2014-01-01'::text)='2014';
^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
extract or is underlying function date_part does not exist for text-like datatypes, but they're not needed anyway. Extracting the year from this date format is equivalent to getting the 4 first characters, so your query would be:
SELECT * FROM my_table WHERE left(date,4)='2014';