Redshift - Weird behaviour from TO_TIMESTAMP function - date

I'm trying to create a timestamp from separated date and time fields. The date field is of 'date' type while the time field is 'varchar'. The table has just over 60k records. Date is in YYYY-MM-DD format, ie 2017-07-25. Time is in HH:MIpm format, ie 4:25pm.
I run the following query to try and create the timestamp:
select *,
to_timestamp(TO_CHAR("date", 'YYYY-MM-DD')||' '||(case when "time"='' then '11:59pm' else "time" end), 'YYYY-MM-DD HH:MIpm') as REQUIRED_TIMESTAMP
from TABLE;
However, it only returns 1023 records. Here's what makes things even weirder:
i. The records that are returned aren't always the same on subsequent runs. However, there is always 1023 records returned. On one run the first few IDs would be 43, 55, 63, 69, etc. and on the next run it'd be different. Sometimes I'd get the following error:
Query 1 ERROR: ERROR: invalid value for "HH" in source string
ii. I used CAST(as timestamp) instead, it'd show the timestamp in a different format but again it'd only show 1023 records.
iii. It's not a specific time value that is omitted. A record with time 7:30pm may be omitted, but another record with 7:30pm shows up just fine.
Is there something wrong with my query? Could it be a problem with the data or is this an issue with Redshift?

Related

Problem loading an "interval" from a CSV file

I am trying to create a table using a csv file that is very big. Amongst the data, I have a column named 'bouwjaar' which means construction year and I have selected 'date' as a date type. I would receive an error, therefore I changed the date type into an interval but it again won't work. It gives me the following error. What should I select as a date type?
ERROR: interval field value out of range: "1971-1980"
CONTEXT: COPY fundadata, line 24, column bouwjaar : "1971-1980"
An interval in PostgreSQL is not something with a starting point and and end point, but a duration like "9 years".
A more appropriate data type for that would be daterange, but the values would have to look like [1971-01-01,1981-01-01). You either have to pre-process the file before loading, or you have to load the data into a text column and post-process it.

Date CAST in SQL Server throwing conversion failed for a date

I am experiencing a very unique cast error I don't understand why it happens with some dates and only in one particular case.
First at all, I cannot change the current code, it's a dynamic query from a legacy application and it's the result of queries to different tables to assemble the query I am having troubles with.
The error is a classic 'conversion failed when converting date and/or time from character string'.
At the beginning I thought it was a classic file naming error, we obtain the date from the file name in the format YYYYMMDD, the file has prefix and suffix and it's always formatted like that. It was pretty common to get wrongly formatted dates but it doesn't happen anymore. The issue is interesting because it only happens in 1 case for some dates that do not look like errors, for example, 20201105 which is basically translated to 11/05/2020 (US Format with month first).
This is the query:
SELECT TOP 1 CAST(LEFT(REPLACE(FileName, 'XXYYY_,''),8) AS DATE) AS MyDate FROM Mytable
The file name in this case is XXYYY_20201105.txt
Why the top 1? Well, it is a very bad design, there are many rows with the same value and it has to take only one to determine the date.
The most interesting part of it, when it fails I can "fix" the error just adding one more column:
SELECT TOP 1 CAST(LEFT(REPLACE(FileName, 'XXYYY_,''),8) AS DATE) AS MyDate, AnotherColumn
FROM Mytable
This query, just adding a column, doesn't fail. That's the weirdest part. I am trying wrap my head around what is the difference between obtaining ONE column and TWO columns. When I add any other column it seems to make the issue disappear.
Thanks a lot.

postgreSQL increment number in output

I am extracting three values (server, region, max(date)) from my postgresql> But I want to extract an additional 4th field which should be the numerical addition of 1 to 3rd field. I am unable to use date add function as in the database date field is defined as an integer.
date type in DB
date|integer|not null
tried using cast and date add function
MAX(s.date)::date + cast('1 day' as interval)
Error Received
ERROR: cannot cast type integer to date
Required output
select server, region, max(alarm_date), next date from table .....
testserver, europe, 20190901, 20190902
testserver2, europe, 20191001, 20191002
next date value should be the addition to alarm_date
To convert an integer like 20190901 to a date, use something like
to_date(CAST(s.date AS text), 'YYYYMMDD')
It is a bad idea to store dates as integers like that. Using the date data type will prevent corrupted data from entering the database, and it will make all operations natural.
First solution that came to my mind:
select (20190901::varchar)::date + 1
Which output 2019-09-02 as type date.
Other solutions can be found here.

Why does DateDiff function fail with "Invalid operation: Data value "0" has invalid format"in Redshift

I have a datediff() function that throws an exception.
I am trying to calculate the number of days between two dates. The rub is that one date is a converted integer value in YYYYMMDD format and the second date field is a timestamp. So, in the snippet below I am doing what I think are the correct conversions. Sometimes, it runs actually.
The message I get is: Amazon Invalid operation: Data value "0" has invalid format.
select site, datediff(days,to_date(cast(posting_dt_sk as varchar), 'YYYYMMDD'),trunc(ship_dt)) days_to_ship from sales_table
Later I added a Where-clause as to ignore empty values thinking I had bad data but that's not it. I still get the message.
where posting_dt_sk is not null and posting_dt_sk > 0
It all looks right to me, but it fails.

postgreSQL sorting with timestamps

I have the following SQL statement:
SELECT * FROM schema."table"
WHERE "TimeStamp"::timestamp >= '2016-03-09 03:00:05'
ORDER BY "TimeStamp"::date asc
LIMIT 15
What do I expect it to do? Giving out 15 rows of the table, where the timestamp is the same and bigger than that date, in ascending order. But postgres sends the rows in the wrong order. The first item is on the last position.
So has anyone an idea why the result is this strange?
Use simply ORDER BY "TimeStamp" (without casting to date).
By casting "TimeStamp" to date you throw away the time part of the timestamp, so all values within one day will be considered equal and are returned in random order. It is by accident that the first rows appear in the order you desire.
Don't cast to date in the ORDER BY clause if the time part is relevant for sorting.
Perhaps you are confused because Oracle's DATE type has a time part, which PostgreSQL's doesn't.