Teradata : BTEQ Import Invalid Date Issue

Teradata : BTEQ Import Invalid Date Issue - date

I am trying to port data from a flat file to TD via BTEQ.
The table definition is :
CREATE MULTISET TABLE _module_execution_log
(
system_id INTEGER,
process_id INTEGER,
module_id INTEGER,
julian_dt INTEGER,
referral_dt DATE FORMAT 'YYYY-MM-DD',
start_dt_tm TIMESTAMP(6),
end_dt_tm TIMESTAMP(6),
ref_s_cnt INTEGER,
ref_d_cnt INTEGER)
PRIMARY INDEX ( module_id );
Following are 2 sample records that i am trying to load in the table :1|1|30|2007073|Mar 14 2007 12:00:00:000AM|Mar 15 2007 1:27:00:000PM|Mar 15 2007 1:41:08:686PM|0|0
1|1|26|2007073|Mar 14 2007 12:00:00:000AM|Mar 15 2007 1:27:00:000PM|Mar 15 2007 1:59:40:620PM|0|0
Snippet for my BTEQ script
USING
( system_id INTEGER
,process_id INTEGER
,module_id INTEGER
,julian_dt INTEGER
,referral_dt DATE FORMAT 'YYYY-MM-DD'
,start_dt_tm TIMESTAMP
,end_dt_tm TIMESTAMP
,ref_s_cnt INTEGER
,ref_d_cnt INTEGER
)
INSERT INTO _module_execution_log
( system_id
,process_id
,module_id
,julian_dt
,referral_dt
,start_dt_tm
,end_dt_tm
,ref_s_cnt
,ref_d_cnt
)
VALUES (
:system_id
,:process_id
,:module_id
,:julian_dt
,:referral_dt
,:start_dt_tm
,:end_dt_tm
,:ref_s_cnt
,:ref_d_cnt);
I get the following error during import :
*** Failure 2665 Invalid date.
Statement# 1, Info =5
*** Failure 2665 Invalid date.
Statement# 1, Info =5
The issue is surely with the exported date in 5th column. I cannot modify the export query. I tried the following in the bteq but still failed : cast(cast(substr(:referral_dt,1,11) as date format 'MMMBDDBYYYY') as date format 'YYYY-MM-DD')

Your data is pipe-delimited variable length characters and the USING should match the input data, e.g.
system_id VARCHAR(11)
referral_dt VARCHAR(26)
The VarChars will be automatically casted to the target datatypes using a default format. For your Timestamps you need to cast manually adding a format:
referral_dt (TIMESTAMP(3),FORMAT 'mmmBddByyyyBhh:mi:ss.s(3)T')
But this will fail for a single digit hour, Teradata always wants two digits.
If you're on TD14 you better utilize the Oracle TO_DATE/TO_TIMESTAMP UDFs which allow single digit hours:
TO_TIMESTAMP(referral_dt,'MON DD YYYY HH:MI:SS:FF3AM')

You do not have a date indicated by your data.
First 4 values expected are integer, then a date, then a timestamp:
system_id INTEGER,
process_id INTEGER,
module_id INTEGER,
julian_dt INTEGER,
**referral_dt DATE FORMAT 'YYYY-MM-DD'**,
start_dt_tm TIMESTAMP(6), ...
Your data doesn't match:
1|1|30|2007073|Mar 14 2007 12:00:00:000AM|Mar 15 2007 1:27:00:000PM|Mar 15 2007 1:41:08:686PM|0|0
you are missing the date:
1|1|30|2007073|**????-??-??**| Mar 14 2007 12:00:00:000AM|...

Related

How to return null if an error for try_parse

I am trying to simply return NULL if I get an error from TRY_PARSE. I am using TRY_PARSE to parse out a datetime from a string.
create table #example (
ID int identity(1, 1)
, extractedDateTime varchar(50)
)
insert into #example (extractedDateTime)
values ('7/19/21 11:15')
,('/30/21 1100')
,('05/15/2021 17:00')
,('05/03/2021 0930')
,('5/26/21 09:30')
,('05/26/2021 0930')
,('06/09/2021 12:00')
,('07/06/2021 13:00')
,('6/15/21 12:00')
,('07/09/2021 07:30')
,('07/14/2021 13:20')
,('/19/2021 10:30')
,('7/22/21 1030')
,('7/21/201')
,('06/21/21 11:00')
select exm.ID, exm.extractedDateTime, [TRY_PARSED] = TRY_PARSE(exm.extractedDateTime as datetime2 using 'en-US')
from #example as exm
drop table #example
In the above example there is ID 14: '7/21/201' which will be parsed as from the year 201 (presumably it was meant to be 21 or 2021). I have gotten this to parse as datetime2, originally I was using datetime. I am inclined to still use datetime, but what I would like is to return NULL for that particular row. Instead a get a lengthy error message about a SqlDateTime overflow, because of using datetime of course.
The reason I want to go back to using datetime is this is an incorrect value, and using datetime might help filter out erroneous values like this. Also, I'd like the query to be able to return NULL anyways, whenever this little bit encounters an error so that it doesn't stop the entire query (this is part of a much larger query).
How can I return NULL for this record? Any advice would be greatly appreciated!
UPDATE:
Here is the picture I get from executing the seen SELECT statement:

As others have noted, the TRY_PARSE is functioning as expected and documented.
That doesn't help your case, though, so I'd suggest setting an arbitrary minimum date that any date prior to that gets assigned a NULL.
Sample code:
SELECT exm.ID,
exm.extractedDateTime,
[TRY_PARSED] = CASE
WHEN TRY_PARSE(exm.extractedDateTime AS DATETIME2 USING 'en-US') < '1970-01-01'
THEN NULL
ELSE TRY_PARSE(exm.extractedDateTime AS DATETIME2 USING 'en-US')
END
FROM #example AS exm;
Results:
1
7/19/21
11:15
2021-07-19 11:15:00.0000000
2
/30/21 1100
NULL
3
05/15/2021
17:00
2021-05-15 17:00:00.0000000
4
05/03/2021
0930
NULL
5
5/26/21
09:30
2021-05-26 09:30:00.0000000
6
05/26/2021
0930
NULL
7
06/09/2021
12:00
2021-06-09 12:00:00.0000000
8
07/06/2021
13:00
2021-07-06 13:00:00.0000000
9
6/15/21 12:00
2021-06-15 12:00:00.0000000
10
07/09/2021
07:30
2021-07-09 07:30:00.0000000
11
07/14/2021 13:20
2021-07-14 13:20:00.0000000
12
/19/2021 10:30
NULL
13
7/22/21 1030
NULL
14
7/21/201
NULL
15
06/21/21
11:00
2021-06-21 11:00:00.0000000

The advice is not to do anything since what you ask for is the default behavior of TRY_PARSE. Check the documentation
https://learn.microsoft.com/en-us/sql/t-sql/functions/try-parse-transact-sql?view=sql-server-ver15

Select if number range contains number in PostgreSQL

I need to be able to find a row based on a number range, saved as a text field. For example, the field tuesday looks like 540-1020. I want to retrieve this row if I search for 900. So far I have,
SELECT string_to_array(tuesday, '-')
FROM coverage
WHERE 900 IN string_to_array(tuesday, '-')
Where string_to_array(tuesday, '-') prints out like {540,1020}. How can I convert it into a selectable integer range?

Use a range.
SELECT string_to_array(tuesday, '-')
FROM coverage
WHERE 900 <# int4range(split_part(tuesday, '-', 1)::int4, split_part(tuesday, '-', 2)::int4, '[]');
That last parameter [] signifies an inclusive range where '100-900' would match. You could also do an exclusive upper range like [) (note the right paren) where '100-900' would not match because the upper number is excluded from the set of matching numbers.
For better query speed as your table gets larger, you can add a GIST functional index.
CREATE INDEX tuesday_range_idx ON coverage
USING GIST (int4range(split_part(tuesday, '-', 1)::int4, split_part(tuesday, '-', 2)::int4, '[]'));
This is exposing some weaknesses in your data model. By having each day as a column, you'd have to create a separate functional index for each column. You're also having to parse text into an array every time you run this. Typically you'd want the data in the table to match how you access it, not its serialized form.
Instead of
CREATE TABLE coverage (
id serial PRIMARY KEY,
year smallint, -- tracking by week
week_num smallint, -- for example
sunday varchar,
monday varchar,
tuesday varchar,
wednesday varchar,
thursday varchar,
friday varchar,
saturday varchar
);
why not something like
CREATE TABLE coverage (
id serial PRIMARY KEY,
day date NOT NULL UNIQUE,
daily_data int4range NOT NULL
);
INSERT INTO coverage (day, daily_data)
VALUES ('2020-06-02', '[540,1020]');
Then your search looks like
SELECT daily_data
FROM coverage
WHERE extract(DOW FROM day) = 2 -- Tuesday (Sunday is 0, Saturday is 6)
AND 900 <# daily_data;
You can make indexes for the daily data ranges, by date (already a unique index in my example), functional indexes for the day of the week, month, year, etc. Much more flexible.
Or if you absolutely want an array back from your SELECT
SELECT ARRAY[lower(daily_data), upper(daily_data)]
FROM coverage
WHERE extract(DOW FROM day) = 2 -- Tuesday (Sunday is 0, Saturday is 6)
AND 900 <# daily_data;

Converting Integer values to Date in Presto SQL

Below is a script i am trying to run in Presto; Subtracting today's date from an integer field I am attempting to convert to date. To get the exacts days between. Unfortunately, it seems the highlighted block does not always convert the date correctly and my final answer is not correct. Please does anyone know another way around this or a standard method on presto of converting integer values to date.
Interger value in the column is in the format '20191123' for year-month-date
select ms, activ_dt, current_date, date_diff('day',act_dt,current_date) from
(
select ms,activ_dt, **CAST(parse_datetime(CAST(activ_dt AS varchar), 'YYYYMMDD') AS date) as act_dt**, nov19
from h.A_Subs_1 where msisdn_key=23480320012
) limit 19

You can convert "date as a number" (eg. 20180527 for May 27, 2018) using the following:
cast to varchar
parse_datetime with appropriate format
cast to date (since parse_datetime returns a timestamp)
Example:
presto> SELECT CAST(parse_datetime(CAST(20180527 AS varchar), 'yyyyMMdd') AS date);
_col0
------------
2018-05-27

You can use below sample query for your requirement:
select date_diff('day', date_parse('20191209', '%Y%m%d'), current_timestamp);

Why does DATE in DB2 database have a time component in it?

How can I make the column data type be DATE like YYYY-MM-DD;
when I create a table with the Data type DATE, it will become TIMESTAMP(0)
When I ALTER SET DATA TYPE DATE, it still is TIMESTAMP(0)
and SELECT CHAR(CURRENT DATE, ISO) FROM SYSIBM.SYSDUMMY1;
it will be error with SQLCODE=-171, CURRENT DATE is 2017-02-28 19:19:09.0
it's too long.
database info:DB2 linux x64 10.5
CREATE TABLE "XCRSUSR"."TIMP_TASK_SERIAL" (
"SERIAL_NO" DECIMAL(16 , 0),
"TASK_NAME" VARCHAR(10),
"TASK_TYPE" DOUBLE,
"TASK_XML" CLOB(10) INLINE LENGTH 164,
"SEND_TIME" DATE,
"FINISH_TIME" DATE,
"TASK_STATUS" DOUBLE DEFAULT 0,
"RUN_TYPE" DOUBLE,
"FLAG" DOUBLE,
"TASK_ID" VARCHAR(10)
)
ORGANIZE BY ROW
DATA CAPTURE NONE
IN "CREDIT_U_16" INDEX IN "CREDIT_INDEX_16"
COMPRESS NO;
ALTER TABLE TIMP_TASK_SERIAL ALTER COLUMN SEND_TIME SET DATA TYPE DATE;
select CURRENT DATE from SYSIBM.SYSDUMMY1;
1
---------------------
2017-02-28 19:19:09.0

Check out the settings of the
Oracle_Compatibility
vector under
https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.apdv.porting.doc/doc/r0052867.html
Bit Position 7 in table 1 is wjat you are looking for.

TSQL update Datetime with Random Value between 2 Dates

What's the easiest way to update a table that contains a DATETIME column on TSQL with RANDOM value between 2 dates?
I see various post related to that but their Random values are really sequential when you ORDER BY DATE after the update.

Assumptions
First assume that you have a database containing a table with a start datetime column and a end datetime column, which together define a datetime range:
CREATE DATABASE StackOverflow11387226;
GO
USE StackOverflow11387226;
GO
CREATE TABLE DateTimeRanges (
StartDateTime DATETIME NOT NULL,
EndDateTime DATETIME NOT NULL
);
GO
ALTER TABLE DateTimeRanges
ADD CONSTRAINT CK_PositiveRange CHECK (EndDateTime > StartDateTime);
And assume that the table contains some data:
INSERT INTO DateTimeRanges (
StartDateTime,
EndDateTime
)
VALUES
('2012-07-09 00:30', '2012-07-09 01:30'),
('2012-01-01 00:00', '2013-01-01 00:00'),
('1988-07-25 22:30', '2012-07-09 00:30');
GO
Method
The following SELECT statement returns the start datetime, the end datetime, and a pseudorandom datetime with minute precision greater than or equal to the start datetime and less than the second datetime:
SELECT
StartDateTime,
EndDateTime,
DATEADD(
MINUTE,
ABS(CHECKSUM(NEWID())) % DATEDIFF(MINUTE, StartDateTime, EndDateTime) + DATEDIFF(MINUTE, 0, StartDateTime),
0
) AS RandomDateTime
FROM DateTimeRanges;
Result
Because the NEWID() function is nondeterministic, this will return a different result set for every execution. Here is the result set I generated just now:
StartDateTime EndDateTime RandomDateTime
----------------------- ----------------------- -----------------------
2012-07-09 00:30:00.000 2012-07-09 01:30:00.000 2012-07-09 00:44:00.000
2012-01-01 00:00:00.000 2013-01-01 00:00:00.000 2012-09-08 20:41:00.000
1988-07-25 22:30:00.000 2012-07-09 00:30:00.000 1996-01-05 23:48:00.000
All the values in the column RandomDateTime lie between the values in columns StartDateTime and EndDateTime.
Explanation
This technique for generating random values is due to Jeff Moden. He wrote a great article on SQL Server Central about data generation. Read it for a more thorough explanation. Registration is required, but it's well worth it.
The idea is to generate a random offset from the start datetime, and add the offset to the start datetime to get a new datetime in between the start datetime and the end datetime.
The expression DATEDIFF(MINUTE, StartDateTime, EndDateTime) represents the total number of minutes between the start datetime and the end datetime. The offset must be less than or equal to this value.
The expression ABS(CHECKSUM(NEWID())) generates an independent random positive integer for every row. The expression can have any value from 0 to 2,147,483,647. This expression mod the first expression gives a valid offset in minutes.
The epxression DATEDIFF(MINUTE, 0, StartDateTime) represents the total number of minutes between the start datetime and a reference datetime of 0, which is shorthand for '1900-01-01 00:00:00.000'. The value of the reference datetime does not matter, but it matters that the same reference date is used in the whole expression. Add this to the offset to get the total number of minutes between the reference datetime.
The ecapsulating DATEADD function converts this to a datetime value by adding the number of minutes produced by the previous expression to the reference datetime.

You can use RAND for this:
select cast(cast(RAND()*100000 as int) as datetime)
from here
Sql-Fiddle looks quite good: http://sqlfiddle.com/#!3/b9e44/2/0

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Teradata : BTEQ Import Invalid Date Issue - date

Related

How to return null if an error for try_parse

Select if number range contains number in PostgreSQL

Converting Integer values to Date in Presto SQL

Why does DATE in DB2 database have a time component in it?

TSQL update Datetime with Random Value between 2 Dates

Categories

Resources