I have following 2 strings consisting of Arabic characters, such strings fails' my mloads on daily basis:
شركة جيبكا السعودية
ح717695
Both columns in which these 2 strings are inserted are defined as Unicode. And in case of these string translate_chk does not work either.
Following is the select statement I am using:
select Translate_Chk ( 'شركة جيبكا السعودية' using LATIN_TO_UNICODE WITH ERROR)
Moreover, I wrote following code for a finnish client and it worked fine. But it does not work in case of Arabic characters.
SELECT (CASE WHEN Translate_Chk ( TRIM('شركة جيبكا السعودية') using LATIN_TO_UNICODE ) <> 0
THEN
CASE WHEN (regexp_Instr('شركة جيبكا السعودية',
'\x{00}|\x{01}|\x{02}|\x{03}|\x{04}|\x{05}|\x{06}|\x{07}|\x{08}|\x{09}|\x{0B}|\x{0C}|\x{0E}|\x{0F}|\x{10}|\x{11}|\x{12}|\x{13}|\x{14}|\x{15}|\x{16}
|\x{17}|\x{18}|\x{19}|\x{1A}|\x{1B}|\x{1C}|\x{1D}|\x{1E}|\x{1F}|\x{5E}|\x{60}|\x{7C}|\x{7E}|\x{7F}|\x{A0}|\x{A2}|\x{A6}|\x{A8}|\x{A9}|\x{AA}|\x{AB}|\x{AC}
|\x{AD}|\x{AE}|\x{AF}|\x{B0}|\x{B1}|\x{B2}|\x{B3}|\x{B4}|\x{B5}|\x{B6}|\x{B7}|\x{B8}|\x{B9}|\x{BA}|\x{BB}|\x{BC}|\x{BD}|\x{BE}|\x{F0}',1,1,0,
'c')) >0
THEN
''
when TD_SYSFNLIB.InSTR('شركة جيبكا السعودية', U&'\008A' UESCAPE '\')>0
THEN
''
WHEN TD_SYSFNLIB.InSTR('شركة جيبكا السعودية', U&'\009A' UESCAPE '\')>0
THEN
''
WHEN TD_SYSFNLIB.InSTR('شركة جيبكا السعودية', U&'\008C' UESCAPE '\') >0
THEN
''
WHEN TD_SYSFNLIB.InSTR('شركة جيبكا السعودية', U&'\009C' UESCAPE '\')>0
THEN
''
WHEN TD_SYSFNLIB.InSTR('شركة جيبكا السعودية', U&'\009F' UESCAPE '\')>0
THEN
''
END
ELSE
'String' END);
Teradata Verion is 16.20.53.31
How can we replace such strings with null? I donot care about data loss as currently I delete such records from the file and resume my mload.
Related
I have a case statement where I am trying to add a where clause <> 0 for the case statement . However, I am not able to do as the error is saying the column name is not existed. This is the current code:
SELECT *,
CASE
when A.TestValue is null then
B.Collected
when B.Collected is null then
A.TestValue
END
as Actual_Value
FROM test_Table
where Actual_Value <> 0
CASE
when A.TestValue is null then B.Collected
when B.Collected is null then A.TestValue
END <> 0
I have a column type text which looks like empty string but the length of it is 1.
the following sql
select
teilnetz_name,
length(teilnetz_name),
trim(teilnetz_name),
length(trim(teilnetz_name))
from test_table
results in
teilnetz_name| length| btrim| length
| 1 | | 1
and
select case when trim(teilnetz_name) is distinct from '' then true else false end
from test_table
--return true
select case when teilnetz_name is distinct from null then true else false end
from test_table
--return true
select case when teilnetz_name is distinct from '' then true else false end
from test_table
--return true
select case when teilnetz_name is distinct from ' ' then true else false end
from test_table
--return true
How can I explain this phenomenon ?
I have on postgreql version 12.3
The column probably contains some other whitespace, e.g. a tab character. trim() will only remove real spaces.
Try
length(regexp_replace(teilnetz_name, '\s+', '', 'g'))
There may be a non-printable unicode character in field teilnetz_name.
Try
select encode(teilnetz_name, 'hex') from test_table;
in order to see what teilnetz_name actually contains.
I found a neat way on stackoverflow of handling multi-variable case statement using ROW values. It's great how clean it looks ...
However, I get errors directly comparing a row value made of 2 table columns of type text with a row made of string literals.
I've used a workaround with a function spTup('Deposit', '' ) which works but may be slower. The other way that works is to explicitly cast the string literals to text, but that creates a lot of visual clutter.
Questions:
Why can't Postgres deduce that the string literals should be treated
as text type?
Why can Postgres deduce the type of the string literal in single element row, but not the row with 2 elements?
I thought I was getting a handle on Postgres type handling, but I don't quite get this scenario, can anyone explain?
Is there any other approach which minimises visual clutter?
I'm using Postgres 10.1 on localhost and 9.6.6 in test & production server.
Test setup:
create table if not exists tblTest ( SeqID serial, EventType text, EventResult text, Amt decimal );
truncate table tblTest;
insert into tblTest( EventType, EventResult, Amt )
values ( 'Withdrawal', '', 1.11 ), ('Deposit', '', 2.22 ), ('Deposit', 'succeeded', 3.33 ), ('Deposit', 'failed', 4.44 );
create or replace function spTup( p_1 text, p_2 text )
returns record as $func$
select ( p_1, p_2 );
$func$ LANGUAGE sql IMMUTABLE;
-- Runs without error (using single element tuple)
select SeqID, EventType, case ( EventType ) when ( 'Deposit' ) then Amt else 9.999 end
from tblTest;
-- ERROR: cannot compare dissimilar column types text and unknown at record column 1
select SeqID, EventType, EventResult, case ( EventType, EventResult )
when ( 'Deposit', '' ) then Amt else 9.999 end
from tblTest;
-- Runs without error -- visually the cleanest apart from using spTup function
select SeqID, EventType, EventResult, case ( EventType, EventResult )::text
when ( 'Deposit', '' )::text then Amt else 9.999 end
from tblTest;
-- Runs without error
select SeqID, EventType, EventResult, case ( EventType, EventResult )
when ( 'Deposit'::text, ''::text ) then Amt else 9.999 end
from tblTest;
select SeqID, EventType, EventResult, case ( EventType, EventResult )
when spTup( 'Deposit', '' ) then Amt else 9.999 end
from tblTest;
-- ERROR: input of anonymous composite types is not implemented
select SeqID, EventType, EventResult, case ( EventType, EventResult )
when '( "Deposit", "" )' then Amt else 9.999 end
from tblTest;
-- Just out of interest
select ( 'Deposit', '' ), ( 'Deposit'::text, ''::text );
/**
row row
(Deposit,"") (Deposit,"")
**/
select SeqID, EventType, EventResult, ( EventType, EventResult )
from tblTest;
/**
seqid eventtype eventresult row
1 Withdrawal (Withdrawal,"")
2 Deposit (Deposit,"")
3 Deposit succeeded (Deposit,succeeded)
4 Deposit failed (Deposit,failed)
**/
This seems to be a restriction of the "simple" or "switched" CASE you are using.
The alternative syntax variant of CASE works without explicit cast:
select SeqID, EventType, EventResult
, CASE WHEN (EventType, EventResult) = ('Deposit', '') THEN amt ELSE 9.999 END
from tblTest;
As long as you have a single case to test, this variant even "minimises visual clutter". Two additional characters, but easier to read (IMHO). For multiple cases, the "switched" variant would seem preferable, though.
The different behavior obviously results from a different workflow in the "simple" CASE. The manual:
The first expression is computed, then compared to each of the value
expressions in the WHEN clauses until one is found that is equal to it.
The code path for the simple expression - value comparison tries less hard to resolve data types - and fails for anonymous row values. Feels like a shortcoming in the implementation. One might expect the same behavior for both variants - and file a bug report.
But the behavior has been like that since at least Postgres 8.4 (and is the same in pg 11):
db<>fiddle here
Probably very few people had similar ideas with untyped row values in a switched CASE so far.
Leaves your question:
Why can Postgres deduce the type of the string literal in single element row, but not the row with 2 elements?
Answer: because row values with a single element ((foo)) are simplified to their single element (foo) when evaluating expression almost everywhere in Postgres. So this:
CASE (eventtype) WHEN ('Deposit') THEN ...
is effectively simplified to:
CASE eventtype WHEN 'Deposit' THEN ...
I'm trying to use tSQLt AssertResultSetsHaveSameMetaData to check the metadata from a query that returns a large number of columns.
When it fails the message that details the 'expected/but was' details is truncated, so I can't the two pieces of information to see what is wrong.
Is there a way to output the message so that it doesn't truncate (for example, to a file)?
It depends on what you are actually testing. I agree that on wide result sets the output from AssertResultsSetsHaveSameMetaData can be a little unwieldy. For stored procedures you would write your tests like this:
create procedure [ProcedureTests].[test SelectProcedure result set contract]
as
begin
create table #expected
(
name varchar(500) not null
, column_ordinal int not null identity(1,1)
, system_type_name varchar(500) not null
, Nullability varchar(16) not null
)
; with expectedCte (name, system_type_name, Nullability)
as
(
select 'ItemId' , 'int' , 'not null'
union all select 'ActorId' , 'int' , 'not null'
union all select 'LanId' , 'nvarchar(200)' , 'not null'
union all select 'ConsumerId' , 'int' , 'not null'
union all select 'ConsumerMoniker' , 'nvarchar(200)' , 'not null'
union all select 'ProfileTypeId' , 'int' , 'not null'
union all select 'ProfileTypeName' , 'varchar(50)' , 'not null'
union all select 'ProfileId' , 'int' , 'not null'
)
insert #expected
(
name
, system_type_name
, Nullability
)
select name, system_type_name, Nullability from expectedCte;
--! Act
select
name
, column_ordinal
, system_type_name
, case is_nullable when 1 then 'null' else 'not null' end as [Nullability]
into
#actual
from
sys.dm_exec_describe_first_result_set_for_object(object_id('mySchema.SelectProcedure'), 0);
--! Assert
exec tSQLt.AssertEqualsTable #expected, #actual;
end;
go
Whilst for views you could use this (slightly different) approach:
alter procedure [ViewTests].[test ViewName resultset contract]
as
begin
create table [ViewTests].[expected]
(
TransactionId int not null
, SourceId int not null
, SourceKey nvarchar(50) not null
, TransactionTypeId int not null
, TransactionStatusId int not null
, LastModified datetime not null
);
--! You comparison may be as simple as this (but see alternative approach below)
exec tSQLt.AssertEqualsTableSchema '[ViewTests].[expected]', 'mySchema.ViewName';
--!
--! Seems that is_nullable column on dm_exec_describe_first_result_set (used by tSQLt.AssertEqualsTableSchema)
--! can be a bit flakey where views are concerned so you may need to ignore nullability when testing
--! this view (so comment out that column in both SELECTs)
--!
select
c.name as [ColumnName]
, c.column_id as [ColumnPosition]
, case
when st.name in ('char', 'varchar', 'varbinary')
then st.name + '(' + case when c.max_length = -1 then 'max' else coalesce(cast(c.max_length as varchar(8)), '???') end + ')'
when st.name in ('nchar', 'nvarchar')
then st.name + '(' + case when c.max_length = -1 then 'max' else coalesce(cast(c.max_length / 2 as varchar(8)), '???') end + ')'
when st.name in ('decimal', 'numeric')
then st.name + '(' + coalesce(cast(c.precision as varchar(8)), '???') + ',' + coalesce(cast(c.scale as varchar(8)), '???') + ')'
when st.name in ('time', 'datetime2', 'datetimeoffset')
then st.name + '(' + coalesce(cast(c.precision as varchar(8)), '???') + ')'
else st.name
end as [DataType]
, c.[precision] as [NumericScale]
, c.scale as [NumericPrecision]
, c.collation_name as [CollationName]
, cast(case c.is_nullable when 1 then 'null' else 'not null' end as varchar(16)) as [Nullability]
into
#expected
from
sys.columns as c
inner join sys.types as st
on st.system_type_id = c.system_type_id
and st.user_type_id = c.user_type_id
where
c.[object_id] = object_id('[ViewTests].[expected]')
select
name as [ColumnName]
, column_ordinal as [ColumnPosition]
, system_type_name as [DataType]
, [precision ]as [NumericScale]
, scale as [NumericPrecision]
, collation_name as [CollationName]
, cast(case is_nullable when 1 then 'null' else 'not null' end as varchar(16)) as [Nullability]
into
#actual
from
sys.dm_exec_describe_first_result_set('select * from mySchema.ViewName, null, null)
exec tSQLt.AssertEqualsTable '#expected', '#actual' ;
end
go
In the even of any failures, the reason will be much clearer as the output is more like AssertEqualsTable.
You could try EXEC [tSQLt].[XmlResultFormatter]; before running your test. This is intended for use in "build server" scenarios, but could probably be pressed into service to show you more of the output in SSMS.
the array_to_string is returning text (916-555-1212), however postgresql is treating it as set-operation even with explicit ::text casting.
select nullif(
array_to_string(
regexp_matches('9165551212', '(\d{3})?(\d{3})(\d{4})')::text[]
,'-')::text
, '');
ERROR: NULLIF does not support set arguments
yet I can use char_length which expects text and it works
select char_length(
array_to_string(
regexp_matches('9165551212', '(\d{3})?(\d{3})(\d{4})')::text[]
,'-')::text
)
char_length
-------------
12
yet wrap even that in a nullif and same error
select nullif(
char_length(
array_to_string(
regexp_matches('9165551212', '(\d{3})?(\d{3})(\d{4})')::text[]
,'-')::text
)
,12)
ERROR: NULLIF does not support set arguments
I had the same problem and it seems to be related to the regexp_matches function:
select nullif( (regexp_matches( '123', '(2)' ) )[1] , '' )
; -- ERROR: NULLIF does not support set arguments
select nullif( (regexp_matches( '123', '(2)' )::text[])[1]::text, '' )
; -- ERROR: NULLIF does not support set arguments
select nullif( ( select (regexp_matches( '123', '(2)' ) )[1] ), '' )
; -- ok: gives "2"
so just wrapping the result in a sub-select seems to solve it here as well :-/
Weird... It works if you do
select
nullif(
(
select array_to_string(
regexp_matches(
'9165551212',
'(\d{3})?(\d{3})(\d{4})'
)::text[]
, '-' )
)
, '')