DB2 Case Sensitivity

DB2 Case Sensitivity - db2

I'm having great difficultly making my DB2 (AS/400) queries case insensitive.
For example:
SELECT *
FROM NameTable
WHERE LastName = 'smith'
Will return no results, but the following returns 1000's of results:
SELECT *
FROM NameTable
WHERE LastName = 'Smith'
I've read of putting SortSequence/SortType into your connection string but have had no luck... anyone have exepierence with this?
Edit:
Here's the stored procedure:
BEGIN
DECLARE CR CURSOR FOR
SELECT T . ID ,
T . LASTNAME ,
T . FIRSTNAME ,
T . MIDDLENAME ,
T . STREETNAME || ' ' || T . ADDRESS2 || ' ' || T . CITY || ' ' || T . STATE || ' ' || T . ZIPCODE AS ADDRESS ,
T . GENDER ,
T . DOB ,
T . SSN ,
T . OTHERINFO ,
T . APPLICATION
FROM
( SELECT R . * , ROW_NUMBER ( ) OVER ( ) AS ROW_NUM
FROM CPSAB32.VW_MYVIEW
WHERE R . LASTNAME = IFNULL ( #LASTNAME , LASTNAME )
AND R . FIRSTNAME = IFNULL ( #FIRSTNAME , FIRSTNAME )
AND R . MIDDLENAME = IFNULL ( #MIDDLENAME , MIDDLENAME )
AND R . DOB = IFNULL ( #DOB , DOB )
AND R . STREETNAME = IFNULL ( #STREETNAME , STREETNAME )
AND R . CITY = IFNULL ( #CITY , CITY )
AND R . STATE = IFNULL ( #STATE , STATE )
AND R . ZIPCODE = IFNULL ( #ZIPCODE , ZIPCODE )
AND R . SSN = IFNULL ( #SSN , SSN )
FETCH FIRST 500 ROWS ONLY )
AS T
WHERE ROW_NUM <= #MAXRECORDS
OPTIMIZE FOR 500 ROW ;
OPEN CR ;
RETURN ;

Why not do this:
WHERE lower(LastName) = 'smith'
If you're worried about performance (i.e. the query not using an index), keep in mind that DB2 has function indexes, which you can read about here. So essentially, you can create an index on upper(LastName).
EDIT
To do the debugging technique I discussed in the comments, you could do something like this:
create table log (msg varchar(100, dt date);
Then in your SP, you can insert messages to this table for debugging purposes:
insert into log (msg, dt) select 'inside the SP', current_date from sysibm.sysdummy1;
Then after the SP runs, you can select from this log table to see what happened.

If you want case-insensitive in your procedure, try using this option in it:
SET OPTION SRTSEQ = *LANGIDSHR ;
You should also create an index to support it for performance. Create the index when you have *LANGIDSHR as a connection attribute, and the shared-weight index should then be available to later jobs. (There are various ways to get the appropriate setting into effect.)
*LANGIDSHR relates to the language-ID for your jobs. Characters in the character set that might be considered as "equals", such as 'A' and 'a' or 'ü' and 'u', should be given equal weights (shared) and so select together.

I did something similar when I wanted a case insensitive search. I used UPPER(mtfield) = 'SEARCHSTRING'. I know this works.

See: https://stackoverflow.com/a/47181640/5507619
Database setting
There is a database config setting you can set at database creation. It's based on unicode, though.
CREATE DATABASE yourDB USING COLLATE UCA500R1_S1
The default Unicode Collation Algorithm is implemented by the UCA500R1 keyword without any attributes. Since the default UCA cannot simultaneously encompass the collating sequence of every language supported by Unicode, optional attributes can be specified to customize the UCA ordering. The attributes are separated by the underscore (_) character. The UCA500R1 keyword and any attributes form a UCA collation name.
The Strength attribute determines whether accent or case is taken into account when collating or comparing text strings. In writing systems without case or accent, the Strength attribute controls similarly important features.
The possible values are: primary (1), secondary (2), tertiary (3), quaternary (4), and identity (I). To ignore:
accent and case, use the primary strength level
case only, use the secondary strength level
neither accent nor case, use the tertiary strength level
Almost all characters can be distinguished by the first three strength levels, therefore in most locales the default Strength attribute is set at the tertiary level. However if the Alternate attribute (described below) is set to shifted, then the quaternary strength level can be used to break ties among white space characters, punctuation marks, and symbols that would otherwise be ignored. The identity strength level is used to distinguish among similar characters, such as the MATHEMATICAL BOLD SMALL A character (U+1D41A) and the MATHEMATICAL ITALIC SMALL A character (U+1D44E).
Setting the Strength attribute to higher level will slow down text string comparisons and increase the length of the sort keys.
Examples:
UCA500R1_S1 will collate "role" = "Role" = "rôle"
UCA500R1_S2 will collate "role" = "Role" < "rôle"
UCA500R1_S3 will collate "role" < "Role" < "rôle"
This worked for me. As you can see, ..._S2 ignores case, too.
Using a newer standard version, it should look like this:
CREATE DATABASE yourDB USING COLLATE CLDR181_S1
Collation keywords:
UCA400R1 = Unicode Standard 4.0 = CLDR version 1.2
UCA500R1 = Unicode Standard 5.0 = CLDR version 1.5.1
CLDR181 = Unicode Standard 5.2 = CLDR version 1.8.1
If your database is already created, there is supposed to be a way to change the setting.
CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );
I do have problems executing this, but for all I know it is supposed to work.
Generated table row
Other options are e.g. generating a upper case row:
CREATE TABLE t (
id INTEGER NOT NULL PRIMARY KEY,
str VARCHAR(500),
ucase_str VARCHAR(500) GENERATED ALWAYS AS ( UPPER(str) )
)#
INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )#
SELECT * FROM t#
ID STR UCASE_STR
----------- ------------------------------------ ------------------------------------
1 Some String SOME STRING
1 record(s) selected.

Related

TSQL: Special (National) char detection (SQL Server2016)

I found a lot of techniques to detect aka special chars($%##) avail on English keyboard, however I see that some national char like á works differently, though my LIKE condition should get it, as it should select anything but a-z1-9, what is the trick here: In sample below I'm missing my special á. I'm on TSQL 2016 with default settings in US.
;WITH cte AS (SELECT 'Euro a€' St UNION SELECT 'adgkjb$' St UNION SELECT 'Bravo Endá' St)
SELECT * FROM cte WHERE St LIKE '%[^a-zA-Z0-9 ]%'
St
adgkjb$
Euro a€
SELECT CAST(N'€' AS VARBINARY(8)) --0xAC20
SELECT CAST(N'á' AS VARBINARY(8)) --0xE100

SQL Server appears to be helping with ranges of characters due to the default collation. If you explicitly list all of the valid characters it will work as desired. Alternatively, you can force a collation on the pattern match that won't interpret the pattern a containing non-ASCII characters.
-- Explicit pattern for "bad" characters.
declare #InvalidCharactersPattern as VarChar(100) = '%[^abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 ]%';
-- Query some sample data with the explicit pattern and an explicitly specified collation.
select Sample,
case when Sample like #InvalidCharactersPattern then 'Bad Character' else 'Okay' end as ExplicitStatus,
case when Sample like '%[^a-zA-Z0-9 ]%' collate Latin1_General_100_BIN
then 'Bad Character' else 'Okay' end as CollationStatus
from ( values ( 'a' ), ( 'A' ), ( 'á' ), ( 'Foo' ), ( 'F&o' ), ( '$%^&' ) ) as Samples( Sample );
-- Server collation.
select ServerProperty( 'collation' ) as ServerCollation;
-- Available collations.
select name, description
from sys.fn_helpcollations()
order by name;

How to import data into teradata tables from delimited file using BTEQ import?

I am trying to execute following bteq command on linux environment but couldn't load data properly into Teradata DB server. Can someone please advise me to resolve the below issue that I am facing while loading.
BTEQ Command used :
.SET width 64000;
.SET session transaction btet;
.logmech ldap
.logon XXXXXXX/XXXXXXXX,********;
DATABASE corecm;
.PACK 1000
.IMPORT VARTEXT '~' FILE=/v/global/user/application_event_bus_evt
.REPEAT *
USING(APPLICATION_EVENT_ID CHAR(24),BUS_EVT_ID CHAR(24),BUS_EVT_VID BIGINT,BUS_EVT_RESTATE_IN SMALLINT)
insert into corecm.application_event_bus_evt (APPLICATION_EVENT_ID
, BUS_EVT_ID
, BUS_EVT_VID
, BUS_EVT_RESTATE_IN
)
values
( COALESCE(:APPLICATION_EVENT_ID,1)
, COALESCE(:BUS_EVT_ID,1)
, COALESCE(:BUS_EVT_VID,1)
, COALESCE(:BUS_EVT_RESTATE_IN,1)
) ;
.LOGOFF;
.EXIT;
SAMPLE INPUT FILE DELIMITTER "~" [ /v/global/user/application_event_bus_evt ] :
Ckn3gMxLEeOgIQBQVgErYA==~g+GDDtlaY3n7BdUrYshDFA==~1~1
CL1kEcxLEeOgIQBQVgErYA==~qoKoiuGDbClpcGt/z6RKGw==~1~1
oYIVcMxKEeOgIQBQVgErYA==~mfmQiwl7yAteevzJfilMvA==~1~1
5N7ME5bM4xGhM7exj3ykUw==~yFM2FZbM4xGhM7exj3ykUw==~1~0
JLBH4JfM4xGDH9s5+Ds/8w==~doZ/7pfM4xGDH9s5+Ds/8w==~1~0
fGvpoMxKEeOgIQBQVgErYA==~mQUQIK2mY6WIPcszfp5BTQ==~1~1
Table Definition :
CREATE MULTISET TABLE CORECM.APPLICATION_EVENT_BUS_EVT ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
APPLICATION_EVENT_ID CHAR(26) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
BUS_EVT_ID CHAR(26) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
BUS_EVT_VID BIGINT NOT NULL,
BUS_EVT_RESTATE_IN SMALLINT)
UNIQUE PRIMARY INDEX ( APPLICATION_EVENT_ID ,BUS_EVT_ID ,BUS_EVT_VID )
INDEX APPLICATION_EVENT_BUS_EVT_IDX1 ( APPLICATION_EVENT_ID )
INDEX APPLICATION_EVENT_BUS_EVT_IDX2 ( BUS_EVT_ID ,BUS_EVT_VID );
Results set in DB server as,
APPLICATION_EVENT_ID BUS_EVT_ID BUS_EVT_VID BUS_EVT_RESTATE_IN
1 Ckn3gMxLEeOgIQBQVgErYA == g+GDDtlaY3n7BdUrYshD 85,849,873,219,141,958 12,544
2 CL1kEcxLEeOgIQBQVgErYA == qoKoiuGDbClpcGt/z6RK 85,849,873,219,155,783 12,544
3 oYIVcMxKEeOgIQBQVgErYA == mfmQiwl7yAteevzJfilM 85,849,873,219,142,006 12,544
4 5N7ME5bM4xGhM7exj3ykUw == JAf0GpbM4xGhM7exj3yk 85,849,873,219,155,797 12,288
5 JLBH4JfM4xGDH9s5+Ds/8w == Du6T7pfM4xGDH9s5+Ds/ 85,849,873,219,155,768 12,288
6 fGvpoMxKEeOgIQBQVgErYA == mQUQIK2mY6WIPcszfp5B 85,849,873,219,146,068 12,544
If we look at the Data, we can see two issues as,
First two column data length is 24 CHARACTERS ( as per input file ), but the issue is that it been shifted two characters in next column.
Column BUS_EVT_VID and BUS_EVT_RESTATE_IN has wrong data 85,849,873,219,141,958 and 12,544 instead of 1 and 1 respectively (this may be because first two column data got shifted)
I tried following options to resolve the above issue but couldn't resolve the issue,
Modified the Table Definition, i.e. changed datatype to
CHAR(28),CHAR(24),CHAR(26)
Modified the Table Definition column
datatypes to VARCHAR(24), VARCHAR(26)
Modified BTEQ command, i.e. altered datatype in below line,
USING(APPLICATION_EVENT_ID CHAR(24),BUS_EVT_ID CHAR(24),BUS_EVT_VID BIGINT,BUS_EVT_RESTATE_IN SMALLINT)
Thanks in advance.

When you define VARTEXT all input columns must be defined as VARCHAR, but you used CHAR and INT.
This should work, VARCHAR length based on the definition of your target table:
USING(
APPLICATION_EVENT_ID VARCHAR(26),
BUS_EVT_ID VARCHAR(26),
BUS_EVT_VID VARCHAR(19),
BUS_EVT_RESTATE_IN VARCHAR(6)
)

Inserting values into multiple columns by splitting a string in PostgreSQL

I have the following heap of text:
"BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,
URLConnectionSample,ShortVersion,1.0,Version,1.0,BundleSize,155648,DynamicSize,
16384,Identifier,com.IdentifierForVendor3,Name,IdentifierForVendor3,ShortVersion,
1.0,Version,1.0,".
What I'd like to do is extract data from this in the following manner:
BundleSize:155648
DynamicSize:204800
Identifier:com.URLConnectionSample
Name:URLConnectionSample
ShortVersion:1.0
Version:1.0
BundleSize:155648
DynamicSize:16384
Identifier:com.IdentifierForVendor3
Name:IdentifierForVendor3
ShortVersion:1.0
Version:1.0
All tips and suggestions are welcome.

It isn't quite clear what do you need to do with this data. If you really need to process it entirely in the database (looks like the task for your favorite scripting language instead), one option is to use hstore.
Converting records one by one is easy:
Assuming
%s =
BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,URLConnectionSample,ShortVersion,1.0,Version,1.0
SELECT * FROM each(hstore(string_to_array(%s, ',')));
Output:
key | value
--------------+-------------------------
Name | URLConnectionSample
Version | 1.0
BundleSize | 155648
Identifier | com.URLConnectionSample
DynamicSize | 204800
ShortVersion | 1.0
If you have table with columns exactly matching field names (note the quotes, populate_record is case-sensitive to key names):
CREATE TABLE data (
"BundleSize" integer, "DynamicSize" integer, "Identifier" text,
"Name" text, "ShortVersion" text, "Version" text);
You can insert hstore records into it like this:
INSERT INTO data SELECT * FROM
populate_record(NULL::data, hstore(string_to_array(%s, ',')));
Things get more complicated if you have comma-separated values for more than one record.
%s = BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,URLConnectionSample,ShortVersion,1.0,Version,1.0,BundleSize,155648,DynamicSize,16384,Identifier,com.IdentifierForVendor3,Name,IdentifierForVendor3,ShortVersion,1.0,Version,1.0,
You need to break up an array into chunks of number_of_fields * 2 = 12 elements first.
SELECT hstore(row) FROM (
SELECT array_agg(str) AS row FROM (
SELECT str, row_number() OVER () AS i FROM
unnest(string_to_array(%s, ',')) AS str
) AS str_sub
GROUP BY (i - 1) / 12) AS row_sub
WHERE array_length(row, 1) = 12;
Output:
"Name"=>"URLConnectionSample", "Version"=>"1.0", "BundleSize"=>"155648", "Identifier"=>"com.URLConnectionSample", "DynamicSize"=>"204800", "ShortVersion"=>"1.0"
"Name"=>"IdentifierForVendor3", "Version"=>"1.0", "BundleSize"=>"155648", "Identifier"=>"com.IdentifierForVendor3", "DynamicSize"=>"16384", "ShortVersion"=>"1.0"
And inserting this into the aforementioned table:
INSERT INTO data SELECT (populate_record(NULL::data, hstore(row))).* FROM ...
the rest of the query is the same.

splitting a string and rebuilding from related data

I've been given a string that can contain multiple offer codes separated by a tilde 123~125~126
I also have a table that may or may not define a parent/child relationship
tbl_allowed
============
offercode varchar(15)
parent_oc varchar(15)
ex:
offercode = 124
parent_oc = 126
I already have a function that will take the delimited string and split it, but I want to take the string, compare the contents to what's in tbl_allowed and regenerate it by replacing any values that exist in tbl_allowed in the parent_oc column with the value in the offercode column. If no defined relationship exists, then just use itself.
this is pretty simple with a single offercode:
set #newOfferCode = (select top 1 coalesce(cac.offercode, #lOfferCode)
from tbl_allowed cac
where OfferCode = #lOfferCode or parent_oc = #lOfferCode)
select coalesce(nullif(#newOfferCode,''), #lOfferCode)
but I'm having difficulty when I have a tilde delimited string. Any ideas?

Here's what I ended up doing. Seems to work.
CREATE TABLE #tempoffers(OfferCode varchar(15), NewOfferCode varchar(15))
INSERT INTO #tempoffers
SELECT OutParam, null
FROM dbo.SplitString(#lOfferCode, '~')
update #tempoffers set NewOfferCode = coalesce(cac.offercode, mb.offerCode)
from #tempoffers mb
left outer join tbl_Allowed cac on mb.OfferCode = cac.parent_oc
-- building the new string
declare #newOfferCode varchar(5000)
SELECT #newOfferCode = COALESCE(#newOfferCode + '~', '') + NewOfferCode FROM #tempoffers
drop table #tempoffers
SELECT #newOfferCode as OfferCode

SQL SErver Trigger not evaluating as Insert or Update properly

I want to have one trigger to handle updates and inserts. Most of the sql actions in the trigger are for both. The only exception is the fields I'm using to record date and username for an insert and an update. This is what I have, but the updates of the fields used to track update and insert are not firing right. If I insert a new record, I get CreatedBy, CreatedOn, LastEditedBy, LastEditedOn populated, with LastEditedOn as 1 second after CreatedOn (which I dont want to happen). When I update the record, only the LastEditedBy & LastEditedOn changes (which is correct). I'm including my full trigger for reference:
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
-- =================================================================================
-- Author: Paul J. Scipione
-- Create date: 2/15/2012
-- Update date: 6/5/2012
-- Description: To concatenate several fields into a set formatted UnitDescription,
-- to total Span & Loop footages, to set appropriate AcctCode, & track
-- user inserts
-- =================================================================================
IF OBJECT_ID('ProcessCable', 'TR') IS NOT NULL
DROP TRIGGER ProcessCable
GO
CREATE TRIGGER ProcessCable
ON Cable
AFTER INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;
-- IF TRIGGER_NESTLEVEL() > 1 RETURN
IF ((SELECT TRIGGER_NESTLEVEL()) > 1 )
RETURN
ELSE
BEGIN
-- record user and date of insert or update
IF EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
ELSE IF NOT EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
-- reset Suffix if applicable
UPDATE Cable SET Suffix = NULL WHERE Suffix = 'n/a'
-- create UnitDescription value
UPDATE Cable SET UnitDescription =
isnull (Type, '') +
isnull (CONVERT (NVARCHAR (10), Size), '') +
'-' +
isnull (CONVERT (NVARCHAR (10), Gauge), '') +
CASE
WHEN ExtraTrench IS NOT NULL AND ExtraTrench > 0 THEN
CASE
WHEN Suffix IS NULL THEN 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')'
ELSE 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')' + Suffix
END
ELSE isnull (Suffix, '')
END
-- convert any accidental negative numbers entered
UPDATE Cable SET Length = ABS(Length)
-- sum Length with LoopFootage into TotalFootage
UPDATE Cable SET TotalFootage = isnull(Length, 0) + isnull(LoopFootage, 0)
-- set proper AcctCode based on Type
UPDATE Cable SET AcctCode =
CASE
WHEN Type IN ('SEA', 'CW', 'CJ') THEN '32.2421.2'
WHEN Type IN ('BFC', 'BJ', 'SEB') THEN '32.2423.2'
WHEN Type IN ('TIP','UF') THEN '32.2422.2'
WHEN Type = 'unknown' OR Type IS NULL THEN 'unknown'
END
WHERE AcctCode IS NULL OR AcctCode = ' '
END
END
GO

A few things jump out at me when I look at your trigger:
You are doing several additional updates rather than a single update (performance-wise, a single update would be better).
Your update statements are unconstrained (there is no JOIN to the inserted/deleted tables to limit the number of records that you perform these additional updates on).
Most of this logic feels like it should be in the application layer rather than in the database; Or, perhaps in some cases implemented differently.
Some quick examples:
Suffix of "n/a" should be removed before inserted.
Cable length absolute value should be done before inserted (with a CHECK CONSTRAINT to verify that bad data cannot be inserted).
TotalFootage should be a computed column so it is always correct.
The Type/AcctCode relationship seems like it should be a column value in a foreign key reference.
But ultimately, I think the reason you are seeing the unexpected dates is because of the unconstrained updates. Without addressing any of the other concerns I brought up above, the statement that sets the audit fields should be more like this:
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN inserted on Cable.PrimaryKeyColumn = inserted.PrimaryKeyColumn
LEFT JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
WHERE deleted.PrimaryKeyColumn IS NULL

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

DB2 Case Sensitivity - db2

I did something similar when I wanted a case insensitive search. I used UPPER(mtfield) = 'SEARCHSTRING'. I know this works.

Related

TSQL: Special (National) char detection (SQL Server2016)

How to import data into teradata tables from delimited file using BTEQ import?

Inserting values into multiple columns by splitting a string in PostgreSQL

splitting a string and rebuilding from related data

SQL SErver Trigger not evaluating as Insert or Update properly

Categories

Resources