Does Db2 support “accent insensitive” collations? - db2

In Microsoft SQL Server, it's possible to specify an "accent insensitive" collation (for a database, table or column). Is this possible in Db2?

Look at the Unicode Collation Algorithm based collations article.
Collating sequence is specified at the database creation time and can't be changed.
See the 'COLLATE USING locale-sensitive-collation' clause of the CREATE DATABASE command.
There is no way to specify collation sequence at the table or column level, but you can use the COLLATION_KEY_BIT function to compare string expressions.
select
case when c1=c2 then 1 else 0 end r1
, case when COLLATION_KEY_BIT(c1, 'CLDR181_EO_S1')=COLLATION_KEY_BIT(c2, 'CLDR181_EO_S1') then 1 else 0 end r2
from table(values ('Café', 'Cafe')) t(c1, c2);
R1 R2
-- --
0 1
If your database had CLDR181_EO_S1 collation, the result in the 1-st column would be 1.

Related

How can I cast a CLOB to INTEGER in both DB2 9.5 and 11.5?

I have a column that stores XML as CLOB data, and in these XMLs I have an ID information. At first, I extracted the ID information using an REGEX expression, e.g.:
SELECT
regexp_substr(XML_CONTENT, '(<identifier>)(.*)(<\/identifier>)', 1, 1, 'c', 2) as ID
FROM XML_TABLE w
Then, this query results a table with one CLOB column containing the ID information. E.g.:
|ID (CLOB)|
|123456|
|456789|
Now that I have this information, I want to join it with another table referencia an integer PK column, and to do this I need to cast the CLOB ID to INTEGER ID. How can I do that?
I'm doing this to find differences between a DEV and PROD databases, and in this environemtn DB2 versions are different, so I need this query to compatible in DB2 9.5 and DB2 11.5.
The REXEXP functions were introduced in DB2 11.1. So if using Db2 9.5, you would need to use e.g. LOCATE and SUBSTR. E.g.
SELECT
SUBSTR(XML_CONTENT
, LOCATE( '<identifier>',XML_CONTENT) + LENGTH('<identifier>')
, LOCATE('</identifier>',XML_CONTENT) - LOCATE('<identifier>',XML_CONTENT) - LENGTH('<identifier>')
)
FROM
TABLE(VALUES
'blah <identifier>Some value</identifier> blah') AS X (XML_CONTENT)
I guess I found a good solution to this problem:
DB2 11.5 (fastest way?)
SELECT
REGEXP_SUBSTR(XML_CONTENT, '(<identifier>)(.*)(</identifier>)', 1, 1, 'c', 2)
FROM XML_TABLE;
DB2 9.5
SELECT
XMLCAST(XMLQUERY('$d//identifier/text()' PASSING XMLPARSE(DOCUMENT XML_CONTENT) AS "d") AS integer) AS ID
FROM XML_TABLE

DB2 tablespaces: "partition-by-range" or "partition-by-growth"

During the upgrade from DB2 9 to DB2 10 on z/OS, the previous (now retired) DBA converted all tablespaces from "simple" to "universal". How can I determine if they are partition-by-range or partition-by-growth?
Using RC/Query in CA/Tools from Computer Associates, I was able to reverse-engineer the CREATE TABLESPACE statement, but it's not obvious from the code which type of tablespace this is.
CREATE TABLESPACE SNF101
IN DNF1
USING STOGROUP GNF2
PRIQTY 48
SECQTY 48
ERASE NO
BUFFERPOOL BP1
CLOSE NO
LOCKMAX SYSTEM
SEGSIZE 4
FREEPAGE 0
PCTFREE 5
GBPCACHE CHANGED
DEFINE YES
LOGGED
TRACKMOD YES
COMPRESS NO
LOCKSIZE ANY
MAXROWS 255
CCSID EBCDIC
;
Given that CREATE TABLE statement, how can I determine if this is partition-by-range or partition-by-growth?
Thanks!
Check if your version of the CA/Tools is capable of recognizing the tablespace types and also generating the matching DDL.
Check the SYSIBM.SYSTABLESPACE column TYPE, value G indicates partition-by-growth, value R indicates partition by range.

SQL Command to insert Chinese Letters

I have a database with one column of the type nvarchar. If I write
INSERT INTO table VALUES ("玄真")
It shows ¿¿ in the table. What should I do?
I'm using SQL Developer.
Use single quotes, rather than double quotes, to create a text literal and for a NVARCHAR2/NCHAR text literal you need to prefix it with N
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value NVARCHAR2(20) );
INSERT INTO table_name VALUES (N'玄真');
Query 1:
SELECT * FROM table_name
Results:
| VALUE |
|-------|
| 玄真 |
First, using NVARCHAR might not even be necessary.
The 'N' character data types are for storing data that doesn't 'fit' in the database's defined character set. There's an auxiliary character set defined as the NCHAR Character set. It's kind of a band aid - once you create a database it can be difficult to change its character set. Moral of this story - take great care in defining the Character Set when creating your database and do not just accept the defaults.
Here's a scenario (LiveSQL) where we're storing a Chinese string in both NVARCHAR and VARCHAR2.
CREATE TABLE SO_CHINESE ( value1 NVARCHAR2(20), value2 varchar2(20 char));
INSERT INTO SO_CHINESE VALUES (N'玄真', '我很高興谷歌翻譯。' )
select * from SO_CHINESE;
Note that both the character sets are in the Unicode family. Note also I told my VARCHAR2 string to hold 20 characters. That's because some characters may require up to 4 bytes to be stored. Using a definition of (20) would give you only room to store 5 of those characters.
Let's look at the same scenario using SQL Developer and my local database.
And to confirm the character sets:
SQL> clear screen
SQL> set echo on
SQL> set sqlformat ansiconsole
SQL> select *
2 from database_properties
3 where PROPERTY_NAME in
4 ('NLS_CHARACTERSET',
5 'NLS_NCHAR_CHARACTERSET');
PROPERTY_NAME PROPERTY_VALUE DESCRIPTION
NLS_NCHAR_CHARACTERSET AL16UTF16 NCHAR Character set
NLS_CHARACTERSET AL32UTF8 Character set
First of all, you should to establish the Chinese character encoding on your Database, for example
UTF-8, Chinese_Hong_Kong_Stroke_90_BIN, Chinese_PRC_90_BIN, Chinese_Simplified_Pinyin_100_BIN ...
I show you an example with SQL Server 2008 (Management Studio) that incorporates all of this Collations, however, you can find the same characters encodings in other Databases (MySQL, SQLite, MongoDB, MariaDB...).
Create Database with Chinese_PRC_90_BIN, but you can choose other Coallition:
Select a Page (Left Header) Options > Collation > Choose the Collation
Create a Table with the same Collation:
Execute the Insert Statement
INSERT INTO ChineseTable VALUES ('玄真');

Postgresql order by - danish characters is expanded

I'm trying to make a "order by" statement in a sql query work. But for some reason danish special characters is expanded in stead of their evaluating their value.
SELECT roadname FROM mytable ORDER BY roadname
The result:
Abildlunden
Æblerosestien
Agern Alle 1
The result in the middle should be the last.
The locale is set to danish, so it should know the value of the danish special characters.
What is the collation of your database? (You might also want to give the PostgreSQL version you are using) Use "\l" from psql to see.
Compare and contrast:
steve#steve#[local] =# select * from (values('Abildlunden'),('Æblerosestien'),('Agern Alle 1')) x(word)
order by word collate "en_GB";
word
---------------
Abildlunden
Æblerosestien
Agern Alle 1
(3 rows)
steve#steve#[local] =# select * from (values('Abildlunden'),('Æblerosestien'),('Agern Alle 1')) x(word)
order by word collate "da_DK";
word
---------------
Abildlunden
Agern Alle 1
Æblerosestien
(3 rows)
The database collation is set when you create the database cluster, from the locale you have set at the time. If you installed PostgreSQL through a package manager (e.g. apt-get) then it is likely taken from the system-default locale.
You can override the collation used in a particular column, or even in a particular expression (as done in the examples above). However if you're not specifying anything (likely) then the database default will be used (which itself is inherited from the template database when the database is created, and the template database collation is fixed when the cluster is created)
If you want to use da_DK as your default collation throughout, and it's not currently your database default, your simplest option might be to dump the database, then drop and re-create the cluster, specifying the collation to initdb (or pg_createcluster or whatever tool you use to create the server)
BTW the question isn't well-phrased. PostgreSQL is very much not ignoring the "special" characters, it is correctly expanding "Æ" into "AE"- which is a correct rule for English. Collating "Æ" at the end is actually more like the unlocalised behaviour.
Collation documentation: http://www.postgresql.org/docs/current/static/collation.html

Numeric literals in sql server 2008

What is the type that sql server assigns to the numeric literal: 2. , i.e. 2 followed by a dot?
I was curious because:
select convert(varchar(50), 2.)
union all
select convert(varchar(50), 2.0)
returns:
2
2.0
which made me ask what's the difference between 2. and 2.0 type wise?
Sql server seems to assign types to numeric literals depending on the number itself by finding the minimal storage type that can hold the number. A value of 1222333 is stored as int while 1152921504606846975 is stored as big int.
thanks
Edit: I also want to add why this is so important. In sql server 2008 r2, select 2/5 returns 0 while select 2./5 returns 0.4, due to the way sql server treats these types. In oracle and Access select 2/5 (oracle: select 2/5 from dummy) returns 0.4. That's the way it should be. I wonder if they fixed this behaviour in sql server 2012. I would be surprised if they did.
This script might answer my question. The type of 2. is numeric(1, 0).
create table dbo.test_type (field sql_variant)
go
delete from dbo.test_type
go
INSERT INTO dbo.test_type
VALUES (2.);
INSERT INTO dbo.test_type
VALUES (2.0);
SELECT field
, sql_variant_property (field
, 'BaseType')
AS BaseType
, sql_variant_property (field
, 'Precision')
AS Precision
, sql_variant_property (field
, 'Scale')
AS Scale
FROM dbo.test_type
It returns:
2 numeric 1 0
2.0 numeric 2 1
This is why when 2.0 is converted to varchar the result is 2.0. Sql server seems to record the precision.