Postgresql LIKE - find UTF-8 variants of ASCII text

Postgresql LIKE - find UTF-8 variants of ASCII text - postgresql

I'm trying to query a PostgreSQL table for rows where a column contains ASCII letters:
SELECT p.* FROM person AS p WHERE p.surname LIKE 'e%';
The database contains UTF-8 strings. For example:
Éxample
Ěxample
I need to find rows that have only ASCII e (or E). Why does the above query not work?

You can do this with the unaccent extension:
CREATE EXTENSION unaccent;
SELECT * FROM (
VALUES
('Éxample'),
('Ěxample')
) v(s)
WHERE lower(unaccent(s)) LIKE 'e%'

Related

how to convert oracle to_single_byte() in postgreSQL

Oracle
select regexp_replace(TO_SINGLE_BYTE('Ａ'),'[^ a-zA-Z]','!!',1,0,'im') from dual;
--> Result: A
Postgres
select regexp_replace('Ａ','[^ a-zA-Z]',' ','ig');
--> Result : !!
Question: how to get a same result between oracle and Postgres? I'd like get an result as like Oracle.
I tried to search the way to convert this function into Postgres. But the answer is there is no function in Postgres.

That letter is Unicode code point U+FF21 (Fullwidth Latin Capital Letter A) and does not match your pattern.
Use the unaccent function to convert the character to a regular A:
CREATE EXTENSION unaccent;
SELECT ascii(unaccent('Ａ'));
ascii
-------
65
(1 row)

SQL Command to insert Chinese Letters

I have a database with one column of the type nvarchar. If I write
INSERT INTO table VALUES ("玄真")
It shows ¿¿ in the table. What should I do?
I'm using SQL Developer.

Use single quotes, rather than double quotes, to create a text literal and for a NVARCHAR2/NCHAR text literal you need to prefix it with N
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value NVARCHAR2(20) );
INSERT INTO table_name VALUES (N'玄真');
Query 1:
SELECT * FROM table_name
Results:
| VALUE |
|-------|
| 玄真 |

First, using NVARCHAR might not even be necessary.
The 'N' character data types are for storing data that doesn't 'fit' in the database's defined character set. There's an auxiliary character set defined as the NCHAR Character set. It's kind of a band aid - once you create a database it can be difficult to change its character set. Moral of this story - take great care in defining the Character Set when creating your database and do not just accept the defaults.
Here's a scenario (LiveSQL) where we're storing a Chinese string in both NVARCHAR and VARCHAR2.
CREATE TABLE SO_CHINESE ( value1 NVARCHAR2(20), value2 varchar2(20 char));
INSERT INTO SO_CHINESE VALUES (N'玄真', '我很高興谷歌翻譯。' )
select * from SO_CHINESE;
Note that both the character sets are in the Unicode family. Note also I told my VARCHAR2 string to hold 20 characters. That's because some characters may require up to 4 bytes to be stored. Using a definition of (20) would give you only room to store 5 of those characters.
Let's look at the same scenario using SQL Developer and my local database.
And to confirm the character sets:
SQL> clear screen
SQL> set echo on
SQL> set sqlformat ansiconsole
SQL> select *
2 from database_properties
3 where PROPERTY_NAME in
4 ('NLS_CHARACTERSET',
5 'NLS_NCHAR_CHARACTERSET');
PROPERTY_NAME PROPERTY_VALUE DESCRIPTION
NLS_NCHAR_CHARACTERSET AL16UTF16 NCHAR Character set
NLS_CHARACTERSET AL32UTF8 Character set

First of all, you should to establish the Chinese character encoding on your Database, for example
UTF-8, Chinese_Hong_Kong_Stroke_90_BIN, Chinese_PRC_90_BIN, Chinese_Simplified_Pinyin_100_BIN ...
I show you an example with SQL Server 2008 (Management Studio) that incorporates all of this Collations, however, you can find the same characters encodings in other Databases (MySQL, SQLite, MongoDB, MariaDB...).
Create Database with Chinese_PRC_90_BIN, but you can choose other Coallition:
Select a Page (Left Header) Options > Collation > Choose the Collation
Create a Table with the same Collation:
Execute the Insert Statement
INSERT INTO ChineseTable VALUES ('玄真');

exporting to csv from db2 with no delimiter

I need to export content of a db2 table to CSV file.
I read that nochardel would prevent to have the separator between each data but that is not happening.
Suppose I have a table
MY_TABLE
-----------------------
Field_A varchar(10)
Field_B varchar(10)
Field_A varchar(10)
I am using this command
export to myfile.csv of del modified by nochardel select * from MY_TABLE
I get this written into the myfile.csv
data1 ,data2 ,data3
but I would like no ',' separator like below
data1 data2 data3
Is there a way to do that?

You're asking how to eliminate the comma (,) in a comma separated values file? :-)
NOCHARDEL tells DB2 not to surround character-fields (CHAR and VARCHAR fields) with a character-field-delimiter (default is the double quote " character).
Anyway, when exporting from DB2 using the delimited format, you have to have some kind of column delimiter. There isn't a NOCOLDEL option for delimited files.
The EXPORT utility can't write fixed-length (positional) records - you would have to do this by either:
Writing a program yourself,
Using a separate utility (IBM sells the High Performance Unload utility)
Writing an SQL statement that concatenates the individual columns into a single string:
Here's an example for the last option:
export to file.del
of del
modified by nochardel
select
cast(col1 as char(20)) ||
cast(intcol as char(10)) ||
cast(deccol as char(30));
This last option can be a pain since DB2 doesn't have an sprintf() function to help format strings nicely.

Yes there is another way of doing this. I always do this:
Put select statement into a file (input.sql):
select
cast(col1 as char(20)),
cast(col2 as char(10)),
cast(col3 as char(30));
Call db2 clp like this:
db2 -x -tf input.sql -r result.txt
This will work for you, because you need to cast varchar to char. Like Ian said, casting numbers or other data types to char might bring unexpected results.
PS: I think Ian points right on the difference between CSV and fixed-length format ;-)

Use "of asc" instead of "of del". Then you can specify the fixed column locations instead of delimiting.

HEX where clause in Postgre

I'm new in postgreSQL
how to do this
select * from table_abc where table_abc.a>=7a and table_abc.b<=7a
all value is HEX in column a, b and input value
Thanks
EDIT :
table_abc
a bytea
b bytea
c text

Careful, here. In Postgres, bytea is a byte array. You look like you want to store a single byte in those columns.
I don't see a single-byte type in the list of datatypes at http://www.postgresql.org/docs/9.0/static/datatype.html.
You can go with an integer type. For example, when I say this:
select x'7A'::integer
I get 122.
If you intend to store a single byte in these columns and write your queries with hex values, then I suggest you make the columns integers and query like this:
select * from table_abc where table_abc.a>=x'7a'::integer and table_abc.b<=x'7a'::integer

Querying a SQL Server 2008 table to find values in a column containing Unicode characters

I've run into a problem in a project I'm working on: some of the string values in a specific SQL Server 2008 table column contain Unicode characters. For example, instead of a dash some strings will instead contain an EM DASH (http://www.fileformat.info/info/unicode/char/2014/index.htm).
The column values that contain Unicode characters are causing problems when I send HTTP requests to a third-party server. Is there a way to query what rows contain one-or-more Unicode characters, so I can at least begin to identify how many rows need to be fixed?

You want to find all strings that contain one or more characters outside ASCII characters 32-126.
I think this should do the job.
SELECT *
FROM your_table
WHERE your_column LIKE N'%[^ -~]%' collate Latin1_General_BIN

One way you can do it is to see which rows no longer equal themselves when converted to a datatype that doesn't support unicode.
CREATE TABLE myStrings (
string nvarchar(max) not null
)
INSERT INTO myStrings (string)
SELECT 'This is not unicode' union all
SELECT 'This has '+nchar(500)+' unicode' union all
SELECT 'This also does not have unicode' union all
SELECT 'This has lots of unicode '+nchar(600)+nchar(700)+nchar(800)+'!'
SELECT cast(string as varchar)
FROM myStrings
SELECT *
FROM myStrings
WHERE cast(cast(string as varchar(max)) as nvarchar(max)) <> string

SELECT *
FROM your_table
WHERE your_column LIKE N'%[^ -~]%' collate Latin1_General_BIN
finds all strings that contain one or more characters within ASCII characters 32-126.
I thought the purpose was to find strings where ASCII characters are not in the range 32-126?
NOT is possible with LIKE. Wouldn't this work?
SELECT *
FROM your_table
WHERE your_column NOT LIKE N'%[^ -~]%'
No collate required.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Postgresql LIKE - find UTF-8 variants of ASCII text - postgresql

I'm trying to query a PostgreSQL table for rows where a column contains ASCII letters: SELECT p.* FROM person AS p WHERE p.surname LIKE 'e%'; The database contains UTF-8 strings. For example: Éxample Ěxample I need to find rows that have only ASCII e (or E). Why does the above query not work?

You can do this with the unaccent extension: CREATE EXTENSION unaccent; SELECT * FROM ( VALUES ('Éxample'), ('Ěxample') ) v(s) WHERE lower(unaccent(s)) LIKE 'e%'

Related

how to convert oracle to_single_byte() in postgreSQL

SQL Command to insert Chinese Letters

exporting to csv from db2 with no delimiter

HEX where clause in Postgre

Querying a SQL Server 2008 table to find values in a column containing Unicode characters

Categories

Resources