Storing uni code characters in PostgreSQL 8.4 table - postgresql

I want to store unicode characters in on of the column of PostgreSQL8.4 datat base table. I want to store non-English language data say want to store the Indic language texts. I have achieved the same in Oracle XE by converting the text into unicode and stored in the table using nvarchar2 column data type.
The same way I want to store unicode characters of Indic languages say (Tamil,Hindi) in one of the column of a table. How to I can achieve that,what data type should I use?
Please guide me, thanks in advance

Just make sure the database is initialized with encoding utf8. This applies to the whole database for 8.4, later versions are more sophisticated. You might want to check the locale settings too - see the manual for details, particularly around matching with LIKE and text pattern ops.

Related

PostgreSQL SELECT can alter a table?

So I'm new to SQL like databases and the place that I work at migrated to PostgreSQL. One table drastically reduced its contents. The point is, I only used SELECT statements, and changed the name of the columns with AS. Is there a way I might have changed the table data?
When you migrate from a DBMS to another DBMS you must be sure that the objects created are strictly equivalent... The question seems to be trivial, but is'nt.
As a matter fact one important consideration for litterals (char/varchar...) is to verify the collation used formerly and the collation you have used to create the newly database in PostGreSQL.
Collation in an RDBMS is the way to adjust the behavior of character strings with regard to certain parameters such as the distinction, or not, of upper and lower case letters, the distinction, or not, of diacritical characters (accents, ligatures...), specific sorting to language, etc. And constitutes a superset of the character encoding.
Did you verify this point when using some WHERE clause to search some litterals ? If not, try to restricts litteral in applying the right collation (COLLATE operator) or use UPPER function to avoid the distinguish between upper and lower chars...

DB2 UnLOAD in unicode with two chardelimiter

I have to create an UNLOAD job for a DB2 table and save the UNload in unicode. That's no problem.
But unfortunately there are contents in the table columns that correspond to the separators.
For example, I would like the combination #! as a separator, but I can't do that in unicode.
Can someone tell me how to do this?
Now my statement looks like this:
DELIMITED COLDEL X'3B' CHARDEL X'24' DECPT X'2E'
UNICODE
thanks a lot for your help
The delimiter can be a single character (not two characters, as you want).
In this case the chosen solution was to find a single character that did not appear in the data.
When that is not possible, consider a non-delimited output format, or a different technique to get the data to the external system (for example via federation or other SQL-based interchange, or XML etc.

UTF-8 table with lowercase and uppercase and relation between them

I'm looking for a UTF-8 table / tables etc. with all the lowercase (small) and uppercase (capital) characters in (hexa-)decimal form and with a relation between the elements.
So far I found:
https://www.fileformat.info/info/unicode/category/Ll/list.htm
https://www.fileformat.info/info/unicode/category/Lu/list.htm
these are nice lists, though:
there is no relation between the 2 tables (I could create this by means some scripting creating a new table)
the values are Unicode values and not utf-8 (hexa-)decimal values.
Any suggestions?
My advise is to use an existing library that can do case conversion for you. You mentioned you were targeting C++ so the library ICU is one option.

Scala Slick, Trouble with inserting Unicode into database

When I create a new row of data that contains several columns that may contain Unicode, the columns that do contain Unicode are being corrupted.
However, if I insert that data directly, using the mysql-cli Slick will retrieve that Unicode data fine.
Is there anything I should add to my table class to tell Slick that this Column may be a Unicode string?
I found the problem, I have to set the character encoding, for the connection.
db.default.url="jdbc:mysql://localhost/your_db_name?characterEncoding=UTF-8"
You probably need to configure that on the db schema side by setting the right collation.

Anyone had success using a specific locale for a PostgreSQL database so that text comparison is case-insensitive? [duplicate]

This question already has answers here:
Change postgres to case insensitive
(2 answers)
Closed last year.
I'm developing an app in Rails on OS X using PostgreSQL 8.4. I need to setup the database for the app so that standard text queries are case-insensitive. For example:
SELECT * FROM documents WHERE title = 'incredible document'
should return the same result as:
SELECT * FROM documents WHERE title = 'Incredible Document'
Just to be clear, I don't want to use:
(1) LIKE in the where clause or any other type of special comparison operators
(2) citext for the column datatype or any other special column index
(3) any type of full-text software like Sphinx
What I do want is to set the database locale to support case-insensitive text comparison. I'm on Mac OS X (10.5 Leopard) and have already tried setting the Encoding to "LATIN1", with the Collation and Ctype both set to "en_US.ISO8859-1". No success so far.
Any help or suggestions are greatly appreciated.
Thanks!
Update
I have marked one of the answers given as the correct answer out of respect for the folks who responded. However, I've chosen to solve this issue differently than suggested. After further review of the application, there are only a few instances where I need case-insensitive comparison against a database field, so I'll be creating shadow database fields for the ones I need to compare case-insensitively. For example, name and name_lower. I believe I came across this solution on the web somewhere. Hopefully PostgreSQL will allow similar collation options to what SQL Server provides in the future (i.e. DOCI).
Special thanks to all who responded.
You will likely need to do something like use a column function to convert your text e.g. convert to uppercase - an example :
SELECT * FROM documents WHERE upper(title) = upper('incredible document')
Note that this may mess up performance that used index scanning, but if it becomes a problem you can define an index including column functions on target columns e.g.
CREATE INDEX I1 on documents (upper(title))
With all the limitations you have set, possibly the only way to make it work is to define your own = operator for text. It is very likely that it will create other problems, such as creating broken indexes. Other than that, your best bet seems to be to use the citext datatype; that would still let the ORM stuff you're using generate the SQL.
(I am not mentioning the possibility of creating your own locale definition because I haven't ever heard of anyone doing it.)
Your problem and your exclusives are like saying "I want to swim, but I don't want to have to move my arms.".
You will drown trying.
I don't think that is what local or encoding is used for. Encoding is more for picking a character set and not determining how to deal with characters. If there were a setting it would be in the config, but I haven't seen one.
If you do not want to use ilike for fear of not being able to port to another database then I would suggest you look into what ORM options might be available with ActiveRecord if you are using that.
here is something from one of the top postgres guys: http://archives.postgresql.org/pgsql-php/2003-05/msg00045.php
edit: fixed specific references to locale.
SELECT * FROM documents WHERE title ~* 'incredible document'