Create database defnition equivalent from mysql to postgresql - postgresql

I need to migrate a mysql table to postgresql.
I need an accent and case insensitive database.
In mysql, my database has the next definition:
CREATE DATABASE gestan
DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
How do I create an equivalent definition to postgresql?
I have read some posts, but it seems outdated.

create database getan
encoding = 'UTF-8';
There is no direct equivalent for the case insensitive collation _ci in Postgres.
You can define a new ICU collation that uses case insensitive comparison, but it can't be used as a default collation for a database, only for the collation on column level.
Related questions:
Install utf8 collation in PostgreSQL
https://dba.stackexchange.com/questions/183355
https://dba.stackexchange.com/a/256979

Related

Install utf8 collation in PostgreSQL

Right now I can choose Encoding : UTF8 when creating a new DB in pgAdmin4 GUI.
But, there is no option to choose utf8_general_ci as collation or character type. When I do select * from pg_collation; I dont see any collation relevant to utf8_general_ci.
Coming from a mySQL background I am confused. Do I have to install utf8-like ( eg utf8_general_ci, utf8_unicode_ci) collation in my PostgreSQL 10 or windows10?
I just want to have the equivalent of mySQL collation utf8_general_ci to PostgreSQL.
Thank you
utf8 is an encoding (how to represent unicode characters as a series of bytes), not a collation (which character goes before which).
I think the Postgres 10 collation equivalent for utf8_general_ci (or more modern utf8_unicode_ci) is called und-x-icu - this is an undefined collation (not defined for any real world language) provided by an ICU library. This collation would sort quite reasonably characters from most languages.
ICU support is a new feature added in PostgreSQL 10, so this collation isn't available for older PostgreSQL versions or when it's disabled during compilation. Before that Postgres was using operating system provided collation support, which differs between operating systems.

how can i change the "character type" of a database in postgresql?

I'm using postgreSQL 9.1
I've set the Collation and the Character Type of the database to Greek_Greece.1253 and I want to change it to utf8
To change the collation I should use this, right?
But how can I change the character type?
Thanks
EDIT
I ment to wright C instead of utf8. I would like to change the Collation and the Character Type to C
You cannot change default collation of an existing database. You need to CREATE DATABASE with the collation you need and then dump/restore your schema and data into it.
If you do not want to recreate the database - you can specify collation for every text collumn in your db.
Here is detailed postgres manual on collations: Collation Support.
First line of this manual page states:
LC_COLLATE and LC_CTYPE settings of a database cannot be changed after
its creation.
CREATE DATABASE, pg_dump, pg_restore

How to make my postgresql database use a case insensitive collation?

In several SO posts OP asked for an efficient way to search text columns in a case insensitive manner.
As much as I could understand the most efficient way is to have a database with a case insensitive collation. In my case I am creating the database from scratch, so I have the perfect control on the DB collation. The only problem is that I have no idea how to define it and could not find any example of it.
Please, show me how to create a database with case insensitive collation.
I am using postgresql 9.2.4.
EDIT 1
The CITEXT extension is a good solution. However, it has some limitations, as explained in the documentation. I will certainly use it, if no better way exists.
I would like to emphasize, that I wish ALL the string operations to be case insensitive. Using CITEXT for every TEXT field is one way. However, using a case insensitive collation would be the best, if at all possible.
Now https://stackoverflow.com/users/562459/mike-sherrill-catcall says that PostgreSQL uses whatever collations the underlying system exposes. I do not mind making the OS expose a case insensitive collation. The only problem I have no idea how to do it.
A lot has changed since this question. Native support for case-insensitive collation has been added in PostgreSQL v12. This basically deprecates the citext extension, as mentioned in the other answers.
In PostgreSQL v12, one can do:
CREATE COLLATION case_insensitive (
provider = icu,
locale = 'und-u-ks-level2',
deterministic = false
);
CREATE TABLE names(
first_name text,
last_name text
);
insert into names values
('Anton','Egger'),
('Berta','egger'),
('Conrad','Egger');
select * from names
order by
last_name collate case_insensitive,
first_name collate case_insensitive;
See https://www.postgresql.org/docs/current/collation.html for more information.
There are no case insensitive collations, but there is the citext extension:
http://www.postgresql.org/docs/current/static/citext.html
For my purpose the ILIKE keyword did the job.
From the postgres docs:
The key word ILIKE can be used instead of LIKE to make the match
case-insensitive according to the active locale. This is not in the
SQL standard but is a PostgreSQL extension.
This is not changing collation, but maybe somebody help this type of query, where I was use function lower:
SELECT id, full_name, email FROM nurses WHERE(lower(full_name) LIKE '%bar%' OR lower(email) LIKE '%bar%')
I believe you need to specify your collation as a command line option to initdb when you create the database cluster. Something like
initdb --lc-collate=en_US.UTF-8
It also seems that using PostgreSQL 9.3 on Ubuntu and Mac OS X, initdb automatically creates the database cluster using a case-insensitive collation that is default in the current OS locale, in my case, en_US.UTF-8.
Could you be using an older version of PostgreSQL that does not default to the host locale? Or could it be that you are on an operating system that does not provide any case-insensitive collations for PostgreSQL to choose from?

Force Postgres shift to uppercase rather than lowercase before sorting case insensitive?

I try to migrate to postgres from pervasive. In pervasive there was something like 'upper.alt' - alternative collation. I don't really know how it works, but I have to make my new postgres database to behave like pervasive with this collation.
I use Postgres 9.2.4 and utf-8 encoding and LC_COLLATE='Polish_Poland.1250' .
You can try and order with COLLATE "C". That would get what you want in your example. It has side effects though! Effectively everything is ordered according to the byte values of the encoded character.
WITH x(col) AS (
VALUES
('ABC_AAAAA')
,('ABC_BBBBB')
,('ABC_ZZZZZ')
,('ABCAAAAA')
,('ABCBBBBB')
,('ABCZZZZZ')
)
SELECT *
FROM x
ORDER BY col COLLATE "C"
This option to change the collation for individual expressions (as opposed to using a collation defined at creation time of the db) was introduced with Postgres 9.1.
More about collation in the manual here.

firebird set case insensitive collation

How can I set case insensitive collation for the whole database?
Do I have to recreate the tables and data?
Database is firebird 2.5
Quote from the release notes:
The character set and collation of existing columns are not affected by ALTER CHARACTER SET changes.
So yes, it seems that the best way would be to recreate the database with desired default character set and collation (and / or with explicit definitions in domains).