MySQL to PostgreSQL table create conversion - charset and collation

MySQL to PostgreSQL table create conversion - charset and collation - postgresql

I want to migrate from MySQL to PostgreSQL.My query for create table is like this.
CREATE TABLE IF NOT EXISTS conftype
(
CType char(1) NOT NULL,
RegEx varchar(300) default NULL,
ErrStr varchar(300) default NULL,
Min integer default NULL,
Max integer default NULL,
PRIMARY KEY (CType)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin;
What is the converted form of this query. I am confused with DEFAULT CHARSET=latin1 COLLATE=latin1_bin part. How can I convert this part?

That one would mean that the table uses only latin-1 (iso-8859-1) character set and latin-1 binary sorting order. In PostgreSQL the character set is database-wide, there is no option to set it on table level.
You could create a mostly compatible database with:
CREATE DATABASE databasenamegoeshere WITH ENCODING 'LATIN1' LC_COLLATE='C'
LC_CTYPE='C' TEMPLATE=template0;
However, I personally would consider a MySQL->PostgreSQL port also worthy of switching to UTF-8/Unicode.

The character set is defined when you create the database, you can't overwrite that per table in Postgres.
A non-standard collation can be defined only on column level in Postgres, not on table level. I think(!) that the equivalent to latin1_bin in MySQL would be the "C" collation in Postgres.
So if you do need a different collation, you need something like this
RegEx varchar(300) default NULL collate "C",
ErrStr varchar(300) default NULL collate "C",
min and max are reserved wordds in SQL and you shouldn't use them as column names (although using them as column names will work I strongly suggest you find different names to avoid problems in the future)

Related

postgresql erro: value too long for type character varying(256)

I have looked into other solutions before but could not find out the problem from explanations. I am trying to run a python script where the data is loaded from an oltp MySQL database (AWS RDS) to an olap database on AWS Redshift. I have defined my table in Redshift as below:
create_product = ("""CREATE TABLE IF NOT EXISTS product (
productCode varchar(15) NOT NULL PRIMARY KEY,
productName varchar(70) NOT NULL,
productLine varchar(50) NOT NULL,
productScale varchar(10) NOT NULL,
productVendor varchar(50) NOT NULL,
productDescription text NOT NULL,
buyPrice decimal(10,2) NOT NULL,
MSRP decimal(10,2) NOT NULL
)""")
I am using a python script to load the data from RDS to Redshift. My function body to load
for query in dimension_etl_query:
oltp_cur.execute(query[0])
items = oltp_cur.fetchall()
try:
olap_cur.executemany(query[1], items)
olap_cnx.commit()
logger.info("Inserted data with: %s", query[1])
except sqlconnector.Error as err:
logger.error('Error %s Couldnt run query %s', err, query[1])
The script run throws the error
olap_cur.executemany(query[1], items)
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(256)
I have checked in my SQL database for each of the columns length and only productDescription has length greater than 265 characters. However I am using text datatype in postgres for that column. Would appreciate any tips on how to find the rootcause?

See here:
https://docs.aws.amazon.com/redshift/latest/dg/r_Character_types.html#r_Character_types-text-and-bpchar-types
TEXT and BPCHAR types
You can create an Amazon Redshift table with a TEXT column, but it is converted to a VARCHAR(256) column that accepts variable-length values with a maximum of 256 characters.
You can create an Amazon Redshift column with a BPCHAR (blank-padded character) type, which Amazon Redshift converts to a fixed-length CHAR(256) column.
Looks like you might need VARCHAR, I think. From same link:
VARCHAR or CHARACTER VARYING
...
If used in an expression, the size of the output is determined using the input expression (up to 65535).
You will have to experiment to see if that works.

Just try to keep everything under 256 chars even if it is a text

Can't create table with text[] data type

I'm trying to store an entity in my postgresql database. This entity has a List in it, so I'd like to use postgresql type TEXT[]. But everytime I'm trying I get a SQL error, I have no idea why.
I don't get the syntax error, really. I'm sure it's a dumb issue but can you help me?
Thank you
I tried some alternatives, creating it directly from h2 console but I always get the same error
The script I use with flyway for creating the table
CREATE TABLE discrimination(
id SERIAL PRIMARY KEY NOT NULL ,
location VARCHAR(255) NOT NULL,
criteria TEXT[] NOT NULL,
domain VARCHAR(255) NOT NULL,
description TEXT NOT NULL,
name_organ VARCHAR(55) NOT NULL,
function_disc VARCHAR(55) NOT NULL
);
my application config for h2 & flyway
h2:
console:
enabled: true
path: /h2
datasource:
url: jdbc:h2:mem:formation-iris;MODE=PostgreSQL
username: test
password: test
driver-class-name: org.h2.Driver
flyway:
locations: classpath:db/migration
enabled: true
And the error I get
Syntax error in SQL statement "CREATE TABLE DISCRIMINATION(
ID SERIAL PRIMARY KEY NOT NULL ,
LOCATION VARCHAR(255) NOT NULL,
CRITERIA TEXT[[*]] NOT NULL,
DOMAIN VARCHAR(255) NOT NULL,
DESCRIPTION TEXT NOT NULL,
NAME_ORGAN VARCHAR(55) NOT NULL,
FUNCTION_DISC VARCHAR(55) NOT NULL
) "; expected "(, FOR, UNSIGNED, INVISIBLE, VISIBLE, NOT, NULL, AS, DEFAULT, GENERATED, ON, NOT, NULL, AUTO_INCREMENT, BIGSERIAL, SERIAL, IDENTITY, NULL_TO_DEFAULT, SEQUENCE, SELECTIVITY, COMMENT, CONSTRAINT, PRIMARY, UNIQUE, NOT, NULL, CHECK, REFERENCES, ,, )"; SQL statement:

From H2 documentation:
Compatibility Modes
For certain features, this database can emulate
the behavior of specific databases. However, only a small subset of
the differences between databases are implemented in this way.
Which means that H2 can emulate certain DB-specific behaviours, but it won't be fully compatible with the selected DB.
That's especially true for SQL syntax.
So, if you want to use arrays in H2, then you should use the H2 syntax ARRAY instead of TEXT[]
Which also means that you will need a separate SQL script for production (PostgreSQL) and for tests (H2). Luckily, flyway supports that. It can load the vendor-specific scripts from different folders. Extend the flyway configuration this way:
spring.flyway.locations=classpath:db/migration/{vendor}
and add the vendor-specific SQL scripts under the /h2 and /postgresql folders respectively.

TYPO3 lists all tables in DB compare because of COLLATE

I'm using TYPO3 8.7.4 with PHP 7.0.22 and MariaDB 10.2.7.
The DB Compare inside the InstallTool shows me that TYPO3 wants to alter all tables because the current value differs from the expected by the collation of the table:
ALTER TABLE `be_groups` CHANGE `title` `title` VARCHAR(50) DEFAULT '' NOT NULL
Current value: title VARCHAR(50) DEFAULT '''' NOT NULL COLLATE utf8_general_ci

MariaDB implemented a change to the Information Schema COLUMNS table which is not backwards compatible with the output expected from the 'original' MySQL:
https://jira.mariadb.org/browse/MDEV-13132

What does COLLATE pg_catalog."default" do on text attribute in postgres database table?

One of the fields while creating the table is described as below
id text COLLATE pg_catalog."default"

It's just telling that you're using default lc_collate for this column.
But what's the default collate? Use SHOW to discover that.
SHOW lc_collate;
PostgreSQL allows to create columns with different types of collation:
CREATE TABLE collate_test
(
default_collate text, --Default collation
custom_collate text COLLATE pg_catalog."C" --Custom collation
);
Did you see the difference?
More info about collation is on docs:
The collation feature allows specifying the sort order and character
classification (...)

PostgreSQL bigserial & nextval

I've got a PgSQL 9.4.3 server setup and previously I was only using the public schema and for example I created a table like this:
CREATE TABLE ma_accessed_by_members_tracking (
reference bigserial NOT NULL,
ma_reference bigint NOT NULL,
membership_reference bigint NOT NULL,
date_accessed timestamp without time zone,
points_awarded bigint NOT NULL
);
Using the Windows Program PgAdmin III I can see it created the proper information and sequence.
However I've recently added another schema called "test" to the same database and created the exact same table, just like before.
However this time I see:
CREATE TABLE test.ma_accessed_by_members_tracking
(
reference bigint NOT NULL DEFAULT nextval('ma_accessed_by_members_tracking_reference_seq'::regclass),
ma_reference bigint NOT NULL,
membership_reference bigint NOT NULL,
date_accessed timestamp without time zone,
points_awarded bigint NOT NULL
);
My question / curiosity is why in a public schema the reference shows bigserial but in the test schema reference shows bigint with a nextval?
Both work as expected. I just do not understand why the difference in schema's would show different table creations. I realize that bigint and bigserial allow the same volume of ints to be used.

Merely A Notational Convenience
According to the documentation on Serial Types, smallserial, serial, and bigserial are not true data types. Rather, they are a notation to create at once both sequence and column with default value pointing to that sequence.
I created test table on schema public. The command psql \d shows bigint column type. Maybe it's PgAdmin behavior ?
Update
I checked PgAdmin source code. In function pgColumn::GetDefinition() it scans table pg_depend for auto dependency and when found it - replaces bigint with bigserial to simulate original table create code.

When you create a serial column in the standard way:
CREATE TABLE new_table (
new_id serial);
Postgres creates a sequence with commands:
CREATE SEQUENCE new_table_new_id_seq ...
ALTER SEQUENCE new_table_new_id_seq OWNED BY new_table.new_id;
From documentation: The OWNED BY option causes the sequence to be associated with a specific table column, such that if that column (or its whole table) is dropped, the sequence will be automatically dropped as well.
Standard name of a sequence is built from table name, column name and suffix _seq.
If a serial column was created in such a way, PgAdmin shows its type as serial.
If a sequence has non-standard name or is not associated with a column, PgAdmin shows nextval() as default value.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

MySQL to PostgreSQL table create conversion - charset and collation - postgresql

Related

postgresql erro: value too long for type character varying(256)

Can't create table with text[] data type

TYPO3 lists all tables in DB compare because of COLLATE

What does COLLATE pg_catalog."default" do on text attribute in postgres database table?

PostgreSQL bigserial & nextval

Categories

Resources