I'm facing problems trying to convert a Firebird 3 database with character set WIN1252 to UTF8.
I've performed the following procedures:
Extract the DDL from the database and the definitions, so I created the new database with UTF8 Character Set, Collate UNICODE_CI_AI. The database structure was created correctly.
After when I try using fbcopy to copy data from WIN1252 database to new UTF8 database the process is aborted returning the error:
Message: isc_dsql_execute2 failed
SQL Message: -104
can not format message 13: 896 - message file C: \ WINDOWS \ SYSTEM32 \ firebird.msg not found
Engine Code: 335544849
Engine Message:
Malformed string
Enabling triggers ... done.
Before using the FbCopy tool, I tried to perform the following commands through backup and restore of the WIN1252 database:
-FIX_FSS_D UTF8 -FIX_FSS_M UTF8
or
-FIX_FSS_D WIN1252 -FIX_FSS_M WIN1252
However, I still get the same error.
Related
I am trying to upload a series of tables (.txt files) into a PostgreSQL database that runs on my Windows 10 desktop. I use psql upload the files. I have successfully uploaded a couple of tables but the largest one (5GB with over 20 million rows) is giving me trouble:
databasename=# \copy table1 FROM 'C:\Users\tablename.txt' DELIMITER ',' CSV HEADER;
ERROR: character with byte sequence 0x9d in encoding "WIN1252" has no equivalent in encoding "UTF8"
CONTEXT: COPY table1, line 581330
I found an answer here which suggested I check the client encoding...
databasename=# SHOW client_encoding;
client_encoding
-----------------
WIN1252
(1 row)
and then change it, which I tried:
databasename=# SET CLIENT_ENCODING TO 'utf8';
SET
I then try the same copy command again and get the following error:
ERROR: invalid byte sequence for encoding "UTF8": 0x92
CONTEXT: COPY table1, line 206051
I've read a little about 0x92 here. It sounds like there is a character in the file which cannot be encoded when I try and perform the \copy command.
Some background:
I was able to upload about 1 million rows into SQL Server 2019 (free version) using the SQL Server Import and Export Wizard. (I stopped the import because it was taking too long.) I was also able to view the file in R using read.csv. Not sure if any of this is helpful. Thank you all in advance.
I have a CSV file -- > named -- myWidechar1.csv and the data in it is as follows:
1~"ϴAnthony"~"Grosse"~"1980-02-23"~"65000.00"
2~"❤Alica"~"Fatnowna"~"1963-11-14"~"45000.00"
3~"☎Stella"~"Rossenhain"~"1992-03-02"~"120000.00"
Copy Command in PostgreSQL is as follows:
Copy dbo.myWidechar From 'D:\temp\myWidechar1.csv' DELIMITER '~' null as 'null' encoding 'windows-1251' CSV; select 1;
But problem is I am getting the below error while importing data into PostgreSQL:
ERROR: invalid byte sequence for encoding "WIN1251": 0x00 CONTEXT:
COPY mywidechar, line 3 SQL state: 22021
I need the data to remain same even in PostgreSQL as I am migrating data from SQL Server 2000 instance.
Can some one please help me in resolving this issue ?
I have a table to export data from SQL Server to PostgreSQL.
Steps I followed:
Step 1: Export data from SQL Server:
Source: SQL Server Table
Destination: Flat file Destination
Table Or Query to copy: Query
Query:
SELECT
COALESCE(convert(varchar(max),id),'NULL') + '|'
+COALESCE(convert(varchar(max),Name),'NULL') + '|'
COALESCE(convert(varchar(max),EDate,121),'NULL') AS A
FROM tbl_Employee;
File Name: file.txt
Step 2: Copy to PostgreSQL.
Command:
\COPY tbl_employee FROM '$FilePath\file.txt' DELIMITER '|' NULL AS 'NULL' ENCODING 'LATIN1'
Getting Following error message:
ERROR: invalid byte sequence for encoding "UTF8": 0xc1 0x20
You tell Postgres the source would be encoded as LATIN1:
\copy ... ENCODING 'LATIN1'
But that's either not the case or the file is damaged. Else we would not see the error message. What is the true encoding of '$FilePath\file.txt'?
The current client_encoding is not relevant for this since, quoting the manual on COPY:
ENCODING
Specifies that the file is encoded in the encoding_name. If this option is omitted, the current client encoding is used.
(\copy is jut a wrapper for SQL COPY in psql.)
And your server_encoding is largely irrelevant, too - as long as Postgres can use a built-in conversion and the target encoding contains all characters of the source encoding - which is the case for LATIN1 -> UTF8: iso_8859_1_to_utf8.
So the remaining source of error is your file, which is almost certainly not valid LATIN1.
I try to create a collation on a new database(firebird3.0.1 on windows. I follows the steps:
create database 'c:\tmp\ebizmis.fdb' user 'SYSDBA' password '123456' default character set utf8 collation unicode;
connect 'c:\tmp\ebizmis.fdb' user 'SYSDBA' password '123456';
create collation py for utf8 from UNICODE case insensitive 'LOCALE=zh';
in this step, prompting error:
Statement failed, SQLSTATE = HY000
unsuccessful metadata update
-CREATE COLLATION PY failed
-Invalid collation attributes
But on CentOS it is successfull on firebird 3.0.1.
When I change the uppercase LOCALE to lowercase:
create collation py for utf8 from UNICODE case insensitive 'locale=zh';
then it can be excuted on windows, but do not work for sorting the characters in chinese pinyin order.
I cannot build Firebird 3.0.1 under windows, but I can build 2.5.6. So I debug under 2.5.6. I found there is no bug in Firebird. The cause of the problem is that I haven't include zh locale in the icu. So I download the data from icu web site and replace the file in 3.0.1. It works!
I am using UTF8 as encoding for my Postgres 8.4.11 database:
CREATE DATABASE test
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = mydata
LC_COLLATE = 'de_DE.UTF-8'
LC_CTYPE = 'de_DE.UTF-8'
CONNECTION LIMIT = -1;
ALTER DATABASE test SET default_tablespace='mydata';
ALTER DATABASE test SET temp_tablespaces=mydata;
And the output of \l
test | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
When I try to insert a German character:
create table x(a text);
insert into x values('ä,ß,ö');
ERROR: invalid byte sequence for encoding "UTF8": 0xe42cdf
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
I am using puTTY to connect. Any idea?
The key element is the client_encoding - the encoding the server expects from your client. It has to match what is actually sent. What do you get for show client_encoding? Is it UNICODE?
Read more in the chapter Automatic Character Set Conversion Between Server and Client of the manual.
If you are using psql as client, you can set client_encoding with \encoding. Check the encoding your local system users (on Linux type locale in the shell) and set a matching client_encoding in psql. You can avoid such complications if you use the same locale on your system as you use as encoding for your PostgreSQL server.
If you use puTTY (on Windows), make sure to set its "Translation" accordingly. Have a look at Settings: Window - Translation. Must match client_encoding. You can right-click in a running session and chose Change Settings. You can also save these settings with your saved sessions.