how to use chinese character in pymysql to create table? - cjk

1.sqlite3
import sqlite3
con=sqlite3.connect("g:\\mytest1.db")
cur=con.cursor()
cur.execute('create table test (上市 TEXT)')
con.commit()
cur.close()
con.close()
I successfully create a test table mytest1.db ,and a chinese character name "上市" as field.
2.in mysql command console.
C:\Users\root>mysql -uroot -p
Welcome to the MySQL monitor. Commands end with ; or \g.
mysql> create database mytest2;
Query OK, 1 row affected (0.00 sec)
mysql> use mytest2;
Database changed
mysql> set names "gb2312";
Query OK, 0 rows affected (0.00 sec)
mysql> create table stock(上市 TEXT) ;
Query OK, 0 rows affected (0.07 sec)
The conclusion can be get : chinese characters can be used in mysql console.
3.pymysql
code31
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
code32
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest32")
cur.execute("use mytest32")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
The same problem occurs
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 21-22: o rdinal not in range(256)
4.mysql-python-connect
code 41
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'123456',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest41")
cur.execute("use mytest41")
cur.execute('set names "gb2312" ')
str='create table stock(上市 TEXT)'
cur.execute(str)
code 42
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'******',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest42")
cur.execute("use mytest42")
cur.execute('set names "gb2312" ')
str='create table stock(上市.encode("utf-8") TEXT)'
cur.execute(str)
same errrors such as in pymysql.
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 22-23: o
rdinal not in range(256)
It is surely a bug in python mysql module that chinese characters can not be used as field name.
1.Chinese characters can be used as field name in python sqlite3 module.
2.Chinese characters can be used as field name in mysql console only if you 'set name "gb2312" '

pymysql.connect() accepts a charset argument. I have tested charset="utf8" and charset="gb2312" and both works (Python 3, PyMySQL 0.6.2). You don't need to use a "SET NAMES" query in this case.
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306,
user='root', passwd='******',
charset="utf8")
cur = con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute("create table stock(上市 TEXT)")
con.commit()

You're encodeing when you should decode. To convert a Chinese character to a unicode character, use:
"上市".decode("GB18030")
Which is an encoding generally used for Chinese chars. latin-1 will not work as most Chinese characters are not within its scope. The GB18030 encoding should work, but if not, there are a host of other encodings you can try, like gbk or big5_hkscs (generally for encodings done within HK/China).
Unicode errors are easy to spot, they show up as u'\ufffd' (which when encoded will be a diamond with a question mark in the middle).
I hope this was helpful!
Edit: I'm somewhat confused by your comment.
>>> print type("上市")
<type 'str'>
>>> print type("上市".decode("GB18030"))
<type 'unicode'>
str.decode() returns unicode.

Related

Psycopg2 execute_values sending all values as text

I have this table in postgres
CREATE TABLE target (
a json
b integer
c text []
id integer
CONSTRAINT id_fkey FOREIGN KEY (id)
REFERENCES public.other_table(id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION,
)
Which I would like to insert data to from psycopg2 using
import psycopg2
import psycopg2.extras as extras
# data is of the form dict, integer, list(string), string <- used to get fkey id
data = [[extras.Json([{'a':1,'b':2}, {'d':3,'e':2}]), 1, ['hello', 'world'], 'ident1'],
[extras.Json([{'a':4,'b':3}, {'d':1,'e':9}]), 5, ['hello2', 'world2'], 'ident2']]
# convert data to list of tuples containing objects
x = [tuple(u) for u in data]
# insert data to the database
query = ('WITH ins (a, b, c, ident) AS '
'(VALUES %s) '
'INSERT INTO target (a, b, c, id) '
'SELECT '
'ins.a '
'ins.b '
'ins.c '
'other_table.id'
'FROM '
'ins '
'LEFT JOIN other_table ON ins.ident = other_table.ident;')
cursor = conn.cursor()
extras.execute_values(cursor, query, x)
When I run this I get the error: column "a" is of type json but expression is of type text. I tried to solve this by adding a type cast in the SELECT statement but then I got the same error for c and then for b.
Originally I thought the problem lies in the WITH statement but based on the answers to my previous question this seems to not be the case Postgres `WITH ins AS ...` casting everything as text
It seems that execute_values is sending all the values as text with ' '.
Main Question: How can I get execute_values to send the values based on their python data type rather than just as text?
Sub questions:
How can I confirm that execute_values is in fact sending the values as text with quotation marks?
What is the purpose of the template argument of execute_values https://www.psycopg.org/docs/extras.html and could that be of help?
The issue, as Adrian Klaver points out in their comment, and also seen in this answer, is that the typing is lost in the CTE.
We can show this with an example in the psql shell:
CREATE TABLE test (col1 json);
WITH cte (c) AS (VALUES ('{"a": 1}'))
INSERT INTO test (col) SELECT c FROM cte;
resulting in
ERROR: column "col" is of type json but expression is of type text
whereas this version, with the type specified, succeeds:
WITH cte(c) AS (VALUES ('{"a": 1}'::json))
INSERT INTO test (col) SELECT c FROM cte;
We can mimic this in execute_valuesby providing the typing information in the template argument:
extras.execute_values(cursor, query, data, template='(%s::json, %s, %s, %s)')

How to export a file in Db2 without character delimiter nor column delimiter?

I would like to export a table in a similar way that a normal select is executed. I mean:
db2 -x "select varchar(SCHEMANAME, 16) SCHEMANAME,
varchar(OWNER, 10) OWNER,
varchar(OWNERTYPE, 10) OWNERTYPE
from syscat.schemata where SCHEMANAME like 'SYS%'"
And the output is:
SYSCAT SYSIBM S
SYSFUN SYSIBM S
SYSIBM SYSIBM S
SYSIBMADM SYSIBM S
SYSIBMINTERNAL SYSIBM S
SYSIBMTS SYSIBM S
SYSPROC SYSIBM S
SYSPUBLIC SYSIBM S
SYSSTAT SYSIBM S
SYSTOOLS SYSIBM S
I would like to generate the same via an export (fixed-length columns). I have tried:
db2 "export to myfile.csv of del
modified by coldelX20
SELECT *
from syscat.schemata"
db2 "export to myfile.csv of del
modified by nochardel coldelX20
SELECT *
from syscat.schemata"
db2 "export to myfile.csv of del
modified by chardelX21 coldelX20
SELECT *
from syscat.schemata"
And I got:
SQL3017N A delimiter is not valid or is used more than once.
(Redirecting the output of a normal select is not an option).
Delimiter considerations for moving data:
Delimiter restrictions
There are a number of restrictions in place that help prevent the
chosen delimiter character from being treated as a part of the data
being moved. First, delimiters are mutually exclusive. Second, a
delimiter cannot be binary zero, a line-feed character, a
carriage-return, or a blank space. As well, the default decimal point
(.) cannot be a string delimiter. Finally, in a DBCS environment, the
pipe (|) character delimiter is not supported.
Db2 export command doesn't support a space as a delimiter. Moreover, it doesn't right pad a value with spaces to the maximum column length in characters.
You may construct one column expression yourself like below:
select
char(schemaname, 20)
-- ...
|| ' ' || char(create_time)
--- ...
|| ' ' || char(coalesce(char(auditpolicyid), '-'), 11)
-- ...
from syscat.schemata;
Try this
export to myfile.csv of del
modified by nochardel
select rpad(left(SCHEMANAME,16) , 16 ) ||
rpad(left(OWNER,10), 10 ) ||
rpad(left(OWNERTYPE,10), 10)
from syscat.schemata

Date format with Quickpart Database in Word

I try to change the date format in a quickpart database function.
The format is in American (mm/d/yyyy) but i want to change in the French format (dd.MM.yyyy).
This is my code :
DATABASE \d "C:\Users\taagede1\Dropbox\Samaritains\Soldes et
indemnités\2018\Total soldes.xlsx" \c
"Provider=Microsoft.ACE.OLEDB.12.0;User ID=Admin;Data
Source=C:\Users\taagede1\Dropbox\Samaritains\Soldes et
indemnités\2018\Total soldes.xlsx;Mode=Read;Extended
Properties=\"HDR=YES;IMEX=1;\";Jet OLEDB:System database=\"\";Jet
OLEDB:Registry Path=\"\";Jet OLEDB:Engine Type=37;Jet OLEDB:Database
Locking Mode=0;Jet OLEDB:Global Partial Bulk Ops=2;Jet OLEDB:Global
Bulk Transactions=1;Jet OLEDB:New Database Password=\"\";Jet
OLEDB:Create System Database=False;Jet OLEDB:Encrypt
Database=False;Jet OLEDB:Don't Copy Locale on Compact=False;Jet
OLEDB:Compact Without Replica Repair=False;Jet OLEDB:SFP=False;Jet
OLEDB:Support Complex Data=False;Jet OLEDB:Bypass UserInfo
Validation=False;Jet OLEDB:Limited DB Caching=False;Jet OLEDB:Bypass
ChoiceField Validation=False" \s "SELECT Quoi, Date , Heure
Début, Heure Fin, Total FROM Engagements$ WHERE ((NomPrenom =
'AubortLoic') AND (Payé IS NULL )) ORDER BY Date" \l "26" \b "191"
\h
This is the result:
I have tried to add this:
{ DATABASE [\# "dd.MM.yyyy"] \* MERGEFORMAT }
But i have a very ugly result (all buggy)
The OLEDB driver for Excel (and Access - it's the same one) supports a limited number of functions that can be used on the data via the Select query, among them Format. It's similar, but not identical to the VBA function of the same name.
In my test the following Select phrase worked (extracted from the Database field code for better visibility):
\s "SELECT Quoi, Format([Date], 'dd.MM.yyyy') AS FrDate, Heure
Début, Heure Fin, Total FROM Engagements$ WHERE ((NomPrenom = 'AubortLoic') AND (Payé IS NULL )) ORDER BY Date
Note that the date format is in single, not double quotes. You can use anything for the alias (the column header), except another field name. So it can't be Date if that's the field name in the data source. It could be Le Date, but in this case, due to the spaces, it would have to be in square brackets: [Le Date].

Psycopg2 copy_from throws DataError: invalid input syntax for integer

I have a table with some integer columns. I am using psycopg2's copy_from
conn = psycopg2.connect(database=the_database,
user="postgres",
password=PASSWORD,
host="",
port="")
print "Putting data in the table: Opened database successfully"
cur = conn.cursor()
with open(the_file, 'r') as f:
cur.copy_from(file=f, table = the_table, sep=the_delimiter)
conn.commit()
print "Successfully copied all data to the database!"
conn.close()
The error says that it expects the 8th column to be an integer and not a string. But, Python's write method can only read strings to the file. So, how would you import a file full of string representation of number to postgres table with columns that expect integer when your file can only have character representation of the integer (e.g. str(your_number)).
You either have to write numbers in integer format to the file (which Python's write method disallows) or psycopg2 should be smart enough to the conversion as part of copy_from procedure, which it apparently is not. Any idea is appreciated.
I ended up using copy_expert command. Note that on Windows, you have to set the permission of the file. This post is very useful setting permission.
with open(the_file, 'r') as f:
sql_copy_statement = "copy {table} FROM '"'{from_file}'"' DELIMITER '"'{deli}'"' {file_type} HEADER;".format(table = the_table,
from_file = the_file,
deli = the_delimiter,
file_type = the_file_type
)
print sql_copy_statement
cur.copy_expert(sql_copy_statement, f)
conn.commit()

PostgreSQL, perl and dojo special character issue (æ,ø and å)

I have a webpage made in perl and dojo using a PostgreSQL database. I have to search for availale people in the database and since im from Denmark the letters æ,ø and å has to be available in the search. I thought this was standard when using UTF8 and when I normally program in php over mysql I didn't think it would be that hard.
I have done properly every trick I know to convert this search_word to the right encoding so i can search in the postgre sql database for correct names with æ,ø and å... but it still fails.
i have my perl code making the fetch but this fetch returns 0 rows and when i insert the same command in the psql terminal i get 46 rows returned (copy from "tail -f log terminal" the STDERR statement and inserts it into another terminal connected to the database through the psql command)... the perl code is:
sub dbSearchPersons {
my $search_word = escapeSql($_[0]);
$search_word = Encode::decode_utf8($search_word);
$statement = "SELECT id,name,initials,email FROM person WHERE name ilike '\%".$search_word."\%' OR email ilike '\%".$search_word."\%' OR initials ilike '\%".$search_word."\%' ORDER BY name ASC";
$sth = $dbh->prepare($statement);
$num_rows = $sth->execute();
print STDERR "Statement: " . $statement;
if($num_rows > 0){
$persons = $dbh->selectall_hashref($statement,'id');
}
dbFinish($sth);
webdie($DBI::errstr) if($DBI::errstr);
}
and as you can see i write the SQL statement to STDERR and which outputs the following:
[Fri Apr 27 11:24:26 2012] [error] [client 10.254.0.1] Statement: SELECT id,name,initials,email FROM person WHERE name ilike '%Jørgen%' OR email ilike '%Jørgen%' OR initials ilike '%Jørgen%' ORDER BY name ASC, referer: https://xx.xxx.xxx.xx/cgi-bin/users.cgi
The sql I correctly written (as i can see it through the terminal output above) and if I copy and paste the statement from the terminal and inserts it directly into the psql terminal, i get 46 rows returned as I should... But the perl still wont return any rows.
I don't get it? When formatting a string to display "ø" and not "ø" (as perl translates the UTF8 encoding to, from "J%C3%B8rgen" which gets send through dojo.xhr.post), should I not be able to use it in a SQL statement? Is it because the psql database can have a certain encoding i have to take that into account somehow? Or could it be some completely different?
Hope someone can help me. I have been struggling with this problem for two days now and since the things looks like they should, but don't work I get a little sad :/
Regards,
Thor Astrup Pedersen
You probably forgot to pg_enable_utf8. The database interface will return then Perl character data to you.
$ createdb -e -E UTF-8 -l en_US.UTF-8 -T template0 so10349280
CREATE DATABASE so10349280 ENCODING 'UTF-8' TEMPLATE template0 LC_COLLATE 'en_US.UTF-8' LC_CTYPE 'en_US.UTF-8';
$ echo 'create table person (id int, name varchar, initials varchar, email varchar)'|psql so10349280
CREATE TABLE
$ echo "insert into person (id, name) values (1, 'Jørgensen')"|psql so10349280
INSERT 0 1
$ echo 'select * from person'|psql so10349280
id | name | initials | email
----+-----------+----------+-------
1 | Jørgensen | |
$ perl -Mutf8 -Mstrictures -MDBI -MDevel::Peek -E'
my $dbh = DBI->connect(
"DBI:Pg:dbname=so10349280", $ENV{LOGNAME}, "", { RaiseError => 1, AutoCommit => 1, pg_enable_utf8 => 1}
);
my $r = $dbh->selectall_hashref("select * from person where name = ?", "id", undef, "Jørgensen");
Dump $r->{1}{name};
'
SV = PV(0x836e20) at 0xa58dc8
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0xa5a000 "J\303\270rgensen"\0 [UTF8 "J\x{f8}rgensen"]
CUR = 10
LEN = 16
You don't say quite clear, I think you eventually intend to send out the character data as JSON for use with Dojo. You need to encode them into UTF-8 octets; the various JSON libaries take care of that automatically for you, no need to invoke Encode functions manually.