psycopg2 does not insert unicode data - postgresql

I have a script that takes data from one database and according to tables' names and fields copies it into another database.
The issue is about unicode data, I need to insert some words in Russian but every time psycopg2 writes it as if it were default string.
import psycopg2
import psycopg2.extensions
conn_two = psycopg2.connect(user="postgres", password="password", host = "localhost", port= "5432", dbname = "base2")
cur_2 = conn_two.cursor()
sql = 'INSERT INTO {} ({}) VALUES {};'.format('"tb_names"', '"num", "name", "district"', (23, 'Рынок', 'Волжский'))
cur_2.execute(sql)
conn_two.commit()
Here is how the result looks like in pgAdmin4:
I also tried to set exteinsions and insert data in unicode, but in this case I have an error
import psycopg2
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)
conn_two = psycopg2.connect(user="postgres", password="password", host = "localhost", port= "5432", dbname = "base2")
conn_two.set_client_encoding("utf-8")
conn_two.set_client_encoding('UNICODE')
cur_2 = conn_two.cursor()
sql = 'INSERT INTO {} ({}) VALUES {};'.format('"tb_names"', '"num", "name", "district"', (23, u'Рынок', u'Волжский'))
cur_2.execute(sql)
conn_two.commit()
Traceback (most recent call last):
File "D:\_Scripts\pgadmin.py", line <>, in <module>
cur_2.execute(sql)
psycopg2.ProgrammingError: ОШИБКА: тип "u" не существует # - says that type "u" does not exist
LINE 1: ...ing_ex" ("num", "name", "district") VALUES (23, u'\u0420\u...
^
What should be done here?

Don't prepare your string with the values baked in (using string formatting or concatenation).
Instead, use %s in the SQL string as placeholders, then pass your values to the .execute method as an argument.
If you're on Python 2.x, non-ASCII strings should be passed as Unicode types.
E.g.
Python 2.x
sql = 'INSERT INTO "tb_names" ("num", "name", "district") VALUES (%s, %s, %s);'
cur_2.execute(sql, (23, u'Рынок', u'Волжский'))
Python 3.x
sql = 'INSERT INTO "tb_names" ("num", "name", "district") VALUES (%s, %s, %s);'
cur_2.execute(sql, (23, 'Рынок', 'Волжский'))

Related

How to get data out of a postgres bytea column into a python variable using sqlalchemy?

I am working with the script below.
If I change the script so I avoid the bytea datatype, I can easily copy data from my postgres table into a python variable.
But if the data is in a bytea postgres column, I encounter a strange object called memory which confuses me.
Here is the script which I run against anaconda python 3.5.2:
# bytea.py
import sqlalchemy
# I should create a conn
db_s = 'postgres://dan:dan#127.0.0.1/dan'
conn = sqlalchemy.create_engine(db_s).connect()
sql_s = "drop table if exists dropme"
conn.execute(sql_s)
sql_s = "create table dropme(c1 bytea)"
conn.execute(sql_s)
sql_s = "insert into dropme(c1)values( cast('hello' AS bytea) );"
conn.execute(sql_s)
sql_s = "select c1 from dropme limit 1"
result = conn.execute(sql_s)
print(result)
# <sqlalchemy.engine.result.ResultProxy object at 0x7fcbccdade80>
for row in result:
print(row['c1'])
# <memory at 0x7f4c125a6c48>
How to get the data which is inside of memory at 0x7f4c125a6c48 ?
You can cast it use python bytes()
for row in result:
print(bytes(row['c1']))

could not determine data type of parameter $1, while executing COPY..TO statement

I want to write the content of a table from the database into CSV file using JDBC prepared statement. The PSQL query that I am using is:
COPY(select * from file where uploaded_at > ?) TO '/tmp/file_info.csv' With DELIMITER ',' CSV HEADER ";
My code is as follows:
private void copyData(Connection conn, Date date){
String sql = "COPY (select * from file where uploaded_at< ?) TO '/tmp/file_info.csv' With DELIMITER ',' CSV HEADER ";
PreparedStatement stmt = conn.prepareStatement(sql);
stmt.setTimestamp(1, new Timestamp(date.getTime()));
stmt.execute();
}
I get the following stack when I run this query:
org.postgresql.util.PSQLException: ERROR: could not determine data type of parameter $1
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:381)
at tests.com.paramatrix.pace.archival.file.TestFileArchival.main(TestFileArchival.java:63)
This exception is not generated while running a simple select query like:
select * from file where uploaded_at > ?
What could be the reason for it not working with COPY..TO statement?
You can't parameterize COPY statements. You'll have to use string interpolation or PgJDBC's CopyManager.

passing python variable to sql statement psycopg2 pandas

I am trying to replace a piece of sql code with a python variable that I will ask a user to generate using a raw_input.
Below is the code i'm using which works great if I set mypythonvariable manually i.e. inputting 344 into the sql code, but if I set the sql as is to mypythonvariable it doesn't work.
The whole sql query is then converted into a pandas dataframe for further messing about with.
Any help on how to do be appreciated.
UPDATE: I just added the %s code into the statement and i'm now getting the error message '': not all arguments converted during string formatting
'
conn = pg.connect(host = "localhost",
port = 1234,
dbname = "somename",
user = "user",
password = "pswd")
mypythonvariable = raw_input("What is your variable number? ")
sql = """
SELECT
somestuff
FROM
sometable
WHERE
something = %s
"""
df = pd.read_sql_query(sql, con=conn,params=mypythonvariable)
thanks to all that looked.
I found the solution.Looks like the params need to be passed as a list.
conn = pg.connect(host = "localhost",
port = 1234,
dbname = "somename",
user = "user",
password = "pswd")
mypythonvariable = raw_input("What is your variable number? ")
sql = """
SELECT
somestuff
FROM
sometable
WHERE
something = %s
"""
df = pd.read_sql_query(sql, con=conn,params=[mypythonvariable])

Psycopg2 copy_from throws DataError: invalid input syntax for integer

I have a table with some integer columns. I am using psycopg2's copy_from
conn = psycopg2.connect(database=the_database,
user="postgres",
password=PASSWORD,
host="",
port="")
print "Putting data in the table: Opened database successfully"
cur = conn.cursor()
with open(the_file, 'r') as f:
cur.copy_from(file=f, table = the_table, sep=the_delimiter)
conn.commit()
print "Successfully copied all data to the database!"
conn.close()
The error says that it expects the 8th column to be an integer and not a string. But, Python's write method can only read strings to the file. So, how would you import a file full of string representation of number to postgres table with columns that expect integer when your file can only have character representation of the integer (e.g. str(your_number)).
You either have to write numbers in integer format to the file (which Python's write method disallows) or psycopg2 should be smart enough to the conversion as part of copy_from procedure, which it apparently is not. Any idea is appreciated.
I ended up using copy_expert command. Note that on Windows, you have to set the permission of the file. This post is very useful setting permission.
with open(the_file, 'r') as f:
sql_copy_statement = "copy {table} FROM '"'{from_file}'"' DELIMITER '"'{deli}'"' {file_type} HEADER;".format(table = the_table,
from_file = the_file,
deli = the_delimiter,
file_type = the_file_type
)
print sql_copy_statement
cur.copy_expert(sql_copy_statement, f)
conn.commit()

how to use chinese character in pymysql to create table?

1.sqlite3
import sqlite3
con=sqlite3.connect("g:\\mytest1.db")
cur=con.cursor()
cur.execute('create table test (上市 TEXT)')
con.commit()
cur.close()
con.close()
I successfully create a test table mytest1.db ,and a chinese character name "上市" as field.
2.in mysql command console.
C:\Users\root>mysql -uroot -p
Welcome to the MySQL monitor. Commands end with ; or \g.
mysql> create database mytest2;
Query OK, 1 row affected (0.00 sec)
mysql> use mytest2;
Database changed
mysql> set names "gb2312";
Query OK, 0 rows affected (0.00 sec)
mysql> create table stock(上市 TEXT) ;
Query OK, 0 rows affected (0.07 sec)
The conclusion can be get : chinese characters can be used in mysql console.
3.pymysql
code31
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
code32
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest32")
cur.execute("use mytest32")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
The same problem occurs
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 21-22: o rdinal not in range(256)
4.mysql-python-connect
code 41
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'123456',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest41")
cur.execute("use mytest41")
cur.execute('set names "gb2312" ')
str='create table stock(上市 TEXT)'
cur.execute(str)
code 42
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'******',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest42")
cur.execute("use mytest42")
cur.execute('set names "gb2312" ')
str='create table stock(上市.encode("utf-8") TEXT)'
cur.execute(str)
same errrors such as in pymysql.
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 22-23: o
rdinal not in range(256)
It is surely a bug in python mysql module that chinese characters can not be used as field name.
1.Chinese characters can be used as field name in python sqlite3 module.
2.Chinese characters can be used as field name in mysql console only if you 'set name "gb2312" '
pymysql.connect() accepts a charset argument. I have tested charset="utf8" and charset="gb2312" and both works (Python 3, PyMySQL 0.6.2). You don't need to use a "SET NAMES" query in this case.
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306,
user='root', passwd='******',
charset="utf8")
cur = con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute("create table stock(上市 TEXT)")
con.commit()
You're encodeing when you should decode. To convert a Chinese character to a unicode character, use:
"上市".decode("GB18030")
Which is an encoding generally used for Chinese chars. latin-1 will not work as most Chinese characters are not within its scope. The GB18030 encoding should work, but if not, there are a host of other encodings you can try, like gbk or big5_hkscs (generally for encodings done within HK/China).
Unicode errors are easy to spot, they show up as u'\ufffd' (which when encoded will be a diamond with a question mark in the middle).
I hope this was helpful!
Edit: I'm somewhat confused by your comment.
>>> print type("上市")
<type 'str'>
>>> print type("上市".decode("GB18030"))
<type 'unicode'>
str.decode() returns unicode.