how to link python pandas dataframe to mysqlconnector '%s' value - mysql-connector

I am trying to pipe a webscraped pandas dataframe into a MySql table with mysql.connector but I can't seem to link df values to the %s variable. The connection is good (I can add individual rows) but it just returns errors when I replace the value witht he %s.
cnx = mysql.connector.connect(host = 'ip', user = 'user', passwd = 'pass', database = 'db')
cursor = cnx.cursor()
insert_df = ("""INSERT INTO table"
"(page_1, date_1, record_1, task_1)"
"VALUES ('%s','%s','%s','%s')""")
cursor.executemany(insert_df, df)
cnx.commit()
cnx.close()
This returns "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
If I add any additional oiperations it returns "ProgrammingError: Parameters for query must be an Iterable."
I am very new to this so any help is appreciated

Work around for me was to redo my whole process. I ran sqlalchemy, all the documentation makes this very easy. message if you want the code I used.

Related

Try to update postgresql database using Doobie but no update happening

I'm trying to update table in postgresql database passing dynamic value using doobie functional JDBC while executing sql statement getting below error.Any help will be appreciable.
Code
Working code
sql"""UPDATE layout_lll
|SET runtime_params = 'testing string'
|WHERE run_id = '123-ksdjf-oreiwlds-9dadssls-kolb'
|""".stripMargin.update.quick.unsafeRunSync
Not working code
val abcRunTimeParams="testing string"
val runID="123-ksdjf-oreiwlds-9dadssls-kolb"
sql"""UPDATE layout_lll
|SET runtime_params = '${abcRunTimeParams}'
|WHERE run_id = '$runID'
|""".stripMargin.update.quick.unsafeRunSync
Error
Exception in thread "main" org.postgresql.util.PSQLException: The column index is out of range: 3, number of columns: 2.
Remove the ' quotes - Doobie make sure they aren't needed. Doobie (and virtually any other DB library) uses parametrized queries, like:
UPDATE layout_lll
SET runtime_params = ?
WHERE run_id = ?
where ? will be replaced by parameters passes later on. This:
makes SQL injection impossible
helps spotting errors in SQL syntax
When you want to pass parameter, the ' is part of the value passed, not part of the parametrized query. And Doobie (or JDBC driver) will "add" it for you. The variables you pass there are processed by Doobie, they aren't just pasted there like in normal string interpolation.
TL;DR Try running
val abcRunTimeParams="testing string"
val runID="123-ksdjf-oreiwlds-9dadssls-kolb"
sql"""UPDATE layout_lll
|SET runtime_params = ${abcRunTimeParams}
|WHERE run_id = $runID
|""".stripMargin.update.quick.unsafeRunSync

Python psycopg2 - using .format() with a dbname inside a string

I'm using psycopg2 to query a database that starts with a number + ".district", so my code goes like:
number = 2345
cur = conn.cursor()
myquery = """ SELECT *
FROM {0}.districts
;""".format(number)
cur.execute("""{0};""".format(query))
data = cur.fetchall()
conn.close()
And i keep receiving the following psycopg2 error..
psycopg2.ProgrammingError: syntax error at or near "2345."
LINE 1: SELECT * FROM 2345.districts...
Thought it was the type of data the problem, maybe int(number) or str(number)..but no, same error appears.
¿ What am i doing wrong ?
The way you are trying to use to pass parameters is not supported. Please read the docs.

How to get data out of a postgres bytea column into a python variable using sqlalchemy?

I am working with the script below.
If I change the script so I avoid the bytea datatype, I can easily copy data from my postgres table into a python variable.
But if the data is in a bytea postgres column, I encounter a strange object called memory which confuses me.
Here is the script which I run against anaconda python 3.5.2:
# bytea.py
import sqlalchemy
# I should create a conn
db_s = 'postgres://dan:dan#127.0.0.1/dan'
conn = sqlalchemy.create_engine(db_s).connect()
sql_s = "drop table if exists dropme"
conn.execute(sql_s)
sql_s = "create table dropme(c1 bytea)"
conn.execute(sql_s)
sql_s = "insert into dropme(c1)values( cast('hello' AS bytea) );"
conn.execute(sql_s)
sql_s = "select c1 from dropme limit 1"
result = conn.execute(sql_s)
print(result)
# <sqlalchemy.engine.result.ResultProxy object at 0x7fcbccdade80>
for row in result:
print(row['c1'])
# <memory at 0x7f4c125a6c48>
How to get the data which is inside of memory at 0x7f4c125a6c48 ?
You can cast it use python bytes()
for row in result:
print(bytes(row['c1']))

Psycopg2 insert python dictionary in postgres database

In python 3+, I want to insert values from a dictionary (or pandas dataframe) into a database. I have opted for psycopg2 with a postgres database.
The problems is that I cannot figure out the proper way to do this. I can easily concatenate a SQL string to execute, but the psycopg2 documentation explicitly warns against this. Ideally I wanted to do something like this:
cur.execute("INSERT INTO table VALUES (%s);", dict_data)
and hoped that the execute could figure out that the keys of the dict matches the columns in the table. This did not work. From the examples of the psycopg2 documentation I got to this approach
cur.execute("INSERT INTO table (" + ", ".join(dict_data.keys()) + ") VALUES (" + ", ".join(["%s" for pair in dict_data]) + ");", dict_data)
from which I get a
TypeError: 'dict' object does not support indexing
What is the most phytonic way of inserting a dictionary into a table with matching column names?
Two solutions:
d = {'k1': 'v1', 'k2': 'v2'}
insert = 'insert into table (%s) values %s'
l = [(c, v) for c, v in d.items()]
columns = ','.join([t[0] for t in l])
values = tuple([t[1] for t in l])
cursor = conn.cursor()
print cursor.mogrify(insert, ([AsIs(columns)] + [values]))
keys = d.keys()
columns = ','.join(keys)
values = ','.join(['%({})s'.format(k) for k in keys])
insert = 'insert into table ({0}) values ({1})'.format(columns, values)
print cursor.mogrify(insert, d)
Output:
insert into table (k2,k1) values ('v2', 'v1')
insert into table (k2,k1) values ('v2','v1')
I sometimes run into this issue, especially with respect to JSON data, which I naturally want to deal with as a dict. Very similar. . .But maybe a little more readable?
def do_insert(rec: dict):
cols = rec.keys()
cols_str = ','.join(cols)
vals = [ rec[k] for k in cols ]
vals_str = ','.join( ['%s' for i in range(len(vals))] )
sql_str = """INSERT INTO some_table ({}) VALUES ({})""".format(cols_str, vals_str)
cur.execute(sql_str, vals)
I typically call this type of thing from inside an iterator, and usually wrapped in a try/except. Either the cursor (cur) is already defined in an outer scope or one can amend the function signature and pass a cursor instance in. I rarely insert just a single row. . .And like the other solutions, this allows for missing cols/values provided the underlying schema allows for it too. As long as the dict underlying the keys view is not modified as the insert is taking place, there's no need to specify keys by name as the values will be ordered as they are in the keys view.
[Suggested answer/workaround - better answers are appreciated!]
After some trial/error I got the following to work:
sql = "INSERT INTO table (" + ", ".join(dict_data.keys()) + ") VALUES (" + ", ".join(["%("+k+")s" for k in dict_data]) + ");"
This gives the sql string
"INSERT INTO table (k1, k2, ... , kn) VALUES (%(k1)s, %(k2)s, ... , %(kn)s);"
which may be executed by
with psycopg2.connect(database='deepenergy') as con:
with con.cursor() as cur:
cur.execute(sql, dict_data)
Post/cons?
using %(name)s placeholders may solve the problem:
dict_data = {'key1':val1, 'key2':val2}
cur.execute("""INSERT INTO table (field1, field2)
VALUES (%(key1)s, %(key2)s);""",
dict_data)
you can find the usage in psycopg2 doc Passing parameters to SQL queries
Here is another solution inserting a dictionary directly
Product Model (has the following database columns)
name
description
price
image
digital - (defaults to False)
quantity
created_at - (defaults to current date)
Solution:
data = {
"name": "product_name",
"description": "product_description",
"price": 1,
"image": "https",
"quantity": 2,
}
cur = conn.cursor()
cur.execute(
"INSERT INTO products (name,description,price,image,quantity) "
"VALUES(%(name)s, %(description)s, %(price)s, %(image)s, %(quantity)s)", data
)
conn.commit()
conn.close()
Note: The columns to be inserted is specified on the execute statement .. INTO products (column names to be filled) VALUES ..., data <- the dictionary (should be the same **ORDER** of keys)

Psycopg2 copy_from throws DataError: invalid input syntax for integer

I have a table with some integer columns. I am using psycopg2's copy_from
conn = psycopg2.connect(database=the_database,
user="postgres",
password=PASSWORD,
host="",
port="")
print "Putting data in the table: Opened database successfully"
cur = conn.cursor()
with open(the_file, 'r') as f:
cur.copy_from(file=f, table = the_table, sep=the_delimiter)
conn.commit()
print "Successfully copied all data to the database!"
conn.close()
The error says that it expects the 8th column to be an integer and not a string. But, Python's write method can only read strings to the file. So, how would you import a file full of string representation of number to postgres table with columns that expect integer when your file can only have character representation of the integer (e.g. str(your_number)).
You either have to write numbers in integer format to the file (which Python's write method disallows) or psycopg2 should be smart enough to the conversion as part of copy_from procedure, which it apparently is not. Any idea is appreciated.
I ended up using copy_expert command. Note that on Windows, you have to set the permission of the file. This post is very useful setting permission.
with open(the_file, 'r') as f:
sql_copy_statement = "copy {table} FROM '"'{from_file}'"' DELIMITER '"'{deli}'"' {file_type} HEADER;".format(table = the_table,
from_file = the_file,
deli = the_delimiter,
file_type = the_file_type
)
print sql_copy_statement
cur.copy_expert(sql_copy_statement, f)
conn.commit()