Inserting array containing json objects as rows in postgres 9.5 - postgresql

Just started using PostgreSQL 9.5 and have ran into my first problem with jsonb column. I have been trying to find an answer to this for a while but failing badly. Can someone help?
I have a json array in python containing json objects like this:
[{"name":"foo", "age":"18"}, {"name":"bar", "age":"18"}]
I'm trying to insert this into a jsonb column like this:
COPY person(person_jsonb) FROM '/path/to/my/json/file.json';
But only 1 row gets inserted. I hope to have each json object in the array as a new row like this:
1. {"name":"foo", "age":"18"}
2. {"name":"bar", "age":"18"}
Also tried:
INSERT INTO person(person_jsonb)
VALUES (%s)
,(json.dumps(data['person'])
Still only one row gets inserted. Can someone please help??
EDIT: Added python code as requested
import psycopg2, sys, json
con = None
orders_file_path = '/path/to/my/json/person.json'
try:
with open(orders_file_path) as data_file:
data = json.load(data_file)
con = psycopg2.connect(...)
cur = con.cursor()
person = data['person']
cur.execute("""
INSERT INTO orders(orders_jsonb)
VALUES (%s)
""", (json.dumps(person), ))
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()
finally:
if con:
con.close()
person.json file:
{"person":[{"name":"foo", "age":"18"}, {"name":"bar", "age":"18"}]}

Assuming the simplest schema:
CREATE TABLE test(data jsonb);
Option 1: parse the JSON in Python
You need to insert each row in PostgreSQL apart, you could parse the JSON on Python side and split the upper level array, then use cursor.executemany to execute the INSERT with each json data already split:
import json
import psycopg2
con = psycopg2.connect('...')
cur = con.cursor()
data = json.loads('[{"name":"foo", "age":"18"}, {"name":"bar", "age":"18"}]')
with con.cursor() as cur:
cur.executemany('INSERT INTO test(data) VALUES(%s)', [(json.dumps(d),) for d in data])
con.commit()
con.close()
Option 2: parse the JSON in PostgreSQL
Another option is to push the JSON processing into PostgreSQL side using json_array_elements:
import psycopg2
con = psycopg2.connect('...')
cur = con.cursor()
data = '[{"name":"foo", "age":"18"}, {"name":"bar", "age":"18"}]'
with con.cursor() as cur:
cur.execute('INSERT INTO test(data) SELECT * FROM json_array_elements(%s)', (data,))
con.commit()
con.close()

Related

How to make psycopg2 emit no quotes?

I want to create a table from Pytnon:
import psycopg2 as pg
from psycopg2 import sql
conn = pg.connect("dbname=test user=test")
table_name = "testDB"
column_name = "mykey"
column_type = "bigint"
cu = conn.cursor()
cu.execute(sql.SQL("CREATE TABLE {t} ({c} {y})").format(
t=sql.Identifier(table_name),
c=sql.Identifier(column_name),
y=sql.Literal(column_type)))
Alas, this emits CREATE TABLE "testDB" ("mykey" 'bigint') which fails with a
psycopg2.ProgrammingError: syntax error at or near "'bigint'"
Of course, I can do something like
cu.execute(sql.SQL("CREATE TABLE {t} ({c} %s)" % (column_name)).format(
t=sql.Identifier(table_name),
c=sql.Identifier(column_name)))
but I suspect there is a more elegant (and secure!) solution.
PS. See also How to make psycopg2 emit nested quotes?
There is an example in the documentation how to build a query text with a placeholder. Use psycopg2.extensions.AsIs(object) for column_type:
query = sql.SQL("CREATE TABLE {t} ({c} %s)").format(
t=sql.Identifier(table_name),
c=sql.Identifier(column_name)).as_string(cu)
cu.execute(query, [AsIs(column_type)])

The sql works fine but with python it doesn't insert values into table

I'm trying to use python for stored procs in sql, I have tested my sql code and it works fine, but when I execute it via python, the values are not inserted into my table
Note: I don't have any errors when executing
My code below:
import psycopg2
con = psycopg2.connect(dbname='dbname'
, host='host'
, port='5439', user='username', password='password')
def executeScriptsfromFile(filename):
#Open and read the file as a single buffer
cur = con.cursor()
fd = open(filename,'r')
sqlFile = fd.read()
fd.close()
#all SQL commands(split on ';')
sqlCommands = filter(None,sqlFile.split(';'))
#Execute every command from the input file
for command in sqlCommands:
# This will skip and report errors
# For example, if the tables do not yet exist, this will skip over
# the DROP TABLE commands
try:
cur.execute(command)
con.commit()
except Exception as inst:
print("Command skipped:", inst)
cur.close()
executeScriptsfromFile('filepath.sql')
Insert comment in sql:
INSERT INTO schema.users
SELECT
UserId
,Country
,InstallDate
,LastConnectDate
FROM #Source;
Note: As I said the sql works perfectly fine when I tested it.

How to get data out of a postgres bytea column into a python variable using sqlalchemy?

I am working with the script below.
If I change the script so I avoid the bytea datatype, I can easily copy data from my postgres table into a python variable.
But if the data is in a bytea postgres column, I encounter a strange object called memory which confuses me.
Here is the script which I run against anaconda python 3.5.2:
# bytea.py
import sqlalchemy
# I should create a conn
db_s = 'postgres://dan:dan#127.0.0.1/dan'
conn = sqlalchemy.create_engine(db_s).connect()
sql_s = "drop table if exists dropme"
conn.execute(sql_s)
sql_s = "create table dropme(c1 bytea)"
conn.execute(sql_s)
sql_s = "insert into dropme(c1)values( cast('hello' AS bytea) );"
conn.execute(sql_s)
sql_s = "select c1 from dropme limit 1"
result = conn.execute(sql_s)
print(result)
# <sqlalchemy.engine.result.ResultProxy object at 0x7fcbccdade80>
for row in result:
print(row['c1'])
# <memory at 0x7f4c125a6c48>
How to get the data which is inside of memory at 0x7f4c125a6c48 ?
You can cast it use python bytes()
for row in result:
print(bytes(row['c1']))

Psycopg2 insert python dictionary in postgres database

In python 3+, I want to insert values from a dictionary (or pandas dataframe) into a database. I have opted for psycopg2 with a postgres database.
The problems is that I cannot figure out the proper way to do this. I can easily concatenate a SQL string to execute, but the psycopg2 documentation explicitly warns against this. Ideally I wanted to do something like this:
cur.execute("INSERT INTO table VALUES (%s);", dict_data)
and hoped that the execute could figure out that the keys of the dict matches the columns in the table. This did not work. From the examples of the psycopg2 documentation I got to this approach
cur.execute("INSERT INTO table (" + ", ".join(dict_data.keys()) + ") VALUES (" + ", ".join(["%s" for pair in dict_data]) + ");", dict_data)
from which I get a
TypeError: 'dict' object does not support indexing
What is the most phytonic way of inserting a dictionary into a table with matching column names?
Two solutions:
d = {'k1': 'v1', 'k2': 'v2'}
insert = 'insert into table (%s) values %s'
l = [(c, v) for c, v in d.items()]
columns = ','.join([t[0] for t in l])
values = tuple([t[1] for t in l])
cursor = conn.cursor()
print cursor.mogrify(insert, ([AsIs(columns)] + [values]))
keys = d.keys()
columns = ','.join(keys)
values = ','.join(['%({})s'.format(k) for k in keys])
insert = 'insert into table ({0}) values ({1})'.format(columns, values)
print cursor.mogrify(insert, d)
Output:
insert into table (k2,k1) values ('v2', 'v1')
insert into table (k2,k1) values ('v2','v1')
I sometimes run into this issue, especially with respect to JSON data, which I naturally want to deal with as a dict. Very similar. . .But maybe a little more readable?
def do_insert(rec: dict):
cols = rec.keys()
cols_str = ','.join(cols)
vals = [ rec[k] for k in cols ]
vals_str = ','.join( ['%s' for i in range(len(vals))] )
sql_str = """INSERT INTO some_table ({}) VALUES ({})""".format(cols_str, vals_str)
cur.execute(sql_str, vals)
I typically call this type of thing from inside an iterator, and usually wrapped in a try/except. Either the cursor (cur) is already defined in an outer scope or one can amend the function signature and pass a cursor instance in. I rarely insert just a single row. . .And like the other solutions, this allows for missing cols/values provided the underlying schema allows for it too. As long as the dict underlying the keys view is not modified as the insert is taking place, there's no need to specify keys by name as the values will be ordered as they are in the keys view.
[Suggested answer/workaround - better answers are appreciated!]
After some trial/error I got the following to work:
sql = "INSERT INTO table (" + ", ".join(dict_data.keys()) + ") VALUES (" + ", ".join(["%("+k+")s" for k in dict_data]) + ");"
This gives the sql string
"INSERT INTO table (k1, k2, ... , kn) VALUES (%(k1)s, %(k2)s, ... , %(kn)s);"
which may be executed by
with psycopg2.connect(database='deepenergy') as con:
with con.cursor() as cur:
cur.execute(sql, dict_data)
Post/cons?
using %(name)s placeholders may solve the problem:
dict_data = {'key1':val1, 'key2':val2}
cur.execute("""INSERT INTO table (field1, field2)
VALUES (%(key1)s, %(key2)s);""",
dict_data)
you can find the usage in psycopg2 doc Passing parameters to SQL queries
Here is another solution inserting a dictionary directly
Product Model (has the following database columns)
name
description
price
image
digital - (defaults to False)
quantity
created_at - (defaults to current date)
Solution:
data = {
"name": "product_name",
"description": "product_description",
"price": 1,
"image": "https",
"quantity": 2,
}
cur = conn.cursor()
cur.execute(
"INSERT INTO products (name,description,price,image,quantity) "
"VALUES(%(name)s, %(description)s, %(price)s, %(image)s, %(quantity)s)", data
)
conn.commit()
conn.close()
Note: The columns to be inserted is specified on the execute statement .. INTO products (column names to be filled) VALUES ..., data <- the dictionary (should be the same **ORDER** of keys)

Use python to execute line in postgresql

I have imported one shapefile named tc_bf25 using qgis, and the following is my python script typed in pyscripter,
import sys
import psycopg2
conn = psycopg2.connect("dbname = 'routing_template' user = 'postgres' host = 'localhost' password = '****'")
cur = conn.cursor()
query = """
ALTER TABLE tc_bf25 ADD COLUMN source integer;
ALTER TABLE tc_bf25 ADD COLUMN target integer;
SELECT assign_vertex_id('tc_bf25', 0.0001, 'the_geom', 'gid')
;"""
cur.execute(query)
query = """
CREATE OR REPLACE VIEW tc_bf25_ext AS
SELECT *, startpoint(the_geom), endpoint(the_geom)
FROM tc_bf25
;"""
cur.execute(query)
query = """
CREATE TABLE node1 AS
SELECT row_number() OVER (ORDER BY foo.p)::integer AS id,
foo.p AS the_geom
FROM (
SELECT DISTINCT tc_bf25_ext.startpoint AS p FROM tc_bf25_ext
UNION
SELECT DISTINCT tc_bf25_ext.endpoint AS p FROM tc_bf25_ext
) foo
GROUP BY foo.p
;"""
cur.execute(query)
query = """
CREATE TABLE network1 AS
SELECT a.*, b.id as start_id, c.id as end_id
FROM tc_bf25_ext AS a
JOIN node AS b ON a.startpoint = b.the_geom
JOIN node AS c ON a.endpoint = c.the_geom
;"""
cur.execute(query)
query = """
ALTER TABLE network1 ADD COLUMN shape_leng double precision;
UPDATE network1 SET shape_leng = length(the_geom)
;"""
cur.execute(query)
I got the error at the second cur.execute(query),
But I go to pgAdmin to check result, even though no error occurs, the first cur.execute(query) didn't add new columns in my table.
What mistake did I make? And how to fix it?
I am working with postgresql 8.4, python 2.7.6 under Windows 8.1 x64.
When using psycopg2, autocommit is set to False by default. The first two statements both refer to table tc_bf25, but the first statement makes an uncommitted change to the table. So try running conn.commit() between statements to see if this resolves the issue
You should run each statement individually. Do not combine multiple statements into a semicolon separated series and run them all at one. It makes error handling and fetching of results much harder.
If you still have the problem once you've made that change, show the exact statement you're having the problem with.
Just to add to #Talvalin you can enable auto-commit by adding
psycopg2.connect("dbname='mydb',user='postgres',host ='localhost',password = '****'")
conn.autocommit = True
after you connect to your database using psycopg2