We have a foreign table that is connecting to Oracle. In Oracle, the columns are:
ticker: VARCHAR2(5)
article_id: NUMBER
In Postgres, we have tried to create the article_id as INTEGER and NUMERIC, but every time we try and query we get this error:
column "article_id" of foreign table "latest_article_id" cannot be converted to or from Oracle data type
How can we create this foreign table so we can query it? The article_id is a number, so is there additional commands we must use?
We are on Postgres 10.10.
CREATE FOREIGN TABLE latest_article_id
(ticker VARCHAR,
article_id NUMERIC)
SERVER usercomm
OPTIONS ( table '(SELECT article_id, ticker
FROM (SELECT a.article_id, t.ticker,
ROW_NUMBER() OVER (PARTITION BY t.ticker
ORDER BY a.publish_date DESC NULLS LAST) AS rnum
FROM tickers t, article_tickers at, articles a
WHERE t.ticker_id = at.ticker_id
AND at.article_id = a.article_id
AND a.status_id = 6
AND a.pull_flag = ''Y'')
WHERE rnum = 1)');
I am using psycopg2 to insert command to postgres database and when there is a confilict i just want to update the other column values.
Here is the query:
insert_sql = '''
INSERT INTO tablename (col1, col2, col3,col4)
VALUES (%s, %s, %s, %s) (val1,val2,val3,val4)
ON CONFLICT (col1)
DO UPDATE SET
(col2, col3, col4)
= (val2, val3, val4) ; '''
cur.excecute(insert_sql)
I want to find where I am doing wrong? I am using variables val1 , val2, val3 not actual values.
To quote from psycopg2's documentation:
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
Now, for an upsert operation you can do this:
insert_sql = '''
INSERT INTO tablename (col1, col2, col3, col4)
VALUES (%s, %s, %s, %s)
ON CONFLICT (col1) DO UPDATE SET
(col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4);
'''
cur.execute(insert_sql, (val1, val2, val3, val4))
Notice that the parameters for the query are being passed as a tuple to the execute statement (this assures psycopg2 will take care of adapting them to SQL while shielding you from injection attacks).
The EXCLUDED bit allows you to reuse the values without the need to specify them twice in the data parameter.
Using:
INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.phone);
I received the following error:
psycopg2.errors.FeatureNotSupported: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: ...ICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.p...
Changing to:
INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = ROW(EXCLUDED.phone);
Solved the issue.
Try:
INSERT INTO tablename (col1, col2, col3,col4)
VALUES (val1,val2,val3,val4)
ON CONFLICT (col1)
DO UPDATE SET
(col2, col3, col4)
= (val2, val3, val4) ; '''
I haven't seen anyone comment on this, but you can utilize psycopg2.extras.execute_values to insert/update many rows of data at once, which I think is the intended solution to many inserts/updates.
There's a few tutorials on YouTube that illustrate this, one being How to connect to PSQL Database using psycopg2 + Python
In the video they load a dataframe using pandas and insert the data from a CSV source into multiple schemas/tables. The code snippet example in that video is
from psycopg2.extras import execute_values
sql_insert = """
INSERT INTO {state}.weather_county(fips_code, county_name, temperature)
VALUES %s
ON CONFLICT (fips_code) DO UPDATE
SET
temperature=excluded.temperature,
updated_at=NOW()
;
"""
grouped = new_weather_data.groupby(by='state') ## new_weather_data is a dataframe
conn = create_rw_conn(secrets=secrets)
for state, df in grouped:
# select only the neccessary columns
df = df[['fips_code', 'county_name', 'temperature']]
print("[{}] upsert...".format(state))
# convert dataframe into list of lists for `execute_values`
data = [tuple(x) for x in df.values.tolist()]
cur = conn.cursor()
execute_values(cur, sql_insert.format(state=state), data)
conn.commit() # <- We MUST commit to reflect the inserted data
print("[{}] changes were commited...".format(state))
cur.close()
The Jupyter Notebook is psycopg2-python-tutorial/new-schemas-tables-insert.ipynb
Here's the function that takes df, schemaname of the table, name of the table, the column that you want to use as a conflict in the name of conflict, and the engine created by create_engine of sqlalchemy. It updates the table with respect to conflict column. This is extended solution for the solution of #Ionut Ticus .
Don't use pandas.to_sql() together. pandas.to_sql() destroys the primary key setting. In this case, one need to set primary key by the ALTER query, which is a suggestion by the function below. Primary key does not necessarily be destroyed by pandas, one might haven't been set it. Error would be in that case:
there is no unique constraint matching given keys for referenced table? Function will suggest you to execute below.
engine.execute('ALTER TABLE {schemaname}.{tablename} ADD PRIMARY KEY ({conflictcolumn});
Function:
def update_query(df,schemaname,tablename,conflictcolumn,engine ):
"""
This function takes dataframe as df, name of schema as schemaname,name of the table to append/add/insert as tablename,
and column name that only other elements of rows will be changed if it's existed as conflictname,
database engine as engine.
Example to engine : engine_portfolio_pg = create_engine('postgresql://pythonuser:vmqJRZ#dPW24d#145.239.121.143/cetrm_portfolio')
Example to schemaname,tablename : weatherofcities.sanfrancisco , schemaname = weatherofcities, tablename = sanfrancisco.
"""
excluded = ""
columns = df.columns.tolist()
deleteprimary = columns.copy()
deleteprimary.remove(conflictcolumn)
excluded = ""
replacestring = '%s,'*len(df.columns.tolist())
replacestring = replacestring[:-1]
for column in deleteprimary:
excluded += "EXCLUDED.{}".format(column)+","
excluded = excluded[:-1]
columns = ','.join(columns)
deleteprimary = ','.join(deleteprimary)
insert_sql = """ INSERT INTO {schemaname}.{tablename} ({allcolumns})
VALUES ({replacestring})
ON CONFLICT ({conflictcolumn}) DO UPDATE SET
({deleteprimary}) = ({excluded})""".format( tablename = tablename, schemaname=schemaname,allcolumns = columns, replacestring= replacestring,
conflictcolumn= conflictcolumn,deleteprimary = deleteprimary, excluded=excluded )
conn = engine.raw_connection()
conn.autocommit = True
#conn = engine.connect()
cursor = conn.cursor()
i = 0
print("------------------------"*5)
print("If below error happens:")
print("there is no unique constraint matching given keys for referenced table?")
print("Primary key is not set,you can execute:")
print("engine.execute('ALTER TABLE {}.{} ADD PRIMARY KEY ({});')".format(schemaname,tablename,conflictcolumn))
print("------------------------"*5)
for index, row in df.iterrows():
cursor.execute(insert_sql, tuple(row.values))
conn.commit()
if i == 0:
print("Order of Columns in Operated SQL Query for Rows")
columns = df.columns.tolist()
print(insert_sql%tuple(columns))
print("----")
print("Example of Operated SQL Query for Rows")
print(insert_sql%tuple(row.values))
print("---")
i += 1
conn.close()
If I have a table:
CREATE TABLE
abc
(
xyz INTEGER,
abc INTEGER
);
and I do:
select * from abc order by abc;
it works and sorts by the column named abc.
However, if I don't have a column named 'abc':
CREATE TABLE
abc
(
xyz INTEGER,
abcxyz INTEGER
);
but I run the same query:
select * from abc order by abc;
there is no error message returned from postgres.
I'm just wondering why postgres doesn't return an error message because clearly, 'abc' is not a valid column so I can't possibly order by that column.
I'm using PostgreSQL 9.6.7
Postgresql creates a default TYPE with the structure when we create a table. So when we use it in the query, it is returned as a TYPE, To verify this,
Insert a row into the table and run this
select pg_typeof(abc) from abc;
It gives the result.
pg_typeof
abc
I have a file of table definitions, in the following format
Table Name Field Name Field Data Type
ATableName1 AFieldName1 VARCHAR2
ATableName1 AFieldName2 NUMBER
...
ATableNameX AFieldNameX1 TIMESTAMP(6)
Is there any easy way to import this into Postgres to automatically create the tables?
What if I split the file up into individual tables, and just had a csv of field names/data types for each table?
Field Name Field Data Type
AFieldName1 VARCHAR2
AFieldName2 NUMBER
My searching has only yielded data import via copy, and table creation (based on data) using pgfutter.
mind I change varchar2 to varchar and number to integer.alsoyou have tsv - in order to use it, change chr(44) in my code to chr(9). Mind I dont check for injection, otherwise here's working example:
t=# do
$$
declare
_r record;
begin
for _r in (
with t(l) as (values('ATableName1,AFieldName1i, VARCHAR
ATableName1,AFieldName2,INTEGER
ATableNameX,AFieldNameX1,TIMESTAMP(6)'::text)
)
, r as (select unnest(string_to_array(l,chr(10))) rw from t)
, p as (select split_part(rw,chr(44),1) tn, split_part(rw,chr(44),2) cn,split_part(rw,chr(44),3) tp from r)
select tn||' ('||string_agg(cn||' '||tp, ', ')||')' s from p
group by tn
) loop
raise info '%','create table '||_r.s;
execute 'create table '||_r.s;
end loop;
end;
$$
;
INFO: create table ATableNameX (AFieldNameX1 TIMESTAMP(6))
INFO: create table ATableName1 (AFieldName1i VARCHAR, AFieldName2 INTEGER)
DO
Time: 16.743 ms
t=# \dt AF
t=# \dt atablename*
List of relations
Schema | Name | Type | Owner
--------+-------------+-------+-------
public | atablename1 | table | vao
public | atablenamex | table | vao
SQL is your friend, it is very expressive, you can construct your tables def using string_agg function. Have a look on the example here.
http://sqlfiddle.com/#!17/0fe14/1
Hello what is the easiest way to duplicate a DB record over the same table?
My problem is that the table where I am doing this has many column, like 100+, and I don't like how the solution looks like. Here is what I do (this is inside plpqsql function):
...
1. duplicate record
INSERT INTO history
(SELECT NEXTVAL('history_id_seq'), col_1, col_2, ... , col_100)
FROM history
WHERE history_id = 1234
ORDER BY datetime DESC
LIMIT 1)
RETURNING
history_id INTO new_history_id;
2. update some columns
UPDATE history
SET
col_5 = 'test_5',
col_23 = 'test_23',
datetime = CURRENT_TIMESTAMP
WHERE history_id = new_history_id;
Here are the problems I am attempting to solve
Listing all these 100+ columns looks lame
When new column is added eventually the function should be updated too
On separate DB instances the column order might differ, which would cause the function fail
I am not sure if I can list them once more (solving issue 3) like insert into <table> (<columns_list>) values (<query>) but then the query looks even uglier.
I would like to achieve something like 'insert into ', but this seems impossible the unique primary key constraint will raise a duplication error.
Any suggestions?
Thanks in advance for you time.
This isn't pretty or particularly optimized but there are a couple of ways to go about this. Ideally, you might want to do this all in an UPDATE trigger though you could implement a duplication function something like this:
-- create source table
CREATE TABLE history (history_id serial not null primary key, col_2 int, col_3 int, col_4 int, datetime timestamptz default now());
-- add some data
INSERT INTO history (col_2, col_3, col_4)
SELECT g, g * 10, g * 100 FROM generate_series(1, 100) AS g;
-- function to duplicate record
CREATE OR REPLACE FUNCTION fn_history_duplicate(p_history_id integer) RETURNS SETOF history AS
$BODY$
DECLARE
cols text;
insert_statement text;
BEGIN
-- build list of columns
SELECT array_to_string(array_agg(column_name::name), ',') INTO cols
FROM information_schema.columns
WHERE (table_schema, table_name) = ('public', 'history')
AND column_name <> 'history_id';
-- build insert statement
insert_statement := 'INSERT INTO history (' || cols || ') SELECT ' || cols || ' FROM history WHERE history_id = $1 RETURNING *';
-- execute statement
RETURN QUERY EXECUTE insert_statement USING p_history_id;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql';
-- test
SELECT * FROM fn_history_duplicate(1);
history_id | col_2 | col_3 | col_4 | datetime
------------+-------+-------+-------+-------------------------------
101 | 1 | 10 | 100 | 2013-04-15 14:56:11.131507+00
(1 row)
As I noted in my original comment, you might also take a look at the colnames extension as an alternative to querying the information schema.
You don't need the update anyway, you can supply the constant values directly in the SELECT statement:
INSERT INTO history
SELECT NEXTVAL('history_id_seq'),
col_1,
col_2,
col_3,
col_4,
'test_5',
...
'test_23',
...,
col_100
FROM history
WHERE history_sid = 1234
ORDER BY datetime DESC
LIMIT 1
RETURNING history_sid INTO new_history_sid;