Need help inserting data into Postgres tables - postgresql

I get an error trying to insert data into my tables ... but I don't know why?
Syntax is correct.
column "population" is of type integer but expression is of type record
create table states(name varchar(25), population int );
create table countries(name varchar(25), population int );
insert into states values (('tn',54945),('ap',2308));
select name from states;
insert into countries values (('india',3022),('america',30902));
select * from countries;

There are extra parentheses around the tuples of values to insert, which turns the whole thing to a single record of records.
Instead:
insert into countries(name, population) values ('india',3022),('america',30902);

Related

Why does this insert give error: There is a column named "column" in table "table", but it cannot be referenced from this part of the query

I am trying to learn to use JDBC to make changes to a PostgreSQL database. I want to try inserting data into a table.
private static void createDatabase(Connection connection) throws SQLException {
Statement stmt = connection.createStatement();
stmt.executeUpdate("DROP TABLE IF EXISTS " + TABLE_NAME);
stmt.executeUpdate("CREATE TABLE my_table (Column1 Text)");
stmt.executeUpdate("INSERT INTO my_table VALUES (column1)");
}
Running this, I get an error:
ERROR: column "column1" does not exist
Hint: There is a column named "column1" in table "my_table", but it cannot be referenced from this part of the query.
If the column exists in the table, why am I getting an error?
The problem is that your statement, INSERT INTO my_table VALUES (Column1), is asking to insert (into column1) the current value of column1. Given the row does not exist yet, this cannot be done: there is no current value of column1. As an aside, it is a good practice to explicitly list the columns to insert to.
The values clause expects a list of values, expressions or parameters. For example, you can use:
INSERT INTO my_table (Column1) VALUES ('Value for Column1')

Output Inserted.id equivalent in Postgres

I am new to PostgreSQL and trying to convert mssql scripts to Postgres.
For Merge statement, we can use insert on conflict update or do nothing but am using the below statement, not sure whether it is the correct way.
MSSQL code:
Declare #tab2(New_Id int not null, Old_Id int not null)
MERGE Tab1 as Target
USING (select * from Tab1
WHERE ColumnId = #ID) as Source on 0 = 1
when not matched by Target then
INSERT
(ColumnId
,Col1
,Col2
,Col3
)
VALUES (Source.ColumnId
,Source.Col1
,Source.Col2
,Source.Col3
)
OUTPUT INSERTED.Id, Source.Id into #tab2(New_Id, Old_Id);
Postgres Code:
Create temp table tab2(New_Id int not null, Old_Id int not null)
With source as( select * from Tab1
WHERE ColumnId = ID)
Insert into Tab1(ColumnId
,Col1
,Col2
,Col3
)
select Source.ColumnId
,Source.Col1
,Source.Col2
,Source.Col3
from source
My query is how to convert OUTPUT INSERTED.Id in postgres.I need this id to insert records in another table (lets say as child tables based on Inserted values in Tab1)
In PostgreSQL's INSERT statements you can choose what the query should return. From the docs on INSERT:
The optional RETURNING clause causes INSERT to compute and return value(s) based on each row actually inserted (or updated, if an ON CONFLICT DO UPDATE clause was used). This is primarily useful for obtaining values that were supplied by defaults, such as a serial sequence number. However, any expression using the table's columns is allowed. The syntax of the RETURNING list is identical to that of the output list of SELECT. Only rows that were successfully inserted or updated will be returned.
Example (shortened form of your query):
WITH [...] INSERT INTO Tab1 ([...]) SELECT [...] FROM [...] RETURNING Tab1.id

How can I bulk insert rows only if a compound primary key don't already exist? [AWS Redshift]

in Amazon Redshift I try to do a bulk insert value in a table from a temp table.
However I only want to insert the values where a compound of values (primary key) not exist in the table, to avoid adding duplicate.
Below the DDL of the table
• clusters_typologies table (table when i want to insert data)
create table if not exists clusters.clusters_typologies
(
cluster_id BIGINT,
typology_id BIGINT,
semantic_id BIGINT,
primary key (cluster_id, typology_id, semantic_id)
);
Temp Table is create with query below and after that all field are correctly inserted.
CREATE TEMPORARY TABLE temporary (
cluster_id bigint,
typology_name varchar(100),
typology_id bigint,
semantic_name varchar(100),
semantic_id bigint
);
Now when i try to insert with that query
INSERT INTO clusters.clusters_typologies (cluster_id, typology_id,semantic_id)
(SELECT temp.cluster_id, temp.typology_id, temp.semantic_id
FROM temporary temp
WHERE NOT EXISTS(SELECT 1
FROM clusters_typologies
where cluster_id = temp.cluster_id
and typology_id = temp.typology_id
and semantic_id = temp.semantic_id));
I got this error and i cannot figured out how to make it work.
Invalid operation: This type of correlated subquery pattern is not supported due to internal error;
Anyone know how to fix or how is the best way to insert in a table with a compound key avoiding duplicate.
Thanks.
To upsert follow this guide
https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-upsert.html
and note that certain types of correlated subquery are not allowed in redshift - that is the cause of your error
see
https://docs.aws.amazon.com/redshift/latest/dg/r_correlated_subqueries.html
After some attempt I figured out how to do an insert from a temp table, and check from a compound primary key to avoid duplicate.
Basically from AWS documentation that #Jon Scott as sent, I understand that use outer table in inner select is not supported from Redshift.
I solve using a left join and check if the joining column is null.
Below the query I use now.
INSERT INTO clusters.clusters_typologies (cluster_id, typology_id, semantic_id)
(SELECT temp.cluster_id, temp.typology_id, temp.semantic_id
FROM aaaa temp
LEFT JOIN clusters.clusters_typologies clu_typ ON temp.cluster_id = clu_typ.cluster_id AND
temp.typology_id = clu_typ.typology_id AND
temp.semantic_id = clu_typ.semantic_id
WHERE clu_typ.cluster_id IS NULL
AND clu_typ.typology_id IS NULL
AND clu_typ.semantic_id IS NULL);

access postgres field given field name as text string

I have a table in postgres:
create table fubar (
name1 text,
name2 text, ...,
key integer);
I want to write a function which returns field values from fubar given the column names:
function getFubarValues(col_name text, key integer) returns text ...
where getFubarValues returns the value of the specified column in the row identified by key. Seems like this should be easy.
I'm at a loss. Can someone help? Thanks.
Klin's answer is a good (i.e. safe) approach to the question as posed, but it can be simplified:
PostgreSQL's -> operator allows expressions. For example:
CREATE TABLE test (
id SERIAL,
js JSON NOT NULL,
k TEXT NOT NULL
);
INSERT INTO test (js,k) VALUES ('{"abc":"def","ghi":"jkl"}','abc');
SELECT js->k AS value FROM test;
Produces
value
-------
"def"
So we can combine that with row_to_json:
CREATE TABLE test (
id SERIAL,
a TEXT,
b TEXT,
k TEXT NOT NULL
);
INSERT INTO test (a,b,k) VALUES
('foo','bar','a'),
('zip','zag','b');
SELECT row_to_json(test)->k AS value FROM test;
Produces:
value
-------
"foo"
"zag"
Here I'm getting the key from the table itself but of course you could get it from any source / expression. It's just a value. Also note that the result returned is a JSON value type (it doesn't know if it's text, numeric, or boolean). If you want it to be text, just cast it: (row_to_json(test)->k)::TEXT
Now that the question itself is answered, here's why you shouldn't do this, and what you should do instead!
Never trust any data. Even if it already lives inside your database, you shouldn't trust it. The method I've posted here is safe against SQL injection attacks, but an attacker could still set k to 'id' and see a column which was not intended to be visible to them.
A much better approach is to structure your data with this type of query in mind. Postgres has some excellent datatypes for this; HSTORE and JSON/JSONB. Merge your dynamic columns into a single column with one of those types (I'd suggest HSTORE for its simplicity and generally being more complete).
This has several advantages: your schema is well-defined and does not need to change if you add more dynamic columns, you do not need to perform expensive re-casting (i.e. row_to_json), and you are able to take advantage of indexes on your columns (thanks to PostgreSQL's functional indexes).
The equivalent to the code I wrote above would be:
CREATE EXTENSION HSTORE; -- necessary if you're not already using HSTORE
CREATE TABLE test (
id SERIAL,
cols HSTORE NOT NULL,
k TEXT NOT NULL
);
INSERT INTO test (cols,k) VALUES
('a=>"foo",b=>"bar"','a'),
('a=>"zip",b=>"zag"','b');
SELECT cols->k AS value FROM test;
Or, for automatic escaping of your values when inserting, you can use one of:
INSERT INTO test (cols,k) VALUES
(hstore( 'a', 'foo' ) || hstore( 'b', 'bar' ), 'a'),
(hstore( ARRAY['a','b'], ARRAY['zip','zag'] ), 'b');
See http://www.postgresql.org/docs/9.1/static/hstore.html for more details.
You can use dynamic SQL to select a column by name:
create or replace function get_fubar_values (col_name text, row_key integer)
returns setof text language plpgsql as $$begin
return query execute 'select ' || quote_ident(col_name) ||
' from fubar where key = $1' using row_key;
end$$;

postgresql company id based sequence

I have a database with companies and their products, I want for each
company to have a separate product id sequence.
I know that postgresql can't do this, the only way is to have a separate sequence for each company but this is cumbersome.
I thought about a solution to have a separate table to hold the sequences
CREATE TABLE "sequence"
(
"table" character varying(25),
company_id integer DEFAULT 0,
"value" integer
)
"table" will be holt the table name for the sequence, such as products, categories etc.
and value will hold the actual sequence data that will be used for product_id on inserts
I will use UPDATE ... RETURNING value; to get a product id
I was wondering is this solution efficient?
With row level locking, only users of same company adding rows in the same table will have to wait to get a lock and I think that reduces race condition problems.
Is there a better way to solve this problem?
I don't want to use a sequence for products table for all companies because the difference between product id's will be to big, I want to keep it simple for the users.
You could just embed a counter in your companies table:
CREATE TABLE companies (
id SERIAL PRIMARY KEY,
name TEXT,
product_id INT DEFAULT 0
);
CREATE TABLE products (
company INT REFERENCES companies(id),
product_id INT,
PRIMARY KEY (company, product_id),
name TEXT
);
INSERT INTO companies (id, name) VALUES (1, 'Acme Corporation');
INSERT INTO companies (id, name) VALUES (2, 'Umbrella Corporation');
Then, use UPDATE ... RETURNING to get the next product ID for a given company:
> INSERT INTO products VALUES (1, (UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id), 'Anvil');
ERROR: syntax error at or near "companies"
LINE 1: INSERT INTO products VALUES (1, (UPDATE companies SET produc...
^
Oh noes! It seems you can't (as of PostgreSQL 9.1devel) use UPDATE ... RETURNING as a subquery.
The good news is, it's not a problem! Just create a stored procedure that does the increment/return part:
CREATE FUNCTION next_product_id(company INT) RETURNS INT
AS $$
UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id
$$ LANGUAGE 'sql';
Now insertion is a piece of cake:
INSERT INTO products VALUES (1, next_product_id(1), 'Anvil');
INSERT INTO products VALUES (1, next_product_id(1), 'Dynamite');
INSERT INTO products VALUES (2, next_product_id(2), 'Umbrella');
INSERT INTO products VALUES (1, next_product_id(1), 'Explosive tennis balls');
Be sure to use the same company ID in both the product value and the argument to next_product_id(company INT).
Depending on how many companies you have, you could create a sequence for each company. Query it by a function which is set as a default on your product_id column.
Alternatively this function could simply do a SELECT FOR UPDATE and update the values of your table. Should be pretty performant I think.