Related
I am writing a function which will select and SUM the resulting output into a new table-therefore I attempted to use the INTO function. However, my standalone code works, yet once a place into a function I get an error stating that the new SELECT INTO table is not a defined variable (perhaps I am missing something). Please see code below:
CREATE OR REPLACE FUNCTION rev_1.calculate_costing_layer()
RETURNS trigger AS
$BODY$
BEGIN
-- This will create an intersection between pipelines and sum the cost to a new table for output
-- May need to create individual cost columns- Will also keep infrastructure costing seperated
--DROP table rev_1.costing_layer;
SELECT inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
INTO rev_1.costing_layer
FROM rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY catchment_e_gravity_lines.name, inyaninga_phases.geom;
RETURN NEW;
END;
$BODY$
language plpgsql
Per the documentation:
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.
Use CREATE TABLE AS.
Although SELECT ... INTO new_table is valid PostgreSQL, its use has been deprecated (or, at least, "unrecommended"). It doesn't work at all in PL/PGSQL, because INSERT INTO is used to get results into variables.
If you want to create a new table, you should use instead:
CREATE TABLE rev_1.costing_layer AS
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
If the table has already been created an you just want to insert a new row in it, you should use:
INSERT INTO
rev_1.costing_layer
(geom, name, gravity_sum)
-- Same select than before
SELECT
inyaninga_phases.geom, catchment_e_gravity_lines.name, SUM(catchment_e_gravity_lines.cost) AS gravity_sum
FROM
rev_1.inyaninga_phases
ON ST_Intersects(catchment_e_gravity_lines.geom,inyaninga_phases.geom)
GROUP BY
catchment_e_gravity_lines.name, inyaninga_phases.geom;
In a trigger function, you're not very likely to create a new table every time, so, my guess is that you want to do the INSERT and not the CREATE TABLE ... AS.
I have a scenario where I need to delete rows from a table using the outcome of a nested select. Like this:
DECLARE #tbl_big TABLE (bigID int);
INSERT INTO #tbl_big (bigID)
VALUES (1),(2),(3),(4),(5);
DECLARE #tbl_small TABLE (smallID int);
INSERT INTO #tbl_small (smallID)
VALUES (1),(2),(3);
DELETE FROM #tbl_big
WHERE (bigID IN (SELECT smallID FROM #tbl_small));
SELECT *
FROM #tbl_big; -- shows 4,5 as expected
However, during development I accidentally made a typo:
DELETE FROM #tbl_big WHERE (bigID IN (SELECT bigID FROM #tbl_small)); --bigID used instead of smallID
SELECT *
FROM #tbl_big; -- no rows
The result was that all rows within the parent table were deleted.
While this may be completely acceptable T-SQL, I've never seen it applied like this, nor would I expect the statement to even compile given that #tbl_small does not contain a bigID column.
Can anybody please clarify why/how this works, and is it valid T-SQL? Also, can you provide a real-world example where this is more useful than risky(!)?
bigID in the DELETE statement you mentioned referes to #tbl_big because it is legal to mention columns from the main table in the sub queries you write in the WHERE clause. For example, you can write the below:
DELETE FROM #tbl_big WHERE (bigID IN (SELECT smallID FROM #tbl_small WHERE smallID = bigID));
So, in your case, you just used all bigID values in your table in the sub query as a constant value.
Not sure whether this is possible and what's the best way to do it, but I have a list of values (say it is 'ACBC', 'ADFC', 'AGGD' etc.) which grows over time. I want to use these values in pgADmin as a sort of variable/parameter for SQL statements; e.g:
Codes = {'ACBC', 'ADFC', 'AGGD'}
SQL: Statement => SELECT * FROM xxx WHERE SUBSTRING IN (Codes)
Is that somehow realizable, either with a variable, parameter, function or anyhing else?
I can think of the following options:
1) Create separate table
create table qry_params(prm_value varchar);
insert into qry_params values
('xxx'),
('yyy'),
('zzz');
select * from xxx where substring in (select prm_value from qry_params)
Whenever you have new parameter, you just need to add it to the table.
2) CTE at the top of query
Use query like:
with params (prm_value) as (select values ('xxx'), ('yyy'), ('zzz'))
select * from xxx where substring in (select prm_value from qry_params)
Whenever you have new parameter, you just need to add it to the CTE.
I am currently working on a project to import data from one table to another. I am trying to parse a field that contains a FULLNAME into its parts LAST,FIRST,MI. The names are all in the format of "LAST,FIRST MI" I have written a stored procedure that correctly parses and returns the results as neccessary but I am unsure as to how to encorporate the stored procedure into a single select statement. For instance, current I have:
SELECT FULLNAME From UserInfo
and what I would like to have is something like this:
SELECT Last, First, MI from UserInfo
Currently my stored procedure takes the form of ParseName(FULLNAME, Last as OUTPUT, First as OUTPUT, MI as OUTPUT). How can I call this procedure and have the output variables split into 3 different columns?
Replace your stored procedure with table valued function. You can then apply this function to all the rows.
Below is an example - just put your logic for parsing the name
create FUNCTION dbo.f_parseName(#inFullName varchar(255))
RETURNS
#tbl TABLE (lastName varchar(255), firstName varchar(255), middleName varchar(255))
as
BEGIN
-- put your logic here
insert into #tbl(lastName,firstName,middleName)
select substring(#inFullName,0,10),substring(#inFullName,11,10), substring(#inFullName,21,10)
return
end
apply the function
-- sample data
declare #fullNames table (fullName varchar(255))
insert into #fullNames (fullName) values
('111111111122222222223333333333')
,('AAAAAAAAAABBBBBBBBBBCCCCCCCCCC')
select
fn.fullName
,pn.lastName
,pn.firstName
,pn.middleName
from
#fullNames fn
cross apply dbo.f_parseName(fn.fullName) pn
You could put the results of your stored procedure in a (temporary) table, like this (I added the FULLNAME column to provide a join condition, you would have to adapt your stored procedure to do that):
CREATE TABLE #temp (
FULLNAME NVARCHAR(..)
,Last NVARCHAR(..)
,First NVARCHAR(..)
,MI NVARCHAR(..)
);
INSERT INTO #temp (Last, First, MI)
EXECUTE MySproc;
If you want to be able to execute SELECT Last, First, MI from UserInfo structurally, you'd have to first add three columns to UserInfo for your name information, and then insert the parsed data that you got from your stored procedure.
EDIT
You mention that you use a SELECT ... INTO ... to put the data in a new table. I'm guessing that the new table does not have the FULLNAME column, and then you would be better off using a table valued function (as this answer suggests). If you keep the FULLNAME column however, you can use that to join the temp table with your new table to update the new table as follows:
UPDATE NUI
SET NUI.Last = T.Last, NUI.First = T.First, NUI.MI = T.MI
FROM NewUserInfo AS NUI
INNER JOIN #temp AS T ON NUI.FULLNAME = T.FULLNAME;
You could use this UPDATE method also with another join condition if you do not have the FULLNAME column in your new table, but make sure you run a good test beforehand to check if the join holds.
Hope this helps, good luck!
You could add computed columns to table like this:
alter table UserInfo
add firstName as SUBSTRING(fullName, CHARINDEX(',',fullName,0)+2, LEN(fullName)-CHARINDEX(',',fullName,0)-CHARINDEX(' ', REVERSE(fullName),0)-1)
,lastName as SUBSTRING(fullName, 0, CHARINDEX(',',fullName,0))
,middleInitital as REVERSE(SUBSTRING(REVERSE(fullName),0,CHARINDEX(' ', REVERSE(fullName),0)))
But the best solution would be to do the other way around. Normalize the data with real columns for firstName, lastName and middleInitial and do a computed column for the fullName.
The expressions in the code above may need a little more work, as I am sure they can be written more effectivly. I only made them work to show the idea.
After creating the computed columns you may do this:
select firstName
,lastName
,middleInitital
from UserInfo
Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;