Inserting multiple records and updating identity column in SQL - sql-server-2008-r2

i have to insert multiple records into a table and in the mean time needed to insert the identity column of the first table in to another table.Can i do it avoiding loop?
Edited
i have two tables named StudentMaster and StudentSujects.
First Table structure is (StudentID int Identity(1,1),StudentName varchar(100))
Second table structure is (SubjectID int Identity(1,1),StudentID int,SubjectName varchar(100)).
StudentID in the 'StudentSujects' table is the Identity column of first table 'StudentMaster'.
INSERT INTO StudentMaster
(
StudentName
)
SELECT StudentName
FROM OPENXML(#hDoc,'/XML/Students')
WITH( StudentName varchar(100) 'StudentName')
I am inserting multiple records in to the first table using the above query.I the mean time i have to insert the identity column of each row in to the second table.

You can use the OUTPUT clause to output multiple columns/rows on an INSERT operation into a table variable.
Assuming your table that you're inserting into has an IDENTITY column called ID, you could have code something like this:
DECLARE #InsertedData TABLE (NewID INT, SomeOtherColumn.....)
INSERT INTO dbo.YourTable(Col1, Col2, ..., ColN)
OUTPUT INTO #InsertedData(NewID, SomeOtherColumn) Inserted.ID, Inserted.OtherColumn
VALUES(Val11, Val12, ..., Val1N),
(Val21, Val22, ..., Val2N),
....
(ValM1, ValM2, ..., ValMN)
Of course, you need to have something that allows you to identify which row in your second table to insert which value into - that's entirely dependent on your situation (and you didn't offer any explanation of that in your question).
But basically, using the OUTPUT clause, you can capture as much information as you need, including the newly assigned IDENTITY values, so that you can then do your second insert based on that information.

Related

insert function with side-effects: how to take insert parameter?

I'm interested in writing a Postgres function that inserts a new row to e.g. an invoice table, and performs some side effects based on the result of the insertion. My invoice table has some columns with default values (e.g. auto-generated primary key column id), and some that are optional.
I'm wondering if there's a way to take a parameter that represents a row of the invoice table, possibly without default and optional fields, and insert that value directly as a row.
CREATE OR REPLACE FUNCTION public.insert_invoice(new_invoice invoice)
RETURNS uuid
LANGUAGE sql
AS $$
WITH invoice_insert_result AS (
-- This fails: new_invoice has type invoice, but expected type uuid (because it thinks we want to put `new_invoice` in the "id" column)
INSERT INTO invoice VALUES (new_invoice)
RETURNING id
)
-- Use the result to perform side-effects
SELECT invoice_insert_result.id
$$;
I know this is possible to do by replicating the schema of the invoice table in the list of parameters of the function, however I'd prefer not to do that since it would mean additional boilerplate and maintenance burden.
The uuid is a value that is automatically generated, you cannot insert a uuid value to a table.
The target column names can be listed in any order. If no list of
column names is given at all, the default is all the columns of the
table in their declared order; or the first N column names, if there
are only N columns supplied by the VALUES clause or query. The values
supplied by the VALUES clause or query are associated with the
explicit or implicit column list left-to-right.
https://www.postgresql.org/docs/current/sql-insert.html
The quote part means that you insert command can either explicit mention column list. Or you not mention column list then the to be inserted command (after values) should have all the column list's value.
to achiever your intended result, Insert command,you must specify columns list. If not, then you need insert uuid value. But you cannot uuid is auto generated. The same example would be like if a table have a column bigserial then you cannot insert bigserial value to that column. Since bigserial is auto-generated.
For other non-automatic column, You can aggregated them use customized type.
denmo
create type inv_insert_template as (receiver text, base_amount numeric,tax_rate numeric);
full function:
CREATE OR REPLACE FUNCTION public.insert_invoice(new_invoice inv_insert_template)
RETURNS bigint
LANGUAGE sql
AS $$
WITH invoice_insert_result AS (
INSERT INTO invoices(receiver,base_amount, tax_rate)
VALUES (new_invoice.receiver,
new_invoice.base_amount,
new_invoice.tax_rate) RETURNING inv_no
)
SELECT invoice_insert_result.inv_no from invoice_insert_result;
$$;
call it: select * from public.insert_invoice(row('person_c', 1000, 0.1));
db fiddle demo

Import a csv with foreignkeys

Let's say I have 2 tables: Students and Groups.
The Group table has 2 columns: id, GroupName
The Student table has 3 columns: id, StudentName and GroupID
The GroupID is a foreign key to a Group field.
I need to import the Students table from a CSV, but in my CSV instead of the Group id appears the name of the group. How can I import it with pgAdmin without modifying the csv?
Based on Laurenz answer, use follwoing scripts:
Create a temp table to insert from CSV file:
CREATE TEMP TABLE std_temp (id int, student_name char(25), group_name char(25));
Then, import the CSV file:
COPY std_temp FROM '/home/username/Documents/std.csv' CSV HEADER;
Now, create std and grp tables for students and groups:
CREATE TABLE grp (id int, name char(25));
CREATE TABLE std (id int, name char(20), grp_id int);
It's grp table's turn to be populated based on distinct value of group name. Consider how row_number() is use to provide value for id`:
INSERT INTO grp (id, name) select row_number() OVER (), * from (select distinct group_name from std_temp) as foo;
And the final step, select data based on the join then insert it into the std table:
insert into std (id, name, grp_id) select std_temp.id, std_temp.student_name,grp.id from std_temp inner join grp on std_temp.group_name = grp.name;
At the end, retreive data from final std table:
select * from std;
Your easiest option is to import the file into a temporary table that is defined like the CSV file. Then you can join that table with the "groups" table and use INSERT INTO ... SELECT ... to populate the "students" table.
There is of course also the option to define a view on a join of the two tables and define an INSTEAD OF INSERT trigger on the view that inserts values into the underlying tables as appropriate. Then you could load the data directly to the view.
The suggestion by #LaurenzAlbe is the obvious approach (IMHO never load a spreadsheet directly to
your tables, they are untrustworthy beasts). But I believe your implementation after loading the staging
table is flawed.
First, using row_number() virtually ensures you get duplicated ids for the same group name.
The ids will always increment from 1 by 1 to then number of group names no matter the number of groups previously loaded and you cannot ensure the identical sequence on a subsequent spreadsheets. What happens when you have a group that does not previously exist.
Further there is no validation that the group name does not already exist. Result: Duplicate group names and/or multiple ids for the same name.
Second, you attempt to use the id from the spreadsheet as the id the student (std) table is full of error possibilities. How do you ensure that number is unique across spreadsheets?
Even if unique in a single spreadsheet, how do you ensure another spreadsheet does not use the same numbers as a previous one. Or assuming multiple users create the spreadsheets that one users numbers do not overlap another users even if all users
user are very conscious of the numbers they use. Result: Duplicate id numbers.
A much better approach would be to put a unique key on the group table name column then insert any group names from the stage table into the group trapping any duplicate name errors (using on conflict). Then load the student table directly from the stage table
while selecting group id from the group table by the (now unique) group name.
create table csv_load_temp( junk_num integer, student_name text, group_name text);
create table groups( grp_id integer generated always as identity
, name text
, grp_key text generated always as ( lower(name) ) stored
, constraint grp_pk
primary key (grp_id)
, constraint grp_bk
unique (grp_key)
);
create table students (std_id integer generated always as identity
, name text
, grp_id integer
, constraint std_pk
primary key (std_id)
, constraint std2grp_fk
foreign key (grp_id)
references groups(grp_id)
);
-- Function to load Groups and Students
create or replace function establish_students()
returns void
language sql
as $$
insert into groups (name)
select distinct group_name
from csv_load_temp
on conflict (grp_key) do nothing;
insert into students (name, grp_id)
select student_name, grp_id
from csv_load_temp t
join groups grp
on (grp.name = t.group_name);
$$;
The groups table requires Postgres v12. For prior versions remove the column grp_key couumn
and and put the unique constraint directly on the name column. What to do about capitalization is up to your business logic.
See fiddle for full example. Obviously the 2 inserts in the Establish_Students function can be run standalone and independently. In that case the function itself is not necessary.

Reading an append-only list from PostgreSQL

I would like to implement an append-only list in PostgreSQL. Basically, this is trivial: Create a table, and only ever INSERT into that table.
However, I would like to be able to read that list again, in the order it was created. How can I do this? Is a simple SELECT * FROM MyTable enough? If not, what do I sort by?
Rows in a relational database have no inherent sort order. The only way to get a guaranteed sort order is to use an order by.
You can either create an identity column that is incremented on every insert or a timestamp column that records the precise time a row was inserted (or do both).
e.g.
create table append_only
(
id bigint generated always as identity,
... other columns ...
created_at timestamp default clock_timestamp()
);
Then use that column for an order by. By having both, you can use the id column as a tie breaker when sorting by the timestamp in case two rows were inserted at exactly same microsecond.
You could create column with data type SERIAL(similiar to AUTOINCREMENT/SEQUENCE):
CREATE TABLE myTable(id SERIAL, ...)
SELECT * FROM myTable ORDER BY id;

Inheritance and Relationship in PostgreSQL

I have some problems with SQL-code.
Sequences and tables creation, some data inserting in:
CREATE SEQUENCE tmp_id_places START 1;
CREATE SEQUENCE tmp_id_books START 1;
CREATE TABLE tmp_places (
id int PRIMARY KEY DEFAULT nextval('tmp_id_places'),
name text
);
CREATE TABLE tmp_cities (population int) INHERITS (tmp_places);
CREATE TABLE tmp_rivers (lenght int) INHERITS (tmp_places);
INSERT INTO tmp_cities (name, population) VALUES
('Moscow', 15),
('St. Petersburg', 9);
INSERT INTO tmp_rivers (name, lenght) VALUES
('Volga', 115),
('Angara', 319);
CREATE TABLE tmp_books (
id int PRIMARY KEY DEFAULT nextval('tmp_id_books'),
id_place int REFERENCES tmp_places(id),
title text
);
Вut this code make an error:
INSERT INTO tmp_books (title, id_place) VALUES
('Some book about Moscow', 1),
('Another book about Angara', 4);
Table tmp_books contain information about places. But I can't insert data in it, because there aren't any data in master-table tmp_places (all data in child-tables).
So can this been resolved anyway?
Take a closer look at this section in the PostrgeSQL docs. If you will insert data into the child table, then data will be found only in the child table. On the other hand, inserting into the master table makes new rows visible in all the child tables also. So you have to always work on the master tables at first hand.
I've been working with inheritance a while ago and also faced the same problem.
I ended up with the following:
INSERT a new entry into the tmp_places;
UPDATE extra fields, say, in tmp_cities with their respective values.
Back in 7.4 times I had to create a set of functions for such activities.
Now it is possible to use the RETURNING clause of INSERT statement and CTEs with UPDATE (also on SQL Fiddle):
WITH theid AS (
INSERT INTO tmp_places (name) VALUES ('Moscow') RETURNING id
)
UPDATE tmp_cities tc SET population = 15
FROM theid
WHERE tc.id = theid.id;
You should also be careful with constraints, as not all of them are inherited.
Denis,
Inheritance does not propagate in INSERT and COPY statements in Postgres.
In PostgreSQL you shouldn't create a foreign key to parent table because, as you just find out, this table acts almost as a view instead of a table (since actual data are in their respective children). For now only be solved by triggers.
You can see an example of this "type of triggers".

Is there a way to quickly duplicate record in T-SQL?

I need to duplicate selected rows with all the fields exactly same except ID ident int which is added automatically by SQL.
What is the best way to duplicate/clone record or records (up to 50)?
Is there any T-SQL functionality in MS SQL 2008 or do I need to select insert in stored procedures ?
The only way to accomplish what you want is by using Insert statements which enumerate every column except the identity column.
You can of course select multiple rows to be duplicated by using a Select statement in your Insert statements. However, I would assume that this will violate your business key (your other unique constraint on the table other than the surrogate key which you have right?) and require some other column to be altered as well.
Insert MyTable( ...
Select ...
From MyTable
Where ....
If it is a pure copy (minus the ID field) then the following will work (replace 'NameOfExistingTable' with the table you want to duplicate the rows from and optionally use the Where clause to limit the data that you wish to duplicate):
SELECT *
INTO #TempImportRowsTable
FROM (
SELECT *
FROM [NameOfExistingTable]
-- WHERE ID = 1
) AS createTable
-- If needed make other alterations to the temp table here
ALTER TABLE #TempImportRowsTable DROP COLUMN Id
INSERT INTO [NameOfExistingTable]
SELECT * FROM #TempImportRowsTable
DROP TABLE #TempImportRowsTable
If you're able to check the duplication condition as rows are inserted, you could put an INSERT trigger on the table. This would allow you to check the columns as they are inserted instead of having to select over the entire table.