postgresql company id based sequence - postgresql

I have a database with companies and their products, I want for each
company to have a separate product id sequence.
I know that postgresql can't do this, the only way is to have a separate sequence for each company but this is cumbersome.
I thought about a solution to have a separate table to hold the sequences
CREATE TABLE "sequence"
(
"table" character varying(25),
company_id integer DEFAULT 0,
"value" integer
)
"table" will be holt the table name for the sequence, such as products, categories etc.
and value will hold the actual sequence data that will be used for product_id on inserts
I will use UPDATE ... RETURNING value; to get a product id
I was wondering is this solution efficient?
With row level locking, only users of same company adding rows in the same table will have to wait to get a lock and I think that reduces race condition problems.
Is there a better way to solve this problem?
I don't want to use a sequence for products table for all companies because the difference between product id's will be to big, I want to keep it simple for the users.

You could just embed a counter in your companies table:
CREATE TABLE companies (
id SERIAL PRIMARY KEY,
name TEXT,
product_id INT DEFAULT 0
);
CREATE TABLE products (
company INT REFERENCES companies(id),
product_id INT,
PRIMARY KEY (company, product_id),
name TEXT
);
INSERT INTO companies (id, name) VALUES (1, 'Acme Corporation');
INSERT INTO companies (id, name) VALUES (2, 'Umbrella Corporation');
Then, use UPDATE ... RETURNING to get the next product ID for a given company:
> INSERT INTO products VALUES (1, (UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id), 'Anvil');
ERROR: syntax error at or near "companies"
LINE 1: INSERT INTO products VALUES (1, (UPDATE companies SET produc...
^
Oh noes! It seems you can't (as of PostgreSQL 9.1devel) use UPDATE ... RETURNING as a subquery.
The good news is, it's not a problem! Just create a stored procedure that does the increment/return part:
CREATE FUNCTION next_product_id(company INT) RETURNS INT
AS $$
UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id
$$ LANGUAGE 'sql';
Now insertion is a piece of cake:
INSERT INTO products VALUES (1, next_product_id(1), 'Anvil');
INSERT INTO products VALUES (1, next_product_id(1), 'Dynamite');
INSERT INTO products VALUES (2, next_product_id(2), 'Umbrella');
INSERT INTO products VALUES (1, next_product_id(1), 'Explosive tennis balls');
Be sure to use the same company ID in both the product value and the argument to next_product_id(company INT).

Depending on how many companies you have, you could create a sequence for each company. Query it by a function which is set as a default on your product_id column.
Alternatively this function could simply do a SELECT FOR UPDATE and update the values of your table. Should be pretty performant I think.

Related

Need help inserting data into Postgres tables

I get an error trying to insert data into my tables ... but I don't know why?
Syntax is correct.
column "population" is of type integer but expression is of type record
create table states(name varchar(25), population int );
create table countries(name varchar(25), population int );
insert into states values (('tn',54945),('ap',2308));
select name from states;
insert into countries values (('india',3022),('america',30902));
select * from countries;
There are extra parentheses around the tuples of values to insert, which turns the whole thing to a single record of records.
Instead:
insert into countries(name, population) values ('india',3022),('america',30902);

Insert a value from a table in another table as foreign key

I have two tables, cinema and theater.
Table Cinema
Columns:
id, name, is_active
Table Theater
Columns:
id, cinema_id
I'm doing insertion into the DB, in sequence. First, I'll insert into cinema and then into theater. The cinema_id.theater is a foreign key that reference cinema.id. After the insertion into cinema, I'll insert data into the theater, but I need the value from cinema's id before insert the data in cinema_id.
I was thinking about RETURNING id INTO cinema_id and, then, save into theater. But I really don't know how I can possibly do something like this.
Any thoughts? Is there any better way to do something like this?
You have two options.
The first one is using the lastval() function which returns the value of the last generated sequence value:
insert into cinema(name, is_active) values ('Cinema One', true);
insert into theater(cinema_id) values (lastval());
Alternatively you can pass the sequence name to the currval() function:
insert into theater(cinema_id)
values (currval(pg_get_serial_sequence('cinema', 'id')));
Alternatively you can chain the two statements using a CTE and the returning clause:
with new_cinema as (
insert into cinema (name, is_active)
values ('Cinema One', true)
returning id
)
insert into theater (cinema_id)
select id
from new_cinema;
In both statements I assume theater.id is also a generated value.
this way works.
with new_cinema as (
insert into cinema (name, is_active)
values ('Cinema One', true)
returning id
)
insert into theater (cinema_id)
select id
from new_cinema;
INSERT INTO tableB
(
columnA
)
SELECT
columnA
FROM
tableA
ORDER BY columnA desc
LIMIT 1

If existing record, row is returned, but if new record inserted, row is not returned

Two tables. author, and book
I am adding a Book into the book table.
If the Author is listed is already in the author table, then get the author's id and insert it into the Book row.
If the Author is not in the author table, then insert a new author and use the id to insert into the Book row.
This functionality works fine.
The database responds appropriately and with the code below (not the actual code, but a more refined version) and rows are appropriately referenced or created.
I also want the query to return the Book row and this is fine.
The Book row is always returned in all tested conditions, be it a Book with an existing author or a Book with a known author.
The issue comes when I now want to join it with the author table to get the author details back as well.
NOW ->
If I insert a Book with a known Author, the functionality is perfect and the row is returned perfectly as expected.
If I insert a Book with a NEW Author, the new author is still created, the new book is still inserted BUT ZERO rows are returned.
I am not sure why this is happening or how I would go about getting the row.
CREATE TABLE author (id PRIMARY KEY, name VARCHAR (255));
CREATE TABLE book (id PRIMARY KEY, title VARCAR (255), author REFERENCES author (id));
WITH
s AS (
SELECT id FROM author
WHERE name = 'British Col'
),
i AS (
INSERT INTO author(name)
SELECT ('Eoin Colfer')
WHERE NOT EXISTS (select 1 from s)
RETURNING id
),
j AS (
SELECT id FROM s
UNION ALL
SELECT id FROM i
),
ins AS (
INSERT INTO book
(title, author)
SELECT 'Artemis Fowl', j.id
FROM j
RETURNING *
)
SELECT ins.*, author.*
FROM ins
JOIN author
ON ins.author = author.id
;
Explanation
This has to do with the behavior of common table expressions in PostgreSQL.
Per the docs (https://www.postgresql.org/docs/current/queries-with.html):
The sub-statements in WITH are executed concurrently with each other
and with the main query. Therefore, when using data-modifying
statements in WITH, the order in which the specified updates actually
happen is unpredictable. All the statements are executed with the same
snapshot (see Chapter 13), so they cannot “see” one another's effects
on the target tables. This alleviates the effects of the
unpredictability of the actual order of row updates, and means that
RETURNING data is the only way to communicate changes between
different WITH sub-statements and the main query. An example of this
is that in
WITH t AS (
UPDATE products SET price = price * 1.05
RETURNING *
)
SELECT * FROM products;
the outer SELECT would return the original prices before the action of
the UPDATE...
The final sentence (below the code snippet) is critical.
Your query against the author table at the end returns data as it was before the insert statements within the CTEs.
Alternative Approach
An alternative approach would be to do this work in a function where you can use variables.
First, some suggested changes to your tables:
CREATE TABLE author
(
id SERIAL PRIMARY KEY,
name TEXT NOT NULL UNIQUE -- Unique for ON CONFLICT later
);
CREATE TABLE book
(
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
author_id INT NOT NULL REFERENCES author (id),
UNIQUE (title, author_id) -- Prevent duplicates
);
Example function:
CREATE OR REPLACE FUNCTION add_book (in_book_title TEXT, in_author_name TEXT)
RETURNS TABLE
(
author_id INT,
book_id INT,
author_name TEXT,
book_title TEXT
)
AS $$
#variable_conflict use_column
DECLARE
var_author_id INT;
var_book_id INT;
BEGIN
-- Upsert author, return id
INSERT INTO author (name)
VALUES (in_author_name)
ON CONFLICT (name) DO
UPDATE SET name = EXCLUDED.name -- Do update to allow use of returning
RETURNING id INTO var_author_id;
-- Upsert book, return id
INSERT INTO book (title, author_id)
VALUES (in_book_title, var_author_id)
ON CONFLICT (title, author_id) DO
UPDATE SET title = EXCLUDED.title -- Do update to allow use of returning
RETURNING id INTO var_book_id;
-- Return the record using your join (similar)
RETURN QUERY
SELECT a.id, b.id, a.name, b.title
FROM author a
INNER JOIN book b
ON a.id = b.author_id
WHERE b.id = var_book_id;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
Usage:
SELECT * FROM add_book('Artemis Fowl', 'Eoin Colfer');

How can I generate big data sample for Postgresql using generate_series and random?

I want to generate big data sample (almost 1 million records) for studying tuplesort.c's polyphase merge in postgresql, and I hope the schema as follows:
CREATE TABLE Departments (code VARCHAR(4), UNIQUE (code));
CREATE TABLE Towns (
id SERIAL UNIQUE NOT NULL,
code VARCHAR(10) NOT NULL, -- not unique
article TEXT,
name TEXT NOT NULL, -- not unique
department VARCHAR(4) NOT NULL REFERENCES Departments (code),
UNIQUE (code, department)
);
how to use generate_series and random for do it? thanks a lot!
To insert one million rows into Towns
insert into towns (
code, article, name, department
)
select
left(md5(i::text), 10),
md5(random()::text),
md5(random()::text),
left(md5(random()::text), 4)
from generate_series(1, 1000000) s(i)
Since id is a serial it is not necessary to include it.

How to implicitly insert SERIAL ID via view over more than one table

I have two tables, connected in E/R by a is-relation. One representing the "mother table"
CREATE TABLE PERSONS(
id SERIAL NOT NULL,
name character varying NOT NULL,
address character varying NOT NULL,
day_of_creation timestamp NOT NULL DEFAULT current_timestamp,
PRIMARY KEY (id)
)
the other representing the "child table"
CREATE TABLE EMPLOYEES (
id integer NOT NULL,
store character varying NOT NULL,
paychecksize integer NOT NULL,
FOREIGN KEY (id)
REFERENCES PERSONS(id),
PRIMARY KEY (id)
)
Now those two tables are joined in a view
CREATE VIEW EMPLOYEES_VIEW AS
SELECT
P.id,name,address,store,paychecksize,day_of_creation
FROM
PERSONS AS P
JOIN
EMPLOYEES AS E ON P.id = E.id
I want to write either a rule or a trigger to enable a db user to make an insert on that view, sparing him the nasty details of the splitted columns into different tables.
But I also want to make it convenient, as the id is a SERIAL and the day_of_creation has a default value there is no actual need that a user has to provide those, therefore a statement like
INSERT INTO EMPLOYEES_VIEW (name, address, store, paychecksize)
VALUES ("bob", "top secret", "drugstore", 42)
should be enough to result in
PERSONS
id|name|address |day_of_creation
-------------------------------
1 |bob |top secret| 2013-08-13 15:32:42
EMPLOYEES
id| store |paychecksize
---------------------
1 |drugstore|42
A basic rule would be easy as
CREATE RULE EMPLOYEE_VIEW_INSERT AS ON INSERT TO EMPLOYEE_VIEW
DO INSTED (
INSERT INTO PERSONS
VALUES (NEW.id,NEW.name,NEW.address,NEW.day_of_creation),
INSERT INTO EMPLOYEES
VALUES (NEW.id,NEW.store,NEW.paychecksize)
)
should be sufficient. But this will not be convenient as a user will have to provide the id and timestamp, even though it actually is not necessary.
How can I rewrite/extend that code base to match my criteria of convenience?
Something like:
CREATE RULE EMPLOYEE_VIEW_INSERT AS ON INSERT TO EMPLOYEES_VIEW
DO INSTEAD
(
INSERT INTO PERSONS (id, name, address, day_of_creation)
VALUES (default,NEW.name,NEW.address,default);
INSERT INTO EMPLOYEES (id, store, paychecksize)
VALUES (currval('persons_id_seq'),NEW.store,NEW.paychecksize)
);
That way the default values for persons.id and persons.day_of_creation will be the default values. Another option would have been to simply remove those columns from the insert:
INSERT INTO PERSONS (name, address)
VALUES (NEW.name,NEW.address);
Once the rule is defined, the following insert should work:
insert into employees_view (name, address, store, paychecksize)
values ('Arthur Dent', 'Some Street', 'Some Store', 42);
Btw: with a current Postgres version an instead of trigger is the preferred way to make a view updateable.