I have the following DB
CREATE TABLE IF NOT EXISTS users (
user_uid INTEGER PRIMARY KEY,
user_name CHAR(64) NOT NULL,
token_id INTEGER
);
CREATE TABLE IF NOT EXISTS unique_thing (
id SERIAL PRIMARY KEY,
unique_thing_id INTEGER NOT NULL,
option_id INTEGER NOT NULL
);
CREATE TABLE IF NOT EXISTS example (
id SERIAL PRIMARY KEY,
variable INTEGER NOT NULL,
variable_2 INTEGER NOT NULL,
char_var CHAR(64) NOT NULL,
char_var2 CHAR(512),
char_var3 CHAR(256),
file_path CHAR(256) NOT NULL
);
CREATE TABLE IF NOT EXISTS different_option_of_things (
id SERIAL PRIMARY KEY,
name CHAR(64)
);
CREATE TABLE IF NOT EXISTS commits (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL,
unique_thing_id INTEGER NOT NULL,
value REAL NOT NULL,
var CHAR(512) NOT NULL,
example_id INTEGER NOT NULL,
boolean_var boolean NOT NULL
);
The tables unique_thing, different_option_of_things and examples will be static (the data will be added rarely and manually).
The table commits will be rather large. It will be the table for insert only (I will delete very rarely).
The user will be the table with user idnetification. It will be not so large as unique_thing, but will have quite a few users.
The Data of the table will be as follows:
INSERT INTO users VALUES(1, 'pacefist', 2);
INSERT INTO users VALUES(3, 'motherfucker', 4);
INSERT INTO users VALUES(4, 'cheater', 5);
INSERT INTO different_option_of_things VALUES(1, 'blablab');
INSERT INTO different_option_of_things VALUES(2, 'smth different');
INSERT INTO different_option_of_things VALUES(3, 'unique_thing');
INSERT INTO different_option_of_things VALUES(4 ,'unique_thing2');
INSERT INTO unique_thing VALUES(DEFAULT, 1, 1);
INSERT INTO unique_thing VALUES(DEFAULT, 1, 3);
INSERT INTO unique_thing VALUES(DEFAULT, 2, 3);
INSERT INTO unique_thing VALUES(DEFAULT, 2, 2);
INSERT INTO example VALUES(1, 20, 20, 'fsdfsdf', 'fgdfgdfg', 'url', '/home/user/file.txt');
INSERT INTO example VALUES(2, 24, 40, 'sfadfadf', 'dfgdfg', 'url', '/home/user/file2.txt');
INSERT INTO commits VALUES(DEFAULT, 1, 1, 55.43, '1234567', 1, TRUE);
INSERT INTO commits VALUES(DEFAULT, 2, 1, 97.85, '1234573', 2, TRUE);
INSERT INTO commits VALUES(DEFAULT, 3, 1, 0.001, '98766543', 1, TRUE);
INSERT INTO commits VALUES(DEFAULT, 4, 2, 100500.00, 'xxxxxxxx', 1, TRUE);
So, the data will be inserted ther following way:
1) I have input data of different_option_of_things, e.g., [ blablab, unique_thing], the REAL value (like 8.9999) and the number of example like `fsdfsdf`
2) It's necessary to find this record in the table `unique_thing`
a) if we've found 2 or more values or haven't found anything
results false -> the search is over
b) if we've found 1 result then
3) we are searching all values (record from unique_thing) in the 'commits' table.
a) if it has been found
a.1 search of the given example name
a.1.1 if found -> get first 25 values and check whether the current value is bigger
a.1.1.1 if yes, we make a commit
a.1.1.2 if no, do nothing (do not duplicate the value)
a.1.2 no -> no results
a.2 if no -> no results
The second function will be almost the same but without insertion, we will just make a selection without insertion (only to get data) and will find for all existing values in the table 'examples' (not only one).
The question: is it better to create 3 functions instead of one big query?
SELECT count(1) AS counter FROM different_option_of_things
WHERE name IN (SELECT * FROM unnest(different_option_of_things));
SELECT * FROM example where id=fsdfsdf;
SELECT TOP 25
FROM commits
JOIN unique_thing
ON commits.id=unique_thing.unique_thing_id where value > 8.9999;
if results-> 0 do a commit
Or is it better to write one enormous query? I am using Postgresql, Tornado and momoko.
I would prefer two stored procedures each to get and insert data.
Pros:
all required data is in db, so it seems like a job for db,
each call on execute needs to:
get/wait for available connection in pool (depending on your app)
run query
fetch data
Release connection
x. And between all of this operation on IOLoop
Although momoko is non-blocking, it is not for free
db can be api, not only sack of data
Cons:
logic in db means you depend on it - change db engine (for example to cassandra) will be harder
often logic in db means there is no tests. Of course you can and you should test it (e.g. pgTap)
for simple tasks it seems like a overkill
It is matter of db and app load, performance and time constraints - in other words run tests and choose solution that meets your expectations/requirements.
Related
I am building a soccer management tool where the league's admin can update the score of every match in the MATCHES TABLE. At the same time I want to update the TEAMS TABLE columns.
For instance if the match is DALLAS vs PHOENIX, and the score was DALLAS 2 - PHOENIX 3, I want to update that match in the MATCH TABLE (I know how to tho this) but at the same time I want to update the points of those two teams based on the result we just updated.
Is there a way to do that in POSTGRESQL?
Thanks for your help.
You can do this for triggers. What is a Database trigger? A database trigger is a special stored procedure that is run when specific actions occur within a database. Most triggers are defined to run when changes are made to a table’s data. Triggers can be defined to run after (or before) INSERT, UPDATE, and DELETE table records. Triggers use two special database objects, INSERTED and DELETED, to access rows affected by the database actions.
When table record is inserted – Use the INSERTED table to determine which rows were added to the table.
When table record is deleted – Use the DELETED table to see which rows were removed from the table.
When table record is updated – Use the INSERTED table to inspect the new or updated values and the DELETED table to see the values prior to update.
In PostgreSQL INSERTED trigger object is called NEW and DELETED object is called OLD
For example:
We have two tables, user_group and user_detail. I would like to insert 12 records into table user_detail when inserting data to table user_group
CREATE TABLE examples.user_group (
id serial4 NOT NULL,
group_name varchar(200) NOT NULL,
user_id int4 NOT NULL
);
CREATE TABLE examples.user_detail (
id serial4 NOT NULL,
user_id int4 NOT NULL,
"month" int2 NOT NULL
);
-- create trigger function for inserting 12 records into user_detail table
CREATE OR REPLACE FUNCTION examples.f_user_group_after_insert()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
p_user_id integer;
begin
p_user_id := new.user_id; -- new is a system table (trigger objects), which return inserted new records for user_group tables
insert into examples.user_detail (user_id, month) values (p_user_id, 1);
insert into examples.user_detail (user_id, month) values (p_user_id, 2);
insert into examples.user_detail (user_id, month) values (p_user_id, 3);
insert into examples.user_detail (user_id, month) values (p_user_id, 4);
insert into examples.user_detail (user_id, month) values (p_user_id, 5);
insert into examples.user_detail (user_id, month) values (p_user_id, 6);
insert into examples.user_detail (user_id, month) values (p_user_id, 7);
insert into examples.user_detail (user_id, month) values (p_user_id, 8);
insert into examples.user_detail (user_id, month) values (p_user_id, 9);
insert into examples.user_detail (user_id, month) values (p_user_id, 10);
insert into examples.user_detail (user_id, month) values (p_user_id, 11);
insert into examples.user_detail (user_id, month) values (p_user_id, 12);
return new;
end;
$function$
;
-- join trigger function to user_group table, when will be run after insert
create trigger user_group_after_insert
after insert
on
examples.user_group for each row execute function examples.f_user_group_after_insert();
I'd like to have column constraint based combination of 2 columns. I don't find the way to use foreign key here, because it should be conditional FK, then. Hope this basic SQL shows the problem:
CREATE TABLE performer_type (
id serial primary key,
type varchar
);
INSERT INTO performer_type ( id, type ) VALUES (1, 'singer'), ( 2, 'band');
CREATE TABLE singer (
id serial primary key,
name varchar
);
INSERT INTO singer ( id, name ) VALUES (1, 'Robert');
CREATE TABLE band (
id serial primary key,
name varchar
);
INSERT INTO band ( id, name ) VALUES (1, 'Animates'), ( 2, 'Zed Leppelin');
CREATE TABLE gig (
id serial primary key,
performer_type_id int default null, /* FK, no problem */
performer_id int default null /* want FK based on previous FK, no good solution so far */
);
INSERT INTO gig ( performer_type_id, performer_id ) VALUES ( 1,1 ), (2,1), (2,2), (1,2), (2,3);
Now, the last INSERT works, but for last 2 value pairs I'd like it fail, because there is no singer ID 2 nor band ID 3. How to set such constraint?
I already asked similar question in Mysql context and only solution was to use trigger. Problem with trigger was: you can't have dynamic list of types and table set. I'd like to add types (and related tables) on the fly.
I also found very promising pattern, but this is upside down for me, I did not figured out, how to turn it to work in my case.
What I am looking here seems to me so useful pattern, I think there must be some common way for it. Is it?
Edit.
Seems, I choose bad items in my examples, so I try make it clear: different performer tables (singer and band) have NO relation between them. gig-table just has to list tasks for different performers, without setting any relations between them.
Another example would items in stock: I may have item_type-table, which defines hundreds of item-types with related tables (for example, orange and house), and there should be table stock which enlists all appearances of items.
PostgreSQL I use is 9.6
Based on #Laurenz Albe answer I form a solution for example above. Main difference: there is parent table performer, which PK is FK/PK for specific performer-tables and is referenced also from gig table.
CREATE TABLE performer_type (
id serial primary key,
type varchar
);
INSERT INTO performer_type ( id, type ) VALUES (1, 'singer' ), ( 2, 'band' );
CREATE TABLE performer (
id serial primary key,
performer_type_id int REFERENCES performer_type(id)
);
CREATE TABLE singer (
id int primary key REFERENCES performer(id),
name varchar
);
INSERT INTO performer ( performer_type_id ) VALUES (1); -- get PK 1 for next statement
INSERT INTO singer ( id, name ) VALUES (1, 'Robert');
CREATE TABLE band (
id int primary key REFERENCES performer(id),
name varchar
);
INSERT INTO performer ( performer_type_id ) VALUES (2); -- get PK 2 for next statement
INSERT INTO singer ( id, name ) VALUES (2, 'Animates');
INSERT INTO performer ( performer_type_id ) VALUES (2); -- get PK 3 for next statement
INSERT INTO singer ( id, name ) VALUES (3, 'Zed Leppelin');
CREATE TABLE gig (
id serial primary key,
performer_id int REFERENCES performer(id)
);
INSERT INTO gig ( performer_id ) VALUES (1), (2), (3), (4);
And the last INSERT fails, as expected:
ERROR: insert or update on table "gig" violates foreign key constraint "gig_performer_id_fkey"
DETAIL: Key (performer_id)=(4) is not present in table "performer".
But
For me there is annoying problem: I have no good way to make distinction which ID is for singer and which for band etc. (in original example I had performer_type_id in gig-table for that), because any performer_id may belong any performer. So I'd like any performer type has it's own ID range, so I create dummy table for every sequence
CREATE TABLE band_id (
id int primary key,
dummy boolean default null
);
CREATE SEQUENCE band_id_seq START 1;
ALTER TABLE band_id ALTER COLUMN id SET DEFAULT nextval('band_id_seq');
CREATE TABLE singer_id (
id int primary key,
dummy boolean default null
);
CREATE SEQUENCE singer_id_seq START 2000000;
ALTER TABLE singer_id ALTER COLUMN id SET DEFAULT nextval('singer_id_seq');
Now, to insert new row into specific perfomer table I have to get next ID for it:
INSERT INTO band_id (dummy) VALUES (NULL);
Trying to figure out, is it possible to solve this process on DB level, or has something to done in App-level. It would be nice, if inserting into band table could:
before trigger inserting into band_id to genereate specific ID
before trigger inserting this new ID into performer-table
include this new ID into INSERT into band
Frist 2 points are easy, but the last point is not clear for now.
I want to create e temp table using select into syntax. Like:
select top 0 * into #AffectedRecord from MyTable
Mytable has a primary key. When I insert record using merge into syntax primary key be a problem. How could I drop pk constraint from temp table
The "SELECT TOP (0) INTO.." trick is clever but my recommendation is to script out the table yourself for reasons just like this. SELECT INTO when you're actually bringing in data, on the other hand, is often faster than creating the table and doing the insert. Especially on 2014+ systems.
The existence of a primary key has nothing to do with your problem. Key Constraints and indexes don't get created when using SELECT INTO from another table, the data type and NULLability does. Consider the following code and note my comments:
USE tempdb -- a good place for testing on non-prod servers.
GO
IF OBJECT_ID('dbo.t1') IS NOT NULL DROP TABLE dbo.t1;
IF OBJECT_ID('dbo.t2') IS NOT NULL DROP TABLE dbo.t2;
GO
CREATE TABLE dbo.t1
(
id int identity primary key clustered,
col1 varchar(10) NOT NULL,
col2 int NULL
);
GO
INSERT dbo.t1(col1) VALUES ('a'),('b');
SELECT TOP (0)
id, -- this create the column including the identity but NOT the primary key
CAST(id AS int) AS id2, -- this will create the column but it will be nullable. No identity
ISNULL(CAST(id AS int),0) AS id3, -- this this create the column and make it nullable. No identity.
col1,
col2
INTO dbo.t2
FROM t1;
Here's the (cleaned up for brevity) DDL for the new table I created:
-- New table
CREATE TABLE dbo.t2
(
id int IDENTITY(1,1) NOT NULL,
id2 int NULL,
id3 int NOT NULL,
col1 varchar(10) NOT NULL,
col2 int NULL
);
Notice that the primary key is gone. When I brought in id as-is it kept the identity. Casting the id column as an int (even though it already is an int) is how I got rid of the identity insert. Adding an ISNULL is how to make a column nullable.
By default, identity insert is set to off here to this query will fail:
INSERT dbo.t2 (id, id3, col1) VALUES (1, 1, 'x');
Msg 544, Level 16, State 1, Line 39
Cannot insert explicit value for identity column in table 't2' when IDENTITY_INSERT is set to OFF.
Setting identity insert on will fix the problem:
SET IDENTITY_INSERT dbo.t2 ON;
INSERT dbo.t2 (id, id3, col1) VALUES (1, 1, 'x');
But now you MUST provide a value for that column. Note the error here:
INSERT dbo.t2 (id3, col1) VALUES (1, 'x');
Msg 545, Level 16, State 1, Line 51
Explicit value must be specified for identity column in table 't2' either when IDENTITY_INSERT is set to ON
Hopefully this helps.
On a side-note: this is a good way to play around with and understand how select insert works. I used a perm table because it's easier to find.
I have two tables, connected in E/R by a is-relation. One representing the "mother table"
CREATE TABLE PERSONS(
id SERIAL NOT NULL,
name character varying NOT NULL,
address character varying NOT NULL,
day_of_creation timestamp NOT NULL DEFAULT current_timestamp,
PRIMARY KEY (id)
)
the other representing the "child table"
CREATE TABLE EMPLOYEES (
id integer NOT NULL,
store character varying NOT NULL,
paychecksize integer NOT NULL,
FOREIGN KEY (id)
REFERENCES PERSONS(id),
PRIMARY KEY (id)
)
Now those two tables are joined in a view
CREATE VIEW EMPLOYEES_VIEW AS
SELECT
P.id,name,address,store,paychecksize,day_of_creation
FROM
PERSONS AS P
JOIN
EMPLOYEES AS E ON P.id = E.id
I want to write either a rule or a trigger to enable a db user to make an insert on that view, sparing him the nasty details of the splitted columns into different tables.
But I also want to make it convenient, as the id is a SERIAL and the day_of_creation has a default value there is no actual need that a user has to provide those, therefore a statement like
INSERT INTO EMPLOYEES_VIEW (name, address, store, paychecksize)
VALUES ("bob", "top secret", "drugstore", 42)
should be enough to result in
PERSONS
id|name|address |day_of_creation
-------------------------------
1 |bob |top secret| 2013-08-13 15:32:42
EMPLOYEES
id| store |paychecksize
---------------------
1 |drugstore|42
A basic rule would be easy as
CREATE RULE EMPLOYEE_VIEW_INSERT AS ON INSERT TO EMPLOYEE_VIEW
DO INSTED (
INSERT INTO PERSONS
VALUES (NEW.id,NEW.name,NEW.address,NEW.day_of_creation),
INSERT INTO EMPLOYEES
VALUES (NEW.id,NEW.store,NEW.paychecksize)
)
should be sufficient. But this will not be convenient as a user will have to provide the id and timestamp, even though it actually is not necessary.
How can I rewrite/extend that code base to match my criteria of convenience?
Something like:
CREATE RULE EMPLOYEE_VIEW_INSERT AS ON INSERT TO EMPLOYEES_VIEW
DO INSTEAD
(
INSERT INTO PERSONS (id, name, address, day_of_creation)
VALUES (default,NEW.name,NEW.address,default);
INSERT INTO EMPLOYEES (id, store, paychecksize)
VALUES (currval('persons_id_seq'),NEW.store,NEW.paychecksize)
);
That way the default values for persons.id and persons.day_of_creation will be the default values. Another option would have been to simply remove those columns from the insert:
INSERT INTO PERSONS (name, address)
VALUES (NEW.name,NEW.address);
Once the rule is defined, the following insert should work:
insert into employees_view (name, address, store, paychecksize)
values ('Arthur Dent', 'Some Street', 'Some Store', 42);
Btw: with a current Postgres version an instead of trigger is the preferred way to make a view updateable.
I have a database with companies and their products, I want for each
company to have a separate product id sequence.
I know that postgresql can't do this, the only way is to have a separate sequence for each company but this is cumbersome.
I thought about a solution to have a separate table to hold the sequences
CREATE TABLE "sequence"
(
"table" character varying(25),
company_id integer DEFAULT 0,
"value" integer
)
"table" will be holt the table name for the sequence, such as products, categories etc.
and value will hold the actual sequence data that will be used for product_id on inserts
I will use UPDATE ... RETURNING value; to get a product id
I was wondering is this solution efficient?
With row level locking, only users of same company adding rows in the same table will have to wait to get a lock and I think that reduces race condition problems.
Is there a better way to solve this problem?
I don't want to use a sequence for products table for all companies because the difference between product id's will be to big, I want to keep it simple for the users.
You could just embed a counter in your companies table:
CREATE TABLE companies (
id SERIAL PRIMARY KEY,
name TEXT,
product_id INT DEFAULT 0
);
CREATE TABLE products (
company INT REFERENCES companies(id),
product_id INT,
PRIMARY KEY (company, product_id),
name TEXT
);
INSERT INTO companies (id, name) VALUES (1, 'Acme Corporation');
INSERT INTO companies (id, name) VALUES (2, 'Umbrella Corporation');
Then, use UPDATE ... RETURNING to get the next product ID for a given company:
> INSERT INTO products VALUES (1, (UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id), 'Anvil');
ERROR: syntax error at or near "companies"
LINE 1: INSERT INTO products VALUES (1, (UPDATE companies SET produc...
^
Oh noes! It seems you can't (as of PostgreSQL 9.1devel) use UPDATE ... RETURNING as a subquery.
The good news is, it's not a problem! Just create a stored procedure that does the increment/return part:
CREATE FUNCTION next_product_id(company INT) RETURNS INT
AS $$
UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id
$$ LANGUAGE 'sql';
Now insertion is a piece of cake:
INSERT INTO products VALUES (1, next_product_id(1), 'Anvil');
INSERT INTO products VALUES (1, next_product_id(1), 'Dynamite');
INSERT INTO products VALUES (2, next_product_id(2), 'Umbrella');
INSERT INTO products VALUES (1, next_product_id(1), 'Explosive tennis balls');
Be sure to use the same company ID in both the product value and the argument to next_product_id(company INT).
Depending on how many companies you have, you could create a sequence for each company. Query it by a function which is set as a default on your product_id column.
Alternatively this function could simply do a SELECT FOR UPDATE and update the values of your table. Should be pretty performant I think.