Create a table with columns from rows values with dynamic values - tsql

I have a table with 'properties'
ID Text
-----------
1 Name
2 Surname
3 D.O.B.
4 City
Another table with 'people'
ID Code
-----------
1 MN0001
2 ST0001
3 ST0002
And another table 'propertiesPeople' that associate that 2 tables. A person can have an undefined quantity of properties.
ID IDPerson IDProp Value
----------------------------------
1 1 1 Peter
2 1 2 Johnson
3 2 1 John
4 2 3 01/01/1977
5 1 4 California
6 3 1 Julian
7 3 2 Ross
8 3 4 Osaka
Before inserting a person/people, I need to validate that there is no other one with the same properties (the properties to validate is variable and it will be stored on another table 'propertiesToValidate')
I thought of making a dynamic SQL making a loop around the 'propertiesToValidate' and this Select would finish something like:
SELECT p1.Value, p2.Value, p3.Value
FROM properties p1
INNER JOIN properties p2 ON p1.IDPerson=p2.IDPerson
INNER JOIN properties p3 ON p1.IDPerson=p3.IDPerson
WHERE p1.IDProp = 1
AND p2.IDProp = 2
AND p3.IDProp = 4
And insert this into a temporal table that would finish something like this:
Value1 Value2 Value3
--------------------------------
Peter Johnson California
Julian Ross Osaka
After that, I would make an intersect with a table with the new person/people that I want to insert.
I think that creating a dynamic string and calling after with sp_executesql is not an elegant way to do it (and complicated to mantain in the future), but I can't realise another way to do it. Is there another way?

I would just add UNIQUE constraint:
CREATE TABLE #propertiesPeople(
ID VARCHAR(34) NOT NULL PRIMARY KEY
,IDPerson INTEGER
,IDProp INTEGER
,Value VARCHAR(10)
,UNIQUE(IDPerson, IDProp)
);
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('1',1,1,'Peter');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('2',1,2,'Johnson');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('3',2,1,'John');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('4',2,3,'01/01/1977');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('5',1,4,'California');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('6',3,1,'Julian');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('7',3,2,'Ross');
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('8',3,4,'Osaka');
When someone try to insert:
INSERT INTO #propertiesPeople(ID,IDPerson,IDProp,Value) VALUES ('9',3,4,'Osaka2');
Will get:
Violation of UNIQUE KEY constraint 'UQ__#propert__918436F84D776654'.
Cannot insert duplicate key in object 'dbo.#propertiesPeople'. The
duplicate key value is (3, 4).
LiveDemo

Related

Postgresql: How can I set the id for null values following existed sequence id?

I have a table with 10 million records, there are about 1 million records with id from 1-1 million, and about 9 million records with the null values. How can I set the id for null values with a sequence of id's that following the existing id.
Try this in a test area to see how long it takes to populate your table. We'll use a short example here.
create table test (id int, fullname text);
insert into test values (1, 'john');
insert into test values (2, 'john');
insert into test values (NULL, 'john');
insert into test values (NULL, 'john');
This simulation shows that records 1 and 2 have an ID and 3 and 4 don't have an ID, yet.
Create a sequence using which we will populate ID in records 3 and 4.
create sequence populate_test start 3;
Now, let's populate:
update test set id = nextval('populate_test') where id is null;
Result:
select * from test;
id | fullname
----+----------
1 | john
2 | john
3 | john
4 | john
In your case, you could try the cache option of create sequence like so: create sequence populate_test start 3 cache 1000000; to cache 1MM numbers at a time.

Postgresql track serial by ID in another column

So I am trying to create a database that can store videos from products, but I do intend to add a few million of them. So obviously I want the performance to be as good as possible.
I wanted to achieve the following:
BIGINT | SMALLSERIAL | VARCHAR(30)
product_id | video_id | video_hash
1 1 Dkfjoie124
1 2 POoieqlgkQ
1 3 Xd2t9dakcx
2 1 Df2459Afdw
However, when I insert a new video for a product:
INSERT INTO TABLE (product_id, video_hash) VALUES (2, DSpewirncS)
I want the following to happen:
BIGINT | SMALLSERIAL | VARCHAR(30)
product_id | video_id | video_hash
1 1 Dkfjoie124
1 2 POoieqlgkQ
1 3 Xd2t9dakcx
2 1 Df2459Afdw
2 2 DSpewirncS
Will this happen when I set the column type for video_id to SMALLSERIAL? Because I am afraid that it will insert a different value (the highest in the entire column), which I do not want.
Thanks.
No, a serial is bound to a sequence and that doesn't reset without telling it to do.
But if you want an ordinal for the videos per products you can query the table to produce it using the row_number() window function.
SELECT product_id,
row_number() OVER (PARTITION BY product_id
ORDER BY video_id) video_ordinal,
video_hash
FROM table;
You could also create a view for this query for convenience, so that you can query the view instead of the table and the view would look like you want it.

Does the returning clause always execute first?

I have a many-to-many relation representing containers holding items.
I have a primary key row_id in the table.
I insert four rows: (container_id, item_id) values (1778712425160346751, 4). These rows will be identical except the aforementioned unique row_id.
I subsequently execute the following query:
delete from contains
where item_id = 4 and
container_id = '1778712425160346751' and
row_id =
(
select max(row_id) from contains
where container_id = '1778712425160346751' and
item_id = 4
)
returning
(
select count(*) from contains
where container_id = '1778712425160346751' and
item_id = 4
);
Now I expected to get 3 returned from this query, but I got a 4. Getting a 4 is the desired behavior, but it is not what was expected.
My question is: can I always expect that the returning clause executes before the delete, or is this an idiosyncrasy of certain versions or specific software?
The use of a query in returning section is allowed but not documented. For the documentation:
output_expression
An expression to be computed and returned by the DELETE command after each row is deleted. The expression can use any column names of the table named by table_name or table(s) listed in USING. Write * to return all columns.
It seems logical that the query sees the table in a state before deleting, as the statement is not completed yet.
create temp table test as
select id from generate_series(1, 4) id;
delete from test
returning id, (select count(*) from test);
id | count
----+-------
1 | 4
2 | 4
3 | 4
4 | 4
(4 rows)
The same concerns update:
create temp table test as
select id from generate_series(1, 4) id;
update test
set id = id+ 1
returning id, (select sum(id) from test);
id | sum
----+-----
2 | 10
3 | 10
4 | 10
5 | 10
(4 rows)

Postgresql - `serial` column & inheritence (sequence sharing policy)

In postgresql, when inherit a serial column from parent table, the sequence is shared by parent & child table.
Is it possible to inherit the serial column, while let the 2 table have separated sequence values, e.g both table's column could have value 1.
Is this possible & reasonable, and if yes, how to do that?
#Update
The reasons that I want to avoid sequence sharing are:
Sharing a single int range by multiple table might use up the
MAX_INT, using bigint could improve this, but it takes more space
too.
There is a kind of resource locking when multiple table doing insert concurrently, so it's a performance issue I guess.
The id jump from 1 to 5 then might to 1000 don't look as beautiful as it could.
#Summary
solutions:
If want child table have its own sequence, while still keep the global sequence among parent & child table. (As described in #wildplasser 's answer.)
Then could add a sub_id serial column for each child table.
If want child table have its own sequence, while don't need a global sequence among parent & child table,
There there are 2 ways:
Using int instead of serial. (As described in #lsilva 's answer.)
Steps:
define type as int or bigint in parent table,
for each parent & child table, create a individual sequence,
specify default value for int type for each table using nextval of their own sequence,
don't forget to maintain/reset the sequence, when re-create table,
Define id serial directly in child table, and not in parent table.
DROP schema tmp CASCADE;
CREATE schema tmp;
set search_path = tmp, pg_catalog;
CREATE TABLE common
( seq SERIAL NOT NULL PRIMARY KEY
);
CREATE TABLE one
( subseq SERIAL NOT NULL
, payload integer NOT NULL
)
INHERITS (tmp.common)
;
CREATE TABLE two
( subseq SERIAL NOT NULL
, payload integer NOT NULL
)
INHERITS (tmp.common)
;
/**
\d common
\d one
\d two
\q
***/
INSERT INTO one(payload)
SELECT gs FROM generate_series(1,5) gs
;
INSERT INTO two(payload)
SELECT gs FROM generate_series(101,105) gs
;
SELECT * FROM common;
SELECT * FROM one;
SELECT * FROM two;
Results:
NOTICE: drop cascades to table tmp.common
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 5
INSERT 0 5
seq
-----
1
2
3
4
5
6
7
8
9
10
(10 rows)
seq | subseq | payload
-----+--------+---------
1 | 1 | 1
2 | 2 | 2
3 | 3 | 3
4 | 4 | 4
5 | 5 | 5
(5 rows)
seq | subseq | payload
-----+--------+---------
6 | 1 | 101
7 | 2 | 102
8 | 3 | 103
9 | 4 | 104
10 | 5 | 105
(5 rows)
But: in fact you don't need the subseq columns, since you can always enumerate them by means of row_number():
CREATE VIEW vw_one AS
SELECT seq
, row_number() OVER (ORDER BY seq) as subseq
, payload
FROM one;
CREATE VIEW vw_two AS
SELECT seq
, row_number() OVER (ORDER BY seq) as subseq
, payload
FROM two;
[results are identical]
And, you could add UNIQUE AND PRIMARY KEY constraints to the child tables, like:
CREATE TABLE one
( subseq SERIAL NOT NULL UNIQUE
, payload integer NOT NULL
)
INHERITS (tmp.common)
;
ALTER TABLE one ADD PRIMARY KEY (seq);
[similar for table two]
I use this :
Parent table definition:
CREATE TABLE parent_table (
id bigint NOT NULL,
Child table definition:
CREATE TABLE cild_schema.child_table
(
id bigint NOT NULL DEFAULT nextval('child_schema.child_table_id_seq'::regclass),
I am emulating the serial by using a sequence number as a default.

T-SQL query, multiple values in a field

I have two tables in a database. The first table tblTracker contains many columns, but the column of particular interest is called siteAdmin and each row in that column can contain multiple loginIDs of 5 digits like 21457, 21456 or just one like 21444. The next table users contains columns like LoginID, fname, and lname.
What I would like to be able to do is take the loginIDs contained in tblTracker.siteAdmin and return fname + lname from users. I can successfully do this when there is only one loginID in the row such as 21444 but I cannot figure out how to do this when there is more than one like 21457, 21456.
Here is the SQL statement I use for when there is one loginID in that column
SELECT b.FName + '' '' + b.LName AS siteAdminName,
FROM tblTracker a
LEFT OUTER JOIN users b ON a.siteAdmin= b.Login_Id
However this doesn't work when it tries to join a siteAdmin with more than one LoginID in it
Thanks!
I prefer the number table approach to split a string in TSQL
For this method to work, you need to do this one time table setup:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
ListValue
-----------------------
1
2
3
4
5
6777
(6 row(s) affected)
Your can now use a CROSS APPLY to split every row in your table like:
DECLARE #users table (LoginID int, fname varchar(5), lname varchar(5))
INSERT INTO #users VALUES (1, 'Sam', 'Jones')
INSERT INTO #users VALUES (2, 'Don', 'Smith')
INSERT INTO #users VALUES (3, 'Joe', 'Doe')
INSERT INTO #users VALUES (4, 'Tim', 'White')
INSERT INTO #users VALUES (5, 'Matt', 'Davis')
INSERT INTO #users VALUES (15,'Sue', 'Me')
DECLARE #tblTracker table (RowID int, siteAdmin varchar(50))
INSERT INTO #tblTracker VALUES (1,'1,2,3')
INSERT INTO #tblTracker VALUES (2,'2,3,4')
INSERT INTO #tblTracker VALUES (3,'1,5')
INSERT INTO #tblTracker VALUES (4,'1')
INSERT INTO #tblTracker VALUES (5,'5')
INSERT INTO #tblTracker VALUES (6,'')
INSERT INTO #tblTracker VALUES (7,'8,9,10')
INSERT INTO #tblTracker VALUES (8,'1,15,3,4,5')
SELECT
t.RowID, u.LoginID, u.fname+' '+u.lname AS YourAdmin
FROM #tblTracker t
CROSS APPLY dbo.FN_ListToTable(',',t.siteAdmin) st
LEFT OUTER JOIN #users u ON st.ListValue=u.LoginID --to get all rows even if missing siteAdmin
--INNER JOIN #users u ON st.ListValue=u.LoginID --to remove rows without any siteAdmin
ORDER BY t.RowID,u.fname,u.lname
OUTPUT:
RowID LoginID YourAdmin
----------- ----------- -----------
1 2 Don Smith
1 3 Joe Doe
1 1 Sam Jones
2 2 Don Smith
2 3 Joe Doe
2 4 Tim White
3 5 Matt Davis
3 1 Sam Jones
4 1 Sam Jones
5 5 Matt Davis
7 NULL NULL
7 NULL NULL
7 NULL NULL
8 3 Joe Doe
8 5 Matt Davis
8 1 Sam Jones
8 15 Sue Me
8 4 Tim White
(18 row(s) affected)