Transpose row values to columns ( join table with a bag of key value pairs in T-SQL) - tsql

I have two tables,
table1 ( Id, date, info1)
table2 ( Id, date, nvarchar(50) Key, nvarchar(50) Value)
I would like to join these tables and obtain rows where each value in the Key column is a new column, and the values in the value table are the data in the rows.
Example:
table1 row:
1, 2010-01-01, 234
table2 row:
1, 2010-01-01, 'TimeToProcess', '15'
1, 2010-01-01, 'ProcessingStatus', 'Complete'
The result desired is a row like:
1, 2010-01-01, 234, '15', 'Complete'
Where the column headers are (Id, date, info1, TimeToProcess, ProcessingStatus)
This transposition looks like it works similarly to PIVOT but I could not get it to work -- I suspect -- due to the nvarchar(50) type of the Key, Value columns and the fact that I am forced to use an aggregate function when in fact I do not need it.
I could use inner joins to achieve this but the only way I know how to do it would take 1 inner join per Key which in my case amounts to 6 inner joins as that's how many metrics I have.
How can I do this?

You're on the right track with PIVOT. You can PIVOT on a string value. Use the MIN or MAX aggregate function.

Related

Optional filter on a column of an outer joined table in the where clause

I have got two tables:
create table student
(
studentid bigint primary key not null,
name varchar(200) not null
);
create table courseregistration
(
studentid bigint not null,
coursenamename varchar(200) not null,
isfinished boolean default false
);
--insert some data
insert into student values(1,'Dave');
insert into courseregistration values(1,'SQL',true);
Student is fetched with id, so it should be always returned in the result. Entry in the courseregistration is optional and should be returned if there are matching rows and those matching rows should be filtered on isfinished=false. This means I want to get the course regsitrations that are not finished yet. Tried to outer join student with courseregistration and filter courseregistration on isfinished=false. Note that, I still want to retrieve the student.
Trying this returns no rows:
select * from student
left outer join courseregistration using(studentid)
where studentid = 1
and courseregistration.isfinished = false
What I'd want in the example above, is a result set with 1 row student, but course rows null (because the only example has the isfinished=true). One more constraint though. If there is no corresponding row in courseregistration, there should still be a result for the student entry.
This is an adjusted example. I can tweak my code to solve the problem, but I really wonder, what is the "correct/smart way" of solving this in postgresql?
PS I have used the (+) in Oracle previously to solve similar issues.
Isn't this what you are looking for :
select * from student s
left outer join courseregistration cr
on s.studentid = cr.studentid
and cr.isfinished = false
where s.studentid = 1
db<>fiddle here

Apply join, sort on date column and select the first row where one of the column value is not null

I have two tables(Table A and Table B) in a Postgres DB.
Both have "id" column in common. Table A has one column called "id" and Table B has three columns: "id, date, value($)".
For each "id" of Table A there exists multiple rows in Table B in the following format - (id, date, value).
For instance, for Table A with "id" as 1 if there exists following rows in Table B:
(1, 2018-06-21, null)
(1, 2018-06-20, null)
(1, 2018-06-19, 202)
(1, 2018-06-18, 200)
I would like to extract the most recent dated non-null value. For example for id - 1, the result should be 202. Please share your thoughts or let me know in case more info is required.
Here is the solution I went ahead with:
with mapping as ( select distinct table1.id, table2.value, table2.date, row_number() over (partition by table1.id order by table2.date desc nulls last) as row_number
from table1
left join table2 on table2.id=table1.id and table2.value is not null
)
select * from mapping where row_number = 1
Let me know if there is scope for improvement.
You may very well want an inner join, not an outer join. If you have an id in table1 that does not exist in table2 or that has only null values you will get NULL for both date and value. This is due to the how outer join works. What it says is if nothing in the right side table matches the ON condition then return NULL for each column in that table. So
with mapping as
(select distinct table1.id
, table2.value
, table2.date
, row_number() over (partition by table1.id order by table2.date desc nulls last) as row_number
from table1
join table2 on table2.id=table1.id and table2.value is not null
)
select *
from mapping
where row_number = 1;
See example of each here. Your query worked because all your test data satisfied the 1st condition of the ON condition. You really need test data that fails to see what your query does.
Caution: DATE and VALUE are very poor choice for a column names. Both are SQL standard reserved words, although not Postgres specifically. Further DATE is a Postgres data type. Having columns with names the same as datatype leads to confusion.

Enforcing a unique relationship over multiple columns where one column is nullable

Given the table
ID PERSON_ID PLAN EMPLOYER_ID TERMINATION_DATE
1 123 ABC 321 2020-01-01
2 123 DEF 321 (null)
3 123 ABC 321 (null)
4 123 ABC 321 (null)
I want to exclude the 4th entry. (The 3rd entry shows the person was re-hired and therefore is a new relationship. I'm only showing relevant fields)
My first attempt was to simply create a unique index over PERSON_ID / PLAN / EMPLOYER_ID / TERMINATION_DATE, thinking that DB2 for IBMi considered nulls equal in a unique index. I was evidently wrong...
Is there a way to enforce uniqueness over these columns, or,
is there a better way to approach the value of termination date? (null is not technically correct; I'm thinking of it as more true/false, but the business logic needs a date)
Edit
According to the docs for 7.3:
UNIQUE
Prevents the table from containing two or more rows with the same value of the index key. When UNIQUE is used, all null values for a column are considered equal. For example, if the key is a single column that can contain null values, that column can contain only one null value. The constraint is enforced when rows of the table are updated or new rows are inserted.
The constraint is also checked during the execution of the CREATE INDEX statement. If the table already contains rows with duplicate key values, the index is not created.
UNIQUE WHERE NOT NULL
Prevents the table from containing two or more rows with the same value of the index key, where all null values for a column are not considered equal. Multiple null values in a column are allowed. Otherwise, this is identical to UNIQUE.
So, the behavior I'm seeing looks more like UNIQUE WHERE NOT NULL. When I generate SQL for this table, I see
ADD CONSTRAINT TERMEMPPLANSSN
UNIQUE( TERMINATION_DATE , EMPLOYERID , PLAN_CODE , SSN ) ;
(note this is showing the real field names, not the ones I used in my example)
Edit 2
Bottom line, Constraint !== Index. When I went back and created an actual index, I got the desired behavior.
CREATE TABLE PERSON
(
ID INT NOT NULL
, PERSON_ID INT NOT NULL
, PLAN CHAR(3) NOT NULL
, EMPLOYER_ID INT
, TERMINATION_DATE DATE
);
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE)
VALUES
(1, 123, 'ABC', 321, DATE('2020-01-01'))
, (2, 123, 'DEF', 321, CAST(NULL AS DATE))
, (3, 123, 'ABC', 321, CAST(NULL AS DATE))
WITH NC;
--- To not allow: ---
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE) VALUES
(4, 123, 'ABC', 321, CAST(NULL AS DATE))
or
(4, 123, 'ABC', 321, DATE('2020-01-01'))
You may:
CREATE UNIQUE INDEX PERSON_U1 ON PERSON
(PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE);
--- To not allow: ---
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE) VALUES
(4, 123, 'ABC', 321, DATE('2020-01-01'))
but allow multiple:
(X, 123, 'ABC', 321, CAST(NULL AS DATE))
(Y, 123, 'ABC', 321, CAST(NULL AS DATE))
...
You may:
CREATE UNIQUE WHERE NOT NULL INDEX PERSON_U2 ON PERSON
(PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE);

Convert comma separated id to comma separated string in postgresql

I have comma separated column which represents the ids of emergency type like:
ID | Name
1 | 1,2,3
2 | 1,2
3 | 1
I want to make query to get name of the this value field.
1 - Ambulance
2 - Fire
3 - Police
EXPECTED OUTPUT
1 - Ambulance, Fire, Police
2 - Ambulance, Fire
3 - Ambulance
I just need to write select statement in postgresql to display string values instead of integer values in comma separated.
Comma separated values is bad database design practice, though postgre is so feature rich, that you can handle this task easily.
-- just simulate tables
with t1(ID, Name) as(
select 1 ,'1,2,3' union all
select 2 ,'1,2' union all
select 3 ,'1'
),
t2(id, name) as(
select 1, 'Ambulance' union all
select 2, 'Fire' union all
select 3, 'Police'
)
-- here is actual query
select s1.id, string_agg(t2.name, ',') from
( select id, unnest(string_to_array(Name, ','))::INT as name_id from t1 ) s1
join t2
on s1.name_id = t2.id
group by s1.id
demo
Though, if you can, change your approach. Right database design means easy queries and better performance.
To get the values for each of the ids is a simple query: select * from ;. Once you have the values you will have to parse the strings with a delimiter of ','. then you would have to assign the parsed sting values to the appropriate job titles, and remake the list. Are you writing this in a specific language?
or you could just assign the sorted value to something like 1,2,3 is equal to "some string", 1,2 is equal to "some other string", etc.
Assuming you have a table with the ids and values for Ambulance, Police and Fire, then you can use something like the following.
CREATE TABLE public.test1
(
id integer NOT NULL,
commastring character varying,
CONSTRAINT pk_test1 PRIMARY KEY (id)
);
INSERT INTO public.test1
VALUES (1, '1,2,3'), (2, '1,2'), (3, '1');
CREATE TABLE public.test2
(
id integer NOT NULL,
description character varying,
CONSTRAINT pk_test2 PRIMARY KEY (id)
);
INSERT INTO public.test2
VALUES (1, 'Ambulance'), (2, 'Fire'), (3, 'Police');
with descs as
(with splits as
(SELECT id, split_part(commastring, ',', 1) as col2,
split_part(commastring, ',', 2) as col3, split_part(commastring, ',', 3) as col4 from test1)
select splits.id, t21.description as d1, t22.description as d2, t23.description as d3
from splits
inner join test2 t21 on t21.id::character varying = splits.col2
left join test2 t22 on t22.id::character varying = splits.col3
left join test2 t23 on t23.id::character varying = splits.col4)
SELECT descs.id, CASE WHEN d2 IS NOT NULL AND d3 IS NOT NULL
THEN CONCAT_WS(',', d1,d2,d3) ELSE CASE WHEN d2 IS NOT NULL
THEN CONCAT_WS(',', d1,d2) ELSE d1 END END FROM descs
ORDER BY id;
By way of explanation, I give the create table and insert commands, so that you (and others) can follow the logic. It would help enormously, if you were to do this in your questions, as it saves everyone time and avoids misunderstandings.
My inner CTE then splits the string using split_part. The syntax here is quite simple, field, separator and desired column within the field to be split (so in this case we need one, two and three). I then join the split columns to test2. Note two things here: the first join is an inner join, as there will always be at least one column in the split (I am assuming!!!), whereas the other two are left joins; secondly, the split of a character varying field in turn produces character varying splits, so I have to cast the int id to character varying for the join to work. Doing the cast this way round (i.e. id to character varying rather than character varying to id) means I don't have to bother about nulls. Finally depending on the number of nulls present, I concatenate the results with the given separator. Again I am assuming d1 will always have a value.
HTH

Map column value to table name and join

I have a composite type that looks like
CREATE TYPE member AS (
id BIGINT,
type CHAR(1)
);
I have a table that relies on this member type with an array.
CREATE TABLE relation (
id BIGINT PRIMARY KEY,
members member[]
);
I have three other tables each with a different schema (but having common id field)
CREATE TABLE table_x (
id BIGINT PRIMARY KEY,
some_text TEXT
);
CREATE TABLE table_y (
id BIGINT PRIMARY KEY,
some_int INT
);
CREATE TABLE table_z (
id BIGINT PRIMARY KEY,
some_date TIMESTAMP
);
type field in member type is just one character to find out table that specific member belongs to. A row in relation table can have a mix of different types.
I have a scenario which requires returning relation ids with at least one member fulfilling a certain condition based on it's type (let's say for x => some_text is not empty or y => some_int is greater than 10 or z => some_date is a week is from now).
I can implement this scenario on the application side by making multiple requests to the database:
unnest relation table
collect member data per relation
make new requests to find out relations
I am wondering if there is a way to map column values to table names and join them.
Assumption
I´m assuming that relation.members array does not have more than one member element of the same type. Correct?
Query to try
with unnested_members as (
-- Unnest members array
select id, unnest(members) members
from relation
)
, members_joined as (
-- left join on a per type basis with table_x, table_y and table_z.
select r.id, (r.members).id idext, (r.members).type,
x.some_text, y.some_int, z.some_date -- more types, more columns here
from unnested_members r
left join table_x x on (x.id = (r.members).id and (r.members).type = 'x')
left join table_y y on (y.id = (r.members).id and (r.members).type = 'y')
left join table_z z on (z.id = (r.members).id and (r.members).type = 'z')
-- More types, more tables to left join
)
select id,
max(some_text) some_text, -- use max() to get not null value for this id
max(some_int) some_int, -- use max() to get not null value for this id
max(some_date) some_date -- use max() to get not null value for this id
-- more types, more max() columns here
from members_joined
group by id -- get one row per relation.id with data from joined table_* columns
If you need to include more tables then you have to include these tables in the left join part, include the column in the select list and in the max() section as well.
#JNevill had a good point about this database design. Although this approach may not seem optimal, it keeps the table definitions clearly separate without any relations in between them. Also the size of relation table is fairly small compared to other three tables.
I solved the problem by simply fetching rows per type and merging them:
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_x ON member.id = table_x.id WHERE member.type = 'x' AND table_x.some_text = 'some text value'
UNION
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_y ON member.id = table_y.id WHERE member.type = 'y' AND table_y.some_int = 123
UNION
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_z ON member.id = table_z.id WHERE member.type = 'z' AND table_z.some_date > '2017-01-11 00:00:00';