Query where NOT NULL but only if NOT used as a FK - postgresql

area
-----
id BIGSERIAL PRIMARY KEY
deleted_at TIMESTAMP WITH TIME ZONE DEFAULT NULL
and
registration
-----
area_id BIGINT REFERENCES area(id) NOT NULL
I want to get all records from area which have deleted_at IS NULL and the ones that can have deleted_at is NOT NULL but are present as a FK in the registration.
SELECT * FROM area
JOIN registration AS reg
ON reg.area_id=area.id
WHERE area.deleted_at IS NULL;
will omit the area records which are FKs in registration but have been marked as "deleted".
Adding an AND clause regarding the deleted_at column in the JOIN ON clause doesn't make sense, since it will only strip out valid records.
I can't quite wrap it around my head, since the two where conditions kind of contradict each other.

Try something like this:
SELECT *
FROM area
LEFT JOIN registration AS reg ON reg.area_id = area.id
WHERE (area.deleted_at IS NULL) <> (reg.area_id IS NOT NULL)
The LEFT JOIN would list all area rows, even without a matching row from registration. (Resulting NULL values for those rows.)
The WHERE clause makes sure that both of the fields are not NULLs at the same time.

I think this is what your asking for. When you use left join it data fields for registration will show up as null where they are not present in registration table.
select * from area
left join registration as reg
on reg.area_id= area.id
where area.deleted_at is null or reg.area_id is not null;

-- I need (0) all area records EXCEPT the ones where (1) deleted_at IS NOT NULL
-- AND (2) are NOT present as FKs in registration.
SELECT * FROM area a
WHERE NOT(
a.deleted_at IS NOT NULL -- (1)
AND NOT EXISTS (
SELECT * -- (2)
FROM registration r
WHERE r.area_id=a.id
)
);
Note: your textual phrasing is confusing: EXCEPT a AND b could mean two things
And, after the rephrasing of the question:
-- I want to get (0) all records from area (1) which have deleted_at IS NULL (1a)
-- and (2) the ones that can have deleted_at is NOT NULL but are present as a FK in the registration.
SELECT * FROM area a
WHERE a.deleted_at IS NULL -- (1)
OR a.deleted_at IS NOT NULL AND EXISTS ( (1a)
SELECT * -- (2)
FROM registration r
WHERE r.area_id=a.id
);
If I understand correctly, you mean plus the ones at (1a) : if so, the and in (1a) is translated into an or

Are you simply searching for the following query?
SELECT * FROM Area
LEFT OUTER JOIN registration on id = area_id
WHERE deleted_at IS NULL OR area_id IS NOT NULL
This will return the same area.id multiple times if registration.area_id is not unique though (since you have no UNIQUE constraints).
If that is a problem, you may want the following query instead.
SELECT * FROM Area
WHERE deleted_at IS NULL OR id IN (SELECT area_id FROM registration)
Or this, built with a COUNT:
SELECT id, deleted_at, COUNT(*) FROM Area
LEFT OUTER JOIN registration on id = area_id
WHERE (deleted_at IS NULL or area_id IS NOT NULL)
GROUP BY id, deleted_at

Related

Optional filter on a column of an outer joined table in the where clause

I have got two tables:
create table student
(
studentid bigint primary key not null,
name varchar(200) not null
);
create table courseregistration
(
studentid bigint not null,
coursenamename varchar(200) not null,
isfinished boolean default false
);
--insert some data
insert into student values(1,'Dave');
insert into courseregistration values(1,'SQL',true);
Student is fetched with id, so it should be always returned in the result. Entry in the courseregistration is optional and should be returned if there are matching rows and those matching rows should be filtered on isfinished=false. This means I want to get the course regsitrations that are not finished yet. Tried to outer join student with courseregistration and filter courseregistration on isfinished=false. Note that, I still want to retrieve the student.
Trying this returns no rows:
select * from student
left outer join courseregistration using(studentid)
where studentid = 1
and courseregistration.isfinished = false
What I'd want in the example above, is a result set with 1 row student, but course rows null (because the only example has the isfinished=true). One more constraint though. If there is no corresponding row in courseregistration, there should still be a result for the student entry.
This is an adjusted example. I can tweak my code to solve the problem, but I really wonder, what is the "correct/smart way" of solving this in postgresql?
PS I have used the (+) in Oracle previously to solve similar issues.
Isn't this what you are looking for :
select * from student s
left outer join courseregistration cr
on s.studentid = cr.studentid
and cr.isfinished = false
where s.studentid = 1
db<>fiddle here

Merge two tables in Postgresql giving preference to one particular table

I have two tables, Users and Masters. Users are having User specific settingkey-value. Masters is having master settingkey-value. I want to display key-value from the two tables, where
if users do not have that particular key, need to take it from masters
2 if the users do not exists in the table, need to display all from masters key-value
if users having key-value, have to display users key-value
Example:
Inputs being - UserID and appID = 1.
I tried with left join combination, but not getting desired result if Users do not exists at all in the Users table.
Could you please give me some advise.
step-by-step demo:db<>fiddle
SELECT
COALESCE(m.app_id, u.app_id) as app_id,
COALESCE(m.setting_key, u.setting_key) as setting_key,
COALESCE(u.setting_value, m.setting_value) as setting_value -- 2
FROM
master_table m
FULL OUTER JOIN -- 1
user_table u
ON m.app_id = u.app_id AND m.setting_key = u.setting_key
WHERE COALESCE(m.app_id, u.app_id) = 1 -- 3
AND (u.user_id = 1 OR u.user_id IS NULL)
You need a FULL OUTER JOIN to join also data set that the other table does not contain
COALESCE(a, b) gives you the first non-null value. So, if a (here the user value) is available, it will be returned. Otherwise b (here the master value)
Filter by app_id and user_id; second needs to be filtered by user_id == NULL too, to get all setting_keys. Of course, you could use here COALESCE as well: COALESCE(u.user_id, 1) whereas the last 1 is the specific user_id you're asking
Edit: If User does not exist, give out the Masters values for app_id:
step-by-step demo:db<>fiddle:
SELECT DISTINCT ON (app_id, setting_key) -- 3
*
FROM (
SELECT
COALESCE(user_app_id, master_app_id) AS app_id, -- 2
COALESCE(user_setting_key, master_setting_key) AS setting_key,
COALESCE(user_setting_value, master_setting_value) AS setting_value,
user_id
FROM (
SELECT
app_id as master_app_id,
setting_key as master_setting_key,
setting_value as master_setting_value,
null as user_id,
null as user_app_id,
null as user_setting_key,
null as user_setting_value
FROM
master_table m
UNION -- 1
SELECT
*
FROM
master_table m
FULL OUTER JOIN
user_table u
ON m.app_id = u.app_id AND m.setting_key = u.setting_key
) s
) s
WHERE app_id = 1
AND (user_id = 2 OR user_id IS NULL)
ORDER BY app_id, setting_key, user_id NULLS LAST -- 3
This is a little more complicated. You need a separate data set for user_id == NULL which could be fetched. So, the NULL user represents the unknown user.
You can achieve this by adding the Master table with NULL values using an UNION.
Now you can create the expected columns with the COALESCE() functions as described above.
The third trick is using the DISTINCT ON clause on the app_id and the setting_key columns. When you ordered the NULL columns from the default UNION part in (1) last, then the DISTINCT ON will fetch the user record. However, when the user didn't exist, then the DISTINCT ON will fetch the default Master record.

show records that have only one matchin row in another table

I need to write a sql code that probably is very simple but I am very new to it.
I need to find all the records from one table that have matching id (but no more than one) from the other table. eg. one table contains records of the employees and the second one with employees' telephone numbers. i need to find all employees with only one telephone no
Sample data would be nice. In absence of:
SELECT
employees.employee_id
FROM
employees
LEFT JOIN
(SELECT distinct on(employee_id) employee_id FROM emp_phone) AS phone
ON
employees.employee_id = phone.employee_id
WHERE
phone.employee_id IS NOT NULL;
You need a join of the 2 tables, group by employee and the condition in the having clause:
SELECT e.employee_id, e.name
FROM employees e INNER JOIN numbers n
ON e.employee_id = n.employee_id
GROUP BY e.employee_id, e.name
HAVING COUNT(*) = 1;
If there can be more than a few numbers per employee in the table with the employees' telephone numbers (calling it tel), then it's cheaper to avoid GROUP BY and HAVING which has to process all rows. Find employees with "unique" numbers using a self-anti-join with NOT EXISTS.
While you don't need more than the employee_id and their unique phone number, you don't even have to involve the employee table at all:
SELECT *
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
);
If you need additional columns from the employee table:
SELECT * -- or any columns you need
FROM (
SELECT employee_id AS id, tel_number -- or any columns you need
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
)
) t
JOIN employee e USING (id);
The column alias in the subquery (employee_id AS id) is just for convenience. Then the outer join condition can be USING (id), and the ID column is only included once in the result, even with SELECT * ...
Simpler with a smart naming convention that uses employee_id for the employee ID everywhere. But it's a widespread anti-pattern to use employee.id instead.
Related:
JOIN table if condition is satisfied, else perform no join

Map column value to table name and join

I have a composite type that looks like
CREATE TYPE member AS (
id BIGINT,
type CHAR(1)
);
I have a table that relies on this member type with an array.
CREATE TABLE relation (
id BIGINT PRIMARY KEY,
members member[]
);
I have three other tables each with a different schema (but having common id field)
CREATE TABLE table_x (
id BIGINT PRIMARY KEY,
some_text TEXT
);
CREATE TABLE table_y (
id BIGINT PRIMARY KEY,
some_int INT
);
CREATE TABLE table_z (
id BIGINT PRIMARY KEY,
some_date TIMESTAMP
);
type field in member type is just one character to find out table that specific member belongs to. A row in relation table can have a mix of different types.
I have a scenario which requires returning relation ids with at least one member fulfilling a certain condition based on it's type (let's say for x => some_text is not empty or y => some_int is greater than 10 or z => some_date is a week is from now).
I can implement this scenario on the application side by making multiple requests to the database:
unnest relation table
collect member data per relation
make new requests to find out relations
I am wondering if there is a way to map column values to table names and join them.
Assumption
I´m assuming that relation.members array does not have more than one member element of the same type. Correct?
Query to try
with unnested_members as (
-- Unnest members array
select id, unnest(members) members
from relation
)
, members_joined as (
-- left join on a per type basis with table_x, table_y and table_z.
select r.id, (r.members).id idext, (r.members).type,
x.some_text, y.some_int, z.some_date -- more types, more columns here
from unnested_members r
left join table_x x on (x.id = (r.members).id and (r.members).type = 'x')
left join table_y y on (y.id = (r.members).id and (r.members).type = 'y')
left join table_z z on (z.id = (r.members).id and (r.members).type = 'z')
-- More types, more tables to left join
)
select id,
max(some_text) some_text, -- use max() to get not null value for this id
max(some_int) some_int, -- use max() to get not null value for this id
max(some_date) some_date -- use max() to get not null value for this id
-- more types, more max() columns here
from members_joined
group by id -- get one row per relation.id with data from joined table_* columns
If you need to include more tables then you have to include these tables in the left join part, include the column in the select list and in the max() section as well.
#JNevill had a good point about this database design. Although this approach may not seem optimal, it keeps the table definitions clearly separate without any relations in between them. Also the size of relation table is fairly small compared to other three tables.
I solved the problem by simply fetching rows per type and merging them:
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_x ON member.id = table_x.id WHERE member.type = 'x' AND table_x.some_text = 'some text value'
UNION
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_y ON member.id = table_y.id WHERE member.type = 'y' AND table_y.some_int = 123
UNION
SELECT relation.* FROM relation, UNNEST(relation.members) member INNER JOIN table_z ON member.id = table_z.id WHERE member.type = 'z' AND table_z.some_date > '2017-01-11 00:00:00';

Choosing the first child record in a selfjoin in TSQL

I've got a visits table that looks like this:
id identity(1,1) not null,
visit_date datetime not null,
patient_id int not null,
flag bit not null
For each record, I need to find a matching record that is same time or earlier, has the same patient_id, and has flag set to 1. What I am doing now is:
select parent.id as parent_id,
(
select top 1
child.id as child_id
from
visits as child
where
child.visit_date <= parent.visit_date
and child.patient_id = parent.patient_id
and child.flag = 1
order by
visit_date desc
) as child_id
from
visits as parent
So, this query works correctly, except that it runs too slow -- I suspect that this is because of the subquery. Is it possible to rewrite it as a joined query?
View the query execution plan. Where you have thick arrows, look at those statements. You should learn the different statements and what they imply, like what Clustered Index Scan/ Seek etc.
Usually when a query is going slow however I find that there are no good indexes.
The tables and columns affected and used to join, create an index that covers all these columns. This is called a covering index usually in the forums. It's something you can do for something that really needs it. But keep in mind that too many indexes will slow down insert statements.
/*
id identity(1,1) not null,
visit_date datetime not null,
patient_id int not null,
flag bit not null
*/
SELECT
T.parentId,
T.patientId,
V.id AS childId
FROM
(
SELECT
visit.id AS parentId,
visit.patient_id AS patientId,
MAX (previous_visit.visit_date) previousVisitDate
FROM
visit
LEFT JOIN visit previousVisit ON
visit.patient_id = previousVisit.patient_id
AND visit.visit_date >= previousVisit.visit_date
AND visit.id <> previousVisit.id
AND previousVisit.flag = 1
GROUP BY
visit.id,
visit.visit_date,
visit.patient_id,
visit.flag
) AS T
LEFT JOIN visit V ON
T.patientId = V.patient_id
AND T.previousVisitDate = V.visit_date