Count after filtering/joining two tables - postgresql

I think I am missing something obvious or otherwise on the wrong path here. I am using postgresql (new to it).
I have two tables:
TABLE A:
id
age
1
20
2
55
3
65
4
75
5
85
TABLE B:
id
service1
service2
1
Yes
Yes
2
Yes
No
3
Yes
Yes
4
Yes
Yes
5
No
Yes
I want to get the count of all customers over the age of 55 with service1 and service2.
When I use the code below, I get the correct list of customers, but doing a select count(*) does not give me a total, but rather count per each id.
SELECT *
FROM A
INNER JOIN B on A.id = B.id
WHERE A.age > 54
AND B.service1 = 'Yes'
AND B.service2 = 'Yes'
GROUP BY A.id, B.id;
I am looking for a total count but I end up with something like this:
count
1
1
2
1
I am sure this has been answered many times but I am having a hard time searching and finding it. I am new to this, so my google skills are not up to par yet. Thank you!

One canonical approach here uses aggregation over the entire joined tables:
SELECT COUNT(*)
FROM A a
INNER JOIN B b
ON b.id = a.id
WHERE b.service1 = 'Yes' AND b.service2 = 'Yes' AND
a.age > 55;

Related

PostgreSQL Query not returning the proper results

So this is my table structure
learning_paths
id
name
version
created_at
updated_at
learning_path_levels
id
name
learning_path_id
order
created_at
updated_at
learning_path_level_nodes
id
name
description
documentation_links
evaluation_methodology
learning_path_level_id
created_at
updated_at
learning_path_node_users
id
learning_path_level_node_id
user_id
evaluated_by
evaluated_at
is_successful
created_at
updated_at
I'm writing a query to retrieve the learning_path_name, count of the amount of levels each learning path has, the pending and completed nodes per level for the user, and the total amount of nodes per level.
I have the following query
select learning_paths."name",
sum(case when learning_path_node_users.is_successful and learning_path_node_users.user_id is not null then 1 else 0 end) as completed_nodes,
sum(case when learning_path_node_users.is_successful = false or learning_path_node_users.user_id is null then 1 else 0 end) as pending_nodes,
count(learning_path_levels.id) as total_levels,
count(*) as total_nodes
from learning_path_level_nodes
inner join learning_path_levels on learning_path_levels.id = learning_path_level_nodes.learning_path_level_id
inner join learning_paths on learning_paths.id = learning_path_levels.learning_path_id
left join learning_path_node_users on learning_path_node_users.learning_path_level_node_id = learning_path_level_nodes.id
group by learning_paths."name"
which returns:
name
completed_nodes
pending_nodes
total_levels
total_nodes
Devops
5
3
8
8
QA
0
1
1
1
Project manager
3
3
6
6
AI
0
5
5
5
Everything is correct, except for the levels count,
for example, for Devops,it should be 2, and it is returning 8
for Project Manager it should be 2, and it is returning 6
a pattern I see is that it returns the amount of nodes as the amount of levels,
How can I fix this?
I'd really appreciate any help or suggestions, as I've been struggling with this.
Thanks in advance
EDIT: As per your suggestion, I'm attaching a fiddle with the tables and data.
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f29676ff7051686a28de96928db1e3a6
While I don't get the exact results you want, I think you want to add a distinct to your count for the total levels:
select
lp.name,
sum(case when u.is_successful and u.user_id is not null then 1 else 0 end) as completed_nodes,
sum(case when u.is_successful = false or u.user_id is null then 1 else 0 end) as pending_nodes,
count(distinct lpl.id) as total_levels, -- added "distinct"
array_agg (lpl.id) as level_detail, -- debugging aid
count(*) as total_nodes
from
learning_path_level_nodes n
join learning_path_levels lpl on lpl.id = n.learning_path_level_id
join learning_paths lp on lp.id = lpl.learning_path_id
left join learning_path_node_users u on u.learning_path_level_node_id = n.id
group by
lp.name
To help expose the rationale, I added the field level_detail, which you can delete, to show why the results are what they are. You can obviously remove that once the results are what you want.
If it's not what you expect, perhaps you can explain or give by example what I might be missing.

need a true false answer for multiple conditions when joining two tables

I have two tables, one is information about a sampleid (sample id is primary key) and the other is conditions the sampleid has (sampleid is not primary key in this table as it may have multiple conditions). I would like to know if my sampleid has a specific condition (Y/N) but not sure how to join them without getting a query that returns mulitple rows of the sampleid.
eg
sampleid colour
-----------------------
1 blue
2 red
3 green
sampleid condition
-----------------------
1 23
1 81
1 94
2 81
2 94
3 23
I want to ask if the sampleid has condition 23 and return:
sampleid colour condition23
----------------------------------------------
1 blue Y
2 red N
3 green Y
Hope this is clear, every time I join them i end up with multiple sampleid- I am a newbie and trying to find my way!
Thanks in advance
F
This can be done using a left join and case something like this:
SELECT
s.sampleId,
s.color,
case when c.condition is null
then 'N'
else 'Y'
end condition23
FROM
samples s
LEFT JOIN conditions c
ON s.sampleId = c.sampleId
AND c.condition = 23
Try this query:
select s.*, case when c.condition is null then 'N' else 'Y' end condition23
from samples s
left join
(select * from conditions where condition = 23) c on s.sampleid = c.sampleid
With EXISTS:
select
s.*,
case
when exists (
select 1 from conditions where sampleid = s.sampleid and condition = 23
) then 'Y'
else 'N'
end condition23
from samples s

Subsetting records that contain multiple values in one column

In my postgres table, I have two columns of interest: id and name - my goal is to only keep records where id has more than one value in name. In other words, would like to keep all records of ids that have multiple values and where at least one of those values is B
UPDATE: I have tried adding WHERE EXISTS to the queries below but this does not work
The sample data would look like this:
> test
id name
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
6 6 A
7 7 A
8 2 B
9 1 B
10 2 B
and the output would look like this:
> output
id name
1 1 A
2 2 A
8 2 B
9 1 B
10 2 B
How would one write a query to select only these kinds records?
Based on your description you would seem to want:
select id, name
from (select t.*, min(name) over (partition by id) as min_name,
max(name) over (partition by id) as max_name
from t
) t
where min_name < max_name;
This can be done using EXISTS:
select id, name
from test t1
where exists (select *
from test t2
where t1.id = t2.id
and t1.name <> t2.name) -- this will select those with multiple names for the id
and exists (select *
from test t3
where t1.id = t3.id
and t3.name = 'B') -- this will select those with at least one b for that id
Those records where for their id more than one name shines up, right?
This could be formulated in "SQL" as follows:
select * from table t1
where id in (
select id
from table t2
group by id
having count(name) > 1)

Limit for inner Join Table

I have a scenario where I am joining three tables and getting the results.
My problem is i have apply limit for joined table.
Take below example, i have three tables 1) books and 2) Customer 3)author. I need to find list of books sold today with author and customer name however i just need last nth customers not all by passing books Id
Books Customer Authors
--------------- ---------------------- -------------
Id Name AID Id BID Name Date AID Name
1 1 1 ABC 1 A1
2 2 1 CED 2 A2
3 3 2 DFG
How we can achieve this?
You are looking for LATERAL.
Sample:
SELECT B.Id, C.Name
FROM Books B,
LATERAL (SELECT * FROM Customer WHERE B.ID=C.BID ORDER BY ID DESC LIMIT N) C
WHERE B.ID = ANY(ids)
AND Date=Current_date

How to determine whether a value exists in a junction table and return zero or one?

I am using SQL Server 2008 R2
I am trying to write a single query that will return only exactly what I need. I will drop in a MovieID and get back a list of ALL genres. If the movie represents a specific genre (has an associated record in the junction table), the Checked value will be 1. If not, then 0.
My result set should look like this:
GenreID Genre Checked
1 ABC 0
2 DEF 1
3 HIJ 0
4 KLM 1
My First table is named Genres. It looks like this:
GenreID Genre
1 ABC
2 DEF
3 HIJ
4 KLM
My second table is named Movies. It looks like this:
MovieID Title
1 Blah
2 Foo
3 Carpe
4 Diem
My third table is a junction table named Movies_Genres. It looks like this:
MovieID GenreID
1 2
1 1
1 4
2 1
2 3
3 4
4 1
I would normally, do a couple of queries and a couple of loops to handle this, but I want to really just make the database do the work here. How do I tweak my query so that I can get the resultset that I need with just a single query?
Here's the starting query:
SELECT GenreID,
Genre
FROM Genres
Thanks in advance for your help!!!
SELECT g.GenreID, g.Genre, Checked = CASE WHEN EXISTS
(SELECT 1 FROM dbo.Movies_Genres AS mg
INNER JOIN dbo.Movies AS m
ON mg.MovieID = m.MovieID
WHERE mg.GenreID = g.GenreID
AND m.MovieID = #MovieID) THEN 1 ELSE 0 END
FROM dbo.Genres AS g
ORDER BY g.GenreID;
If there is a unique constraint or primary key on dbo.Movies_Genres(MovieID, GenreID) then this can be simply:
SELECT g.GenreID, g.Genre, Checked = COUNT(mg.GenreID)
FROM dbo.Genres AS g
LEFT OUTER JOIN dbo.Movies_Genres AS mg
ON g.GenreID = mg.GenreID
AND mg.MovieID = #MovieID
GROUP BY g.GenreID, g.Genre;
...since the count for any genre can only be 0 or 1 given a single #MovieID.
Pretty straight forward using CASE;
SELECT DISTINCT g.GenreID, g.Genre,
CASE WHEN mg.MovieID IS NULL THEN 0 ELSE 1 END Checked
FROM Genres g
LEFT JOIN Movies_Genres mg
ON g.GenreID=mg.GenreID
AND mg.MovieId=#MovieID;
Demo here.
Edit: If entries are guaranteed to be unique in Movies_Genres, you could choose to drop the DISTINCT.
The #MovieID is the movie, you want to filter by.
SELECT Genres.GenreID,
Genres.Genre,
CASE WHEN (Movies_Genres.GenreID IS NULL)
THEN 0
ELSE 1
END AS Checked
FROM Genres LEFT JOIN
Movies_Genres ON Movies_Genres.GenreID = Genres.GenreID AND
MovieID = #MovieID