PosgreSQL - column referece is ambiguous - postgresql

I am trying to get a number of units per room. I have two separate tables rooms and units. Room can be only one, but can have multiple units. I am trying to get a list of rooms with number of units for each. This is even if there are 0 units in given room. I worked to the point I wanted to print in the table also the room_id. room_id figures both in room table and unit table. Therefore I am getting error message stating that room_id is ambiguous. Of course I would expect this to understand that I want room_id from the room table.
I have following query:
SELECT count(ucr.*) units_no
, ucr.room_name
, ucr.room_image
, ucr.room_id
FROM (
SELECT u.*
, r.room_image
, r.room_name
, r.room_id
FROM unit u
LEFT JOIN room r ON r.room_id = u.room_id
WHERE r.room_id = 'b6229c33-a37e-4457-8fb0-941d632c2540'
) ucr
GROUP BY ucr.room_name, ucr.room_image, ucr.room_id;
I am getting following error:
column reference "room_id" is ambiguous
I have tried following:
, ucr.r.room_id
Also following:
, ucr(r.room_id)
Also following:
, ucr.(r.room_id)
I run out of options. How do I do this? Thank you for taking your time having a look on this issue.

The unit and room tables both have a column called room_id. Therefore, the inner select is ambiguous:
SELECT u.*, r.room_image, r.room_name, r.room_id
because it isn't clear which room_id value to use in the outer query. You could alias the two room_id columns to unique names, but given that your query doesn't even seem to need the columns from the unit table, I would suggest:
SELECT COUNT(ucr.room_name) units_no,
ucr.room_name,
ucr.room_image,
ucr.room_id
FROM (
SELECT r.room_image, r.room_name, r.room_id
FROM unit u
LEFT JOIN room r ON r.room_id = u.room_id
WHERE r.room_id = 'b6229c33-a37e-4457-8fb0-941d632c2540'
) ucr
GROUP BY ucr.room_name, ucr.room_image, ucr.room_id;
Actually, the subquery itself seems unnecessary and we can just use:
SELECT r.room_image, r.room_name, r.room_id, COUNT(*) AS units_no
FROM unit u
LEFT JOIN room r
ON r.room_id = u.room_id AND
r.room_id = 'b6229c33-a37e-4457-8fb0-941d632c2540'
GROUP BY r.room_image, r.room_name, r.room_id;

Related

Get distinct row by primary key, but use value from another column

I'm trying to get the sum of the total time that was spent sending all emails within a campaign.
Because of the joins in my query I end up with the 'processing_time' column duplicated over many rows. So running sum(s.processing_time) as send_time will always over represent how long it took to run.
select
c.id,
c.sender,
c.subject,
count(*) as total_items,
count(distinct s.id) as sends,
sum(s.processing_time) as send_time,
from campaigns c
left join sends s on c.id = s.campaigns_id
left join opens o on s.id = o.sends_id
group by c.id;
I'd ideally like to do something like sum(s.processing_time when distinct s.id) but I can't quite work out how to achieve that.
I have made other attempts using case but I always run into the same issue, I need to get the distinct rows based on the ID column, but work with another column.
Since you want statistics related to distinct s.id as well as c.id, group by both columns. Collect the (intermediate) data that you need,
and use this table as the inner table in a nested sub-select query.
In the outer select, group by c.id alone.
Since the inner select groups by s.id, values which are unique per s.id will not get double-counted when you sum/group by c.id.
SELECT id
, sender
, subject
, sum(total_items) as total_items
, sum(sends) as sends
, sum(processing_time) as send_time
FROM (
SELECT
c.id
, s.id as sid
, count(*) as total_items
, 1 as sends
, s.processing_time
, c.sender
, c.subject
FROM campaigns c
LEFT JOIN sends s on c.id = s.campaigns_id
LEFT JOIN opens o on s.id = o.sends_id
GROUP BY c.id, c.sender, c.subject, s.processing_time, s.id) t
GROUP BY id, sender, subject
ORDER BY id
Since the final table includes sender and subject, you'll need to group by these columns as well to avoid an error such as:
ERROR: column "c.sender" must appear in the GROUP BY clause or be used in an aggregate function
LINE 14: , c.sender

TSQL/SQL Server 2008 R2 - Recursive select consolidating self-referenced table Unit and apply SUM on UnitSale and UnitCharge

I've been searching here and everywhere and I cant find a proper path to follow on my problem.
Here is the structure I am using:
Table [Unit] - represents an unit of an organization, like Management, General Coordination, Production Team 1, etc.
This table is self-referenced by his own key on the ParentID column.
Table [UnitSale] - holds fictitious sales data, referencing a specific Unit.
Table [UnitCharge] - hold fictitious costs and charges of a specific Unit.
My goal is to select the Units, from the top-most member of the tree, recursively consolidating its child-Units, by applying SUM on each UnitSale and UnitCharge of the children, and finally applying theses totals to the current Unit, in this case, the top most.
Image of sample data: http://brit.dyndns-work.com:89/Brit/SampleData.png
Check the SQL Fiddle: http://sqlfiddle.com/#!3/75c3cc/3
Any help?
CTE is a good way to go. I would however do it bottom-up attributing sales from lower level to upper level, group by unit and finally join to unit for description and calculate rate. Check the updated fiddle: http://sqlfiddle.com/#!3/75c3cc/16/0.
with cte1 as
(
select u.id, u.parentid, s.salevalue, c.chargevalue
from Unit u
left join UnitSale s on s.unitid = u.id
left join UnitCharge c on c.unitid = u.id
union all
select u.id, u.parentid, x.salevalue, x.chargevalue
from Unit u
inner join cte1 x on x.parentid = u.id
)
, cte2 as
(
select id, sum(salevalue) as totalsale, sum(chargevalue) as totalcharge
from cte1
group by id
)
select u.id, u.description, u.parentid, x.totalsale, x.totalcharge, x.totalsale / x.totalcharge as rate
from cte2 x
inner join unit u on u.id = x.id
order by u.description

SSRS 2005 column chart: show series label missing when data count is zero

I have a pretty simple chart with a likely common issue. I've searched for several hours on the interweb but only get so far in finding a similar situation.
the basics of what I'm pulling contains a created_by, person_id and risk score
the risk score can be:
1 VERY LOW
2 LOW
3 MODERATE STABLE
4 MODERATE AT RISK
5 HIGH
6 VERY HIGH
I want to get a headcount of persons at each risk score and display a risk count even if there is a count of 0 for that risk score but SSRS 2005 likes to suppress zero counts.
I've tried this in the point labels
=IIF(IsNothing(count(Fields!person_id.value)),0,count(Fields!person_id.value))
Ex: I'm missing values for "1 LOW" as the creator does not have any "1 LOW" they've assigned risk scores for.
*here's a screenshot of what I get but I'd like to have a column even for a count when it still doesn't exist in the returned results.
#Nathan
Example scenario:
select professor.name, grades.score, student.person_id
from student
inner join grades on student.person_id = grades.person_id
inner join professor on student.professor_id = professor.professor_id
where
student.professor_id = #professor
Not all students are necessarily in the grades table.
I have a =Count(Fields!person_id.Value) for my data points & series is grouped on =Fields!score.Value
If there were a bunch of A,B,D grades but no C & F's how would I show labels for potentially non-existent counts
In your example, the problem is that no results are returned for grades that are not linked to any students. To solve this ideally there would be a table in your source system which listed all the possible values of "score" (e.g. A - F) and you would join this into your query such that at least one row was returned for each possible value.
If such a table doesn't exist and the possible score values are known and static, then you could manually create a list of them in your query. In the example below I create a subquery that returns a combination of all professors and all possible scores (A - F) and then LEFT join this to the grades and students tables (left join means that the professor/score rows will be returned even if no students have those scores in the "grades" table).
SELECT
professor.name
, professorgrades.score
, student.person_id
FROM
(
SELECT professor_id, score
FROM professor
CROSS JOIN
(
SELECT 'A' AS score
UNION
SELECT 'B'
UNION
SELECT 'C'
UNION
SELECT 'D'
UNION
SELECT 'E'
UNION
SELECT 'F'
) availablegrades
) professorgrades
INNER JOIN professor ON professorgrades.professor_id = professor.professor_id
LEFT JOIN grades ON professorgrades.score = grades.score
LEFT JOIN student ON grades.person_id = student.person_id AND
professorgrades.professor_id = student.professor_id
WHERE professorgrades.professor_id = 1
See a live example of how this works here: SQLFIDDLE
SELECT RS.RiskScoreId, RS.Description, SUM(DT.RiskCount) AS RiskCount
FROM (
SELECT RiskScoreId, 1 AS RiskCount
FROM People
UNION ALL
SELECT RiskScoreId, 0 AS RiskCount
FROM RiskScores
) DT
INNER JOIN RiskScores RS ON RS.RiskScoreId = DT.RiskScoreId
GROUP BY RS.RiskScoreId, RS.Description
ORDER BY RS.RiskScoreId

T-SQL query one table, get presence or absence of other table value

I'm not sure what this type of query is called so I've been unable to search for it properly. I've got two tables, Table A has about 10,000 rows. Table B has a variable amount of rows.
I want to write a query that gets all of Table A's results but with an added column, the value of that column is a boolean that says whether the result also appears in Table B.
I've written this query which works but is slow, it doesn't use a boolean but rather a count that will be either zero or one. Any suggested improvements are gratefully accepted:
SELECT u.number,u.name,u.deliveryaddress,
(SELECT COUNT(productUserid)
FROM ProductUser
WHERE number = u.number and productid = #ProductId)
AS IsInPromo
FROM Users u
UPDATE
I've run the query with actual execution plan enabled, I'm not sure how to show the results but various costs are:
Nested Loops (left semi join): 29%]
Clustered Index scan (User Table): 41%
Clustered Index Scan (ProductUser table): 29%
NUMBERS
There are 7366 users in the users table and currently 18 rows in the productUser table (although this will change and could be in the thousands)
You can use EXISTS to short circuit after the first row is found rather than COUNT-ing all matching rows.
SQL Server does not have a boolean datatype. The closest equivalent is BIT
SELECT u.number,
u.name,
u.deliveryaddress,
CASE
WHEN EXISTS (SELECT *
FROM ProductUser
WHERE number = u.number
AND productid = #ProductId) THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS IsInPromo
FROM Users u
RE: "I'm not sure what this type of query is called". This will give a plan with a semi join. See Subqueries in CASE Expressions for more about this.
Which management system are you using?
Try this:
SELECT u.number,u.name,u.deliveryaddress,
case when COUNT(p.productUserid) > 0 then 1 else 0 end
FROM Users u
left join ProductUser p on p.number = u.number and productid = #ProductId
group by u.number,u.name,u.deliveryaddress
UPD: this could be faster using mssql
;with fff as
(
select distinct p.number from ProductUser p where p.productid = #ProductId
)
select u.number,u.name,u.deliveryaddress,
case when isnull(f.number, 0) = 0 then 0 else 1 end
from Users u left join fff f on f.number = u.number
Since you seem concerned about performance, this query can perform faster as this will cause index seek on both tables versus an index scan:
SELECT u.number,
u.name,
u.deliveryaddress,
ISNULL(p.number, 0) IsInPromo
FROM Users u
LEFT JOIN ProductUser p ON p.number = u.number
WHERE p.productid = #ProductId

Aggregate function with Date on Postgres

I'm kind of rusty on my SQL, maybe you can help me out on this query.
I have these two tables for a tickets system (I'm omitting some fields):
table tickets
id - bigint
subject - text
user_id - bigint
closed - boolean
first_message - bigint
(foreign key, for next table's id)
last_message - bigint
(same as before)
table ticket_messages
creation_date
I need to query the closed tickets, and make an average of the time spent between the first message creation_date and the last message creation_date. This is what I've done so far:
SELECT t.id, t.subject, tm.creation_date
FROM tickets AS t
INNER JOIN ticket_messages AS tm
ON tm.id = t.first_message
OR tm.id = t.last_message
WHERE t.closed = true
I'm looking for some group by or aggregate function to get all the data from the table, and try to calculate the time spent between last and first, also trying to display the dates for the first and last message.
UPDATE I added an inner Join with the second table instead of "OR", now I get both dates, and I can find the sum from my application:
SELECT t.id, t.subject, tm.creation_date, tm2.creation_date
FROM tickets AS t
INNER JOIN ticket_messages AS tm
ON tm.id = t.first_message
INNER JOIN ticket_messages as tm2
ON tm2.id = t.last_message
WHERE t.closed = true
I think that did it...
Something like this should do for getting the nr of days elapsed. You might need to put this in a subquery to easily pull out more fields from 'tickets'.
SELECT t.id,AVG(tlast.creation_date - tfirst.creation_date)
FROM tickets AS t
INNER JOIN ticket_messages AS tfirst
ON tm.id = t.first_message
INNER JOIN ticket_messages AS tlast
ON tm.id = t.last_message
WHERE t.closed = true
GROUP BY t.id
Which might lead to(not tested..) e.g.
select t.id,t.subject,sub.nr_days
FROM (
SELECT t.id,AVG(tlast.creation_date - tfirst.creation_date) as nr_days
FROM tickets AS t
INNER JOIN ticket_messages AS tfirst
ON tm.id = t.first_message
INNER JOIN ticket_messages AS tlast
ON tm.id = t.last_message
WHERE t.closed = true
GROUP BY t.id ) AS sub
INNER JOIN tickets AS t
ON sub.id = t.id;
You are trying to combine two queries into one and trying to get the data from three rows of data from two tables. Both need to be fixed.
First of all, you should not attempt to mix aggregate data (such as averages) with the details for single items - you need separate queries for that. You can do it, but the output is repetitious and therefore wasteful (all the single items in a group will have the same aggregate data).
Secondly, you need to find the first message and the last message for a given ticket. Hence, that query is:
SELECT t.id, t.subject, tm1.creation_date as start, tm2.creation_date as end,
tm2.creation_date - tm1.creation_date as close_interval
FROM tickets AS t
INNER JOIN ticket_messages AS tm1 ON t.last_message = tm1.id
INNER JOIN ticket_messages AS tm2 ON t.last_message = tm2.id
WHERE t.closed = true
This gives you three rows of data per result row - as required. The computed value should be an interval type - assuming that PostgreSQL actually has that type. (In Informix, the type would effectively be INTERVAL DAY(n) for a suitable n, such as 9.)
You can average those intervals, now. You can't average dates because dates cannot be added together and cannot be divided; averaging involves both summing and dividing. Intervals can be added and divided.