How to select vehicle counts group by region using postgres - postgresql

I am new to postgres.
My postgres table name is Vehicle consisting of following columns
1.ID
2.name
3. wheel (2,3,4,6,8) // two wheeleer,4 whellers
4. region ('hyderabad','mumbai','delhi',...)
5. polluted ('yes','no')
My query is how to select count of 4 wheeler vehicles which are polluted group by regions
Expected Output
hyderabad -> 4
mumbai -> 3
delhi -> 8,...

Ideally you should have a regions table somewhere which contains all regions. Assuming this, you could write the following query:
SELECT
r.region,
COALESCE(v.cnt, 0) AS count
FROM regions r
LEFT JOIN
(
SELECT region, COUNT(*) cnt
FROM Vehicle
WHERE wheel = 4 AND polluted = 'yes'
GROUP BY region
) v
ON r.region = v.region;
If you only have a Vehicle table, which is bad database design, then we can try the following query:
SELECT
region,
SUM(CASE WHEN wheel = 4 AND polluted = 'yes' THEN 1 ELSE 0 END) AS count
FROM Vehicle
GROUP BY region;
This is inefficient, but at least it would let you report every region even if it has no matching records.

Related

How to use 'Distinct' for just one column?

I have a query checking the visits from some "locations" table I have. If the user signed up with a referral of "emp" or "oth", their first visit shouldn't count but the second visit and forward should count.
I'm trying to get a count of those "first visits" per location. Whenever they do a visit, I get a record on which location it was.
The problem is that my query is counting correctly, but some users have visits on different locations. So instead of just counting one visit for that location (the first one), is adding one per location where a user has done a visit.
This is my query
SELECT COUNT(DISTINCT CASE WHEN customer.ref IN ('emp', 'oth') THEN customer.id END) as visit_count, locations.name as location FROM locations
LEFT JOIN visits ON locations.location_name = visits.location_visit_name
LEFT JOIN customer ON customer.id = visits.customer_id
WHERE locations.active = true
GROUP BY locations.location_name, locations.id;
The results I'm getting are
visit_count | locations
-------------------------
7 | Loc 1
3 | Loc 2
1 | Loc 3
How it should be:
visit_count | locations
-------------------------
6 | Loc 1
2 | Loc 2
1 | Loc 3
Because 2 of these people have visits on both locations, so its counting one for each location. I think the DISTINCT is also doing it for the locations, when it should be only on the counting for the customer.id
Is there a way I can add something to my query to just grab the location for the first visit, without caring they have done other visits on other locations?
If I followed you correctly, you want to count only the first visit of each customer, spread by location.
One solution would be to use a correlated subquery in the on clause of the relevant join to filter on first customer visits. Assuming that column visit(visit_date) stores the date of each visit, you could do:
select
count(c.customer_id) visit_count,
l.name as location
from locations l
left join visits v
on l.location_name = v.location_visit_name
and v.visit_date = (
select min(v1.visit_date)
from visit v1
where v1.customer_id = v.customer_id
)
left join customer c
on c.id = v.customer_id
and c.ref in ('emp', 'oth')
where l.active = true
group by l.location_name, l.id;
Side notes:
properly fitering on the first visit per customer avoids the need for distinct in the count() aggregate function
table aliases make the query more concise and easier to understand; I recommend to use them in all queries
the filter on customer(ref) is better placed in the where clause than as a conditional count criteria
Try moving the when condition in where clause
SELECT COUNT( distinct customer.id) as visit_count
, locations.name as location
FROM locations
LEFT JOIN visits ON locations.location_name = visits.location_visit_name
LEFT JOIN customer ON customer.id = visits.customer_id
WHERE locations.active = true
AND customer.ref IN ('emp', 'oth')
GROUP BY locations.location_name;c

SSRS 2005 column chart: show series label missing when data count is zero

I have a pretty simple chart with a likely common issue. I've searched for several hours on the interweb but only get so far in finding a similar situation.
the basics of what I'm pulling contains a created_by, person_id and risk score
the risk score can be:
1 VERY LOW
2 LOW
3 MODERATE STABLE
4 MODERATE AT RISK
5 HIGH
6 VERY HIGH
I want to get a headcount of persons at each risk score and display a risk count even if there is a count of 0 for that risk score but SSRS 2005 likes to suppress zero counts.
I've tried this in the point labels
=IIF(IsNothing(count(Fields!person_id.value)),0,count(Fields!person_id.value))
Ex: I'm missing values for "1 LOW" as the creator does not have any "1 LOW" they've assigned risk scores for.
*here's a screenshot of what I get but I'd like to have a column even for a count when it still doesn't exist in the returned results.
#Nathan
Example scenario:
select professor.name, grades.score, student.person_id
from student
inner join grades on student.person_id = grades.person_id
inner join professor on student.professor_id = professor.professor_id
where
student.professor_id = #professor
Not all students are necessarily in the grades table.
I have a =Count(Fields!person_id.Value) for my data points & series is grouped on =Fields!score.Value
If there were a bunch of A,B,D grades but no C & F's how would I show labels for potentially non-existent counts
In your example, the problem is that no results are returned for grades that are not linked to any students. To solve this ideally there would be a table in your source system which listed all the possible values of "score" (e.g. A - F) and you would join this into your query such that at least one row was returned for each possible value.
If such a table doesn't exist and the possible score values are known and static, then you could manually create a list of them in your query. In the example below I create a subquery that returns a combination of all professors and all possible scores (A - F) and then LEFT join this to the grades and students tables (left join means that the professor/score rows will be returned even if no students have those scores in the "grades" table).
SELECT
professor.name
, professorgrades.score
, student.person_id
FROM
(
SELECT professor_id, score
FROM professor
CROSS JOIN
(
SELECT 'A' AS score
UNION
SELECT 'B'
UNION
SELECT 'C'
UNION
SELECT 'D'
UNION
SELECT 'E'
UNION
SELECT 'F'
) availablegrades
) professorgrades
INNER JOIN professor ON professorgrades.professor_id = professor.professor_id
LEFT JOIN grades ON professorgrades.score = grades.score
LEFT JOIN student ON grades.person_id = student.person_id AND
professorgrades.professor_id = student.professor_id
WHERE professorgrades.professor_id = 1
See a live example of how this works here: SQLFIDDLE
SELECT RS.RiskScoreId, RS.Description, SUM(DT.RiskCount) AS RiskCount
FROM (
SELECT RiskScoreId, 1 AS RiskCount
FROM People
UNION ALL
SELECT RiskScoreId, 0 AS RiskCount
FROM RiskScores
) DT
INNER JOIN RiskScores RS ON RS.RiskScoreId = DT.RiskScoreId
GROUP BY RS.RiskScoreId, RS.Description
ORDER BY RS.RiskScoreId

Restricting duplicate results in grouped result set without using distinct

I am attempting to create a query that returns a list of specific entity records without returning any duplicated entries from the entityID field. The query cannot use DISTINCT because the list is being passed to a reporting engine that doesn't understand result sets containing more than the entityID, and DISTINCT requires all the ORDER BY fields to be returned.
The result set cannot contain duplicate entityIDs because the reporting engine also cannot process a report for the same entity twice in the same run. I have found out the hard way that temporary tables aren't supported as well.
The entries need to be sorted in the query because the report engine only allows sorting on the entity_header level, and I need to sort based on the report.status. Thankfully the report engine honors the order in which you return the results.
The tables are as follows:
entity_header
=================================================
entityID(pk) Location active name
1 LOCATION1 0 name1
2 LOCATION1 0 name2
3 LOCATION2 0 name3
4 LOCATION3 0 name4
5 LOCATION2 1 name5
6 LOCATION2 0 name6
report
========================================================
startdate entityID(fk) status reportID(pk)
03-10-2013 1 running 1
03-12-2013 2 running 2
03-10-2013 1 stopped 3
03-10-2013 3 stopped 4
03-12-2013 4 running 5
03-10-2013 5 stopped 6
03-12-2013 6 running 7
Here is the query I've got so far, and it is almost what I need:
SELECT entity_header.entityID
FROM entity_header eh
INNER JOIN report r on r.entityID = eh.entityID
WHERE r.startdate between getdate()-7.5 and getdate()
AND eh.active = 0
AND eh.location in ('LOCATION1','LOCATION2')
AND r.status is not null
AND eh.name is not null
GROUP BY eh.entityID, r.status, eh.name
ORDER BY r.status, eh.name;
I would appreciate any advice this community can offer. I will do my best to provide any additional information required.
Here is a working sample that runs on ms SQL only.
I am using the rank() to count the number of times entityID appears in the results. Saved as list.
The list will contain an integer value of the number of times the entityID occurs.
Using where a.list = 1, filters the results.
Using ORDER BY a.ut, a.en, sorts the results. The ut and en are used to sort.
SELECT a.entityID FROM (
SELECT distinct TOP (100) PERCENT eh.entityID,
rank() over(PARTITION BY eh.entityID ORDER BY r.status, eh.name) as list,
r.status ut, eh.name en
FROM report AS r INNER JOIN entity_header as eh ON r.entityID = eh.entityID
WHERE (r.startdate BETWEEN GETDATE() - 7.5 AND GETDATE()) AND (eh.active = 0)
AND (eh.location IN ('LOCATION1', 'LOCATION2'))
ORDER BY r.status, eh.name
) AS a
where a.list = 1
ORDER BY a.ut, a.en

T-SQL: Conditional join or convoluted WHERE clause?

I have a table called MapObjects which is used to store information about objects placed on a map. I have another table called OrgLocations which is used to store all the locations where an organisation is located. Locations are defined with a latitude and longitude. Finally, I have another table called ObjectLocations which maps a map object to an organistion in the OrgLocations table. It is used to indicate a subset of the locations for an object that is shown on a map.
As an example, suppose an organisation (OrgID = 10) has 4 locations (stored in the OrgLocations table): Dallas, Atlanta, Miami, New York.
The organisation has 1 map object associated with Atlanta and Miami (MapObjects.ID = 5).
My dataset must return the records from OrgLocations that correspond with Atlanta and Miami (but not include Dallas or New York) . However, I can also have a map object that is not assigned to any location (no record in ObjectLocations). These map objects still belong to an organisation but are not associated with any specific location. In this case I want to return all the locations assigned to the organisation.
I am not sure if this is done through a conditional join or something in the WHERE clause. Here is what the tables would look like with some data:
OrgLocations
ID OrgID Latitude Longitude Name
0 10 32.780 -96.798 Dallas
1 10 33.7497 -84.394 Atlanta
2 10 25.7863 -80.2270 Miami
3 10 40.712 -74.005 New York
4 11 42.348 -83.071 Detroit
ObjectLocations
OrgLocationID MapObjectID
1 5
2 5
MapObjects
ID OrgID
5 10
6 11
In this example, when MapObjects.ID is 5, 2 locations for this object exist in ObjectLocations: Atlanta and Miami. When MapObjects.ID is 6, there is no record in ObjectLocations so all the locations in OrgLocatons that belong to the organisation (OrgID = 11) are returned.
Thanks for any help!
I guess you will have the cleanest queries if you check for the existence of MapObjectID in ObjectLocations to decide what query to use.
Something like this:
declare #MapObjectID int
set #MapObjectID = 5
if exists(select *
from ObjectLocations
where MapObjectID = #MapObjectID)
begin
select *
from OrgLocations
where ID in (select OrgLocationID
from ObjectLocations
where MapObjectID = #MapObjectID)
end
else
begin
select *
from OrgLocations
where OrgID in (select OrgID
from MapObjects
where ID = #MapObjectID)
end
As a single query.
select OL.*
from OrgLocations as OL
inner join ObjectLocations as OLoc
on OL.ID = OLoc.OrgLocationID
where OLoc.MapObjectID = #MapObjectID
union all
select OL.*
from OrgLocations as OL
inner join MapObjects as MO
on OL.OrgID = MO.OrgID
where MO.ID = #MapObjectID and
not exists (select *
from ObjectLocations
where MapObjectID = #MapObjectID)

Removing rows with duplicate secondary values

This one is probably a softball question for any DBA, but here's my challenge. I have a table that looks like this:
id parent_id active
--- --------- -------
1 5 y
2 6 y
3 6 y
4 6 y
5 7 y
6 8 y
The way the system I am working on operates, it should only have one active row per parent. Thus, it'd be ok if ID #2 and #3 were active = 'n'.
I need to run a query that finds all rows that have duplicate parent_ids who are active and flip all but the highest ID to active = 'y'.
Can this be done in a single query, or do I have to write a script for it? (Using Postgresql, btw)
ANSI style:
update table set
active = 'n'
where
id <> (select max(id) from table t1 where t1.parent_id = table.parent_id)
Postgres specific:
update t1 set
active = 'n'
from
table t1
inner join (select max(id) as topId, parent_id from table group by parent_id) t2 on
t1.id < t2.topId
and t1.parent_id = t2.parent_id
The second one is probably a bit faster, since it's not doing a correlated subquery for each row. Enjoy!