Select Query from postgres with count - postgresql

I have two tables, one contains customers, the other one the bookings.
Now i want to see how many bookings come from one person but display it with their name instead of the id.
SELECT booking.id, COUNT(booking.id) AS idcount
FROM booking
GROUP BY booking.id ORDER BY idcount DESC;
The output is (correct count):
id | idcount
----------+--------
2 | 8
1 | 4
My attempt at getting the name displayed instead of the id was:
SELECT customer.lastn, customer.firstn, COUNT(booking.id) AS idcount
FROM booking, customer
GROUP BY customer.lastn, customer.firstn ORDER BY idcount DESC;
The output (wrong count):
lastn | firstn | idcount
----------+---------+--------
Adam | Michael | 13
Jackson | Leo | 13
13 is the total number of bookings (i just cut the output off) so there's that coming from, however i cant make the transition to get the right count with the name.

You need to use a JOIN in your FROM clause:
SELECT customer.lastn, customer.firstn, COUNT(booking.id) AS idcount
FROM booking
JOIN customer ON booking.id = customer.id
GROUP BY customer.lastn, customer.firstn
ORDER BY idcount DESC;
The JOIN here tells how the booking table relates to your customer table.

Related

PostgreSQL: Selecting one address from almost but not exactly duplicate rows

I have a big table that I'm trying to join another table to, however the table has entries such as:
--- Name | Address | Priority
----------------------------------------
1 | Jane Doe | 123 Baker St | 1
2 | Jane Doe | 345 Clay Dr | 2
3 | Jeff Boe | 231 Street St| 1
4 | Karen Al | 4232 Elm St | 1
5 | Karen Al | 5632 Pine Ct | 2
What I really want to select is one single address per person. The correct address I want is priority 2. However some of the addresses don't have a priority 2, so I can't join only on priority 2.
I've tried the following test query:
SELECT DISTINCT n.ID, LastName, FirstName, MAX(Address), MAX(Address2), City, State, PostalCode, n.Phone
FROM NormalTable n
JOIN Contracts cn ON n.ID = cn.ID
Which returns the table that I sketched out above, with the same person/sameID but different addresses.
Is there a way to do this in one query? I can think of maybe doing one INSERT statement into my final table where I do all the priority 2 addresses and then ANOTHER INSERT statement for IDs that aren't in the table yet, and use the priority 1 address for those. But I'd much prefer if there's a way to do this all in one go where I end up with only the address I want.
You could choice the address you need joining a subquery for max priority
select m.LastName, m.FirstName, m.Address, m.Address2, m.City, m.State, m.PostalCode, m.Phone
from my_table m
inner join (
select LastName, FirstName, max(priority) max_priority
from my_table
group by LastName, FirstName
) t on t.LastName = m.LastName
AND t.FirstName = m.FirstName
AND t.max_priority = m.priority
I think you want something like this
SELECT DISTINCT (Name), Address, Priority
ORDER BY Priority DESC
How this works is that the DISTINCT (Name) only returns one row per name. The row returned for each Name is the first row. Which will be the one with the highest priority because of the ORDER BY.

How to get id of the row which was selected by aggregate function? [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 4 years ago.
I have next data:
id | name | amount | datefrom
---------------------------
3 | a | 8 | 2018-01-01
4 | a | 3 | 2018-01-15 10:00
5 | b | 1 | 2018-02-20
I can group result with the next query:
select name, max(amount) from table group by name
But I need the id of selected row too. Thus I have tried:
select max(id), name, max(amount) from table group by name
And as it was expected it returns:
id | name | amount
-----------
4 | a | 8
5 | b | 1
But I need the id to have 3 for the amount of 8:
id | name | amount
-----------
3 | a | 8
5 | b | 1
Is this possible?
PS. This is required for billing task. At some day 2018-01-15 configuration of a was changed and user consumes some resource 10h with the amount of 8 and rests the day 14h -- 3. I need to count such a day by the maximum value. Thus row with id = 4 is just ignored for 2018-01-15 day. (for next day 2018-01-16 I will bill the amount of 3)
So I take for billing the row:
3 | a | 8 | 2018-01-01
And if something is wrong with it. I must report that row with id == 3 is wrong.
But when I used aggregation function the information about id is lost.
Would be awesome if this is possible:
select current(id), name, max(amount) from table group by name
select aggregated_row(id), name, max(amount) from table group by name
Here agg_row refer to the row which was selected by aggregation function max
UPD
I resolve the task as:
SELECT
(
SELECT id FROM t2
WHERE id = ANY ( ARRAY_AGG( tf.id ) ) AND amount = MAX( tf.amount )
) id,
name,
MAX(amount) ma,
SUM( ratio )
FROM t2 tf
GROUP BY name
UPD
It would be much better to use window functions
There are at least 3 ways, see below:
CREATE TEMP TABLE test (
id integer, name text, amount numeric, datefrom timestamptz
);
COPY test FROM STDIN (FORMAT csv);
3,a,8,2018-01-01
4,a,3,2018-01-15 10:00
5,b,1,2018-02-20
6,b,1,2019-01-01
\.
Method 1. using DISTINCT ON (PostgreSQL-specific)
SELECT DISTINCT ON (name)
id, name, amount
FROM test
ORDER BY name, amount DESC, datefrom ASC;
Method 2. using window functions
SELECT id, name, amount FROM (
SELECT *, row_number() OVER (
PARTITION BY name
ORDER BY amount DESC, datefrom ASC) AS __rn
FROM test) AS x
WHERE x.__rn = 1;
Method 3. using corelated subquery
SELECT id, name, amount FROM test
WHERE id = (
SELECT id FROM test AS t2
WHERE t2.name = test.name
ORDER BY amount DESC, datefrom ASC
LIMIT 1
);
demo: db<>fiddle
You need DISTINCT ON which filters the first row per group.
SELECT DISTINCT ON (name)
*
FROM table
ORDER BY name, amount DESC
You need a nested inner join. Try this -
SELECT id, T2.name, T2.amount
FROM TABLE T
INNER JOIN (SELECT name, MAX(amount) amount
FROM TABLE
GROUP BY name) T2
ON T.amount = T2.amount

Sum with different condition for every line

In my Postgresql 9.3 database I have a table stock_rotation:
+----+-----------------+---------------------+------------+---------------------+
| id | quantity_change | stock_rotation_type | article_id | date |
+----+-----------------+---------------------+------------+---------------------+
| 1 | 10 | PURCHASE | 1 | 2010-01-01 15:35:01 |
| 2 | -4 | SALE | 1 | 2010-05-06 08:46:02 |
| 3 | 5 | INVENTORY | 1 | 2010-12-20 08:20:35 |
| 4 | 2 | PURCHASE | 1 | 2011-02-05 16:45:50 |
| 5 | -1 | SALE | 1 | 2011-03-01 16:42:53 |
+----+-----------------+---------------------+------------+---------------------+
Types:
SALE has negative quantity_change
PURCHASE has positive quantity_change
INVENTORY resets the actual number in stock to the given value
In this implementation, to get the current value that an article has in stock, you need to sum up all quantity changes since the latest INVENTORY for the specific article (including the inventory value). I do not know why it is implemented this way and unfortunately it would be quite hard to change this now.
My question now is how to do this for more than a single article at once.
My latest attempt was this:
WITH latest_inventory_of_article as (
SELECT MAX(date)
FROM stock_rotation
WHERE stock_rotation_type = 'INVENTORY'
)
SELECT a.id, sum(quantity_change)
FROM stock_rotation sr
INNER JOIN article a ON a.id = sr.article_id
WHERE sr.date >= (COALESCE(
(SELECT date FROM latest_inventory_of_article),
'1970-01-01'
))
GROUP BY a.id
But the date for the latest stock_rotation of type INVENTORY can be different for every article.
I was trying to avoid looping over multiple article ids to find this date.
In this case I would use a different internal query to get the max inventory per article. You are effectively using stock_rotation twice but it should work. If it's too big of a table you can try something else:
SELECT sr.article_id, sum(quantity_change)
FROM stock_rotation sr
LEFT JOIN (
SELECT article_id, MAX(date) AS date
FROM stock_rotation
WHERE stock_rotation_type = 'INVENTORY'
GROUP BY article_id) AS latest_inventory
ON latest_inventory.article_id = sr.article_id
WHERE sr.date >= COALESCE(latest_inventory.date, '1970-01-01')
GROUP BY sr.article_id
You can use DISTINCT ON together with ORDER BY to get the latest INVENTORY row for each article_id in the WITH clause.
Then you can join that with the original table to get all later rows and add the values:
WITH latest_inventory as (
SELECT DISTINCT ON (article_id) id, article_id, date
FROM stock_rotation
WHERE stock_rotation_type = 'INVENTORY'
ORDER BY article_id, date DESC
)
SELECT article_id, sum(sr.quantity_change)
FROM stock_rotation sr
JOIN latest_inventory li USING (article_id)
WHERE sr.date >= li.date
GROUP BY article_id;
Here is my take on it: First, build the list of products at their last inventory state, using a window function. Then, join it back to the entire list, filtering on operations later than the inventory date for the item.
with initial_inventory as
(
select article_id, date, quantity_change from
(select article_id, date, quantity_change, rank() over (partition by article_id order by date desc)
from stockRotation
where type = 'INVENTORY'
) a
where rank = 1
)
select ii.article_id, ii.quantity_change + sum(sr.quantity_change)
from initial_inventory ii
join stockRotation sr on ii.article_id = sr.article_id and sr.date > ii.date
group by ii.article_id, ii.quantity_change

How to use COUNT() in more that one column?

Let's say I have this 3 tables
Countries ProvOrStates MajorCities
-----+------------- -----+----------- -----+-------------
Id | CountryName Id | CId | Name Id | POSId | Name
-----+------------- -----+----------- -----+-------------
1 | USA 1 | 1 | NY 1 | 1 | NYC
How do you get something like
---------------------------------------------
CountryName | ProvinceOrState | MajorCities
| (Count) | (Count)
---------------------------------------------
USA | 50 | 200
---------------------------------------------
Canada | 10 | 57
So far, the way I see it:
Run the first SELECT COUNT (GROUP BY Countries.Id) on Countries JOIN ProvOrStates,
store the result in a table variable,
Run the second SELECT COUNT (GROUP BY Countries.Id) on ProvOrStates JOIN MajorCities,
Update the table variable based on the Countries.Id
Join the table variable with Countries table ON Countries.Id = Id of the table variable.
Is there a possibility to run just one query instead of multiple intermediary queries? I don't know if it's even feasible as I've tried with no luck.
Thanks for helping
Use sub query or derived tables and views
Basically If You You Have 3 Tables
select * from [TableOne] as T1
join
(
select T2.Column, T3.Column
from [TableTwo] as T2
join [TableThree] as T3
on T2.CondtionColumn = T3.CondtionColumn
) AS DerivedTable
on T1.DepName = DerivedTable.DepName
And when you are 100% percent sure it's working you can create a view that contains your three tables join and call it when ever you want
PS: in case of any identical column names or when you get this message
"The column 'ColumnName' was specified multiple times for 'Table'. "
You can use alias to solve this problem
This answer comes from #lotzInSpace.
SELECT ct.[CountryName], COUNT(DISTINCT p.[Id]), COUNT(DISTINCT c.[Id])
FROM dbo.[Countries] ct
LEFT JOIN dbo.[Provinces] p
ON ct.[Id] = p.[CountryId]
LEFT JOIN dbo.[Cities] c
ON p.[Id] = c.[ProvinceId]
GROUP BY ct.[CountryName]
It's working. I'm using LEFT JOIN instead of INNER JOIN because, if a country doesn't have provinces, or a province doesn't have cities, then that country or province doesn't display.
Thanks again #lotzInSpace.

how to get employee details along with deptname based on maximum salary group by deptname

I have two tables Employee and Dept as below
EIN | ENAME | Salary | DeptID
1 | Ravi | 500 | 10
2 | Krishna | 1000 | 20
3 | Kiran | 1500 | 20
DeptID | DeptName
10 | IT
20 | Finance
I want the output as employees along with Deptname who is getting maximum salary group by dept
Here is an option which does not use analytic functions:
SELECT e.ENAME,
COALESCE(d.DeptName, 'NA') AS DeptName,
t.max_salary
FROM Employee e
LEFT JOIN Dept d
ON e.DeptID = d.DeptID
INNER JOIN
(
SELECT DeptID, MAX(SALARY) AS max_salary
FROM Employee
GROUP BY DeptId
) t
ON e.DeptID = t.DeptID AND
e.Salary = t.max_salary
Note that in the event that more than one employee should have the maximum salary in a given department, this query would return all ties. In the absence of further information, this seems reasonable without being able to distinguish one employee from another.
Update:
You could also use a WHERE clause containing a correlated subquery instead of joining to a (non-correlated) subquery as I gave above. Here is what that would look like:
SELECT e.ENAME,
COALESCE(d.DeptName, 'NA') AS DeptName,
e.Salary
FROM Employee e
LEFT JOIN Dept d
ON e.DeptID = d.DeptID
WHERE e.Salary = (SELECT MAX(t.Salary) FROM Employee t WHERE t.DeptID = e.DeptID)
However, I would recommend using the first option as it would probably perform better than the correlated subquery option.