I'm trying to calculate a metric with data coming from two independent tables:
SELECT COUNT(DISTINCT user_id)
FROM users
--gives the count of users
SELECT SUM(costs)
FROM costs
--gives the total costs
How do I divide total costs/users?
Did that like this:
WITH costs
AS (SELECT Sum(costs) AS budget,
1 AS id
FROM costs),
budget
AS (SELECT COUNT(DISTINCT user_id) AS users_count,
1 AS id
FROM users)
SELECT budget / users_count AS Result
FROM costs
JOIN budget
ON costs.id = budget.id
That is ugly, but does the job
Related
I want to calculate the ratio of some features in postgresql but using data from 2 different tables.
From table 1 I want to query all the count of all rows:
select count(*) as total from table1
and then perform a calculation with the total calculate above, like this:
select count(class) / total as ratio
from table2
So the total comes table1 and count(class) comes from table2:
How can I do this as there's no common fields to join them?
Use subqueries:
select
(select count(class) from table2) /
(select count(*) as total from table1) as ratio
It could be that count(class) should be count(distinct class), if not count(class) can be just count(*).
I have a table and want to calculate the percentage of total by store_id which each (category_id, store_id) subtotal represents. My code is below:
WITH
example_table (name, store_id)
AS
(
select name, store_id
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
)
SELECT name, store_id, cast(count(*) as numeric)/(SELECT count(*) FROM example_table)
FROM example_table
GROUP BY name, store_id
ORDER BY name, store_id
This code actually works, as in, it doesn't throw an error, only they're not the results I'm looking for. Here each of the subtotals is divided by the total across both stores and all 16 names. Instead, I want the subtotals divided by their respective store totals or divided by their respective name totals.
I'm wondering how to perform calculations on those subtotals in general.
Thanks in advance,
I believe you need to explore the possibilities of using aggregate functions combined with an OVER(PARTITION BY ...) e.g.
SELECT DISTINCT
name, store_id, store_id_count, name_count
FROM (
select name, store_id
, count(*) over(partition by store_id) as store_id_count
, count(*) over(partition by name) as name_count
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
) AS example_table
When using aggregate function with the over clause you get the wanted counts on each row of the result, and it seems that in this case you need this. Note that select distinct has been used simply to reduce the final number of rows returned, you might still need to use a group by but I am not sure if you do.
Once you have the needed values within the derived table (aliases as example_table) then it should be a simple matter of some arithmetic in the overall select clause.
Working with table with columns:
(PK)sales_log_id
user_id
created_at
With sales_log_id to represent transaction activities for users.
I have been able to query how many users have x amount of transactions.
Now I would like to find out how many users have eg. > 10 AND < 20 transactions in a certain period of time.
Being new with databases and Postgres, I'm learning that you can do a query and another query with the previous result (subquery). So I tried to query first how many users are having < 30 transactions in June and later query the result for users having > 10 transactions.
SELECT COUNT(DISTINCT t.user_id) usercounter
FROM (
SELECT user_id, created_at, sales_log_id
FROM sales_log
WHERE created_at BETWEEN
'2019-06-01' AND '2019-06-30'
GROUP BY user_id, created_at, sales_log_id
HAVING COUNT (sales_log_id) <30
)t
GROUP BY t.user_id
HAVING COUNT (t.sales_log.id) >10;
But it produced an error
ERROR: missing FROM-clause entry for table "sales_log"
LINE 11: HAVING COUNT (t.sales_log.id) >10;
^
SQL state: 42P01
Character: 359
Can anyone please provide the correct way to do this?
I think it is as simple as
SELECT count(*)
FROM (
SELECT user_id, COUNT(*)
FROM sales_log
WHERE created_at BETWEEN '2019-06-01' AND '2019-06-30'
GROUP BY user_id
HAVING COUNT (sales_log_id) BETWEEN 11 AND 29
) AS q;
Only add DISTINCT to a query if you really need it.
It is just one word, but it can have a big performance penalty.
I see a discrepancy in the count because of the group by clause, please advise how to go about this.
Total records in the table are 4638 when I do count(you_id), but when I run this script I'm getting only 4544 records.
SELECT
thumbnails,
CR1,
monetize,
c_id,
you_id,
vendor,
ratio * 100 AS percent
FROM
(
SELECT thumbnails, c_id,you_id, vendor,CR1,monetize,timestamp,
count(*) AS total, RATIO_TO_REPORT(total) OVER() AS ratio
FROM testing
GROUP by 1,2,3,4,5,6,7
);
We have a table of items that each item has an invoice id. We process this data in chunks based on invoice id (100 "invoices" at a time). Can you assist in creating a query that will assign a group id to each set of 100 invoices (chunk). Here's a logical example of what we wish to attain:
In this scenario, we know we have 9 rows and 5 invoices in advance. We want to create groups that each group contains 2 invoices except the last group.
SELECT n1.*,
n2.r
FROM dbtable n1,
-- Group distinct inv_ids per group of 2
-- or any other number by changing the /2 to e.g., /4
(SELECT inv_id,
((row_number() OVER())-1)/2 AS r
FROM
-- Get distinct inv_ids
(SELECT DISTINCT inv_id AS inv_id
FROM dbtable
ORDER BY inv_id) n2a) n2
WHERE n1.inv_id=n2.inv_id ;
This query has the advantage that will select correct groups of inv_ids even when the inv_ids are not consecutive.
SQL fiddle here