Subsetting based on combinations from an inner query - postgresql

I'm using postgres on Redshift. I have a query which goes like this:
SELECT EXTRACT(year from created_at) AS CustomYear,
client_ip,
member_id,
COUNT(*) AS Views
FROM ads.fbs_page_view_staging
WHERE member_id = 2
GROUP BY CustomYear,
client_ip,
member_id
HAVING COUNT(*) = 1
ORDER BY CustomYear
Here, I'm selecting a combination of client_ip and member_id where Views is 1. I would now like to take these combinations of client_ip and member_id and subset the entire table ads.fbs_page_view_staging having only such combinations.
If there was only one column I wanted to subset on, say client_ip, I could've written the following query and got the results:
SELECT EXTRACT(year FROM created_at) AS CustomYear,
COUNT(*)
FROM ads.fbs_page_view_staging
WHERE member_id = 2
AND client_ip IN (SELECT client_ip
FROM ((SELECT EXTRACT(year from created_at) AS CustomYear,
client_ip,
member_id,
COUNT(*)
FROM ads.fbs_page_view_staging
WHERE member_id = 2
GROUP BY CustomYear,
client_ip,
member_id
HAVING COUNT(*) = 1
ORDER BY CustomYear)))
GROUP BY customyear
ORDER BY customyear
Notice that in the outer query, I am subsetting based on client_ip. But how do I subset the table on a combination of columns?
Any help would be much appreciated.

Instead of subquerying, try joining directly to the results of your query. That way you can specify multiple criteria.
Here is (draft) SQL to select IP/member pairs that match the rows found by your sub-query (i.e. for some year in the past, there was only one view for that IP & member.)
SELECT distinct client_ip, member_id
FROM ads.fbs_page_view_staging Staging
INNER JOIN (SELECT EXTRACT(year from created_at) AS CustomYear,
client_ip,
member_id,
COUNT(*) AS Views
FROM ads.fbs_page_view_staging
WHERE member_id = 2
GROUP BY CustomYear,
client_ip,
member_id
HAVING COUNT(*) = 1) SingularViews
ON SingularViews.client_ip=Staging.client_ip
AND SingularViews.member_id=Staging.member_id
ORDER BY Staging.client_ip, Staging.member_id
I'm not certain I've captured the intent of your query correctly, but if not hopefully you can adapt the technique.

Related

Calculate difference between the row counts of tables in two schemas in PostgreSQL

I have two table with same name in two different schemas (old and new dump). I would like to know the difference between the two integration.
I have two queries, that gives old and new count:
select count(*) as count_old from(
SELECT
distinct id
FROM
schema1.compound)q1
select count(*) as count_new from(
SELECT
distinct id
FROM
schema2.compound)q2
I would like have the following output.
table_name count_new count_new diff
compound 4740 4735 5
Any help is appreciated. Thanks in advance
with counts as (
select
(select count(distinct id) from schema1.compound) as count_old,
(select count(distinct id) from schema2.compound) as count_new
)
select
'compound' as table_name,
count_old,
count_new,
count_old - count_new as diff
from counts;
I think you could do something like this:
SELECT 'compound' AS table_name, count_old, count_new, (count_old - count_new) AS diff FROM (
SELECT(
(SELECT count(*) FROM (SELECT DISTINCT id FROM schema1.compound)) AS count_old,
(SELECT count(*) FROM (SELECT DISTINCT id FROM schema2.compound)) AS count_new
)
It was probably answered already, but it is a subquery/nested query.
You can directly compute the COUNT on distinct values if you use the DISTINCT keyword inside your aggregation function. Then you can join the queries extracting your two needed values, and use them inside your query to get the output table.
WITH cte AS (
SELECT new.cnt AS count_new,
old.cnt AS count_old
FROM (SELECT COUNT(DISTINCT id) AS cnt FROM schema1.compound) AS old
INNER JOIN (SELECT COUNT(DISTINCT id) AS cnt FROM schema2.compound) AS new
ON 1 = 1
)
SELECT 'compound' AS table_name,
count_new,
count_old,
count_new = count_old AS diff
FROM cte

postgres how to insert values with 2 selects

I'm trying to do a query on Postgres but it's not working. I'd like to create an insert query with 2 select:
Example :
INSERT INTO table1 (id_1, id_2)
SELECT id from table_2 where code='01',
SELECT id from table_2 where code='02';
I don't find the good syntax for this.
I believe below query will works for your use case
INSERT INTO stats(totalProduct, totalCustomer, totalOrder)
VALUES(
(SELECT COUNT(*) FROM products),
(SELECT COUNT(*) FROM customers),
(SELECT COUNT(*) FROM orders)
);
you can changes query accordingly
You can add one more SELECT to achieve this
INSERT INTO table_1 (id_1, id_2)
SELECT
(SELECT id FROM table_2 WHERE code = '01') AS Id_1,
(SELECT id FROM table_2 WHERE code = '02') AS Id_2;
Or you may try with CASE expression:
INSERT INTO table1 (id_1, id_2)
SELECT MAX(CASE WHEN code = '01' THEN id ELSE 0 END) AS Id_1,
MAX(CASE WHEN code = '02' THEN id ELSE 0 END) AS Id_2
FROM table_2
Please refer to the working fiddle on db<>fiddle

How to get count(*) total from DB2 with having clause?

How do I get the sum of all return rows with group by clause in DB2?
For example:
Desc Ctr
---- ---
Bowl 30
Plate 21
Spoon 6
Sum 57
SELECT COUNT (name) as Desc, Count(*) OVER ALL
GROUP BY name
Above query return error from DB2. What is the proper SQL statement to return SUM of all rows?
Thanks,
Brandon.
Try this query,
select name, count(*) from table group by name
What is your platform of Db2?
If you want just the total count of rows, then
select count(*)
from mytable
If you want the subtotals by name plus the total, SQL didn't originally support that. You had to union the two results.
select name, count(*) as cnt
from mytable
group by name
UNION ALL
select '', count(*)
from mytable
However more modern versions have added ROLLUP (and CUBE) functionality...
select name, count(*) as cnt
from mytable
group by name with rollup
Edit
To put a value for name, you could simply use COALESCE() assuming name won't ever be null except in the total row.
select coalesce(name,'-Total-') as name, count(*) as cnt
from mytable
group by name with rollup
The more correct method is to use the GROUPING() function
either return just the flag
select name, count(*) as cnt, grouping(name) as IS_TOTAL
from mytable
group by name with rollup
or use it to set the text
select case grouping(name)
when 1 then '-Total-'
else name
end as name
, count(*) as cnt
from mytable
group by name with rollup
Inculde total
To include the total on each line, you could do something like so...
with tot as (select count(*) as cnt from mytable)
select name
, count(*) as name_cnt
, tot.cnt as total_cnt
from mytable
cross join tot
group by name
Note that this will read mytable twice, once for the total and again for the detail rows. But it's real obvious what you're doing.
Another option would be something like so
with allrows as (
select name, count(*) as cnt, grouping(name) as IS_TOTAL
from mytable
group by name with rollup
)
select dtl.name, dtl.cnt, tot.cnt
from allrows dtl
join allrows tot
on tot.is_total = 1
where
dtl.is_total = 0

Postgresql rows to columns (UNION ALL to JOIN)

Hello with this query I'm getting one result with four rows, how can I change it in order to get four named columns with their own result every one?
SELECT COUNT(*) FROM vehicles WHERE cus=1
UNION ALL
SELECT COUNT(*) FROM user WHERE cus=1
UNION ALL
SELECT COUNT(*) FROM vehicle_events WHERE cus=1
UNION ALL
SELECT COUNT(*) FROM vehicle_alerts WHERE cus=1
Thanks in advance.
SELECT a.ct veh_count, b.ct user_count, c.ct event_count, d.ct alert_count
FROM
( SELECT COUNT(*) ct FROM vehicles WHERE cus=1 ) a,
( SELECT COUNT(*) ct FROM user WHERE cus=1 ) b,
( SELECT COUNT(*) ct FROM vehicle_events WHERE cus=1 ) c,
( SELECT COUNT(*) ct FROM vehicle_alerts WHERE cus=1 ) d;
UNION only adds rows; it has no effect on the columns.
Columns, which define the "shape" of the row tuples, must appear as selected columns1.
For example:
SELECT
(SELECT COUNT(*) FROM vehicles WHERE cus=1) as veh_count
,(SELECT COUNT(*) FROM users WHERE cus=1) as user_count
..
1 There are other constructs that can allow this, see crosstab for example - but the columns are fixed by the query command. It takes dynamic SQL to get a variable number of columns.

SQL Server SUM() for DISTINCT records

I have a field called "Users", and I want to run SUM() on that field that returns the sum of all DISTINCT records. I thought that this would work:
SELECT SUM(DISTINCT table_name.users)
FROM table_name
But it's not selecting DISTINCT records, it's just running as if I had run SUM(table_name.users).
What would I have to do to add only the distinct records from this field?
Use count()
SELECT count(DISTINCT table_name.users)
FROM table_name
SQLFiddle demo
This code seems to indicate sum(distinct ) and sum() return different values.
with t as (
select 1 as a
union all
select '1'
union all
select '2'
union all
select '4'
)
select sum(distinct a) as DistinctSum, sum(a) as allSum, count(distinct a) as distinctCount, count(a) as allCount from t
Do you actually have non-distinct values?
select count(1), users
from table_name
group by users
having count(1) > 1
If not, the sums will be identical.
You can see for yourself that distinct works with the following example. Here I create a subquery with duplicate values, then I do a sum distinct on those values.
select DistinctSum=sum(distinct x), RegularSum=Sum(x)
from
(
select x=1
union All
select 1
union All
select 2
union All
select 2
) x
You can see that the distinct sum column returns 3 and the regular sum returns 6 in this example.
You can use a sub-query:
select sum(users)
from (select distinct users from table_name);
SUM(DISTINCTROW table_name.something)
It worked for me (innodb).
Description - "DISTINCTROW omits data based on entire duplicate records, not just duplicate fields." http://office.microsoft.com/en-001/access-help/all-distinct-distinctrow-top-predicates-HA001231351.aspx
;WITH cte
as
(
SELECT table_name.users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM table_name
)
SELECT SUM(users)
FROM cte
WHERE rn = 1
SQL Fiddle
Try here yourself
TEST
DECLARE #table_name Table (Users INT );
INSERT INTO #table_name Values (1),(1),(1),(3),(3),(5),(5);
;WITH cte
as
(
SELECT users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM #table_name
)
SELECT SUM(users) DisSum
FROM cte
WHERE rn = 1
Result
DisSum
9
If circumstances make it difficult to weave a "distinct" into the sum clause, it will usually be possible to add an extra "where" clause to the entire query - something like:
select sum(t.ColToSum)
from SomeTable t
where (select count(*) from SomeTable t1 where t1.ColToSum = t.ColToSum and t1.ID < t.ID) = 0
May be a duplicate to
Trying to sum distinct values SQL
As per Declan_K's answer:
Get the distinct list first...
SELECT SUM(SQ.COST)
FROM
(SELECT DISTINCT [Tracking #] as TRACK,[Ship Cost] as COST FROM YourTable) SQ