PostgreSQL - How to count when Distinct On - postgresql

How to get count of rows for each user_id
select distinct on (user_id) *
from some_table
As in such SQL:
select user_id, count(*)
from some_table
group by user_id

Try this:
SELECT DISTINCT ON (a.user_id)
a.*
FROM
(
SELECT user_id
, count(*) OVER(PARTITION BY user_id)
FROM some_table
) a

If you want to be able to use SELECT * in order to get a "sample row", depending on how large your table is you may be able to use a correlated subquery to get the count of rows for that particular user id:
select distinct on (user_id) *
, (select count (1)
from some_table st2
where st2.user_id = some_table.user_id) as user_row_count
from some_table

Related

Calculate difference between the row counts of tables in two schemas in PostgreSQL

I have two table with same name in two different schemas (old and new dump). I would like to know the difference between the two integration.
I have two queries, that gives old and new count:
select count(*) as count_old from(
SELECT
distinct id
FROM
schema1.compound)q1
select count(*) as count_new from(
SELECT
distinct id
FROM
schema2.compound)q2
I would like have the following output.
table_name count_new count_new diff
compound 4740 4735 5
Any help is appreciated. Thanks in advance
with counts as (
select
(select count(distinct id) from schema1.compound) as count_old,
(select count(distinct id) from schema2.compound) as count_new
)
select
'compound' as table_name,
count_old,
count_new,
count_old - count_new as diff
from counts;
I think you could do something like this:
SELECT 'compound' AS table_name, count_old, count_new, (count_old - count_new) AS diff FROM (
SELECT(
(SELECT count(*) FROM (SELECT DISTINCT id FROM schema1.compound)) AS count_old,
(SELECT count(*) FROM (SELECT DISTINCT id FROM schema2.compound)) AS count_new
)
It was probably answered already, but it is a subquery/nested query.
You can directly compute the COUNT on distinct values if you use the DISTINCT keyword inside your aggregation function. Then you can join the queries extracting your two needed values, and use them inside your query to get the output table.
WITH cte AS (
SELECT new.cnt AS count_new,
old.cnt AS count_old
FROM (SELECT COUNT(DISTINCT id) AS cnt FROM schema1.compound) AS old
INNER JOIN (SELECT COUNT(DISTINCT id) AS cnt FROM schema2.compound) AS new
ON 1 = 1
)
SELECT 'compound' AS table_name,
count_new,
count_old,
count_new = count_old AS diff
FROM cte

postgres how to insert values with 2 selects

I'm trying to do a query on Postgres but it's not working. I'd like to create an insert query with 2 select:
Example :
INSERT INTO table1 (id_1, id_2)
SELECT id from table_2 where code='01',
SELECT id from table_2 where code='02';
I don't find the good syntax for this.
I believe below query will works for your use case
INSERT INTO stats(totalProduct, totalCustomer, totalOrder)
VALUES(
(SELECT COUNT(*) FROM products),
(SELECT COUNT(*) FROM customers),
(SELECT COUNT(*) FROM orders)
);
you can changes query accordingly
You can add one more SELECT to achieve this
INSERT INTO table_1 (id_1, id_2)
SELECT
(SELECT id FROM table_2 WHERE code = '01') AS Id_1,
(SELECT id FROM table_2 WHERE code = '02') AS Id_2;
Or you may try with CASE expression:
INSERT INTO table1 (id_1, id_2)
SELECT MAX(CASE WHEN code = '01' THEN id ELSE 0 END) AS Id_1,
MAX(CASE WHEN code = '02' THEN id ELSE 0 END) AS Id_2
FROM table_2
Please refer to the working fiddle on db<>fiddle

How to index subgroups in table in Postgres?

Supposing I have a table like this:
select country_id, city_id, person_id from mytable
country_id,city_id,person_id
123,45,100334
123,45,3460456
123,45,943875
123,121,4362
123,121,124747
146,87,3457320
146,89,3495879
146,89,34703924
I want to index the subgroups of country_id and city_id to get such result:
select country_id, city_id, person_id, ???, ??? from mytable
country_id,city_id,person_id,country_num,city_num
123,45,100334,1,1
123,45,3460456,1,1
123,45,943875,1,1
123,121,4362,1,2
123,121,124747,1,2
146,87,3457320,2,1
146,89,3495879,2,2
146,89,34703924,2,2
In other words, I want to numerate all countries in the sequence with integer numbers from 1, and also I want to mark cities the same way within each country separately. Is there an elegant way to do it in Postgres?
demo:db<>fiddle
Use dense_rank() window function:
SELECT
*,
dense_rank() OVER (ORDER BY country_id),
dense_rank() OVER (PARTITION BY country_id ORDER BY city_id)
FROM
mytable
Further reading

How do I delete records based on query results

How to I delete records from a table which is referenced in my query for example, below is my query which returns me the correct amount of results but I then want to delete those records from the same table that is referenced in the query.
;with cte as (select *,
row_number() over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]
order by c.[Hospital Number]) as Rn,
count(*) over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]) as cntDups
from CommDB.dbo.tblNHFDArchive as c)
Select * from cte
Where cte.Rn>1 and cntDups >1
as you can already select the rows by querying Select * from cte Where cte.Rn>1 and cntDups >1, you can delete them by running delete from your_table where unique_column in (Select unique_column from cte Where cte.Rn>1 and cntDups >1)
note that unique_column is a column in your table that cannot have duplicate values, and your_table is the table where the rows reside.
and don't forget to backup your table first if it's on production.

SQL Server SUM() for DISTINCT records

I have a field called "Users", and I want to run SUM() on that field that returns the sum of all DISTINCT records. I thought that this would work:
SELECT SUM(DISTINCT table_name.users)
FROM table_name
But it's not selecting DISTINCT records, it's just running as if I had run SUM(table_name.users).
What would I have to do to add only the distinct records from this field?
Use count()
SELECT count(DISTINCT table_name.users)
FROM table_name
SQLFiddle demo
This code seems to indicate sum(distinct ) and sum() return different values.
with t as (
select 1 as a
union all
select '1'
union all
select '2'
union all
select '4'
)
select sum(distinct a) as DistinctSum, sum(a) as allSum, count(distinct a) as distinctCount, count(a) as allCount from t
Do you actually have non-distinct values?
select count(1), users
from table_name
group by users
having count(1) > 1
If not, the sums will be identical.
You can see for yourself that distinct works with the following example. Here I create a subquery with duplicate values, then I do a sum distinct on those values.
select DistinctSum=sum(distinct x), RegularSum=Sum(x)
from
(
select x=1
union All
select 1
union All
select 2
union All
select 2
) x
You can see that the distinct sum column returns 3 and the regular sum returns 6 in this example.
You can use a sub-query:
select sum(users)
from (select distinct users from table_name);
SUM(DISTINCTROW table_name.something)
It worked for me (innodb).
Description - "DISTINCTROW omits data based on entire duplicate records, not just duplicate fields." http://office.microsoft.com/en-001/access-help/all-distinct-distinctrow-top-predicates-HA001231351.aspx
;WITH cte
as
(
SELECT table_name.users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM table_name
)
SELECT SUM(users)
FROM cte
WHERE rn = 1
SQL Fiddle
Try here yourself
TEST
DECLARE #table_name Table (Users INT );
INSERT INTO #table_name Values (1),(1),(1),(3),(3),(5),(5);
;WITH cte
as
(
SELECT users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM #table_name
)
SELECT SUM(users) DisSum
FROM cte
WHERE rn = 1
Result
DisSum
9
If circumstances make it difficult to weave a "distinct" into the sum clause, it will usually be possible to add an extra "where" clause to the entire query - something like:
select sum(t.ColToSum)
from SomeTable t
where (select count(*) from SomeTable t1 where t1.ColToSum = t.ColToSum and t1.ID < t.ID) = 0
May be a duplicate to
Trying to sum distinct values SQL
As per Declan_K's answer:
Get the distinct list first...
SELECT SUM(SQ.COST)
FROM
(SELECT DISTINCT [Tracking #] as TRACK,[Ship Cost] as COST FROM YourTable) SQ