how to count the distinct rows from two tables using joins - postgresql

I have two tables like
table1 table2
------------ ----------------
col1 col2 col1 col2
I need to count the distinct col1 from table1 if itis matching with table2 col1
note: table2 col1 also distinct

select count(distinct table1.col1)
from table1,table2
where table1.col1=table2.col1
As you select the distinct col of table1, and set the join, the col1 of table2 will also be selected distinctly.

Related

python pyodbc bulk select with where tuples

Hello I am looking for a “bulk” select by pyodbc
Select col1, col2 from table 1 where col1=df[col] and col2=df[col2]
or
select col1, col2 from table1 where col1, col2=” df_tupes
df_tupes would be [(col1val1, col2val1), (col1val2, col2val2)]
Not necessarily for two columns but multiple columns

select distinct values in multiple column and save in common column with column tags

I have a table in postgres with two columns:
col1 col2
a a
b c
d e
f f
I would like to have distinct on the two columns and make one column and later assign the tag of column name from where it is coming. The desired output is:
col source
a col1, col2
b col1
c col1
d col1
e col1
f col1, col2
I am able to find distinct in individual columns but not able to make a single column and add label source.
below is the query i am using:
select distinct on (col1, col2) col1, col2 from table
Any suggestions would be really helpful.
You can un-pivot the columns and the aggregate them back:
select u.value, string_agg(distinct u.source, ',' order by u.source)
from data
cross join lateral (
values('col1', col1), ('col2', col2)
)as u(source,value)
group by u.value
order by u.value;
Online example
Alternatively, if you don't want to list each column, you can convert the row to a JSON value and then un-pivot that:
select x.value, string_agg(distinct x.source, ',' order by x.source)
from data d
cross join lateral jsonb_each_text(to_jsonb(d)) as x(source, value)
group by x.value
order by x.value;

T-SQL find differences

I found Jeff Smith's solution which is displaying differences between two tables:
SELECT MIN(TableName) as TableName, ID, COL1, COL2, COL3 ...
FROM
(
SELECT 'Table A' as TableName, A.ID, A.COL1, A.COL2, A.COL3, ...
FROM A
UNION ALL
SELECT 'Table B' as TableName, B.ID, B.COL1, B.COl2, B.COL3, ...
FROM B
) tmp
GROUP BY ID, COL1, COL2, COL3 ...
HAVING COUNT(*) = 1
ORDER BY ID
In my project I need to compare eg. col1 and col2 only, rest is used for another operations.
I tried to use
HAVING (COUNT(col1) = 1 and COUNT(col2) = 1)
but with no effect.
Could you please ptovide me solution which will do that?
Get the values of COL1 and COL2 in A that do not exist in B using EXCEPT:
SELECT COL1, COL2 FROM A
EXCEPT
SELECT COL1, COL2 FROM B
Use the results as a derived table to join them back to A and get all the columns:
SELECT 'A' AS SRC, A.COL1, A.COL2, A.COL3...
FROM (
SELECT COL1, COL2 FROM A
EXCEPT
SELECT COL1, COL2 FROM B
) AS diff
INNER JOIN A ON diff.COL1 = A.COL1 AND diff.COL2 = A.COL2
Similarly, use EXCEPT to get the values of COL1 and COL2 that exist only in B, and join the resulting set to B obtain complete rows accordingly.
Combine the two sets with UNION ALL:
SELECT 'A' AS SRC, A.COL1, A.COL2, A.COL3...
FROM (
SELECT COL1, COL2 FROM A
EXCEPT
SELECT COL1, COL2 FROM B
) AS diff
INNER JOIN A ON diff.COL1 = A.COL1 AND diff.COL2 = A.COL2
UNION ALL
SELECT 'B' AS SRC, B.COL1, B.COL2, B.COL3...
FROM (
SELECT COL1, COL2 FROM B
EXCEPT
SELECT COL1, COL2 FROM A
) AS diff
INNER JOIN B ON diff.COL1 = B.COL1 AND diff.COL2 = B.COL2
;
You are dropping the columns from the wrong place. You should drop it from the lists of columns instead of from the star:
SELECT MIN(TableName) as TableName, ID, COL1, COL2
FROM
(
SELECT 'Table A' as TableName, A.ID, A.COL1, A.COL2
FROM A
UNION ALL
SELECT 'Table B' as TableName, B.ID, B.COL1, B.COl2
FROM B
) tmp
GROUP BY ID, COL1, COL2
HAVING COUNT(*) = 1
ORDER BY ID
To keep the other columns in the result, you can use MIN (or friends) to keep them:
SELECT MIN(TableName) as TableName, ID, COL1, COL2, MIN(COL3), MIN(COL4), ...
FROM
(
SELECT 'Table A' as TableName, A.ID, A.COL1, A.COL2, A.COL3, A.COL4, ...
FROM A
UNION ALL
SELECT 'Table B' as TableName, B.ID, B.COL1, B.COL2, B.COL3, B.COL4, ...
FROM B
) tmp
GROUP BY ID, COL1, COL2
HAVING COUNT(*) = 1
ORDER BY ID
Note that this doesn't work very well for certain situations. If two rows are identical in the two tables (including IDs), then it will find it as a difference even though it's not. Also, in this version, if you have multiple rows where COL1 and COL2 are the same, then this doesn't work well either. I would join the two tables together for a more robust comparison.

postgres output query within with clause

I'm trying to get the output of queries within the with clause of my final query as csv or some sort of text files. I only have query access, I'm not allowed to create tables for this database. I have a set of queries that do some calculations on a data set, another set of queries that compute on the previous set and yet another that calculates on the final set. I don't want to run all of it as three seperate queries because the results from the first two are actually in the last one.
WITH
Q1 AS(
SELECT col1, col2, col3, col4, col5, col6, col7
FROM table1
),
Q2 AS(
SELECT AVG(col1) as col1Avg, MAX(col1) as col1Max, col2, col3,col4
FROm Q1
GROUP BY col2, col3, col4
)
SELECT
AVG(col1AVG), col3
FROM
Q2
GROUP BY col3
I would like the results from Q1, Q2 and the final select statement as preferably 3 csv files but I could live with all of it in one csv file. Is this possible?
Thanks!
Edit: Just to clarify, the columns from the queries are very different. I'm definitely pulling more columns from my first query than my second. I've edited the above code a bit to make this more clear.
To combine all the results together you'd use UNION ALL, but the number and data types of the columns must match.
select col1, col2, col2
from blah
union all
select col1, col2, col2
from blah2
union all
... etc
You can reference CTE's in there of course ...
with
cte_1 as (
select ... from ...),
cte_2 as (
select ... from ... cte_1),
cte_3 as (
select ... from ... cte_2)
select col1, col2, col2
from cte_1
union all
select col1, col2, col2
from cte_2
union all
select col1, col2, col2
from cte_3
If your final output is a csv then it looks like you have multiple row formats in there -- checksums? If so, in the queries that you union all together you might like to combine all the columns from each query into one string ...
with
cte_1 as (
select ... from ...),
cte_2 as (
select ... from ... cte_1),
cte_3 as (
select ... from ... cte_2)
select col1||','||col2||','||col2
from cte_1
union all
select col1||','||col2
from cte_2
union all
select col1
from cte_3

How to filter sql duplicates?

My question: I want the records without duplicate, in the same table and in multiple tables? How can I proceed to do this in SQL?
Let me explain what I have tried:
Select distinct Col1, col2
from Table
where order id = 143
Output
VolumeAnswer1 AreaAnswer1 heightAnswer1
VolumeAnswer2 AreaAnswer1 heightAnswer2
VolumeAnswer3 AreaAnswer1 heightAnswer2
Expected Output
It shows the duplicate for the second table, but I need the output to be like:
VolumeAnswer1 AreaAnswer1 heightAnswer1
VolumeAnswer2 heightAnswer2
VolumeAnswer3
I need the same scenario for multiple tables, same duplicate I found for joins also. If it cannot be handled in SQL Server, how can we handle it in .Net? I used multiple select but they used to change it in single select. Each and every column should bind in dropdownlist...
Something like this might be a good place to start:
;with cte1 as (
Select col1, cnt1
From (
Select
col1
,row_number() over(Partition by col1 Order by col1) as cnt1
From tbltest) as tbl_sub1
Where cnt1 = 1
), cte2 as (
Select col2, cnt2
From (
Select
col2
,row_number() over(Partition by col2 Order by col2) as cnt2
From tbltest) as tbl_sub2
Where cnt2 = 1
), cte3 as (
Select col3, cnt3
From (
Select
col3
,row_number() over(Partition by col3 Order by col3) as cnt3
From tbltest) as tbl_sub3
Where cnt3 = 1
)
Select
col1, col2, col3
From cte1
full join cte2 on col1 = col2
full join cte3 on col1 = col3
Sql Fiddle showing example: http://sqlfiddle.com/#!3/c9127/1