Find all rows same value in Col1 but different values in Col2 - tsql

Given a table similar to this:
Col1 Col2
---- ----
A A
A A
B B
C C
C D
I'm trying to write a query which will identify all values in Col1 which appear more than once AND have differing values in Col2. So a query that would return only rows with C in Col1 (because there are two rows with C in Col1, and they have differing values in Col2).

Groupy by col1 and take only the ones having more than 1 unique col2. These automatically have more than one col1 value too.
select col1
from your_table
group by col1
having count(distinct col2) > 1

Related

Set value of a column based on another column

I have the following table in Postgres 11.
col1 col2 source col3
a abc curation rejected
a abc DB
b etg DB accepted
c jfh curation
How can I assign value in col3 based on the values in col1
The expected output is:
col1 col2 source col3
a abc curation rejected
a abc DB rejected
b etg DB accepted
c jfh curation null
Is there a way to check if values in col1 and col2 in subsequent rows are identical, then assign same col3 to all the rows (if the col3 value in other row is null).
Any help is highly appreciated.
You're not entirely clear on what the criteria is, but at a basic level it could depend on how you want to query this data, there are multiple ways you could do this.
Generated Columns
drop table if exists atable ;
CREATE TABLE atable (
cola text ,
colb text GENERATED ALWAYS AS (case when cola='a' then 'rejected' else
null end) STORED
);
insert into atable(cola) values ('a')
A View.
create or replace view aview as
select cola, case when cola='a' then 'rejected' else null end as colb
from atable;
Both would yield the same results.
cola|colb |
----+--------+
a |rejected|
Other options could be a materialized view, simple query logic.
You have options.
update a2 set
col3 =
case when col1 = 'a' then 'rejected'
when col1 = 'b' then 'accepted'
when col1 = 'c' then 'null' end
where col3 is null
returning *;
You can also set triggers. But generated columns only available from 12. So you need upgrade to use generated columns.
db fiddle

Filter rows in PostgreSQL table based on column match conditions

I have following table in PostgreSQL 11.0
col1 col2 col3 col4
1 a a a
1 a a a_1
1 a a a_2
1 b b c
2 d d c
3 e d e
I would like to filter above table such that if col2 and col4 are equal, only this match should be selected and below two rows are excluded. When col2 and col4 are not equal, rows with col2 = col3 should be kept.
The desired output is:
col1 col2 col3 col4
1 a a a
1 b b c
2 d d c
3 e d e
I am trying following query with no success so far.
select * from table1
where col2=col4
union
select * from table1
where col2 != col4 and col2=col3
but this will include rows where there is already a match, which I want to exclude in the final output.
1 a a a_1
1 a a a_2
I would use
SELECT DISTINCT ON (col2) *
FROM table1
WHERE col2 = col4 OR col2 = col3
ORDER BY col2, col2 IS DISTINCT FROM col4;
This relies on FALSE < TRUE.
As per my understanding you want unique col2 in you result with given conditions:
Try this:
with cte as
(select *,
case
when col2=col4 then 2
when col2=col3 then 1
else 0
end "flag" from table1 )
select distinct on(col2) col1,col2,col3,col4 from cte where flag>0
order by col2, "flag" desc
Demo on Fiddle

select distinct values in multiple column and save in common column with column tags

I have a table in postgres with two columns:
col1 col2
a a
b c
d e
f f
I would like to have distinct on the two columns and make one column and later assign the tag of column name from where it is coming. The desired output is:
col source
a col1, col2
b col1
c col1
d col1
e col1
f col1, col2
I am able to find distinct in individual columns but not able to make a single column and add label source.
below is the query i am using:
select distinct on (col1, col2) col1, col2 from table
Any suggestions would be really helpful.
You can un-pivot the columns and the aggregate them back:
select u.value, string_agg(distinct u.source, ',' order by u.source)
from data
cross join lateral (
values('col1', col1), ('col2', col2)
)as u(source,value)
group by u.value
order by u.value;
Online example
Alternatively, if you don't want to list each column, you can convert the row to a JSON value and then un-pivot that:
select x.value, string_agg(distinct x.source, ',' order by x.source)
from data d
cross join lateral jsonb_each_text(to_jsonb(d)) as x(source, value)
group by x.value
order by x.value;

how to count the distinct rows from two tables using joins

I have two tables like
table1 table2
------------ ----------------
col1 col2 col1 col2
I need to count the distinct col1 from table1 if itis matching with table2 col1
note: table2 col1 also distinct
select count(distinct table1.col1)
from table1,table2
where table1.col1=table2.col1
As you select the distinct col of table1, and set the join, the col1 of table2 will also be selected distinctly.

SELECT col1, COUNT(col2=0), AVG(col3) FROM table GROUP BY col1; in postgresql

I have a table with three columns, I wanted the below query to work.
SELECT col1, COUNT(col2=0), AVG(col3) FROM table GROUP BY col1;
what I'm trying to achieve is that I'm grouping by column 1, and in each row of the result, i want the following
col1
count of col2=0, grouped by col1, i.e i want to count number of rows in the table with col2==0 grouped by col1
average of col3 grouped by col1
but this is not working as I expect it to work.
COUNT(DISTINCT col2) will count the number of distinct col2 present in each group, what i want is count of col2 where it is zero in each group.
when I use the above query I'm just getting the normal COUNT without the equality.
You can use CASE for the conditional, and then SUM to get the number of 0 entries.
SELECT col1, SUM (CASE WHEN (col2=0) THEN 1 ELSE 0 END ), AVG(col3) FROM table GROUP BY col1;
SELECT col1, SUM (CASE WHEN (col2=0) THEN 1 ELSE 0 END ), AVG(col3) FROM table GROUP BY col1;
The above query will give you average of all col3 grouped by col1 ..
select col1 ,count(col2),avg(col3) from table_name where col2==0 group by col1 ;
this query will give you the average of respective rows in col3 where col2 is 0