Weird behaviors of T-SQL in SQL Server - sql-server-2008-r2

For below data set I am trying to get the results where total of Emp_AVG <> 100. Some how I am getting weird results in SQL Server based on the order of the Data in Emp_AVG column.
See below example:
drop table #temp1
select 'F_TEST1' as First_Name, 'L_TEST1' as Last_Name, 'P' as Emp_Catagory, '99.99' as Emp_AVG, 'JAN' as Emp_Month into #temp1 union all
select 'F_TEST1' as First_Name, 'L_TEST1' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'FEB' as Emp_Month union all
select 'F_TEST1' as First_Name, 'L_TEST1' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'MAR' as Emp_Month union all
select 'F_TEST1' as First_Name, 'L_TEST1' as Last_Name, 'C' as Emp_Catagory, '33.4' as Emp_AVG, 'APR' as Emp_Month union all
select 'F_TEST2' as First_Name, 'L_TEST2' as Last_Name, 'P' as Emp_Catagory, '99.98' as Emp_AVG, 'JAN' as Emp_Month union all
select 'F_TEST2' as First_Name, 'L_TEST2' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'FEB' as Emp_Month union all
select 'F_TEST2' as First_Name, 'L_TEST2' as Last_Name, 'C' as Emp_Catagory, '33.4' as Emp_AVG, 'MAR' as Emp_Month union all
select 'F_TEST2' as First_Name, 'L_TEST2' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'APR' as Emp_Month union all
select 'F_TEST3' as First_Name, 'L_TEST3' as Last_Name, 'P' as Emp_Catagory, '99.97' as Emp_AVG, 'JAN' as Emp_Month union all
select 'F_TEST3' as First_Name, 'L_TEST3' as Last_Name, 'C' as Emp_Catagory, '33.4' as Emp_AVG, 'FEB' as Emp_Month union all
select 'F_TEST3' as First_Name, 'L_TEST3' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'MAR' as Emp_Month union all
select 'F_TEST3' as First_Name, 'L_TEST3' as Last_Name, 'C' as Emp_Catagory, '33.3' as Emp_AVG, 'APR' as Emp_Month
--select * from #temp1
select First_Name,Last_Name, Emp_Catagory, sum(cast(Emp_AVG as float)) as Total_AVG
from #temp1
Group by First_Name,Last_Name, Emp_Catagory
having Sum(cast(Emp_AVG as float)) <> 100
order by sum(cast(Emp_AVG as float)) desc
/***************************************************************/
I will really appreciate if anyone can provide me the solution to this.
Regards,
Jigar B.

My guess is the "strange" result you're seeing is that there are results where Total_AVG seems to be equal to 100? Like this:
/---------------------------------------------------\
| First_Name | Last_Name | Emp_Catagory | Total_AVG |
|------------+-----------+--------------+-----------|
| F_TEST2 | L_TEST2 | C | 100 |
| F_TEST3 | L_TEST3 | C | 100 |
| F_TEST1 | L_TEST1 | P | 99.99 |
| F_TEST2 | L_TEST2 | P | 99.98 |
| F_TEST3 | L_TEST3 | P | 99.97 |
\---------------------------------------------------/
Try casting your values as decimal instead:
select
First_Name,
Last_Name,
Emp_Catagory,
sum(cast(Emp_AVG as decimal(4,2))) as Total_AVG
from #temp1
group by
First_Name,
Last_Name,
Emp_Catagory
having sum(cast(Emp_AVG as decimal(4,2))) <> 100
order by sum(cast(Emp_AVG as decimal(4,2))) desc
You then only see the results you are expecting:
/---------------------------------------------------\
| First_Name | Last_Name | Emp_Catagory | Total_AVG |
|------------+-----------+--------------+-----------|
| F_TEST1 | L_TEST1 | P | 99.99 |
| F_TEST2 | L_TEST2 | P | 99.98 |
| F_TEST3 | L_TEST3 | P | 99.97 |
\---------------------------------------------------/

Related

Oracle SQL Listagg remove duplicates with case statement conditions

I am trying to show repeated column values with comma separated list by using listagg but getting error as "Not a single group by function". Hope I get some help.
Below is the DDL script with insert statements and data:
DROP TABLE dept CASCADE CONSTRAINTS;
DROP TABLE myrole CASCADE CONSTRAINTS;
DROP TABLE person CASCADE CONSTRAINTS;
DROP TABLE person_role CASCADE CONSTRAINTS;
CREATE TABLE dept (
id INTEGER NOT NULL,
dept VARCHAR2(50 CHAR)
);
INSERT INTO dept (
id,
dept
) VALUES (
1,
'Operations'
);
INSERT INTO dept (
id,
dept
) VALUES (
2,
'Research'
);
INSERT INTO dept (
id,
dept
) VALUES (
3,
'Accounts'
);
INSERT INTO dept (
id,
dept
) VALUES (
4,
'Sales'
);
ALTER TABLE dept ADD CONSTRAINT dept_pk PRIMARY KEY ( id );
CREATE TABLE myrole (
id INTEGER NOT NULL,
role VARCHAR2(50 CHAR)
);
INSERT INTO myrole (
id,
role
) VALUES (
1,
'JJJ'
);
INSERT INTO myrole (
id,
role
) VALUES (
2,
'Auth'
);
INSERT INTO myrole (
id,
role
) VALUES (
3,
'AAA'
);
INSERT INTO myrole (
id,
role
) VALUES (
4,
'MMM'
);
INSERT INTO myrole (
id,
role
) VALUES (
5,
'KKK'
);
INSERT INTO myrole (
id,
role
) VALUES (
6,
'BBB'
);
ALTER TABLE myrole ADD CONSTRAINT myrole_pk PRIMARY KEY ( id );
CREATE TABLE person (
id INTEGER NOT NULL,
person VARCHAR2(50 CHAR)
);
INSERT INTO person (
id,
person
) VALUES (
1,
'John'
);
INSERT INTO person (
id,
person
) VALUES (
2,
'Scott'
);
INSERT INTO person (
id,
person
) VALUES (
3,
'Ruth'
);
INSERT INTO person (
id,
person
) VALUES (
4,
'Smith'
);
INSERT INTO person (
id,
person
) VALUES (
5,
'Frank'
);
INSERT INTO person (
id,
person
) VALUES (
6,
'Martin'
);
INSERT INTO person (
id,
person
) VALUES (
7,
'Blake'
);
ALTER TABLE person ADD CONSTRAINT person_pk PRIMARY KEY ( id );
CREATE TABLE person_role (
id INTEGER NOT NULL,
person_id INTEGER NOT NULL,
role_id INTEGER NOT NULL,
dept_id INTEGER
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
1,
1,
1,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
2,
2,
2,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
3,
2,
4,
1
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
4,
2,
4,
2
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
5,
3,
1,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
6,
3,
5,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
7,
4,
3,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
8,
5,
6,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
9,
6,
6,
3
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
10,
6,
6,
2
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
11,
6,
2,
NULL
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
12,
7,
6,
4
);
INSERT INTO person_role (
id,
person_id,
role_id,
dept_id
) VALUES (
13,
7,
6,
4
);
ALTER TABLE person_role ADD CONSTRAINT person_role_pk PRIMARY KEY ( id );
ALTER TABLE person_role
ADD CONSTRAINT person_role_myrole_fk FOREIGN KEY ( myrole_id )
REFERENCES myrole ( id );
ALTER TABLE person_role
ADD CONSTRAINT person_role_person_fk FOREIGN KEY ( person_id )
REFERENCES person ( id );
CREATE SEQUENCE dept_seq START WITH 1 NOCACHE;
CREATE OR REPLACE TRIGGER dept_tr BEFORE
INSERT ON dept
FOR EACH ROW
WHEN ( new.id IS NULL )
BEGIN
:new.id := dept_seq.nextval;
END;
/
CREATE SEQUENCE myrole_seq START WITH 1 NOCACHE;
CREATE OR REPLACE TRIGGER myrole_tr BEFORE
INSERT ON myrole
FOR EACH ROW
WHEN ( new.id IS NULL )
BEGIN
:new.id := myrole_seq.nextval;
END;
/
CREATE SEQUENCE person_seq START WITH 1 NOCACHE;
CREATE OR REPLACE TRIGGER person_tr BEFORE
INSERT ON person
FOR EACH ROW
WHEN ( new.id IS NULL )
BEGIN
:new.id := person_seq.nextval;
END;
/
CREATE SEQUENCE person_role_seq START WITH 1 NOCACHE;
CREATE OR REPLACE TRIGGER person_role_tr BEFORE
INSERT ON person_role
FOR EACH ROW
WHEN ( new.id IS NULL )
BEGIN
:new.id := person_role_seq.nextval;
END;
/
By using below query that #Koen Lostrie provided and by adding columns I need, I get output as shown:
SELECT p.person, r.role as myrole, d.dept,
CASE
WHEN rl.role_type = 1 AND r.role IN ('AAA','BBB') THEN 'Add'
WHEN rl.role_type = 0 AND r.role = 'Auth' THEN 'Remove'
END as myaccess
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
JOIN (
SELECT p.id, MIN(CASE WHEN r.ROLE = 'Auth' THEN 0 WHEN r.ROLE in ('AAA','BBB') THEN 1 ELSE 2 END) as role_type
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
GROUP BY p.id
) rl ON rl.id = pr.person_id
left join dept d on d.id = pr.dept_id
Output from query:
+--------+--------+------------+----------+
| PERSON | MYROLE | DEPT | MYACCESS |
+--------+--------+------------+----------+
| John | JJJ | | |
| Scott | Auth | | Remove |
| Scott | MMM | Operations | |
| Scott | MMM | Research | |
| Ruth | JJJ | | |
| Ruth | KKK | | |
| Smith | AAA | | Add |
| Frank | BBB | | Add |
| Martin | AAA | Accounts | |
| Martin | AAA | Research | |
| Martin | Auth | | Remove |
| Blake | BBB | Sales | |
| Blake | BBB | Sales | Add |
+--------+--------+------------+----------+
Now I want to show DEPT column values comma separated based on PERSON and MYROLE columns and the output expected is shown below:
+--------+--------+---------------------+----------+
| PERSON | MYROLE | DEPT | MYACCESS |
+--------+--------+---------------------+----------+
| John | JJJ | | |
| Scott | Auth | | Remove |
| Scott | MMM | Operations,Research | |
| Ruth | JJJ | | |
| Ruth | KKK | | |
| Smith | AAA | | Add |
| Frank | BBB | | Add |
| Martin | AAA | Accounts,Research | |
| Martin | Auth | | Remove |
| Blake | BBB | Sales | Add |
+--------+--------+---------------------+----------+
I added listagg to existing query but getting error
SELECT p.person, r.role as myrole,
listagg(d.dept, ', ') within group (order by d.dept) as dept,
CASE
WHEN rl.role_type = 1 AND r.role IN ('AAA','BBB') THEN 'Add'
WHEN rl.role_type = 0 AND r.role = 'Auth' THEN 'Remove'
END as myaccess
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
JOIN (
SELECT p.id, MIN(CASE WHEN r.ROLE = 'Auth' THEN 0 WHEN r.ROLE in ('AAA','BBB') THEN 1 ELSE 2 END) as role_type
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
GROUP BY p.id
) rl ON rl.id = pr.person_id
left join dept d on d.id = pr.dept_id
getting not a single group by error. Not sure how to fix. Appreciate any help.
Thanks,
Richa
LISTAGG is an aggregate function. If you apply it to a column, then you need to specify in the query what columns you're grouping by. Typically that is all the columns that don't have an aggregate function.
I didn't test since there is no sample data for the dept table nor the person_roles table but this is probably the issue
SELECT p.person, r.role as myrole, listagg(d.dept, ', ') within group (order by d.dept) as dept_list,
CASE
WHEN rl.role_type = 1 AND r.role IN ('AAA','BBB') THEN 'Add'
WHEN rl.role_type = 0 AND r.role = 'Auth' THEN 'Remove'
END as myaccess
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
JOIN (
SELECT p.id, MIN(CASE WHEN r.ROLE = 'Auth' THEN 0 WHEN r.ROLE in ('AAA','BBB') THEN 1 ELSE 2 END) as role_type
FROM person_role pr
JOIN person p ON p.id = pr.person_id
JOIN myrole r ON r.id = pr.role_id
GROUP BY p.id
) rl ON rl.id = pr.person_id
left join dept d on d.id = pr.dept_id
GROUP BY
p.person,
r.role,
CASE
WHEN rl.role_type = 1 AND r.role IN ('AAA','BBB') THEN 'Add'
WHEN rl.role_type = 0 AND r.role = 'Auth' THEN 'Remove'
END
ORDER BY p.person

Postgresql crosstab query with multiple "row name" columns

I have a table that is a "tall skinny" fact table:
CREATE TABLE facts(
eff_date timestamp NOT NULL,
update_date timestamp NOT NULL,
symbol_id int4 NOT NULL,
data_type_id int4 NOT NULL,
source_id char(3) NOT NULL,
fact decimal
/* Keys */
CONSTRAINT fact_pk
PRIMARY KEY (source_id, symbol_id, data_type_id, eff_date),
)
I'd like to "pivot" this for a report, so the header looks like this:
eff_date, symbol_id, source_id, datatypeValue1, ... DatatypeValueN
I.e., I'd like a row for each unique combination of eff_date, symbol_id, and source_id.
However, the postgresql crosstab() function only allow on key column.
Any ideas?
crosstab() expects the following columns from its input query (1st parameter), in this order:
a row_name
(optional) extra columns
a category (matching values in 2nd crosstab parameter)
a value
You don't have a row_name. Add a surrogate row_name with the window function dense_rank().
Your question leaves room for interpretation. Let's add sample rows for demonstration:
INSERT INTO facts (eff_date, update_date, symbol_id, data_type_id, source_id)
VALUES
(now(), now(), 1, 5, 'foo')
, (now(), now(), 1, 6, 'foo')
, (now(), now(), 1, 7, 'foo')
, (now(), now(), 1, 6, 'bar')
, (now(), now(), 1, 7, 'bar')
, (now(), now(), 1, 23, 'bar')
, (now(), now(), 1, 5, 'baz')
, (now(), now(), 1, 23, 'baz'); -- only two rows for 'baz'
Interpretation #1: first N values
You want to list the first N values of data_type_id (the smallest, if there are more) for each distinct (source_id, symbol_id, eff_date).
For this, you also need a synthetic category, can be synthesized with row_number(). The basic query to produce input to crosstab():
SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
, eff_date, symbol_id, source_id -- extra columns
, row_number() OVER (PARTITION BY eff_date, symbol_id, source_id
ORDER BY data_type_id)::int AS category
, data_type_id AS value
FROM facts
ORDER BY row_name, category;
Crosstab query:
SELECT *
FROM crosstab(
'SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
, eff_date, symbol_id, source_id -- extra columns
, row_number() OVER (PARTITION BY eff_date, symbol_id, source_id
ORDER BY data_type_id)::int AS category
, data_type_id AS value
FROM facts
ORDER BY row_name, category'
, 'VALUES (1), (2), (3)'
) AS (row_name int, eff_date timestamp, symbol_id int, source_id char(3)
, datatype_1 int, datatype_2 int, datatype_3 int);
Results:
row_name | eff_date | symbol_id | source_id | datatype_1 | datatype_2 | datatype_3
-------: | :--------------| --------: | :-------- | ---------: | ---------: | ---------:
1 | 2017-04-10 ... | 1 | bar | 6 | 7 | 23
2 | 2017-04-10 ... | 1 | baz | 5 | 23 | null
3 | 2017-04-10 ... | 1 | foo | 5 | 6 | 7
Interpretation #2: actual values in column names
You want to append actual values of data_type_id to the column names datatypeValue1, ... DatatypeValueN. One ore more of these:
SELECT DISTINCT data_type_id FROM facts ORDER BY 1;
5, 6, 7, 23 in the example. Then actual display values can be just boolean (or the redundant value?). Basic query:
SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
, eff_date, symbol_id, source_id -- extra columns
, data_type_id AS category
, TRUE AS value
FROM facts
ORDER BY row_name, category;
Crosstab query:
SELECT *
FROM crosstab(
'SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
, eff_date, symbol_id, source_id -- extra columns
, data_type_id AS category
, TRUE AS value
FROM facts
ORDER BY row_name, category'
, 'VALUES (5), (6), (7), (23)' -- actual values
) AS (row_name int, eff_date timestamp, symbol_id int, source_id char(3)
, datatype_5 bool, datatype_6 bool, datatype_7 bool, datatype_23 bool);
Result:
eff_date | symbol_id | source_id | datatype_5 | datatype_6 | datatype_7 | datatype_23
:--------------| --------: | :-------- | :--------- | :--------- | :--------- | :----------
2017-04-10 ... | 1 | bar | null | t | t | t
2017-04-10 ... | 1 | baz | t | null | null | t
2017-04-10 ... | 1 | foo | t | t | t | null
dbfiddle here
Related:
Crosstab function in Postgres returning a one row output when I expect multiple rows
Dynamic alternative to pivot with CASE and GROUP BY
Postgres - Transpose Rows to Columns

Flattening Postgres nested JSONB column

I'm looking to see how to flatten data nested in a JSONB column.
As an example, say we have the table users with user_id(int) and siblings(JSONB)
With rows like:
id | JSONB
---------------------
1 | {"brother": {"first_name":"Sam", "last_name":"Smith"}, "sister": {"first_name":"Sally", "last_name":"Smith"}
2 | {"sister": {"first_name":"Jill"}}
I'm looking for a query that will return a response like:
id | sibling | first_name | last_name
-------------------------------------
1 | "brother" | "Sam" | "Smith"
1 | "sister" | "Sally" | "Smith"
2 | "sister" | "Jill" | null
I develop to this use it in psql.
To check code I create small view t1:
CREATE VIEW t1 AS (
SELECT 1 AS id, '{"brother": {"first_name":"Sam", "last_name":"Smith"}, "sister": {"first_name":"Sally", "last_name":"Smith"}}'::jsonb AS jsonb
UNION SELECT 2, '{"sister": {"first_name":"Jill", "last_name":"Johnson"}}'
UNION SELECT 3, '{"sister": {"first_name":"Jill", "x_name":"Johnson"}}'
);
The first task is to found list of possible key:
WITH fields AS (
SELECT DISTINCT jff.key
FROM t1,
jsonb_each(jsonb) AS jf,
jsonb_each(jf.value) AS jff
)
SELECT * FROM fields;
The result is:
key
------------
first_name
last_name
x_name
The next step is generate queries:
SELECT 'SELECT id, jf.key as sibling, ' || (
WITH fields AS (
SELECT DISTINCT jff.key
FROM t1,
jsonb_each(jsonb) AS jf,
jsonb_each(jf.value) AS jff
)
SELECT string_agg('jf.value->>''' || key || ''' as "' || key || '"', ',' ORDER BY key)
FROM fields
)
|| ' FROM t1, jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;' AS cmd;
It returns:
cmd
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT id, jf.key as sibling,jf.value->>'first_name' as "first_name",jf.value->>'last_name' as "last_name",jf.value->>'x_name' as "x_name" FROM t1, jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;
(1 row)
To set result as psql variable I use gset:
\gset
After that you can call query:
:cmd
id | sibling | first_name | last_name | x_name
----+---------+------------+-----------+---------
1 | brother | Sam | Smith |
1 | sister | Sally | Smith |
2 | sister | Jill | Johnson |
3 | sister | Jill | | Johnson
(4 rows)
To run it from external languages you can create postgres function than return SQL command:
CREATE OR REPLACE FUNCTION build_query(IN tname text, OUT cmd text) AS $sql$
BEGIN
EXECUTE $cmd$
SELECT 'SELECT id, jf.key as sibling, ' || (
WITH fields AS (
SELECT DISTINCT jff.key
FROM t1,
jsonb_each(jsonb) AS jf,
jsonb_each(jf.value) AS jff
)
SELECT string_agg('jf.value->>''' || key || ''' as "' || key || '"', ',' ORDER BY key)
FROM fields
)
|| ' FROM $cmd$ || quote_ident(tname) || $cmd$ , jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;'$cmd$ INTO cmd;
RETURN;
END;
$sql$ LANGUAGE plpgsql;
SELECT * FROM build_query('t1');
cmd
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT id, jf.key as sibling, jf.value->>'first_name' as "first_name",jf.value->>'last_name' as "last_name",jf.value->>'x_name' as "x_name" FROM t1 , jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;
(1 row)

How to compare two identicals tables data of each column in postgres?

I want compare two table's all column values.The two table is identical tables means column number is same and primary key is same. can any one suggest query which compare such two tables in postgres.
The query should give the column name and what is the two different value of two tables.Like this
pkey | column_name | table1_value | table2_value
123 | bonus | 1 | 0
To get all different rows you can use:
select *
from table_1 t1
join table_2 t2 on t1.pkey = t2.pkey
where t1 is distinct from t2;
This will only compare rows that exist in both tables. If you also want to find those that are missing in on of them use a full outer join:
select coalesce(t1.pkey, t2.pkey) as pkey,
case
when t1.pkey is null then 'Missing in table_1'
when t2.pkey is null then 'Missing in table_2'
else 'At least one column is different'
end as status,
*
from table_1 t1
full ojoin table_2 t2 on t1.pkey = t2.pkey
where (t1 is distinct from t2)
or (t1.pkey is null)
or (t2.pkey is null);
If you install the hstore extension, you can view the differences as a key/value map:
select coalesce(t1.pkey, t2.pkey) as pkey,
case
when t1.pkey is null then 'Missing in table_1'
when t2.pkey is null then 'Missing in table_2'
else 'At least one column is different'
end as status,
hstore(t1) - hstore(t2) as values_in_table_1,
hstore(t2) - hstore(t1) as values_in_table_2
from table_1 t1
full ojoin table_2 t2 on t1.pkey = t2.pkey
where (t1 is distinct from t2)
or (t1.pkey is null)
or (t2.pkey is null);
Using this sample data:
create table table_1 (pkey integer primary key, col_1 text, col_2 int);
insert into table_1 (pkey, col_1, col_2)
values (1, 'a', 1), (2, 'b', 2), (3, 'c', 3), (5, 'e', 42);
create table table_2 (pkey integer primary key, col_1 text, col_2 int);
insert into table_2 (pkey, col_1, col_2)
values (1,'a', 1), (2, 'x', 2), (3, 'c', 33), (4, 'd', 52);
A possible result would be:
pkey | status | values_in_table_1 | values_in_table_2
-----+----------------------------------+-------------------+------------------
2 | At least one column is different | "col_1"=>"b" | "col_1"=>"x"
3 | At least one column is different | "col_2"=>"3" | "col_2"=>"33"
4 | Missing in table_1 | |
5 | Missing in table_2 | |
Example data:
create table test1(pkey serial primary key, str text, val int);
insert into test1 (str, val) values ('a', 1), ('b', 2), ('c', 3);
create table test2(pkey serial primary key, str text, val int);
insert into test2 (str, val) values ('a', 1), ('x', 2), ('c', 33);
This simple query gives a complete information on differences of two tables (including rows missing in one of them):
(select 1 t, * from test1
except
select 1 t, * from test2)
union all
(select 2 t, * from test2
except
select 2 t, * from test1)
order by pkey, t;
t | pkey | str | val
---+------+-----+-----
1 | 2 | b | 2
2 | 2 | x | 2
1 | 3 | c | 3
2 | 3 | c | 33
(4 rows)
In Postgres 9.5+ you can transpose the result to the expected format using jsonb functions:
select pkey, key as column, val[1] as value_1, val[2] as value_2
from (
select pkey, key, array_agg(value order by t) val
from (
select t, pkey, key, value
from (
(select 1 t, * from test1
except
select 1 t, * from test2)
union all
(select 2 t, * from test2
except
select 2 t, * from test1)
) s,
lateral jsonb_each_text(to_jsonb(s))
group by 1, 2, 3, 4
) s
group by 1, 2
) s
where key <> 't' and val[1] <> val[2]
order by pkey;
pkey | column | value_1 | value_2
------+--------+---------+---------
2 | str | b | x
3 | val | 3 | 33
(2 rows)
I tried all of the above answer.Thanks guys for your help.Bot after googling I found a simple query.
SELECT <common_column_list> from table1
EXCEPT
SELECT <common_column_list> from table2.
It shows all the row of table1 if any table1 column value is different from table2 column value.
Not very nice but fun and it works :o)
Just replace public.mytable1 and public.mytable2 by correct tables and
update the " where table_schema='public' and table_name='mytable1'"
select * from (
select pkey,column_name,t1.col_value table1_value,t2.col_value table2_value from (
select pkey,generate_subscripts(t,1) ordinal_position,unnest(t) col_value from (
select pkey,
(
replace(regexp_replace( -- null fields
'{'||substring(a::character varying,'^.(.*).$') ||'}' -- {} instead of ()
,'([\{,])([,\}])','\1null\2','g'),',,',',null,')
)::TEXT[] t
from public.mytable1 a
) a) t1
left join (
select pkey,generate_subscripts(t,1) ordinal_position,unnest(t) col_value from (
select pkey,
(
replace(regexp_replace( -- null fields
'{'||substring(a::character varying,'^.(.*).$') ||'}' -- {} instead of ()
,'([\{,])([,\}])','\1null\2','g'),',,',',null,')
)::TEXT[] t
from public.mytable2 a
) a) t2 using (pkey,ordinal_position)
join (select * from information_schema.columns where table_schema='public' and table_name='mytable1') c using (ordinal_position)
) final where COALESCE(table1_value,'')!=COALESCE(table2_value,'')

List of columns per table

This query is working as expected:
select nspname, relname, max(attnum) as num_cols
from pg_attribute a, pg_namespace n, pg_class c
where n.oid = c.relnamespace and a.attrelid = c.oid
and c.relname not like '%pkey'
and n.nspname not like 'pg%'
and n.nspname not like 'information%'
group by 1, 2
order by 1, 2;
nspname | relname | num_cols
--------+----------+----------
public | category | 4
public | date | 8
public | event | 6
public | listing | 8
public | sales | 10
public | users | 18
public | venue | 5
But how do I get the list of columns per table?
Expected output:
nspname | relname | num_cols
--------+----------+----------
public | category | col1, col2, col3, col4
public | date | col1, col2, col3, col4 ..., col8
Mysql has group_concat function that would apply here.
http://docs.aws.amazon.com/redshift/latest/dg/c_join_PG_examples.html
The following query mentioned on that page does not return any rows for me.
select distinct attrelid, rtrim(name), attname, typname
from pg_attribute a, pg_type t, stv_tbl_perm p
where t.oid=a.atttypid and a.attrelid=p.id
and a.attrelid between 100100 and 110000
and typname not in('oid','xid','tid','cid')
order by a.attrelid asc, typname, attname;
Make sure to include all the schemas that you want to look at in search path.
http://docs.aws.amazon.com/redshift/latest/dg/r_search_path.html
set search_path to '$user', public, enterprise;
Redshift without the schemas in your user id's search path cant show you the columns. For accessing table structure you may want to have usage rights on the schema too.