Compare two tables and find the missing column using left join - select

I wanted to compare the two tables employees and employees_a and find the missing columns in the table comployees_a.
select a.Column_name,
From User_tab_columns a
LEFT JOIN User_tab_columns b
ON upper(a.table_name) = upper(b.table_name)||'_A'
AND a.column_name = b.column_name
Where upper(a.Table_name) = 'EMPLOYEES'
AND upper(b.table_name) = 'EMPLOYEES_A'
AND b.column_name is NULL
;
But this doesnt seems to be working. No rows are returned.
My employees table has the below columns
emp_name
emp_id
base_location
department
current_location
salary
manager
employees_a table has below columns
emp_name
emp_id
base_location
department
current_location
I want to find the rest two columns and add them into employees_a table.
I have more than 50 tables like this to compare them and find the missing column and add those columns into their respective "_a" table.

Missing columns? Why not using the MINUS set operator, seems to be way simpler, e.g.
select column_name from user_tables where table_name = 'EMP_1'
minus
select column_name from user_tables where table_name = 'EMP_2'

Thirstly, check if user_tab_columns table contains columns of your tables (in my case user_tab_columns is empty and I have to use all_tab_columns):
select a.Column_name
From User_tab_columns a
Where upper(a.Table_name) = 'EMPLOYEES'
Secondly, remove line AND upper(b.table_name) = 'EMPLOYEES_A', because upper(b.table_name) is null in case a column is not found. You have b.table_name in JOIN part of the SELECT already.
select a.Column_name
From User_tab_columns a
LEFT JOIN User_tab_columns b
ON upper(a.table_name) = upper(b.table_name)||'_A'
AND a.column_name = b.column_name
Where upper(a.Table_name) = 'EMPLOYEES'
AND b.column_name is NULL

You do not need any joins and can use:
select 'ALTER TABLE EMPLOYEES_A ADD "'
|| Column_name || '" '
|| CASE MAX(data_type)
WHEN 'NUMBER'
THEN 'NUMBER(' || MAX(data_precision) || ',' || MAX(data_scale) || ')'
WHEN 'VARCHAR2'
THEN 'VARCHAR2(' || MAX(data_length) || ')'
END
AS sql
From User_tab_columns
Where Table_name IN ('EMPLOYEES', 'EMPLOYEES_A')
GROUP BY COLUMN_NAME
HAVING COUNT(CASE table_name WHEN 'EMPLOYEES' THEN 1 END) = 1
AND COUNT(CASE table_name WHEN 'EMPLOYEES_A' THEN 1 END) = 0;
Or, for multiple tables:
select 'ALTER TABLE ' || MAX(table_name) || '_A ADD "'
|| Column_name || '" '
|| CASE MAX(data_type)
WHEN 'NUMBER'
THEN 'NUMBER(' || MAX(data_precision) || ',' || MAX(data_scale) || ')'
WHEN 'VARCHAR2'
THEN 'VARCHAR2(' || MAX(data_length) || ')'
END
AS sql
From User_tab_columns
Where Table_name IN ('EMPLOYEES', 'EMPLOYEES_A', 'SOMETHING', 'SOMETHING_A')
GROUP BY
CASE
WHEN table_name LIKE '%_A'
THEN SUBSTR(table_name, 1, LENGTH(table_name) - 2)
ELSE table_name
END,
COLUMN_NAME
HAVING COUNT(CASE WHEN table_name NOT LIKE '%_A' THEN 1 END) = 1
AND COUNT(CASE WHEN table_name LIKE '%_A' THEN 1 END) = 0;
fiddle

Related

Postgres: Parsing query for database objects

I've got procedure which works for too long period of time.
It parses query to arrays and then search for intersections with objects in database.
In first temp table I split every statement to array.
Second is about combine all possible database objects in array
Third - I'm looking for intersections in arrays.
Now this procedure uses 3 month time period for analyzing.
I dont want to reduce time. Maybe I will if
you dont suggest me something.
I've read that index GIN on array may help. What do you think?
Maybe you did it another way?
database - POSTGRES 11
CREATE TEMP TABLE temp_array_data
AS
(
SELECT id,
pid,
regexp_split_to_array(query, '\s+') as query
FROM t_stat_session
WHERE query_start::DATE BETWEEN pdtQueryDateFrom AND pdtQueryDateTo
AND duration IS NOT NULL
);
CREATE TEMP TABLE temp_sys_objects_data
AS
(
SELECT string_to_array(schemaname || '.' || tablename, '.') object_arr1,
string_to_array(schemaname || '.' || tablename, ',') object_arr2,
schemaname,
tablename object_name,
'T' AS object_type
FROM pg_catalog.pg_tables
UNION ALL
SELECT string_to_array(schemaname || '.' || viewname, '.') object_arr1,
string_to_array(schemaname || '.' || viewname, ',') object_arr2,
schemaname,
viewname object_name,
'VW' AS object_type
FROM pg_catalog.pg_views
UNION ALL
SELECT string_to_array(schemaname || '.' || matviewname, '.') object_arr1,
string_to_array(schemaname || '.' || matviewname, ',') object_arr2,
schemaname,
matviewname object_name,
'MVW' AS object_type
FROM pg_catalog.pg_matviews
);
CREATE TEMP TABLE temp_data_for_final
AS
(
SELECT id,
pid,
schemaname,
object_name,
object_type,
1 cnt
FROM temp_array_data adta,
temp_sys_objects_data
WHERE (ARRAY [object_arr1] && ARRAY [query] OR ARRAY [object_arr2] <# ARRAY [query])
);

Count null values in each column of a table : PSQL

I have a very big table but as an example I will only provide a very small part of it as following:-
col1 col2 col3 col4
10 2 12
13 4 11
0 1
3 5 111
I know how to find null values in one column. What I want to find is how many null values are there in each column just by writing one query.
Thanks in advance
You can use an aggregate with a filter:
select count(*) filter (where col1 is null) as col1_nulls,
count(*) filter (where col2 is null) as col2_nulls,
count(*) filter (where col3 is null) as col3_nulls,
count(*) filter (where col4 is null) as col4_nulls
from the_table;
I think you can generate this query on the fly. Here is an example of one way you can approach it:
CREATE OR REPLACE FUNCTION null_counts(tablename text)
RETURNS SETOF jsonb LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE 'SELECT to_jsonb(t) FROM (SELECT ' || (
SELECT string_agg('count(*) filter (where ' || a.attname::text || ' is null) as ' || a.attname || '_nulls', ',')
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = tablename::regclass
AND a.attnum > 0
AND a.attisdropped = false
) || ' FROM ' || tablename::regclass || ') as t';
END
$func$;
SELECT null_counts('your_table') AS val;

t-sql select column names from all tables where there is at least 1 null value

Context: I am exploring a new database (in MS SQL server), and I want to know for each table, all columns that have null values.
I.e. result would look something like this:
table column nulls
Tbl1 Col1 8
I have found this code here on stackoverflow, that makes a table of table-columnnames - without the WHERE statement which is my addition.
I tried to filter for nulls in WHERE statement, but then the table ends up empty, and I see why - i am checking if the col name is actually null, and not its contents. But can't figure out how to proceed.
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
-- in this where statement, I am trying to filter for nulls, but i get an empty result. and i know there are nulls
where col.name is null
order by schema_name, table_name, column_id
I also tried this (see 4th line):
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
,(select count(*) from tab.name where col.name is null) as countnulls
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
order by schema_name, table_name, column_id
the last one returns an error "Invalid object name 'tab.name'."
column name can't be null but if you mean nullable column (column that accept null value) that has null value at least so you can use following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, * FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
CLOSE getinfo
DEALLOCATE getinfo
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and all records that has null value in this column
but if you want to get only count of nulls you can use the following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, count(*) as nulls FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and count all records that has null value in this column

How do I find all the NUMERIC columns in a table and do a SUM() on them?

I have a few tables in Netezza, DB2 and PostgreSQL databases, for which I need to reconcile and the best way we have come out with is to do a SUM() across all the NUMERIC Table columns on all the 3 databases.
Does anyone have a quick and simple way to find all the COLUMNS which are either NUMERIC or INTEGER or BIGINT and then run a SUM() on all these?
For comparing the results, I can do it manually also, or if someone has a way to capture these results in a common table and automatically check the differences in the SUM?
For DB2 you can use this metadata which will help you to find out the data type for each column
SELECT
COLUMN_NAME || ' ' || REPLACE(REPLACE(DATA_TYPE,'DECIMAL','NUMERIC'),'CHARACTER','VARCHAR') ||
CASE
WHEN DATA_TYPE = 'TIMESTAMP' THEN ''
ELSE
' (' ||
CASE
WHEN CHARACTER_MAXIMUM_LENGTH IS NOT NULL THEN CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR(30))
WHEN NUMERIC_PRECISION IS NOT NULL THEN CAST(NUMERIC_PRECISION AS VARCHAR(30)) ||
CASE
WHEN NUMERIC_SCALE = 0 THEN ''
ELSE ',' || CAST(NUMERIC_SCALE AS VARCHAR(3))
END
ELSE ''
END || ')'
END || ',' "SQLCOL",
COLUMN_NAME,
DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE, ORDINAL_POSITION
FROM SYSIBM.COLUMNS
WHERE TABLE_NAME = 'insert your table name'
AND TABLE_SCHEMA = 'insert your table schema'
ORDER BY ORDINAL_POSITION
For Netezza, I got the following query:
SELECT 0 AS ATTNUM, 'SELECT' AS SQL
UNION
SELECT ATTNUM, 'SUM(' || ATTNAME || ') AS S_' || ATTNAME || ',' AS COLMN
FROM _V_RELATION_COLUMN RC
WHERE NAME = '<table-name>'
AND FORMAT_TYPE= 'NUMERIC'
UNION
SELECT 10000 AS ATTNUM, ' 0 AS FLAG FROM ' || '<table-name>'
ORDER BY ATTNUM
Still looking how to do this across DB2 and PostgreSQL.

How to add single quotes around a list of columns names?

I have a table which has few columns. I got the name of columns by getting it as an array_agg and then array_to_string. Like below:
"hospitalaccountrecord,locationname,patientkey,inpatientadmitdatetime,readmission,no_null_days_btw_admissions,cohort_assignment,admit_mon_feb,admit_mon_mar,admit_mon_apr,admit_mon_may,admit_mon_june,admit_mon_july,admit_mon_aug,admit_mon_sep,admit_mon_oct,a (...)"
The code I used was this:
select array_to_string(array_agg(column_name::text), ',')
from
(
select column_name
from
information_schema.columns
where table_schema='a' and
table_name = 'b'
order by columns.ordinal_position
) as v;
What I am looking for is the same thing but each column name should be enclosed in ''. Like
'hospitalaccountrecord','locationname','patientkey'
and so on.
Don't need two SELECTs, only
select string_agg(''''|| your_column_name ||'''', ',')
should do the job.
select string_agg(quoted_column_name, ',')
from
(
select '''' || column_name || '''' as quoted_column_name
from
information_schema.columns
where table_schema='a' and
table_name = 'b'
order by columns.ordinal_position
) as v;