Postgres NOT IN does not work as expected - postgresql

I am trying to add the below condition in my query to filter data.
SELECT *
FROM dump
WHERE letpos NOT IN ('0', '(!)','NA','N/A') ;
I need only records with id 1,2,3 and 6. But the query does not return ids 3 and 6. I get only 1,2.
TABLE:
id
name
letpos
num
1
AAA
A
60
2
BBB
B
3
CCC
50
4
DDD
0
5
EEE
(!)
70
6
FFF
70
I am not sure what is missing? Could anyone advise on how to resolve this?
-Thanks

In the row with id = 3 the value of letpos is (I suspect) NULL, so the boolean expression in the WHERE clause is:
WHERE NULL NOT IN ('0', '(!)','NA','N/A');
The comparison of NULL with operators like IN, NOT IN, =, > etc always returns NULL and is never TRUE.
So you don't get this row in the results.
Check for NULL also in the WHERE clause:
SELECT *
FROM dump
WHERE letpos IS NULL
OR letpos NOT IN ('0', '(!)', 'NA', 'N/A');

Related

Type bigint but expression is of type character varying

I'm hoping someone can lend a hand with this:
Trying to insert one row per order_id in a database that is running in RedShift, and sometimes subscription_id contains more than 1 value. This creates duplicate rows, so I figured I would LISTAGG. This is the line:
LISTAGG(DISTINCT CAST(script.subscription_id AS VARCHAR), ',') AS subscription_id
The subscription_id column is an int8 and after giving me the character varying error, I tried to CAST; but for some reason I cannot do it. Does LISTAGG do not support this type of nested CAST? If not, is there a wasy to actually achieve this?
ORIGINAL:
order_id subscription_id
1 123
2 124
3 125
1 126
2 127
IDEAL:
order_id subscription_id
1 123,126
2 124,127
3 125
Both columns are of int8.

Group by and sum depending on cases in Google Big Query

The data looks like-
A_value B_value C_value Type
1 null null A
2 null null A
null 3 null B
null 4 null B
null null 5 C
null null 6 C
When Type is 'A' I want to sum the 'A_value' and store in a different column called 'Type_value', when Type is 'B' I want to sum the 'B_value' and store in the column 'Type_value' and do similar for 'C'
Expected results-
Type_value Type
3 A
7 B
11 C
How to achieve this result?
Below is for BigQuery Standard SQL
#standardSQL
SELECT SUM(CASE Type
WHEN 'A' THEN A_value
WHEN 'B' THEN B_value
WHEN 'C' THEN C_value
ELSE 0
END) AS Type_value, Type
FROM `project.dataset.table`
GROUP BY Type
If to apply to sample data in your question - result is
Row Type_value Type
1 3 A
2 7 B
3 11 C
Another potential option is to reuse the fact that your data has pattern of having value only in respective columns. So if it is true - you can use below version
#standardSQL
SELECT SUM(IFNULL(A_value, 0) + IFNULL(B_value, 0) + IFNULL(C_value, 0)) AS Type_value, Type
FROM `project.dataset.table`
GROUP BY Type
with same result obviously

PostgreSQL - dynamic INSERT on column names

I'm looking to dynamically insert a set of columns from one table to another in PostgreSQL. What I think I'd like to do is read in a 'checklist' of column headings (those columns which exist in table 1 - the storage table), and if they exist in the export table (table 2) then insert them in all at once from table 1. Table 2 will be variable in its columns though - once imported ill drop it and import new data to be imported with potentially different column structure. So I need to import it based on the column names.
e.g.
Table 1. - The storage table
ID NAME YEAR LITH_AGE PROV_AGE SIO2 TIO2 CAO MGO COMMENTS
1 John 1998 2000 3000 65 10 5 5 comment1
2 Mark 2005 2444 3444 63 8 2 3 comment2
3 Luke 2001 1000 1500 77 10 2 2 comment3
Table 2. - The export table
ID NAME MG# METHOD SIO2 TIO2 CAO MGO
1 Amy 4 Method1 65 10 5 5
2 Poe 3 Method2 63 8 2 3
3 Ben 2 Method3 77 10 2 2
As you can see the export table may include columns which do not exist in the storage table, so these would be ignored.
I want to insert all of these columns at once, as I've found if I do it individually by column it extends the number of rows each time on the insert (maybe someone can solve this issue instead? Currently I've written a function to check if a column name exists in table 2, if it does, insert it, but as said this extends the rows of the table every time and NULL the rest of the columns).
The INSERT line from my function:
EXECUTE format('INSERT INTO %s (%s) (SELECT %s::%s FROM %s);',_tbl_import, _col,_col,_type,_tbl_export);
As a type of 'code example' for my question:
EXECUTE FORMAT('INSERT INTO table1 (%s) (SELECT (%s) FROM table2)',columns)
where 'columns' would be some variable denoting the columns that exist in the export table that need to go into the storage table. This will be variable as table 2 will be different every time.
This would ideally update Table 1 as:
ID NAME YEAR LITH_AGE PROV_AGE SIO2 TIO2 CAO MGO COMMENTS
1 John 1998 2000 3000 65 10 5 5 comment1
2 Mark 2005 2444 3444 63 8 2 3 comment2
3 Luke 2001 1000 1500 77 10 2 2 comment3
4 Amy NULL NULL NULL 65 10 5 5 NULL
5 Poe NULL NULL NULL 63 8 2 3 NULL
6 Ben NULL NULL NULL 77 10 2 2 NULL
UPDATED answer
As my original answer did not meet requirement came out later but was asked to post an alternative example for information_schema solution so here it is.
I made two versions for solutions:
V1 - is equivalent to already given example using information_schema. But that solution relies on table1 column DEFAULTs. Meaning, if table1 column that does not exist at table2 does not have DEFAULT NULL then it will be filled with whatever the default is.
V2 - is modified to force 'NULL' in case of two table columns mismatch and does not inherit table1 own DEFAULTs
Version1:
CREATE OR REPLACE FUNCTION insert_into_table1_v1()
RETURNS void AS $main$
DECLARE
columns text;
BEGIN
SELECT string_agg(c1.attname, ',')
INTO columns
FROM pg_attribute c1
JOIN pg_attribute c2
ON c1.attrelid = 'public.table1'::regclass
AND c2.attrelid = 'public.table2'::regclass
AND c1.attnum > 0
AND c2.attnum > 0
AND NOT c1.attisdropped
AND NOT c2.attisdropped
AND c1.attname = c2.attname
AND c1.attname <> 'id';
-- Following is the actual result of query above, based on given data examples:
-- -[ RECORD 1 ]----------------------
-- string_agg | name,si02,ti02,cao,mgo
EXECUTE format(
' INSERT INTO table1 ( %1$s )
SELECT %1$s
FROM table2
',
columns
);
END;
$main$ LANGUAGE plpgsql;
Version2:
CREATE OR REPLACE FUNCTION insert_into_table1_v2()
RETURNS void AS $main$
DECLARE
t1_cols text;
t2_cols text;
BEGIN
SELECT string_agg( c1.attname, ',' ),
string_agg( COALESCE( c2.attname, 'NULL' ), ',' )
INTO t1_cols,
t2_cols
FROM pg_attribute c1
LEFT JOIN pg_attribute c2
ON c2.attrelid = 'public.table2'::regclass
AND c2.attnum > 0
AND NOT c2.attisdropped
AND c1.attname = c2.attname
WHERE c1.attrelid = 'public.table1'::regclass
AND c1.attnum > 0
AND NOT c1.attisdropped
AND c1.attname <> 'id';
-- Following is the actual result of query above, based on given data examples:
-- t1_cols | t2_cols
-- --------------------------------------------------------+--------------------------------------------
-- name,year,lith_age,prov_age,si02,ti02,cao,mgo,comments | name,NULL,NULL,NULL,si02,ti02,cao,mgo,NULL
-- (1 row)
EXECUTE format(
' INSERT INTO table1 ( %s )
SELECT %s
FROM table2
',
t1_cols,
t2_cols
);
END;
$main$ LANGUAGE plpgsql;
Also link to documentation about pg_attribute table columns if something is unclear: https://www.postgresql.org/docs/current/static/catalog-pg-attribute.html
Hopefully this helps :)

Default value in select query for null values in postgres

I have a table with sales Id, product code and amount. Some places product code is null. I want to show Missing instead of null. Below is my table.
salesId prodTypeCode amount
1 123 150
2 123 200
3 234 3000
4 234 400
5 234 500
6 123 200
7 111 40
8 111 500
9 1000
10 123 100
I want to display the total amount for every prodTypeCode with the option of If the prodTypeCode is null then Missing should be displayed.
select (CASE WHEN prodTypeCode IS NULL THEN
'Missing'
ELSE
prodTypeCode
END) as ProductCode, SUM(amount) From sales group by prodTypeCode
Above query giving error. Please suggest me to overcome this issue. I ahve created a SQLFIDDLE
The problem is a mismatch of datatypes; 'Missing' is text, but the product type code is numeric.
Cast the product type code to text so the two values are compatible:
select (CASE WHEN prodTypeCode IS NULL THEN
'Missing'
ELSE
prodTypeCode::varchar(40)
END) as ProductCode, SUM(amount) From sales group by prodTypeCode
See SQLFiddle.
Or, simpler:
select coalesce(prodTypeCode::varchar(40), 'Missing') ProductCode, SUM(amount)
from sales
group by prodTypeCode
See SQLFiddle.
Perhaps you have a type mismatch:
select coalesce(cast(prodTypeCode as varchar(255)), 'Missing') as ProductCode,
SUM(amount)
From sales s
group by prodTypeCode;
I prefer coalesce() to the case, simply because it is shorter.
I tried all 2 answers in my case and both did not work. I hope this snippet can help if both do not work for someone else:
SELECT
COALESCE(NULLIF(prodTypeCode,''), 'Missing') AS ProductCode,
SUM(amount)
From sales s
group by prodTypeCode;

Redshift Postgres SQL comparing NULL vs NOT NULL values in a table

I am trying to create a query in Redshift DB (Postgres SQL) to do the following:
I have columns that I am checking for quality control and need the percentages of NULL vs. NOT NULL for each column. I would like my output to look like this, below shows the totals but need it in % if possible. How can I write this query?
Column NOT NULL NULL Total Records Percentage NULL
--------- ------- ------ ---------------- ---------------------
Column A 78 10 88 11.3%
Column B 68 15 83 18.0%
Column C 3 5 8 62.5%
With SQL, you can calculate the values for a specific column, like this:
select
count(a) as "NOT_NULL",
count(*) - count(a) as "NULL",
count(*) as "Total Records",
to_char(100.0 * (count(*) - count(a)) / count(*), '999.9%') as "Percentage NULL"
from stack
However, it is not possible to display "one row per column". You would have to JOIN several queries together to produce that result.