Why are my view's columns nullable? - postgresql

I'm running PostgreSQL 9.2 on Windows.
I have an existing table with some non nullable columns :
CREATE TABLE testtable
(
bkid serial NOT NULL,
bklabel character varying(128),
lacid integer NOT NULL
}
The I create a view on this table :
CREATE OR REPLACE VIEW test AS
SELECT testtable.bkid, testtable.lacid
from public.testtable;
I'm surprised that information_schema.columns for the view reports is_nullable to be YES for the selected columns ?
select * from information_schema.columns where table_name = 'test'
Reports :
"MyDatabase";"public";"test";"bkid";1;"";"YES";"integer";;;32;2;0;;"";;"";"";"";"";"";"";"";"";"";"MyDatabase";"pg_catalog";"int4";"";"";"";;"1";"NO";"NO";"";"";"";"";"";"";"NEVER";"";"NO"
"MyDatabase";"public";"test";"lacid";2;"";"YES";"integer";;;32;2;0;;"";;"";"";"";"";"";"";"";"";"";"MyDatabase";"pg_catalog";"int4";"";"";"";;"2";"NO";"NO";"";"";"";"";"";"";"NEVER";"";"NO"
Is it an expected behavior ?
My problem is that I'm trying to import such views in an Entity Framework Data Model and it fails because all columns are marked as nullable.
EDIT 1 :
The following query :
select attrelid, attname, attnotnull, pg_class.relname
from pg_attribute
inner join pg_class on attrelid = oid
where relname = 'test'
returns :
attrelid;attname;attnotnull;relname
271543;"bkid";f;"test"
271543;"lacid";f;"test"
As expected, attnotnull is 'false'.
As #Mike-Sherrill-Catcall suggested, I could manually set them to true :
update pg_attribute
set attnotnull = 't'
where attrelid = 271543
And the change is reflected in the information_schema.columns :
select * from information_schema.columns where table_name = 'test'
Output is :
"MyDatabase";"public";"test";"bkid";1;"";"NO";"integer";;;32;2;0;;"";;"";"";"";"";"";"";"";"";"";"MyDatabase";"pg_catalog";"int4";"";"";"";;"1";"NO";"NO";"";"";"";"";"";"";"NEVER";"";"NO"
"MyDatabase";"public";"test";"lacid";2;"";"NO";"integer";;;32;2;0;;"";;"";"";"";"";"";"";"";"";"";"MyDatabase";"pg_catalog";"int4";"";"";"";;"2";"NO";"NO";"";"";"";"";"";"";"NEVER";"";"NO"
I'll try to import the views in the Entity Framework data model.
EDIT 2 :
As guessed, it works, the view is now correctly imported in the Entity Framework Data Model.
Of course, I won't set all columns to be non nullable, as demonstrated above, only those non nullable in the underlying table.

I believe this is expected behavior, but I don't pretend to fully understand it. The columns in the base table seem to have the right attributes.
The column in the system tables underlying the information_schema here seems to be "attrnotnull". I see only one thread referring to "attnotnull" on the pgsql-hackers listserv: cataloguing NOT NULL constraints. (But that column might have had a different name in an earlier version. It's probably worth researching.)
You can see the behavior with this query. You'll need to work with the WHERE clause to get exactly what you need to see.
select attrelid, attname, attnotnull, pg_class.relname
from pg_attribute
inner join pg_class on attrelid = oid
where attname like 'something%'
On my system, columns that have a primary key constraint and columns that have a NOT NULL constraint have "attnotnull" set to 't'. The same columns in a view have "attnotnull" set to 'f'.
If you tilt your head and squint just right, that kind of makes sense. The column in the view isn't declared NOT NULL. Just the column in the base table.
The column pg_attribute.attnotnull is updatable. You can set it to TRUE, and that change seems to be reflected in the information_schema views. Although you can set it to TRUE directly, I think I'd be more comfortable setting it to match the value in the base table. (And by more comfortable, I don't mean to imply I'm comfortable at all with mucking about in the system tables.)

Why:
A view could be computed but references the column from a table. That computation could result in a NULL value on what is otherwise a non-null column. So basically, they put it into the too hard basket.
There is a way to see the underlying nullability for yourself with the following query:
select vcu.column_name, c.is_nullable, c.data_type
from information_schema.view_column_usage vcu
join information_schema."columns" c
on c.column_name = vcu.column_name
and c.table_name = vcu.table_name
and c.table_schema = vcu.table_schema
and c.table_catalog = vcu.table_catalog
where view_name = 'your_view_here'
If you know that you are only projecting the columns raw without functions, then it will work. Ideally, the Postgres provider for EF would use this view and also read the view definition to confirm nullability.

The nullability tracking in PostgreSQL is not developed very much at all. In most places, it will default to claiming everything is potentially nullable, which is in many cases allowed by the relevant standards. This is the case here as well: Nullability is not tracked through views. I wouldn't rely on it for an application.

Related

Is a subquery able to select columns from outer query? [duplicate]

This question already has answers here:
sql server 2008 management studio not checking the syntax of my query
(2 answers)
Closed 1 year ago.
I have the following select:
SELECT DISTINCT pl
FROM [dbo].[VendorPriceList] h
WHERE PartNumber IN (SELECT DISTINCT PartNumber
FROM [dbo].InvoiceData
WHERE amount > 10
AND invoiceDate > DATEADD(yyyy, -1, CURRENT_TIMESTAMP)
UNION
SELECT DISTINCT PartNumber
FROM [dbo].VendorDeals)
The issue here is that the table [dbo].VendorDeals has NO column PartNumber, however no error is detected and the query works with the first part of the union.
Even more, IntelliSense also allows and recognize PartNumber. This fails only when inside a complex statement.
It is pretty obvious that if you qualify column names, the mistake will be evident.
This isn't a bug in SQL Server/the T-SQL dialect parsing, no, this is working exactly as intended. The problem, or bug, is in your T-SQL; specifically because you haven't qualified your columns. As I don't have the definition of your table, I'm going to provide sample DDL first:
CREATE TABLE dbo.Table1 (MyColumn varchar(10), OtherColumn int);
CREATE TABLE dbo.Table2 (YourColumn varchar(10) OtherColumn int);
And then an example that is similar to your query:
SELECT MyColumn
FROM dbo.Table1
WHERE MyColumn IN (SELECT MyColumn FROM dbo.Table2);
This, firstly, will parse; it is a valid query. Secondly, provided that dbo.Table2 contains at least one row, then every row from table dbo.Table1 will be returned where MyColumn has a non-NULL value. Why? Well, let's qualify the column with table's name as SQL Server would parse them:
SELECT Table1.MyColumn
FROM dbo.Table1
WHERE Table1.MyColumn IN (SELECT Table1.MyColumn FROM dbo.Table2);
Notice that the column inside the IN is also referencing Table1, not Table2. By default if a column has it's alias omitted in a subquery it will be assumed to be referencing the table(s) defined in that subquery. If, however, none of the tables in the sub query have a column by that name, then it will be assumed to reference a table where that column does exist; in this case Table1.
Let's, instead, take a different example, using the other column in the tables:
SELECT OtherColumn
FROM dbo.Table1
WHERE OtherColumn IN (SELECT OtherColumn FROM dbo.Table2);
This would be parsed as the following:
SELECT Table1.OtherColumn
FROM dbo.Table1
WHERE Table1.OtherColumn IN (SELECT Table2.OtherColumn FROM dbo.Table2);
This is because OtherColumn exists in both tables. As, in the subquery, OtherColumn isn't qualified it is assumed the column wanted is the one in the table defined in the same scope, Table2.
So what is the solution? Alias and qualify your columns:
SELECT T1.MyColumn
FROM dbo.Table1 T1
WHERE T1.MyColumn IN (SELECT T2.MyColumn FROM dbo.Table2 T2);
This will, unsurprisingly, error as Table2 has no column MyColumn.
Personally, I suggest that unless you have only one table being referenced in a query, you alias and qualify all your columns. This not only ensures that the wrong column can't be referenced (such as in a subquery) but also means that other readers know exactly what columns are being referenced. It also stops failures in the future. I have honestly lost count how many times over years I have had a process fall over due to the "ambiguous column" error, due to a table's definition being changed and a query referencing the table wasn't properly qualified by the developer...

Query Constraint Clauses With Schema and Table (Postgres)

I am trying to query the constraint clauses along with schema and table in postgres. I've gotten as far as identifying information_schema.check_constraints as a useful table. The problem is that doing
select *
from information_schema.check_constraints
Results in constraint_catalog, constraint_schema, constraint_name, check_clause. The check_clause is what I want and this table also gives me the constraint_schema. However, it does not give the table that this constraint is defined on. In my current database, I have constraints with the same name defined on different tables within the same schema (which is in it of itself perhaps poor design but what I need to deal with). Is it possible to get the table name here as well?
select
conname,
connamespace::regnamespace as schemaname,
conrelid::regclass as tablename,
consrc as checkclause,
pg_get_constraintdef(oid) as definition
from
pg_constraint
where
contype = 'c'
and conrelid <> 0; -- to get only table constraints
About pg_constraint
About Object Identifier Types

How to introspect materialized views

I have a utility that introspects columns of tables using:
select column_name, data_type from information_schema.columns
where table_name=%s
How can I extend this to introspect columns of materialized views?
Your query carries a few shortcomings / room for improvement:
A table name is not unique inside a database, you would have to narrow down to a specific schema, or could get surprising / misleading / totally incorrect results.
It's much more effective / convenient to cast the (optionally) schema-qualified table name to regclass ... see below.
A cast to regtype gives you generic type names instead of internal ones. But that's still only the base type.
Use the system catalog information functions format_type() instead to get an exact type name including modifiers.
With the above improvements you don't need to join to additional tables. Just pg_attribute.
Dropped columns reside in the catalog until the table is vacuumed (fully). You need to exclude those.
SELECT attname, atttypid::regtype AS base_type
, format_type(atttypid, atttypmod) AS full_type
FROM pg_attribute
WHERE attrelid = 'myschema.mytable'::regclass
AND attnum > 0
AND NOT attisdropped; -- no dead columns
As an aside: the views in the information schema are only good for standard compliance and portability (rarely works anyway). If you don't plan to switch your RDBMS, stick with the catalog tables, which are much faster - and more complete, apparently.
It would seem that postgres 9.3 has left materialized views out of the information_schema. (See http://postgresql.1045698.n5.nabble.com/Re-Materialized-views-WIP-patch-td5740513i40.html for a discussion.)
The following will work for introspection:
select attname, typname
from pg_attribute a
join pg_class c on a.attrelid = c.oid
join pg_type t on a.atttypid = t.oid
where relname = %s and attnum >= 1;
The clause attnum >= 1 suppresses system columns. The type names are pg_specific this way, I guess, but good enough for my purposes.

Name keyword in postgresql

Why is this working or better yet what does it represent ( replace table with an existing one )
select table.name from table
and where is it documented ( postgresql ) ?
select table.name from table
is equivalent to
select name(table) from table
which, since name is a type, is equivalent to
select cast(table as name) from table
The first table is a row variable containing all the columns from the respective table, so you will get a text representation of the row.
This is not directly documented, since it's a combination of several obscure features (some dating back to PostQUEL). In fact, this usage has been disallowed in PostgreSQL 9.1 (see the release notes under "Casting").
The name type is documented in the string types. Hidden columns are documented (off the top of my head) in the section on system catalogs, or in the internals.
You can see hidden columns in pg_attribute. Their numbers are negative:
select attum, attname
from pg_attribute
where attrelid = 'yourtable'::regclass
That should reveal xmin, xmax, ctid, oid (where applicable), etc. Might this also yield name? (I can't test on the iPad.)

Refactoring PostgreSQL SProcs

with a moderate PostgreSQL installation we accumulated quite a few stored procedures/functions and types.
Now in the lowest level composite type (i.e. 3 types are built with it and a myriad functions reference any of those types) one element of the type is of wrong type (i.e. smallint instead of bigint), thus handling it is identical, only the range is different.
How do I know all types depending on
a type (pg_catalog.pg_type seems
insufficient)?
How can I know all
functions depending on a type (as
arguments and locally scoped vars)?
Can I refactore a composite type
(maybe change smallint to bigint)
without dropping/rebuilding every
single function depending on it?
Is there any kind of
automation/tool/best practice for
such a refactoring?
I know its 4 questions in one, but atm this is kind of frustrating and any help would be appreciated!
Many Thanks!
The system catalogue "pg_depend" contains some useful dependency information. You can find objects depending on particular types a bit like this:
select * from pg_depend where refclassid = 'pg_type'::regclass
and refobjid = 'information_schema.sql_identifier'::regtype;
This finds objects dependent on the "information_schema.sql_identifier" type. In the result, classid is the OID of a catalogue- for instance, for a column depending on a user type, classid is 'pg_class'::regclass, objid is the OID of the pg_class row, and objsubid is the attnum value from pg_attribute, so for this case you can format the results like this:
select objid::regclass, attname from pg_depend
join pg_attribute on pg_attribute.attrelid = pg_depend.objid and pg_attribute.attnum = pg_depend.objsubid
where refclassid = 'pg_type'::regclass and refobjid = 'information_schema.sql_identifier'::regtype
and classid = 'pg_class'::regclass
limit 10
So in pg_depend, (classid,objid,objsubid) describe some object that depends on the object described by (refclassid,refobjid).