PostgreSQL double-quoted type - postgresql

Can anybody tell me, why double-quoted types behave differently in PosqtgreSQL?
CREATE TABLE foo1 (pk int4); -- ok
CREATE TABLE foo2 (pk "int4"); -- ok
CREATE TABLE foo3 (pk int); -- ok
CREATE TABLE foo4 (pk "int"); -- fail: type "int" does not exist
CREATE TABLE foo5 (pk integer); -- ok
CREATE TABLE foo6 (pk "integer"); -- fail: type "integer" does not exist
I can't find anything about it in documentation. Is this a bug?
Any information would be greatly appreciated

Double quotes mean that an identifier is to be interpreted exactly as written. They cause case to be preserved instead of flattened, and they allow what would otherwise be a keyword to be interpreted as an identifier.
PostgreSQL's int is a parse-time transformation to the integer type. There is not actually any data type named int in the system catalogs:
regress=> select typname from pg_type where typname = 'int';
typname
---------
(0 rows)
It is instead handled as a parse-time transformation much like a keyword. So when you protect it from that transformation by quoting it, you're telling the DB to look for a real type by that name.
This can't really be undone in a backward compatible way, since it'd break someone's system if they created a type or table named "int". (Types and tables share the same namespace).
This is similar to how user is transformed to current_user. Rails developers often use User as a model name, which causes Rails to try to SELECT * FROM user; in the DB, but this is transformed at parse time to SELECT * FROM current_user;, causing confused users to wonder why their table has a single row with their username in it. Query generators should always quote identifiers, i.e. they should be generating SELECT * FROM "user";... but few do.

Related

equivalent of Oracle's DBMS_ASSERT.sql_object_name() in PostgreSQL?

I'm trying to come up with a function to verify the object identifier name. Like in Oracle, if a given identifier associated with any sql object (tables, functions, views,... ) It returns the name as it is else error out. Following are few examples.
SELECT SYS.DBMS_ASSERT.SQL_OBJECT_NAME('DBMS_ASSERT.sql_object_name') FROM DUAL;
SYS.DBMS_ASSERT.SQL_OBJECT_NAME('DBMS_ASSERT.SQL_OBJECT_NAME')
DBMS_ASSERT.sql_object_name
SELECT SYS.DBMS_ASSERT.SQL_OBJECT_NAME('unknown') FROM DUAL;
ORA-44002: invalid object name
For tables, views, sequences, you'd typically cast to regclass:
select 'some_table_I_will_create_later'::regclass;
ERROR: relation "some_table_I_will_create_later" does not exist`.
LINE 1: select 'some_table_I_will_create_later'::regclass;
^
For procedures and functions, it'd be a cast to regproc instead, so to get a function equivalent to DBMS_ASSERT.sql_object_name() you'd have to go through the full list of what the argument could be cast to:
create or replace function assert_sql_object_name(arg text)
returns text language sql as $function_body$
select coalesce(
to_regclass(arg)::text,
to_regcollation(arg)::text,
to_regoper(arg)::text,
to_regproc(arg)::text,
to_regtype(arg)::text,
to_regrole(quote_ident(arg))::text,
to_regnamespace(quote_ident(arg))::text )
$function_body$;
These functions work the same as a plain cast, except they return null instead of throwing an exception. coalesce() works the same in PostgreSQL as it does in Oracle, returning the first non-null argument it gets.
Note that unknown is a pseudo-type in PostgreSQL, so it doesn't make a good test.
select assert_sql_object_name('unknown');
-- assert_sql_object_name
-- ------------------------
-- unknown
select assert_sql_object_name('some_table_I_will_create_later');
-- assert_sql_object_name
-- ------------------------
-- null
create table some_table_I_will_create_later(id int);
select assert_sql_object_name('some_table_I_will_create_later');
-- assert_sql_object_name
-- --------------------------------
-- some_table_i_will_create_later
select assert_sql_object_name('different_schema.some_table_I_will_create_later');
-- assert_sql_object_name
-- ------------------------
-- null
create schema different_schema;
alter table some_table_i_will_create_later set schema different_schema;
select assert_sql_object_name('different_schema.some_table_I_will_create_later');
-- assert_sql_object_name
-- -------------------------------------------------
-- different_schema.some_table_i_will_create_later
Online demo
There is no direct equivalent, but if you know the expected type of the object, you can cast the name to one of the Object Identifier Types
For tables, views and other objects that have an entry in pg_class, you can cast it to to regclass:
select 'pg_catalog.pg_class'::regclass;
select 'public.some_table'::regclass;
The cast will result in an error if the object does not exist.
For functions or procedures you need to cast the name to regproc:
select 'my_schema.some_function'::regproc;
However, if that is an overloaded function (i.e. multiple entries exist in pg_catalog.pg_proc, then it would result in an error more than one function named "some_function". In that case you need to provide the full signature you want to test using the type regprocedureregprocedure instead, e.g.:
select 'my_schema.some_function(int4)'::regprocedure;
You can create a wrapper function in PL/pgSQL that tries the different casts to mimic the behaviour of the Oracle function.
The orafce extensions provides an implementation of dbms_assert.object_name

Alter table column to int(9) UNSIGNED ZEROFILL NOT NULL

After reading the following link..
https://dba.stackexchange.com/questions/94443/how-should-a-numeric-value-that-is-an-external-key-be-stored
I decided to alter a column from text to:
int(9) UNSIGNED ZEROFILL NOT NULL
However, I am not sure of the SQL statement to use. I know the below is not correct because it does not include the the 9 digits, unsigned zerofill and not NULL parameters.
ALTER TABLE "Organizations" ALTER COLUMN "EIN" TYPE INTEGER using "EIN"::INTEGER
UPDATE:
Since Postgres does not use zerofill or INT(9). What would be the recommended data type of an EIN number that is 9 digits?
I would recommend below as is in two statements:
ALTER TABLE "Organizations" ALTER COLUMN "EIN" TYPE INTEGER using "EIN"::INTEGER;
ALTER TABLE "Organizations" ALTER COLUMN "EIN" SET NOT NULL;
decoration with padding zeros can be done on select with client (or rule, which would be effectively just a view, selected instead, and thus I think overcomplicating here - ((and changing to int to select text with zeroes - does not sound reasonambe))), eg:
t=# select lpad(123::int::text,9,'0');
lpad
-----------
000000123
(1 row)
so If its needed, can be mocked up
For the 9-digit restriction, a domain over int can work:
CREATE DOMAIN ein AS int CHECK (VALUE>0 AND VALUE<1000000000);
Then ein can be used in declarations as a type, for instance:
=> CREATE TABLE test(id ein, t text);
CREATE TABLE
=> insert into test values(2*1e9::int);
ERROR: value for domain ein violates check constraint "ein_check"
=> insert into test values(100);
INSERT 0 1
The zerofill bit is different, it's about presentation, not storage,
and that part cannot be specialized for a domain.
You may instead apply to_char to the values, for example:
=> select to_char(id,'000000000') from test;
to_char
------------
000000100
and possibly access this through a stored view or a presentation
function that takes only the ein as argument
if you prefer to abstract this from the client.
To go further, you could create a full type with CREATE TYPE
backed with C code for the INPUT and OUTPUT function, and these functions could implement the 9-digit left-padded format as the input/output format, so that the user may never see anything else at the SQL level.

Why is SELECT without columns valid

I accidently wrote a query like select from my_table; and surprisingly it is valid statement. Even more interesting to me is that even SELECT; is a valid query in PostgreSQL. You can try to write a lot funny queries with this:
select union all select;
with t as (select) select;
select from (select) a, (select) b;
select where exists (select);
create table a (b int); with t as (select) insert into a (select from t);
Is this a consequence of some definition SQL standard, or there is some use case for it, or it is just funny behavior that no one cared to programatically restrict?
Right from the manual:
The list of output expressions after SELECT can be empty, producing a zero-column result table. This is not valid syntax according to the SQL standard. PostgreSQL allows it to be consistent with allowing zero-column tables. However, an empty list is not allowed when DISTINCT is used.
The possibility of "zero-column" tables is a side effect of the table inheritance if I'm not mistaken. There were discussions over this on the Postgres mailing lists (but I can't find them right now)

Executing queries dynamically in PL/pgSQL

I have found solutions (I think) to the problem I'm about to ask for on Oracle and SQL Server, but can't seem to translate this into a Postgres solution. I am using Postgres 9.3.6.
The idea is to be able to generate "metadata" about the table content for profiling purposes. This can only be done (AFAIK) by having queries run for each column so as to find out, say... min/max/count values and such. In order to automate the procedure, it is preferable to have the queries generated by the DB, then executed.
With an example salesdata table, I'm able to generate a select query for each column, returning the min() value, using the following snippet:
SELECT 'SELECT min('||column_name||') as minval_'||column_name||' from salesdata '
FROM information_schema.columns
WHERE table_name = 'salesdata'
The advantage being that the db will generate the code regardless of the number of columns.
Now there's a myriad places I had in mind for storing these queries, either a variable of some sort, or a table column, the idea being to then have these queries execute.
I thought of storing the generated queries in a variable then executing them using the EXECUTE (or EXECUTE IMMEDIATE) statement which is the approach employed here (see right pane), but Postgres won't let me declare a variable outside a function and I've been scratching my head with how this would fit together, whether that's even the direction to follow, perhaps there's something simpler.
Would you have any pointers, I'm currently trying something like this, inspired by this other question but have no idea whether I'm headed in the right direction:
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
dyn_sql text;
BEGIN
dyn_sql := SELECT 'SELECT min('||column_name||') from salesdata'
FROM information_schema.columns
WHERE table_name = 'salesdata';
execute dyn_sql
END
$$ LANGUAGE PLPGSQL;
System statistics
Before you roll your own, have a look at the system table pg_statistic or the view pg_stats:
This view allows access only to rows of pg_statistic that correspond
to tables the user has permission to read, and therefore it is safe to
allow public read access to this view.
It might already have some of the statistics you are about to compute. It's populated by ANALYZE, so you might run that for new (or any) tables before checking.
-- ANALYZE tbl; -- optionally, to init / refresh
SELECT * FROM pg_stats
WHERE tablename = 'tbl'
AND schemaname = 'public';
Generic dynamic plpgsql function
You want to return the minimum value for every column in a given table. This is not a trivial task, because a function (like SQL in general) demands to know the return type at creation time - or at least at call time with the help of polymorphic data types.
This function does everything automatically and safely. Works for any table, as long as the aggregate function min() is allowed for every column. But you need to know your way around PL/pgSQL.
CREATE OR REPLACE FUNCTION f_min_of(_tbl anyelement)
RETURNS SETOF anyelement
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT format('SELECT (t::%2$s).* FROM (SELECT min(%1$s) FROM %2$s) t'
, string_agg(quote_ident(attname), '), min(' ORDER BY attnum)
, pg_typeof(_tbl)::text)
FROM pg_attribute
WHERE attrelid = pg_typeof(_tbl)::text::regclass
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
);
END
$func$;
Call (important!):
SELECT * FROM f_min_of(NULL::tbl); -- tbl being the table name
db<>fiddle here
Old sqlfiddle
You need to understand these concepts:
Dynamic SQL in plpgsql with EXECUTE
Polymorphic types
Row types and table types in Postgres
How to defend against SQL injection
Aggregate functions
System catalogs
Related answer with detailed explanation:
Table name as a PostgreSQL function parameter
Refactor a PL/pgSQL function to return the output of various SELECT queries
Postgres data type cast
How to set value of composite variable field using dynamic SQL
How to check if a table exists in a given schema
Select columns with particular column names in PostgreSQL
Generate series of dates - using date type as input
Special difficulty with type mismatch
I am taking advantage of Postgres defining a row type for every existing table. Using the concept of polymorphic types I am able to create one function that works for any table.
However, some aggregate functions return related but different data types as compared to the underlying column. For instance, min(varchar_column) returns text, which is bit-compatible, but not exactly the same data type. PL/pgSQL functions have a weak spot here and insist on data types exactly as declared in the RETURNS clause. No attempt to cast, not even implicit casts, not to speak of assignment casts.
That should be improved. Tested with Postgres 9.3. Did not retest with 9.4, but I am pretty sure, nothing has changed in this area.
That's where this construct comes in as workaround:
SELECT (t::tbl).* FROM (SELECT ... FROM tbl) t;
By casting the whole row to the row type of the underlying table explicitly we force assignment casts to get original data types for every column.
This might fail for some aggregate function. sum() returns numeric for a sum(bigint_column) to accommodate for a sum overflowing the base data type. Casting back to bigint might fail ...
#Erwin Brandstetter, Many thanks for the extensive answer. pg_stats does indeed provide a few things, but what I really need to draw a complete profile is a variety of things, min, max values, counts, count of nulls, mean etc... so a bunch of queries have to be ran for each columns, some with GROUP BY and such.
Also, thanks for highlighting the importance of data types, i was sort of expecting this to throw a spanner in the works at some point, my main concern was with how to automate the query generation, and its execution, this last bit being my main concern.
I have tried the function you provide (I probably will need to start learning some plpgsql) but get a error at the SELECT (t::tbl) :
ERROR: type "tbl" does not exist
btw, what is the (t::abc) notation referred as, in python this would be a list slice, but it’s probably not the case in PLPGSQL

Is name a special keyword in PostgreSQL?

I am using Ubuntu and PostgreSql 8.4.9.
Now, for any table in my database, if I do select table_name.name from table_name, it shows a result of concatenated columns for each row, although I don't have any name column in the table. For the tables which have name column, no issue. Any idea why?
My results are like this:
select taggings.name from taggings limit 3;
---------------------------------------------------------------
(1,4,84,,,PlantCategory,soil_pref_tags,"2010-03-18 00:37:55")
(2,5,84,,,PlantCategory,soil_pref_tags,"2010-03-18 00:37:55")
(3,6,84,,,PlantCategory,soil_pref_tags,"2010-03-18 00:37:55")
(3 rows)
select name from taggings limit 3;
ERROR: column "name" does not exist
LINE 1: select name from taggings limit 3;
This is a known confusing "feature" with a bit of history. Specifically, you could refer to tuples from the table as a whole with the table name, and then appending .name would invoke the name function on them (i.e. it would be interpreted as select name(t) from t).
At some point in the PostgreSQL 9 development, Istr this was cleaned up a bit. You can still do select t from t explicitly to get the rows-as-tuples effect, but you can't apply a function in the same way. So on PostgreSQL 8.4.9, this:
create table t(id serial primary key, value text not null);
insert into t(value) values('foo');
select t.name from t;
produces the bizarre:
name
---------
(1,foo)
(1 row)
but on 9.1.1 produces:
ERROR: column t.name does not exist
LINE 1: select t.name from t;
^
as you would expect.
So, to specifically answer your question: name is a standard type in PostgreSQL (used in the catalogue for table names etc) and also some standard functions to convert things to the name type. It's not actually reserved, just the objects that exist called that, plus some historical strange syntax, made things confusing; and this has been fixed by the developers in recent versions.
According to the PostgreSQL documentation, name is a "non-reserved" keyword in PostgreSQL, SQL:2003, SQL:1999, or SQL-92.
SQL distinguishes between reserved and non-reserved key words. According to the standard, reserved key words are the only real key words; they are never allowed as identifiers. Non-reserved key words only have a special meaning in particular contexts and can be used as identifiers in other contexts. Most non-reserved key words are actually the names of built-in tables and functions specified by SQL. The concept of non-reserved key words essentially only exists to declare that some predefined meaning is attached to a word in some contexts.
The suggested fix when using keywords is:
As a general rule, if you get spurious parser errors for commands that contain any of the listed key words as an identifier you should try to quote the identifier to see if the problem goes away.