How can I get a list of all functions stored in the database of a particular schema in PostgreSQL? - postgresql

I want to be able to connect to a PostgreSQL database and find all of the functions for a particular schema.
My thought was that I could make some query to pg_catalog or information_schema and get a list of all functions, but I can't figure out where the names and parameters are stored. I'm looking for a query that will give me the function name and the parameter types it takes (and what order it takes them in).
Is there a way to do this?

\df <schema>.*
in psql gives the necessary information.
To see the query that's used internally connect to a database with psql and supply an extra "-E" (or "--echo-hidden") option and then execute the above command.

After some searching, I was able to find the information_schema.routines table and the information_schema.parameters tables. Using those, one can construct a query for this purpose. LEFT JOIN, instead of JOIN, is necessary to retrieve functions without parameters.
SELECT routines.routine_name, parameters.data_type, parameters.ordinal_position
FROM information_schema.routines
LEFT JOIN information_schema.parameters ON routines.specific_name=parameters.specific_name
WHERE routines.specific_schema='my_specified_schema_name'
ORDER BY routines.routine_name, parameters.ordinal_position;

If any one is interested here is what query is executed by psql on postgres 9.1:
SELECT n.nspname as "Schema",
p.proname as "Name",
pg_catalog.pg_get_function_result(p.oid) as "Result data type",
pg_catalog.pg_get_function_arguments(p.oid) as "Argument data types",
CASE
WHEN p.proisagg THEN 'agg'
WHEN p.proiswindow THEN 'window'
WHEN p.prorettype = 'pg_catalog.trigger'::pg_catalog.regtype THEN 'trigger'
ELSE 'normal'
END as "Type"
FROM pg_catalog.pg_proc p
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace
WHERE pg_catalog.pg_function_is_visible(p.oid)
AND n.nspname <> 'pg_catalog'
AND n.nspname <> 'information_schema'
ORDER BY 1, 2, 4;
You can get what psql runs for a backslash command by running psql with the -E flag.

There's a handy function, oidvectortypes, that makes this a lot easier.
SELECT format('%I.%I(%s)', ns.nspname, p.proname, oidvectortypes(p.proargtypes))
FROM pg_proc p INNER JOIN pg_namespace ns ON (p.pronamespace = ns.oid)
WHERE ns.nspname = 'my_namespace';
Credit to Leo Hsu and Regina Obe at Postgres Online for pointing out oidvectortypes. I wrote similar functions before, but used complex nested expressions that this function gets rid of the need for.
See related answer.
(edit in 2016)
Summarizing typical report options:
-- Compact:
SELECT format('%I.%I(%s)', ns.nspname, p.proname, oidvectortypes(p.proargtypes))
-- With result data type:
SELECT format(
'%I.%I(%s)=%s',
ns.nspname, p.proname, oidvectortypes(p.proargtypes),
pg_get_function_result(p.oid)
)
-- With complete argument description:
SELECT format('%I.%I(%s)', ns.nspname, p.proname, pg_get_function_arguments(p.oid))
-- ... and mixing it.
-- All with the same FROM clause:
FROM pg_proc p INNER JOIN pg_namespace ns ON (p.pronamespace = ns.oid)
WHERE ns.nspname = 'my_namespace';
NOTICE: use p.proname||'_'||p.oid AS specific_name to obtain unique names, or to JOIN with information_schema tables — see routines and parameters at #RuddZwolinski's answer.
The function's OID (see pg_catalog.pg_proc) and the function's specific_name (see information_schema.routines) are the main reference options to functions. Below, some useful functions in reporting and other contexts.
--- --- --- --- ---
--- Useful overloads:
CREATE FUNCTION oidvectortypes(p_oid int) RETURNS text AS $$
SELECT oidvectortypes(proargtypes) FROM pg_proc WHERE oid=$1;
$$ LANGUAGE SQL IMMUTABLE;
CREATE FUNCTION oidvectortypes(p_specific_name text) RETURNS text AS $$
-- Extract OID from specific_name and use it in oidvectortypes(oid).
SELECT oidvectortypes(proargtypes)
FROM pg_proc WHERE oid=regexp_replace($1, '^.+?([^_]+)$', '\1')::int;
$$ LANGUAGE SQL IMMUTABLE;
CREATE FUNCTION pg_get_function_arguments(p_specific_name text) RETURNS text AS $$
-- Extract OID from specific_name and use it in pg_get_function_arguments.
SELECT pg_get_function_arguments(regexp_replace($1, '^.+?([^_]+)$', '\1')::int)
$$ LANGUAGE SQL IMMUTABLE;
--- --- --- --- ---
--- User customization:
CREATE FUNCTION pg_get_function_arguments2(p_specific_name text) RETURNS text AS $$
-- Example of "special layout" version.
SELECT trim(array_agg( op||'-'||dt )::text,'{}')
FROM (
SELECT data_type::text as dt, ordinal_position as op
FROM information_schema.parameters
WHERE specific_name = p_specific_name
ORDER BY ordinal_position
) t
$$ LANGUAGE SQL IMMUTABLE;

Run below SQL query to create a view which will show all functions:
CREATE OR REPLACE VIEW show_functions AS
SELECT routine_name FROM information_schema.routines
WHERE routine_type='FUNCTION' AND specific_schema='public';

Get List of function_schema and function_name...
SELECT
n.nspname AS function_schema,
p.proname AS function_name
FROM
pg_proc p
LEFT JOIN pg_namespace n ON p.pronamespace = n.oid
WHERE
n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY
function_schema,
function_name;

Is a good idea named the functions with commun alias on the first words for filtre the name with LIKE
Example with public schema in Postgresql 9.4, be sure to replace with his scheme
SELECT routine_name
FROM information_schema.routines
WHERE routine_type='FUNCTION'
AND specific_schema='public'
AND routine_name LIKE 'aliasmyfunctions%';

Example:
perfdb-# \df information_schema.*;
List of functions
Schema | Name | Result data type | Argument data types | Type
information_schema | _pg_char_max_length | integer | typid oid, typmod integer | normal
information_schema | _pg_char_octet_length | integer | typid oid, typmod integer | normal
information_schema | _pg_datetime_precision| integer | typid oid, typmod integer | normal
.....
information_schema | _pg_numeric_scale | integer | typid oid, typmod integer | normal
information_schema | _pg_truetypid | oid | pg_attribute, pg_type | normal
information_schema | _pg_truetypmod | integer | pg_attribute, pg_type | normal
(11 rows)

This function returns all user defined routines in current database.
SELECT pg_get_functiondef(p.oid) FROM pg_proc p
INNER JOIN pg_namespace ns ON p.pronamespace = ns.oid
WHERE ns.nspname = 'public';

The joins in abovementioned answers returns not only input parameters, but also outputs. So it is necessary to specify also parameter_mode. This select will return list of functions with its input parametrs (if having some). Postgres 14.
select r.routine_name, array_agg(p.data_type::text order by p.ordinal_position) from information_schema.routines r left join information_schema.parameters p on r.specific_name = p.specific_name
where r.routine_type = 'FUNCTION' and r.specific_schema = 'schema_name' and (p.parameter_mode = 'IN' or p.parameter_mode is null)
group by r.routine_name order by r.routine_name;

Related

PostgreSQL: How to query all tables with exact number of lines [duplicate]

I'm looking for a way to find the row count for all my tables in Postgres. I know I can do this one table at a time with:
SELECT count(*) FROM table_name;
but I'd like to see the row count for all the tables and then order by that to get an idea of how big all my tables are.
There's three ways to get this sort of count, each with their own tradeoffs.
If you want a true count, you have to execute the SELECT statement like the one you used against each table. This is because PostgreSQL keeps row visibility information in the row itself, not anywhere else, so any accurate count can only be relative to some transaction. You're getting a count of what that transaction sees at the point in time when it executes. You could automate this to run against every table in the database, but you probably don't need that level of accuracy or want to wait that long.
WITH tbl AS
(SELECT table_schema,
TABLE_NAME
FROM information_schema.tables
WHERE TABLE_NAME not like 'pg_%'
AND table_schema in ('public'))
SELECT table_schema,
TABLE_NAME,
(xpath('/row/c/text()', query_to_xml(format('select count(*) as c from %I.%I', table_schema, TABLE_NAME), FALSE, TRUE, '')))[1]::text::int AS rows_n
FROM tbl
ORDER BY rows_n DESC;
The second approach notes that the statistics collector tracks roughly how many rows are "live" (not deleted or obsoleted by later updates) at any time. This value can be off by a bit under heavy activity, but is generally a good estimate:
SELECT schemaname,relname,n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
That can also show you how many rows are dead, which is itself an interesting number to monitor.
The third way is to note that the system ANALYZE command, which is executed by the autovacuum process regularly as of PostgreSQL 8.3 to update table statistics, also computes a row estimate. You can grab that one like this:
SELECT
nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r'
ORDER BY reltuples DESC;
Which of these queries is better to use is hard to say. Normally I make that decision based on whether there's more useful information I also want to use inside of pg_class or inside of pg_stat_user_tables. For basic counting purposes just to see how big things are in general, either should be accurate enough.
Here is a solution that does not require functions to get an accurate count for each table:
select table_schema,
table_name,
(xpath('/row/cnt/text()', xml_count))[1]::text::int as row_count
from (
select table_name, table_schema,
query_to_xml(format('select count(*) as cnt from %I.%I', table_schema, table_name), false, true, '') as xml_count
from information_schema.tables
where table_schema = 'public' --<< change here for the schema you want
) t
query_to_xml will run the passed SQL query and return an XML with the result (the row count for that table). The outer xpath() will then extract the count information from that xml and convert it to a number
The derived table is not really necessary, but makes the xpath() a bit easier to understand - otherwise the whole query_to_xml() would need to be passed to the xpath() function.
To get estimates, see Greg Smith's answer.
To get exact counts, the other answers so far are plagued with some issues, some of them serious (see below). Here's a version that's hopefully better:
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
LOOP
RETURN QUERY EXECUTE format('select cast(%L as text),count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
It takes a schema name as parameter, or public if no parameter is given.
To work with a specific list of schemas or a list coming from a query without modifying the function, it can be called from within a query like this:
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
This produces a 3-columns output with the schema, the table and the rows count.
Now here are some issues in the other answers that this function avoids:
Table and schema names shouldn't be injected into executable SQL without being quoted, either with quote_ident or with the more modern format() function with its %I format string. Otherwise some malicious person may name their table tablename;DROP TABLE other_table which is perfectly valid as a table name.
Even without the SQL injection and funny characters problems, table name may exist in variants differing by case. If a table is named ABCD and another one abcd, the SELECT count(*) FROM... must use a quoted name otherwise it will skip ABCD and count abcd twice. The %I of format does this automatically.
information_schema.tables lists custom composite types in addition to tables, even when table_type is 'BASE TABLE' (!). As a consequence, we can't iterate oninformation_schema.tables, otherwise we risk having select count(*) from name_of_composite_type and that would fail. OTOH pg_class where relkind='r' should always work fine.
The type of COUNT() is bigint, not int. Tables with more than 2.15 billion rows may exist (running a count(*) on them is a bad idea, though).
A permanent type need not to be created for a function to return a resultset with several columns. RETURNS TABLE(definition...) is a better alternative.
The hacky, practical answer for people trying to evaluate which Heroku plan they need and can't wait for heroku's slow row counter to refresh:
Basically you want to run \dt in psql, copy the results to your favorite text editor (it will look like this:
public | auth_group | table | axrsosvelhutvw
public | auth_group_permissions | table | axrsosvelhutvw
public | auth_permission | table | axrsosvelhutvw
public | auth_user | table | axrsosvelhutvw
public | auth_user_groups | table | axrsosvelhutvw
public | auth_user_user_permissions | table | axrsosvelhutvw
public | background_task | table | axrsosvelhutvw
public | django_admin_log | table | axrsosvelhutvw
public | django_content_type | table | axrsosvelhutvw
public | django_migrations | table | axrsosvelhutvw
public | django_session | table | axrsosvelhutvw
public | exercises_assignment | table | axrsosvelhutvw
), then run a regex search and replace like this:
^[^|]*\|\s+([^|]*?)\s+\| table \|.*$
to:
select '\1', count(*) from \1 union/g
which will yield you something very similar to this:
select 'auth_group', count(*) from auth_group union
select 'auth_group_permissions', count(*) from auth_group_permissions union
select 'auth_permission', count(*) from auth_permission union
select 'auth_user', count(*) from auth_user union
select 'auth_user_groups', count(*) from auth_user_groups union
select 'auth_user_user_permissions', count(*) from auth_user_user_permissions union
select 'background_task', count(*) from background_task union
select 'django_admin_log', count(*) from django_admin_log union
select 'django_content_type', count(*) from django_content_type union
select 'django_migrations', count(*) from django_migrations union
select 'django_session', count(*) from django_session
;
(You'll need to remove the last union and add the semicolon at the end manually)
Run it in psql and you're done.
?column? | count
--------------------------------+-------
auth_group_permissions | 0
auth_user_user_permissions | 0
django_session | 1306
django_content_type | 17
auth_user_groups | 162
django_admin_log | 9106
django_migrations | 19
[..]
If you don't mind potentially stale data, you can access the same statistics used by the query optimizer.
Something like:
SELECT relname, n_tup_ins - n_tup_del as rowcount FROM pg_stat_all_tables;
Simple Two Steps: (Note : No need to change anything - just copy paste)
1. create function
create function
cnt_rows(schema text, tablename text) returns integer
as
$body$
declare
result integer;
query varchar;
begin
query := 'SELECT count(1) FROM ' || schema || '.' || tablename;
execute query into result;
return result;
end;
$body$
language plpgsql;
2. Run this query to get rows count for all the tables
select sum(cnt_rows) as total_no_of_rows from (select
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE') as subq;
or
To get rows counts tablewise
select
table_schema,
table_name,
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
order by 3 desc;
Not sure if an answer in bash is acceptable to you, but FWIW...
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT table_name
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='public'
\""
TABLENAMES=$(export PGPASSWORD=test; eval "$PGCOMMAND")
for TABLENAME in $TABLENAMES; do
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT '$TABLENAME',
count(*)
FROM $TABLENAME
\""
eval "$PGCOMMAND"
done
Extracted from my Comment in the answer from GregSmith to make it more readable:
with tbl as (
SELECT table_schema,table_name
FROM information_schema.tables
WHERE table_name not like 'pg_%' AND table_schema IN ('public')
)
SELECT
table_schema,
table_name,
(xpath('/row/c/text()',
query_to_xml(format('select count(*) AS c from %I.%I', table_schema, table_name),
false,
true,
'')))[1]::text::int AS rows_n
FROM tbl ORDER BY 3 DESC;
Thanks to #a_horse_with_no_name
I usually don't rely on statistics, especially in PostgreSQL.
SELECT table_name, dsql2('select count(*) from '||table_name) as rownum
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='livescreen'
ORDER BY 2 DESC;
CREATE OR REPLACE FUNCTION dsql2(i_text text)
RETURNS int AS
$BODY$
Declare
v_val int;
BEGIN
execute i_text into v_val;
return v_val;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This worked for me
SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY
n_live_tup DESC;
I don't remember the URL from where I collected this. But hope this should help you:
CREATE TYPE table_count AS (table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT
c.relname
FROM
pg_catalog.pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE
c.relkind = ''r''
AND n.nspname = ''public''
ORDER BY 1
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.relname
LOOP
END LOOP;
r.table_name := t_name.relname;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
Executing select count_em_all(); should get you row count of all your tables.
I made a small variation to include all tables, also for non-public tables.
CREATE TYPE table_count AS (table_schema TEXT,table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT table_schema,table_name
FROM information_schema.tables
where table_schema !=''pg_catalog''
and table_schema !=''information_schema''
ORDER BY 1,2
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.table_schema||''.''||t_name.table_name
LOOP
END LOOP;
r.table_schema := t_name.table_schema;
r.table_name := t_name.table_name;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
use select count_em_all(); to call it.
Hope you find this usefull.
Paul
You Can use this query to generate all tablenames with their counts
select ' select '''|| tablename ||''', count(*) from ' || tablename ||'
union' from pg_tables where schemaname='public';
the result from the above query will be
select 'dim_date', count(*) from dim_date union
select 'dim_store', count(*) from dim_store union
select 'dim_product', count(*) from dim_product union
select 'dim_employee', count(*) from dim_employee union
You'll need to remove the last union and add the semicolon at the end !!
select 'dim_date', count(*) from dim_date union
select 'dim_store', count(*) from dim_store union
select 'dim_product', count(*) from dim_product union
select 'dim_employee', count(*) from dim_employee **;**
RUN !!!
Here is a much simpler way.
tables="$(echo '\dt' | psql -U "${PGUSER}" | tail -n +4 | head -n-2 | tr -d ' ' | cut -d '|' -f2)"
for table in $tables; do
printf "%s: %s\n" "$table" "$(echo "SELECT COUNT(*) FROM $table;" | psql -U "${PGUSER}" | tail -n +3 | head -n-2 | tr -d ' ')"
done
output should look like this
auth_group: 0
auth_group_permissions: 0
auth_permission: 36
auth_user: 2
auth_user_groups: 0
auth_user_user_permissions: 0
authtoken_token: 2
django_admin_log: 0
django_content_type: 9
django_migrations: 22
django_session: 0
mydata_table1: 9011
mydata_table2: 3499
you can update the psql -U "${PGUSER}" portion as needed to access your database
note that the head -n-2 syntax may not work in macOS, you could probably just use a different implementation there
Tested on psql (PostgreSQL) 11.2 under CentOS 7
if you want it sorted by table, then just wrap it with sort
for table in $tables; do
printf "%s: %s\n" "$table" "$(echo "SELECT COUNT(*) FROM $table;" | psql -U "${PGUSER}" | tail -n +3 | head -n-2 | tr -d ' ')"
done | sort -k 2,2nr
output;
mydata_table1: 9011
mydata_table2: 3499
auth_permission: 36
django_migrations: 22
django_content_type: 9
authtoken_token: 2
auth_user: 2
auth_group: 0
auth_group_permissions: 0
auth_user_groups: 0
auth_user_user_permissions: 0
django_admin_log: 0
django_session: 0
I like Daniel Vérité's answer.
But when you can't use a CREATE statement you can either use a bash solution or, if you're a windows user, a powershell one:
# You don't need this if you have pgpass.conf
$env:PGPASSWORD = "userpass"
# Get table list
$tables = & 'C:\Program Files\PostgreSQL\9.4\bin\psql.exe' -U user -w -d dbname -At -c "select table_name from information_schema.tables where table_type='BASE TABLE' AND table_schema='schema1'"
foreach ($table in $tables) {
& 'C:\path_to_postresql\bin\psql.exe' -U root -w -d dbname -At -c "select '$table', count(*) from $table"
}
I wanted the total from all tables + a list of tables with their counts. A little like a performance chart of where most time was spent
WITH results AS (
SELECT nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r'
GROUP BY schemaname, relname, reltuples
)
SELECT * FROM results
UNION
SELECT 'all' AS schemaname, 'all' AS relname, SUM(reltuples) AS "reltuples" FROM results
ORDER BY reltuples DESC
You could of course put a LIMIT clause on the results in this version too so that you get the largest n offenders as well as a total.
One thing that should be noted about this is that you need to let it sit for a while after bulk imports. I tested this by just adding 5000 rows to a database across several tables using real import data. It showed 1800 records for about a minute (probably a configurable window)
This is based from https://stackoverflow.com/a/2611745/1548557 work, so thank you and recognition to that for the query to use within the CTE
If you're in the psql shell, using \gexec allows you to execute the syntax described in syed's answer and Aur's answer without manual edits in an external text editor.
with x (y) as (
select
'select count(*), '''||
tablename||
''' as "tablename" from '||
tablename||' '
from pg_tables
where schemaname='public'
)
select
string_agg(y,' union all '||chr(10)) || ' order by tablename'
from x \gexec
Note, string_agg() is used both to delimit union all between statements and to smush the separated datarows into a single unit to be passed into the buffer.
\gexec
Sends the current query buffer to the server, then treats each column of each row of the query's output (if any) as a SQL statement to be executed.
below query will give us row count and size for each table
select table_schema, table_name,
pg_relation_size('"'||table_schema||'"."'||table_name||'"')/1024/1024 size_MB,
(xpath('/row/c/text()', query_to_xml(format('select count(*) AS c from %I.%I', table_schema, table_name),
false, true,'')))[1]::text::int AS rows_n
from information_schema.tables
order by size_MB desc;

Postgres get the exact row count of all tables without using vacuum [duplicate]

I'm looking for a way to find the row count for all my tables in Postgres. I know I can do this one table at a time with:
SELECT count(*) FROM table_name;
but I'd like to see the row count for all the tables and then order by that to get an idea of how big all my tables are.
There's three ways to get this sort of count, each with their own tradeoffs.
If you want a true count, you have to execute the SELECT statement like the one you used against each table. This is because PostgreSQL keeps row visibility information in the row itself, not anywhere else, so any accurate count can only be relative to some transaction. You're getting a count of what that transaction sees at the point in time when it executes. You could automate this to run against every table in the database, but you probably don't need that level of accuracy or want to wait that long.
WITH tbl AS
(SELECT table_schema,
TABLE_NAME
FROM information_schema.tables
WHERE TABLE_NAME not like 'pg_%'
AND table_schema in ('public'))
SELECT table_schema,
TABLE_NAME,
(xpath('/row/c/text()', query_to_xml(format('select count(*) as c from %I.%I', table_schema, TABLE_NAME), FALSE, TRUE, '')))[1]::text::int AS rows_n
FROM tbl
ORDER BY rows_n DESC;
The second approach notes that the statistics collector tracks roughly how many rows are "live" (not deleted or obsoleted by later updates) at any time. This value can be off by a bit under heavy activity, but is generally a good estimate:
SELECT schemaname,relname,n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
That can also show you how many rows are dead, which is itself an interesting number to monitor.
The third way is to note that the system ANALYZE command, which is executed by the autovacuum process regularly as of PostgreSQL 8.3 to update table statistics, also computes a row estimate. You can grab that one like this:
SELECT
nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r'
ORDER BY reltuples DESC;
Which of these queries is better to use is hard to say. Normally I make that decision based on whether there's more useful information I also want to use inside of pg_class or inside of pg_stat_user_tables. For basic counting purposes just to see how big things are in general, either should be accurate enough.
Here is a solution that does not require functions to get an accurate count for each table:
select table_schema,
table_name,
(xpath('/row/cnt/text()', xml_count))[1]::text::int as row_count
from (
select table_name, table_schema,
query_to_xml(format('select count(*) as cnt from %I.%I', table_schema, table_name), false, true, '') as xml_count
from information_schema.tables
where table_schema = 'public' --<< change here for the schema you want
) t
query_to_xml will run the passed SQL query and return an XML with the result (the row count for that table). The outer xpath() will then extract the count information from that xml and convert it to a number
The derived table is not really necessary, but makes the xpath() a bit easier to understand - otherwise the whole query_to_xml() would need to be passed to the xpath() function.
To get estimates, see Greg Smith's answer.
To get exact counts, the other answers so far are plagued with some issues, some of them serious (see below). Here's a version that's hopefully better:
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
LOOP
RETURN QUERY EXECUTE format('select cast(%L as text),count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
It takes a schema name as parameter, or public if no parameter is given.
To work with a specific list of schemas or a list coming from a query without modifying the function, it can be called from within a query like this:
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
This produces a 3-columns output with the schema, the table and the rows count.
Now here are some issues in the other answers that this function avoids:
Table and schema names shouldn't be injected into executable SQL without being quoted, either with quote_ident or with the more modern format() function with its %I format string. Otherwise some malicious person may name their table tablename;DROP TABLE other_table which is perfectly valid as a table name.
Even without the SQL injection and funny characters problems, table name may exist in variants differing by case. If a table is named ABCD and another one abcd, the SELECT count(*) FROM... must use a quoted name otherwise it will skip ABCD and count abcd twice. The %I of format does this automatically.
information_schema.tables lists custom composite types in addition to tables, even when table_type is 'BASE TABLE' (!). As a consequence, we can't iterate oninformation_schema.tables, otherwise we risk having select count(*) from name_of_composite_type and that would fail. OTOH pg_class where relkind='r' should always work fine.
The type of COUNT() is bigint, not int. Tables with more than 2.15 billion rows may exist (running a count(*) on them is a bad idea, though).
A permanent type need not to be created for a function to return a resultset with several columns. RETURNS TABLE(definition...) is a better alternative.
The hacky, practical answer for people trying to evaluate which Heroku plan they need and can't wait for heroku's slow row counter to refresh:
Basically you want to run \dt in psql, copy the results to your favorite text editor (it will look like this:
public | auth_group | table | axrsosvelhutvw
public | auth_group_permissions | table | axrsosvelhutvw
public | auth_permission | table | axrsosvelhutvw
public | auth_user | table | axrsosvelhutvw
public | auth_user_groups | table | axrsosvelhutvw
public | auth_user_user_permissions | table | axrsosvelhutvw
public | background_task | table | axrsosvelhutvw
public | django_admin_log | table | axrsosvelhutvw
public | django_content_type | table | axrsosvelhutvw
public | django_migrations | table | axrsosvelhutvw
public | django_session | table | axrsosvelhutvw
public | exercises_assignment | table | axrsosvelhutvw
), then run a regex search and replace like this:
^[^|]*\|\s+([^|]*?)\s+\| table \|.*$
to:
select '\1', count(*) from \1 union/g
which will yield you something very similar to this:
select 'auth_group', count(*) from auth_group union
select 'auth_group_permissions', count(*) from auth_group_permissions union
select 'auth_permission', count(*) from auth_permission union
select 'auth_user', count(*) from auth_user union
select 'auth_user_groups', count(*) from auth_user_groups union
select 'auth_user_user_permissions', count(*) from auth_user_user_permissions union
select 'background_task', count(*) from background_task union
select 'django_admin_log', count(*) from django_admin_log union
select 'django_content_type', count(*) from django_content_type union
select 'django_migrations', count(*) from django_migrations union
select 'django_session', count(*) from django_session
;
(You'll need to remove the last union and add the semicolon at the end manually)
Run it in psql and you're done.
?column? | count
--------------------------------+-------
auth_group_permissions | 0
auth_user_user_permissions | 0
django_session | 1306
django_content_type | 17
auth_user_groups | 162
django_admin_log | 9106
django_migrations | 19
[..]
If you don't mind potentially stale data, you can access the same statistics used by the query optimizer.
Something like:
SELECT relname, n_tup_ins - n_tup_del as rowcount FROM pg_stat_all_tables;
Simple Two Steps: (Note : No need to change anything - just copy paste)
1. create function
create function
cnt_rows(schema text, tablename text) returns integer
as
$body$
declare
result integer;
query varchar;
begin
query := 'SELECT count(1) FROM ' || schema || '.' || tablename;
execute query into result;
return result;
end;
$body$
language plpgsql;
2. Run this query to get rows count for all the tables
select sum(cnt_rows) as total_no_of_rows from (select
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE') as subq;
or
To get rows counts tablewise
select
table_schema,
table_name,
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
order by 3 desc;
Not sure if an answer in bash is acceptable to you, but FWIW...
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT table_name
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='public'
\""
TABLENAMES=$(export PGPASSWORD=test; eval "$PGCOMMAND")
for TABLENAME in $TABLENAMES; do
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT '$TABLENAME',
count(*)
FROM $TABLENAME
\""
eval "$PGCOMMAND"
done
Extracted from my Comment in the answer from GregSmith to make it more readable:
with tbl as (
SELECT table_schema,table_name
FROM information_schema.tables
WHERE table_name not like 'pg_%' AND table_schema IN ('public')
)
SELECT
table_schema,
table_name,
(xpath('/row/c/text()',
query_to_xml(format('select count(*) AS c from %I.%I', table_schema, table_name),
false,
true,
'')))[1]::text::int AS rows_n
FROM tbl ORDER BY 3 DESC;
Thanks to #a_horse_with_no_name
I usually don't rely on statistics, especially in PostgreSQL.
SELECT table_name, dsql2('select count(*) from '||table_name) as rownum
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='livescreen'
ORDER BY 2 DESC;
CREATE OR REPLACE FUNCTION dsql2(i_text text)
RETURNS int AS
$BODY$
Declare
v_val int;
BEGIN
execute i_text into v_val;
return v_val;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This worked for me
SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY
n_live_tup DESC;
I don't remember the URL from where I collected this. But hope this should help you:
CREATE TYPE table_count AS (table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT
c.relname
FROM
pg_catalog.pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE
c.relkind = ''r''
AND n.nspname = ''public''
ORDER BY 1
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.relname
LOOP
END LOOP;
r.table_name := t_name.relname;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
Executing select count_em_all(); should get you row count of all your tables.
I made a small variation to include all tables, also for non-public tables.
CREATE TYPE table_count AS (table_schema TEXT,table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT table_schema,table_name
FROM information_schema.tables
where table_schema !=''pg_catalog''
and table_schema !=''information_schema''
ORDER BY 1,2
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.table_schema||''.''||t_name.table_name
LOOP
END LOOP;
r.table_schema := t_name.table_schema;
r.table_name := t_name.table_name;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
use select count_em_all(); to call it.
Hope you find this usefull.
Paul
You Can use this query to generate all tablenames with their counts
select ' select '''|| tablename ||''', count(*) from ' || tablename ||'
union' from pg_tables where schemaname='public';
the result from the above query will be
select 'dim_date', count(*) from dim_date union
select 'dim_store', count(*) from dim_store union
select 'dim_product', count(*) from dim_product union
select 'dim_employee', count(*) from dim_employee union
You'll need to remove the last union and add the semicolon at the end !!
select 'dim_date', count(*) from dim_date union
select 'dim_store', count(*) from dim_store union
select 'dim_product', count(*) from dim_product union
select 'dim_employee', count(*) from dim_employee **;**
RUN !!!
Here is a much simpler way.
tables="$(echo '\dt' | psql -U "${PGUSER}" | tail -n +4 | head -n-2 | tr -d ' ' | cut -d '|' -f2)"
for table in $tables; do
printf "%s: %s\n" "$table" "$(echo "SELECT COUNT(*) FROM $table;" | psql -U "${PGUSER}" | tail -n +3 | head -n-2 | tr -d ' ')"
done
output should look like this
auth_group: 0
auth_group_permissions: 0
auth_permission: 36
auth_user: 2
auth_user_groups: 0
auth_user_user_permissions: 0
authtoken_token: 2
django_admin_log: 0
django_content_type: 9
django_migrations: 22
django_session: 0
mydata_table1: 9011
mydata_table2: 3499
you can update the psql -U "${PGUSER}" portion as needed to access your database
note that the head -n-2 syntax may not work in macOS, you could probably just use a different implementation there
Tested on psql (PostgreSQL) 11.2 under CentOS 7
if you want it sorted by table, then just wrap it with sort
for table in $tables; do
printf "%s: %s\n" "$table" "$(echo "SELECT COUNT(*) FROM $table;" | psql -U "${PGUSER}" | tail -n +3 | head -n-2 | tr -d ' ')"
done | sort -k 2,2nr
output;
mydata_table1: 9011
mydata_table2: 3499
auth_permission: 36
django_migrations: 22
django_content_type: 9
authtoken_token: 2
auth_user: 2
auth_group: 0
auth_group_permissions: 0
auth_user_groups: 0
auth_user_user_permissions: 0
django_admin_log: 0
django_session: 0
I like Daniel Vérité's answer.
But when you can't use a CREATE statement you can either use a bash solution or, if you're a windows user, a powershell one:
# You don't need this if you have pgpass.conf
$env:PGPASSWORD = "userpass"
# Get table list
$tables = & 'C:\Program Files\PostgreSQL\9.4\bin\psql.exe' -U user -w -d dbname -At -c "select table_name from information_schema.tables where table_type='BASE TABLE' AND table_schema='schema1'"
foreach ($table in $tables) {
& 'C:\path_to_postresql\bin\psql.exe' -U root -w -d dbname -At -c "select '$table', count(*) from $table"
}
I wanted the total from all tables + a list of tables with their counts. A little like a performance chart of where most time was spent
WITH results AS (
SELECT nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r'
GROUP BY schemaname, relname, reltuples
)
SELECT * FROM results
UNION
SELECT 'all' AS schemaname, 'all' AS relname, SUM(reltuples) AS "reltuples" FROM results
ORDER BY reltuples DESC
You could of course put a LIMIT clause on the results in this version too so that you get the largest n offenders as well as a total.
One thing that should be noted about this is that you need to let it sit for a while after bulk imports. I tested this by just adding 5000 rows to a database across several tables using real import data. It showed 1800 records for about a minute (probably a configurable window)
This is based from https://stackoverflow.com/a/2611745/1548557 work, so thank you and recognition to that for the query to use within the CTE
If you're in the psql shell, using \gexec allows you to execute the syntax described in syed's answer and Aur's answer without manual edits in an external text editor.
with x (y) as (
select
'select count(*), '''||
tablename||
''' as "tablename" from '||
tablename||' '
from pg_tables
where schemaname='public'
)
select
string_agg(y,' union all '||chr(10)) || ' order by tablename'
from x \gexec
Note, string_agg() is used both to delimit union all between statements and to smush the separated datarows into a single unit to be passed into the buffer.
\gexec
Sends the current query buffer to the server, then treats each column of each row of the query's output (if any) as a SQL statement to be executed.
below query will give us row count and size for each table
select table_schema, table_name,
pg_relation_size('"'||table_schema||'"."'||table_name||'"')/1024/1024 size_MB,
(xpath('/row/c/text()', query_to_xml(format('select count(*) AS c from %I.%I', table_schema, table_name),
false, true,'')))[1]::text::int AS rows_n
from information_schema.tables
order by size_MB desc;

Why is pg_table_def table returning tablenames containing the text "_$mig"

I am using pg_table_def to work out the primary keys associated to a given table, example below:
SELECT pg_table_def.tablename,
pg_table_def.column
FROM pg_table_def
WHERE schemaname = 'myschema'
AND tablename LIKE '%mytable%'
AND tablename LIKE '%_pkey%';
Example output from the above query:
tablename column
mytable_$mig_pkey id
This is correctly identifying the primary keys, but in some cases the format of the tablename returned has the text _$mig in it, for example: mytable_$mig_pkey.
I use the tablename in a later part of the code so currently have to remove this text. My questions are:
Why for some tables is the tablename formatted like this with _$mig?
Are there other strings that could appear between tablename and _pkey that should be taken into consideration?
Or is there a better way to identify the primary keys associated to a
table?
I've so far tried:
Checking the AWS documentation it recommends to check the search_path, so I've done this and made sure the schema is referenced in the where clause.
Checked that mytable_$mig doesn't exist / isn't available.
Other googling hasn't got me any further...
Update
Based on the comments below, running the below query:
select tablename, "column", type, encoding, distkey, sortkey, "notnull"
from pg_table_def
where schemaname = 'myschema' AND tablename LIKE '%mytable%';
Returns:
tablename column type encoding distkey sortkey notnull
mytable id integer none true 1 true
mytable desc character varying(50) lzo false 0 false
mytable_$mig_pkey id integer none true 1 false
Update:
So it looks like you have your actual table and one that was probably created accidentally. If you do select count(*) from mytable and select count(*) from mytable_$mig_pkey;, hopefully mytable has all your data and the other one is
empty. If so you can clean it up with drop table mytable_$mig_pkey.
I see you have the id column as the distkey which is Redshift's version of a primary key. You may also be trying to define constraints, which Redshift will allow you to do, but they are not enforced.
You can view the defined table constraints for a table like so:
select * from information_schema.table_constraints
where table_name='mytable';
On one of my tables gives a result like:
constraint_catalog | constraint_schema | constraint_name | table_catalog | table_schema | table_name | constraint_type | is_deferrable | initially_deferred
--------------------+-------------------+-----------------+---------------+--------------+------------+-----------------+---------------+--------------------
mydb | myschema | mytable_pkey | mydb | myschema | mytable | PRIMARY KEY | NO | NO
(1 row)
My current guess is that you attempted to define constraints on your table and accidentally created a second table at some point, which is why the other table has _pkey in its name.
I use a query like this myself:
select tablename, "column", type, encoding, distkey, sortkey, "notnull"
from pg_table_def
where schemaname = 'myschema' AND tablename LIKE '%mytable%';
Running something more general like that might help you figure out what's going on.
The \d I referenced in the comments is a way to list all the tables in the database that is a convenience function built into the psql command line app. The query it runs to do that is this:
SELECT n.nspname as "Schema",
c.relname as "Name",
CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END as "Type",
pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','m','S','f','')
AND n.nspname <> 'pg_catalog'
AND n.nspname <> 'information_schema'
AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
Running that might help.

Get complex output type of function that returns record in postgresql

I want to write nice and detailed report on functions in my postgresql database.
I built the following query:
SELECT routine_name, data_type, proargnames
FROM information_schema.routines
join pg_catalog.pg_proc on pg_catalog.pg_proc.proname = information_schema.routines.routine_name
WHERE specific_schema = 'public'
ORDER BY routine_name;
It works as it should (basically returns me what I want it to: function name, output data type and input data type) except one thing:
I have relatively complicated functions and many of them return record.
The thing is, data_type returns me record as well for such functions, while I want detailed list of function output types.
For instance, I have something like this in one of my functions:
RETURNS TABLE("Res" integer, "Output" character varying) AS
How can I make query above (or, perhaps, a new query, if it will solve the problem) return something like
integer, character varying instead of record for such functions?
I am using postgresql 9.2
Thanks in advance!
The RECORD returned value is evaluated at runtime, there is no way that the information can be retrieved this way.
BUT, if RETURNS TABLE("Res" integer, "Output" character varying) AS is used, there is a solution.
The test functions I used:
-- first function, uses RETURNS TABLE
CREATE FUNCTION test_ret(a TEXT, b TEXT)
RETURNS TABLE("Res" integer, "Output" character varying) AS $$
DECLARE
ret RECORD;
BEGIN
-- test
END;$$ LANGUAGE plpgsql;
-- second function, test some edge cases
-- same name as above, returns simple integer
CREATE FUNCTION test_ret(a TEXT)
RETURNS INTEGER AS $$
DECLARE
ret RECORD;
BEGIN
-- test
END;$$ LANGUAGE plpgsql;
How to retrieve this function return datatype is easy as it's stored into pg_catalog.pg_proc.proallargtypes, the problem is that this is an array of OID. We must unnest this thing and join it to pg_catalog.pg_types.oid.
-- edit: add support for function not returning tables, thx Tommaso Di Bucchianico
WITH pg_proc_with_unnested_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
CASE WHEN proallargtypes IS NOT NULL THEN unnest(proallargtypes) ELSE null END AS proallargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
),
pg_proc_with_proallargtypes_names AS (
SELECT
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname,
array_agg(pg_catalog.pg_type.typname) AS proallargtypes
FROM pg_proc_with_unnested_proallargtypes
LEFT JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = proallargtype
GROUP BY
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname
)
SELECT
information_schema.routines.specific_name,
information_schema.routines.routine_name,
information_schema.routines.routine_schema,
information_schema.routines.data_type,
pg_proc_with_proallargtypes_names.proallargtypes
FROM information_schema.routines
-- we can declare many function with the same name and schema as long as arg types are different
-- This is the only right way to join pg_catalog.pg_proc and information_schema.routines, sadly
JOIN pg_proc_with_proallargtypes_names
ON pg_proc_with_proallargtypes_names.proname || '_' || pg_proc_with_proallargtypes_names.oid = information_schema.routines.specific_name
;
Any refactoring is welcome :)
Here is the result:
specific_name | routine_name | routine_schema | data_type | proallargtypes
----------------+--------------+----------------+-----------+--------------------------
test_ret_16633 | test_ret | public | record | {text,text,int4,varchar}
test_ret_16635 | test_ret | public | integer | {NULL}
(2 rows)
EDIT
Identification of input and output arguments is not trivial, here is my solution for pg 9.2
-- https://gist.github.com/subssn21/e9e121f6fd5ff50f688d
-- Allow us to use array_remove in pg < 9.3
CREATE OR REPLACE FUNCTION array_remove(a ANYARRAY, e ANYELEMENT)
RETURNS ANYARRAY AS $$
BEGIN
RETURN array(SELECT x FROM unnest(a) x WHERE x <> e);
END;
$$ LANGUAGE plpgsql;
-- edit: add support for function not returning tables, thx Tommaso Di Bucchianico
WITH pg_proc_with_unnested_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
pg_catalog.pg_proc.proargmodes,
CASE WHEN proallargtypes IS NOT NULL THEN unnest(proallargtypes) ELSE null END AS proallargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
),
pg_proc_with_unnested_proallargtypes_names_and_mode AS (
SELECT
pg_proc_with_unnested_proallargtypes.oid,
pg_proc_with_unnested_proallargtypes.proname,
pg_catalog.pg_type.typname,
-- we can't unnest multiple array of same length the way we expect in pg 9.2
-- just retrieve each mode manually using type row_number
pg_proc_with_unnested_proallargtypes.proargmodes[row_number() OVER w] AS proargmode
FROM pg_proc_with_unnested_proallargtypes
LEFT JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = proallargtype
WINDOW w AS (PARTITION BY pg_proc_with_unnested_proallargtypes.proname)
),
pg_proc_with_input_and_output_type_names AS (
SELECT
pg_proc_with_unnested_proallargtypes_names_and_mode.oid,
pg_proc_with_unnested_proallargtypes_names_and_mode.proname,
array_agg(pg_proc_with_unnested_proallargtypes_names_and_mode.typname) AS proallargtypes,
-- we should use FILTER, but that's not available in pg 9.2 :(
array_remove(array_agg(
-- see documentation for proargmodes here: http://www.postgresql.org/docs/9.2/static/catalog-pg-proc.html
CASE WHEN pg_proc_with_unnested_proallargtypes_names_and_mode.proargmode = ANY(ARRAY['i', 'b', 'v'])
THEN pg_proc_with_unnested_proallargtypes_names_and_mode.typname
ELSE NULL END
), NULL) AS proinputargtypes,
array_remove(array_agg(
-- see documentation for proargmodes here: http://www.postgresql.org/docs/9.2/static/catalog-pg-proc.html
CASE WHEN pg_proc_with_unnested_proallargtypes_names_and_mode.proargmode = ANY(ARRAY['o', 'b', 't'])
THEN pg_proc_with_unnested_proallargtypes_names_and_mode.typname
ELSE NULL END
), NULL) AS prooutputargtypes
FROM pg_proc_with_unnested_proallargtypes_names_and_mode
GROUP BY
pg_proc_with_unnested_proallargtypes_names_and_mode.oid,
pg_proc_with_unnested_proallargtypes_names_and_mode.proname
)
SELECT
*
FROM pg_proc_with_input_and_output_type_names
;
And here is my sample output:
oid | proname | proallargtypes | proinputargtypes | prooutputargtypes
-------+--------------+--------------------------+------------------+-------------------
16633 | test_ret | {text,text,int4,varchar} | {text,text} | {int4,varchar}
16634 | array_remove | {NULL} | {} | {}
16635 | test_ret | {NULL} | {} | {}
(3 rows)
Hope that helps :)
Answer, provided by Clément Prévost, is detailed and educative and therefore I marked it as best, while, however, after I executed suggested script I ended up with empty (filled only by {}) proinputargtypes and prooutputargtypes columnes on my machine. So I conducted a little research on my own, using hints that I learned from the answer above, and wrote following query:
WITH pg_proc_with_unhandled_proallargtypes AS (
SELECT
pg_catalog.pg_proc.oid,
pg_catalog.pg_proc.proname,
pg_catalog.pg_proc.proargmodes,
CASE WHEN proallargtypes IS NOT NULL THEN cast(proallargtypes AS text) ELSE NULL END AS proallargtype,
CASE WHEN array_agg(proargtypes) IS NOT NULL THEN replace(string_agg(proargtypes::text, ','), ' ', ',') ELSE NULL END AS proargtype
FROM pg_catalog.pg_proc
JOIN pg_catalog.pg_namespace ON pg_catalog.pg_proc.pronamespace = pg_catalog.pg_namespace.oid
WHERE pg_catalog.pg_namespace.nspname = 'public'
GROUP BY pg_catalog.pg_proc.oid, pg_catalog.pg_proc.proname, pg_catalog.pg_proc.proargmodes, proallargtype
),
pg_proc_with_unnested_proallargtypes AS(
SELECT proname, CASE WHEN char_length(proargtype) =0 THEN NULL
ELSE ('{' || proargtype || '}')::oid[] end AS inp,
CASE WHEN proallargtype is NULL THEN NULL ELSE replace(proallargtype, proargtype || ',', '')::oid[] end AS output
FROM pg_proc_with_unhandled_proallargtypes
),
smth_input AS(
SELECT proname, unnest(inp) AS inp FROM pg_proc_with_unnested_proallargtypes
),
smth_output AS(
SELECT proname, unnest(output) AS output FROM pg_proc_with_unnested_proallargtypes
),
input_unnested AS(
SELECT proname, string_agg(pg_catalog.pg_type.typname::text, ',') AS fin_input FROM smth_input
JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = inp
GROUP BY proname
),
output_unnested AS (
SELECT proname, string_agg(pg_catalog.pg_type.typname::text, ',') AS fin_output FROM smth_output
JOIN pg_catalog.pg_type ON pg_catalog.pg_type.oid = output
GROUP BY proname
)
SELECT input_unnested.proname, fin_input, CASE WHEN fin_output IS NOT NULl THEN fin_output ELSE information_schema.routines.data_type end AS fin_output
FROM input_unnested
LEFT JOIN output_unnested ON input_unnested.proname = output_unnested.proname
JOIN information_schema.routines ON information_schema.routines.routine_name = input_unnested.proname
It might be inefficient a bit and I probably used too much explicict type casting, but it worked.

How to find inherited tables programatically in PostgreSQL?

I have a PostgreSQL 8.3 database where table inheritance is being used. I would like to get a list of all tables along with its schema name which is inherited from a base table using query. Is there any way we can get this using PGSQL?
Since you're on such an old version of PostgreSQL you'll probably have to use a PL/PgSQL function to handle inheritance depths of > 1. On modern PostgreSQL (or even 8.4) you'd use a recursive common table expression (WITH RECURSIVE).
The pg_catalog.pg_inherits table is the key. Given:
create table pp( ); -- The parent we'll search for
CREATE TABLE notpp(); -- Another root for multiple inheritance
create table cc( ) inherits (pp); -- a 1st level child of pp
create table dd( ) inherits (cc,notpp); -- a 2nd level child of pp that also inherits aa
create table notshown( ) inherits (notpp); -- Table that inherits only notpp
create table ccdd () inherits (cc,dd) -- Inheritance is a graph not a tree; join node
A correct result will find cc, dd, and ccdd, but not find notpp or notshown.
A single-depth query is:
SELECT pg_namespace.nspname, pg_class.relname
FROM pg_catalog.pg_inherits
INNER JOIN pg_catalog.pg_class ON (pg_inherits.inhrelid = pg_class.oid)
INNER JOIN pg_catalog.pg_namespace ON (pg_class.relnamespace = pg_namespace.oid)
WHERE inhparent = 'pp'::regclass;
... but this will only find cc.
For multi-depth inheritance (ie tableC inherits tableB inherits tableA) you have to extend that via a recursive CTE or a loop in PL/PgSQL, using the children of the last loop as parents in the next.
Update: Here's an 8.3 compatible version that should recursively find all tables that inherit directly or indirectly from a given parent. If multiple inheritance is used, it should find any table that has the target table as one of its parents at any point along the tree.
CREATE OR REPLACE FUNCTION find_children(oid) RETURNS SETOF oid as $$
SELECT i.inhrelid FROM pg_catalog.pg_inherits i WHERE i.inhparent = $1
UNION
SELECT find_children(i.inhrelid) FROM pg_catalog.pg_inherits i WHERE i.inhparent = $1;
$$ LANGUAGE 'sql' STABLE;
CREATE OR REPLACE FUNCTION find_children_of(parentoid IN regclass, schemaname OUT name, tablename OUT name) RETURNS SETOF record AS $$
SELECT pg_namespace.nspname, pg_class.relname
FROM find_children($1) inh(inhrelid)
INNER JOIN pg_catalog.pg_class ON (inh.inhrelid = pg_class.oid)
INNER JOIN pg_catalog.pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);
$$ LANGUAGE 'sql' STABLE;
Usage:
regress=# SELECT * FROM find_children_of('pp'::regclass);
schemaname | tablename
------------+-----------
public | cc
public | dd
public | ccdd
(3 rows)
Here's the recursive CTE version, which will work if you update Pg, but won't work on your current version. It's much cleaner IMO.
WITH RECURSIVE inh AS (
SELECT i.inhrelid FROM pg_catalog.pg_inherits i WHERE inhparent = 'pp'::regclass
UNION
SELECT i.inhrelid FROM inh INNER JOIN pg_catalog.pg_inherits i ON (inh.inhrelid = i.inhparent)
)
SELECT pg_namespace.nspname, pg_class.relname
FROM inh
INNER JOIN pg_catalog.pg_class ON (inh.inhrelid = pg_class.oid)
INNER JOIN pg_catalog.pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);
The following statement retrieves all child tables of the table public.base_table_name:
select bt.relname as table_name, bns.nspname as table_schema
from pg_class ct
join pg_namespace cns on ct.relnamespace = cns.oid and cns.nspname = 'public'
join pg_inherits i on i.inhparent = ct.oid and ct.relname = 'base_table_name'
join pg_class bt on i.inhrelid = bt.oid
join pg_namespace bns on bt.relnamespace = bns.oid
It should work with 8.3 although I'm not 100% sure.
For those who are running a version of PostgreSQL with RECURSIVE support here's a function that finds derived tables for the specified base table.
CREATE OR REPLACE FUNCTION tables_derived_from(base_namespace name, base_table name)
RETURNS TABLE (table_schema name, table_name name, oid oid)
AS $BODY$
WITH RECURSIVE inherited_id AS
(
SELECT i.inhrelid AS oid
FROM pg_inherits i
JOIN pg_class base_t ON i.inhparent = base_t.oid
JOIN pg_namespace base_ns ON base_t.relnamespace = base_ns.oid
WHERE base_ns.nspname = base_namespace AND base_t.relname = base_table
UNION
SELECT i.inhrelid AS oid
FROM pg_inherits i
JOIN inherited_id b ON i.inhparent = b.oid
)
SELECT child_ns.nspname as table_schema, child_t.relname as table_name, child_t.oid
FROM inherited_id i
JOIN pg_class child_t ON i.oid = child_t.oid
JOIN pg_namespace child_ns ON child_t.relnamespace = child_ns.oid
ORDER BY 1, 2, 3;
$BODY$ LANGUAGE sql STABLE;
It's important to note that one table can inherit multiple tables, and none of the solutions listed really expose that; they just walk down the tree of a single parent. Consider:
CREATE TABLE a();
CREATE TABLE b();
CREATE TABLE ab_() INHERITS (a,b);
CREATE TABLE ba_() INHERITS (b,a);
CREATE TABLE ab__() INHERITS (ab_);
CREATE TABLE ba__() INHERITS (ba_);
CREATE TABLE ab_ba_() INHERITS (ab_, ba_);
CREATE TABLE ba_ab_() INHERITS (ba_, ab_);
WITH RECURSIVE inh AS (
SELECT i.inhparent::regclass, i.inhrelid::regclass, i.inhseqno FROM pg_catalog.pg_inherits i WHERE inhparent = 'a'::regclass
UNION
SELECT i.inhparent::regclass, i.inhrelid::regclass, i.inhseqno FROM inh INNER JOIN pg_catalog.pg_inherits i ON (inh.inhrelid = i.inhparent)
) SELECT * FROM inh;
inhparent | inhrelid | inhseqno
-----------+----------+----------
a | ab_ | 1
a | ba_ | 2
ab_ | ab__ | 1
ba_ | ba__ | 1
ab_ | ab_ba_ | 1
ba_ | ab_ba_ | 2
ba_ | ba_ab_ | 1
ab_ | ba_ab_ | 2
(8 rows)
Notice that b doesn't show up at all which is incorrect, as both ab_ and ba_ inherit b.
I suspect the "best" way to handle this would be a column that's text[] and contains (array[inhparent::regclass])::text for each table. That would give you something like
inhrelid path
ab_ {"{a,b}"}
ba_ {"{b,a}"}
ab_ba_ {"{a,b}","{b,a}"}
While obviously not ideal, that would at least expose the complete inheritance path and allow you to access it with enough gymnastics. Unfortunately, constructing that is not at all easy.
A somewhat simpler alternative is not to include the full inheritance path at each level, only each tables direct parents. That would give you this:
inhrelid parents
ab_ {a,b}
ba_ {b,a}
ab_ba_ {ab_,ba_}