Determine column data types in PostgreSQL - postgresql

I'm interacting with a PostgreSQL on command-line based environment, and I'd like to be able to determine the data types of the table columns in this database.
For a simple example, when I request
SELECT * FROM products
I'd like to know if the product id column it returns is giving me text or integers.

You can get this kind of information by querying the catalog.
To find the specific queries, run psql -E (to echo hidden query) and then e.g. \d products. You'll see psql output various queries that yield information about your table, its column types, indexes, etc.
In this specific case, you'd run something like:
SELECT a.attname,
pg_catalog.format_type(a.atttypid, a.atttypmod)
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = 'products'::regclass AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

Related

How to list all tables in postgres without partitions

This is closely related to this question which describes how to list all tables in a schema in a postgres databank. The query select * from information_schema.tables does the job. In my case, some of the tables in the schema are partitioned and in this case the query above lists both the complete table as well as all the partitions as separate entries.
How can I get a list that only contains the full tables without the individual partitions?
For example, if the schema contains a table named 'example' which is partitioned on the column 'bla' with the two values 'a' and 'b', then information_schema.tables will have one entry for 'example' and then two additional entries 'example_part_bla_a' and 'example_part_bla_a'. I thought about doing an exclusion based on substring matches to 'part' or something like that but that makes an assumption on how the tables are named and hence would fail with some table names. There must be a better way to do this.
You won't find that information in the information_schema; you will have to query the catalogs directly:
SELECT c.relname
FROM pg_class AS c
WHERE NOT EXISTS (SELECT 1 FROM pg_inherits AS i
WHERE i.inhrelid = c.oid)
AND c.relkind IN ('r', 'p');

SQL Natural Join

Okay. So the question that I got asked by the teacher was this:
(5 marks) Construct a SQL query on the dvdrental database that uses a natural join of two or more tables and an additional where condition. (E.g. find the titles of films rented by a particular customer.) Note the hints on the course news page if your query returns nothing.
Here is the layout of the database im working with:
http://www.postgresqltutorial.com/wp-content/uploads/2013/05/PostgreSQL-Sample-Database.png
The hint to us was this:
PostgreSQL hint:
If a natural join doesn't produce any results in the dvdrental DB, it is because many tables have the last update: timestamp field, and thus the natural join tries to join on that field as well as the intended field.
e.g.
select *
from film natural join inventory;
does not work because of this - it produces an empty table (no results).
Instead, use
select *
from film, inventory
where film.film_id = inventory.film_id;
This is what I did:
select *
from film, customer
where film.film_id = customer.customer_id;
The problem is I cannot get a particular customer.
I tried doing customer_id = 2; but it returns a error.
Really need help!
Well, it seems that you would like to join two tables that have no direct relation with each other, there's your issue:
where film.film_id = customer.customer_id
To find which films are rented by which customer you would have to join customer table with rental, then with inventory and finally with film.
The task description states
Construct a SQL query on the dvdrental database that uses a natural join of two or more tables and an additional where condition.quote

How to list tables from accessible via database links?

I have an access to a database, and sure I can get all tables/columns accessible for me just using:
select * from ALL_TAB_COLUMNS
I can also access some tables using "#", as I understand a database link mechanism, like this:
select * from aaa.bbb_ddd#ffgh where jj = 55688
where aaa.bbb_ddd#ffgh corresponds to some table with a column jj
BUT I don't see this aaa.bbb_ddd#ffgh table in ALL_TAB_COLUMNS.
How can I request all tables (and columns inside them) accessible for me via these database links (or so)?
You can't, easily, get all columns accessible via all database links; you can get all columns accessible via one database link by querying ALL_TAB_COLUMNS on the remote database
select * from all_tab_columns#<remote_server>
where <remote_server> in your example would be ffgh.
If you want to get this same information for all database links in your current schema, you'd either have to manually enumerate them and UNION the results together:
select * from all_tab_columns#dblink1
union all
select * from all_tab_columns#dblink2
Or, do something dynamically.
As Justin says, it's clearer if you add which database the data is coming from; you can do this either by just writing it in the query:
select 'dblink1' as dblink, a.* from all_tab_columns#dblink1 a
union all
select 'dblink2', a.* from all_tab_columns#dblink2 a
Or by using an Oracle built-in to work, for example the GLOBAL_NAME table (there's lots more ways):
select db1g.global_name, db1a.*
from all_tab_columns#dblink1 db1a
cross join global_name#dblink1 db1g
union all
select db2g.global_name, db2a.*
from all_tab_columns#dblink2 db2a
cross join global_name#dblink2 db2g

Postgres subquery has access to column in a higher level table. Is this a bug? or a feature I don't understand?

I don't understand why the following doesn't fail. How does the subquery have access to a column from a different table at the higher level?
drop table if exists temp_a;
create temp table temp_a as
(
select 1 as col_a
);
drop table if exists temp_b;
create temp table temp_b as
(
select 2 as col_b
);
select col_a from temp_a where col_a in (select col_a from temp_b);
/*why doesn't this fail?*/
The following fail, as I would expect them to.
select col_a from temp_b;
/*ERROR: column "col_a" does not exist*/
select * from temp_a cross join (select col_a from temp_b) as sq;
/*ERROR: column "col_a" does not exist
*HINT: There is a column named "col_a" in table "temp_a", but it cannot be referenced from this part of the query.*/
I know about the LATERAL keyword (link, link) but I'm not using LATERAL here. Also, this query succeeds even in pre-9.3 versions of Postgres (when the LATERAL keyword was introduced.)
Here's a sqlfiddle: http://sqlfiddle.com/#!10/09f62/5/0
Thank you for any insights.
Although this feature might be confusing, without it, several types of queries would be more difficult, slower, or impossible to write in sql. This feature is called a "correlated subquery" and the correlation can serve a similar function as a join.
For example: Consider this statement
select first_name, last_name from users u
where exists (select * from orders o where o.user_id=u.user_id)
Now this query will get the names of all the users who have ever placed an order. Now, I know, you can get that info using a join to the orders table, but you'd also have to use a "distinct", which would internally require a sort and would likely perform a tad worse than this query. You could also produce a similar query with a group by.
Here's a better example that's pretty practical, and not just for performance reasons. Suppose you want to delete all users who have no orders and no tickets.
delete from users u where
not exists (select * from orders o where o.user_d = u.user_id)
and not exists (select * from tickets t where t.user_id=u.ticket_id)
One very important thing to note is that you should fully qualify or alias your table names when doing this or you might wind up with a typo that completely messes up the query and silently "just works" while returning bad data.
The following is an example of what NOT to do.
select * from users
where exists (select * from product where last_updated_by=user_id)
This looks just fine until you look at the tables and realize that the table "product" has no "last_updated_by" field and the user table does, which returns the wrong data. Add the alias and the query will fail because no "last_updated_by" column exists in product.
I hope this has given you some examples that show you how to use this feature. I use them all the time in update and delete statements (as well as in selects-- but I find an absolute need for them in updates and deletes often)

POSTGRESQL: DUMP TABLE WITHOUT TOAST DATA

I am trying to isolate toast data from a table so that I can dump the table without the toast data. I know there must be a way to do that, but I cant get my way there...Suggestions would be highly appreciated
Try a COPY (or psql's \copy) with the query option - you can select the columns to export. You can also choose a CSV format rather than tab-separated, the representation of nulls etc.
TOAST is the way how PostgreSQL is storing your data internally. For you, as a user, there is only values that you delegated to the database to keep for you.
TOAST comes into play mostly for the textual data, when any of the tuple's attributes make tuple's size to be more then 8k (if PostgreSQL compiled with default page size). This happens inside the DB engine, transparently to the user. Say, if you'll insert a row with a text that has round 10k symbols, the corresponding attribute will be TOASTed.
Given how TOAST works, your question appears to look like: How do I dump table without attributes containing big chunks of data? It seems unclear for me what would be the purpose of this, as your dump will be incomplete.
EDIT: I don't know how to find if any attribute of any tuple do have a TOASTed value. Instead, I will eliminate all attributes that can have TOASTed values.
The following query will give you all the columns for a table, that are always in PLAIN storage mode:
SELECT a.attname
FROM pg_class t
JOIN pg_attribute a ON t.oid = a.attrelid
JOIN pg_type typ ON typ.oid = a.atttypid
WHERE t.relkind='r' AND t.relname = 'element'
AND a.attnum > 0 AND NOT a.attisdropped
AND typ.typstorage='p'
ORDER BY a.attnum;
And this query will generate the desired SQL, you can wrap it in the script or into the PL/pgSQL's EXECUTE statement:
SELECT 'COPY '||quote_ident(t.relname)||
'('||string_agg(a.attname, ',' ORDER BY a.attnum)||') TO stdout;'
FROM pg_class t
JOIN pg_attribute a ON t.oid = a.attrelid
JOIN pg_type typ ON typ.oid = a.atttypid
WHERE t.relkind='r' AND t.relname = '<YOUR_TABLE>'
AND a.attnum > 0 AND NOT a.attisdropped
AND typ.typstorage='p'
GROUP BY t.relname;