How to create enum from a SQL query in PostgreSQL? - postgresql

I want to create an enum from distinct values of a column in PostgreSQL. Instead of creating an enum from all labels, I want to create it using a query to get all possible values of the type. I am expecting something like:
CREATE TYPE genre_type AS ENUM
(select distinct genre from movies);
But I am not allowed to do this. Is there any way to achieve this?

If genre types are in any way dynamic, i.e. you create new ones and rename old ones from time to time, or if you want to save additional information with every genre type, #mu's advice and #Marcello's implementation would be worth considering - except you should just use type text instead of varchar(20) and consider ON UPDATE CASCADE for the fk constraint.
Other than that, here is the recipe you asked for:
DO
$$
BEGIN
EXECUTE (
SELECT format('CREATE TYPE genre_type AS ENUM (%s)'
,string_agg(DISTINCT quote_literal(genre), ', '))
FROM movies);
END
$$
You need dynamic SQL for that. The simple way is a DO command (PostgreSQL 9.0+).
Make sure your strings are properly escaped with quote_literal().
I aggregate the string with string_agg() (PostgreSQL 9.0+).

I think you can't do that by design.
http://www.postgresql.org/docs/9.1/static/datatype-enum.html
Enumerated (enum) types are data types that comprise a static, ordered
set of values.
The "DISTINCT" keyword in your SELECT clause makes me think your schema is not fully normalized.
For example:
CREATE TABLE movies(
...
genre VARCHAR(20)
...
);
SELECT DISTINCT genre FROM movies;
Should become:
CREATE TABLE genres(
id SERIAL PRIMARY KEY,
name VARCHAR(20)
);
CREATE TABLE movies (
id SERIAL PRIMARY KEY,
title VARCHAR(200),
genre INTEGER REFERENCES genres(id)
);
SELECT name FROM genres;

A trivial approach would be to execute the SELECT in a client and then copy the names from it.
I think that is about everything you can do. If you look at the documentation http://www.postgresql.org/docs/9.3/static/sql-createtype.html you will notice that there is written CREATE TYPE name AS ENUM ( [ 'label' [, ... ] ] ) so what is not there is the keyword expression. That means that after ENUM there may not be an expression but only a list of labels.
You may want to follow #muistooshort's advice and create a genres table (that you can fill with an INSERT ... SELECT ...) and then create a foreign key to that table.

It can be done in pgsql. Here's inline code to do it:
DO $$
DECLARE temp text;
BEGIN
SELECT INTO temp string_agg(DISTINCT quote_literal(genre),',') FROM movies;
EXECUTE 'CREATE TYPE foo AS ENUM ('||temp||')';
END$$;
fiddle

Related

Cannot assign/cast PostgreSQL record with JSONB to an HSTORE column

I'm trying to create a trash table, where I can store deleted entities, and have the ability to restore them manually as needed. To make this more complicated, my tables have JSONB, HSTORE, GEOMETRY, and BYTEA columns.
Inserting is pretty simple:
insert into trash (original_table, original_id, content) select 'something', id, to_jsonb(something) from something where ...;
But HSTORE seems to cause problems when I'm trying to read data back from trash:
select t.* from trash, jsonb_to_record(content) as t(id bytea, created_at timestamp, attributes hstore, ...);
(content is a column of type JSONB.)
GEOMETRY, BYTEA, and JSONB columns get assigned/cast just fine. I know that JSONB cannot be automatically cast to HSTORE, so I created this CAST:
CREATE FUNCTION jsonb_to_hstore(j JSONB) RETURNS HSTORE IMMUTABLE STRICT LANGUAGE sql AS $$
SELECT hstore(array_agg(key), array_agg(value)) FROM jsonb_each_text(j)
$$;
CREATE CAST (JSONB AS HSTORE) WITH FUNCTION jsonb_to_hstore AS IMPLICIT;
But it doesn't seem to be used, and I still get this error when including the HSTORE column in t(...):
ERROR: Syntax error near '"' at position 11
Why is my cast not used by AS? Where can I find out more about the usage of the record type? jsonb_to_record documentation mentions that:
As with all functions returning record, the caller must explicitly define the structure of the record with an AS clause.
But I can't find anything more about this.

PostgreSQL use uuid_generate_v4() used in INSERT...SELECT in a later UPDATE statement

I'm writing a database migration that adds a new table whose id column is populated using uuid_generate_v4(). However, that generated id needs to be used in an UPDATE on another table to associate the entities. Here's an example:
BEGIN;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE IF NOT EXISTS models(
id,
type
);
INSERT INTO models(id)
SELECT
uuid_generate_v4() AS id
,t.type
FROM body_types AS t WHERE t.type != "foo";
ALTER TABLE body_types
ADD COLUMN IF NOT EXISTS model_id uuid NOT NULL DEFAULT uuid_generate_v4();
UPDATE TABLE body_types SET model_id =
(SELECT ....??? I'M STUCK RIGHT HERE)
This is obviously a contrived query with flaws, but I'm trying to illustrate that what it looks like I need is a way to store the uuid_generate_v4() value from each inserted row into a variable or hash that I can reference in the later UPDATE statement.
Maybe I've modeled the solution wrong & there's a better way? Maybe there's a postgresql feature I just don't know about? Any pointers greatly appreciated.
I was modeling the solution incorrectly. The short answer is "don't make the id in the INSERT random". In this case the key is to add the 'model_id' column to 'body_types' first. Then I can use it in the INSERT...SELECT without having to save it for later use because I'll be selecting it from the body_types table.
BEGIN;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
ALTER TABLE body_types
ADD COLUMN IF NOT EXISTS model_id uuid NOT NULL DEFAULT uuid_generate_v4();
CREATE TABLE IF NOT EXISTS models(
id,
type
);
INSERT INTO models(id)
SELECT
t.model_id AS id
,t.type
FROM body_types AS t WHERE t.type != "foo";
Wish I had a better contrived example, but the point is, avoid using random values that you have to use later, and in this case it was totally unnecessary to do so anyway.

PostgreSQL - how to pass a function's argument to a trigger?

I have tree tables in a database: users (user_id (auto increment), fname, lname), roles (role_id, role_desc) and users_roles (user_id, role_id). What I'd like to do is to have a function create_user_with_role. The function takes 3 arguments: first name, last name and role_id. The function inserts a new row into the users table and a new user_id is created automatically. Now I want to insert a new record to the users_roles table: user_id is the newly created value and the role_id is taken from the function's arguments list.
Is it possible to pass the role_id argument to an after insert trigger (defined on users table) so another automatic insert can be performed? Or can you suggest any other solution?
First #Pavel Stehule is right:
Don't try to pass parameters to triggers, ever!
Second, you just have to get the inserted id into a variable.
CREATE FUNCTION create_user_with_role(first_name text, last_name text, new_role_id integer)
RETURNS VOID AS $$
DECLARE
new_user_id integer;
BEGIN
INSERT INTO users (fname, lname) VALUES (first_name, last_name)
RETURNING id INTO new_user_id;
INSERT INTO users_roles (user_id, role_id)
VALUES (new_user_id, new_role_id);
END;$$ LANGUAGE plpgsql;
Obviously, this is completely inefficient if you want to insert multiple rows but that's another question ;)
When you need to pass any parameter to trigger, then there is clean, so your design is wrong. Usually triggers should to have check or audit functionality. Not more. You can use a function, and call function directly from your application. Don't try to pass parameters to triggers, ever! Another bad sign are artificial columns in table used just only for trigger parametrization. This is pretty bad design!

Getting error for auto increment fields when inserting records without specifying columns

We're in process of converting over from SQL Server to Postgres. I have a scenario that I am trying to accommodate. It involves inserting records from one table into another, WITHOUT listing out all of the columns. I realize this is not recommended practice, but let's set that aside for now.
drop table if exists pk_test_table;
create table public.pk_test_table
(
recordid SERIAL PRIMARY KEY NOT NULL,
name text
);
--example 1: works and will insert a record with an id of 1
insert into pk_test_table values(default,'puppies');
--example 2: fails
insert into pk_test_table
select first_name from person_test;
Error I receive in the second example:
column "recordid" is of type integer but expression is of type
character varying Hint: You will need to rewrite or cast the
expression.
The default keyword will tell the database to grab the next value.
Is there any way to utilize this keyword in the second example? Or some way to tell the database to ignore auto-incremented columns and just them be populated like normal?
I would prefer to not use a subquery to grab the next "id".
This functionality works in SQL Server and hence the question.
Thanks in advance for your help!
If you can't list column names, you should instead use the DEFAULT keyword, as you've done in the simple insert example. This won't work with a in insert into ... select ....
For that, you need to invoke nextval. A subquery is not required, just:
insert into pk_test_table
select nextval('pk_test_table_id_seq'), first_name from person_test;
You do need to know the sequence name. You could get that from information_schema based on the table name and inferring its primary key, using a function that takes just the table name as an argument. It'd be ugly, but it'd work. I don't think there's any way around needing to know the table name.
You're inserting value into the first column, but you need to add a value in the second position.
Therefore you can use INSERT INTO table(field) VALUES(value) syntax.
Since you need to fetch values from another table, you have to remove VALUES and put the subquery there.
insert into pk_test_table(name)
select first_name from person_test;
I hope it helps
I do it this way via a separate function- though I think I'm getting around the issue via the table level having the DEFAULT settings on a per field basis.
create table public.pk_test_table
(
recordid integer NOT NULL DEFAULT nextval('pk_test_table_id_seq'),
name text,
field3 integer NOT NULL DEFAULT 64,
null_field_if_not_set integer,
CONSTRAINT pk_test_table_pkey PRIMARY KEY ("recordid")
);
With function:
CREATE OR REPLACE FUNCTION func_pk_test_table() RETURNS void AS
$BODY$
INSERT INTO pk_test_table (name)
SELECT first_name FROM person_test;
$BODY$
LANGUAGE sql VOLATILE;
Then just execute the function via a SELECT FROM func_pk_test_table();
Notice it hasn't had to specify all the fields- as long as constraints allow it.

postgresql - designing a tree hierarchy with mixed node types (inheritance does not help!)

I have a question about implementing inheritance in postgresql(9.1).
The purpose is to build a geo-hierarchy model, where countries, states and continents can be mixed up to create "regions". And then these
regions too can be mixed up with the countries, etc. to create a truly awesome region-hierarchy
So in my logical model, everything is a type of "place". A region-tree can be constructed by specifying edgewise using the two "places". The design is as below, and easy to manage in the Java layer.
create table place_t (
place_id serial primary key,
place_type varchar(10)
);
create table country_t (
short_name varchar(30) unique,
name varchar(255) null
) inherits(place_t);
create table region_t(
short_name varchar(30),
hierarchy_id integer, -- references hierarchy_t(hierarchy_id)
unique(short_name) -- (short_name,hierarchy_id)
) inherits(place_t);
create table region_hier_t(
parent integer references place_t(place_id), -- would prefer FK region_t(place_id)
child integer references place_t(place_id),
primary key(parent,child)
);
insert into region_t values(DEFAULT, 'region', 'NA', 'north american ops');
insert into region_t values(DEFAULT, 'region', 'EMEA', 'europe and middle east');
insert into country_t values(DEFAULT, 'country', 'US', 'USD', 'united states');
insert into country_t values(DEFAULT, 'country', 'CN', 'CND', 'canada');
So far so good. But the following fails:
insert into region_hier_t
select p.place_id, c.place_id
from region_t as p, country_t as c
where p.short_name = 'NA' and c.short_name = 'US';
The reason is that the first 4 inserts did not create any row in "place_t". RTFM! Postgres docs actually mention this.
The question is - is there a workaround? Via insert triggers on region_t and country_t to implement my own "inheritance" is the only thing I could think of.
A second question is - is there a better design for such a mixed-node tree structure?
For certain reasons I do not want to rely too much on postgres-contrib features. Perhaps that's very silly and please feel free to chime in, but gently (and only after answering the other question)!
Thanks
References on parent and child column in region_hier_t table are wrong, because you cannot insert a key from country_t if your reference calls another table (child integer references place_t(place_id)); You can either drop them or add new ones.
So let's take the second option and add an unique constraint matching given keys for referenced tables region_t and country_t:
ALTER TABLE region_t
ADD CONSTRAINT pk_region_t PRIMARY KEY(place_id );
ALTER TABLE country_t
ADD CONSTRAINT pk_country_t PRIMARY KEY(place_id );
The correct CREATE statement for region_hier_t is:
create table region_hier_t(
parent integer references region_t(place_id),
child integer references country_t(place_id),
primary key(parent,child)
);
And finally you can run your INSERT.
So, as you see there is many improvements for you to do. Maybe you should reconsider your design. Take a look at this answer: How to store postal addresses and political divisions in a normalized way? It's much simpler than your solution and easier to maintain.
But if you wanna stay by your solution don't forget to set primary keys on child tables(as shown above). Only check constraints and not-null constraints are inherited by its children and you haven't done it already.
I see that other of your insert don't work correctly:
insert into region_t values(DEFAULT, 'region', 'NA', 'north american ops');
ERROR: invalid input syntax for integer: "north american ops"
LINE 1: ...ert into region_t values(DEFAULT, 'region', 'NA', 'north ame...
So there is problem with column assignment as well.
So it turns out that inheritance in PostgreSQL is somewhat different from that used in typical OOP languages. In particular, the "superclass" table is not populated automatically. If I had to use my own triggers to do that, I didn't have a use case left for the inheritance structure.
So I abandoned Postgresql inheritance and created my own "place_t" table. And "country_t", "state_t", "county_t" and "region_t" children tables, linked to parent "place_t" through "place_id".
On these children tables, I created an before insert/update row level trigger to ensure that "place_id" refers to a valid row in "place_t" and the reference is not changed later. IOW, "place_id" in children tables should behave like write-once-read-many.
Now, I can insert the world geo. Also, define a new "region". I created a "region_composition_t" to record the edges of a regional hierarchy, the parent being a reference to "region_t" and child being a reference to "place_t".
So far so good. The challenge now is how to suppress any update/delete cascading effects.
The workaround is to get rid of your foreign keys to place_t and do instead:
CREATE FUNCTION place_t_exists(id int)
RETURNS bool LANGUAGE SQL AS
$$
SELECT count(*) = 1 FROM place_t;
$$;
CREATE FUNCTION fkey_place_t() RETURNS TRIGGER
LANGUAGE PLPGSQL AS $$
BEGIN;
IF place_t_exists(TG_ARGV[1]) THEN RETURN NEW
ELSE RAISE EXCEPTION 'place_t does not exist';
END IF;
END;
$$;
You also need something on the child tables to restrain when the hierarchy node exists:
CREATE FUNCTION hierarchy_exists(id int) RETURNS BOOL LANGUAGE SQL AS
$$
SELECT COUNT(*) > 0 FROM region_heir_t WHERE parent = $1 or child = $1;
$$;
CREATE OR REPLACE FUNCTION fkey_hierarchy_trigger() RETURNS trigger LANGUAGE PLPGSQL AS
$$
BEGIN
IF hierarchy_exists(old.place_id) THEN RAISE EXCEPTION 'Hierarchy node still exists';
ELSE RETURN OLD;
END;
$$;
Then you can create your triggers:
CREATE CONSTRAINT TRIGGER fkey_place_parent AFTER INSERT OR UPDATE TO region_hier_t
FOR EACH ROW EXECUTE PROCEDURE fkey_place_t(new.parent);
CREATE CONSTRAINT TRIGGER fkey_place_child AFTER INSERT OR UPDATE TO region_hier_t
FOR EACH ROW EXECUTE PROCEDURE fkey_place_t(new.child);
And then for each of the place_t child tables:
CREATE CONSTRAINT TRIGGER fkey_hier_t TO [child_table]
FOR EACH ROW EXECUTE PROCEDURE fkey_hierarchy_trigger();
This solution may not be worth it, but it is worth knowing how to do it if you need to.