What's the ideal way to store dropdown list text/values in a Postgres 9.4? - postgresql

What's the optimal way to store values for a select list in a web-app with Postgres?
If I use an enum, this has the benefit of acting as a constraint for whatever column is set to that type (only allowing possible values). I can also write rather normal queries to pull those values to populate the select... but this has the drawback of requiring the option text and value to be identical.
If I create a table to store these, I can have columns for both value and text, and perhaps even a third (comment/description, whatever). However, it means a full table for every set of values, of which I expect several dozen throughout the webapp. Not sure why this feels like a "heavier" solution than enums, but it does. (A "create enum" statement plus possible "alter enum" in the future vs. a "create table" plus many initial insert statements and maybe more in the future.)
Nor can I create a single table for all dropdown lists, because then I would need to do convoluted constraint logic in the various tables that related to that.
Is there a code pattern that is ideal for this problem that I'm unaware of?
The solution doesn't need to be portable to other database engines... I'm more than happy to use a postgres-only solution.

Related

Constrain a column across tables

I'm using Postgres to store formulae and elements of formulae across two tables. Basically, you have something like:
Elements Table
symbol
content
pi
3.1.45
lune
42
Formula Table
symbol
content
area
pi*r^2
rugsize
area*lune
So, formulae can use elements but also other formulae in their content field. For this reason (and for general reduction of confusion) I would like to make symbol unique across both tables.
I can, of course, in the code that's doing the insertion, look up the entries before adding them and refuse to add a duplicate symbol. (I probably will do this, but I don't want the database reliant on that.) I could also require a tag within the formula table to specify when it's using another formula:
symbol
content
rugsize
f(area)*lune
I'm not crazy about that since it puts a burden on the user to remember that, or on the coder to secretly add and remove the "f()".
Everything I found on Stack and elsewhere went the other way: Forcing a column value to be present in another table, except for one suggestion that the unique items be kept in a separate table.
symbol
area
lune
pi
rugsize
And then...actually, I'm still not sure how that would work at the DB level.
So, is there a way to do this with constraints or foreign keys, or must I write a trigger for each table to look into the other table?
Addition: I've simplified here greatly but the elements table is much more complex than I'm showing and has little in common with the formulae table.
Edited to add the above addition and to try to fix the one-column "symbol" table which looks fine in the editing preview but does not format correctly on the actual page.

PostgreSQL array of elements that each are a foreign key

I am attempting to create a DB for my app and one thing I'd like to find the best way of doing is creating a one-to-many relationship between my Users and Items tables.
I know I can make a third table, ReviewedItems, and have the columns be a User id and an Item id, but I'd like to know if it's possible to make a column in Users, let's say reviewedItems, which is an integer array containing foreign keys to Items that the User has reviewed.
If PostgreSQL can do this, please let me know! If not, I'll just go down my third table route.
It may soon be possible to do this: https://commitfest.postgresql.org/17/1252/ - Mark Rofail has been doing some excellent work on this patch!
The patch will (once complete) allow
CREATE TABLE PKTABLEFORARRAY (
ptest1 float8 PRIMARY KEY,
ptest2 text
);
CREATE TABLE FKTABLEFORARRAY (
ftest1 int[],
FOREIGN KEY (EACH ELEMENT OF ftest1) REFERENCES PKTABLEFORARRAY,
ftest2 int
);
However, author currently needs help to rebase the patch (beyond my own ability) so anyone reading this who knows Postgres internals please help if you can.
No, this is not possible.
PostgreSQL is a relational DBMS, operating most efficiently on properly normalized data models. Arrays are not relational data structures - by definition they are sets - and while the SQL standard supports defining foreign keys on array elements, PostgreSQL currently does not support it. There is an (dormant? no activity on commitfest since February 2021) effort to implement this - see this answer to this same question - so the functionality might one day be supported.
For the time being you can, however, build a perfectly fine database with array elements linking to primary keys in other tables. Those array elements, however, can not be declared to be foreign keys and the DBMS will therefore not maintain referential integrity. Using an appropriate set of triggers (both on the referenced and referencing tables, as a change in either would have to trigger a check and possible update on the other) one would in principle be able to implement referential integrity over the array elements but the performance is unlikely to be stellar (because indexes would not be used, for instance).

Is it relevant to put "version" on a separate sql server table?

I have a table with several fields, this table almost never change but for one field, "version" which change very often.
Would it be relevant to put that single field into a separate table in order to reduce how often locks are put on the main table?
For instance I have a table tType and a table tEntry.
Whenever I add/deleted/update any row of tEntry, I need to update the "version" field of tType. There might be thousand of rows inside tEntry for a single tType referenced row. Meaning the version number could change very often, though any other data of tType (such as name, id, etc.) doesn't change.
Your Referral to tType and tEntry sounds like you are implementing a key-value store in a rdbms. There are several discussions you can google about this topic. In the www there seems to be consesus, that cons overweight pros on that. An option would be to look at key value stores, no sql, multi column DBs, etc (wikipedia)...
The next "anti-pattern" I recognized is that you try to mix transactional data with 'master data' in the table tType. Try to avoid this, even if your selects get more uncomfortable and need to be tuned better. Keep off the version info from the tType, if this changes extremely often. Look here to get the concept: MySQL JOIN the most recent row only?

Dynamic auditing of data with PostgreSQL trigger

I'm interested in using the following audit mechanism in an existing PostgreSQL database.
http://wiki.postgresql.org/wiki/Audit_trigger
but, would like (if possible) to make one modification. I would also like to log the primary_key's value where it could be queried later. So, I would like to add a field named something like "record_id" to the "logged_actions" table. The problem is that every table in the existing database has a different primary key fieldname. The good news is that the database has a very consistent naming convention. It's always, _id. So, if a table was named "employee", the primary key is "employee_id".
Is there anyway to do this? basically, I need something like OLD.FieldByName(x) or OLD[x] to get value out of the id field to put into the record_id field in the new audit record.
I do understand that I could just create a separate, custom trigger for each table that I want to keep track of, but it would be nice to have it be generic.
edit: I also understand that the key value does get logged in either the old/new data fields. But, what I would like would be to make querying for the history easier and more efficient. In other words,
select * from audit.logged_actions where table_name = 'xxxx' and record_id = 12345;
another edit: I'm using PostgreSQL 9.1
Thanks!
You didn't mention your version of PostgreSQL, which is very important when writing answers to questions like this.
If you're running PostgreSQL 9.0 or newer (or able to upgrade) you can use this approach as documented by Pavel:
http://okbob.blogspot.com/2009/10/dynamic-access-to-record-fields-in.html
In general, what you want is to reference a dynamically named field in a record-typed PL/PgSQL variable like 'NEW' or 'OLD'. This has historically been annoyingly hard, and is still awkward but is at least possible in 9.0.
Your other alternative - which may be simpler - is to write your audit triggers in plperlu, where dynamic field references are trivial.

Postgres full text search across multiple related tables

This may be a very simplistic question, so apologies in advance, but I am very new to database usage.
I'd like to have Postgres run its full text search across multiple joined tables. Imagine something like a model User, with related models UserProfile and UserInfo. The search would only be for Users, but would include information from UserProfile and UserInfo.
I'm planning on using a gin index for the search. I'm unclear, however, on whether I'm going to need a separate tsvector column in the User table to hold the aggregated tsvectors from across the tables, and to setup triggers to keep it up to date. Or if it's possible to create an index without a tsvector column that'll keep itself up to date whenever any of the relevant fields in any of the relevant tables change. Also, any tips on the syntax of the command to create all this would be much appreciated as well.
Your best answer is probably to have a separate tsvector column in each table (with an index on, of course). If you aggregate the data up to a shared tsvector, that'll create a lot of updates on that shared one whenever the individual ones update.
You will need one index per table. Then when you query it, obviously you need multiple WHERE clauses, one for each field. PostgreSQL will then automatically figure out which combination of indexes to use to give you the quickest results - likely using bitmap scanning. It will make your queries a little more complex to write (since you need multiple column matching clauses), but that keeps the flexibility to only query some of the fields in the cases where you want.
You cannot create one index that tracks multiple tables. To do that you need the separate tsvector column and triggers on each table to update it.