PostgreSQL: how to define and use "global" constants - postgresql

I am writing a few stored procs that process some batch upload data. Each input line can be flagged for a variety of application errors. I have nearly 100 different types of errors in all, and over a dozen different file load procedures.
In C/C++ the idiom for error codes is a bunch of #define or const in a project-wide include (class) file and then using the symbolic names in application code. The compilers check for wayward spellings. Java/C# too offer a similar construct. How does one obtain a similar effect in plpgsql? I have toyed with setting up these in postgresql.conf but is that a sound approach? It obviously will not work at compile time. And I don't want to grant write privileges to conf files to application developers. Further, it will require a reload of conf for every application change, possibly a system stability issue. I am sure there are many other drawbacks.
In a like vein, I have also a need for plain "user-defined" types wherein I would like to fix the representation of certain application data types, such as "part_number" to be varchar(20), "currency_code" to be char(3) and so on. Again, in C/C++ one would use typedef or struct as the case might be. So I tried creating a TYPE in PostgreSQL for consistent usage across tables, views, function headers. But with the UDTs I ran into a new set of issues: specifying primary keys, and in CSV input specs where the value must now be given in parentheses. Is there a different way of dealing with such objectives in PostgreSQL?
I am new to PostgreSQL. We are using 9.2 on Linux. I am tempted to use a pre-processor but then it will not be compatible with any design tool I have seen.

For your first question you could potentially use an ENUM type.
CREATE TYPE flag AS ENUM ('ok', 'bad', 'superbad');
Which would at least allow for sanity checking of your spellings for each of the flag states.
For your second question (and please ask multiple questions in the future - since it keeps things on topic) you might want to look at DOMAINs

Related

Is it safe to use fundamental type OID defined in catalog header on client code?

This QA entry shows how to get OID code from catalog header.
It might be the simplest way to get OID numbers. Anyway the header file itself is explicitly separated from client-side header, so it seems it implies not to be used on client side.
Is it safe to use these server-side constants on client side? It's predictable that it will make some legacy issue. Older version of server may lack specific OID code. So I ask excluding this case. I mean, can I assume *once define OID code for fundamental types to be same eternally on future versions*?
Update
I meant only for fundamental types. Such as TEXTOID, INT8OID or TIMESTAMPOID. No custom, composite, user-defined or any other non-fundamental stuffs.
Currently, this is the best I could find. I would go with hardcoding OIDs.
Cited from Merlin Moncure's mention from the mailing list entry.
built in type oids are defined in pg_type.h: cat
src/include/catalog/pg_type.h | grep OID | grep define
built in type oids don't change. you can pretty much copy/pasto the
output of above into an app...just watch out for some types that may
not be in older versions.
user defined type oids (tables, views, composite types, enums, and
domains) have an oid generated when it is created. since that oid can
change via ddl so you should look it up by name at appropriate times.
Is it safe to use these server-side constants on client side? (...) I mean, can I assume once define OID code to be same eternally on future versions?
I honestly doubt it's not safe. The oids will likely be different depending on the Postgres version that was initially installed when new types etc are introduced. Older installs that get upgraded may or may not have the same oids as the same on fresh installs.
For illustration purposes, picture yourself creating an admin user with ID 1 in an app when it gets installed, and hard-coding everything in C by defining ADMIN_USER as that ID. Your customers then add new users, etc. In a subsequent release, you add a quasi-admin user with ID 2 and proceed tohard-code everything around that too. Customers upgrade... and, well, it blows up in your face because on their end, the quasi-admin user can have any ID...
When you use hard-coded oids in Postgres, the same kind of thing may happen. In one version, the built-in types are created in a certain order; in the next, they may be created in another because e.g. Postgres adds a shiny new enum or int4range type. And this doesn't begin to touch the topic of what may potentially occur during upgrades. (Admittedly, dumping and reloading data should yield sane things here, but I wouldn't take the chance myself.)

Is it possible to create permanent object aliases

I recently found myself using some rather lengthy names for the tables and views involved in a development piece, which got me wondering whether it's possible to create client/database/server level aliases for objects.
Say for example I have a view named dbo.vAlphaBetaGammaDelta . Is there a way (with or without Intellisense) to create a reference to it named dbo.vABGD ?
If not, would there be any downfalls to creating a view of a view or single table aside from maintenance necessary if/when the table schema changes?
I should note that these aliases/views would not be intended for use in other objects, but for alleviation and prevention of carpal tunnel during day-to-day troubleshooting and delving xD
SQL Server allows for the creation of synonyms. That seems to be what you are looking for: http://msdn.microsoft.com/en-us/library/ms177544.aspx
However, as #MitchWheat mentioned. this seems to be going in the wrong direction. There are a few quite good SSMS plugins available that provide auto completion of long object names (e.g. SQL Prompt). Incidentally those products have trouble with synonyms...
There are many cases where you would like to have synonyms.
Let me state just one for start:
You have a well defined hypothetical name of a table: GlobalStatisticalRecord. Hundreds of lines of code and objects (keys, indexes, etc.) in SQL and elsewhere are referring to this table name.
After 5 years of usage, the abbreviation GSR was accustomed not only among the technical people, but also among the business users. So, to stress again, GSR is now even more recognizable than GlobalStatisticalRecord. However, for the new people that come into the technical team, it is good to keep the name GlobalStatisticalRecord as a table name, since it nicely describes what is the table all about. Now, when writing a quick adhoc query - and that may not be from your tool of choice with all the Intellisense features you are accustom to - then these aliases are really saving your time (and "life" at 2am in the morning when you are frantically trying to diagnose a production problem).
Please, if you never faced a case when you would need this, just don't assume that there is none.
I stressed the adhoc adjective, since I agree that in permanent queries (stored procedures, etc.), for the reasons you pointed out, it is advisable to use the full table names.

How to organize parameters for a postgres application

I am working on a postgres application. For the moment I am not sure how to manage application constant parameters best. For example I want to define a threshold variable which I am going to use in several functions.
One idea is making a table "config" and query the variable every time I need them. And for a shortcut wrap the sql query into an other function i.e.: t := get_Config('Threshold');
But in fact I am not really lucky with this. What is the best way to handle custom application configuration parameters? They should be handy in maintainance and I want to avoid querying every time for constants. In oracle for example you could compile constants into package specs. Are there any better ways to deal with such configuration parameters?
I have organized global parameters just the way you describe it for some years now. It seems a bit awkward but it works just fine.
I have got quite a number of those, so I added an integer plus index to my config table and use get_config($my_id) (plus comment) - which is slightly faster but less readable.
OR you can use custom_variable_classes. See:
How to declare variable in PostgreSQL

How can I build a generic dataset-handling Perl library?

I want to build a generic Perl module for handling and analysing biomedical character separated datasets and which can, most certain, be used on any kind of datasets that contain a mixture of categorical (A,B,C,..) and continuous (1.2,3,881..) and identifier (XXX1,XXX2...). The plan is to have people initialize the module and then use some arguments to point to the data file(s), the place were the analysis reports should be placed and the structure of the data.
By structure of data I mean which variable is in which place and its name/type. And this is where I need some enlightenment. I am baffled how to do this in a clean way. Obviously, having people create a simple schema file, be it XML or some other format would be the cleanest but maybe not all people enjoy doing something like this.
The solutions I can think of are:
Create a configuration file in XML or similar and with a prespecified format.
Pass the information during initialization of the module.
Use the first row of the data as headers and try to guess types (ouch)
Surely there must be a "canonical" way of doing this that is also usable and efficient.
This doesn't answer your question directly, but have you checked CPAN? It might have the module you need already. If not, it might have similar modules -- related either to biomedical data or simply to delimited data handling -- that you can mine for good ideas, both concerning formats for metadata and your module's API.
Any of the approaches you've listed could make sense. It all depends on how complex the data structures and their definitions are. What will make something like this useful to people is whether it saves them time and effort. So, your decision will have to be answered based on what approach will best satisfy the need to make:
use of the module easy
reuse of data definitions easy
the data definition language sufficiently expressive to describe all known use cases
the data definition language sufficiently simple that an infrequent user can spend minimal time with the docs before getting real work done.
For example, if I just need to enter the names of the columns and their types (and there are only 4 well defined types), doing this each time in a script isn't too bad. Unless I have 350 columns to deal with in every file.
However, if large, complicated structure definitions are common, then a more modular reuse oriented approach is better.
If your data description language is difficult to work with, you can mitigate the issue a bit by providing a configuration tool that allows one to create and edit data schemes.
rx might be worth looking at, as well as the Data::Rx module on the CPAN. It provides schema checking for JSON, but there is nothing inherent in the model that makes it JSON-only.

Using constants for message keys and database table names and column names

Recently there was a big debate during a code reveiw session on the use of constants.
The developers had used constants for the following purposes:
Each and every message key used in the i18N application was declared as a constant. The application contained around 3000 message keys and hence the same number of constants.
Each and every database column name was declared as a constant. There were around 5000 column names and still counting..
Does it make sense to have such a huge number of constants in any application?
IMHO, common sense should prevail. Message keys just don't need to be declared as constants. We already have one level of indirection - why add one more?
Reg. database column names, I have mixed opinions. If a column is being used in multiple classes, does it make sense to declare it as a global constant?
Please pour in with your thoughts...
If I18N message keys aren't defined as constants, how do you enforce consistency? How do you automatically differentiate between a typo and a missing value? How do you audit to make sure that all I18N keys are fulfilled in each new language file?
As to database columns, you could definitely use some indirection - if your application knows about column names, you've got a binding problem. So there, you might consider a config file with the actual column names - but of course, you would want to refer to the column names by symbolic keys, which should be defined as auditable constants, just like the I18N keys.
I think is a good practice to put message keys used for i18N as constants.
I don't see much benefits in doing the same for the DB columns, if you have a well designed persistence layer.
This depends on the programming language, I think.
In PHP it's not uncommon to ude defines aka contants for such things, while I'd not use this in Java or C#.
In most projects we tried to extract the SQL to templates, so not only the table and column names were configurable but the whole sql statement. We used velocity for basic templating mechanics like variables, small loops,...
Regarding the language constants:
Another layer doesn't make much sense to me, but you hav eto choose your identifiers for the language translation carefully. Using the whole english sentence as key may end up in a lot of work for the translators if you fix the wording for example in the english sentence without changing the meaning. So all translators would have to update their files.
If the constant is used in multiple places and the compiler really catches the problem, yes.