PostgreSQL Locale Language - postgresql

I'm running Debian 3.2 w/ PostgreSQL 9.2 installed. A co-worker of mine initialized the database cluster with japanese. So now, every single database created with createdb, no matter who the user is, is now instantiated with japanese as it's language.
I cannot find a way to reset the language back to english w/o having to run initdb again on a new cluster. I really don't want to do this as there is a lot of data that can't afford to go down.
From what I've read, the database cluster is set with japanese when the command with these flags are called initdb -D /place/for/data --locale=ja_JP. However when I create the same cluster and then use createdb to create a new database, the language of the error messages are in English. In my co-workers cluster, the error messages from this scenario would produce japanese error messages.
Can someone help me understand how the locales work in PostgreSQL 9.2? And what would be the way to change the cluster back into English?

You can't fully change the cluster back into English without a full dump and reload.
In general the postmaster will emit messages in the language and encoding the database cluster was initdb'd in. Sessions on individual databases will emit messages in the language and encoding that database was created in, which may not be the same as the cluster defaults.
This can lead to logs in mixed languages and mixed text encodings, which is really pretty ugly. Despite repeated discussions on the mailing lists we've never come to an agreement on how to solve that - it's more complicated than it looks.
Because each session logs in its own locale settings, what you can do is CREATE DATABASE ... with the appropriate LC_CTYPE, ENCODING, LC_COLLATE, etc settings per the CREATE DATABASE manual page. You may have to specify TEMPLATE template0 for this to succeed. This will result in newly created databases being in the desired language and encoding; you can then dump each old DB into a corresponding new one, rename the old one, and rename the new one into the old one's place. The old one can be dropped when you're comfortable all's well.
Postmaster-level messages will still be in Japanese. I don't think there's a way around that without a re-initdb. Worse, if it's not jp_JP.UTF-8 but a a different encoding, you might have mixed encodings in your log files, something that can quite upset logfile processors etc.

Related

After using pg_dump behind pg_bouncer, the search_path appears to be altered and other clients are affected

My network looks like this:
App (many connections) pg_bouncer (few sessions) PostgreSql
nodes ----------------------- nodes ----------------- nodes
So pg_bouncer multiplexes connections giving app nodes the illusion that they are all connected directly.
The issue comes when I launch pg_dump: few milliseconds after the dump finishes, all app nodes fail with errors saying "relation xxxx does not exist" though the table or sequence is actually there. I'm pretty sure the cause is pg_bouncer manipulating the "search_path" variable, so that app nodes no longer find tables in my schema. This happens at dump time even if the dump file is not imported nor executed.
Note, I've searched SO and google and I've seen there are many threads asking about the search_path in the generated file, but that's not what I'm asking about. I have no problems with the generated file, my issue is the pg_bouncer session that other clients are using, and I haven't found anything about this.
The most obvious workaround would probably be to set the search_path manually in the app, but attention, don't fall into this fallacy: it's useless for the app to do it at the beginning since it may be assigned a different pg_bouncer session at the next transaction. And I cannot be setting it all the time.
The next most obvious workaround would be to set it back to the intended value immediately after launching pg_dump, but there's a race condition here, and other nodes are quick enough so that I fear they will still fail.
Is there a way to avoid letting pg_dump manipulate this variable, or making sure it resets it before exiting?
(Also, I'm taking for granted pg_dump and search_path are the cause for this, can you suggest a way to confirm that? All the evidence I have is the errors few milliseconds later and the set search_path instruction in the generated file which produces the same errors if executed.)
Thanks
Don't connect pg_dump through pgbouncer with transaction pooling. Just change the port number so it connects directly to the database. pg_dump is incompatible with transaction pooling.
You might be able to get it to work anyway by setting server_reset_query_always = 1

Is there a way to show everything that was changed in a PostgreSQL database during a transaction?

I often have to execute complex sql scripts in a single transaction on a large PostgreSQL database and I would like to verify everything that was changed during the transaction.
Verifying each single entry on each table "by hand" would take ages.
Dumping the database before and after the script to plain sql and using diff on the dumps isn't really an option since each dump would be about 50G of data.
Is there a way to show all the data that was added, deleted or modified during a single transaction?
Dude, What are you looking for is the most searchable thing on the internet when it comes to capturing Database changes. It is a kind of version control we can say.
But as long as I know, sadly there are no in-built approaches are available in PostgreSQL or MySql. But you can overcome it by setting/adding some triggers for your most usable operations.
You can create some backup schemas, and tables to capture your changes that are changed(updated), created, or deleted.
In this way you can achieve what you want. I know this process is fully manual, But really effective.
If you need to analyze the script's behaviour only sporadically, then the easiest approach would be to change server configuration parameter log_min_duration_statement to 0 and then back to any value it had before the analysis. Then all of the script activity will be written to the instance log.
This approach is not suitable if your storage is not prepared to accommodate this amount of data, or for systems in which you don't want sensitive client data to be written to a plain-text log file.

Upgrading from Postgres 7.4 to 9.4.1

I'm upgrading Postgres from ancient 7.4 to 9.4.1 and seeing some errors.
On the old machine, I did:
pg_dumpall | gzip > db_pg_bu.gz
On the new machine, I did:
gunzip -c db_pg_bu.gz | psql
While restoring I got a number of errors which I don't understand, and don't know the importance of. I'm not a DBA, just a lowly developer, so if someone could help me understand what I need to do to get this migration done I would appreciate it.
Here are the errors:
ERROR: cannot delete from view "pg_shadow"
DETAIL: Views that do not select from a single table or view are not automatically updatable.
HINT: To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
I also got about 15 of these:
NOTICE: SYSID can no longer be specified
And this, although this looks harmless since I saw plpgsql is installed by default stating in version 9.2:
ERROR: could not access file "/usr/lib/postgresql/lib/plpgsql.so": No such file or directory
SET
NOTICE: using pg_pltemplate information instead of CREATE LANGUAGE parameters
ERROR: language "plpgsql" already exists
A big concern is that, as it restores the databases, for each ne I see something like this:
COMMENT
You are now connected to database "landrush" as user "postgres".
SET
ERROR: could not access file "/usr/lib/postgresql/lib/plpgsql.so": No such file or directory
There are basically two ways. Both are difficult for the inexperienced. (and maybe even for the experienced)
do a stepwise migration, using a few intermediate versions (which will probably have to be compiled from source). Between versions you'd have to do a pg_dump --> pg_restore (or just the psql < dumpfile, like in the question). A possible path first hop could be 7.4 -> 8.3, but maybe an additional hop might be needed.
Edit the (uncompressed) dumpfile: remove (or comment out) anything that the new version does not like. This will be an iterative process, and it assumes your dump fits into your editor. (and that you know what you are doing). You might need to redump, separating schema and data (options --schema-only and --data-only, I don't even know if these were available in PG-7.4)
BTW: it is advisable to use the pg_dump from the newer version(the one that you will import to). You'll need to specify the source host via the -h flag. The new (target) version knows about what the new version needs, and will try to adapt (upto a certain point, you still need to use more than one step) I will also refuse to work if it cannot produce a usable dump. (In which case you'll have to make smaller steps...)
Extra:
if the result of your failed conversion is complete enough, and if you are only interested in the basic data, you could just stop here, and maybe polish a bit.
NOTICE: using pg_pltemplate information instead of CREATE LANGUAGE parameters I don't know what this is. Maybe the way that additional languages, such as plpgsql, were added to the core dbms.
ERROR: language "plpgsql" already exists : You can probably ignore this error. -->> comment out the offending lines.
DETAIL: Views that do not select from a single table or view are not automatically updatable. This implies that the postgres RULE rewrite system is used in the old DB. It will need serious work to get it working again.

Database encoding in PostgreSQL

I have recently started using PostgreSQL for creating/updating existing SQL databases. Being rather new in this I came across an issue of selecting correct encoding type while creating new database. UTF-8 (default) did not work for me as data to be included is of various languages (English, Chinese, Japanese, Russia etc) as well as includes symbolic characters.
Question: What is the right database encoding type to satisfy my needs.
Any help is highly appreciated.
There are four different encoding settings at play here:
The server side encoding for the database
The client_encoding that the PostgreSQL client announces to the PostgreSQL server. The PostgreSQL server assumes that text coming from the client is in client_encoding and converts it to the server encoding.
The operating system default encoding. This is the default client_encoding set by psql if you don't provide a different one. Other client drivers might have different defaults; eg PgJDBC always uses utf-8.
The encoding of any files or text being sent via the client driver. This is usually the OS default encoding, but it might be a different one - for example, your OS might be set to use utf-8 by default, but you might be trying to COPY some CSV content that was saved as latin-1.
You almost always want the server encoding set to utf-8. It's the rest that you need to change depending on what's appropriate for your situation. You would have to give more detail (exact error messages, file contents, etc) to be able to get help with the details.

refused database connection causes encoding error

I have a postgreSQL Server with some databases. Every user can only connect to certain databases.
So far so good. I wanted to test if everthing worked, so i used pgAdmin III to log in with a restricted user. when i try to connect to a database the user has no connection rights to, something seems to happen to the logfile!
it can't be read by the server-status window anymore. All i get are a lot of messages about invalid Byte-sequences for encoding utf8.
The only way of stopping those messages windows is to kill the programm and force postgre to create a new logfile.
can anyone explain to me why that happens and how i can stop it???
OK, I think the problem is the "ü" in "für". The error message seems to be complaining about a character code 0xfc which in latin1 (and similar) is lower case u with umlaut.
Messages sent back via a database connection should be translated to the client encoding. However, the log-file contains output from a variety of sources and according to this there were issues fairly recently (2012):
It's a known issue, I'm afraid. The PostgreSQL postmaster logs in the
system locale, and the PostgreSQL backends log in whatever encoding
their database is in. They all write to the same log file, producing a
log file full of mixed encoding data that'll choke many text editors.
So - I'm guessing your system locale is 8859-1 (or -15 perhaps) whereas pg-admin is expecting UTF-8. Short-term, you could set the system encoding to UTF-8, longer term drop a bug report over to the pgadmin team - one error message is helpful, after that it should probably just put hexcodes in the text or some such.