Getting log information from PostgreSQL - postgresql

My application uses PostgreSQL for storing data. I need to gather information about all user actions (any INSERTs, UPDATEs or DELETEs) executed on the database. This made me wonder whether PostgreSQL provides any default implementations/tables for this? As per my prior searches, I haven't found anything usable yet - however proper confirmation to my suspicions would be nice. If PostgreSQL truly doesn't provide any default implementations for this, then I will design my own history table.

PostgreSQL supports several methods for logging server messages, including stderr, csvlog and syslog. On Windows, eventlog is also supported. Set this parameter to a list of desired log destinations separated by commas. The default is to log to stderr only. This parameter can only be set in the postgresql.conf file or on the server command line.
SEE HERE

Related

Distinguish queries run by user directly, from internal queries, in log files

When checking log files, there is a lot of internal/system queries (runned internally by pgadmin tool for example) and not directly executed by user.
I there any way to distinguish which queries run by user directly, and which is internal queries?
(btw, config params I use for logging: log_statement=all and log_min_duration_statement=0)
I am not sure what you mean by "internal/system queries". Unless you use auto_explain with auto_explain.log_nested_statements = on, PostgreSQL will only log the queries that get sent by the client.
It may well be that some of these queries are internal queries of your database driver or application server, but PostgreSQL doesn't know that. For that, you'd have to enable logging at a component that has that information.

Is there a way to show everything that was changed in a PostgreSQL database during a transaction?

I often have to execute complex sql scripts in a single transaction on a large PostgreSQL database and I would like to verify everything that was changed during the transaction.
Verifying each single entry on each table "by hand" would take ages.
Dumping the database before and after the script to plain sql and using diff on the dumps isn't really an option since each dump would be about 50G of data.
Is there a way to show all the data that was added, deleted or modified during a single transaction?
Dude, What are you looking for is the most searchable thing on the internet when it comes to capturing Database changes. It is a kind of version control we can say.
But as long as I know, sadly there are no in-built approaches are available in PostgreSQL or MySql. But you can overcome it by setting/adding some triggers for your most usable operations.
You can create some backup schemas, and tables to capture your changes that are changed(updated), created, or deleted.
In this way you can achieve what you want. I know this process is fully manual, But really effective.
If you need to analyze the script's behaviour only sporadically, then the easiest approach would be to change server configuration parameter log_min_duration_statement to 0 and then back to any value it had before the analysis. Then all of the script activity will be written to the instance log.
This approach is not suitable if your storage is not prepared to accommodate this amount of data, or for systems in which you don't want sensitive client data to be written to a plain-text log file.

Loging activity for one user on specific database in Postgres

I need to log all activity for some specific user on database. I have set up the logging with ALTER ROLE username SET log_statement TO 'all'; and the logging works fine, all queries from user are logged. The problem is that for this user queries to Postgres internal schemas (pg_catalog) from clients like psql and pgAdmin are also logged. I have a bunch of lines with SELECT pg_catalog.quote_ident(n.nspname) || '.' || pg_catalog.quote_ident(c.relname).... in the log that are of no use to me. Even worse this queries are more then one line in the log so it's not easy to filter them out.
Is it possible to somehow restrict the logging only to one specific database or schema and not to include queries to other schemas like pg_catalog?
I don't know if the standard logging utility in postgres has that option (my guess is no). But maybe it's worth a look to the pgaudit external library for postgres.
The module pgadmin is designed to generate audit logs, but it uses the standard postgres logging tool. You can tweak several parameters to customize the logs, and it has a specific parameter that I think is perfect for your use case. From the documentation:
pgaudit.log_catalog
Specifies that session logging should be enabled in the case where all
relations in a statement are in pg_catalog. Disabling this setting
will reduce noise in the log from tools like psql and PgAdmin that
query the catalog heavily.
The default is on.
I hope it helps!
Change your logging format from text to csv (log_destination=csvlog) — you can then import the data to the database and then filter out the queries you are not interested in:
Using CSV-Format Log Output

How to see the actual sql statements executed by POSTGRES?

I want to log the actual sql statements executed against a POSTGRES instance. I am aware that I can enable logging of the sql statements. Unfortunately, this doesn't log the actual sql, but rather a parsed version, with certain parameters stripped out and listed separately.
Is there a tool for reliably reconstituting this output into executable sql statements?
Or is there a way of intercepting the sql that is send to the postgres instance, such that that sql can be logged?
We want to be able to replay these sql statements against another database.
Thanks for your help!
Actually, PostgreSQL does log exactly the SQL that got executed. It doesn't strip parameters out. Rather, it doesn't interpolate them in, it logs what the application sent, with bind parameters separate. If your app sends insert into x(a,b) values ($1, $2) with bind params 42 and 18, that's what gets logged.
There's no logging option to interpolate bind parameters into the query string.
Your last line is the key part. You don't want logging at all. You're trying to do statement based replication via the logs. This won't work well, if at all, due to volatile functions, the search_path, per-user settings, sequence allocation order/gap issues, and more. If you want replication don't try to do it by log parsing.
If you want to attempt statement-based replication look into PgPool-II. It has a limited ability to do so, with caveats aplenty.
Via setting log_statement to all on postgresql.conf. See the documentation for runtime-config-logging

How can I audit with the Microsoft SQL Server LDF file?

We need an audit log in the product that we are creating. We use SQL Server 2008 R2. I learned that the LDF file keeps an complete log of all transactions that where made*.
I've found ApexSQL Log, this tools analyses the LDF file and provides a GUI. It's a great demonstration of what's possible. But it's expensive. More info: http://www.apexsql.com/sql_tools_log.aspx
Do you know of other programs that can analyse the LDF file's? Or perhaps other methods to provide audit-trail functionality? I know that it's possible to create triggers. But if it isn't necessary to add things to my database scheme then I would rather not do it.
*Only if you select the full recovery model.
How about the new Change Data Capture (CDC) functionality in R2. Doesnt that serve your purpose ?
When it comes to the information stored in an LDF file, make sure to form a full log chain. A log chain is a continuous sequence of transaction log backups. It starts with a full database backup followed by all subsequent log backups up through the auditing point. If it becomes broken, only the transactions in the logs up to the last backup before the missing one can be shown with full information (e.g. a schema and object name, or a row history)
Unlike INSERT and DELETE operations, which are fully logged in the LDF files, UPDATE operations are logged minimally – only the changes that are made are logged, but the old and new values are not. When logging UPDATE operations, SQL Server doesn’t log complete before and after row states but only the incremental change that occurred to the row. For example, if a word “log” was updated to word “blog” SQL Server will, in general case, only log an addition of letter “b” at index 0. This is enough for its purpose of ensuring ACID but not enough to easily show before and after states of the row. So, in order to understand what changed really occurred, you have to reconstruct the context in which the change occurred from the rest of transaction log and/or backup and online database data