Distinguish queries run by user directly, from internal queries, in log files - postgresql

When checking log files, there is a lot of internal/system queries (runned internally by pgadmin tool for example) and not directly executed by user.
I there any way to distinguish which queries run by user directly, and which is internal queries?
(btw, config params I use for logging: log_statement=all and log_min_duration_statement=0)

I am not sure what you mean by "internal/system queries". Unless you use auto_explain with auto_explain.log_nested_statements = on, PostgreSQL will only log the queries that get sent by the client.
It may well be that some of these queries are internal queries of your database driver or application server, but PostgreSQL doesn't know that. For that, you'd have to enable logging at a component that has that information.

Related

Loging activity for one user on specific database in Postgres

I need to log all activity for some specific user on database. I have set up the logging with ALTER ROLE username SET log_statement TO 'all'; and the logging works fine, all queries from user are logged. The problem is that for this user queries to Postgres internal schemas (pg_catalog) from clients like psql and pgAdmin are also logged. I have a bunch of lines with SELECT pg_catalog.quote_ident(n.nspname) || '.' || pg_catalog.quote_ident(c.relname).... in the log that are of no use to me. Even worse this queries are more then one line in the log so it's not easy to filter them out.
Is it possible to somehow restrict the logging only to one specific database or schema and not to include queries to other schemas like pg_catalog?
I don't know if the standard logging utility in postgres has that option (my guess is no). But maybe it's worth a look to the pgaudit external library for postgres.
The module pgadmin is designed to generate audit logs, but it uses the standard postgres logging tool. You can tweak several parameters to customize the logs, and it has a specific parameter that I think is perfect for your use case. From the documentation:
pgaudit.log_catalog
Specifies that session logging should be enabled in the case where all
relations in a statement are in pg_catalog. Disabling this setting
will reduce noise in the log from tools like psql and PgAdmin that
query the catalog heavily.
The default is on.
I hope it helps!
Change your logging format from text to csv (log_destination=csvlog) — you can then import the data to the database and then filter out the queries you are not interested in:
Using CSV-Format Log Output

JOOQ Cannot get autoCommit to a PostgreSQL database

I have the following setup where a service layer, using jooq, contacts a PostgreSQL database.
In this scenario, whenever multiple requests happen quickly one after another (or even not that quickly), I get the following error message:
Internal error processing createItem: Cannot get autoCommit
My queries all run within transactions (using jooq's transactionResult methods).
Searching has not yielded many results, and I do not see why autoCommit should even be enabled in those cases. Is this most likely a configuration issue, or is there something else I can try to troubleshoot this issue better?
I noticed the same problem and message when running massive batch uploads on the limit of physical memory and limited amount of db connection (specific to my environment). It would be hard to provide a reproduction case for that, but to me this is a sign of db performance/memory starvation. Reduction of Java execution threads helped in my case.

Getting log information from PostgreSQL

My application uses PostgreSQL for storing data. I need to gather information about all user actions (any INSERTs, UPDATEs or DELETEs) executed on the database. This made me wonder whether PostgreSQL provides any default implementations/tables for this? As per my prior searches, I haven't found anything usable yet - however proper confirmation to my suspicions would be nice. If PostgreSQL truly doesn't provide any default implementations for this, then I will design my own history table.
PostgreSQL supports several methods for logging server messages, including stderr, csvlog and syslog. On Windows, eventlog is also supported. Set this parameter to a list of desired log destinations separated by commas. The default is to log to stderr only. This parameter can only be set in the postgresql.conf file or on the server command line.
SEE HERE

Does PostgreSQL allow running stored procedures in parallel?

I'm working with an ETL tool, Business Objects Data Services, which has the capability of specifying parallel execution of functions. The documentation says that before you can do this, you have to make sure that your database, which in our case is Postgres, allows "a stored procedure to run in parallel". Can anyone tell me if Postgres does that?
Sure. Just run your queries in different connections, and they will run in parallel transactions. Beware of locking though.
You can also call different stored procedures from the same connection (and effectively still run them in parallel) by using DBLink.
See this SO answer to see an example.

Is there a way to persist HSQLDB data?

We have all of our unit tests written so that they create and populate tables in HSQL. I want the developers who use this to be able to write queries against this HSQL DB ( 1) by writing queries they can better understand the data model and the ones not as familiar with SQL can play with the data before writing the runtime statements and 2) since they don't have access to the test DB/security reasons). Is there a way to persist the results of the test data so that it may be examine and analyzed with a an sql client?
Right now I am jury rigging it by switching the data source to a different DB (like DB2/mysql, then connecting to that DB on my machine so I can play with persistant data), however it would be easier for me if HSQL supports persisting this than to explain how to do this to every new developer.
Just to be clear, I need an SQL client to interact with persistent data, so debugging and checking memory won't be clean. This has more to do with initial development and not debugging/maintenance/testing.
If you use an HSQLDB Server instance for your tests, the data will survive the test run.
If the server uses a jdbc:hsqldb:mem:aname (all-in-memory) url for its database, then the data will be available while the server is running. Alternatively the server can use a jdbc:hsqldb:file:filepath url and the data is persisted to files.
The latest HSQLDB docs explain the different options. Most of the observations also apply to older (1.8.x) versions. However, the latest version 2.0.1 supports starting a server and creating databases dynamically upon the first connection, which can simplify testing a lot.
http://hsqldb.org/doc/2.0/guide/deployment-chapt.html#N13C3D