Pgpool prints "unable to parse query" to certain more complex queries - postgresql

I'm using one pgpool server and 2 RDS servers in load-balance master-slave mode.
It's all going well but I have this one problem i can't find a solution.
I have one query that does only SELECTs with a some joins and etc. However, when I execute this query and then look in pgpool log, it prints the message : " Unable to parse the query:". The query then is executed in the backend 0 (master) with no problem.
Thing is, This is a heavy query and i want it to be balanced as well.
The query has : "INNER JOIN; INNER JOIN LATERAL; COUNT(); GROUP BY; COALESCE(); MAX(); EXTRACT(); EPOCH(); NOW()"
Searching I did only find questions, but no solution at all. Hope any of you guys can help me. Thanks

By taking a look at the LATERAL JOIN (Table Expressions). It's only available after Postgresql 9.3. So maybe you're using an older pgpool version. That's why pgpool can't parse your query.

Related

Postgres server-side cursor with LEFT JOIN does not return on Heroku PG

I have a Heroku app that uses a psycopg server-side cursor together with a LEFT JOIN query running on Heroku PG 13.5.
The query basically says “fetch items from one table, that don’t appear in another table”.
My data volume is pretty stable, and this has been working well for some time.
This week these queries stopped returning. In pg_stat_activity they appeared as active indefinitely (17+ hours), similarly in heroku pg:ps. There appeared to be no deadlocks. All the Heroku database metrics and logs appeared healthy.
If I run the same queries directly in the console (without a cursor) they return in a few seconds.
I was able to get it working again in the cursor by making the query a bit more efficient (switching from LEFT JOIN to NOT EXISTS; dropping one of the joins).
My questions are:
Why might the original query perform fine in the console, but not return with a psycopg server-side cursor?
How might I debug this?
What might have changed this week to trigger the issue?
I can say that:
However I write the query (LEFT JOIN, Subquery, NOT EXISTS), the query plan involves a Nested Loop Anti Join
I don’t believe this is related to the Heroku outage the following day (and which didn’t affect Heroku PG)
Having Googled extensively, the closest thing I can find to a hypothesis to explain this is a post on the PG message boards from 2003 entitled left join in cursor where the response is “Some plan node types don't cope very well with being run backwards.”
Any advice appreciated!
If you are using a cursor, PostgreSQL estimates that only 10% of the query result will be fetched quickly and prefers plans that return the first few rows quickly, at the expense of the total query cost.
You can disable this optimization by setting the PostgreSQL parameter cursor_tuple_fraction to 1.0.

Tableau Queries with JOINS and check for NULL are failing in ClickHouse

I am running Tableau connected to ClickHouse via ODBC driver. At first mostly any report request was failing. I have configured this tdc file https://github.com/yandex/clickhouse-odbc/blob/clickhouse-tbc/clickhouse.tdc and its actually started to work, however now some of the query requests with JOINS that contain check for NULL in ON are failing because of using IS NULL instead of isNull(id)
JOIN users ON ((users.user_id = t0.user_id) OR ((users.user_id IS NULL) AND (t0.user_id IS NULL)))
This is the correct way that works:
JOIN users ON ((users.user_id = t0.user_id) OR ((isNull(users.user_id) = 1) AND (isNull(t0.user_id) = 1 IS NULL)))
How to make tablau driver to send the right requerst?
Here are a few suggestions:
This post on the Tableau Community looks like it has similar symptoms as you describe. The suggested resolution is to wrap all fields as such IfNull([Dimension], "") thereby reducing the need, apparently, to have Clickhouse do the check of nulls.
The TDC file from Github looks pretty complete, but they might not have taken joins into consideration. The GitHub commit states that the tdc is "untested." I would message the creator of that TDC and see if they've done any work around joins and if they have any suggestions.
Here is a list of possible ODBC Customizations that can be added to or removed from your TDC file. The combination of which may take some experimentation, but they're well worth researching as a possible solution.
Create an extract before performing complex analysis. If you're able to connect initially, then it should be possible to bring all the data from Clickhouse into an extract.
Custom SQL would probably alleviate any join syntax issue because the query and any joins are purely written by you. After making the initial connection to ClickHouse, instead of choosing a table, select "Custom ODBC" and write a query that will return the joined tables of your choosing.
Finally, the Tableau Ideas Forum is a place to ask for and/or vote on upcoming connectors. I can see there is already an idea in place for ClickHouse. Feel free to vote it up.
If you can make sure not to have any NULL values in the data, you can also use this proxy that I wrote for this exact problem.
https://github.com/kfzteile24/clickhouse-proxy
It kinda worked, for most cases, but it's not bullet-proof.

In DBeaver, how can I run an SQL union query from two different connections..?

We recently migrated a large DB2 database to a new server. It got trimmed a lot in the migration, for instance 10 years of data chopped down to 3, to name a few. But now I find that I need certain data from the old server until after tax season.
How can I run a UNION query in DBeaver that pulls data from two different connections..? What's the proper syntax of the table identifiers in the FROM and JOIN keywords..?
I use DBeaver for my regular SQL work, and I cannot determine how to span a UNION query across two different connections. However, I also use Microsoft Access, and I easily did it there with two Pass-Through queries that are fed to a native Microsoft Access union query.
But how to do it in DBeaver..? I can't understand how to use two connections at the same time.
For instance, here are my connections:
And I need something like this...
SELECT *
FROM ASP7.F_CERTOB.LDHIST
UNION
SELECT *
FROM OLD.VIPDTAB.LDHIST
...but I get the following error, to which I say "No kidding! That's what I want!", lol... =-)
SQL Error [56023]: [SQL0512] Statement references objects in multiple databases.
How can this be done..?
This is not a feature of DBeaver. DBeaver can only access the data that the DB gives it, and this is restricted to a single connection at a time (save for import/export operations). This feature is being considered for development, so keep an eye out for this answer to be outdated sometime in 2019.
You can export data from your OLD database and import it into ASP7 using DBeaver (although vendor tools for this are typically more efficient for this). Then you can do your union as suggested.
Many RDBMS offer a way to logically access foreign databases as if they were local, in which case DBeaver would then be able to access the data from the OLD database (as far as DBeaver is concerned in this situation, all the data is coming from a single connection). In Postgres, for example, one can use a foreign data wrapper to access foreign data.
I'm not familiar with DB2, but a quick Google search suggests that you can set up foreign connections within DB2 using nicknames or three-part-names.
If you check this github issue:
https://github.com/dbeaver/dbeaver/issues/3605
The way to solve this is to create a task and execute it in different connections:
https://github.com/dbeaver/dbeaver/issues/3605#issuecomment-590405154

Deal with Postgresql Error -canceling statement due to conflict with recovery- in psycopg2

I'm creating a reporting engine that makes a couple of long queries over a standby server and process the result with pandas. Everything works fine but sometimes I have some issues with the execution of those queries using a psycopg2 cursor: the query is cancelled with the following message:
ERROR: cancelling statement due to conflict with recovery
Detail: User query might have needed to see row versions that must be removed
I was investigating this issue
PostgreSQL ERROR: canceling statement due to conflict with recovery
https://www.postgresql.org/docs/9.0/static/hot-standby.html#HOT-STANDBY-CONFLICT
but all solutions suggest fixing the issue making modifications to the server's configuration. I can't make those modifications (We won the last football game against IT guys :) ) so I want to know how can I deal with this situation from the perspective of a developer. Can I resolve this issue using python code? My temporary solution is simple: catch the exception and retry all the failed queries. Maybe could be done better (I hope so).
Thanks in advance
There is nothing you can do to avoid that error without changing the PostgreSQL configuration (from PostgreSQL 9.1 on, you could e.g. set hot_standby_feedback to on).
You are dealing with the error in the correct fashion – simply retry the failed transaction.
The table data on the hot standby slave server is modified while a long running query is running. A solution (PostgreSQL 9.1+) to make sure the table data is not modified is to suspend the replication on the slave and resume after the query.
select pg_xlog_replay_pause(); -- suspend
select * from foo; -- your query
select pg_xlog_replay_resume(); --resume
I recently encountered a similar error and was also in the position of not being a dba/devops person with access to the underlying database settings.
My solution was to reduce the time of the query where ever possible. Obviously this requires deep knowledge of your tables and data, but I was able to solve my problem with a combination of a more efficient WHERE filter, a GROUPBY aggregation, and more extensive use of indexes.
By reducing the amount of server side execute time and data, you reduce the chance of a rollback error occurring.
However, a rollback can still occur during your shortened window, so a comprehensive solution would also make use of some retry logic for when a rollback error occurs.
Update: A colleague implemented said retry logic as well as batching the query to make the data volumes smaller. These three solutions have made the problem go away entirely.
I got the same error. What you CAN do (if the query is simple enough), is deviding the data into smaller chunks as a workaround.
I did this within a python loop to call the query multiple times with the LIMIT and OFFSET parameter like:
query_chunk = f"""
SELECT *
FROM {database}.{datatable}
LIMIT {chunk_size} OFFSET {i_chunk * chunk_size}
"""
where database and datatable are the names of your sources..
The chunk_size is individually and to set this to a not too high value is crucial for the query to finish.

Phoenix JDBC query - Joins does not work

Hello Phoenix Team and friends working on Phoenix/Hbase,
I am connecting to Phoenix layer on HBase using JDBC Driver. My PreparedStatement with simple select query works/executes fine in my Java program. However, when I use any sql join (left, or inner) the PreparedStatement execute query gives below expection even thought I limit my results to 1 or 5 records.
java.sql.SQLException: Encountered exception in sub plan [0] execution.
However when I run the same query (Simple or Joins) works well on Phoenix client.
Did anyone face this issue?
Please share if any fix around
Best regards,
Nandu
Please use the hint /*+ NO_STAR_JOIN */ to execute your query. There are some more hints which can help you to fine tune your query based on the nature of operation which you want to perform.Please refer to hints at this link https://phoenix.apache.org/language/index.html