Can not write to PPAS logfile - postgresql-performance

Now, I am testing query from application to PPAS via PgPool.
When I query
select a.name, b.name from tb_01 a, tb_02 c where a.id = b.id_ref
... or update/insert, I have to wait for a long time (30 --> 60 seconds).
I check in my PPAS log file but I can not find any information about this query.
Example:
Query (*)
Complete affter waiting for 30 seconds.
View log file (dont see any info)
I tested with another query, and it's ok, I can find info in log file.

Related

How can I select only data within a specific window in KSQL?

I have a table with tumbling window, e.g.
CREATE TABLE total_transactions_per_1_days AS
SELECT
sender,
count(*) AS count,
sum(amount) AS total_amount,
histogram(recipient) AS recipients
FROM
completed_transactions
WINDOW TUMBLING (
SIZE 1 DAYS
)
Now I need to only select data from the current window, i.e. windowstart <= current time and windowend <= current time. Is it possible? I could not find any example.
Depends what you mean when you say 'select data' ;)
ksqlDB supports two main query types, (see https://docs.ksqldb.io/en/latest/concepts/queries/).
If what you want is a pull query, i.e. a traditional sql query where you want to pull back the current window as a one time result, then what you want may be possible, though pull queries are a recent feature and not fully featured yet. As of version 0.10 you can only look up a known key. For example, if sender is the key of the table, you could run a query like:
SELECT * FROM total_transactions_per_1_days
WHERE sender = some_value
AND WindowStart <= UNIX_TIMESTAMP()
AND WindowEnd >= UNIX_TIMESTAMP();
This would require the table to have processed data with a timestamp close to the current wall clock time for it to pull back data, i.e. if the system was lagging, or if you were processing historic or delayed data, this would not work.
Note: the above query will work on ksqlDB v0.10. Your success on older versions may vary.
There are plans to extend the functionality of pull queries. So keep an eye for updates to ksqlDB.

Redshift Compile Time For First Time Run Queries

i am struggling with my dashboard performance which runs queries on Redshift using JDBC driver.
the query is like -
select <ALIAS_TO_SCHEMA.TABLENAME>.<ANOTHER_COLUMN_NAME> as col_0_0_,
sum(<ALIAS_TO_SCHEMA.TABLENAME>.devicecount) as col_1_0_ from <table_schema>.<table_name> <ALIAS_TO_SCHEMA.TABLENAME> where <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$1
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$2
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$3
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$4
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$5
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$6
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$7
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$8
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$9
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$10
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$11
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$12
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$13
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$14
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$15
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$16
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$17
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$18
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$19
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$20
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$21
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$22
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$23
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$24
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$25
or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$26
or ........
The For dashboard we use Spring, Hibernate ( I am not 100% sure about it though ).
But the query might sometimes stretch till $1000 + according to the filters/options being selected on the UI.
But the problem we are seeing is - The First Time this query is being run by the reports, it takes more than 40 sec - 60 seconds for the response. After the first time , the query runs quite fast and takes only few seconds to run.
We initially suspected there must be something wrong with redshift caching , but it turns out that , Even simple queries like these ( But Huge ) takes considerable time to COMPILE, which is clear when we look into the svl_compile table which shows this query was compiled in over 35 seconds.
What should I do to handle such issues ?
Recommend restructuring the query generated by your dashboard to use an IN list. Redshift should be able to reuse the already compiled query segments for different length IN lists.
Note that IN lists with less than 10 values will still be evaluated as OR. https://docs.aws.amazon.com/redshift/latest/dg/r_in_condition.html#r_in_condition-optimization-for-large-in-lists
SELECT <ALIAS_TO_SCHEMA.TABLENAME>.<ANOTHER_COLUMN_NAME> as col_0_0_
, SUM(<ALIAS_TO_SCHEMA.TABLENAME>.devicecount) AS col_1_0_
FROM <table_schema>.<table_name> <ALIAS_TO_SCHEMA.TABLENAME>
WHERE <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME> IN ( $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11 … $1000 )
;

How to skip showing results in HIVE Command Line?

I have executed a query in HIVE CLI that should generate around 11.000.000 rows, I know the result because I have executed the query in the MS SQL Server Management Studio too.
The problem is that in HIVE CLI the rows are showing on an on ( right know there are more than 12 hours since I started the execution ) and all I want to know is the time processing, which is showed only after showing the results.
So I have 2 questions :
How to skip showing rows results in HIVE command line ?
If I will execute the query in Beeswax, how do I see statistics like execution time , similar with SET STATISTICS TIME ON in T-SQL ?
You can check it using link given in log .But it wont give you total processing left.

Get un retrieved rows only in DB2 select

I have an BPM application where I am polling some rows from DB2 database at every 5 mins with a scheduler R1 with below query -
- select * from Table where STATUS = 'New'
based on rows returned I do some processing and then change the status of these rows to 'Read'.
But while this processing is being completed, its takes more than 5 mins and in meanwhile scheduler R1 runs and picks up some of the cases already picked up in last run.
How can I ensure that every scheduler picks up the rows which were not selected in last run. What changes do i need to do in my select statement? Please hep.
How can I ensure that every scheduler picks up the rows which were not selected in last run
You will need to make every scheduler aware of what was selected by other schedulers. You can do this, for example, by locking the selected rows (SELECT ... FOR UPDATE). Of course, you will then need to handle lock timeouts.
Another option, allowing for better concurrency, would be to update the record status before processing the records. You might introduce an intermediary status, something like 'In progress', and include the status in the query condition.

Can I interrupt a "Count Rows" operation in SQL Developer?

When I execute a query and right click in the results area, I get a pop-up menu with the following options:
Save Grid as Report ...
Single Record View ...
Count Rows ...
Find/Highlight ...
Export ...
If I select "Count Rows", is there a way to interrupt the operation if it starts taking too long?
No, you don't seem to be able to.
When you select Count Rows from the context menu, it runs the count on the main UI thread, hanging the whole UI, potentially for minutes or hours.
It's best not to use that feature - just put select count (*) from ( < your query here>) which it runs properly on separate thread which can be cancelled.
You can open an new instance of sql developer and kill the session counting the rows.
I do suggest using the select count(*) query though as it is less painful in the long run.