How to access partitions metadata info in Aurora PostgresSQL - postgresql

I am trying to execute the following query to get the no of rows in the table
SELECT SUM (row_count)
FROM sys.partitions
WHERE object_id=OBJECT_ID('Transactions')
AND (index_id=0 or index_id=1);
But getting the error saying that relation "sys.partitions" does not exist. Any suggestions to get the partitions available in the system.

Related

Using redshift system views with select pivot

I'd like to pivot the values in the system views svv_*_privileges to show privileges assigned to security principles by object. (For the complete solution will need to union results for all objects and pivot to all privileges)
as an example for the default privileges:
select * from
(select object_type, grantee_name, grantee_type, privilege_type, 1 as is_priv from pg_catalog.svv_default_privileges where grantee_name = 'abc' and grantee_type = 'role')
pivot (max(is_priv) for privilege_type in ('EXECUTE', 'INSERT', 'SELECT', 'UPDATE', 'DELETE', 'RULE', 'REFERENCES', 'TRIGGER', 'DROP') );
This gives error (only valid on leader node?)
[Amazon](500310) Invalid operation: Query unsupported due to an internal error.
Then thought of trying a temp table, pivot then being on a redshift table
select * into temp schema_default_priv from pg_catalog.svv_default_privileges where grantee_name = 'abc' and grantee_type = 'role'
... same error as above :-(
Is there a way I can work with SQL on the system tables to accomplish this in Redshift SQL????
While I can do the pivot in python ... why should I, It's supposedly a sql db!!!
On reread of your question the issue became clear. You are using a leader node only system table and looking to apply compute node data and/or functions. This path of data flow is not supported on Redshift. I do have some question as to what action is requiring compute node action but that isn't the crucial and digging is would take time.
If you need to get leader node data to the compute nodes there are a few ways and none of them are trivial. I find that the best method is to move the needed data is to use a cursor. This previous answer outlines hot to do this
How to join System tables or Information Schema tables with User defined tables in Redshift

What should I do to get pull query results? (Failed to scan materialized table)

To summarize first, if you send a pull query, you will get an error message below.
Unable to execute pull query
Caused by: io.confluent.ksql.util.KsqlException: Error executing query locally
at node http://our.host.com:8089/: Failed to scan
materialized table
Caused by: Error executing query locally at node
http://our.host.com:8089/: Failed to scan materialized
table
Caused by: Failed to scan materialized table
Caused by: Cannot get state store Aggregate-Aggregate-Materialize because the
stream thread is PARTITIONS_ASSIGNED, not RUNNING
The ksqlDb server with the same service.id has been uploaded to three servers. (Confluent Platform Community Eddition 7.0.0)
A stream with 9 partitions as a data source was created, and a table with that stream as a data source was created.
I will attach the relevant query below.
SET 'auto.offset.reset' = 'earliest';
CREATE STREAM IF NOT EXISTS NEW_STREAM (A BIGINT, B BIGINT, C VARCHAR, D VARCHAR)
WITH (kafka_topic='exist.topic',
key_format='KAFKA',
value_format='JSON',
partitions=9);
CREATE TABLE IF NOT EXISTS NEW_STREAM_TABLE WITH (KAFKA_TOPIC='NEW_STREAM_TABLE', KEY_FORMAT='json', PARTITIONS=3, REPLICAS=1, VALUE_FORMAT='json') AS
SELECT
A A_KEY,
B B_KEY,
C C_KEY,
AS_VALUE(A) A,
AS_VALUE(B) B,
AS_VALUE(C) C,
COUNT(*) COUNT
FROM NEW_STREAM WINDOW TUMBLING (SIZE 30 MINUTES)
GROUP BY A, B, C
EMIT CHANGES;
pull query
ksql> select * from NEW_STREAM_TABLE;
+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+
|A_KEY |B_KEY |C_KEY |WINDOWSTART |WINDOWEND |A |B |C |COUNT |
+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+--------------------------+
Unable to execute pull query
Caused by: io.confluent.ksql.util.KsqlException: Error executing query locally
at node http://our.host.com:8089/: Failed to scan
materialized table
Caused by: Error executing query locally at node
http://our.host.com:8089/: Failed to scan materialized
table
Caused by: Failed to scan materialized table
Caused by: Cannot get state store Aggregate-Aggregate-Materialize because the
stream thread is PARTITIONS_ASSIGNED, not RUNNING
What should I do to get pull query results?
KSQL underlying logic uses KafkaStreams
It seems that due to a consumer or producer subscribing or unsubscribing to a particular topic, the stream will "Rebalance" or redistribute partitions. This may be the reason for the final issue.
Sometimes a stream or consumer can get stuck in a rebalancing stage and never be allowed to be pulled from. I have experienced this issue myself and am currently trying to increase the rebalancing delay of the broker as described here: https://medium.com/bakdata/solving-my-weird-kafka-rebalancing-problems-c05e99535435
Hope that this is helpful, even though it is only my best guess as to what your issue may be.

Can't we join two tables and fetch data in Kafka?

I have joined two tables and fetched data using Postgres source connector. But every time it gave the same issue i.e.
I have run the same query in Postgres and it runs without any issue. Is fetching data by joining tables not possible in Kafka?
I solve this issue by using the concept of the subquery. The problem was that
when I use an alias, the alias column is interpreted as a whole as a column name and therefore the problem occurs. Here goes my query:
select * from (select p.\"Id\" as \"p_Id\", p.\"CreatedDate\" as p_createddate, p.\"ClassId\" as p_classid, c.\"Id\" as c_id, c.\"CreatedDate\" as c_createddate, c.\"ClassId\" as c_classid from \"PolicyIssuance\" p left join \"ClassDetails\" c on p.\"DocumentNumber\" = c.\"DocumentNo\") as result"

How to disable one query in Postgres for specified (partioned) table

SELECT COUNT(*) as count FROM "public"."views";
This query executes every time by TablePlus client when I open a partitioned table with many partitions and processed too long time.
How I can disable the execution of this query on this table?

Retrieve Redshift error messages

I'm running queries on a Redshift cluster using DataGrip that take upwards of 10 hours to run and unfortunately these often fail. Alas, DataGrip doesn't maintain a connection to the database long enough for me to get to see the error message with which the queries fail.
Is there a way of retrieving these error messages later, e.g. using internal Redshift tables? Alternatively, is there are a way to make DataGrip maintain the connection for long enough?
Yes, you Can!
Query stl_connection_log table to find out pid by looking at the recordtime column when your connection was initiated and also dbname, username and duration column helps to narrow down.
select * from stl_connection_log order by recordtime desc limit 100
If you can find the pid, you can query stl_query table to find out if are looking at right query.
select * from stl_query where pid='XXXX' limit 100
Then, check the stl_error table for your pid. This will tell you the error you are looking for.
select * from stl_error where pid='XXXX' limit 100
If I’ve made a bad assumption please comment and I’ll refocus my answer.