Redshift pg_user table throws Invalid Operation error on JOIN

Redshift pg_user table throws Invalid Operation error on JOIN - amazon-redshift

when I run the following query,
select * from stl_query q
join pg_user u on q.userid = u.usesysid
where u.usename = 'admin';
I get the following error:
SQL Error [500310] [0A000]: [Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
The query is run on the leader node. What am I doing wrong?

pg_user is a Leader Node-Only Function and cannot be mixed with functions that are not Leader Node-Only.
From the documentation:
Some Amazon Redshift SQL functions are supported only on the leader
node and are not supported on the compute nodes. A query that uses a
leader-node function must execute exclusively on the leader node, not
on the compute nodes, or it will return an error.
The documentation for each leader-node only function includes a note
stating that the function will return an error if it references
user-defined tables or Amazon Redshift system tables.
Source: http://docs.aws.amazon.com/redshift/latest/dg/c_SQL_functions_leader_node_only.html
As a work around in your scenario, you can generate a temp table with the subset of pg_user data that you need, then join to that temp table.
select usesysid
into temp table tmp_user
from pg_user
where usename = 'admin';
select * from stl_query q
inner join tmp_user u on q.userid = u.usesysid;

Related

Using redshift system views with select pivot

I'd like to pivot the values in the system views svv_*_privileges to show privileges assigned to security principles by object. (For the complete solution will need to union results for all objects and pivot to all privileges)
as an example for the default privileges:
select * from
(select object_type, grantee_name, grantee_type, privilege_type, 1 as is_priv from pg_catalog.svv_default_privileges where grantee_name = 'abc' and grantee_type = 'role')
pivot (max(is_priv) for privilege_type in ('EXECUTE', 'INSERT', 'SELECT', 'UPDATE', 'DELETE', 'RULE', 'REFERENCES', 'TRIGGER', 'DROP') );
This gives error (only valid on leader node?)
[Amazon](500310) Invalid operation: Query unsupported due to an internal error.
Then thought of trying a temp table, pivot then being on a redshift table
select * into temp schema_default_priv from pg_catalog.svv_default_privileges where grantee_name = 'abc' and grantee_type = 'role'
... same error as above :-(
Is there a way I can work with SQL on the system tables to accomplish this in Redshift SQL????
While I can do the pivot in python ... why should I, It's supposedly a sql db!!!

On reread of your question the issue became clear. You are using a leader node only system table and looking to apply compute node data and/or functions. This path of data flow is not supported on Redshift. I do have some question as to what action is requiring compute node action but that isn't the crucial and digging is would take time.
If you need to get leader node data to the compute nodes there are a few ways and none of them are trivial. I find that the best method is to move the needed data is to use a cursor. This previous answer outlines hot to do this
How to join System tables or Information Schema tables with User defined tables in Redshift

COBOL - DCLGEN Host Variable Are Ambiguous

Hi I am trying to run a SQL select query with inner join on tbl1 & tbl2
the DCLGEN of 2 table i.e. DCLTBL1 & DCLTBL2 have few similar column name, due to this I am getting Error message as HOST variables Unresolved as the HOST variables are Ambiguous during compilation.
sql query:
EXEC SQL
SELECT A.COLUMN1, A.COLUMN2
FROM TBL1 A INNER JOIN TBL2 B ON A.COLUMN1 = B.COLUMN2
WHERE A.COLUMN1 = :HOST-VARIABLE1
AND A.COLUMN2 = :HOST-VARIABLE2
END-EXEC.
what could be done to resolve this issue?

I Db2 on IBM Z allows for qualifying your host variables.
Try :HOST-VARIABLE1.:TBL1-DCLGEN-STRUCTURE
I might have that backwards.

"Spectrum nested query error" Redshift error

When I run this query in Redshift:
select sd.device_id
from devices.s_devices sd
left join devices.c_devices cd
on sd.device_id = cd.device_id
I get an error like this:
ERROR: Spectrum nested query error
DETAIL:
-----------------------------------------------
error: Spectrum nested query error
code: 8001
context: A subquery that refers to a nested table cannot refer to any other table.
query: 0
location: nested_query_rewriter.cpp:726
process: padbmaster [pid=6361]
-----------------------------------------------
I'm not too sure what this error means. I'm only joining to one table I'm not sure which "other table" it's referring to, and I can't find much info about this error on the web.
I've noticed if I change it from left join to join, the error goes away, but I do need to do a left join.
Any ideas what I'm doing wrong?

Redshift reference mentions:
If a FROM clause in a subquery refers to a nested table, it can't refer to any other table.
In your example, you're trying to join two nested columns in one statement.
I would try to first unnest them separately and only then join:
with
s_dev as (select sd.device_id from devices.s_devices sd),
c_dev as (select cd.device_id from devices.c_devices cd)
select
c_dev.device_id
from c_dev
left join s_dev
on s_dev.device_id = c_dev.device_id

The solution that worked for me, was to create a temporary table with the nested table's data and then join the temp table with the rest of the tables I needed to.
For example, if the nested table is spectrum.customers, the solution will be:
DROP TABLE IF EXISTS temp_spectrum_customers;
CREATE TEMPORARY TABLE
temp_spectrum_customers AS
SELECT c.id, o.shipdate, c.customer_id
FROM spectrum.customers c,
c.orders o;
SELECT tc.id, tc.shipdate, tc.customer_id, d.delivery_carrier
FROM temp_spectrum_customers tc
LEFT OUTER JOIN orders_delivery d on tc.id = d.order_id;

Can a database table partition name be used as a part of WHERE clause for IBM DB2 9.7 SELECT statement?

I am trying to select all data out of the same specific table partition for 100+ tables using the DB2 EXPORT utility. The partition name is constant across all of my partitioned tables, which makes this method more advantageous than using some other possible methods.
I cannot detach the partitions as they are in a production environment.
In order to script this for semi-automation, I need to be able to run the query:
SELECT * FROM MYTABLE
WHERE PARTITION_NAME = MYPARTITION;
I am not able to find the correct syntax for utilizing this type of logic in my SELECT statement passed to the EXPORT utility.

You can do something like this by looking up the partition number first:
SELECT SEQNO
FROM SYSCAT.DATAPARTITIONS
WHERE TABNAME = 'YOURTABLE' AND DATAPARTITIONNAME = 'WHATEVER'
then using the SEQNO value in the query:
SELECT * FROM MYTABLE
WHERE DATAPARTITIONNUM(anycolumn) = <SEQNO value>
Edit:
Since it does not matter what column you reference in DATAPARTITIONNUM(), and since each table is guaranteed to have at least one column, you can automatically generate queries by joining SYSCAT.DATAPARTITIONS and SYSCAT.COLUMNS:
select
'select * from', p.tabname,
'where datapartitionnum(', colname, ') = ', seqno
from syscat.datapartitions p
inner join syscat.columns c
on p.tabschema = c.tabschema and p.tabname = c.tabname
where colno = 1
and datapartitionname = '<your partition name>'
and p.tabname in (<your table list>)
However, building dependency on database metadata into your application is, in my view, not very reliable. You can simply specify the appropriate partitioning key range to extract the data, which will be as efficient.

Why does this query deadlock?

I have an application that reads the structure of an existing PostgreSQL 9.1 database, compares it against a "should be" state and updates the database accordingly. That works fine, most of the time. However, I had several instances now when reading the current database structure deadlocked. The query responsible reads the existing foreign keys:
SELECT tc.table_schema, tc.table_name, tc.constraint_name, kcu.column_name,
ccu.table_schema, ccu.table_name, ccu.column_name
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
WHERE constraint_type = 'FOREIGN KEY'
Viewing the server status in pgAdmin shows this to be the only active query/transaction that's running on the server. Still, the query doesn't return.
The error is reproducible in a way: When I find a database that produces the error, it will produce the error every time. But not all databases produce the error. This is one mysterious bug, and I'm running out of options and ideas on what else to try or how to work around this. So any input or ideas are highly appreciated!
PS: A colleague of mine just reported he produced the same error using PostgreSQL 8.4.

I tested and found your query very slow, too. The root of this problem is that "tables" in information_schema are in fact complicated views to provide catalogs according to the SQL standard. In this particular case, matters are further complicated as foreign keys can be built on multiple columns. Your query yields duplicate rows for those cases which, I suspect, may be an undesired.
Correlated subqueries with unnest, fed to ARRAY constructors avoid the problem in my query.
This query yields the same information, just without duplicate rows and 100x faster. Also, I would venture to guarantee, without deadlocks.
Only works for PostgreSQL, not portable to other RDBMSes.
SELECT c.conrelid::regclass AS table_name
, c.conname AS fk_name
, ARRAY(SELECT a.attname
FROM unnest(c.conkey) x
JOIN pg_attribute a
ON a.attrelid = c.conrelid AND a.attnum = x) AS fk_columns
, c.confrelid::regclass AS ref_table
, ARRAY(SELECT a.attname
FROM unnest(c.confkey) x
JOIN pg_attribute a
ON a.attrelid = c.confrelid AND a.attnum = x) AS ref_columns
FROM pg_catalog.pg_constraint c
WHERE c.contype = 'f';
-- ORDER BY c.conrelid::regclass::text,2
The cast to ::regclass yields table names as seen with your current search_path. May or may not be what you want. For this query to include the absolute path (schema) for every table name you can set the search_path like this:
SET search_path = pg_catalog;
SELECT ...
To continue your session with your default search_path:
RESET search_path;
Related:
Get column names and data types of a query, table or view

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Redshift pg_user table throws Invalid Operation error on JOIN - amazon-redshift

Related

Using redshift system views with select pivot

COBOL - DCLGEN Host Variable Are Ambiguous

"Spectrum nested query error" Redshift error

Can a database table partition name be used as a part of WHERE clause for IBM DB2 9.7 SELECT statement?

Why does this query deadlock?

Categories

Resources