How to query the schema info in Redshift?

How to query the schema info in Redshift? - amazon-redshift

I am getting the following error:
ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables.
I understand this is because I am trying to use a leader node function only but is there some other way to do the same thing?
I am running this in the Amazon Data Pipeline on Redshift Database.
CREATE TABLE "schema_n"."temp_variable"
AS
SELECT CASE WHEN (NOT EXISTS(SELECT 1 FROM PG_TABLE_DEF pgtd WHERE schemaname = 'schema_xyz' AND tablename = 'table_xyz')) OR (DATE_PART('dow', CURRENT_DATE) = 0)
THEN '2017-01-01'::DATE
ELSE CURRENT_DATE - 11
END AS "date_import";
I've also tried:
CREATE TABLE "schema_n"."temp_variable"
AS
SELECT CASE WHEN (NOT EXISTS(SELECT * FROM information_schema.tables WHERE table_schema = 'schema_xyz' AND table_name = 'table_xyz')) OR (DATE_PART('dow', CURRENT_DATE) = 0)
THEN '2017-01-01'::DATE
ELSE CURRENT_DATE - 11
END AS "date_import";
Basically I am trying to do the following:
If table_xyz does not exist or if it's Sunday, return '2017-01-01' otherwise return today-11 days.
Whatever I try, I keep getting the same error:
ERROR: Specified types or functions (one per INFO message) not supported on Redshift tables.

In the end, the only thing that worked in the pipeline is to do this using svv_table_info table.
CREATE TABLE "schema_n"."temp_variable" (
"date_import" DATE;
INSERT INTO "schema_n"."temp_variable"
SELECT
CASE WHEN (NOT EXISTS(SELECT 1 FROM svv_table_info WHERE "schema" = 'schema_xyz' AND "table" = 'table_xyz')) OR (DATE_PART('dow', CURRENT_DATE) = 0)
THEN '2017-01-01'::DATE
ELSE CURRENT_DATE - 11
END AS "date_import";

Related

ERROR: missing FROM-clause entry for table "max_table"

I want to get the maximum id from the home_history table and filtered home_history by this value. I used it with the operator. where did I mistake it?
with max_table as (select max(id) as max_id from home_history),
current_data as (
select Cast(created_at As date), count(id)
from home_history
where id > (max_table.max_id - 30 * 500000)
and created_at >= CAST((now() + (INTERVAL '-30 day')) AS date)
and home_history.created_at < CAST(now() AS date)
group by CAST(created_at AS date)
order by CAST(created_at As Date)
)
SELECT *
from current_data;
[42P01] ERROR: missing FROM-clause entry for table "max_table"
Position: 201

Even if it is obvious to you, you have to explicitly state in your main query that you select from the CTE.
Think of a CTE as a view defined just for a single query. You'd need a FROM clause to indicate the view you select from.

How to pass the result from first query to the second one with PostgresOperator in airflow 2.x?

I have to pass the result of my first redshift query to the second one. I am using postgres operator, Postgre script. doesn't have any return function as you see in this link
Actually I thought to modify the script and add return to the execute method. But the point is that I do not use the execute method and for executing the sql script I am using this:
retrieve_latest_query_task = PostgresOperator(
sql='rs_warm-up_query-id.sql',
postgres_conn_id='redshift',
task_id='retrieve_latest_query_ids_from_metadata'
)
Here are my two queries:
SELECT query
FROM (SELECT query,
querytxt,
ROW_NUMBER() OVER (PARTITION BY querytxt ORDER BY query ASC) AS num
FROM stl_query
WHERE userid = 102
AND starttime >= CURRENT_DATE - 2 + INTERVAL '7 hour'
AND starttime < CURRENT_DATE - 2 + INTERVAL '11 hour'
AND UPPER(querytxt) LIKE 'SELECT %'
ORDER BY query)
WHERE num = 1;
and with the retrieve data (which is a list) , I have to pass it to the second script:
SELECT LISTAGG(CASE WHEN LEN (RTRIM(TEXT)) = 0 THEN TEXT ELSE RTRIM(TEXT) END,'') within group(ORDER BY SEQUENCE) AS TEXT
FROM stl_querytext
WHERE query = {};
I thought that using xcom could be a good solution, as I don't return many rows. But I don't know how to use it with Postgres.
I don't want to use the temporal table, as I believe that for that small volume I don't need.
I ll appreciate your help.

Dividing 2 count statements in Postgresql

I do have a question about the division of 2 count statements below, which give me the error underneath.
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
INNER JOIN account ON account.sfid = transactions.accountsfid
INNER JOIN transactionLineItems ON transactions.transactionNumber
= transactionLineItems.transactionNumber
INNER JOIN products ON transactionLineItems.USIM = products.USIM
WHERE products.gender = 'male' AND products.agegroup = 'adult'
AND transactions.transactionDate >= current_date - interval
'730' day)/
(SELECT COUNT(transactions.transactionNumber)
FROM transactions
WHERE transactions.transactionDate >=
current_date - interval '730' day)
ERROR: syntax error at or near "/"
LINE 6: ...tions.transactionDate >= current_date - interval '730' day)/``
What I think the problem is, that the my count statements are creating tables, and the division of the tables is the problem, but how can I make this division work?
Afterwards I want to check the result against a percentage, e.g. < 0.2.
Can anyone help me with this.

Is that your complete query? Something like this works in Postgres 10:
SELECT
(SELECT COUNT(id) FROM test WHERE state = false) / (SELECT COUNT(id) FROM test WHERE state = true) as y
The extra SELECT in front of both sub queries with the division is what's important. Otherwise I also get the error you mentioned.
See also my DB Fiddle version of this query.

Looping SQL query - PostgreSQL

I'm trying to get a query to loop through a set of pre-defined integers:
I've made the query very simple for this question.. This is pseudo code as well obviously!
my_id = 0
WHILE my_id < 10
SELECT * from table where id = :my_id`
my_id += 1
END
I know that for this query I could just do something like where id < 10.. But the actual query I'm performing is about 60 lines long, with quite a few window statements all referring to the variable in question.
It works, and gets me the results I want when I have the variable set to a single figure.. I just need to be able to re-run the query 10 times with different variables hopefully ending up with one single set of results.
So far I have this:
CREATE OR REPLACE FUNCTION stay_prices ( a_product_id int ) RETURNS TABLE (
pid int,
pp_price int
) AS $$
DECLARE
nights int;
nights_arr INT[] := ARRAY[1,2,3,4];
j int;
BEGIN
j := 1;
FOREACH nights IN ARRAY nights_arr LOOP
-- query here..
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
But I'm getting this back:
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
So do I need to get my query to SELECT ... INTO the returning table somehow? Or is there something else I can do?
EDIT: this is an example of the actual query I'm running:
\x auto
\set nights 7
WITH x AS (
SELECT
product_id, night,
LAG(night, (:nights - 1)) OVER (
PARTITION BY product_id
ORDER BY night
) AS night_start,
SUM(price_pp_gbp) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS pp_price,
MIN(spaces_available) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS min_spaces_available,
MIN(period_date_from) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS min_period_date_from,
MAX(period_date_to) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS max_period_date_to
FROM products_nightlypriceperiod pnpp
WHERE
spaces_available >= 1
AND min_group_size <= 1
AND night >= '2016-01-01'::date
AND night <= '2017-01-01'::date
)
SELECT
product_id as pid,
CASE WHEN x.pp_price > 0 THEN x.pp_price::int ELSE null END as pp_price,
night_start as from_date,
night as to_date,
(night-night_start)+1 as duration,
min_spaces_available as spaces
FROM x
WHERE
night_start = night - (:nights - 1)
AND min_period_date_from = night_start
AND max_period_date_to = night;
That will get me all the nights night periods available for all my products in 2016 along with the price for the period and the max number of spaces I could fill in that period.
I'd like to be able to run this query to get all the periods available between 2 and 30 days for all my products.
This is likely to produce a table with millions of rows. The plan is to re-create this table periodically to enable a very quick look up of what's available for a particular date. The products_nightlypriceperiod represents a night of availability of a product - e.g. Product X has 3 spaces left for Jan 1st 2016, and costs £100 for the night.

Why use a loop? You can do something like this (using your first query):
with params as (
select generate_series(1, 10) as id
)
select t.*
from params cross join
table t
where t.id = params.id;
You can modify params to have the values you really want. Then just use cross join and let the database "do the looping."

SQL Server 2000: how do i get a list of tables and the row counts? [duplicate]

This question already has answers here:
Query to list number of records in each table in a database
(23 answers)
Closed 8 years ago.
I know that I can get a list of tables with
SELECT TABLE_NAME FROM information_schema.tables
WHERE NOT TABLE_NAME='sysdiagrams'
AND TABLE_SCHEMA = 'dbo'
AND TABLE_TYPE= 'BASE TABLE'
But I'm not sure how to modify that to get a 2nd column with the current count of rows for the tables. I though of something like this:
DECLARE #tbl VARCHAR(200)
(SELECT #tbl = TABLE_NAME, TABLE_NAME,
(SELECT COUNT(ID) AS Cnt FROM #tbl)
FROM information_schema.tables
WHERE NOT TABLE_NAME='sysdiagrams'
AND TABLE_SCHEMA = 'dbo'
AND TABLE_TYPE= 'BASE TABLE')
I know the above is not valid T-SQL but I think it gets the point of what I would like the have done. This is for SQL Server 2000. I would prefer not to use store procedures if at all possible.

A quick and dirty way (includes uncommitted changes and possibly forwarding pointers on heaps)
select o.name, rows
from sysindexes i join sysobjects o on o.id=i.id
where indid < 2 and type='U'

exec sp_MSforeachtable 'select count(*) as nr_of_rows, ''?'' as table_name from ?'

You can go whole hog on this one. The problem with using sysIndexes to get rowcounts is that they're not always up to date. There is a way to make them all up to date, though. The following code will give you row counts for each table and a whole bunch more.
/**********************************************************************************************************************
Purpose:
Returns a single result set similar to sp_Space used for all user tables at once.
Notes:
1. May be used as a view, stored procedure, or table-valued function.
2. Must comment out 1 "Schema" in the SELECT list below prior to use. See the adjacent comments for more info.
Revision History:
Rev 00 - 22 Jan 2007 - Jeff Moden
- Initital creation for SQL Server 2000
Rev 01 - 11 Mar 2007 - Jeff Moden
- Add automatic page size determination for future compliance
Rev 02 - 05 Jan 2008 - Jeff Moden
- Change "Owner" to "Schema" in output. Add optional code per Note 2 to find correct schema name
**********************************************************************************************************************/
--===== Ensure that all row counts, etc is up to snuff
-- Obviously, this will not work in a view or UDF and should be removed if in a view or UDF. External code should
-- execute the command below prior to retrieving from the view or UDF.
DBCC UPDATEUSAGE(0) WITH COUNT_ROWS, NO_INFOMSGS
--===== Return the single result set similar to what sp_SpaceUsed returns for a table, but more
SELECT DBName = DB_NAME(),
--SchemaName = SCHEMA_NAME(so.UID), --Comment out if for SQL Server 2000
SchemaName = USER_NAME(so.UID), --Comment out if for SQL Server 2005
TableName = so.Name,
TableID = so.ID,
MinRowSize = MIN(si.MinLen),
MaxRowSize = MAX(si.XMaxLen),
ReservedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB,
DataKB = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB
+ SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,
IndexKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,
UnusedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB,
Rows = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.Rows ELSE 0 END),
RowModCtr = MIN(si.RowModCtr),
HasTextImage = MAX(CASE WHEN si.IndID IN ( 255) THEN 1 ELSE 0 END),
HasClustered = MAX(CASE WHEN si.IndID IN ( 1 ) THEN 1 ELSE 0 END)
FROM dbo.SysObjects so,
dbo.SysIndexes si,
(--Derived table finds page size in KB according to system type
SELECT Low/1024 AS PageKB --1024 is a binary Kilo-byte
FROM Master.dbo.spt_Values
WHERE Number = 1 --Identifies the primary row for the given type
AND Type = 'E' --Identifies row for system type
) pkb
WHERE si.ID = so.ID
AND si.IndID IN (0, --Table w/o Text or Image Data
1, --Table with clustered index
255) --Table w/ Text or Image Data
AND so.XType = 'U' --User Tables
AND PERMISSIONS(so.ID) <> 0
GROUP BY so.Name,
so.UID,
so.ID,
pkb.PageKB
ORDER BY ReservedKB DESC

how about "dtproperties" and "sysdiagrams" ?
these tables will show up as user table incorrectly

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to query the schema info in Redshift? - amazon-redshift

Related

ERROR: missing FROM-clause entry for table "max_table"

How to pass the result from first query to the second one with PostgresOperator in airflow 2.x?

Dividing 2 count statements in Postgresql

Looping SQL query - PostgreSQL

SQL Server 2000: how do i get a list of tables and the row counts? [duplicate]

Categories

Resources