adding current date to a materialized view in Amazon Redshift - amazon-redshift

we are using Amazon redshift, when we are creating a materialized view with a column to capture current date along with some other columns from a specific table, it is giving an error stating Materialized view cannot contain mutate functions.
Following is the sample code:
CREATE MATERIALIZED VIEW SAMPLE_TEST AS
SELECT CURRENT_DATE FROM TABLE_A;
I have also used SYSDATE, GETDATE(), NOW() instead of CURRENT_DATE as well.
but everything is mutable function it seems
it is throwing an error as below:
Amazon Invalid operation: Materialized views cannot contain mutable functions. The given materialized view contains one or more of the following mutable functions: CURRENT_DATE

Related

Is there a way to describe an external/spectrum table via redshift?

In AWS Athena you can write
SHOW CREATE TABLE my_table_name;
and see a SQL-like query that describes how to build the table's schema. It works for tables whose schema are defined in AWS Glue. This is very useful for creating tables in a regular RDBMS, for loading and exploring data views.
Interacting with Athena in this way is manual, and I would like to automate the process of creating regular RDBMS tables that have the same schema as those in Redshift Spectrum.
How can I do this through a query that can be run via psql? Or is there another way to get this via the aws-cli?
Redshift Spectrum does not support SHOW CREATE TABLE syntax, but there are system tables that can deliver same information. I have to say, it's not as useful as the ready to use sql returned by Athena though.
The tables are
svv_external_schemas - gives you information about glue database mapping and IAM roles bound to it
svv_external_tables - gives you the location information, and also data format and serdes used
svv_external_columns - gives you the column names, types and order information.
Using that data, you could reconstruct the table's DDL.
For example to get the list of columns and their types in the CREATE TABLE format one can do:
select distinct
listagg(columnname || ' ' || external_type, ',\n')
within group ( order by columnnum ) over ()
from svv_external_columns
where tablename = '<YOUR_TABLE_NAME>'
and schemaname = '<YOUR_SCHEM_NAME>'
the query give you the output similar to:
col1 int,
col2 string,
...
*) I am using listagg window function and not the aggregate function, as apparently listagg aggregate function can only be used with user defined tables. Bummer.
I had been doing something similar to #botchniaque's answer in the past, but recently stumbled across a solution in the AWS-Labs' amazon-redshift-utils code package that seems to be more reliable than my hand-spun queries:
amazon-redshift-utils: v_generate_external_tbl_ddl
If you don't have the ability to create a view backed with the ddl listed in that package, you can run it manually by removing the CREATE statement from the start of the query. Assuming you can create it as a view, usage would be:
SELECT ddl
FROM admin.v_generate_external_tbl_ddl
WHERE schemaname = '<external_schema_name>'
-- Optionally include specific table references:
-- AND tablename IN ('<table_name_1>', '<table_name_2>', ..., '<table_name_n>')
ORDER BY tablename, seq
;
They added show external table now.
SHOW EXTERNAL TABLE external_schema.table_name [ PARTITION ]
SHOW EXTERNAL TABLE my_schema.my_table;
https://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_EXTERNAL_TABLE.html

How to avoid static query in Postgres?

In a function, I need an array of values which is a result of a simple query like:
SELECT array_agg( some_col ) FROM some_table;
I could declare it in function like:
my_array text[] := SELECT array_agg( some_col ) FROM some_table;
But:
this dataset changes maybe once in some years
this dataset is really small
this function would be called a lot
this dataset needs to be up to date
Is there a way to avoid executing the same query over and over? It is not particularly expensive to call, but due to its static nature, I'd like to avoid it.
I could set trigger on some_table to generate the cached version of my_array on any mutation on the table, but is there a way to hold such a variable all the time for every connection?
I'd like to write this function in SQL or PLPGSQL.
In Postgres you can create materialized views (see the docs). It allows you to store the result of a query, and refresh it whenever you want.
It acts like a virtual table, so it is very cheap to query against.
CREATE MATERIALIZED VIEW mymatview AS SELECT array_agg( some_col ) FROM some_table;
And when you want to refresh it:
REFRESH MATERIALIZED VIEW mymatview;

Join a view to another table

Is it possible to join a view with another table in SQL? If so, how?
I have a query on Oracle db which has specific fields. I need to re create the same query on PostgreSQL but some of the data in the PostgreSQL query are coming from a view... And that view has missing information. It's a pretty complex view, so I don't want to NOT use it for now.
For example, in Oracle I do this:
SELECT
d.dos_id,
trunc(d.dos_creation, 'MM') as Cohorte,
sum(v.ver_etude + v.ver_direct) as encaissé
from t_dossier d
left outer join v_versement v
on v.dos_id = d.dos_id
In the Postgres one, I'm using a view. But the view does not return "dos_id" so I cannot explicitly join v_versement with the view.
Is there a way to force a view to return specific fields at runtime which weren't there when creating the view?
You can't force it
to return specific fields at runtime which weren't there when creating
the view
You can create or replace it with limitation:
https://www.postgresql.org/docs/current/static/sql-createview.html
CREATE OR REPLACE VIEW is similar, but if a view of the same name
already exists, it is replaced. The new query must generate the same
columns that were generated by the existing view query (that is, the
same column names in the same order and with the same data types), but
it may add additional columns to the end of the list. The calculations
giving rise to the output columns may be completely different.
example:
t=# create view v2 as select now();
CREATE VIEW
Time: 36.488 ms
t=# create or replace view v2 as select now(),current_user;
CREATE VIEW
Time: 8.551 ms
t=# create or replace view v2 as select now()::text,current_user;
ERROR: cannot change data type of view column "now" from timestamp with time zone to text
Time: 0.430 ms
I guess I had not realised that I can actually use the view without creating it...
So I edited the SQL statement that makes up the view, added the fields that I needed and used the code of the view without having to create a new view (creating a new view would mean outsourcing it to another company, which would cost us money..)
Thanks :)

postgres index with date in where clause

I have large table with several million rows in Postgresql 9.1. One of the columns is timestamp with time zone.
Frequently used query is looking for data using where clause 'column > (now()::date - 11)' to look for last ten days.
I want to build an index that would work only for last months data, to limit the scan. Partial index.
So far I have not figured out how to use actual last month, so I started by hardcoding '2015-12-01' as a start date for index.
create index q on test (i) where i > '2015-01-01';
this worked fine, index was created. But unfortunately, it was not used, as it treats '2015-01-01' as a ::timestamp, while query is with a ::date. So index was not used and I was back to square one.
Next I tried to modify index to compare column with date, so it would match. But here I hit the immutable wall.
As to_date or cast as date are mutable functions, they are dependent on local timezone, index creation fails.
if I have test table like this:
create table test (i timestamptz);
and then try to create index with
create index q on test (i) where i > to_date('2015-01-01','YYYY-DD-MM');
then it fails with
ERROR: functions in index predicate must be marked IMMUTABLE
this is understandable. But now, when I try it with specific timezone
create index q on test (i) where i > to_date('2015-01-01','YYYY-DD-MM')
at time zone 'UTC';
it still fails
ERROR: functions in index predicate must be marked IMMUTABLE
this I don't understand anymore. It has timezone defined. What else is immutable?
I also tried creating immutable function myself:
CREATE FUNCTION
datacube=# create or replace function immutable_date(timestamptz) returns date as $$
select ($1::date at time zone 'UTC')::date;
$$ language sql immutable;
but using this function in index:
create index q on test (i) where i > immutable_date('2015-01-01');
fails with the same error:
ERROR: functions in index predicate must be marked IMMUTABLE
I am at loss here. Maybe it has something to do with Locales, not only timezones? Or something else makes it mutable?
And also - maybe there are another, simpler way, to limit index to last month or two of data? Table partitioning in Postgres would require rebuilding entire database, and so far I have not found anything else.

Greenplum : Object Creation Date

I have used pg_stat_operations to know when a table or view is created or altered,
I want to know when a function or sequence is created or modified.
(Function list is availed at pg_proc and information_schema.routines but there is no option for creation or modification date)
Is there any way to find it out???