How much space do VIEWS occupy in PostgreSQL? - postgresql

If I store my query results as views does it take more space of my memory in comparison to a table with query results?
Another question about views is that can I have new query based on the results of a query that is stored as views?

Views don't store query results, they store queries.
Some RDBMS allow the way to store query results (for some queries): this is called materialized views in Oracle and indexed views in SQL Server.
PostgreSQL does not support those (though, as #CalvinCheng mentioned, you can emulate those using triggers or rules).
Yes, you can use views in your queries. However, a view is just a convenient way to refer to a complex query by name, not a way to store its results.

For Question 1
To answer your first question, you cannot store your query results as views but you can achieve a similar functionality using PostgreSQL's trigger feature.
PostgreSQL supports creation of views natively but not the creation of materialized views (views that store your results) - but this can be handled using triggers. See http://wiki.postgresql.org/wiki/Materialized_Views
views do not take up RAM ("memory").
For Question 2
And to answer the second question, to update a view in postgresql, you will need to use CREATE RULE - http://www.postgresql.org/docs/devel/static/sql-createrule.html
CREATE RULE defines a new rule applying to a specified table or view. CREATE OR REPLACE RULE will either create a new rule, or replace
an existing rule of the same name for the same table.

I would like to point out that as of Postgres 9.3, Materialized Views are supported

PostgreSQL view is a saved query. Once created, selecting from a view is exactly the same as selecting from the original query, it returns the query each time. so views do not take up memory.
You can not store your query results as views, views are just queries, but you can achieve a similar functionality using materialized views. Materialized views they are only updated on demand. Second, the whole materialized view must be updated; there is no way to only update a single stale row.
So in that case you have to eagerly update view whenever a change occurs that would invalidate a row. It can be done with triggers.

Related

REFRESH MATERIALIZED VIEW suddenly taking more time to complete

We have a materialized view in our Postgres DB (11.12, managed by AWS RDS). We have a scheduled task that updates it every 5 minutes using REFRESH MATERIALIZED VIEW <view_name>. At some specific point last week, the time needed to refresh the view suddenly went from ~1s to ~20s. The view contains ~70k rows, with around 15 columns, all of them being integers, booleans or UUIDs.
Prior to this:
There were no changes in the server configuration.
There were no changes to the view itself. In fact, running EXPLAIN ANALYZE <expression used to create the view> returns that the query still gets executed in less than a second. If the query is ran with a client like Postico, it takes ~20s to fetch all the results (a bit consistent with the time needed to materialize it, although we assume this is due to the time needed for network transmission).
There were no changes in the schema or any significant record increase in the contents of the tables needed to compute the view.
RDS Performance Insights indicate that the query is mostly using CPU resources
I know this is probably not enough to get a solution, but:
Are there any server performance metrics or logs that could lead us to understand better this situation?
Is this just the normal time the server needs to persist the view to disk? If so, any idea of possible reasons why it started to take so long recently?
Here is a link to the execution plan.
EDIT: creating another materialized view with the same JOINS but selecting less columns performs as expected (~1s).
EDIT 2: setting enable_nestloop = false greatly speeds up the REFRESH operation (same performance as before). Would this indicate that refactoring the underlying query could solve the issue?
Try REFRESH materialized view concurrently.
When you refresh data for a materialized view, PostgreSQL locks the entire table therefore you cannot query data against it. To avoid this, you can use the CONCURRENTLY option.
REFRESH MATERIALIZED VIEW CONCURRENTLY view_name;
With CONCURRENTLY option, PostgreSQL creates a temporary updated version of the materialized view, compares two versions, and performs INSERT and UPDATE only the differences.
You can query against a materialized view while it is being updated. One requirement for using CONCURRENTLY option is that the materialized view must have a UNIQUE index.
Original poster here. This is more than a year old, but here's what happened and how we eventually fixed it.
TLDR:
-REFRESH MATERIALIZED VIEW <query> started to take much longer than executing the query used to construct the view (~1s vs ~20s).
After a couple of weeks this question was asked, the query itself started to behave similarly (taking ~20s to complete). At this point, the EXPLAIN ANALYZE started to show indications of performance issues with the query. So we ended up optimising the underlying query (the biggest performance gain being replacing some JOINS with a CTE).
After this, the performance of both the REFRESH MATERIALIZED VIEW <query> and the standalone query behaved correctly (execution time < 1s).
A still open question here is why the REFRESH MATERIALIZED VIEW <query> and the standalone query had different performance at some point in time? Was the DB query planner choosing different query plans depending on whether it was going to materialize the view or not? I guess if someone knows if such thing is possible, please comment.
Updates materialized view every time (or every 5 minutes) this is not a good way to refresh materialized. Then the meaning of using materialized view does not remain. Let me explain to you one of the ways I found with my own logic, based on my own experience, so you can find a more optimal way later. Assumed, we used two tables in our materialized view, and we need that is a changed data one of the two tables we will refresh materialized view. To do this during the update or delete table we must insert to the table (for example refresh_materialized table) one record (you can also use the trigger), through which will be performed refreshing materialized view
For example:
insert into refresh_materialized
(
refresh_status,
insert_date,
executed_date
)
values (
false,
now(),
null
)
And so in our schedule task, we can use this query:
select count(*) from refresh_materialized
where refresh_status = false
if the count(*) will be > 0 then we must refresh materialized view else do nothing. After the refreshing materialized view we must update this table:
update refresh_materialized
set
refresh_status = true,
executed_date = now()
where
refresh_status = false;

Problema with kinked materialized view when overwriting existing postgis table

Main question: I have several views depending on a PostgreSQL/PostGIS table and a final materialized view created by querying the other views. I need a fast and updatable final result (i.e. MV) to use in a QGIS project.
My aim is to update the starting table by overwriting it with new (lots of) values and hopefully have update views and materialized view. I use QGIS DB Manager to overwrite existing table but I get an error because of mv depending on it. If I delete mv, overwrite table and then recreate mv everything is ok but I'd like to avoid manual operations as much as possible.
Is there a better way to reach my goal?
Another question: If I set a trigger to refresh a mv when I update/insert/delete values in a table, would it work even in case of overwriting entire table with a new one?
Refreshing a materialized view runs the complete defining query, so that is a long running and heavy operation for a complicated query.
It is possible to launch REFRESH MATERIALIZED VIEW from a trigger (it had better be a FOR EACH STATEMENT trigger then), but that would make every data modification so slow that I don't think that is practically feasible.
One thing that might work is to implement something like a materialized view that refreshes immediately “by hand”:
create a regular table for the “materialized view” and fill it with data by running the query
on each of the underlying tables, define a row level trigger that modifies the materialized view in accordance with the changes that triggered it
This should work for views where the definition is simple enough, for complicated queries it will not be possible.

Using views in postgresql to enable transparent replacement of backing tables

We have a view that aggregates from a backing table. The idea is to reduce cpu load by using a pre-aggregated table, and to periodically refresh it with the following:
create new_backing_table (fill it)
begin
drop backingtable
rename new_backingtable to backingtable
commit
while in production. The latency caused by the refresh interval is acceptable. Incremental updates are possible but not desirable.
Anyone has a comment on this scheme ?
Check out materialized views. This may suit your use case. It can be used to store query results at creation then refreshed at a later time.
A materialized view is defined as a table which is actually physically stored on disk, but is really just a view of other database tables. In PostgreSQL, like many database systems, when data is retrieved from a traditional view it is really executing the underlying query or queries that build that view.
https://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html

Is it possible to have indexes on non-materialized views?

In PostgreSQL, can I have an index on a non-materialized view?
I'm using a view in my application and it basically works well, but I'd like to speed up access to its data. I could switch to a materialized view, but I don't want to have to refresh it.
No
From http://postgresql.nabble.com/Indexes-not-allowed-on-read-only-views-Why-td4812152.html
in postgres, views are essentially macros, thus there is no data to index
and
A normal (non-materialized) view doesn't have any data of its own, it
pulls it from one or more other tables on the fly during query
execution. The execution of a view is kind of similar to a
set-returning function or a subquery, almost as if you'd substituted
the view definition into the original query.
That means that the view will use any indexes on the original
table(s), but there isn't really even an opportunity to check for
indexes on the view its self because the view's definition is
effectively substituted into the query. If the view definition is
complex enough that it does a lot of work where indexes on the
original table(s) don't help, that work has to be done every time.
and
What you CAN do is use triggers to maintain your own materialized
views as regular tables, and have indexes on the tables you maintain
using triggers. This is widely discussed on the mailing list and isn't
hard to do, though it's tricky to make updates perform well with some
kinds of materialized view query.

Inserts to indexed views

Greetings Overflowers,
Is there an SQL DBMS that allows me to create an indexed view in which I can insert new rows without modifying the original tables of the view? I will need to query this view after performing the in-view-only inserts. If the answer is no, what other methods can do the job? I simply want to merge a set of rows that comes from another server with the set of rows in the created view -in a specific order- to be able to perform fast queries against the merged set, ie the indexed view, without having to persist the received set in disk. I am not sure if using in-memory database would perform well as the merged sets grow ridiculously?
What do you think guys?
Kind regards
Well, there's no supported way to do that, since the view has to be based on some table(s).
Besides that, indexed views are not meant to be used like that. You don't have to push any data into the index view thinking that you will make data retrieval faster.
I suggest you keep your view just the way it is. And then have a staging table, with the proper indexes created on it, in which you insert the data coming from the external system.
The staging table should be truncated anytime you want to get rid of the data (so right before you're inserting new data). That should be done in a SNAPSHOT ISOLATION transaction, so your existing queries don't read dirty data, or deadlock.
Then you have two options:
Use an UNION ALL clause to merge the results from the view and the staging table when you want to retrieve your data.
If the staging table shouldn't be merged, but inner joined, then you perhaps can integrate it in the indexed view.