If I want to load test against a PostgreSQL table on an index. Will the shared buffer space or any other memory component that PostgreSQL use cache the data/query plan?
I found this resource but it didn't really answer my question:
https://www.postgresql.fastware.com/blog/back-to-basics-with-postgresql-memory-components
There is no shared memory area in PostgreSQL where plans are cached.
Normally, execution plans are not cached at all, they have to be generated again whenever a query is run.
There are two exceptions where execution plans are cached in a database session (but not across sessions):
The plans of prepared statements are cached.
The plans of SQL statements run from a PL/pgSQL funtions are cached (except for dynamic SQL executed with EXECUTE).
Related
We have a project where we let users execute workflows based on a selection of steps.
Basically each step is linked to an execution and an execution can be linked to one or multiple executionData (the data created or updated during that execution for that step, a blob in postgres).
Today, we execute this through a queuing mechanism where executions are created in queues and workers do the executions and create the next job in the queue.
But this architecture and our implementation make our postgres database slow as when multiple jobs are scheduled at the same time:
We are basically always creating and reading from the execution table (we create the execution to be scheduled, we read the execution when starting the job, we update the status when the job is finished)
We are basically always creating and reading from the executionData table (we add and update executionData during executions)
We have the following issues:
Our executionData table is growing very fast and it's almost impossible to remove rows as there are constantly locks on the table => what could we do to avoid that ? Postgres a good usage for that kind of data ?
Our execution table is growing as well very fast and it impacts the overall execution as to be able to execute we need to create, read & update execution. Delete of rows is as well almost impossible ... => what could we do to improve this ? Usage of historical table ? Suggestions ?
We need to perform statistics on the total executions executed & data saved, this is as well requested on the above table which slows down the process
We use RDS on AWS for our Postgres database.
Thanks for your insights!
Try going for a faster database architecture. Your use-case seems well optimized for a DynamoDB architecture for your executions. You can get O(1) performance, and the blob-storage can fit right into the record as long as you can keep it under 256K.
I have taken over an existing MVC website which uses entity framework and hangfire and is hosted on Azure and uses Azure database. Every so often the website times out.
I'm new to Azure portal, entity framework and hangfire.
If I increase the DTU's it clears the timeout issues?
I'm looking for ways of how to diagnose why the website times out. I have added error logging using elmah and checked hangfire but this doesn't give me any further information.
Is there anything in azure portal that can help?
If it "times out" and if "increasing DTU resolves timeouts" and these observations are true (I think it's on you to really convince yourself this is absolutely true, don't make this assumption lightly) then the usual and obvious candidate is "a slow sql query". Entity Framework is often used with linq to create sql queries without writing sql. These queries are often fine for very simple tasks, such as someData.Where(x=>x.Id == 1).First(), however, if linq is used to join tables, or create complex associations, the generated sql can become monstrously bad, from a performance perspective. You can add logging to write out the sql generated by linq, or you can try to trace the database to see what sql is running on it. If tracing is out of the question, there are still meta queries you can use to view things like cached query plans and SQL Server can give you estimated costs and cached execution counts.
You can still hang yourself without using linq. You can still use stored procedures with EF. Way too many developers are naive about SQL performance still; you need to comb over your back end and learn the schema, the stored procedures; inspect the sql contents of everything. Check for any database triggers (easy to miss). Red flags are subqueries, too many joining, too many results from a query, lots of string manipulation in a query, joining tables on strings, or XML/JSON-based SQL work.
Be aware that "slow sql queries" will become slower when load is high. And when slow sql queries build up, they only take more time to resolve. This can also cause debilitating table locking, depending on the nature of the query.
But queries can be performant and still cause locking. ie One table is being written to often and it's blocking other writes or reads from that table. This is a little harder to diagnose, but you can figure it out by carefully inspecting logs of database calls and how long they take to execute. There are also sql queries you can run on the database to diagnose long-running queries, or what tables are locked at a given point in time.
Finally, check for any back end webjobs for your application. If timeouts occur at reoccurring days or times, then somebody's batch SQL could be blocking your production database from being read.
But this is all speculation. I think you need to do more research to determine what is actually causing the site to become unresponsive. If you can log response times for common queries, you can rule out SQL-based latency as being the culprit or not and work from there. There's nothing inherently "amiss" about any of the technologies you specified.
If queries are perfomant but still causing issues, a long term solution is to add something like a message queue and batch your sql work intelligently, or just make the database work asynchronous and not block the UI.
You should correlate any logged timeouts with azure's monitoring. Azure can give you CPU/RAM/page visits and such on the dashboard.
SQL Azure is a bit of a different beast. It doesn't have the on-demand performance of a dedicated DB unless you're prepared to throw serious $$ at it. And even then ...
EF, when written for well can perform quite well. When written poorly it can be a dog, and those problems are compounded on a platform like SQL Azure.
The first thing is to check that your EF contexts are set up to use an execution strategy suited to Azure: https://learn.microsoft.com/en-us/ef/ef6/fundamentals/connection-resiliency/retry-logic
The next thing would be to see what kinds of SQL tracing you can run on Azure. Tracing is essential to see what EF is doing behind the scenes. I'm not familiar with tools available for Azure, in my case my Azure experience was running SQL Server on VMs because SQL Azure was too immature, not HIPAA compliant at the time, and expensive for the DTU estimates we were able to get. Worst case, can you restore an database backup into an SQL Server instance and point a copy of your application environment temporarily at that to run through common usage scenarios? Using an SQL Trace you can pick up on exactly when and how often EF is executing queries, and what queries it is executing.
Things to look at:
How many queries are running? If you are loading a set of records and expect one query, are there a whole heap of queries getting sent? This would indicate lazy-load calls being triggered.
What queries are being run? Is it selecting a lot more fields than are being displayed? This would be potentially a case where entire entities are being loaded where a .Select() could be used to reduce the amount of data. Perhaps even the case where entire sets of entities are being loaded that aren't relevant to what is displayed/done, such as cases where someone is using .ToList() prior to just doing a .Count() or .Any() or doing a .FirstOrDefault() just to do a != null check.
Is the database properly indexed? Copy some of the heavier queries into SQL Manager and execute them with an execution plan. Are there indexing suggestions?
The common sins of developing with EF and other ORMs boil down to "pulling too much, too often." It's surprising how many clients I've worked with have development teams that have not used a profiler to inspect their ORM use efficiency. (and I'm talking 0% so far.)
Conclusion
You can not. Microsoft explicitly states: "you cannot manually remove an execution plan from the cache" in this article called 'Understanding the Procedure Cache on SQL Azure'.
Original Question
On SQL Server a single execution plan can be deleted from the cache using [DBCC FREEPROCCACHE(plan_handle varbinary(64))][1]. There is different [documentation about DBCC FREEPROCCACHE on SQL Azure][2]. It seems that it removes all cached execution plans from all compute nodes and or control nodes (whatever those nodes might be, I don't know). I do not understand why SQL on Azure of the Server version would differ in this aspect.
However, I am looking for a way to delete a single execution plan from the cache on Azure. Is there any way to delete a single execution plan on Azure? Maybe using a query instead of a DBCC?
There is no way to remove single execution plan from cache.
If your execution plan is related to only few tables/one table and if you are ok with removal of cache for those tables as well, then you can alter the table ,add a non null column and remove the column.This will force flush the cache ..
Changing schema of the tables causes cache flush(not single plan, all plans) for those tables involved
I do not understand why SQL on Azure of the Server version would differ in this aspect.
This has to do with database as a offering, you are offered a database(this may be in some server with multiple databases) and some dbcc commands affect whole instance,so they kind of banned all DBCC commands.There is a new offering called Managed instance(which is same as on premises server,but with high availabilty of Azure database), you may want to check that as well
I am trying to figure out if PostgreSQL query execution plans are stored somewhere (possibly as complimentary to pg_stat_statements and pg_prepared_statements) in a way they are available for longer than the duration of the session. I understand that PREPARE does cache a sql statement in pg_prepared_statements, though the plan itself does not seem to be available in any view as far as I can tell.
I am not sure if there is a doc explaining the life cycle of a query plan for PostgreSQL but from what it sounds in the EXPLAIN documentation, PostgreSQL does not cache query plans at all. Is this accurate?
Thanks!
PostgreSQL has no shared storage for execution plans, so they cannot be reused across database sessions.
There are two ways to cache an execution plan within a session:
Use prepared statements with the SQL statements PREPARE and EXECUTE. The plan will be cached for the life time of the prepared statement, usually until your session ends.
Use PL/pgSQL functions. The plans for all static SQL statements (that is, statements that are not run with EXECUTE) in such a function will be cached for the session life time.
Other than that, execution plans are not cached in PostgreSQL.
I want to get to know the real time of my query execution with different hints and without it. But oracle DB caches the query after its first execution and second time it executes quickly. How can I clear this cache after each query execution?
ALTER SYSTEM FLUSH BUFFER_CACHE
More details in the manual:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_2013.htm#i2053602