Is it possible to delete a single execution plan from cache on Azure SQL DB? - tsql

Conclusion
You can not. Microsoft explicitly states: "you cannot manually remove an execution plan from the cache" in this article called 'Understanding the Procedure Cache on SQL Azure'.
Original Question
On SQL Server a single execution plan can be deleted from the cache using [DBCC FREEPROCCACHE(plan_handle varbinary(64))][1]. There is different [documentation about DBCC FREEPROCCACHE on SQL Azure][2]. It seems that it removes all cached execution plans from all compute nodes and or control nodes (whatever those nodes might be, I don't know). I do not understand why SQL on Azure of the Server version would differ in this aspect.
However, I am looking for a way to delete a single execution plan from the cache on Azure. Is there any way to delete a single execution plan on Azure? Maybe using a query instead of a DBCC?

There is no way to remove single execution plan from cache.
If your execution plan is related to only few tables/one table and if you are ok with removal of cache for those tables as well, then you can alter the table ,add a non null column and remove the column.This will force flush the cache ..
Changing schema of the tables causes cache flush(not single plan, all plans) for those tables involved
I do not understand why SQL on Azure of the Server version would differ in this aspect.
This has to do with database as a offering, you are offered a database(this may be in some server with multiple databases) and some dbcc commands affect whole instance,so they kind of banned all DBCC commands.There is a new offering called Managed instance(which is same as on premises server,but with high availabilty of Azure database), you may want to check that as well

Related

Architecture to be able to have a lot of SQL calls on the same tables for a workflow execution

We have a project where we let users execute workflows based on a selection of steps.
Basically each step is linked to an execution and an execution can be linked to one or multiple executionData (the data created or updated during that execution for that step, a blob in postgres).
Today, we execute this through a queuing mechanism where executions are created in queues and workers do the executions and create the next job in the queue.
But this architecture and our implementation make our postgres database slow as when multiple jobs are scheduled at the same time:
We are basically always creating and reading from the execution table (we create the execution to be scheduled, we read the execution when starting the job, we update the status when the job is finished)
We are basically always creating and reading from the executionData table (we add and update executionData during executions)
We have the following issues:
Our executionData table is growing very fast and it's almost impossible to remove rows as there are constantly locks on the table => what could we do to avoid that ? Postgres a good usage for that kind of data ?
Our execution table is growing as well very fast and it impacts the overall execution as to be able to execute we need to create, read & update execution. Delete of rows is as well almost impossible ... => what could we do to improve this ? Usage of historical table ? Suggestions ?
We need to perform statistics on the total executions executed & data saved, this is as well requested on the above table which slows down the process
We use RDS on AWS for our Postgres database.
Thanks for your insights!
Try going for a faster database architecture. Your use-case seems well optimized for a DynamoDB architecture for your executions. You can get O(1) performance, and the blob-storage can fit right into the record as long as you can keep it under 256K.

Does the shared buffer space in PostgreSQL cache index queries?

If I want to load test against a PostgreSQL table on an index. Will the shared buffer space or any other memory component that PostgreSQL use cache the data/query plan?
I found this resource but it didn't really answer my question:
https://www.postgresql.fastware.com/blog/back-to-basics-with-postgresql-memory-components
There is no shared memory area in PostgreSQL where plans are cached.
Normally, execution plans are not cached at all, they have to be generated again whenever a query is run.
There are two exceptions where execution plans are cached in a database session (but not across sessions):
The plans of prepared statements are cached.
The plans of SQL statements run from a PL/pgSQL funtions are cached (except for dynamic SQL executed with EXECUTE).

Azure Database, EF, Time out issues

I have taken over an existing MVC website which uses entity framework and hangfire and is hosted on Azure and uses Azure database. Every so often the website times out.
I'm new to Azure portal, entity framework and hangfire.
If I increase the DTU's it clears the timeout issues?
I'm looking for ways of how to diagnose why the website times out. I have added error logging using elmah and checked hangfire but this doesn't give me any further information.
Is there anything in azure portal that can help?
If it "times out" and if "increasing DTU resolves timeouts" and these observations are true (I think it's on you to really convince yourself this is absolutely true, don't make this assumption lightly) then the usual and obvious candidate is "a slow sql query". Entity Framework is often used with linq to create sql queries without writing sql. These queries are often fine for very simple tasks, such as someData.Where(x=>x.Id == 1).First(), however, if linq is used to join tables, or create complex associations, the generated sql can become monstrously bad, from a performance perspective. You can add logging to write out the sql generated by linq, or you can try to trace the database to see what sql is running on it. If tracing is out of the question, there are still meta queries you can use to view things like cached query plans and SQL Server can give you estimated costs and cached execution counts.
You can still hang yourself without using linq. You can still use stored procedures with EF. Way too many developers are naive about SQL performance still; you need to comb over your back end and learn the schema, the stored procedures; inspect the sql contents of everything. Check for any database triggers (easy to miss). Red flags are subqueries, too many joining, too many results from a query, lots of string manipulation in a query, joining tables on strings, or XML/JSON-based SQL work.
Be aware that "slow sql queries" will become slower when load is high. And when slow sql queries build up, they only take more time to resolve. This can also cause debilitating table locking, depending on the nature of the query.
But queries can be performant and still cause locking. ie One table is being written to often and it's blocking other writes or reads from that table. This is a little harder to diagnose, but you can figure it out by carefully inspecting logs of database calls and how long they take to execute. There are also sql queries you can run on the database to diagnose long-running queries, or what tables are locked at a given point in time.
Finally, check for any back end webjobs for your application. If timeouts occur at reoccurring days or times, then somebody's batch SQL could be blocking your production database from being read.
But this is all speculation. I think you need to do more research to determine what is actually causing the site to become unresponsive. If you can log response times for common queries, you can rule out SQL-based latency as being the culprit or not and work from there. There's nothing inherently "amiss" about any of the technologies you specified.
If queries are perfomant but still causing issues, a long term solution is to add something like a message queue and batch your sql work intelligently, or just make the database work asynchronous and not block the UI.
You should correlate any logged timeouts with azure's monitoring. Azure can give you CPU/RAM/page visits and such on the dashboard.
SQL Azure is a bit of a different beast. It doesn't have the on-demand performance of a dedicated DB unless you're prepared to throw serious $$ at it. And even then ...
EF, when written for well can perform quite well. When written poorly it can be a dog, and those problems are compounded on a platform like SQL Azure.
The first thing is to check that your EF contexts are set up to use an execution strategy suited to Azure: https://learn.microsoft.com/en-us/ef/ef6/fundamentals/connection-resiliency/retry-logic
The next thing would be to see what kinds of SQL tracing you can run on Azure. Tracing is essential to see what EF is doing behind the scenes. I'm not familiar with tools available for Azure, in my case my Azure experience was running SQL Server on VMs because SQL Azure was too immature, not HIPAA compliant at the time, and expensive for the DTU estimates we were able to get. Worst case, can you restore an database backup into an SQL Server instance and point a copy of your application environment temporarily at that to run through common usage scenarios? Using an SQL Trace you can pick up on exactly when and how often EF is executing queries, and what queries it is executing.
Things to look at:
How many queries are running? If you are loading a set of records and expect one query, are there a whole heap of queries getting sent? This would indicate lazy-load calls being triggered.
What queries are being run? Is it selecting a lot more fields than are being displayed? This would be potentially a case where entire entities are being loaded where a .Select() could be used to reduce the amount of data. Perhaps even the case where entire sets of entities are being loaded that aren't relevant to what is displayed/done, such as cases where someone is using .ToList() prior to just doing a .Count() or .Any() or doing a .FirstOrDefault() just to do a != null check.
Is the database properly indexed? Copy some of the heavier queries into SQL Manager and execute them with an execution plan. Are there indexing suggestions?
The common sins of developing with EF and other ORMs boil down to "pulling too much, too often." It's surprising how many clients I've worked with have development teams that have not used a profiler to inspect their ORM use efficiency. (and I'm talking 0% so far.)

Why does Azure Database perform better with transactions

We decided to use a micro-orm against an Azure Database. As our business only needs "inserts" and "selects", we decided to suppress all code-managed SqlTransaction (no concurrency issues on data).
Then, we noticed that our instance of Azure Database responded very slowly. The "rpc completed" event occured in delays that are hundreds times the time needed to run a simple sql statement.
Next, we benchmarked our code with EF6 and we saw that the server responded very quickly. As EF6 implements a built-in transaction, we decided to restore the SqlTransaction (ReadCommited) on the micro-orm and we noticed everything was fine.
Does Azure Database require an explicit SqlTransaction (managed by code) ? How does the SqlTransaction influence Azure Database performances ? Why was it implemented that way ?
EDIT : I am going to post some more precise information about the way we collected traces. It seems our Azure events logs sometimes express in nanoseconds, sometimes in milliseconds. Seems so weird.
If I understand what you are asking correctly, batching multiple SQL queries into one transaction will give you better results on any DBS. Committing after every insert/update/delete has a huge overhead on a DBS that is not designed for it (like MyISAM on MySQL).
It can even cause bad flushes to disk and thrashing if you do too much. I once had a programmer committing thousands of entries to one of my DBs every minute, each as their own transactions, and it brought the server to a halt.
InnoDB, one of 2 most popular database formats for MySQL, can only commit 20-30 transactions a second (or maybe it was 2-3... it's been a long time), as each is flushed to the disk at the end for ACID compliance.

Synchronize between an MS Access (Jet / MADB) database and PostgreSQL DB, is this possible?

Is it possible to have a MS access backend database (Microsoft JET or Access Database Engine) set up so that whenever entries are inserted/updated those changes are replicated* to a PostgreSQL database?
Two-way synchronization would be nice, but one way would be acceptable.
I know it's popular to link the two and use one as a frontend, but it's essential that both be backend.
Any suggestions?
* ie reflected, synchronized, mirrored
Can you use Microsoft SQL Server Express Edition? Or do you have to use Microsoft Access Database Engine? It's possible you'll have more options using MS SQL express, like more complete triggers and logging.
Either way, you're going to need a way to accumulate a log of changed rows from the source database engine, and a program to sync them to PostgreSQL by reading the log and converting it into suitable PostgreSQL INSERT, UPDATE and DELETE statements.
You could do this by having audit triggers in MADB/Express insert a row into an audit shadow table for every "real" table whenever it changed, including inserting special "row deleted" audit entries. Then your sync program could connect to both MADB/Express, read the audit tables, apply the changes to PostgreSQL, and empty the audit tables.
I'll be surprised if you find anything to do this out of the box. It's one area where Microsoft SQL Server has a big advantage because of all the deep Access and MADB engine integation to support the synchronisation and integration features.
There are some ETL ("Extract, Transform, Load") tools that might be helpful, like Pentaho and Talend. I don't know if you can achieve the desired degree of automation with them though.