Why does Azure Database perform better with transactions - ado.net

We decided to use a micro-orm against an Azure Database. As our business only needs "inserts" and "selects", we decided to suppress all code-managed SqlTransaction (no concurrency issues on data).
Then, we noticed that our instance of Azure Database responded very slowly. The "rpc completed" event occured in delays that are hundreds times the time needed to run a simple sql statement.
Next, we benchmarked our code with EF6 and we saw that the server responded very quickly. As EF6 implements a built-in transaction, we decided to restore the SqlTransaction (ReadCommited) on the micro-orm and we noticed everything was fine.
Does Azure Database require an explicit SqlTransaction (managed by code) ? How does the SqlTransaction influence Azure Database performances ? Why was it implemented that way ?
EDIT : I am going to post some more precise information about the way we collected traces. It seems our Azure events logs sometimes express in nanoseconds, sometimes in milliseconds. Seems so weird.

If I understand what you are asking correctly, batching multiple SQL queries into one transaction will give you better results on any DBS. Committing after every insert/update/delete has a huge overhead on a DBS that is not designed for it (like MyISAM on MySQL).
It can even cause bad flushes to disk and thrashing if you do too much. I once had a programmer committing thousands of entries to one of my DBs every minute, each as their own transactions, and it brought the server to a halt.
InnoDB, one of 2 most popular database formats for MySQL, can only commit 20-30 transactions a second (or maybe it was 2-3... it's been a long time), as each is flushed to the disk at the end for ACID compliance.

Related

Azure Database, EF, Time out issues

I have taken over an existing MVC website which uses entity framework and hangfire and is hosted on Azure and uses Azure database. Every so often the website times out.
I'm new to Azure portal, entity framework and hangfire.
If I increase the DTU's it clears the timeout issues?
I'm looking for ways of how to diagnose why the website times out. I have added error logging using elmah and checked hangfire but this doesn't give me any further information.
Is there anything in azure portal that can help?
If it "times out" and if "increasing DTU resolves timeouts" and these observations are true (I think it's on you to really convince yourself this is absolutely true, don't make this assumption lightly) then the usual and obvious candidate is "a slow sql query". Entity Framework is often used with linq to create sql queries without writing sql. These queries are often fine for very simple tasks, such as someData.Where(x=>x.Id == 1).First(), however, if linq is used to join tables, or create complex associations, the generated sql can become monstrously bad, from a performance perspective. You can add logging to write out the sql generated by linq, or you can try to trace the database to see what sql is running on it. If tracing is out of the question, there are still meta queries you can use to view things like cached query plans and SQL Server can give you estimated costs and cached execution counts.
You can still hang yourself without using linq. You can still use stored procedures with EF. Way too many developers are naive about SQL performance still; you need to comb over your back end and learn the schema, the stored procedures; inspect the sql contents of everything. Check for any database triggers (easy to miss). Red flags are subqueries, too many joining, too many results from a query, lots of string manipulation in a query, joining tables on strings, or XML/JSON-based SQL work.
Be aware that "slow sql queries" will become slower when load is high. And when slow sql queries build up, they only take more time to resolve. This can also cause debilitating table locking, depending on the nature of the query.
But queries can be performant and still cause locking. ie One table is being written to often and it's blocking other writes or reads from that table. This is a little harder to diagnose, but you can figure it out by carefully inspecting logs of database calls and how long they take to execute. There are also sql queries you can run on the database to diagnose long-running queries, or what tables are locked at a given point in time.
Finally, check for any back end webjobs for your application. If timeouts occur at reoccurring days or times, then somebody's batch SQL could be blocking your production database from being read.
But this is all speculation. I think you need to do more research to determine what is actually causing the site to become unresponsive. If you can log response times for common queries, you can rule out SQL-based latency as being the culprit or not and work from there. There's nothing inherently "amiss" about any of the technologies you specified.
If queries are perfomant but still causing issues, a long term solution is to add something like a message queue and batch your sql work intelligently, or just make the database work asynchronous and not block the UI.
You should correlate any logged timeouts with azure's monitoring. Azure can give you CPU/RAM/page visits and such on the dashboard.
SQL Azure is a bit of a different beast. It doesn't have the on-demand performance of a dedicated DB unless you're prepared to throw serious $$ at it. And even then ...
EF, when written for well can perform quite well. When written poorly it can be a dog, and those problems are compounded on a platform like SQL Azure.
The first thing is to check that your EF contexts are set up to use an execution strategy suited to Azure: https://learn.microsoft.com/en-us/ef/ef6/fundamentals/connection-resiliency/retry-logic
The next thing would be to see what kinds of SQL tracing you can run on Azure. Tracing is essential to see what EF is doing behind the scenes. I'm not familiar with tools available for Azure, in my case my Azure experience was running SQL Server on VMs because SQL Azure was too immature, not HIPAA compliant at the time, and expensive for the DTU estimates we were able to get. Worst case, can you restore an database backup into an SQL Server instance and point a copy of your application environment temporarily at that to run through common usage scenarios? Using an SQL Trace you can pick up on exactly when and how often EF is executing queries, and what queries it is executing.
Things to look at:
How many queries are running? If you are loading a set of records and expect one query, are there a whole heap of queries getting sent? This would indicate lazy-load calls being triggered.
What queries are being run? Is it selecting a lot more fields than are being displayed? This would be potentially a case where entire entities are being loaded where a .Select() could be used to reduce the amount of data. Perhaps even the case where entire sets of entities are being loaded that aren't relevant to what is displayed/done, such as cases where someone is using .ToList() prior to just doing a .Count() or .Any() or doing a .FirstOrDefault() just to do a != null check.
Is the database properly indexed? Copy some of the heavier queries into SQL Manager and execute them with an execution plan. Are there indexing suggestions?
The common sins of developing with EF and other ORMs boil down to "pulling too much, too often." It's surprising how many clients I've worked with have development teams that have not used a profiler to inspect their ORM use efficiency. (and I'm talking 0% so far.)

Is it possible to delete a single execution plan from cache on Azure SQL DB?

Conclusion
You can not. Microsoft explicitly states: "you cannot manually remove an execution plan from the cache" in this article called 'Understanding the Procedure Cache on SQL Azure'.
Original Question
On SQL Server a single execution plan can be deleted from the cache using [DBCC FREEPROCCACHE(plan_handle varbinary(64))][1]. There is different [documentation about DBCC FREEPROCCACHE on SQL Azure][2]. It seems that it removes all cached execution plans from all compute nodes and or control nodes (whatever those nodes might be, I don't know). I do not understand why SQL on Azure of the Server version would differ in this aspect.
However, I am looking for a way to delete a single execution plan from the cache on Azure. Is there any way to delete a single execution plan on Azure? Maybe using a query instead of a DBCC?
There is no way to remove single execution plan from cache.
If your execution plan is related to only few tables/one table and if you are ok with removal of cache for those tables as well, then you can alter the table ,add a non null column and remove the column.This will force flush the cache ..
Changing schema of the tables causes cache flush(not single plan, all plans) for those tables involved
I do not understand why SQL on Azure of the Server version would differ in this aspect.
This has to do with database as a offering, you are offered a database(this may be in some server with multiple databases) and some dbcc commands affect whole instance,so they kind of banned all DBCC commands.There is a new offering called Managed instance(which is same as on premises server,but with high availabilty of Azure database), you may want to check that as well

JOOQ Cannot get autoCommit to a PostgreSQL database

I have the following setup where a service layer, using jooq, contacts a PostgreSQL database.
In this scenario, whenever multiple requests happen quickly one after another (or even not that quickly), I get the following error message:
Internal error processing createItem: Cannot get autoCommit
My queries all run within transactions (using jooq's transactionResult methods).
Searching has not yielded many results, and I do not see why autoCommit should even be enabled in those cases. Is this most likely a configuration issue, or is there something else I can try to troubleshoot this issue better?
I noticed the same problem and message when running massive batch uploads on the limit of physical memory and limited amount of db connection (specific to my environment). It would be hard to provide a reproduction case for that, but to me this is a sign of db performance/memory starvation. Reduction of Java execution threads helped in my case.

What's the difference between Heroku's Postgres Continuous Protection vs including a Follower database for integrity and recovery

I'm considering deploying an app to Heroku along with a Postres Standard database plan. I'm keen on ensuring data integrity and ensuring in no case that my customer's data can be lost if the database becomes corrupted or some other similar issue. I also want to ensure a smooth recovery process in tis even. So I have the following questions:
First, I'm assuming with Continuos there's a still a possibility
that a database can become corrupted. Is this true?
What's provides more
integrity, protection, and ease of recovery if a database becomes
corrupted: Standard DB / with Continuos Protection or Standard DB
with a Follower DB.
If by chance the DB
becomes corrupted, or an database integrity issues arise, how will
Heroku remediate (given the database is a "managed" service). Is it
automated or I have to work with Support manually to remediate?
I would love to hear your thoughts on this. My experience in the past has been with MySQL but not Postgres, which I hear great things about.
Thanks
Caveat: I have some experience with Postgresql, but I don't have any experience with Heroku as such.
What Heroku calls 'Continuous Protection' and 'follower' databases are implemented using Postgresql's Continuous Archiving and streaming replication functionality. They have provided a range of administrative tools and infrastructure around these functions to make them easier to use.
Both of these functions make use of the fact that Postgresql writes all updates that it is making to databases in a Write-Ahead Log (WAL).
With Continuous Archiving, one takes a complete copy of all of the underlying files in the database - this is referred to as the base backup. One also collects all WAL files produced by the database, both during and after production of the base backup. Note that you do not need to stop the database in order to make the base backup - it is a fairly unobtrusive process.
If the worst happens, and it is necessary to recover the database from the backup, you just restore the base dump, configure the database so it knows where to find the archived WAL files, and start it up. It will then replay the WAL files in sequence until it is fully up to date.
Note that you can also stop the replay early. This can be extremely useful, as you will see in my answer to your first question:
First, I'm assuming with Continuos there's a still a possibility that
a database can become corrupted. Is this true?
Yes, of course. Database corruption can happen for a number of reasons: hardware failure, a software fault in the database, a fault in your application, or even operator error.
One of the benefits of continuous archiving, though, is that you can replay the WAL files up to a particular point in time, so you can effectively rewind back to the point immediately before the database became corrupted.
As mentioned above, a Follower DB uses Postgresql's 'Streaming Replication' function. With this function, you restore your base backup onto another server, configure it to connect to the master database and fetch WAL files in real time as they are produced. The follower then keeps up to date with any changes made on the master.
Whats provides more integrity, protection, and ease of recovery if a
database becomes corrupted: Standard DB / with Continuos Protection or
Standard DB with a Follower DB.
Ease of recovery is the difference.
If you have a Follower DB, it is a hot standby - if the master fails for some reason, you can switch your application over to the follower with minimal downtime. On the other hand, if you have a large database and you have to restore it from the last base backup and then replay all the WAL files produced since - well that could take a long time, days even if it was a really large database.
Note that also, however, that a follower DB will be of no use if your database becomes corrupted due to, for example, an administrator accidentally dropping the wrong table. The table will be dropped in the follower only a few seconds later. They are like lemmings going over a cliff. The same applies if your application corrupts the database due to a bug, or a hacker, or whatever. Even with the follower, you must have a proper backup in place, either a Continuous Archive or a normal pg_dump.
If by chance the DB becomes corrupted, or an database integrity issues
arise, how will Heroku remediate (given the database is a "managed"
service). Is it automated or I have to work with Support manually to
remediate?
Their documentation indicates that premium plans do feature automated failover. This would be useful in the event of a hardware or platform failure and most kinds of database failure, where the system can detect that the master database has gone down and initiate a failover.
In the case where the database becomes corrupted by the application itself (or a hasty admin) then I suspect you would have to manually initiate failover.

How to rollback an update in PostgreSQL

While editing some records in my PostgreSQL database using sql in the terminal (in ubuntu lucid), I made a wrong update.
Instead of -
update mytable set start_time='13:06:00' where id=123;
I typed -
update mytable set start_time='13:06:00';
So, all records are now having the same start_time value.
Is there a way to undo this change? There are some 500+ records in the table, and I do not know what the start_time value for each record was
Is it lost forever?
I'm assuming it was a transaction that's already committed? If so, that's what "commit" means, you can't go back.
Some data may be recoverable if you're lucky. Stop the database NOW.
Here's an answer I wrote on the same topic earlier. I hope it's helpful.
This might be too: Recoved deleted rows in postgresql .
Unless the data is absolutely critical, just restore from backups, it'll be lots easier and less painful. If you didn't have backups, consider yourself soundly thwacked.
If you catch the mistake and immediately bring down any applications using the database and take it offline, you can potentially use Point-in-Time Recovery (PITR) to replay your Write Ahead Log (WAL) files up to, but not including, the moment when the errant transaction was made. This would return the database to the state it was in prior, thus effectively 'undoing' that transaction.
As an approach for a production application database it has a number of obvious limitations, but there are circumstances in which PITR may be the best option available, especially when critical data loss has occurred. However, it is of no value if archiving was not already configured before the corruption event.
https://www.postgresql.org/docs/current/static/continuous-archiving.html
Similar capabilities exist with other relational database engines.