Our app runs on many web servers. The time of these web servers can get skewed over time, as is to be expected. The database is a single separate machine with it's own time. We're using EF 5.0 and have a table that needs very precise and consistent times in multiple columns. I would like to be sure the date columns in this table always use the database servers time.
In SQL I would just set the column to GetUtcDate(). Simple, the date is computed and set on the database server, done. But how can I do this with EF on an insert or update? To be clear I need the SQL generated by EF to set the column to the function GetUtcDate() so that the value comes from the database server. I do not want the date being calculated on the web server. Some ideas I've seen and considered and why they don't work for me:
1) I could use default values on the columns in the schema. But I have many update scenarios where I also need consistent dates, not just inserts.
2) I could use triggers in the database. But we currently have zero logic in our database (we are using an ORM after all) and I don't want to set that precedent if I can avoid it. It also is tricky to determine when to update these columns on the database end.
3) I can get the database server time manually (separate query as in the example below), set the column to that value, then do the update. But this is very inefficient as it requires an extra call to the database. In a tight loop this is way too much overhead. Plus the time is now less accurate since I got the time milliseconds earlier, though it is at least consistent.
CreateQuery<DateTime>("CurrentUtcDateTime()").Execute().First();
So what is the right way to do this? Or is it even possible to make EF do the right thing here?
This question is really, Can I tell EF to get the Date/Time from the DB / Underlying provider. As far as I know, this isnt possible with EF statements no.
You should use a simple SQL statement prior to the Get DBTime
T-SQL GetDate Choose the preferred date option
var dq = context.Database.SqlQuery<DateTime>("select GETUTCDATE();");
DateTime serverDate;
foreach (var dt in dq) {
serverDate = dt;
}
Now use serverDate in your EF linq statement.
Related
I want to periodically export data from db2 and load it in another database for analysis.
In order to do this, I would need to know which rows have been inserted/updated since the last time I've exported things from a given table.
A simple solution would probably be to add a timestamp to every table and use that as a reference, but I don't have such a TS at the moment, and I would like to avoid adding it if possible.
Is there any other solution for finding the rows which have been added/updated after a given time (or something else that would solve my issue)?
There is an easy option for a timestamp in Db2 (for LUW) called
ROW CHANGE TIMESTAMP
This is managed by Db2 and could be defined as HIDDEN so existing SELECT * FROM queries will not retrieve the new row which would cause extra costs.
Check out the Db2 CREATE TABLE documentation
This functionality was originally added for optimistic locking but can be used for such situations as well.
There is a similar concept for Db2 z/OS - you have to check that out as I have not tried this one.
Of cause there are other ways to solve it like Replication etc.
That is not possible if you do not have a timestamp column. With a timestamp, you can know which are new or modified rows.
You can also use the TimeTravel feature, in order to get the new values, but that implies a timestamp column.
Another option, is to put the tables in append mode, and then get the rows after a given one. However, this option is not sure after a reorg, and affects the performance and space utilisation.
One possible option is to use SQL replication, but that needs extra tables for staging.
Finally, another option is to read the logs, with the db2ReadLog API, but that implies a development. Also, just appliying the archived logs into the new database is possible, however the database will remain in roll forward pending.
I am new to SSIS and am after some assistance in creating an SSIS package to do a specific task. My data is stored remotely within a MySQL Database and this is downloaded to a SQL Server 2014 Database. What I want to do is the following, create a package where I can enter 2 dates that can be compared against the create date/date modified per record on a number of tables to give me a snap shot and compare the MySQL Data to the SQL Data so that I can see if there are any rows that are missing from my local SQL Database or if any need to be updated. Some tables have no dates so I just want to see a record count on what is missing if anything between the 2. If this is better achieved through TSQL I am happy to hear about other suggestions or sites to look at where things have been done similar.
In relation to your query Tab :
"Hi Tab, What happens at the moment is our master data is stored in a MySQL Database, the data was then downloaded to a SQL Server Database as a one off. What happens at the moment is I have a SSIS package that uses the MAX ID which can be found on most of the tables to work out which records are new and just downloads them or updates them. What I want to do is run separate checks on the tables to make sure that during the download nothing has been missed and everything is within sync. In an ideal world I would like to pass in to a SSIS package or tsql stored procedure a date range, shall we say calender week, this would then check for any differences between the remote MySQL database tables and the local SQL tables. It does not currently have to do anything but identify issues, correcting them may come later or changes would need to be made to the existing sync package. Hope his makes more sense."
Thanks P
To do this, you need to implement a Type 1 Slowly Changing Dimension type data flow in SSIS. There are a number of ways to do this, including a built in transformation aptly called the Slowly Changing Dimension transformation. Whilst this is easy to set up, it is a pain to maintain and it runs horrendously slowly.
There are numerous ways to set this up using other transformations or even SQL merge statements which are detailed here: https://bennyaustin.wordpress.com/2010/05/29/alternatives-to-ssis-scd-wizard-component/
I would recommend that you use Lookup transformations as they perform better than the Slowly Changing Dimension transformation but offer better diagnostics and error handling than the better performing SQL merge statement.
Before you do this you will need to add a Checksum or Hashbytes column to your SQL data for ease of comparison with the incoming MySQL data.
In short, calculate some sort of repeatable checksum as the data is downloaded into your SQL Server, then use this in an SSIS Lookup, matching on the row key, to check for changes. Where the checksum value is different for the same row it needs updating and where there is no matching row key in your SQL Data you need to insert the new row.
I'm investigating the scalibility of Sequalize in a production app, specifically the increment function, to see how well it can handle when a row could theortically be updated several times simultaneously (say, the totals row of something). My question is can the sequalize increment operator be trusted for these little addition operations that could be concurrent?
We're using Postgres on the backend, but I'm not familar with the internals of Postgres and how it would handle this type of scenerio (heroku postgres will be the production host, if it matters).
The Docs / The Code
The sql ran by Sequealize according to the code comments
SET column = column + X
It's hard to say without complete SQL examples, but I'd say this will likely serialize all transactions that call it on the same object.
If you update an object, the db takes a row update lock that's only released at commit/rollback time. Other updates/deletes block on this lock until the first tx commits or rolls back.
I am working on my first project using an ORM (currently using Entiry Framework, although that's not set in stone) and am unsure what is the best practice when I need to add or subtract a given amount from a database field, when I am not interested in the new value and I know the field in question is frequently updated, so concurrency conflicts are a concern.
For example, in a retail system where I am recording a sale, as well as creating records for the sale and each of the line items, I need to update the quantity on hand of the items sold. It seems unnecessary to query the database for the existing quantity on hand, just so that I can populate the entity model before saving the updated quantity - and in the time taken for that round-trip, there is a chance that the same item will have been sold through another checkout or the website, so I either have a conflict or (if using a transaction) the other sale is blocked until I complete my update.
In SQL I would simply write
UPDATE Item SET Quantity=Quantity-1 WHERE ...
It seems the best option in this case is to fall back to ADO.NET + stored procedure for this one update, but is there a better way within Entity Framework?
You're right. ORMs are specialized in tracking changes to each individual entity, and applying those changes to the DB individually. Some ORMs support sending thechanges in btaches, but, even so, to modify all the records in a table implies reading them all, modifyng each one, and sending the changes back to the DB as individual UPDATEs.
And that's a big no-no! as you have corectly thought. It implies loading all the rows into memory, modifying all of them, track their changes, and send them back to the DB as indivudal updates, which is way more expensive that running a single UPDATE on the DB.
As to the final question, to run a SQL command you don't need to use traditional ADO.NET. You can run SQL queries directly from an EF DbContext using ExecuteSqlCommand like this:
MyDbContext.Database.ExecuteSqlCommand('Your SQL here!!');
I recommend you to look at the MSDN docs for Database class, to learn all the things that can be done, for example managing transactions, executing commands that return no data (as the previous example) or executing queries that return data, and even mapping them to entities (classes) in your model: SqlQuery().
So you can run SQL commands and queries without using a different technology.
I'm planning to use MS entity framework for new web apps (come on EF v2!).
So does it make sense to plan ahead by adding timestamp columns to all entity tables in existing and future databases, to support concurrency checks? Is there any reason why it would be a bad idea to have a timestamp column in every table?
Note that the point is to add support for optimistic concurrency, not auditing.
I've used timestamp columns as a matter of routine for years. Another option is a row-version, but then you need to update it etc. I've never had any problems with timestamp. One word of caution - if you ever select into a temp-table/table-var for processing, you need to use varbinary(8), not timestamp, in the temp table - otherwise your temp table will get its own unique timestamps upon update ;-p
As you acknowledge, timestamp only helps with concurrency. Despite the name, it has nothing directly to do with time, so won't help with auditing.
It is nicely supported in the MS db offerings (LINQ-to-SQL / EF / etc)
In previous projects I used timestamps a lot, and I never had a bad experience with it.
Additionally I would totally exclude the Entity Framework from that decision, because that's some that might change over time.