Parallel.Foreach and BulkCopy - sqlbulkcopy

I have a C# library which connects to 59 servers of the same database structure and imports data to my local db to the same table. At this moment I am retrieving data server by server in foreach loop:
foreach (var systemDto in systems)
{
var sourceConnectionString = _systemService.GetConnectionStringAsync(systemDto.Ip).Result;
var dbConnectionFactory = new DbConnectionFactory(sourceConnectionString,
"System.Data.SqlClient");
var dbContext = new DbContext(dbConnectionFactory);
var storageRepository = new StorageRepository(dbContext);
var usedStorage = storageRepository.GetUsedStorageForCurrentMonth();
var dtUsedStorage = new DataTable();
dtUsedStorage.Load(usedStorage);
var dcIp = new DataColumn("IP", typeof(string)) {DefaultValue = systemDto.Ip};
var dcBatchDateTime = new DataColumn("BatchDateTime", typeof(string))
{
DefaultValue = batchDateTime
};
dtUsedStorage.Columns.Add(dcIp);
dtUsedStorage.Columns.Add(dcBatchDateTime);
using (var blkCopy = new SqlBulkCopy(destinationConnectionString))
{
blkCopy.DestinationTableName = "dbo.tbl";
blkCopy.WriteToServer(dtUsedStorage);
}
}
Because there are many systems to retrieve data, I wonder if it is possible to use Pararel.Foreach loop? Will BulkCopy lock the table during WriteToServer and next WriteToServer will wait until previous will complete?
-- EDIT 1
I've changed Foreach to Parallel.Foreach but I face one problem. Inside this loop I have async method: _systemService.GetConnectionStringAsync(systemDto.Ip)
and this line returns error:
System.NotSupportedException: A second operation started on this
context before a previous asynchronous operation completed. Use
'await' to ensure that any asynchronous operations have completed
before calling another method on this context. Any instance members
are not guaranteed to be thread safe.
Any ideas how can I handle this?

In general, it will get blocked and will wait until the previous operation complete.
There are some factors that may affect if SqlBulkCopy can be run in parallel or not.
I remember when adding the Parallel feature to my .NET Bulk Operations, I had hard time to make it work correctly in parallel but that worked well when the table has no index (which is likely never the case)
Even when worked, the performance gain was not a lot faster.
Perhaps you will find more information here: MSDN - Importing Data in Parallel with Table Level Locking

Related

Cannot drop Firebird table when using multiple connections

I would like to safely drop Firebird table. I have 3 transactions, one to recreate table, one to do something with the table (just inserting a single row to keep it simple) and the last one to drop the table.
If all these txns are executed using single connection these works. If I use a different connection, then the drop command fails with
lock conflict on no wait transaction
unsuccessful metadata update
object TABLE "DEMO" is in use
private static void Test() {
using var conn1 = new FbConnection(ConnectionString);
using var conn2 = new FbConnection(ConnectionString);
using var conn3 = new FbConnection(ConnectionString);
conn1.Open();
conn2.Open();
conn3.Open();
ExecuteTxn(conn1, cmd => {
cmd.CommandText = "recreate table demo (id int primary key)";
cmd.ExecuteNonQuery();
});
ExecuteTxn(conn2, cmd => {
cmd.CommandText = "insert into demo (id) values (1)";
cmd.ExecuteNonQuery();
});
ExecuteTxn(conn3, cmd => {
cmd.CommandText = "drop table demo";
cmd.ExecuteNonQuery();
});
}
private static void ExecuteTxn(FbConnection conn, Action<FbCommand> todo) {
using (var txn = conn.BeginTransaction())
using (var cmd = conn.CreateCommand()) {
cmd.Transaction = txn;
todo(cmd);
txn.Commit();
}
}
I realized that changing the transaction options as
txn = conn.BeginTransaction(new FbTransactionOptions { TransactionBehavior = FbTransactionBehavior.Wait }))
seems to help. But I'm not sure if this the right thing to do or just a coincidence...
Using Firebird 3.0.6, FirebirdSql.Data.FirebirdClient.dll 7.5.0.0
As far as I understand it, the problem has to do with how Firebird caches certain metadata, which might result in existence locks being retained, which will prevent deletion of the object. In addition, it is possible - this is a guess! - that the Firebird ADO.net provider retains the statement handle with the insert statement prepared, which will also result in an existence lock being retained.
Executing in a WAIT transaction (optionally with a timeout) is considered an appropriate workaround by the Firebird core developers.
For reference, see the following tickets:
CORE-3766 - Transaction can`t change metadata if it is run in no_wait and there is another connect that once had queried these metadata
CORE-6382 - Triggers accessing a table prevent concurrent DDL command from dropping that table
In certain cases, switching from Firebird ClassicServer or Firebird SuperClassic to Firebird SuperServer can also prevent this problem.
However, if you want a more in-depth explanation, it might be worthwhile to ask this question on the firebird-devel mailing list.

What impact does changing a IReliableQueue to a IReliableConcurrentQueue have in an existing deployment?

I am working in a Service Fabric application that uses IReliableQueue. For the uses cases of this system, the IReliableConcurrentQueue makes sense to use and some local testing (i.e. basically by just changing the code to use IReliableConcurrentQueue instead of IReliableQueue - queue name does not change) shows great performance improvements. However, I am worried about the impact of changing this in a production system (i.e. upgrading). I can't find any docs or online questions (unless I just missed them) about these considerations. For example, in this system, the existing IReliableQueue will almost always have items. So what happens to that data when I upgrade the SF application? Will it be available to dequeue in the IReliableConcurrentQueue? Or would data be lost? I know I can "just try it" but wanted to see if someone out there had done the same or could offer pointers to existing resources. Thanks!
Sorry for a late answer (that you probably don't need anymore but still).
When we calling GetOrAddAsync method on IReliableStateManager we aren't retrieving the interface to store values - we actually creating an instance of reliable collection. This basically means that type of the interface we specify is very important.
Taking this into account if we do this:
Service v. 1.0
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableQueue<long>>("MyCollection")
Then doing this in the next version:
Service v. 1.1
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableConcurrentQueue<long>>("MyCollection")
will throw an exception:
Returned reliable object of type Microsoft.ServiceFabric.Data.Collections.DistributedQueue`1[System.Int64] cannot be casted to requested type Microsoft.ServiceFabric.Data.Collections.IReliableConcurrentQueue`1[System.Int64]
and then:
System.ExecutionEngineException: 'Exception of type 'System.ExecutionEngineException' was thrown.'
The above exception looks like a bug so I have filled one.
UPDATE 2019.06.28
It turned out that appearance of System.ExecutionEngineException isn't a bug but rather an undocumented behavior of Environment.FailFast method in combination with Visual Studio debugger.
Please see my comment to the above issue.
This is what would happen.
There are plenty ways to overcome this.
Here is the most obvious one:
Example
var migrate = false; // This flag indicates whether the migration was already done.
var migrateValues = new List<long>();
var applicationFlags = await this.StateManager
.GetOrAddAsync<IReliableDictionary<string, bool>>("application-flags");
using (var transaction = this.StateManager.CreateTransaction())
{
var flag = await applicationFlags
.TryGetValueAsync(transaction, "queue-to-concurrent-queue-migration");
if (!flag.HasValue || !flag.Value)
{
var queue = await this.StateManager
.GetOrAddAsync<IReliableQueue<long>>("value-collection");
for (;;)
{
var c = await queue.TryDequeueAsync(transaction);
if (!c.HasValue)
{
break;
}
migrateValues.Add(c.Value);
}
migrate = true;
}
}
if (migrate)
{
await this.StateManager.RemoveAsync("value-collection");
using (var transaction = this.StateManager.CreateTransaction())
{
var concurrentQueue = await this.StateManager
.GetOrAddAsync<IReliableConcurrentQueue<long>>("value-collection");
foreach (var i in migrateValues)
{
await concurrentQueue.EnqueueAsync(transaction, i);
}
await applicationFlags.AddOrUpdateAsync(
transaction,
"queue-to-concurrent-queue-migration",
true,
(s, b) => true);
}
await transaction.CommitAsync();
}
Please note that this code is just an illustrative example and should be properly tested before applying it to real life application.

Get connection used by DatabaseFactory.GetDatabase().ExecuteReader()

We have two different query strategies that we'd ideally like to operate in conjunction on our site without opening redundant connections. One strategy uses the enterprise library to pull Database objects and Execute_____(DbCommand)s on the Database, without directly selecting any sort of connection. Effectively like this:
Database db = DatabaseFactory.CreateDatabase();
DbCommand q = db.GetStoredProcCommand("SomeProc");
using (IDataReader r = db.ExecuteReader(q))
{
List<RecordType> rv = new List<RecordType>();
while (r.Read())
{
rv.Add(RecordType.CreateFromReader(r));
}
return rv;
}
The other, newer strategy, uses a library that asks for an IDbConnection, which it Close()es immediately after execution. So, we do something like this:
DbConnection c = DatabaseFactory.CreateDatabase().CreateConnection();
using (QueryBuilder qb = new QueryBuilder(c))
{
return qb.Find<RecordType>(ConditionCollection);
}
But, the connection returned by CreateConnection() isn't the same one used by the Database.ExecuteReader(), which is apparently left open between queries. So, when we call a data access method using the new strategy after one using the old strategy inside a TransactionScope, it causes unnecessary promotion -- promotion that I'm not sure we have the ability to configure for (we don't have administrative access to the SQL Server).
Before we go down the path of modifying the query-builder-library to work with the Enterprise Library's Database objects ... Is there a way to retrieve, if existent, the open connection last used by one of the Database.Execute_______() methods?
Yes, you can get the connection associated with a transaction. Enterprise Library internally manages a collection of transactions and the associated database connections so if you are in a transaction you can retrieve the connection associated with a database using the static TransactionScopeConnections.GetConnection method:
using (var scope = new TransactionScope())
{
IEnumerable<RecordType> records = GetRecordTypes();
Database db = DatabaseFactory.CreateDatabase();
DbConnection connection = TransactionScopeConnections.GetConnection(db).Connection;
}
public static IEnumerable<RecordType> GetRecordTypes()
{
Database db = DatabaseFactory.CreateDatabase();
DbCommand q = db.GetStoredProcCommand("GetLogEntries");
using (IDataReader r = db.ExecuteReader(q))
{
List<RecordType> rv = new List<RecordType>();
while (r.Read())
{
rv.Add(RecordType.CreateFromReader(r));
}
return rv;
}
}

Mongo c# driver freezes and never returns a value on Update()

I have a long running operation that inserts thousands of sets of entries, each time a set is inserted using the code below.
After a while of this code running, the collection.Update() method freezes (does not return) and the entire process grinds to a halt.
Can't find any reasonable explanation for this anywhere.
I've looked at the mongod logs, nothing unusual, it just stops receiving requests from this process.
Mongo version: 2.4.1, C# driver version: 1.8.0
using (_mongoServer.RequestStart(_database))
{
var collection = GetCollection<BsonDocument>(collectionName);
// Iterate over all records
foreach (var recordToInsert in recordsDescriptorsToInsert)
{
var query = new QueryDocument();
var update = new UpdateBuilder();
foreach (var property in recordToInsert)
{
var field = property.Item1;
var value = BsonValue.Create(property.Item2);
if (keys.Contains(field))
query.Add(field, value);
update.Set(field, value);
}
collection.Update(query, update, UpdateFlags.Upsert); // ** NEVER RETURNS **
}
}
This is may related to this: CSHARP-717
It was fixed for driver 1.8.1

Error loading a persisted workflow

I have a workflow started and persisted using messaging activities.
The correlation between the Start initial command and the Stop final command works well if they're sent within few seconds.
Problems begin when the workflow is unloaded, because the following Stop message throws the following FaultException:
If LoadWorkflowByInstanceKeyCommand.AssociateLookupKeyToInstanceId is not specified, the LookupInstanceKey must already be associated to an instance, or the LoadWorkflowByInstanceKeyCommand will fail. For this reason, it is invalid to also specify the LookupInstanceKey in the InstanceKeysToAssociate collection if AssociateLookupKeyToInstanceId isn't set
Can anybody help me?
The variables inside the workflow are of types int and XDocument.
This is the code to initialize the WorkflowServiceHost:
WorkflowServiceHost serviceHost = new WorkflowServiceHost(myWorkflow, new Uri(serviceUri));
ServiceDebugBehavior debug = serviceHost.Description.Behaviors.Find<ServiceDebugBehavior>();
if (debug == null)
{
debug = new ServiceDebugBehavior();
serviceHost.Description.Behaviors.Add(debug);
}
debug.IncludeExceptionDetailInFaults = true;
WorkflowIdleBehavior idle = serviceHost.Description.Behaviors.Find<WorkflowIdleBehavior>();
if (idle == null)
{
idle = new WorkflowIdleBehavior();
serviceHost.Description.Behaviors.Add(idle);
}
idle.TimeToPersist = TimeSpan.FromSeconds(2);
idle.TimeToUnload = TimeSpan.FromSeconds(10);
var behavior = new SqlWorkflowInstanceStoreBehavior
{
ConnectionString = ConfigurationManager.ConnectionStrings["WorkflowPersistence"].ConnectionString,
InstanceEncodingOption = InstanceEncodingOption.None,
InstanceCompletionAction = InstanceCompletionAction.DeleteAll,
InstanceLockedExceptionAction = InstanceLockedExceptionAction.BasicRetry,
HostLockRenewalPeriod = new TimeSpan(00, 00, 30),
RunnableInstancesDetectionPeriod = new TimeSpan(00, 00, 05)
};
serviceHost.Description.Behaviors.Add(behavior);
serviceHost.Open();
Looking at the database, it seems that the workflow is never suspended.
Any help appreciated,
thank you
Not really sure what is going on here but it sounds like there are types used in the workflow that cannot be serialized and prevent the workflow from being stored to disk. When you say "Looking at the database, it seems that the workflow is never suspended." do you really mean suspended? And why do you expect the workflow to be suspended?
What happens if you send just the start message to the workflow and wait 2 seconds? Do you get a new record in the persistence database?