I am trying to use nested transactions with OrientGraph , but it does not seem to work properly
my scenario is
function1(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
function2();
// save some data
db.commit();
}
function2(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
// save some data
db.commit(); // this commit is saving the data into db
}
commit in function2 saves data , but its part of nested transaction it should be commited at the moment when commit happens on outer transaction
am i doing something wrong ?
Note : i am doing db.setAutoStartTx(false); so that it will not start transaction automatically
You should use the same database instance object.
To automate this process (and get performance speed up) I suggest you use com.orientechnologies.orient.core.db.OPartitionedDatabasePool class. Also I always suggest you to use this pool because it minimizes time is need to acquire new collection and scales very well on multicore H/ w.
EDIT
Try using db.getRawGraph().activateOnCurrentThread() after the function 2
function1(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
function2();
db.getRawGraph().activateOnCurrentThread();
// save some data
db.commit();
}
Related
I have a single DbContext.. First I do:
var all = context.MySet.Where(c=>c.X == 1).ToList();
later (with the same context instance)
var special = context.MySet.Where(c=>(c.X == 1) && (c.Y===1).ToList();
The database is hit AGAIN! Since the first query is guaranteed
to return all of the elements that will exist in the second, why is the DB being hit again?
If you wish to avoid hitting the database again then you could try this;
var special = all.Where(c=>(c.X == 1) && (c.Y===1).ToList();
Since the list of all objects already contains everything you want you can just query that list and the database won't get hit again.
Your link expression is just a query, it only retrieves data when you enumerate it (for example calling .ToList()). You can keep changing the query and hold off actually getting the data until you need it. The entity framework will convert your query into an SQL query in the background and then fetch data.
Avoid writing "ToList()" at the end of every query as this forces the EF to hit the database.
If you only ever what to hit the database once then get the data you need by calling "ToList(), To.Array etc and then work with that collection (in your case the "all" collection) since this is the object holding all the data.
I'm working on a REST API. I'm having all sorts of problems with transactions in Orientdb. In the current setup, we have a singleton that wraps around the ODatabaseDocumentPool. We retrieve all instances through this setup. Each api call starts by acquiring an instance from the pool and creating a new instance of OrientGraph using the ODatabaseDocumentTx instance. The code that follows uses methods from both ODatabaseDocumentTx and OrientGraph. At the end of the code, we call graph.commit() on write operations and graph.shutdown() on all operations.
I have a list of questions.
To verify, I can still use the ODatabaseDocumentTx instanced I used to create OrientGraph? Or should I use OrientGraph.getRawGraph()?
What is the best way to do read operations when using OrientGraph? Even during read operations, I get OConcurrentModificationExceptions, lock exceptions, or error on retrieving records. Is this because the OrientGraph is transactional and versions are modified even when retrieving records? I should mention, I also use the index manager and iterate through edges of a vertex in these read operations.
When I get a record through the Index Manager, does this update the version on the database?
Does graph.shutdown() release the ODatabaseDocumentTx instance back to the pool?
Does v1.78 still required us to lock records in transactions?
If set autoStartTx to false on OrientGraph, do I have to start transactions manually, or do they start automatically when accessing the database?
Sample Code:
ODatabaseDocumentTx db = pool.acquire();
// READ
OrientGraph graph = new OrientGraph(db);
ODocument doc = (ODocument) oidentifialbe.getRecord() // I use Java API to a get record from index
if( ((String) doc.field("field")).equals('name') )
//code
OrientVertex v = graph.getVertex(doc);
for(OrientVertex vv : v.getVertices()) {
//code
}
// OR WRITE
doc.field('d',val);
doc = doc.save();
OrientVertex v = v.getVertex(doc);
graph.addEdge(null, v, otherVertex);
graph.addEdge(null, v, anotherVertex) // do I have to reload the record in v?
// End Transaction
// if write
graph.commit();
// then
graph.shutdown();
Background: We have an Azure .NET application where we need to register a single "front end" user with multiple back end providers. Since this registration takes longer, we offload it to worker role and there are multiple worker roles. All data is stored in Azure SQL and we're using Entity Framework 5.0 as our ORM. The way we're currently setup, we read from SQL dB => process in worker role code => write/update to SQL dB to flag completion. Essentially I need to solve the traditional "multithreaded + shared data writes" problem but instead of OS scale it's at the cloud scale.
Concern: We have a race condition with multiple workers if the first worker take longer than the visibility timeout. For example, assuming two worker roles, I've marked below how both would read from SQL, think that the processing is still pending and both would proceed. It results in a last-writer-wins race condition and also creates orphaned and extra accounts on the external service providers.
Question: How can I modify this to take care of this situation elegantly? I can alter the data flow or use a per-user "cloud" lock for Mutex. Without trying to constraint anyone's thinking, in the past I speculated having a SQL based cloud lock, but couldn't really get it working in EF5.0. Here I'm trying to explore any answers, SQL based locks or not.
// Read message off Service Bus Queue
// message invisible for 1 min now, worker must finish in 1 min
BrokeredMessage qMsg = queueClient.Receive();
// Extract UserID Guid from message
Guid userProfileId = DeserializeUserIdFromQMsg(qMsg);
// READ PROFILE FROM SQL
UserProfile up = (from x in myAppDbContext.UserProfiles select x).SingleOrDefault(p => p.UserProfileId == userProfileId);
if (up != null)
{
List<Task> allUserRegTasks = new List<Task>();
string firstName = up.FirstName; // <== WORKER ROLE #2 HERE
string lastName = up.LastName;
string emailAddress = up.Email;
// Step 1: Async register User with provider #1, update db
if (String.IsNullOrEmpty(up.Svc1CustId))
{
// <== WORKER ROLE #1 HERE
Svc1RegHelper svc1RegHelper = new Svc1RegHelper();
Task svc1UserRegTask = svc1RegHelper.GetRegisterTask(userProfileId, firstName, lastName, emailAddress);
svc1UserRegTask.Start(); // <== SQL WRITE INSIDE THIS (marks "up.Svc1CustId")
allUserRegTasks.Add(svc1UserRegTask);
}
// Step 2: Async register User with provider #2, update db
if (String.IsNullOrEmpty(up.Svc2CustId))
{
Svc2RegHelper svc2RegHelper = new Svc2RegHelper();
Task svc2UserRegTask = svc2RegHelper.GetRegisterTask(userProfileId, firstName, lastName, emailAddress);
svc2UserRegTask.Start(); // <== SQL WRITE INSIDE THIS (marks "up.Svc2CustId")
allUserRegTasks.Add(svc2UserRegTask);
}
Task.WaitAll(allUserRegTasks.ToArray());
// Step 3: Send confirmation email to user we're ready for them!
// ...
}
You can put a mutex in blob storage via blob lease. Put a try/catch around the whole thing as the AcquireLease() will fail if mutex is used by someone else
var lockBlobContainer = cloudClient.GetContainerReference("mutex-container");
var lockBlob = lockBlobContainer.GetBlobReference("SOME_KNOWN_KEY.lck");
lockBlob.UploadText(DateTime.UtcNow.ToString(CultureInfo.InvariantCulture)); //creates the mutex file
var leaseId = lockBlob.AcquireLease();
try
{
// Do stuff
}
finally
{
lockBlob.ReleaseLease(leaseId);
}
I have project that pull data from a service (return xml) which deserialize into objects/entities.
I'm using EF CF and testing is working fine until it come to big chuck of data, not too big, only 150K records, I use SQL profile to check the SQL statement and it's really fast, but there is a huge slow issue with generating insert statement.
simply put, the data model is simple, class Client has many child object set (5) and 1 many-to-many relationship.
ID for model is provided from service so I cleaned up the duplicate instances of one entity (same ID).
var clientList = service.GetAllClients(); // return IEnumerable<Client> // return 10K clients
var filteredList = Client.RemoveDuplicateInstancesSameEntity(clientList); // return IEnumerable<Client>
int cur = 0;
in batch = 100;
while (true)
{
logger.Trace("POINT A : get next batch");
var importSegment = filteredList.Skip(cur).Take(batch).OrderBy(x=> x.Id);
if (!importSegment.Any())
Break;
logger.Trace("POINT B: Saving to DB");
importSegment.ForEach(c => repository.addClient(c));
logger.Trace("POINT C: calling persist");
repository.persist();
cur = cur + batch;
}
logic is simple, breaking it up into batch to speed up the process. each 100 Client create about 1000 insert statement (for child records and 1 many to many table).
using profiler and logging to analyze this. right after it insert
log show POINT B as the last step all the time. but i dont see any insert statement yet in profiler. then 2 minutes later, I see all the insert statement and then the POINT B for the next batch. and 2 minutes again.
did I do anything wrong or is there is setting or anything I can do to improve?
insert 1k records seems to be fast. Database is wiped out when process start so no records in there. doesn't seem to be an issue with SQL slowness but EF generating insert statement?
although the project works but it is slow. I want to speed it up and understand more about EF when it comes to big chunks of data. or is this normal?
the first 100 is fast and then is getting slower and slower and slower. seems like issue at POINT B. is it issue with too much data repo/dbcontext can't handle it in timely manner?
repo is inheritance from dbcoontext and addClient is simply
dbcontext.Client.Add(client)
Thank you very much.
I have a MSSQL procedure with the following code in it:
SELECT Id, Role, JurisdictionType, JurisdictionKey
FROM
dbo.SecurityAssignment WITH(UPDLOCK, ROWLOCK)
WHERE Id = #UserIdentity
I'm trying to move that same behavior into a component that uses OleDb connections, commands, and transactions to achieve the same result. (It's a security component that uses the SecurityAssignment table shown above. I want it to work whether that table is in MSSQL, Oracle, or Db2)
Given the above SQL, if I run a test using the following code
Thread backgroundThread = new Thread(
delegate()
{
using (var transactionScope = new TrasnsactionScope())
{
Subject.GetAssignmentsHavingUser(userIdentity);
Thread.Sleep(5000);
backgroundWork();
transactionScope.Complete();
}
});
backgroundThread.Start();
Thread.Sleep(3000);
var foregroundResults = Subject.GetAssignmentsHavingUser(userIdentity);
Where
Subject.GetAssignmentsHavingUser
runs the sql above and returns a collection of results and backgroundWork is an Action that updates rows in the table, like this:
delegate
{
Subject.UpdateAssignment(newAssignment(user1, role1));
}
Then the foregroundResults returned by the test should reflect the changes made in the backgroundWork action.
That is, I retrieve a list of SecurityAssignment table rows that have UPDLOCK, ROWLOCK applied by the SQL, and subsequent queries against those rows don't return until that update lock is released - thus the foregroundResult in the test includes the updates made in the backgroundThread.
This all works fine.
Now, I want to do the same with database-agnostic SQL, using OleDb transactions and isolation levels to achieve the same result. And I can't for the life of me, figure out how to do it. Is it even possible, or does this row-level locking only apply at the db level?