Orientdb Transactions Best Practices - orientdb

I'm working on a REST API. I'm having all sorts of problems with transactions in Orientdb. In the current setup, we have a singleton that wraps around the ODatabaseDocumentPool. We retrieve all instances through this setup. Each api call starts by acquiring an instance from the pool and creating a new instance of OrientGraph using the ODatabaseDocumentTx instance. The code that follows uses methods from both ODatabaseDocumentTx and OrientGraph. At the end of the code, we call graph.commit() on write operations and graph.shutdown() on all operations.
I have a list of questions.
To verify, I can still use the ODatabaseDocumentTx instanced I used to create OrientGraph? Or should I use OrientGraph.getRawGraph()?
What is the best way to do read operations when using OrientGraph? Even during read operations, I get OConcurrentModificationExceptions, lock exceptions, or error on retrieving records. Is this because the OrientGraph is transactional and versions are modified even when retrieving records? I should mention, I also use the index manager and iterate through edges of a vertex in these read operations.
When I get a record through the Index Manager, does this update the version on the database?
Does graph.shutdown() release the ODatabaseDocumentTx instance back to the pool?
Does v1.78 still required us to lock records in transactions?
If set autoStartTx to false on OrientGraph, do I have to start transactions manually, or do they start automatically when accessing the database?
Sample Code:
ODatabaseDocumentTx db = pool.acquire();
// READ
OrientGraph graph = new OrientGraph(db);
ODocument doc = (ODocument) oidentifialbe.getRecord() // I use Java API to a get record from index
if( ((String) doc.field("field")).equals('name') )
//code
OrientVertex v = graph.getVertex(doc);
for(OrientVertex vv : v.getVertices()) {
//code
}
// OR WRITE
doc.field('d',val);
doc = doc.save();
OrientVertex v = v.getVertex(doc);
graph.addEdge(null, v, otherVertex);
graph.addEdge(null, v, anotherVertex) // do I have to reload the record in v?
// End Transaction
// if write
graph.commit();
// then
graph.shutdown();

Related

Bi-directional database syncing for Postgres and Mongodb

Let's say I have a local server running and I also have an exactly similar server already running on amazon.
Both server can CRUD data to its databases.
Note that the servers use both `postgres` and `mongodb`.
Now when no one is using the wifi (usually in the night), I would like to sync both postgres and mongodb databases so that all writes from each database on server to each database on local gets properly applied.
I don't want to use Multi-Master because:
MongoDB does not support this architecture itself, so perhaps I will need a complex alternative.
I want to control when and how much I sync both databases.
I do not want to use network bandwidth when others are using the internet.
So can anyone show me right direction.
Also, if you list some tools that solve my problem, it will be very helpful.
Thanks.
We have several drivers what would be able to help you with this process. I'm presuming some knowledge of software development and will showcase our ADO.NET Provider for MongoDB, which using the familiar-looking MongoDBConnection, MongoDBCommand, and MongoDBDataReader objects.
First, you'll want to create your connection string for connecting with you cloud MongoDB instance:
string connString = "Auth Database=test;Database=test;Password=test;Port=27117;Server=http://clouddbaddress;User=test;Flatten Objects=false";
You'll note that we have the Flatten Objects property set to false, this ensures that any JSON/BSON objects contained in the documents will be returned as raw JSON/BSON.
After you create the connection string, you can establish the connection and read data from the database. You'll want to store the returned data in some way that would let you access it easily for future use.
List<string> columns = new List<string>();
List<object> values;
List<List<object>> rows = new List<List<object>>();
using (MongoDBConnection conn = new MongoDBConnection(connString))
{
//create a WHERE clause that will limit the results to newly added documents
MongoDBCommand cmd = new MongoDBCommand("SELECT * FROM SomeTable WHERE ...", conn);
rdr = cmd.ExecuteReader();
results = 0;
while (rdr.Read())
{
values = new List<object>();
for (int i = 0; i < rdr.FieldCount; i++)
{
if (results == 0)
columns.Add(rdr.GetName(i));
values.Add(rdr.GetValue(i));
}
rows.Add(values);
results++;
}
}
After you've collected all of the data for each of the objects that you want to replicated, you can configure a new connection to your local MongoDB instance and build queries to insert the new documents.
connString = "Auth Database=testSync;Database=testSync;Password=testSync;Port=27117;Server=localhost;User=testSync;Flatten Objects=false";
using (MongoDBConnection conn = new MongoDBConnection(connString)) {
foreach (var row in rows) {
//code here to create comma-separated strings for the columns
// and values to be inserted in a SQL statement
String sqlInsert = "INSERT INTO backup_table (" + column_names + ") VALUES (" + column_values + ")";
MongoDBCommand cmd = new MongoDBCommand(sqlInsert, conn);
cmd.ExecuteQuery();
}
At this point, you'll have inserted all of the new documents. You could then change your filter (the WHERE clause at the beginning) to filter based on updated date/time and update their corresponding entries in the local MongoDB instance using the UPDATE command.
Things to look out for:
Be sure that you're properly filtering out new/updated entries.
Be sure that you're properly interpreting the type of variable so that you properly surround with quotes (or not) when entering the values in the SQL query.
We have a few drivers that might be useful to you. I demonstrated the ADO.NET Provider above, but we also have a driver for writing apps in Xamarin and a JDBC driver (for Java).

EF 6 Caching withing a context

I have a single DbContext.. First I do:
var all = context.MySet.Where(c=>c.X == 1).ToList();
later (with the same context instance)
var special = context.MySet.Where(c=>(c.X == 1) && (c.Y===1).ToList();
The database is hit AGAIN! Since the first query is guaranteed
to return all of the elements that will exist in the second, why is the DB being hit again?
If you wish to avoid hitting the database again then you could try this;
var special = all.Where(c=>(c.X == 1) && (c.Y===1).ToList();
Since the list of all objects already contains everything you want you can just query that list and the database won't get hit again.
Your link expression is just a query, it only retrieves data when you enumerate it (for example calling .ToList()). You can keep changing the query and hold off actually getting the data until you need it. The entity framework will convert your query into an SQL query in the background and then fetch data.
Avoid writing "ToList()" at the end of every query as this forces the EF to hit the database.
If you only ever what to hit the database once then get the data you need by calling "ToList(), To.Array etc and then work with that collection (in your case the "all" collection) since this is the object holding all the data.

Nested transactions with OrientGraph

I am trying to use nested transactions with OrientGraph , but it does not seem to work properly
my scenario is
function1(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
function2();
// save some data
db.commit();
}
function2(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
// save some data
db.commit(); // this commit is saving the data into db
}
commit in function2 saves data , but its part of nested transaction it should be commited at the moment when commit happens on outer transaction
am i doing something wrong ?
Note : i am doing db.setAutoStartTx(false); so that it will not start transaction automatically
You should use the same database instance object.
To automate this process (and get performance speed up) I suggest you use com.orientechnologies.orient.core.db.OPartitionedDatabasePool class. Also I always suggest you to use this pool because it minimizes time is need to acquire new collection and scales very well on multicore H/ w.
EDIT
Try using db.getRawGraph().activateOnCurrentThread() after the function 2
function1(){
OrientGraph db = factory.openDatabase() ; // which will give active graph on current thread
db.begin();
function2();
db.getRawGraph().activateOnCurrentThread();
// save some data
db.commit();
}

Cloud mutual exclusion: Cloud based concurrency involving read/write to shared data (SQL data)

Background: We have an Azure .NET application where we need to register a single "front end" user with multiple back end providers. Since this registration takes longer, we offload it to worker role and there are multiple worker roles. All data is stored in Azure SQL and we're using Entity Framework 5.0 as our ORM. The way we're currently setup, we read from SQL dB => process in worker role code => write/update to SQL dB to flag completion. Essentially I need to solve the traditional "multithreaded + shared data writes" problem but instead of OS scale it's at the cloud scale.
Concern: We have a race condition with multiple workers if the first worker take longer than the visibility timeout. For example, assuming two worker roles, I've marked below how both would read from SQL, think that the processing is still pending and both would proceed. It results in a last-writer-wins race condition and also creates orphaned and extra accounts on the external service providers.
Question: How can I modify this to take care of this situation elegantly? I can alter the data flow or use a per-user "cloud" lock for Mutex. Without trying to constraint anyone's thinking, in the past I speculated having a SQL based cloud lock, but couldn't really get it working in EF5.0. Here I'm trying to explore any answers, SQL based locks or not.
// Read message off Service Bus Queue
// message invisible for 1 min now, worker must finish in 1 min
BrokeredMessage qMsg = queueClient.Receive();
// Extract UserID Guid from message
Guid userProfileId = DeserializeUserIdFromQMsg(qMsg);
// READ PROFILE FROM SQL
UserProfile up = (from x in myAppDbContext.UserProfiles select x).SingleOrDefault(p => p.UserProfileId == userProfileId);
if (up != null)
{
List<Task> allUserRegTasks = new List<Task>();
string firstName = up.FirstName; // <== WORKER ROLE #2 HERE
string lastName = up.LastName;
string emailAddress = up.Email;
// Step 1: Async register User with provider #1, update db
if (String.IsNullOrEmpty(up.Svc1CustId))
{
// <== WORKER ROLE #1 HERE
Svc1RegHelper svc1RegHelper = new Svc1RegHelper();
Task svc1UserRegTask = svc1RegHelper.GetRegisterTask(userProfileId, firstName, lastName, emailAddress);
svc1UserRegTask.Start(); // <== SQL WRITE INSIDE THIS (marks "up.Svc1CustId")
allUserRegTasks.Add(svc1UserRegTask);
}
// Step 2: Async register User with provider #2, update db
if (String.IsNullOrEmpty(up.Svc2CustId))
{
Svc2RegHelper svc2RegHelper = new Svc2RegHelper();
Task svc2UserRegTask = svc2RegHelper.GetRegisterTask(userProfileId, firstName, lastName, emailAddress);
svc2UserRegTask.Start(); // <== SQL WRITE INSIDE THIS (marks "up.Svc2CustId")
allUserRegTasks.Add(svc2UserRegTask);
}
Task.WaitAll(allUserRegTasks.ToArray());
// Step 3: Send confirmation email to user we're ready for them!
// ...
}
You can put a mutex in blob storage via blob lease. Put a try/catch around the whole thing as the AcquireLease() will fail if mutex is used by someone else
var lockBlobContainer = cloudClient.GetContainerReference("mutex-container");
var lockBlob = lockBlobContainer.GetBlobReference("SOME_KNOWN_KEY.lck");
lockBlob.UploadText(DateTime.UtcNow.ToString(CultureInfo.InvariantCulture)); //creates the mutex file
var leaseId = lockBlob.AcquireLease();
try
{
// Do stuff
}
finally
{
lockBlob.ReleaseLease(leaseId);
}

Row-Level Update Lock using System.Transactions

I have a MSSQL procedure with the following code in it:
SELECT Id, Role, JurisdictionType, JurisdictionKey
FROM
dbo.SecurityAssignment WITH(UPDLOCK, ROWLOCK)
WHERE Id = #UserIdentity
I'm trying to move that same behavior into a component that uses OleDb connections, commands, and transactions to achieve the same result. (It's a security component that uses the SecurityAssignment table shown above. I want it to work whether that table is in MSSQL, Oracle, or Db2)
Given the above SQL, if I run a test using the following code
Thread backgroundThread = new Thread(
delegate()
{
using (var transactionScope = new TrasnsactionScope())
{
Subject.GetAssignmentsHavingUser(userIdentity);
Thread.Sleep(5000);
backgroundWork();
transactionScope.Complete();
}
});
backgroundThread.Start();
Thread.Sleep(3000);
var foregroundResults = Subject.GetAssignmentsHavingUser(userIdentity);
Where
Subject.GetAssignmentsHavingUser
runs the sql above and returns a collection of results and backgroundWork is an Action that updates rows in the table, like this:
delegate
{
Subject.UpdateAssignment(newAssignment(user1, role1));
}
Then the foregroundResults returned by the test should reflect the changes made in the backgroundWork action.
That is, I retrieve a list of SecurityAssignment table rows that have UPDLOCK, ROWLOCK applied by the SQL, and subsequent queries against those rows don't return until that update lock is released - thus the foregroundResult in the test includes the updates made in the backgroundThread.
This all works fine.
Now, I want to do the same with database-agnostic SQL, using OleDb transactions and isolation levels to achieve the same result. And I can't for the life of me, figure out how to do it. Is it even possible, or does this row-level locking only apply at the db level?