OrientDB - Create Edge using kind of self join - orientdb

I have imported hierarchical data into OrientDB from RDMBS using OETL. In RDBMS we used to store parentId in the same row.
e.g. the table structure is something like this:
ID - Name - Parent_ID
Corp - Corporate Office - Corp
D1 - District Office 1 - Corp
D2 - District Office 2 - Corp
SO1 - Small Office 1 - D1
SO2 - Small Office 2 - D2
SO3 - Small Office 3 - D1
Now each row is a node in Orientdb.
I want to create an edge (ParentOf) from say Corp to D1 and D1 to SO1 and so on.
How can I write a query to achieve this? Something along the line of following?
create edge parentOf from (select from node)a to (select from node
where a.id = parent_id)
Sorry I am still thinking in relational db way.
Orient DB version is orientdb-community-2.0.9

Thanks to pembeci, following is the function which finally did the trick:
var db = orient.getGraph();
var nodes = db.getVerticesOfClass("nodex");
var itr = nodes.iterator();
while(itr.hasNext()) {
var vertex = itr.next();
var id = vertex.getProperty("id");
var children = db.getVertices("parent_id", id);
var childItr = children.iterator();
while(childItr.hasNext()) {
var child = childItr.next();
vertex.addEdge("parentOf", child);
}
}

You can't pass information from the first select query (from part) to the second query (to part). So you should possibly create a server side function. Something like this may work (I'll test it later, when I'll have access to Orient Studio):
var db = orient.getGraph;
var nodes = db.getVerticesOfClass("node");
for (var i=0; i<nodes.length; i++) {
var vertex = nodes[i]
var id = vertex.getProperty("id");
var children = db.getVertices("parent_id", id);
for (var j=0; j<children.length; j++) {
var child = children[j];
vertex.addEdge("parentOf", child)
}
}

The js function was running forever. Finally wrote java code to do this:
OrientGraph graph = new OrientGraph("remote:localhost/testNode", "root", "root");
for (Vertex v : graph.getVerticesOfClass("Node"))
{
String id = v.getProperty("ID");
System.out.println("Checking children of: " + id);
String sql = "select from node where PARENT_ID = '"+id+"'";
for (Vertex child : (Iterable<Vertex>) graph.command(
new OCommandSQL(sql))
.execute())
{
v.addEdge("parent_of", child);
System.out.println("\tAdded 'Parent Of' Edge from: " + id + " to " + child.getProperty("ID"));
}
graph.commit();
}

Related

How to avoid the select n + 1 problem in nested entities

Given the following code:
var persons = context.PERSONs.Select(
x =>
new
{
personId = x.PERSON_ID,
personName = x.PERSON_NAME,
items = x.ITEMs.Select(
y =>
new
{
itemID = y.ITEM_ID,
itemName = y.ITEM_NAME,
properties = y.PROPERTies.Select(
z =>
new
{
z.PROPERTY_ID,
z.PROPERTY_NAME
}
)
}
)
}
).ToList();
How can I avoid select n + 1 problems with it? Tried .Include("ITEMs").Include("ITEMs.PROPERTies") but it didn't help. Would expect a single query with 2 left outer joins.
Note - would like a generic answer because I'm working on an OData background where it's hard to craft queries for each entity by hand
-edit-
Database: MS SQL Server
Entity Framework version: 6
Can confirm that all properties are simple mapped properties (ints and strings actullay, no functions nor computed values)

Make group by in outter join LINQ to entity

I use Entity Framework-6.
I have this code(outter join, LINQ to entity):
var inspectionSitesConjection = (from st in sites
join ir in inspectionReview on st.Id equals ir.SiteId into g
from subsite in g.DefaultIfEmpty()
select new GeneralReportViewModel
{
siteName = subsite.Site.Name,
address = subsite.Site.Description,
inspectionDate = subsite.DateReview,
siteType = subsite.Site.SiteType.Description,
frequency = subsite.InspectionFrequency.Name,
status = subsite.IsNormal,
}).AsNoTracking();
I need to make group by siteName and frequency.
Is it pussable to make group by inside LINQ above?
Here is a starting point:
var grouped = inspectionSitesConjection
.GroupBy(item => new { item.siteName, item.frequency });
But note that the result (as you may see in Queryable.GroupBy documentation) is no more IQueryable<GeneralReportViewModel> but IQueryable<IGrouping<Key, GeneralReportViewModel>> where Key is anonymous type having siteName and frequency properties.
I'm providing this just because you specifically asked. It's not quite clear what are you trying to achieve with that query. Also, once you decided to use explicit join, then use the joined table instead of navigation property, and take into account that subsite can be null due to outer join.
from st in sites
join ir in inspectionReview on st.Id equals ir.SiteId into g
from subsite in g.DefaultIfEmpty()
select new GeneralReportViewModel
{
siteName = st.Name,
address = st.Description,
siteType = st.SiteType.Description,
inspectionDate = subsite.DateReview, // problem if subsite == null
frequency = subsite.InspectionFrequency.Name, // problem if subsite == null
status = subsite.IsNormal, // problem if subsite == null
})

Entity Framework - Select * from Entities where Id = (select max(Id) from Entities)

I have an entity set called Entities which has a field Name and a field Version. I wish to return the object having the highest version for the selected Name.
SQL wise I'd go
Select *
from table
where name = 'name' and version = (select max(version)
from table
where name = 'name')
Or something similar. Not sure how to achieve that with EF. I'm trying to use CreateQuery<> with a textual representation of the query if that helps.
Thanks
EDIT:
Here's a working version using two queries. Not what I want, seems very inefficient.
var container = new TheModelContainer();
var query = container.CreateQuery<SimpleEntity>(
"SELECT VALUE i FROM SimpleEntities AS i WHERE i.Name = 'Test' ORDER BY i.Version desc");
var entity = query.Execute(MergeOption.OverwriteChanges).FirstOrDefault();
query =
container.CreateQuery<SimpleEntity>(
"SELECT VALUE i FROM SimpleEntities AS i WHERE i.Name = 'Test' AND i.Version =" + entity.Version);
var entity2 = query.Execute(MergeOption.OverwriteChanges);
Console.WriteLine(entity2.GetType().ToString());
Can you try something like this?
using(var container = new TheModelContainer())
{
string maxEntityName = container.Entities.Max(e => e.Name);
Entity maxEntity = container.Entities
.Where(e => e.Name == maxEntityName)
.FirstOrDefault();
}
That would select the maximum value for Name from the Entities set first, and then grab the entity from the entity set that matches that name.
I think from a simplicity point of view, this should be same result but faster as does not require two round trips through EF to sql server, you always want to execute query as few times as possible for latency, as the Id field is primary key and indexed, should be performant
using(var db = new DataContext())
{
var maxEntity = db.Entities.OrderByDecending(x=>x.Id).FirstOrDefault()
}
Should be equivalent of sql query
SELECT TOP 1 * FROM Entities Order By id desc
so to include search term
string predicate = "name";
using(var db = new DataContext())
{
var maxEntity = db.Entities
.Where(x=>x.Name == predicate)
.OrderByDecending(x=>x.Id)
.FirstOrDefault()
}
I think something like this..?
var maxVersion = (from t in table
where t.name == "name"
orderby t.version descending
select t.version).FirstOrDefault();
var star = from t in table
where t.name == "name" &&
t.version == maxVersion
select t;
Or, as one statement:
var star = from t in table
let maxVersion = (
from v in table
where v.name == "name"
orderby v.version descending
select v.version).FirstOrDefault()
where t.name == "name" && t.version == maxVersion
select t;
this is the easiest way to get max
using (MyDBEntities db = new MyDBEntities())
{
var maxReservationID = _db .LD_Customer.Select(r => r.CustomerID).Max();
}

Performance question about Mongo database

today I have tested the Mongo database, but I got a performance issue.
After I insert 1.800.00, I tried to make a sum of all values but it too 57s.
Then I tried the same thing in MSSQL and took 0s!!
Can you give any tips what I'm doing wrong?
Is this a Mango limitation?
static void Main(string[] args)
{
//Create a default mongo object. This handles our connections to the database.
//By default, this will connect to localhost, port 27017 which we already have running from earlier.
var connStr = new MongoConnectionStringBuilder();
connStr.ConnectTimeout = new TimeSpan(1, 0, 0);
connStr.SocketTimeout = new TimeSpan(1, 0, 0);
connStr.Server = new MongoServerAddress("localhost");
var mongo = MongoServer.Create(connStr);
//Get the blog database. If it doesn't exist, that's ok because MongoDB will create it
//for us when we first use it. Awesome!!!
var db = mongo.GetDatabase("blog");
var sw = new Stopwatch();
sw.Start();
//Get the Post collection. By default, we'll use the name of the class as the collection name. Again,
//if it doesn't exist, MongoDB will create it when we first use it.
var collection = db.GetCollection<Post>("Post");
Console.WriteLine(collection.Count());
sw.Stop();
Console.WriteLine("Time: " + sw.Elapsed.TotalSeconds);
sw.Reset();
sw.Start();
var starting = collection.Count();
var batch = new List<Post>();
for (int i = starting; i < starting + 200000; i++)
{
var post = new Post
{
Body = i.ToString(),
Title = "title " + i.ToString(),
CharCount = i.ToString().Length,
CreatedBy = "user",
ModifiedBy = "user",
ModifiedOn = DateTime.Now,
CreatedOn = DateTime.Now
};
//collection.Insert<Post>(post);
batch.Add(post);
}
collection.InsertBatch(batch);
Console.WriteLine(collection.Count());
sw.Stop();
Console.WriteLine("Time to insert 100.000 records: " + sw.Elapsed.TotalSeconds);
//var q = collection.Find(Query.LT("Body", "30000")).ToList();
//Console.WriteLine(q.Count());
sw.Reset();
sw.Start();
var q2 = collection.AsQueryable<Post>();
var sum = q2.Sum(p => p.CharCount);
Console.WriteLine(sum);
sw.Stop();
Console.WriteLine("Time to sum '" + q2.Count() + "' Post records: " + sw.Elapsed.TotalSeconds); //PROBLEM: take 57 to SUM 1.000.000 records
}
}
Performance issue in the following row:
var q2 = collection.AsQueryable<Post>();
In row above you loading all posts from the posts collection into memory, because of driver does not support linq. In MSSQL it's taking less than second because of linq and calculating will go through the database. Here i guess almost all 57 second need to load data into memory.
In mongodb to achieve best performance you need to create extra fields (de normalize data) and calculate any sums,counters, etc whenever it possible. If it not possible you need to use map/reduce or available aggregate functions, like group (good fit for your example of sum calculation).

How to get List of all tables in the Entity data framework?

I need to get the list of All tables in the Entity Data Framework.
I know that in Linq2SQL we can use something like this.
var dataContext = new DataContext();
var dataContextTableNames = (from tables in dataContext.Mapping.GetTables()
select tables.TableName).ToList();
But, I need to get list of all tables in Entity Data Framework. There is any work around to get similar list in Entity Data Framework.
Thanks in advance.
[Edit]
Perhaps this can be of use to find the number of objects in Storage space
var count = GetEntitySetCount(myObjectContext.MetadataWorkspace);
public static int GetEntitySetCount(MetadataWorkspace workspace)
{
var count = 0;
// Get a collection of the entity containers from storage space.
var containers = workspace.GetItems<EntityContainer>(DataSpace.SSpace);
foreach(var container in containers)
{
//Console.WriteLine("EntityContainer Name: {0} ",
// container.Name);
foreach(var baseSet in container.BaseEntitySets)
{
if(baseSet is EntitySet)
{
count++;
//Console.WriteLine(
// " EntitySet Name: {0} , EntityType Name: {1} ",
// baseSet.Name, baseSet.ElementType.FullName);
}
}
}
return count;
}
To retrieve the number of tables in the database, you can do the following in .Net 4.0
myObjectContext.ExecuteStoreQuery<int>(
"SELECT COUNT(*) from information_schema.tables WHERE table_type = 'base table'");
Using .Net 3.5
var connection = ((EntityConnection)myObjectContext.Connection).StoreConnection as SqlConnection;
var cmd = new SqlCommand("SELECT COUNT(*) from information_schema.tables WHERE table_type = 'base table'", connection);
connection.Open();
var count = (int)cmd.ExecuteScalar();
connection.Close();