Graph traversal based on value of property of Edge | OrientDB - orientdb

I have a graph which save topology of cloud infrastructure of a firm.It consists of Vertex called Object (Machines) and Edge called link (to denote how machines are linked to each other these links keep changing according as one machine can connect to different machines depending on need).
create class Object extends V
create class link extends E
Object vertices denotes Machine have properties to store config of the machine.
create property Object.ram long
create property Object.mem long
create property Object.cpu_core long
"link" edges links machine together to form a topology.It is a outbound edge from one machine to other. It has two time based property startTime which denotes the when this edge was created between two machines and endTime which initially is infinity and is update if we need to close this edge denoting that two machine and now no more linked.
create property link.startTime long
create property link.endTime long
create property link.l_id string
I want traverse from a Object(vertex) to all its connected Objects imposing a condition on link(edge) property while traversing that for a time say t, should be startTime <= t <= endTime.
For example Topology (Please click the link )
I want to traverse down from node rid #21:0 to all the edges using link edges whose startTime > 1488965393 and endTime < 1498965393. I am quite new to OrientDB and not able to figure out possible SQL command for the same.
Pseudo query (looking for something similar if possible):
SELECT * FROM (TRAVERSE out('link') FROM #21:0 while startTime > 1488965393 and endTime < 1498965393)

I was able to form a query that produced the desired output
SELECT id FORM (TRAVERSE out() FROM #21:0 WHILE (creationTime<= 1488965393 AND endTime >= 1488965393 AND #class='link')) WHERE #class= 'Object'

Related

SQL query to extract default (initial) value of a value property

I am trying to create a custom template fragment that builds a table of value properties. I started by creating a SQL query fragment that pulls all properties classified by a Value Type. Now I would like to pull in the default (initial) value assigned. I figured out that it's in the Description table of t_xref, with the property guid in the client field, but I don't know how to write a query that will reliably parse the default value out since the string length may be different depending on other values set. I tried using the template content selector first but I couldn't figure out how to filter to only value properties. I'm still using the default .qeax file but will be migrating to a windows based DBMS soon. Appreciate any help!
Tried using the content selector. Successfully built a query to get value properties but got stuck trying to join and query t_xref for default value.
Edited to add current query and image
Value Properties are block properties that are typed to Value Types. I'm using SysML.
This is my current query, I am no SQL expert! I don't pull anything from t_xref yet but am pulling out only the value properties with this query:
SELECT property.ea_guid AS CLASSGUID, property.Object_Type AS CLASSTYPE, property.Name, property.Note as [Notes], classifier.Name AS TYPE
FROM t_object property
LEFT JOIN t_object classifier ON property.PDATA1 = classifier.ea_guid
LEFT JOIN t_object block on property.ParentID = block.Object_ID
WHERE block.Object_ID = #OBJECTID# AND property.Object_Type = 'Part' AND classifier.Object_Type = 'DataType'
ORDER BY property.Name
I guess that Geert will come up with a more elaborate answer, but (assuming you are after the Run State) here are some details. The value for these Run States is stored in t_object.runstate as one of the crude Sparxian formats. You find something like
#VAR;Variable=v1;Value=4711;Op==;#ENDVAR;
where v1 is the name and 4711 the default in this example. How you can marry that with your template? Not the faintest idea :-/
I can't give a full answer to the original question as I can't reproduce your data, but I can provide an answer for the generic problem of "how to extract data through SQL from the name-value pair in t_xref".
Note, this is heavily dependent on the database used. The example below extracts fully qualified stereotype names from t_xref in SQL Server for custom profiles.
select
substring(
t_xref.Description, charindex('FQName=',t_xref.Description)+7,
charindex(';ENDSTEREO',t_xref.Description,charindex('FQName=',t_xref.Description))
-charindex('FQName=',t_xref.Description)-7
),
Description from t_xref where t_xref.Description like '%FQName%'
This works using:
substring(string, start, length)
The string is the xref description column, and the start and length are set using:
charindex(substring, string, [start position])
This finds the start and end tags within the xref description field, for the data you're trying to parse.
For your data, I imagine something like the below is the equivalent (I haven't tested this). It's then a case of combining it with the query you've already got.
select
substring(
t_xref.Description, #the string to search in
charindex('#VALU=',t_xref.Description,charindex('#NAME=default',t_xref.Description)+6, #the start position, find the position of the first #VALU= tag after name=default
charindex('#ENDVALU;',t_xref.Description,charindex('#VALU=',t_xref.Description))
-charindex('#VALU=',t_xref.Description,charindex('#NAME=default',t_xref.Description))-6 #the length, find the position of the first #ENDVALU tag after the start, and subtract it from the start position
),
Description from t_xref where t_xref.Description like '%#NAME=default%' #filter anything which doesn't contain this tag to avoid "out of range" index errors

How to do pattern matching using match_recognize in Esper for unknown properties of events in event stream?

I am new to Esper and I am trying to filter the events properties from event streams having multiple events coming with high velocity.
I am using Kafka to send row by row of CSV from producer to consumer and at consumer I am converting those rows to HashMap and creating Events at run-time in Esper.
For example I have events listed below which are coming every 1 second
WeatherEvent Stream:
E1 = {hum=51.0, precipi=1, precipm=1, tempi=57.9, icon=clear, pickup_datetime=2016-09-26 02:51:00, tempm=14.4, thunder=0, windchilli=, wgusti=, pressurei=30.18, windchillm=}
E2 = {hum=51.5, precipi=1, precipm=1, tempi=58.9, icon=clear, pickup_datetime=2016-09-26 02:55:00, tempm=14.5, thunder=0, windchilli=, wgusti=, pressurei=31.18, windchillm=}
E3 = {hum=52, precipi=1, precipm=1, tempi=59.9, icon=clear, pickup_datetime=2016-09-26 02:59:00, tempm=14.6, thunder=0, windchilli=, wgusti=, pressurei=32.18, windchillm=}#
Where E1, E2...EN are multiple events in WeatherEvent
In the above events I just want to filter out properties like hum, tempi, tempm and presssurei because they are changing as the time proceeds ( during 4 secs) and dont want to care about the properties which are not changing at all or are changing really slowly.
Using below EPL query I am able to filter out the properties like temp, hum etc
#Name('Out') select * from weatherEvent.win:time(10 sec)
match_recognize (
partition by pickup_datetime?
measures A.tempm? as a_temp, B.tempm? as b_temp
pattern (A B)
define
B as Math.abs(B.tempm? - A.tempm?) > 0
)
The problem is I can only do it when I specify tempm or hum in the query for pattern matching.
But as the data is coming from CSV and it has high-dimensions or features so I dont know what are the properties of Events before-hand.
I want Esper to automatically detects features/properties (during run-time) which are changing and filter it out, without me specifying properties of events.
Any Ideas how to do it? Is that even possible with Esper? If not, can I do it with other CEP engines like Siddhi, OracleCEP?
You may add a "?" to the event property name to get the value of those properties that are not known at the time the event type is defined. This is called dynamic property see documentation . The type returned is Object so you need to downcast.

Titan - Cassandra: Process entire set of vertices of a given type without running out of memory

I'm new to Titan and looking for the best way to iterate over the entire set of vertices with a given label without running out of memory. I come from a strong SQL background so I am still working on switching my way of thinking away from SQL-type thinking. Let's say I have 1 million profile vertices. I would like to iterate over each one and perform some type of statistical analysis of the information linked to each profile. I don't really care how long the entire analysis process takes, but I need to iterate over all of the profiles. In SQL I would do SELECT * FROM MY_TABLE, using a scroll-sensitive result, fetch the next result, grab and process the info linked to that row, then fetch the next result. I also don't care if the result is real-time accurate as it is just for gathering general stats, so if a new profile is added during iteration and I miss it, that's ok.
Even if there is a way to grab all the values for a given property, that would probably work too because then I could go through that list and grab each vertex by its ID for example.
I believe titan does lazy loading so you should be able to just iterate over the whole graph:
GraphTraversal<Vertex, Vertex> it = graph.traversal().V();
while(it.hasNext()){
Vertex v = it.next():
//Do what you want here
}
Another option would be to use the range step so that you explicitly choose the range of vertices you need. For example:
List<Vertex> vertices = graph.traversal().V().range(0, 3).toList();
//Do what you want with your batch of vertices.
With regards to getting vertices of a specific type you can query vertices based on their internal properties. For example if you have and internal property "TYPE" which defined the type you are interested in. You can query for those vertices by:
graph.traversal().V().has("TYPE", "A"); //Gets vertices of type A
graph.traversal().V().has("TYPE", "B"); //Gets vertices of type B

Converting complex query with inner join to tableau

I have a query like this, which we use to generate data for our custom dashboard (A Rails app) -
SELECT AVG(wait_time) FROM (
SELECT TIMESTAMPDIFF(MINUTE,a.finished_time,b.start_time) wait_time
FROM (
SELECT max(start_time + INTERVAL avg_time_spent SECOND) finished_time, branch
FROM mytable
WHERE name IN ('test_name')
AND status = 'SUCCESS'
GROUP by branch) a
INNER JOIN
(
SELECT MIN(start_time) start_time, branch
FROM mytable
WHERE name IN ('test_name_specific')
GROUP by branch) b
ON a.branch = b.branch
HAVING avg_time_spent between 0 and 1000)t
GROUP BY week
Now I am trying to port this to tableau, and I am not being able to find a way to represent this data in tableau. I am stuck at how to represent the inner group by in a calculated field. I can also try to just use a custom sql data source, but I am already using another data source.
columns in mytable -
start_time
avg_time_spent
name
branch
status
I think this could be achieved new Level Of Details formulas, but unfortunately I am stuck at version 8.3
Save custom SQL for rare cases. This doesn't look like a rare case. Let Tableau generate the SQL for you.
If you simply connect to your table, then you can usually write calculated fields to get the information you want. I'm not exactly sure why you have test_name in one part of your query but test_name_specific in another, so ignoring that, here is a simplified example to a similar query.
If you define a calculated field called worst_case_test_time
datediff(min(start_time), dateadd('second', max(start_time), avg_time_spent)), which seems close to what your original query says.
It would help if you explained what exactly you are trying to compute. It appears to be some sort of worst case bound for avg test time. There may be an even simpler formula, but its hard to know without a little context.
You could filter on status = "Success" and avg_time_spent < 1000, and place branch and WEEK(start_time) on say the row and column shelves.
P.S. Your query seems a little off. Don't you need an aggregation function like MAX or AVG after the HAVING keyword?

How do I implement object-persistence not involving loading to memory?

I have a Graph object (this is in Perl) for which I compute its transitive closure (i.e. for solving the all-pairs shortest paths problem).
From this object, I am interested in computing:
Shortest path from any vertices u -> v.
Distance matrix for all vertices.
General reachability questions.
General graph features (density, etc).
The graph has about 2000 vertices, so computing the transitive closure (using Floyd-Warshall's algorithm) takes a couple hours. Currently I am simply caching the serialized object (using Storable, so it's pretty efficient already).
My problem is, deserializing this object still takes a fair amount of time (a minute or so), and consumes about 4GB of RAM. This is unacceptable for my application.
Therefore I've been thinking about how to design a database schema to hold this object in 'unfolded' form. In other words, precompute the all-pairs shortest paths, and store those in an appropriate manner. Then, perhaps use stored procedures to retrieve the necessary information.
My other problem is, I have no experience with database design, and have no clue about implementing the above, hence my post. I'd also like to hear about other solutions that I may be disregarding. Thanks!
To start with, sounds like you need two entities: vertex and edge and perhaps a couple tables for results. I would suggest a table that stores node-to-node information. If A is reachable from Y the relationship gets the reachable attribute. So here goes
Vertex:
any coordinates (x,y,...)
name: string
any attributes of a vertex*
Association:
association_id: ID
association_type: string
VertexInAssociation:
vertex: (constrained to Vertex)
association: (constrained to association)
AssociationAttributes:
association_id: ID (constrained to association)
attribute_name: string
attribute_value: variable -- possibly string
* You might also want to store vertex attributes in a table as well, depending on how complex they are.
The reason that I'm adding the complexity of Association is that an edge is not felt to be directional and it simplifies queries to consider both vertexes to just be members of a set of vertexes "connected-by-edge-x"
Thus an edge is simply an association of edge type, which would have an attribute of distance. A path is an association of path type, and it might have an attribute of hops.
There might be other more optimized schemas, but this one is conceptually pure--even if it doesn't make the first-class concept of "edge" a first class entity.
To create an minimal edge you would need to do this:
begin transaction
select associd = max(association_id) + 1 from Association
insert into Association ( association_id, association_type )
values( associd, 'edge' )
insert
into VertexInAssociation( association_id, vertex_id )
select associd, ? -- $vertex->[0]->{id}
UNION select associd, ? -- $vertex->[1]->{id}
insert into AssociationAttributes ( association_id, association_name, association_value )
select associd, 'length', 1
UNION select associd, 'distance', ? -- $edge->{distance}
commit
You might also want to make association types classes of sorts. So that the "edge" association automatically gets counted as a "reachable" association. Otherwise, you might want to insert UNION select associd, reachable, 'true' in there as well.
And then you could query a union of reachable associations of both vertexes and dump them as reachable associations to the other node if they did not exist and dump existing length attribute value + 1 into the length attribute.
However, you'd probably want an ORM for all that though, and just manipulate it inside the Perl.
my $v1 = Vertex->new( 'V', x => 23, y => 89, red => 'hike!' );
my $e = Edge->new( $v1, $v2 ); # perhaps Edge knows how to calculate distance.