Fetch a vertex and all its vertices and edges - orientdb

I want, from a vertex, to fetch this vertex itself and all its outgoing vertices and edges. I have this :
select expand(unionall(#this, outE(), out())) from #1:0
This works for edges and vertices but not for the #this element. I get something like this in a field named "value":
{"out_Configured":["#2:0","#2:1"]}
How can I do this ?

I tried running you query and it is working fine in version 2.0.5.
If you do not want the output compulsorily in JSON format, you can try the following query:
select expand( $f ) LET $a = ( select expand(unionall(outE(), out())) from #1:0 ), $b = ( select from #1:0 ), $f = unionall( $a, $b )

Related

ABAP String Templates - Embedded Expressions with GROUP BY

I am refering to this question: LOOP AT... GROUP BY with dynamic group key
I also want to do some dynamic GROUP-BY-Aggregations.
I want to dynamize a GROUP-BY-Statement like the following snippet:
WHEN 'vkorg'.
LOOP AT ls_output-merged_data INTO ls_tmp
GROUP BY ( sales_org = ls_tmp-sales_org
group_quantity = GROUP SIZE )
ASCENDING ASSIGNING FIELD-SYMBOL(<fs_tmp_2>).
lt_tmp_dim1 = VALUE #( BASE lt_tmp_dim1 (
sales_org = <fs_tmp_2>-sales_org
group_quantity = <fs_tmp_2>-group_quantity
) ).
ENDLOOP.
Curently I have some smiliar to this:
LOOP AT ls_output-merged_data INTO ls_tmp
GROUP BY COND #( WHEN iv_dim1 EQ 'vkorg' THEN |{ ls_tmp-sales_org }| )
ASCENDING ASSIGNING FIELD-SYMBOL(<total_group>).
LOOP AT GROUP <total_group> ASSIGNING FIELD-SYMBOL(<line_data>).
ASSIGN COMPONENT 'vkorg' OF STRUCTURE <line_data> TO FIELD-SYMBOL(<vkorg>).
EXIT.
ENDLOOP.
ENDLOOP.
The issue here is, that I also need the GROUP SIZE field. However I am not able to put it into the embedded expression like: |{ ls_tmp-sales_org GROUP SIZE }|
I tried several positions and syntax but I was not able to make it work.
Does someone did something similiar before and can help me out?
Thanks to Sandra it works now:
LOOP AT ls_output-merged_data INTO ls_tmp
GROUP BY ( sales_org = COND #( WHEN iv_dim1 EQ 'vkorg' THEN |{ ls_tmp-sales_org }| )
group_quantity = GROUP SIZE )
ASCENDING ASSIGNING FIELD-SYMBOL(<total_group>).
ENDLOOP.

How to write a query that filters traversed elements

I'm struggling with the following query. For a family tree database, I have a vertex 'Person' and a lightweight edge 'Child', so the edge would go out of a parent and into a child (ie 'child-of'). From a person, I need to get their siblings who share the exact same parents.
I can get all of a persons siblings fairly easy, as follows;
SELECT
FROM (
TRAVERSE out_Child
FROM (
SELECT expand(in_Child)
FROM #11:3
)
WHILE $depth <= 1
)
WHERE $depth = 1
So this gets the parents of the person in question, then gets all the children of the parents. The results might look like the following
#rid in_Child
#11:2 #11:0
#11:3 #11:0, #11:1
#11:4 #11:0, #11:1
#11:5 #11:1
I need to filter these results though, as I only want records that have the exact same parents as #11:3. So in this instance, the query should only return #11:3 and #11:4. If the query were for #11:5, it should return #11:5 only. So basically, the in_Child fields must be the same.
I've tried all sorts of queries such as the following, but the query either doesnt run or doesnt filter.
SELECT
FROM (
SELECT
FROM (
TRAVERSE out_Child
FROM (
SELECT expand(in_Child)
FROM #11:3
)
WHILE $depth <= 1
)
WHERE $depth = 1
)
LET $testinChild = (SELECT expand(in_Child) FROM #11:3)
WHERE in_Child CONTAINSALL $testinChild
Ultimately I would prefer to not do any sub-queries, but if it's required then so be it. I Also tried to use traversedElement(0) function, but it only returns the first record traversed (ie #11:0, but not #11:1), so it can't be used.
Update;
If you copy-paste the following into orientdb console (change the password etc to suit your setup), you will have the same dataset described above.
create database remote:localhost/persondb root pass memory graph
alter database custom useLightweightEdges=true
create class Person extends V
create property Person.name string
create class Child extends E
create vertex Person set name = "Father"
create vertex Person set name = "Mother"
create vertex Person set name = "Child of father only"
create edge Child from #11:0 to #11:2
create vertex Person set name = "Child of father+mother #1"
create edge Child from #11:0 to #11:3
create edge Child from #11:1 to #11:3
create vertex Person set name = "Child of father+mother #2"
create edge Child from #11:0 to #11:4
create edge Child from #11:1 to #11:4
create vertex Person set name = "Child of mother only"
create edge Child from #11:1 to #11:5
Okay, I've found some solutions.
First of all, the way I used CONTAINSALL in the question is not correct, as pointed out to me here. CONTAINSALL does not check that all the items on the 'right' are in the 'left', but actually loops over each item in the 'left' and uses that item in the expression on the 'right'. SO WHERE in_Child CONTAINSALL (sex = 'Male) will filter for records where all of the in_Child records are only Male (ie no females). It's basically checking that in_Child[0:n].sex = 'Male'.
So I tried this query;
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
WHERE
(SELECT expand(in('Child')) from #11:3) CONTAINSALL (#rid in $current.in_Child)
I think OrientDB might have a bug here. The above query return #11:2, #11:3 and #11:4, which doesn't make sense to me. I changed this query slightly...
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
$parents CONTAINSALL (#rid in $current.in_Child)
This works better. The above query correctly returns #11:3 and #11:4, but a query on #11:2 or #11:5 also incorrectly includes both #11:3 and #11:4. This makes sense, because it checking the parent rids of eg #11:2 (which is only 1) is in the parents of the rest, which they are. So I added a check to ensure they had the same amount of parents.
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
$parents CONTAINSALL (#rid in $current.in_Child)
AND
$parents.size() = in('Child').size()
Now the query is working correctly for all instances. However, I still wasn't happy with this query. I abandonned the use of CONTAINSALL and eventually came up with the following...
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
in_Child.asSet() = $parents.asSet()
This appears the best/safest, and is the one I will use.
UPDATE for dynamic number of parents :
SELECT
distinct(#rid)
FROM
(SELECT
expand(intersect)
FROM
(SELECT
in('Child').out('Child') as intersect
FROM
#17:2))
WHERE
in('Child').size() = $parentCount.size[0]
LET $parentCount = (SELECT
in('Child').size() as size
FROM
#17:2)

OrientDB Removing one result set from another using the difference() function

We are using version v.1.7-rc2 of OrientDB, embedded in our application, and I'm struggling to figure out a query for removing one set of results from another set of results.
For a simplified example, we have a class of type "A" which is organized in a directional hierarchy. The class has a "name" attribute defined as a string (referring to areas, regions, counties, cities, etc), and a "parent" edge defining a relationship from the child instances to the parent instances.
I was able to find the intersection of the result sets from the two sub-queries of my hierarchy using the instance() function:
select expand( $1 ) LET $2 = ( select from (traverse in('parent') from (select from A where name = 'Eastern')) where $depth > 0 and name like '%a%' ), $3 = ( select from (traverse in('parent') from (select from A where name = 'Eastern')) where $depth > 0 and name like '%o%' ), $1 = intersect( $2, $3 )
I thought I could accomplish the opposite effect if I used the difference() function:
select expand( $1 ) LET $2 = ( select from (traverse in('parent') from (select from A where name = 'Eastern')) where $depth > 0 and name like '%a%' ), $3 = ( select from (traverse in('parent') from (select from A where name = 'Eastern')) where $depth > 0 and name like '%o%' ), $1 = difference( $2, $3 )
but it returns zero records, when the sub queries for $2 and $3 run separately return record sets that overlap. What am I failing to understand? I've searched the forums and documentation, but haven't figured it out.
In the end, I want to take vertices found in one result set, and remove from it any vertices found in a second result set. I essentially want the analogous behavior of the SQL EXCEPT operator (https://en.wikipedia.org/wiki/Set_operations_%28SQL%29#EXCEPT_operator).
Any ideas or directions would be extremely helpful!
Regards,
Andrew

search recursively for dead-ends in topological network table

I've been trying for weeks to figure this out:
I need to recursively search a topological network, OpenStreetMap streets in this case, for dead ends, and neighborhoods that hang from the rest of the network by only one edge. These are places where you might expect to see a no-exit sign if your city is considerate like that.
My table has a record for each edge in the network. Each edge has a 'target' and 'source' field, identifying the node to which that side of the edge is connected. I've added a binary column called 'dangling' to indicate whether the edge has been identified as a dea-ending segment. I initialize this column as FALSE, assuming the best.
So far, I've been able to get to identify simply branching dead-ends with the following SQL
WITH node_counts AS ( -- get all unique nodes
SELECT target AS node FROM edge_table WHERE NOT dangling
UNION ALL
SELECT source AS node FROM edge_table WHERE NOT dangling),
single_nodes AS ( -- select only those that occur once
SELECT node
FROM node_counts
GROUP BY node
HAVING count(*) = 1
) --
UPDATE edge_table SET dangling = true
FROM single_nodes
WHERE node = target OR node = source;
I simply keep running this query until no rows are updated.
The result looks like this(red is dangling = true):
http://i.stack.imgur.com/OE1rZ.png
Excellent! This is working great...but there are still cul-de-sac neighborhoods if you will, which are only connected to the larger network by one edge. How can I identify those?
My best guess is that I'm going to need a WITH RECURSIVE at some point, but that's about as far as my unmathmatical mind will go. Can anyone point me in the right direction?
OK. Here's how I figured it out:
I decided that there was not a way, or least not an easy way to implement this in SQL alone. I ended up implementing Tarjan's Algorithm in PHP and SQL, creating a temporary nodes table which linked each node to a strongly connected subcomponent of the graph. Once that was done, I updated any segment that was touching a node which did not belong to the largest subcomponent, as 'dangling'. All edges therefor that started and ended at nodes belonging to the largest subcomponent belong to the main street network (not dangling).
Here's the code. Note that it can take a very long time to run on a large graph. It's also pretty hard on the working memory, but it worked for my purposes.
<?php
$username = '';
$password = '';
$database = '';
$edge_table = 'cincy_segments';
$v1 = 'target';
$v2 = 'source';
$dangling_boolean_field = 'dangling';
$edge_id_field = 'edge_id';
//global variables declared
$index = 0;
$component_index = 0;
$nodes = array();
$stack = array();
pg_connect("host=localhost dbname=$database user=$username password=$password");
// get vertices
echo "getting data from database\n";
$neighbors_query = pg_query("
WITH nodes AS (
SELECT DISTINCT $v1 AS node FROM $edge_table
UNION
SELECT DISTINCT $v2 AS node FROM $edge_table
),
edges AS (
SELECT
node,
$edge_id_field AS edge
FROM nodes JOIN $edge_table
ON node = $v1 OR node = $v2
)
SELECT
node,
array_agg(CASE WHEN node = $v2 THEN $v1
WHEN node = $v1 THEN $v2
ELSE NULL
END) AS neighbor
FROM edges JOIN $edge_table ON
(node = $v2 AND edge = $edge_id_field) OR
(node = $v1 AND edge = $edge_id_field)
GROUP BY node");
// now make the results into php results
echo "putting the results in an array\n";
while($r = pg_fetch_object($neighbors_query)){ // for each node record
$nodes[$r->node]['id'] = $r->node;
$nodes[$r->node]['neighbors'] = explode(',',trim($r->neighbor,'{}'));
}
// create a temporary table to store results
pg_query("
DROP TABLE IF EXISTS temp_nodes;
CREATE TABLE temp_nodes (node integer, component integer);
");
// the big traversal
echo "traversing graph (this part takes a while)\n";
foreach($nodes as $id => $values){
if(!isset($values['index'])){
tarjan($id, 'no parent');
}
}
// identify dangling edges
echo "identifying dangling edges\n";
pg_query("
UPDATE $edge_table SET $dangling_boolean_field = FALSE;
WITH dcn AS ( -- DisConnected Nodes
-- get nodes that are NOT in the primary component
SELECT node FROM temp_nodes WHERE component != (
-- select the number of the largest component
SELECT component
FROM temp_nodes
GROUP BY component
ORDER BY count(*) DESC
LIMIT 1)
),
edges AS (
SELECT DISTINCT e.$edge_id_field AS disconnected_edge_id
FROM
dcn JOIN $edge_table AS e ON dcn.node = e.$v1 OR dcn.node = e.$v2
)
UPDATE $edge_table SET $dangling_boolean_field = TRUE
FROM edges WHERE $edge_id_field = disconnected_edge_id;
");
// clean up after ourselves
echo "cleaning up\n";
pg_query("DROP TABLE IF EXISTS temp_nodes;");
pg_query("VACUUM ANALYZE;");
// the recursive function definition
//
function tarjan($id, $parent)
{
global $nodes;
global $index;
global $component_index;
global $stack;
// mark and push
$nodes[$id]['index'] = $index;
$nodes[$id]['lowlink'] = $index;
$index++;
array_push($stack, $id);
// go through neighbors
foreach ($nodes[$id]['neighbors'] as $child_id) {
if ( !isset($nodes[$child_id]['index']) ) { // if neighbor not yet visited
// recurse
tarjan($child_id, $id);
// find lowpoint
$nodes[$id]['lowlink'] = min(
$nodes[$id]['lowlink'],
$nodes[$child_id]['lowlink']
);
} else if ($child_id != $parent) { // if already visited and not parent
// assess lowpoint
$nodes[$id]['lowlink'] = min(
$nodes[$id]['lowlink'],
$nodes[$child_id]['index']
);
}
}
// was this a root node?
if ($nodes[$id]['lowlink'] == $nodes[$id]['index']) {
do {
$w = array_pop($stack);
$scc[] = $w;
} while($id != $w);
// record results in table
pg_query("
INSERT INTO temp_nodes (node, component)
VALUES (".implode(','.$component_index.'),(',$scc).",$component_index)
");
$component_index++;
}
return NULL;
}
?>
IMO it is not possible without loop-detection. (the dangling bit is a kind of breadcrum-loopdetection). The below query is a forking Y-shape leading into two dead-end-streets (1..4 and 11..14).
If you add the link between #19 back to #15, the recursion will not stop. (Maybe my logic is incorrect or incomplete?)
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE edge_table
( source INTEGER NOT NULL
, target INTEGER NOT NULL
, dangling boolean NOT NULL DEFAULT False
);
INSERT INTO edge_table ( source, target) VALUES
(1,2) ,(2,3) ,(3,4)
,(11,12) ,(12,13) ,(13,14)
,( 15,16) ,(16,17) ,(17,18) ,( 18,19)
-- , (19,15) -- this will close the loop
, (19,1) -- Y-fork
, (19,11) -- Y-fork
;
-- EXPLAIN
WITH RECURSIVE cul AS (
SELECT e0.source AS source
, e0.target AS target
FROM edge_table e0
WHERE NOT EXISTS ( -- no way out ...
SELECT * FROM edge_table nx
WHERE nx.source = e0.target
)
UNION ALL
SELECT e1.source AS source
, e1.target AS target
FROM edge_table e1
JOIN cul ON cul.source = e1.target
WHERE 1=1
AND NOT EXISTS ( -- Only one incoming link; no *other* way to cul
SELECT * FROM edge_table nx
WHERE nx.target = cul.source
AND nx.source <> e1.source
)
)
SELECT * FROM cul
;
[ the CTE is of course intended to be used in an update statement to set the dangling fields ]

OrientDB SQL - traverse while keeping edges weight

Lets assume the following model in OrientDB graph:
I have a Profile vertex.
Profiles are connected with 2 edges: Liked and Commented. Both edges have a "value" field indicating the count of the action (or the "weight" of the edge).
So, if user A commented 3 times on posts by user B there will be a Commented edge from user A to user B with value = 3.
Now, say I want to get all the users that interacted with user B (either liked or commented), sorted by the weight of the interaction.
I can do that with the following SQL:
select * from (traverse out from
(select out, sum(value) as value from
(traverse * from (select from Profile where username="B") while $depth < 3)
where #class="Liked" or #class="Commented" group by out order by value desc)
while $depth < 2 ) where #class="Profile" )
But what if I want to know also the weight of the interaction? How do I propagate up the "value" while doing the last traverse?
Edit
According to the suggestion, a simplified version of this query will be:
select expand(out) from (
select out, sum(value) as value from (
select expand(inE("Liked", "Commented")) from Profile
where username="B"
) group by out order by value desc
)
But I still can't find a way to use LET to insert the value into the outer expanded object. $parent does not seem to point to the object that is expanded on the most outer select.
Edit 2
I'm Playing with $parent in every way I can think of. I don't see how you can use it in this case. Again - the problem I'm trying to solve is how to pass the sum(value) to the outer result set. I don't see a way of using LET for it when doing a GROUP BY, and I also don't see a way of using LET when the outer most select is doing an expand (since you can't do other projections together with expand).
Also, the results of using $current do not seem to be what is expected. For example, the following query:
select expand($v) from
(select from
(select expand(inE("Liked", "Commented")) from Profile where #rid=#11:0)
let $v = $current
)
Returns this:
{
"result" : [{
"#type" : "d",
"#rid" : "#14:4",
"#version" : 2,
"#class" : "Commented",
"value" : 1,
"out" : "#11:165",
"in" : "#11:0"
}, {
"#type" : "d",
"#rid" : "#14:4",
"#version" : 2,
"#class" : "Commented",
"value" : 1,
"out" : "#11:165",
"in" : "#11:0"
}, {
"#type" : "d",
"#rid" : "#14:4",
"#version" : 2,
"#class" : "Commented",
"value" : 1,
"out" : "#11:165",
"in" : "#11:0"
}
]
}
The same node over and over again, instead of all the edges, which is what I would expect.
I see you're using an old version of OrientDB. With more recent versions you can simplify it by. Example: original query:
select * from (
traverse out from (
select out, sum(value) as value from (
traverse * from (
select from Profile where username="B"
) while $depth < 3
) where #class="Liked" or #class="Commented" group by out order by value desc
) while $depth < 2
) where #class="Profile" )
You could skip some step by using out()/in()/both() passing the Edge's labels/class like:
select expand( out(["Liked","Commented]) ) from Profile where username="B"
However to pass the value you can use variables with LET clause. Example:
select from XXX let $parent.a = value
In this way you set the variable "a" into the upper level context, but you could do also:
select from XXX let $parent.$parent.a = value
To set it 2 levels up.
I haven't tried this with a group by yet, but you should be able to group the result using a sub query. This works for me, where prop1, prop2, prop3, etc are properties of the vertices coming out of the edge (the columns resulting from select out() Profile where username="B")
select outV().prop1, outV().prop2, outV().prop3, value from (
select expand(inE("Liked", "Commented")) from Profile
where username="B"
) order by value desc