How to group by linked verticies - orientdb

I am attempting to build a family tree database with OrientDB (v2.0.2 if it makes a difference). I have a vertex called 'Person' and an edge called 'Child' (so the edge goes out of the parent and in to the child).
For some example data, lets say 'Jane' has had 2 children with 2 different partners (so 4 in total). With Janes rid, I would like to get a list of all her children grouped by their other parent. So the result set I want should be something like
with Bob with Joe
-------- --------
Alpha Gamma
Beta Delta
I can get the children with a simple query select expand( out('Child') ) from #12:6, but I am stumped with what to do from this point. Also note that in some cases, the other parent may not be known, and should thus be listed in an additional group with 'unknown' parent.
I am using the database via pyorient (python), so I could always loop over the results and manually add them to groups in a dict, but this doesn't sound like the proper solution.
Edit: after much trial and error, I have come up with
select unionall($current) as children, $parents as parents from ( select expand(out('Child')) from #12:6 ) let $parents = (select expand(in('Child')) from $current) group by $parents
This gives me a raw result along the lines of
"children": ["#12:0","#12:5"],
"parents": ["#12:1","#12:6"],
Is this as good as it gets? What is the best way to get all these records via the rids, so I can have all the persons info (name, etc)?

From what you mentioned, I am assuming that you have two vertices called Person and Child.
In that case, what you can do is maintain another edge called Partners or something like that.
Now create an edge from Person to itself. This will help in maintaining record of Partners.
Also, have a Children edge that keeps links from the Person to Child vertices. It is always recommended not to use the default E.
To retrieve the child of a person called 'John' with all his partners, you can do:
select expand( $f ) let
$a = (select expand(out()) from
(select expand(out('Partners')) from Person where personName='John')),
$b = (select expand(out('Children')) from Person where personName='John'),
$f = UNIONALL( $a, $b )

Related

OrientDB traverse (children) while connected to a vertex and get an other vertex

I'm not sure the title is the best way to phrase it, here's the structure:
Structure
Here's the db json backup if you want to import it to test it: http://pastebin.com/iw2d3uuy
I'd like to get the Dishes eaten by the Humans living in Continent 1 until a _Parent Human moved to Continent 2.
Which means the target is Dish 1 & 2.
If a parent moved to another Continent, I don't want their dish nor the dishes of their children, even if they move back to Continent 1.
I don't know if it matters, but a Human can have multiple children.
If there wasn't the condition about the children of a Human who has moved from the Continent, this query would have worked:
SELECT expand(in('_Is_in').in('_Lives').in('_Eaten_by'))
FROM Continent WHERE continent_id = 1
But I guess here we're forced to use (among other things)
TRAVERSE out('_Parent') FROM Human WHILE
I've tried to use the while of traverse with a subquery to get all the Humans I'm interested in, before to try to get the Dishes, but I'm not even sure we can use while with a subquery.
I hope the structure will help other users to quickly find out if this query is useful to them. If anyone is wondering, I used the Graph tab of OrientDB Studio to make it, along with GIMP.
As a bonus, if anyone knows the Gremlin syntax, it would also be useful to learn it.
Please feel free to edit this post as you see fit and contribute your thoughts :)
SELECT expand(in('_Eaten_by'))
FROM (TRAVERSE out('_Parent')
FROM (SELECT from Human WHERE in('_Parent').size() = 0)
WHILE out('_Lives').out('_Is_in').continent_id = 1)
Explanation:
TRAVERSE out('_Parent')
FROM (SELECT FROM Human WHERE in('_Parent').size() = 0)
WHILE out('_Lives').out('_Is_in').continent_id = 1
returns Human 1 and 2.
That query traverses Human, starting from Human 1 while the Human is connected to Continent 1.
It starts from in('_Parent').size() = 0 which are the Humans without any _Parent (there's only Human 1 in this case) (size() is the size of the collection of vertices coming in from _Parent).
And SELECT expand(in('_Eaten_by')) FROM
gets the Dishes, starting from the Humans we got from the traversal and going through the edge _Eaten_by.
Note: be sure to always use ' around the vertices and edges names, otherwise the names don't seem to be taken in account.

Find most common shared vertices in OrientDB

I'm currently evaluating OrientDB (2.1.16) as a possible solution to building a similarity recommender. To that end, I'd love some help writing an initial query that accomplishes the following:
Vertex:Maker -(Edge:Produced)-> Vertex:Item -(Edge:TaggedBy)-> Vertex:Tag
I'd like to select a particular Item (V1) and get a list back of other Items (Vn) ordered by the number of Tags shared in common with V1;
By extension, I'd like to take a selected Maker (V2) and traverse through Items to get an ordered list of Makers (and the traversed Items, if possible) who share Tags.
There isn't an awful lot of detailed documentation on the application of intersect in this way. No unusual constraints in particular. There would be thousands of Items and Makers and probably 10x that many Tags.
I tried with this little graph example
I used this query
select item.name, count(tag)from (
select from (
MATCH {
CLASS:Item, AS:item, WHERE: (name<>'v1')
}
.out("TaggedBy"){AS:tag}
return item, tag
) where tag in (
select expand(tag) from (
MATCH {
CLASS:Item, AS:item, WHERE: (name='v1')
}.out("TaggedBy"){AS:tag}
return tag
)
)
) group by item order by count desc
and I got this result
Hope it helps.

OrientDB query for hierarchical data

OrientDB Server v2.0.10 ,
I am trying to come up with a query for the following scenario.
I have 2 hierarchies: A->B->C and D->E->F
The number of nodes in the hierarchy can change.
The node in 1st hierarchy can be connected to the other hierarchy using some relation say 'Assigned'.
What I want is the parent node of the 2nd hierarchy if there is any incoming edge to any of the node in that 2nd hierarchy from the 1st.
For example, say we have Car-Child->Engine-Child->Piston and Country-Child->State-Child->City
And a relationship Made_In which relates Car or Engine or Piston to either Country or State or City
So if there is a relation with either of Country or State or City, the Country should be returned. Example, Engine1-Made_In->Berlin, this would return Germany.
Sorry for such a toyish example. I hope it is clear.
Thanks.
You should consider reading the chapter about "traversing" - that should be the missing link to answer your question. You can find it here: http://orientdb.com/docs/last/SQL-Traverse.html
Basically, if you think of your graph as a family tree, you want to achieve 3 things:
Find all children, grand-children, grand-grand-children (and so on) from tree 1 for a given family member (=Hierarchy1)
Find those who have relations to members of another family tree (=ASSIGNED)
Show me who's on top of this tree (=Hierarchy2)
One of the possible solutions should look a little something like this:
Since you want to end up on top of hierarchy2, you have to start on the other side, i.e. hierarchy1.
Get hierarchy1 (top-to-bottom)
TRAVERSE out("CHILD") FROM Car
Choose all relations
SELECT out("MADE_IN) FROM ([1])
and from those, go bottom-to-top
TRAVERSE in("CHILD") FROM ([2])
Who's on top?
SELECT FROM ([3]) WHERE #class="Country"
Combined into one sql, it looks as ugly as this:
SELECT FROM (
TRAVERSE in("CHILD") FROM (
SELECT out("MADE_IN") FROM (
TRAVERSE out("CHILD") FROM Car
)
)
) WHERE #class="Country"
You could replace Car with any #rid in hierarchy1 to get a list of countries it or any part of it was made in.
There might be better solutions for sure. But at least this one should work, so I hope it will help.

How does a self join table decide what to display when conditions are "identical"

I have a simple chat table. The chat table has a user_id column and a recipient_id column and a boolean agrees_to_chat column.
What I'd like to do, is display the users for which user 1 wants to chat with and whom all other users also want to chat with user 1.
(Note that there will be cases where 1 agrees to chat with 2, but 2 has not gone online to signal a preference yet. Obviously in those cases I don't want a chat to show up.)
Here's what I've come up with so far.
SELECT c1.user_id, c1.recipient_id, c2.user_id, c2.recipient_id FROM chats c1, chats c2
WHERE c1.recipient_id = c2.user_id
AND c1.user_id = c2.recipient_id
AND c2.user_id=1
AND c2.agrees_to_chat=true
AND c1.agrees_to_chat=true
For some reason setting c2.user_id = 1 results in what I want: records where user_id = 1, along with people who have agreed to chat listed in the recipient_id column.
However if I set it to c1.user_id=1 I get the same results flipped over. Namely, now my results are still people who have agreed to chat, but now the recipient_id = 1 for all results, and the user_id is the different users.
This matters to me because if I want to serve data that shows everyone whose agreed to chat with user 1. But if I decide to reference recipient_id in my code, I need to know that won't change...For example, on my computer I noticed that c2.user_id =1 results in what I want, but in this sql fiddle it seems to be that c1.user_id=1 gets what I need... http://sqlfiddle.com/#!15/799a9/2
So what's going on here? Is there something I'm not understanding about my query? Alternatively is there a better query for what I'm trying to achieve?
You don't need all 4 columns, since you already know 1st and 4th (and 2nd and 3rd) will be equal. Use SELECT c2.user_id, c2.recipient_id FROM ... or SELECT c1.user_id, c1.recipient_id FROM .... In case you actually need several copies of the same column from the self-joined tables, you can give names to them: SELECT c1.user_id AS user_id1, c1.recipient_id AS recipient_id1, c2.user_id AS user_id2, c2.recipient_id AS recipient_id2 FROM ...

SOQL Query not bringing back parent with no comments

Due to the way this is structured I can't bring back groups with no comments/feeds in them, unfortunately trying to invert this brings up multiple errors as CollaborationGroup does not understand the relationship it has with CollaborationGroupFeed.
Here is the query:
SELECT
c.Parent.Id,
c.Parent.OwnerId,
c.Parent.CreatedById,
c.Id,
c.ParentId,
(
SELECT
Id,
FeedItemId,
ParentId
FROM FeedComments
)
FROM CollaborationGroupFeed c
I can't do it like this though for whatever reason:
SELECT
Id,
OwnerId,
CreatedById,
(
SELECT
Id,
ParentId
FROM CollaborationGroupFeeds
),
(
SELECT
Id,
FeedItemId,
ParentId
FROM FeedComments
)
FROM CollaborationGroup
Didn't understand relationship 'CollaborationGroupFeed' in FROM part of query call.`
EDIT
So lets say I have a Group that I just created called Foo
[FOO]
Foo has one Post in it BlahPost
[FOO]
|
|_BlahPost
Lets say BlahPost has a comment (or several)
[FOO]
|
|_BlahPost
|_Comment 1
|_Comment 2
The query above will return all of this.
Now lets say I have a new Group Bar
[Bar]
Since there are NO posts/comments the query above returns nothing since I'm working from child to parent,
and parent has no posts. I am looking for a query that starts at the parent CollaborationGroup and moves
down to CollaborationFeed which will display FeedComment
Make more sense? The order is mess up, I'm working from the middle and should be working from the top
Try using Chatter in Apex, which is Chatter REST API resource actions exposed as static methods in the Apex ConnectApi namespace. It's a much easier way to access Chatter data.
http://www.salesforce.com/us/developer/docs/apexcode/Content/connectAPI_overview.htm
http://www.salesforce.com/us/developer/docs/apexcode/Content/apex_classes_connect_api.htm