gremlin query if-else-then in the graph traversal

gremlin query if-else-then in the graph traversal - titan

I have been trying to figure this if-else in a Gremlin query.
Assume g.V({0}) is the group vertex below.
var q = "g.V({0}).as('groupName', 'groupId', 'ownerId').inE(eIsAdminOf, eIsMemberOf).as('rel', 'joinDate').outV().hasLabel(userLabel).as('memberId')";
//TODO:var q = "g.V({0}).as('groupName', 'groupId', 'ownerId').inE(eIsAdminOf";
//if .has('mCanList',true).inE(eIsAdminOf, eIsMemberOf)
//if .has('mCanList',false).inE(eIsAdminOf)
//, eIsMemberOf).as('rel', 'joinDate').outV().hasLabel(userLabel).as('memberId')";
I want the .inE(eIsAdminOf, eIsMemberOf) to be based on property value mCanList of true or false as in the comments above.
Have been trying a choose to no avail:
var q = "g.V({0}).as('groupName', 'groupId', 'ownerId','mCanList');
q += ".by(values('mCanList').choose(is(true),.inE(eIsAdminOf, eIsMemberOf), .inE(eIsAdminOf))";
q += '.as('rel', 'joinDate').outV().hasLabel(userLabel).as('memberId')”;
I am using node.js to build the gremlin query with the gremlin library. The worst option for me is to build 2 separate async queries which build the results separately based on
if .has('mCanList',true).inE(eIsAdminOf, eIsMemberOf) or
if .has('mCanList',false).inE(eIsAdminOf)
TIA

I'm not sure that I follow the reasoning behind all the step labeling that you have so I've mostly omitted that to demonstrate use of choose() which seems to be the focus of your question. I roughly approximated what I think your graph structure is based on how you described the problem (if you have future questions, please consider providing some sample graph creation code that can be easily cut/paste into the a Gremlin Console session). In any case, here's what I think you need:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV(id,1).property("mCanList",true).as('a').
......1> addV(id,2).as('b').
......2> addV(id,3).as('c').
......3> addE("isAdminOf").from("b").to("a").
......4> addE("isMemberOf").from("c").to("a").iterate()
gremlin> g.V(1).choose(has('mCanList',true),inE("isAdminOf","isMemberOf"),inE("isAdminOf"))
==>e[1][2-isAdminOf->1]
==>e[2][3-isMemberOf->1]
gremlin>
gremlin> g.V(1).property('mCanList',false)
==>v[1]
gremlin> g.V(1).choose(has('mCanList',true),inE("isAdminOf","isMemberOf"),inE("isAdminOf"))
==>e[1][2-isAdminOf->1]
If I try to directly edit your Gremlin I think your traversal basically just needs to look like this:
var q = "g.V({0});
q += ".choose(has('mCanList',true),inE(eIsAdminOf, eIsMemberOf), inE(eIsAdminOf))";
q += ".outV().hasLabel(userLabel)";
I presume that "eIsAdminOf", "eIsMemberOf" and "userLabel" are JS variables - if not they would need quotes around them if they happen to be actual label names. Again, I'm not clear on what you were doing with all the uses of as() - none of that seemed relevant to your traversal based on your question.

Related

JanusGraph indexing in Scala

I am using Spark to make a JanusGraph from a data stream, but am having issues indexing and creating properties. I want to create an index by a vertex property called "register_id". I am not sure I'm doing it the right way.
So, here's my code:
var gr1 = JanusGraphFactory.open("/Downloads/janusgraph-cassandra.properties")
gr1.close()
// This is done to clear the graph made in every run.
JanusGraphFactory.drop(gr1)
gr1 = JanusGraphFactory.open("/Downloads/janusgraph-cassandra.properties")
var reg_id_prop = gr1.makePropertyKey("register_id").dataType(classOf[String]).make()
var mgmt = gr1.openManagement()
gr1.tx().rollback()
mgmt.buildIndex("byRegId", classOf[Vertex]).addKey(reg_id_prop).buildCompositeIndex()
When I run the above, I get an error saying:
"Vertex with id 5164 was removed".
Also, how do I check if I have vertices with a certain property in the graph or not in Scala. I know in gremlin, g.V().has('name', 'property_value') works, but can't figure out how to do this in Scala. I tried Gremlin-Scala but can't seem to find it.
Any help will be appreciated.

You should be using mgmt object to build the schema, not the graph object. You also need to make sure to mgmt.commit() the schema updates.
gr1 = JanusGraphFactory.open("/Downloads/janusgraph-cassandra.properties")
var mgmt = gr1.openManagement()
var reg_id_prop = mgmt.makePropertyKey("register_id").dataType(classOf[String]).make()
mgmt.buildIndex("byRegId", classOf[Vertex]).addKey(reg_id_prop).buildCompositeIndex()
mgmt.commit()
Refer to the indexing docs from JanusGraph.
For your second question on checking for the existence of a vertex using the composite index, you need to finish your traversal with a terminal step. For example, in Java, this would return a boolean value:
g.V().has('name', 'property_value').hasNext()
Refer to iterating the traversal docs from JanusGraph.
Reading over the gremlin-scala README, it looks like it has a few options for terminal steps that you could use like head, headOption, toList, or toSet.
g.V().has('name', 'property_value').headOption
You should also check out the gremlin-scala-examples and the gremlin-scala traversal specification.

Graph Traversal & Filtering with Gremlin using OrientDB

Group[code=a]->Choice[selected=true]
Group[code=a]->Choice[selected=false]
Group[code=a]->Choice[selected=false]
Group[code=b]->Choice[selected=false]
Group[code=b]->Choice[selected=false]
Group[code=c]->Choice[selected=false]
Group[code=c]->Choice[selected=true]
Given the above Vertices, I'm looking to query for Group Vertices, where a group does not have any Choice vertices, with a selected attribute as true.
Hence the result should return only Group b
Group[code=b]
Any help is appreciated.

Here's your graph - when asking questions about Gremlin it's always helpful to provide your sample data in this way:
graph = TinkerGraph.open()
g = graph.traversal()
g.addV('group').property('code','a').as('a').
addV('group').property('code','b').as('b').
addV('group').property('code','c').as('c').
addV('choice').property('selected',true).
addE('link').from('a').
addV('choice').property('selected',false).
addE('link').from('a').
addV('choice').property('selected',false).
addE('link').from('a').
addV('choice').property('selected',false).
addE('link').from('b').
addV('choice').property('selected',false).
addE('link').from('b').
addV('choice').property('selected',false).
addE('link').from('c').
addV('choice').property('selected',true).
addE('link').from('c').iterate()
One way to get the answer you want is to do a traversal like this:
gremlin> g.V().hasLabel('group').
......1> where(__.not(out('link').has('selected',true))).
......2> values('code')
==>b
The above answer is for TinkerPop 3.x. In TinkerPop 2.x the pattern is the same. You would basically do:
g.V().has('label','group').filter{ it._().out('link').has('selected',true).hasNext() }

Gremlin - how do you merge vertices to combine their properties without listing the properties explicitly?

Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).
I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly.
(8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).
I've got this far:
:> g.V().hasLabel('product').
as('cid').
outE('state').
has('to', 8640000000000000).
inV().
as('name').
as('price').
select('cid', 'name','price').
by('cid').
by('name').
by('price')
=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}
but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.
Any ideas?
===================
Edit:
Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:
> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]
then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:
> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]
At the moment the best suggestion so far comes up with the following:
> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]
I hope that's clearer. If not please let me know.

Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.
Assume you want to merge marko and lop:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).valueMap()
==>[name:[marko],age:[29]]
gremlin> g.V(1).out("created").valueMap()
==>[name:[lop],lang:[java]]
Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.
Get the properties for both vertices:
gremlin> g.V(1).union(identity(), out("created")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[lop],lang:[java]]
Merge them:
gremlin> g.V(1).union(identity(), out("created")).valueMap().
unfold().group().by(select(keys)).by(select(values))
==>[name:[lop],lang:[java],age:[29]]
UPDATE
Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).
gremlin> g.V().hasLabel("person").
filter(outE("created")).map(
union(valueMap(),
outE("created").limit(1).inV().valueMap("lang")).
unfold().group().by {it.getKey()}.by {it.getValue()})
==>[name:[marko], lang:[java], age:[29]]
==>[name:[josh], lang:[java], age:[32]]
==>[name:[peter], lang:[java], age:[35]]

Merging edge and vertex properties using gremlin java DSL:
g.V().has('User', 'id', userDbId).outE(Edges.TWEETS)
.union(__.identity().valueMap(), __.inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
.map(v -> converter.toTweet((Map) v.get())).toList();

Thanks for the answer by Daniel Kuppitz and youhans it has given me a basic idea on the solution of the issue. But later I found out that the solution is not working for multiple rows. It is required to have local step for handling multiple rows. The modified gremlin query will look like:
g.V()
.local(
__.union(__.valueMap(), __.outE().inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
)
This will limit the scope of union and group by to a single row.
If you can work with custom DSL ,create custom DSL with java like this one.
public default GraphTraversal<S, LinkedHashMap> unpackMaps(){
GraphTraversal<S, LinkedHashMap> it = map(x -> {
LinkedHashMap mapSource = (LinkedHashMap) x.get();
LinkedHashMap mapDest = new LinkedHashMap();
mapSource.keySet().stream().forEach(key->{
Object obj = mapSource.get(key);
if (obj instanceof LinkedHashMap) {
LinkedHashMap childMap = (LinkedHashMap) obj;
childMap.keySet().iterator().forEachRemaining( key_child ->
mapDest.put(key_child,childMap.get(key_child)
));
} else
mapDest.put(key,obj);
});
return mapDest;
});
return it;
}
and use it freely like
g.V().as("s")
.valueMap().as("value_map_0")
.select("s").outE("INFO1").inV().valueMap().as("value_map_1")
.select("s").outE("INFO2").inV().valueMap().as("value_map_2")
.select("s").outE("INFO3").inV().valueMap().as("value_map_3")
.select("s").local(__.outE("INFO1").count()).as("value_1")
.select("s").outE("INFO1").inV().value("name").as("value_2")
.project("val_map1","val_map2","val_map3","val1","val2")
.by(__.select("value_map_1"))
.by(__.select("value_map_2"))
.by(__.select("value_1"))
.by(__.select("value_2"))
.unpackMaps()
results to rows with
map1_val1, map1_val2,.... ,map2_va1, map2_val2....,value1, value2
This can handle mix of values and valueMaps in a natural gremlin way.

Return certain vertices and whether they share an edge with a particular vertex

I'm trying to model follower relationships between certain users in my app:
user----follows----user
(think Twitter)
Given a set of userIds I need to return all those user vertices and a boolean if a particular user (currentUser) has a follows edge to those users. So I need to know whether or not currentUser is following each of these users:
user1: true
user2: true
user3: false
user4: true
I'm stuck on how to fetch that follow status. If I return each user vertex like so:
currentUser = g.V(1);
g.V().hasLabel("appUser").or(__.has("userId","123869681319429"),
__.has("userId","103659593341656")).valueMap();
what would be an efficient command to determine if each of those had an incoming follows edge from currentUser?
TitanDB 1.0.0 running on DynamoDB.
Edit- Adding My full working traversal:
g.V().hasLabel('appUser').or(__.has('cId', '1232'),__.has('cId', '1116')).group().by().by(__.in('follows').hasId(hasLabel('appUser').has('pId', 'd13dfa6').id()).count())
Edit 2 -
I wound up rewriting this traversal to better capture the data I needed by using as() and select(). Leaving here for reference:
g.V().hasLabel('appUser').or(__.has('cId', '1232'),__.has('cId', '1116')).as('user','followCount').select('user','followCount').by(__.valueMap()).by(__.in('follows').hasId(hasLabel('appUser').has('pId', 'd13dfa6').id()).count())

Here's one way to do it. Assume this sample graph:
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> vUser1 = graph.addVertex(id,1)
==>v[1]
gremlin> vUser2 = graph.addVertex(id,2)
==>v[2]
gremlin> vUser3 = graph.addVertex(id,3)
==>v[3]
gremlin> vUser1.addEdge('follows',vUser2)
==>e[0][1-follows->2]
gremlin> vUser3.addEdge('follows',vUser3)
==>e[1][3-follows->3]
Your code snippet above demonstrates that you will have the "current user" vertex and the vertices of the users you want to compare to that current user to see if there are any follows relationships. Given that assumption, you could approach it this way:
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:3 edges:2], standard]
gremlin> g.V(vUser2,vUser3).group().by().by(__.in("follows").hasId(vUser1.id()).count())
==>[v[2]:1, v[3]:0]
In this case, you iterate the list of user vertices you want to compare against, then group on them. The traversal will output a Map where a value greater than 0 represents a follows relationship and a value of zero represents the opposite of no follow relationship. So, in the example above, user 1 follows 2 but doesn't follow 3.

traversing orientdb graph, sql-traverse vs gremlin

I want to model linked nodes data set:
Node(A)----next---->Node(B)----next---->Node(C)
applying SQL-Traverse:
traverse out('next') from Node(A)
will include Node(A) in result; A,B,C ,and this is the desired output,
where as using gremlin:
g.('Node(A)').as('start').out('next').loop('start')
will only returns B,C ,
my question is how to emit Node A in gremlin , followed by other nodes, in the same order they were linked in, and i prefer the end result to be pipline; i.e. i tried aggregate(), but the problem with it is that it will make me use the aggregated collection as a start point for a new pipline with new traverse, and i dont want this behavior, any ideas? thanks.

I think path will do what you want:
gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.v(1).as('s').out().loop('s'){true}{true}.path()
==>[v[1], v[3]]
==>[v[1], v[2]]
==>[v[1], v[4]]
==>[v[1], v[4], v[3]]
==>[v[1], v[4], v[5]]