Can't delete/remove multiple property keys on Vertex Titan 1.0 Tinkerpop 3 - titan

Very basic question,
I just upgraded my Titan from 0.54 to Titan 1.0 Hadoop 1 / TP3 version 3.01.
I encounter a problem with deleting values of
Property key: Cardinality.LIST/SET
Maybe it is due to upgrade process or just my TP3 misunderstanding.
// ----- CODE ------:
tg = TitanFactory.open(c);
TitanManagement mg = tg.openManagement();
//create KEY (Cardinality.LIST) and commit changes
tm.makePropertyKey("myList").dataType(String.class).cardinality( Cardinality.LIST).make();
mg.commit();
//add vertex with multi properties
Vertex v = tg.addVertex();
v.property("myList", "role1");
v.property("myList", "role2");
v.property("myList", "role3");
v.property("myList", "role4");
v.property("myList", "role4");
Now, I want to delete all the values "role1,role2...."
// iterate over all values and try to remove the values
List<String> values = IteratorUtils.toList(v.values("myList"));
for (String val : values) {
v.property("myList", val).remove();
}
tg.tx().commit();
//---------------- THE EXPECTED RESULT ----------:
Empty vertex properties
But unfortunately the result isn't empty:
System.out.println("Values After Delete" + IteratorUtils.toList(v.values("myList")));
//------------------- OUTPUT --------------:
After a delete, values are still apparent!
15:19:59,780 INFO ThriftKeyspaceImpl:745 - Detected partitioner org.apache.cassandra.dht.Murmur3Partitioner for keyspace titan
15:19:59,784 INFO Values After Delete [role1, role2, role3, role4, role4]
Any ideas?

You're not executing graph traversals with the higher level Gremlin API, but you're currently mutating the graph with the lower level graph API. Doing for loops in Gremlin is often an antipattern.
According to the TinkerPop 3.0.1 Drop Step documentation, you should be able to do the following from the Gremlin console:
v = g.addV().next()
g.V(v).property("myList", "role1")
g.V(v).property("myList", "role2")
// ...
g.V(v).properties('myList').drop()

property(key, value) will set the value of the property on the vertex (javadoc). What you should do is get the VertexProperties (javadoc).
for (VertexProperty vp : v.properties("name")) {
vp.remove();
}
#jbmusso offered a solid solution using the GraphTraversal instead.

Related

Adding edge attribute causes TypeError: 'AtlasView' object does not support item assignment

Using networkx 2.0 I try to dynamically add an additional edge attribute by looping through all the edges. The graph is a MultiDiGraph.
According to the tutorial it seems to be possible to add edge attributes the way I do in the code below:
g = nx.read_gpickle("../pickles/" + gname)
yearmonth = gname[:7]
g.name = yearmonth # works
for source, target in g.edges():
g[source][target]['yearmonth'] = yearmonth
This code throws the following error:
TypeError: 'AtlasView' object does not support item assignment
What am I doing wrong?
That should happen if your graph is a nx.MultiGraph. From which case you need an extra index going from 0 to n where n is the number of edges between the two nodes.
Try:
for source, target in g.edges():
g[source][target][0]['yearmonth'] = yearmonth
The tutorial example is intended for a nx.Graph.

save page rank output in neo4j

I am running Pregel Page rank algorith
m on twitter data in Spark using scala. The algorithm runs fine and gives me the output correctly finding out the highest page rank score. But I am unable to save graph on neo4j.
The inputs and outputs are mentioned below.
Input file: (The numbers are twitter userIDs)
86566510 15647839
86566510 197134784
86566510 183967095
15647839 11272122
15647839 10876852
197134784 34236703
183967095 20065583
11272122 197134784
34236703 18859819
20065583 91396874
20065583 86566510
20065583 63433165
20065583 29758446
Output of the graph vertices:
(11272122,0.75)
(34236703,1.0)
(10876852,0.75)
(18859819,1.0)
(15647839,0.6666666666666666)
(86566510,0.625)
(63433165,0.625)
(29758446,0.625)
(91396874,0.625)
(183967095,0.6666666666666666)
(197134784,1.1666666666666665)
(20065583,1.0)
Using the below scala code I try saving the graph but it does'nt. Please help me solve this.
Neo4jGraph.saveGraph(sc, pagerankGraph, nodeProp = "twitterId", relProp = "follows")
Thanks.
Did you load the graph originally from Neo4j? Currently saveGraph saves the graph data back to Neo4j nodes via their internal id's.
It actually runs this statement:
UNWIND {data} as row
MATCH (n) WHERE id(n) = row.id
SET n.$nodeProp = row.value return count(*)
But as a short term mitigation I added optional labelIdProp parameters that are used instead of the internal id's, and a match/merge flag. You'll have to build the library yourself though to use that. I gonna push the update the next few days.
Something you can try is Neo4jDataFrame.mergeEdgeList
Here is the test code for it.
You basically have a dataframe with the data and it saves it to a Neo4j graph (including relationships though).
val rows = sc.makeRDD(Seq(Row("Keanu", "Matrix")))
val schema = StructType(Seq(StructField("name", DataTypes.StringType), StructField("title", DataTypes.StringType)))
val df = new SQLContext(sc).createDataFrame(rows, schema)
Neo4jDataFrame.mergeEdgeList(sc, df, ("Person",Seq("name")),("ACTED_IN",Seq.empty),("Movie",Seq("title")))
val edges : RDD[Edge[Long]] = sc.makeRDD(Seq(Edge(0,1,42L)))
val graph = Graph.fromEdges(edges,-1)
assertEquals(2, graph.vertices.count)
assertEquals(1, graph.edges.count)
Neo4jGraph.saveGraph(sc,graph,null,"test")
val it: ResourceIterator[Long] = server.graph().execute("MATCH (:Person {name:'Keanu'})-[:ACTED_IN]->(:Movie {title:'Matrix'}) RETURN count(*) as c").columnAs("c")
assertEquals(1L, it.next())
it.close()

Tinkerpop3 - degree centrality

I'm looking to find the most liked nodes so basically the degree centrality recipe. This query kind of works but I'd like to return the full vertex (including properties) rather than just the id's.
( I am using Tinkerpop 3.0.1-incubating )
g.V()
.where( inE("likes") )
.group()
.by()
.by( inE("likes").count() )
Result
{
"8240": [
2
],
"8280": [
1
],
"12376": [
1
],
"24704": [
1
],
"40976": [
1
]
}
You're probably looking for the order step, using an anonymous traversal passed to the by() modulator:
g.V().order().by(inE('likes').count(), decr)
Note: this will require iterating over all vertices in Titan v1.0.0 and this query cannot be optimized, it will only work over smaller graphs in OLTP.
To get the 10 most liked:
g.V().order().by(inE('likes').count(), decr).limit(10)
If you want to get the full properties, simply chain .valueMap() or .valueMap(true) (for id and label) on the query.
See also:
http://tinkerpop.apache.org/docs/3.0.1-incubating/#order-step
https://groups.google.com/d/topic/gremlin-users/rt3qRKyAqts/discussion
GraphSON, as it is JSON based, does not support the conversion of complex objects to keys. A key in JSON must be string based and, as in this case, cannot be a map. To deal with this JSON limitation, GraphSON converts complex objects that are to be keys via the Java's toString() or by other methods for certain objects like graph elements (which returns a string representation of the element identifier, explaining why you received the output that you did).
If you want to return properties of elements while using GraphSON, you will have to figure out a workaround. In this specific case, you might do:
gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().group().
......1> by(id).
......2> by(union(__(), outE('knows').count()).fold())
==>[1:[v[1],2],2:[0,v[2]],3:[v[3],0],4:[0,v[4]],5:[v[5],0],6:[0,v[6]]]
In this way you get the vertex identifier as the map id and in the value you get the full vertex plus the count. TinkerPop is working on improving this issue, but I don't expect a fast fix.

Is it possible to return a map of key values using gremlin scala

Currently i have two gremlin queries which will fetch two different values and i am populating in a map.
Scenario : A->B , A->C , A->D
My queries below,
graph.V().has(ID,A).out().label().toList()
Fetch the list of outE labels of A .
Result : List(B,C,D)
graph.traversal().V().has("ID",A).outE("interference").as("x").otherV().has("ID",B).select("x").values("value").headOption()
Given A and B , get the egde property value (A->B)
Return : 10
Is it possible that i can combine both there queries to get a return as Map[(B,10)(C,11)(D,12)]
I am facing some performance issue when i have two queries. Its taking more time
There is probably a better way to do this but I managed to get something with the following traversal:
gremlin> graph.traversal().V().has("ID","A").outE("interference").as("x").otherV().has("ID").label().as("y").select("x").by("value").as("z").select("y", "z").select(values);
==>[B,1]
==>[C,2]
I would wait for more answers though as I suspect there is a better traversal out there.
Below is working in scala
val b = StepLabel[Edge]()
val y = StepLabel[Label]()
val z = StepLabel[Integer]()
graph.traversal().V().has("ID",A).outE("interference").as(b)
.otherV().label().as(y)
.select(b).values("name").as(z)
.select((y,z)).toMap[String,Integer]
This will return Map[String,Int]

How make logic OR between vertex Indexes in Titan 1.0 / TP3 3.01 using predicate Text

During my migration from TP2 0.54 -> TP3 titan 1.0 / Tinkerpop 3.01
I'm trying to build gremlin query which make "logical OR" with Predicate Text , between properties on different Vertex indexes
Something like:
------------------- PRE-DEFINED ES INDEXES: ------------------
tg = TitanFactory.open('../conf/titan-cassandra-es.properties')
tm = tg.openManagement();
g=tg.traversal();
PropertyKey pNodeType = createPropertyKey(tm, "nodeType", String.class, Cardinality.SINGLE);
PropertyKey userContent = createPropertyKey(tm, "storyContent", String.class, Cardinality.SINGLE);
PropertyKey storyContent = createPropertyKey(tm, "userContent", String.class, Cardinality.SINGLE);
//"storyContent" : is elasticsearch backend index - mixed
tm.buildIndex(indexName, Vertex.class).addKey(storyContent, Mapping.TEXTSTRING.asParameter()).ib.addKey(pNodeType, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search");
//"userContent" : is elasticsearch backend index - mixed
tm.buildIndex(indexName, Vertex.class).addKey(userContent, Mapping.TEXTSTRING.asParameter()).ib.addKey(pNodeType, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search");
v1= g.addVertex()
v1.property("nodeType","USER")
v1.property("userContent" , "dccsdsadas")
v2= g.addVertex()
v2.property("nodeType","STORY")
v2.property("storyContent" , "abdsds")
v3= g.addVertex()
v3.property("nodeType","STORY")
v3.property("storyContent" , "xxxx")
v4= g.addVertex()
v4.property("nodeType","STORY")
v4.property("storyContent" , "abdsds") , etc'...
------------------- EXPECTED RESULT: -----------
I want to return all vertexes with property "storyContent" match text contains prefix , OR all vertexes with property "userContent" matching its case.
in this case return v1 and v2 , because v3 doesn't match and v4 duplicated so it must be ignored by dedup step
g.V().has("storyContent", textContainsPrefix("ab")) "OR" has("userContent", textContainsPrefix("dc"))
or maybe :
g.V().or(_().has('storyContent', textContainsPrefix("abc")), _().has('userContent', textContainsPrefix("dcc")))
PS,
I thought use TP3 OR step with dedup , but gremlin throws error ...
Thanks for any help
Vitaly
How about something along those lines:
g.V().or(
has('storyContent', textContainsPrefix("abc")),
has('userContent', textContainsPrefix("dcc"))
)
Edit - as mentioned in the comments, this query won't use any index. It must be split into two separate queries.
See TinkerPop v3.0.1 Drop Step documentation and Titan v1.0.0 Ch. 20 - Index Parameters and Full-Text Search documentation.
With Titan, you might have to import text predicates before:
import static com.thinkaurelius.titan.core.attribute.Text.*
_.() is TinkerPop2 material and no longer used in TinkerPop3. You now use anonymous traversals as predicates, which sometimes have to start with __. for steps named with reserved keywords in Groovy (for ex. __.in()).