Titan -> Neo4j Gremlin subgraph - titan

I wish to extract all the edges and vertices that are attached to a specific list and who they follow and copy them to either neo4j directly or by creating a graphson or a kryo file of the data.
Something similar to this:
g.V().has("sublist_id", 14).in('ON').out('FOLLOWS')
I basically want every vertices and edge in a separate database or file to query in isolation.
What is my best approach approach?
I did the following but can't seem to export as json or kryo only graphml.
gremlin> subGraph = g.V().has('sublist_id', 14).in('ON').outE('FOLLOWS').subgraph('subGraph').cap('subGraph').next()
==>tinkergraph[vertices:3438716 edges:14090945]
gremlin> subGraph.io(IoCore.gryo()).writeGraph("/data/test.kryo")
Class is not registered: com.thinkaurelius.titan.graphdb.relations.RelationIdentifier
Note: To register this class use: kryo.register(com.thinkaurelius.titan.graphdb.relations.RelationIdentifier.class);
Display stack trace?
gremlin> subGraph.io(IoCore.graphson()).writeGraph("/data/test.json");
(was java.lang.IllegalStateException) (through reference chain: com.thinkaurelius.titan.graphdb.relations.RelationIdentifier["inVertexId"])
Display stack trace? [yN]

When you generate a subgraph from one graph and try to push to another you might end up with serialization problems because the new graph might not know how to deal with the data within the first graph. That is the case you are having here, specifically with the RelationIdentifier which is the unique id in Titan.
So to state that in the context of your work, you have a TinkerGraph that has Titan RelationIdentifier in it and when you go to write that to gryo, the serialization process fails because TinkerGraph doesn't have knowledge of the Titan serializers.
You can work around this problem by giving gryo the Titan serializers. Do something like:
gryo = GryoIo.build().registry(TitanIoRegistry.INSTANCE).graph(subGraph).create()
gryo.writeGraph("/data/test.kryo")
That's one way to do it. You could also build the GryoWriter up from scratch for even more control over different options and settings.

Related

How to include car ferries in OSMnx graphs

Is it possible to have OSMnx (great tool BTW) include car ferries when building a graph? Failing that, what would be the most direct way to build such a graph? The problem isn't just that the ferry routes themselves aren't present but, without the ferries, islands that are in reality reachable by car aren't included in the road network.
I have tried using osmnx.settings.useful_tags_way to no avail. Using 'route'='ferry' in overpass-api returns what I would like to include in the graph so I have been editing the OSMnx downloads.py file trying to alter the overpass-api call directly.
Thanks!
First pass (hacky)
I've came up with fairly hacky partial solution by editing osmnx's downloads.py file and replacing the line (originally around line 350):
query_str = f"{overpass_settings};(way{osm_filter}(poly:'{polygon_coord_str}');>;);out;"
with:
osm_filter2 = f'["route"="ferry"]'
query_str = f"{overpass_settings};(way{osm_filter}(poly:'{polygon_coord_str}');>;way{osm_filter2}(poly:'{polygon_coord_str}');>;);out;"
The query string format took some trial and error. If this, or something like it, turns out to be the best approach, I'll try to make it more broadly applicable (add "ferry" to the predefined filter list rather than hardcoded, pass a list of filters, etc).
Much Better
I found that building separate graphs and unioning them using networkx.union or networkx.disjoint_union did not give usable results. So I have added a multiple filter capability to my version of the osmnx downloads.py file. I also added some additional network type options. So it's now possible to pass, for instance, "ferry" as a network_type for the osmnx.graph_from_* functions:
network_type = "ferry"
I can also now pass multiple network types using a pipe delimiter:
network_type = "drive|ferry"
to get a graph that is the union of the two. When doing this, I found that it's useful to create a second ferry-only graph and use that as a reference to update the tags in the combined network graph. It is also better to load non-simplified graphs at first and and simplify it/them yourself after this step.
Still experimenting
Still having some issues with non-connectedness - now in the ferry-only network. Car ferries at isolated crossings than don't share a harbor with a through-ferry are not included in the ferries-only graphs and therefore the tags of their counterparts in the combined network aren't getting updated as needed. The ferries in the combined network (network_type = "drive|ferry") are all present and connected correctly and therefore their respective islands are now on the road network - which is great. But, because their tags aren't being updated, these isolated ferries are getting default highway speeds (and I'm doing travel time analysis). I can use work around this using:
if not 'highway' in edge[3]:
but, for a robust solution, I think we want to be able to tag them explicitly.
I'd still love to hear what other think.

VSCode ANTLR4 Plugin: Export Call Graph to JSON?

The vscode-antlr4 plugin for VisualStudio Code has a nice call-graph feature which visualizes (as a dendrogram) how grammar (and lexer) rules interact. You can save the graphic as SVG.
Is there a way to export the information as JSON? I wouldn't mind going into the plugin's code to find a way to do it.
My aim is to create reachability graphs for individual rules, i.e. graphs that show from which other rules a particular rule can be reached (transitively). The "calls" and "is-called" information from the call-graph feature would be a nice starting point.
The data for the call graph comes from a source context instance (for each grammar file there's a single source context to manage all details for it). See the function getReferenceGraph, which collects the relations into a map object. You can use that object to generate a JSON object from it. Or you create another function, taking this one as template, to generate the JSON directly, without the overhead required for the UI.

scala for mapbox vector tiles - getting an 'id' field into the Features written to vector tiles

I'm writing MapBox vector tiles using geotrellis vectorpipe.
see here for the basic flow: https://geotrellis.github.io/vectorpipe/usage.html
Typically GeoJson Features can have an id field, so that Features can be rolled up into FeatureCollections. I need to make use of this field, but vectorpipe doesn't (natively) have this capability.
This is the Feature type used, and you can see it only has space for 1) a Geometry and 2) a data object D (which ends up populating properties in the output). There is no spot for an id.
https://geotrellis.github.io/scaladocs/latest/index.html#geotrellis.vector.Feature
Upstream there's a method called writeFeatureJsonWithID() that does let you inject an id field into a Feature when writing GeoJson.
https://github.com/locationtech/geotrellis/blob/master/vector/src/main/scala/geotrellis/vector/io/json/FeatureFormats.scala#L41-L49
My question is this:
I have worked through the vectorpipe code (https://github.com/geotrellis/vectorpipe), and I can't figure out if/where the data ever exists as GeoJson in a way where I can override and inject the id, maybe using the writeFeatureJsonWithID() or something I write explicitly. A lot of conversions are implicit, but it also may never explicitly sit as json.
Any ideas for how to get an id field in the final GeoJson written to vector tiles?
EDIT
Right now I think the trick is going to be finding a way to override .unfeature() method here:
https://github.com/locationtech/geotrellis/blob/master/vectortile/src/main/scala/geotrellis/vectortile/Layer.scala
The problem is that the internal.vector_tile.Tile is private, so I can construct it without forking the project.
Ended up having to fork geotrellis, hard-code a metadata => id function in Layer.unfeature() and compile locally to include in my project. Not ideal, but it works fine.
Also opened an issue here: https://github.com/locationtech/geotrellis/issues/2884

How do I make use of ILocation source and target in custom routing?

This is my sample network and idea of trying how to make a new routing instead of just the shortest path (the path i want to follow is via the pink arrows)
What am I missing here to make my predefined function work?
Fixing the basics
As explained here, you can't instantiate a Java List, because it is an interface. You can however instantiate any implementing Class of a List, for example an ArrayList.
With this in mind your code will look like this:
List<Path> myPath = new ArrayList<Path>();
myPath.add(path14);
myPath.add(path8);
myPath.add(path);
myPath.add(path1);
myPath.add(path4);
myPath.add(path13);
return myPath;
So far for the basics.
Where to go from here
To get it to consider your actual source and destination for the route planning, define both as input parameters of type ILocation in the properties of the function.
Now comes the really tricky part: writing your own or importing a routing algorithm that can give you that list of paths automatically based on criteria that you define. This is however a topic too broad for this question. The basic steps will be:
Create a graph that represents your AnyLogic path network
Solve the graph routing problem with a solving algorithm (eg. Dijkstra Algorithm), using the graph, the startpoint and the endpoint
Convert the solution you get from the solver back again to an ArrayList that you can work with in AnyLogic
You can do these steps on your own, eg. by implementing the Dijkstra Algorithm yourself, or you import into AnyLogic one of the available graph solving Java packages like JUNG or Graphhopper. In this article I explain step by step how to do so with JUNG.

how to pass data in Zeppelin to DS3.js for Spark visualization

The graph options with Zeppelin are pretty basic. So I am looking for an example of how to do something simple, like a barchart, with ds3.js. From what I can tell that would be the best graphing library to use to create stunning graphs.
Anyway my question is how to pass data to the JavaScript code. With regular Zeppelin charts you write scala or other code and then save that in a dataframe. Then on the next line you use the %sql option and you can write a SQL command and then buttons appear to let you graph the data.
But what I have found looking on the internet is no indication that data created in the scala code section would be passed to the Angular section where you put the ds3.js code.
Some examples I found are like this one where all the html and Javascript is put in one giant print statement in the scala code https://rawkintrevo.org/2016/09/20/gelly-on-apache-flink/
And then there is an example like this one Using d3.js with Apache Zeppelin where the Zeppelin line is all JavaScript, but the data is just a locally created array.
So I need (1) an example and (2) some understanding of how RDDs ad Dataframes can be passed into the JavaScript code, which of course is on a different line that the scala code. How do you bring objects in the scala section of the notebook into scope for the Javascript section.
You can refer to zeppelin docs for a good getting-started guide to creating custom visualization. Also, you might want to check out the code of some of the built-ins viz.
Regarding how data from DataFrames are passed to js, I'm pretty sure z.show or %sql triggers dataFrame.take(${zeppelin.spark.maxResult}) which collects the RDD[T] as a Seq[T] object to the driver whose elements are then used to render graphs.
Alternatively if you have a javascript graph defined in another paragraph, you can also usez.angularBind("values", rdd.take(maxResult)) to send the data to the angular view. There's a really nice answer here on the subject which might help.
Hope you find this helpful.