How do I copy, but not deepcopy, a networkx Graph? - networkx

I want to compare the state of a networkx.Graph object n before a function call d(n) (with side effects) with the state afterwards.
There are mutable object node attributes such as n.node[0]['attribute'], which I want to compare.
Obviously,
before = n
d()
after = n
assert id(before.node[0]['attribute']) == id(after.node[0]['attribute'])
succeeds trivially, because
before == after
but if I set before=n.copy(), a deep copy is made, and therefore id(before.node[0]['attribute']) != id(after.node[0]['attribute']). How do I get a copy of a Graph object without copying all node attribute objects?

Calling the copy method gives a deep copy. All attributes of the new graph are copies of the original graph. Calling the constructor (e.g. Graph(G)) gives a shallow copy where the graph structure is copied but the data attributes are references those in the original graph.
From the copy method docs
All copies reproduce the graph structure, but data attributes may be
handled in different ways. There are four types of copies of a graph
that people might want.
Deepcopy -- The default behavior is a "deepcopy" where the graph
structure as well as all data attributes and any objects they might
contain are copied. The entire graph object is new so that changes in
the copy do not affect the original object.
Data Reference (Shallow) -- For a shallow copy (with_data=False) the
graph structure is copied but the edge, node and graph attribute dicts
are references to those in the original graph. This saves time and
memory but could cause confusion if you change an attribute in one
graph and it changes the attribute in the other.
In [1]: import networkx as nx
In [2]: G = nx.Graph()
In [3]: G.add_node(1, color=['red'])
In [4]: G_deep = G.copy()
In [5]: G_deep.node[1]['color'].append('blue')
In [6]: list(G.nodes(data=True))
Out[6]: [(1, {'color': ['red']})]
In [7]: list(G_deep.nodes(data=True))
Out[7]: [(1, {'color': ['red', 'blue']})]
In [8]: G_shallow = nx.Graph(G)
In [9]: G_shallow.node[1]['color'].append('blue')
In [10]: list(G.nodes(data=True))
Out[10]: [(1, {'color': ['red', 'blue']})]
In [11]: list(G_shallow.nodes(data=True))
Out[11]: [(1, {'color': ['red', 'blue']})]

Try this:
G = # your graph
G2 = nx.Graph() # or whatever type of graph `G` is
G2.add_edges_from(G.edges())

Please also note that if your networkx graph contains objects of objects..., even a deepcopy would not work. It would return an error that there are too many levels.
Normally, I would think what exactly is of interest in the graph and just create a new one with that.

Related

Is it possible to load a model which is stored with model.module.state_dict() but load with model.state_dict()

I want to ask a question, I have trained a model with two gpus and stored this model with
model.module.state_dict(), now I want to load this model in one gpu, can I directly load this trained model with model.state_dict()?
Thanks in advance!
You can refer to this question.
You can either adding an nn.DataParallel for loading purpose. Or change the key naming like
# original saved file with DataParallel
state_dict = torch.load('myfile.pth.tar')
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
model.load_state_dict(new_state_dict)
But as you save the model with model.module.state_dict() instead of model.state_dict() it possible that the names may differed. If the two methods above don't work try print the saved dict and model to see what you need to change. Like
state_dict = torch.load('myfile.pth.tar')
print(state_dict)
print(model)

Save and re-load a weighted graph from OSMnx for NetworKX

I am using OSMnx to get a graph and add a new edge attribute (w3) representing a custom weight for each edge. Then I can successfully find 2 different shortest paths between 2 points using NetworkX and 'length', 'w2'. Everything works fine, this is my code:
G = ox.graph_from_place(PLACE, network_type='all_private', retain_all = True, simplify=True,truncate_by_edge=False) ```
w3_dict = dict((zip(zip(lu, lv, lk),lw3)))
nx.set_edge_attributes(G, w3_dict, "w3")
route_1 = nx.shortest_path(G, node_start, node_stop, weight = 'length')
route_2 = nx.shortest_path(G, node_start, node_stop, weight = 'w3')
Now I would like to save G to disk and reopen it, to perform more navigation tasks later on. But after saving it with:
ox.save_graph_xml(G, filepath='DATA/network.osm')
and reopen it with:
G = ox.graph_from_xml('DATA/network.osm')
my custom attribute w3 has disappeared. I have followed the instructions in the docs but with no luck. It feels like I'm missing something really obvious but I don't understand what it is..
Use the ox.save_graphml and ox.load_graphml functions to save/load full-featured OSMnx/NetworkX graphs to/from disk for later use. The save xml function exists only to allow serialization to the .osm file format for applications that require it, and has many constraints to conform to that.
import networkx as nx
import osmnx as ox
ox.config(use_cache=True, log_console=True)
# get a graph, set 'w3' edge attribute
G = ox.graph_from_place('Piedmont, CA, USA', network_type='drive')
nx.set_edge_attributes(G, 100, 'w3')
# save graph to disk
ox.save_graphml(G, './data/graph.graphml')
# load graph from disk and confirm 'w3' edge attribute is there
G2 = ox.load_graphml('./data/graph.graphml')
nx.get_edge_attributes(G2, 'w3')

Plot a graph with ipycytoscape (and networkx)

Following the instructions of ipycitoscape I am not able to plot a graph using ipycitoscape.
according to: https://github.com/QuantStack/ipycytoscape/blob/master/examples/Test%20NetworkX%20methods.ipynb
this should work:
import networkx as nx
import ipycytoscape
G2 = nx.Graph()
G2.add_nodes_from([*'ABCDEF'])
G2.add_edges_from([('A','B'),('B','C'),('C','D'),('E','F')])
print(G2.nodes)
print(G2.edges)
cytoscapeobj = ipycytoscape.CytoscapeWidget()
cytoscapeobj.graph.add_graph_from_networkx(nx_graph)
G2 is a networkx graph example and it looks ok since print(G2) gives the networkx object back and G2.nodes and G2.edges can be printed.
The error:
ValueError: invalid literal for int() with base 10: 'A'
Why should a node be an integer?
More general what to do if the starting data point if a pandas dataframe with a million rows edges those being strings like ProcessA-ProcessB, processC-processD etc
Also having a look to the examples it is to be noted that the list of nodes is composed of a dictionary data for every node. that data including an "id" per node and also "Atribute". The surprise here is that the networkx Graph should have all those properties.
thanks
This problem was fixed. See attachment.
Please let me know if it's still happening. Feel free to open an issue: https://github.com/QuantStack/ipycytoscape/
I'm just playing around with ipycytoscape myself, so I could be way off-base, but, shouldn't the line be:
cytoscapeobj.graph.add_graph_from_networkx(G2) # your graph name goes here
Trying to generate a cytoscape object built on a graph that doesn't exist might trigger a ValueError because it can't find any nodes.

Gremlin - how do you merge vertices to combine their properties without listing the properties explicitly?

Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).
I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly.
(8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).
I've got this far:
:> g.V().hasLabel('product').
as('cid').
outE('state').
has('to', 8640000000000000).
inV().
as('name').
as('price').
select('cid', 'name','price').
by('cid').
by('name').
by('price')
=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}
but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.
Any ideas?
===================
Edit:
Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:
> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]
then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:
> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]
At the moment the best suggestion so far comes up with the following:
> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]
I hope that's clearer. If not please let me know.
Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.
Assume you want to merge marko and lop:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).valueMap()
==>[name:[marko],age:[29]]
gremlin> g.V(1).out("created").valueMap()
==>[name:[lop],lang:[java]]
Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.
Get the properties for both vertices:
gremlin> g.V(1).union(identity(), out("created")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[lop],lang:[java]]
Merge them:
gremlin> g.V(1).union(identity(), out("created")).valueMap().
unfold().group().by(select(keys)).by(select(values))
==>[name:[lop],lang:[java],age:[29]]
UPDATE
Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).
gremlin> g.V().hasLabel("person").
filter(outE("created")).map(
union(valueMap(),
outE("created").limit(1).inV().valueMap("lang")).
unfold().group().by {it.getKey()}.by {it.getValue()})
==>[name:[marko], lang:[java], age:[29]]
==>[name:[josh], lang:[java], age:[32]]
==>[name:[peter], lang:[java], age:[35]]
Merging edge and vertex properties using gremlin java DSL:
g.V().has('User', 'id', userDbId).outE(Edges.TWEETS)
.union(__.identity().valueMap(), __.inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
.map(v -> converter.toTweet((Map) v.get())).toList();
Thanks for the answer by Daniel Kuppitz and youhans it has given me a basic idea on the solution of the issue. But later I found out that the solution is not working for multiple rows. It is required to have local step for handling multiple rows. The modified gremlin query will look like:
g.V()
.local(
__.union(__.valueMap(), __.outE().inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
)
This will limit the scope of union and group by to a single row.
If you can work with custom DSL ,create custom DSL with java like this one.
public default GraphTraversal<S, LinkedHashMap> unpackMaps(){
GraphTraversal<S, LinkedHashMap> it = map(x -> {
LinkedHashMap mapSource = (LinkedHashMap) x.get();
LinkedHashMap mapDest = new LinkedHashMap();
mapSource.keySet().stream().forEach(key->{
Object obj = mapSource.get(key);
if (obj instanceof LinkedHashMap) {
LinkedHashMap childMap = (LinkedHashMap) obj;
childMap.keySet().iterator().forEachRemaining( key_child ->
mapDest.put(key_child,childMap.get(key_child)
));
} else
mapDest.put(key,obj);
});
return mapDest;
});
return it;
}
and use it freely like
g.V().as("s")
.valueMap().as("value_map_0")
.select("s").outE("INFO1").inV().valueMap().as("value_map_1")
.select("s").outE("INFO2").inV().valueMap().as("value_map_2")
.select("s").outE("INFO3").inV().valueMap().as("value_map_3")
.select("s").local(__.outE("INFO1").count()).as("value_1")
.select("s").outE("INFO1").inV().value("name").as("value_2")
.project("val_map1","val_map2","val_map3","val1","val2")
.by(__.select("value_map_1"))
.by(__.select("value_map_2"))
.by(__.select("value_1"))
.by(__.select("value_2"))
.unpackMaps()
results to rows with
map1_val1, map1_val2,.... ,map2_va1, map2_val2....,value1, value2
This can handle mix of values and valueMaps in a natural gremlin way.

Save fields of branched structure with dynamic names in matlab

I need to know how to save just one branch of a structure in MATLAB.
The structure contains more levels with more fields per level. For example:
data.level1.level21
data.level1.level22
I want now to save the branches data.level1.level21 and data.level1.level21 individually. I have tried the following but it doesn't work:
firstLevelName = fieldnames(data);
secondLevelNames = fieldnames(data.(firstLevelName{1}));
for pL = 1:length(secondLevelNames)
save([filename '.mat'], '-struct', 'data', firstLevelName{1}, secondLevelNames{pL});
end
The structure-saving method that you're trying to use doesn't work quite the way you are expecting. All the arguments after your struct variable name are fields of that struct to save.
The way MATLAB is interpreting your code is that you're trying to save the level1, and level21 fields of data which obviously doesn't work since level21 is a subfield of level1 not data.
To save the nested fields, the easiest thing is probably to create a new variable pointing to the structure data.level and then call save on that and specify the specific fields to save.
level1 = data.level1;
for pL = 1:numel(secondLevelNames)
save(filename, '-struct', 'level1', secondLevelNames{pL});
end
If you actually want the double nesting in the saved data, you would need to create a new structure containing only the data that you wanted and then save that.
for pL = 1:numel(secondLevelNames)
newstruct = struct(firstLevelName{1}, struct());
newstruct.(secondLevelNames{pL}) = data.(firstLevelName{1}).(secondLevelNames{pL});
save(filename, '-struct', 'newstruct')
end