how to create graph from edge list using GraphFrame

how to create graph from edge list using GraphFrame - pyspark

I have dataframe with two columns which are edge list and I want to create graph from it using pyspark or python
Can anyone suggest how to do it.
In R it can be done using below command from igraph
graph.edgelist(as.matrix(df))
my input dataframe is df
valx valy
1: 600060 09283744
2: 600131 96733110
3: 600194 01700001
My output should look like below (its basically all valx and valy under V1 and their membership info under V2)
V1 V2
600060 1
96733110 1
01700001 2

By your desired output, you don't seem to want a graph but rather an array that shows which row your V1 value was originally stored in. Which you can get from your original dataframe.
I'm going to assume that what you want is to turn the dataframe in a graph format and not the above.
import networkx as nx
import pandas as pd
filelocation = r'C:\Users\Documents\Tilo Edgelist'
Panda_edgelist = pd.read_csv(filelocation)
g = nx.from_pandas_edgelist(Panda_edgelist,'valx','valy')
nx.draw(g,with_labels = True,node_size = 0)
The above code will create a graph for you in python, below is what the output looks like if you draw the graph using the draw function from networkx.
I've gone ahead and assumed that you're creating the dataframe by reading in some sort of file.
If you can covert this file into a csv file, then you can read it in to a dataframe with pandas.
Format for the csv file I used is as follows:
valx,valy
600060,09283744
600131,96733110
600194,01700001
substitute the filepath between the quotation marks for the filepath to your csv file.
below you can see the what the dataframe from pd.read_csv looks like
valx valy
0 600060 9283744
1 600131 96733110
2 600194 1700001
So then we pass this dataframe to networkx to create the graph
g = nx.from_pandas_edgelist(Panda_edgelist,'valx','valy')
In the function above, you can see I've given it the argument Panda_edgelist and then 'valx' and 'valy' as the source and target node column names, respectively. It uses these arguments to create a graph called g.
Finally, I've drawn the graph generated to console using nx.draw.
nx.draw(g,with_labels = True,node_size = 0)
This function needs you to pass it the graph, g in our case.
with_labels = True is used to draw the node names/ID.
node_size = 0 is used to make the size of the node drawn 0. By default, if you don't give the function this argument then it will draw small red circles to represent the nodes in the graph.

Related

Paraview - get current time_index in ProgrammableSource

I have an array that has exactly as many rows as there are time steps in the animation. Now I want to have the row associated to the current time step as vtkTable as output of a ProgrammableSource. My code (for the ProgrammableSource) so far looks like this:
import numpy as np
file = "table.csv"
tbl = np.genfromtxt(file, names=True, delimiter=",", autostrip=True)
for n in tbl.dtype.names:
v = tbl[n][2]
output.RowData.append(v, n)
Which currently always writes out the third line (v = tbl[n][2]). Is there a way to pass the current time step (index) in place of [2]?

The best is to make your Programmable Source timesteps aware.
So you should fill the Information Script (advanced property) to declare the list of available timesteps, and then in the main script, get current timestep and output the data you want.
See this example from the doc.

Save and re-load a weighted graph from OSMnx for NetworKX

I am using OSMnx to get a graph and add a new edge attribute (w3) representing a custom weight for each edge. Then I can successfully find 2 different shortest paths between 2 points using NetworkX and 'length', 'w2'. Everything works fine, this is my code:
G = ox.graph_from_place(PLACE, network_type='all_private', retain_all = True, simplify=True,truncate_by_edge=False) ```
w3_dict = dict((zip(zip(lu, lv, lk),lw3)))
nx.set_edge_attributes(G, w3_dict, "w3")
route_1 = nx.shortest_path(G, node_start, node_stop, weight = 'length')
route_2 = nx.shortest_path(G, node_start, node_stop, weight = 'w3')
Now I would like to save G to disk and reopen it, to perform more navigation tasks later on. But after saving it with:
ox.save_graph_xml(G, filepath='DATA/network.osm')
and reopen it with:
G = ox.graph_from_xml('DATA/network.osm')
my custom attribute w3 has disappeared. I have followed the instructions in the docs but with no luck. It feels like I'm missing something really obvious but I don't understand what it is..

Use the ox.save_graphml and ox.load_graphml functions to save/load full-featured OSMnx/NetworkX graphs to/from disk for later use. The save xml function exists only to allow serialization to the .osm file format for applications that require it, and has many constraints to conform to that.
import networkx as nx
import osmnx as ox
ox.config(use_cache=True, log_console=True)
# get a graph, set 'w3' edge attribute
G = ox.graph_from_place('Piedmont, CA, USA', network_type='drive')
nx.set_edge_attributes(G, 100, 'w3')
# save graph to disk
ox.save_graphml(G, './data/graph.graphml')
# load graph from disk and confirm 'w3' edge attribute is there
G2 = ox.load_graphml('./data/graph.graphml')
nx.get_edge_attributes(G2, 'w3')

Plot a graph with ipycytoscape (and networkx)

Following the instructions of ipycitoscape I am not able to plot a graph using ipycitoscape.
according to: https://github.com/QuantStack/ipycytoscape/blob/master/examples/Test%20NetworkX%20methods.ipynb
this should work:
import networkx as nx
import ipycytoscape
G2 = nx.Graph()
G2.add_nodes_from([*'ABCDEF'])
G2.add_edges_from([('A','B'),('B','C'),('C','D'),('E','F')])
print(G2.nodes)
print(G2.edges)
cytoscapeobj = ipycytoscape.CytoscapeWidget()
cytoscapeobj.graph.add_graph_from_networkx(nx_graph)
G2 is a networkx graph example and it looks ok since print(G2) gives the networkx object back and G2.nodes and G2.edges can be printed.
The error:
ValueError: invalid literal for int() with base 10: 'A'
Why should a node be an integer?
More general what to do if the starting data point if a pandas dataframe with a million rows edges those being strings like ProcessA-ProcessB, processC-processD etc
Also having a look to the examples it is to be noted that the list of nodes is composed of a dictionary data for every node. that data including an "id" per node and also "Atribute". The surprise here is that the networkx Graph should have all those properties.
thanks

This problem was fixed. See attachment.
Please let me know if it's still happening. Feel free to open an issue: https://github.com/QuantStack/ipycytoscape/

I'm just playing around with ipycytoscape myself, so I could be way off-base, but, shouldn't the line be:
cytoscapeobj.graph.add_graph_from_networkx(G2) # your graph name goes here
Trying to generate a cytoscape object built on a graph that doesn't exist might trigger a ValueError because it can't find any nodes.

Tensorflow 0.8 Import and Export output tensors problems

I am using Tensorflow 0.8 with Python 3. I am trying to train the Neural Network, and the goal is to automatically export/import network states every 50 iteration. The problem is when I export the output tensor at the first iteration, the output tensor name is ['Neg:0', 'Slice:0'], but when I export the output tensor at the second iteration, the output tensor name is changed as ['import/Neg:0', 'import/Slice:0'], and importing this output tensor is not working then:
ValueError: Specified colocation to an op that does not exist during import: import/Variable in import/Variable/read
I wonder if anyone has ideas on this problem. Thanks!!!

That's how tf.import_graph_def works.
If you don't want the prefix, just set the name parameter to the empty string as showed in the following example.
# import the model into the current graph
with tf.Graph().as_default() as graph:
const_graph_def = tf.GraphDef()
with open(TRAINED_MODEL_FILENAME, 'rb') as saved_graph:
const_graph_def.ParseFromString(saved_graph.read())
# replace current graph with the saved graph def (and content)
# name="" is important because otherwise (with name=None)
# the graph definitions will be prefixed with import.
# eg: the defined operation FC2/unscaled_logits:0
# will be import/FC2/unscaled_logits:0
tf.import_graph_def(const_graph_def, name="")
[...]

Modelica vector parameters from a file

Is it possible to read a vector of parameters from a file?
I'm trying to create a vector of objects, such as shown here: enter link description here starting on page 49. However, I would like to pull the specific resistance and capacitance values from a text file. (I'm actually just using this as an example for how to read it in).
So, the example fills in data like this:
A.Basic.Resistor R[N + 1](R = vector([Re/2; fill(Re,N-1); Re/2]) );
A.Basic.Capacitor C[N](each C = c*L/N);
But, instead I have a text file that contains something like, where the first column is the index, the second is the R values and the third is the C values:
#1
double test1(4,3) #First set of data (row then col)
1.0 1.0 10.0
2.0 2.0 30.0
3.0 5.0 50.0
4.0 7.0 100.0
I know that I can read this data in using a CombiTable1D or CombiTable2D. But, is there a way to essentially convert each column of data to a vector so that I can do something analogous to:
ReadInTableFromDisk
A.Basic.Resistor R[N + 1](R = FirstDataColumnOfDataOnDisk );
A.Basic.Capacitor C[N](each C = SecondDataColumnOfDataOnDisk);

I would recommend the ExternData library if you want to load external data files into your modelica tool.
Modelica library for data I/O of INI, JSON, XML, MATLAB MAT and Excel XLS/XLSX files

There is the vector() function that converts arrays to vectors.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

how to create graph from edge list using GraphFrame - pyspark

Related

Paraview - get current time_index in ProgrammableSource

Save and re-load a weighted graph from OSMnx for NetworKX

Plot a graph with ipycytoscape (and networkx)

Tensorflow 0.8 Import and Export output tensors problems

Modelica vector parameters from a file

Categories

Resources