Networkx gives attributes as neighbors - networkx

I need to mark nodes as visited in a traversal I'm doing. So I do G[node]['visited'] = True. However, this messes up G.neighbors(node), giving me 'visited' as a neighbor of node! What is the approriate way to handle this?
Example:
>>> import networkx as nx
>>> G = nx.Graph()
>>> G.add_edge(0,1)
>>> G[0]['visited'] = True
>>> G.neighbors(0)
['visited', 1]

Instead of G[0]['visited'] = True use G.node[0]['visited'] = True.
Example for what you want. You can check the attribute value in the same way you set it.
>>> import networkx as nx
>>> G = nx.Graph()
>>> G.add_edge(0,1)
>>> G.node[0]['visited'] = True
>>> G.neighbors(0)
[1]
>>> G.node[0]['visited']
True

Related

Find neighboring nodes in graph

I've the following graph:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edge(131,673,weight=673)
g.add_edge(131,201,weight=201)
g.add_edge(131,303,weight=20)
g.add_edge(673,96,weight=96)
g.add_edge(673,205,weight=44)
g.add_edge(673,110,weight=7)
g.add_edge(201,96,weight=96)
g.add_edge(201,232,weight=10)
nx.draw(g,with_labels=True)
plt.show()
g.nodes(data=True)
g.edges(data=True)
I need to create a function myfunction(g, node_list) that returns a subgraph whose nodes have weight < 50.
For example, if I run myfunction(g, [131, 201]), the output should be:
EdgeDataView([(131, 303, {'weight': 20}), (201, 232, {'weight': 10})])
A way to do that is by looping through all the nodes in your list and finding their neighbors with the nx.neighbors function from networkx. You can then set up an if condition to check the weight of the edge between the node of interest and its neighbors. If the condition satisfies your constraint, you can add the neighbor, the edge, and the weight to your subgraph.
See code below:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edge(131,673,weight=673)
g.add_edge(131,201,weight=201)
g.add_edge(131,303,weight=20)
g.add_edge(673,96,weight=96)
g.add_edge(673,205,weight=44)
g.add_edge(673,110,weight=7)
g.add_edge(201,96,weight=96)
g.add_edge(201,232,weight=10)
fig=plt.figure(figsize=(10,10))
#Plot full graph
plt.subplot(211)
plt.title('Full graph')
labels_g = nx.get_edge_attributes(g,'weight')
pos_g=nx.circular_layout(g)
nx.draw_networkx_edge_labels(g,pos_g,edge_labels=labels_g)
nx.draw(g,pos=pos_g,with_labels=True)
def check_neighbor_weights(g,nodes):
subg=nx.Graph() #Create subgraph
for n in nodes:
subg.add_node(n)
neighbors=g.neighbors(n) #Find all neighbors of node n
for neighs in neighbors:
if g[n][neighs]['weight']<50: #Check if the weigh t is below 50
subg.add_edge(n,neighs,weight=g[n][neighs]['weight'])
return subg
subg=check_neighbor_weights(g,[131,201]) #Returns subgraph of interest
plt.subplot(212)
plt.title('subgraph')
labels_subg = nx.get_edge_attributes(subg,'weight')
pos_subg=nx.circular_layout(subg)
nx.draw_networkx_edge_labels(subg,pos=pos_subg,edge_labels=labels_subg)
nx.draw(subg,pos=pos_subg,with_labels=True)
plt.show()
And the output gives:

gaussian process regression in multiple dimensions with GPflow

I would like to perform some multivariant regression using gaussian process regression as implemented in GPflow using version 2.
Installed with pip install gpflow==2.0.0rc1
Below is some example code that generates some 2D data and then attempts to fit it with using GPR and the finally computes the difference
between the true input data and the GPR prediction.
Eventually I would like to extend to higher dimensions
and do tests against a validation set to check for over-fitting
and experiment with other kernels and "Automatic Relevance Determination"
but understanding how to get this to work is the first step.
Thanks!
The following code snippet will work in a jupyter notebook.
import gpflow
import numpy as np
import matplotlib
from gpflow.utilities import print_summary
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = (12, 6)
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def gen_data(X, Y):
"""
make some fake data.
X, Y are np.ndarrays with shape (N,) where
N is the number of samples.
"""
ys = []
for x0, x1 in zip(X,Y):
y = x0 * np.sin(x0*10)
y = x1 * np.sin(x0*10)
y += 1
ys.append(y)
return np.array(ys)
# generate some fake data
x = np.linspace(0, 1, 20)
X, Y = np.meshgrid(x, x)
X = X.ravel()
Y = Y.ravel()
z = gen_data(X, Y)
#note X.shape, Y.shape and z.shape
#are all (400,) for this case.
# if you would like to plot the data you can do the following
fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(X, Y, z, s=100, c='k')
# had to set this
# to avoid the following error
# tensorflow.python.framework.errors_impl.InvalidArgumentError: Cholesky decomposition was not successful. The input might not be valid. [Op:Cholesky]
gpflow.config.set_default_positive_minimum(1e-7)
# setup the kernel
k = gpflow.kernels.Matern52()
# set up GPR model
# I think the shape of the independent data
# should be (400, 2) for this case
XY = np.column_stack([[X, Y]]).T
print(XY.shape) # this will be (400, 2)
m = gpflow.models.GPR(data=(XY, z), kernel=k, mean_function=None)
# optimise hyper-parameters
opt = gpflow.optimizers.Scipy()
def objective_closure():
return - m.log_marginal_likelihood()
opt_logs = opt.minimize(objective_closure,
m.trainable_variables,
options=dict(maxiter=100)
)
# predict training set
mean, var = m.predict_f(XY)
print(mean.numpy().shape)
# (400, 400)
# I would expect this to be (400,)
# If it was then I could compute the difference
# between the true data and the GPR prediction
# `diff = mean - z`
# but because the shape is not as expected this of course
# won't work.
The shape of z must be (N, 1), whereas in your case it is (N,). However, this is a missing check in GPflow and not your fault.

copy construct from a tensor: USER WARNING

I am creating a random tensor from normal distribution and since this tensor is served as the weight in the NN, to add requires_grad attributes, I use torch.tensor() as below:
import torch
input_dim, hidden_dim = 3, 5
norm = torch.distributions.normal.Normal(loc=0, scale=0.01)
W = norm.sample((input_dim, hidden_dim))
W = torch.tensor(W, requires_grad=True)
I am getting user warning error as below:
UserWarning: To copy construct from a tensor,
it is recommended to use sourceTensor.clone().detach() or
sourceTensor.clone().detach().requires_grad_(True),
rather than torch.tensor(sourceTensor).
Is there an alternative way to achieve the above? Thanks
You can just set W.requires_grad to True
import torch
input_dim, hidden_dim = 3, 5
norm = torch.distributions.normal.Normal(loc=0, scale=0.01)
W = norm.sample((input_dim, hidden_dim))
W.requires_grad = True

Skewnorm not fitting properly

This is a follow-up to my previous question here. I'm trying to fit my data from this csv file with scipy.stats.skewnorm, but I can't get it working right:
import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats import skewnorm
df = pd.read_csv('astro_data.csv')
x = df['delta z']
number_bins = 50
fig, ax = plt.subplots()
h, edges, _ = ax.hist(x, alpha = 0.5,
density = False,
bins = number_bins)
a_est, loc_est, scale_est = skewnorm.fit(x)
ax.plot(x, skewnorm.pdf(x, a_est, loc_est, scale_est), 'r-', lw=5, alpha=0.6, label='skewnorm pdf')
Can anyone see how I can fix this?
EDIT: when I change to density=True, the result is this:

passing a tuple to fill_value in scipy.interpolate.interp1d results in ValueError

The docs in scipy.interpolate.interp1d (v0.17.0) say the following for the optional fill_value argument:
fill_value : ... If a two-element tuple, then the first element is used as a fill value for x_new < x[0] and the second element is used for x_new x[-1].
Thus I pass a two-element tupe in this code:
N=100
x=numpy.arange(N)
y=x*x
interpolator=interp1d(x,y,kind='linear',bounds_error=False,fill_value=(x[0],x[-1]))
r=np.arange(1,70)
interpolator(np.arange(1,70))
But it throws ValueError:
ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing result of shape (0,1)
Can anyone please point to me what am I doing wrong here?
Thanks in advance for any help.
It's a bug which has been fixed in the current dev version:
>>> N = 100
>>> x = np.arange(N)
>>> y = x**2
>>> from scipy.interpolate import interp1d
>>> iii = interp1d(x, y, fill_value=(-10, 10), bounds_error=False)
>>> iii(-1)
array(-10.0)
>>> iii(101)
array(10.0)
>>> scipy.__version__
'0.18.0.dev0+8b07439'
That being said, if all you want is a linear interpolation with fill values for left-hand and right-hand sides, you can use np.interp
directly.