How to delete a random edge in networkx? - networkx

Suppose you have a graph graph = nx.read_gml("x.gml") and you'd like to drop n edges. Is there any quick way to do so?

Here is one approach using the sample function from the random library. I set k, the number of edges to be sampled to 2.
import networkx as nx
import random
G=nx.Graph()
G.add_edges_from([[1,2],[1,3],[2,3],[2,4],[3,5],[4,5]])
to_remove=random.sample(G.edges(),k=2)
G.remove_edges_from(to_remove)
print(G.edges())

Related

Binary classification(label 0 &1), which one is considered to be 'positive' when calculating recall, precision etc.?

When using pycaret to do binary classification (label 0 and 1), which one is considered to be 'positive' when calculating recall, precision etc.?
For example, I'm trying to build a model to predict if a patient have a certain disease(0-negative, 1-positive). My intention is to aim for a high recall to avoid situations in which the disease is not detected. When I plot the confusion matrix, 0 appears at the place where 'positive' supposes to be in a normal confusion matrix. I'm so confusing. Do I need to switch 0 and 1?
Any help is appreciated!
Maybe a solution is to create a 'manual' plot rather than using the integrated package. You can change the layout of the heatmap if you like.
import seaborn as sns
import matplotlib.pyplot as plt
matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(matrix.T, annot=True)
plt.title("Confusion Matrix")
plt.ylabel("Actuals")
plt.xlabel("Predictions")
plt.ylim(0,2)
plt.xlim(2,0)

Networkx - Get probability p(k) from network

I have plotted the histogram of network (dataframe), with count of 'k' node connections, like so:
import seaborn as sns
parameter ='k'
sns.histplot(network[parameter])
But now I need to create a modular random graph using above group distribution with:
from networkx.generators.community import random_partition_graph
random_partition_graph(sizes, p_in, p_out, seed=None, directed=False)
And, instead of counts, I need this value p(k), which must be passed as p_in.
p_in (float)
probability of edges with in groups
How do I get p(k) from my network?
This is how I would handle what you described. First, you can normalize your histogram such that the integral of the histogram is equal to 1. This can be done by setting the weights argument of your histogram appropriately. This histogram can then be considered the probability distribution of your degrees. Now that you have this probability distribution, i.e. a list of probability (deg_prob in the code) you can randomly sample from it using np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling). From this random sampling, you can then create a random expected_degree_graph by just passing your samples in the w argument.
You can then compare the degree distribution of your original graph with the one from your random graph.
See below for the code and more details:
import networkx as nx
from networkx.generators.random_graphs import binomial_graph
from networkx.generators.degree_seq import expected_degree_graph
import matplotlib.pyplot as plt
import numpy as np
fig=plt.figure()
N_nodes=1000
G=binomial_graph(n=N_nodes, p=0.01, seed=0) #Creating a random graph as data
degrees = np.array([G.degree(n) for n in G.nodes()])#Computing degrees of nodes
bins_val=np.arange(np.amin(degrees),np.amax(degrees)+2) #Bins
deg_prob,_,_=plt.hist(degrees,bins=bins_val,align='left',weights=np.ones_like(degrees)/N_nodes,
color='tab:orange',alpha=0.3,label='Original distribution')#Histogram
#Sampling from distribution
N_sampling=500
random_sampling=np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling)
#Creating random graph from samples
G_random_sampling=expected_degree_graph(random_sampling,seed=0,selfloops=False)
degrees_random_sampling = np.array([G_random_sampling.degree(n) for n in G_random_sampling.nodes()])
deg_prob_random_sampling,_,_=plt.hist(degrees_random_sampling,bins=bins_val,align='left',
weights=np.ones_like(degrees_random_sampling)/N_sampling,color='tab:blue',label='Sample distribution',alpha=0.3)
#Plotting both histograms
plt.xticks(bins_val)
plt.xlabel('degree')
plt.ylabel('Prob')
plt.legend()
plt.show()
The output then gives:

Need to find clusters and their centroids in a h5 crowd density map file

I'm trying to use clustering techniques which should allow me to find centroids (or medoids) for each group of people inside a density map (of a real photo). I could I reach that? I've already used Kmeans strategy, and maybe the calculated centroids could be also correct. But how could I better view them over the image?
h5 file: density map of a crowd - points are representing people
Download the ".h5" from here: https://drive.google.com/file/d/1C5xvEQELswr4SJ5zhtYtUEVw2FbP2QWo/view?usp=sharing
I obtain the matrix of this h5 file through this code:
import sys
import numpy
import h5py
import matplotlib.pyplot as plt
from PIL import Image as im
with h5py.File('/content/img001001.h5', 'r') as hf:
h5_matrix= hf.get('density')[:]
plt.imshow(h5_matrix)
#print(h5_matrix[:, 1])
print(h5_matrix.shape)
Printed matrix look like this:
https://drive.google.com/file/d/1f376lUPaWT58iBIg5E693uQfC22g5m3U/view?usp=sharing
what I would like to find and have: density map with centroids
How could I afford that?

How to find neighbors within a distance for an unconnected node within a networkx python graph

I would like to simulate a wireless network with time-varying and mobile behaviour of the nodes. Thus, I need every time that the node wakes up or moves to search for its neighbours within a distance. How can I find the nearby nodes? There exist any functions? Thank you
It's a single function: ego_graph. It lets you specify a distance parameter, called the radius.
import networkx as nx
# Sample data
G = nx.florentine_families_graph()
nx.draw_networkx(G, with_labels=True)
# Desired graph
H = nx.ego_graph(G, node=4, radius=2)
nx.draw_networkx(H, with_labels=True)
The entire Florentine families graph:
And just those within distance 2 of the node 'Acciauoli':
If you're using a distance measure other than simple topological distance (i.e. counting edges), you can provide the distance parameter to the ego_graph function to specify an edge attribute to use for distance.

Spectral clustering using scikit learn on graph generated through networkx

I have a 3000x50 feature vector matrix. I obtained a similarity matrix for this using sklearn.metrics.pairwise_distances as 'Similarity_Matrix'. Now I used networkx to create a graph using the similarity matrix generated in the previous step as G=nx.from_numpy_matrix(Similarity_Matrix). I want to perform spectral clustering on this graph G now but several google searches have failed to provide a decent example of scikit learn spectral clustering on this graph :( The official documentation shows how spectral clustering can be done on some image data which is highly unclear at least to a newbie like myself.
Can anyone give me a code sample for this or for graph cuts or graph partitioning using networkx, scikit learn etc.
Thanks a million!
adj_matrix = nx.from_numpy_matrix will help you create an adjacency matrix which will be your affinity matrix. You need to feed this to scikit-learn like this: SpectralClustering(affinity = 'precomputed', assign_labels="discretize",random_state=0,n_clusters=2).fit_predict(adj_matrix)
If you don't have any similarity matrix, you can change the value of 'affinity' param to 'rbf' or 'nearest_neighbors'. An example below explains the entire Spectral Clustering pipeline:
import sklearn
import networkx as nx
import matplotlib.pyplot as plt
'''Graph creation and initialization'''
G=nx.Graph()
G.add_edge(1,2) # default edge weight=1
G.add_edge(3,4,weight=0.2) #weight represents edge weight or affinity
G.add_edge(2,3,weight=0.9)
G.add_edge("Hello", "World", weight= 0.6)
'''Matrix creation'''
adj_matrix = nx.to_numpy_matrix(G) #Converts graph to an adj matrix with adj_matrix[i][j] represents weight between node i,j.
node_list = list(G.nodes()) #returns a list of nodes with index mapping with the a
'''Spectral Clustering'''
clusters = SpectralClustering(affinity = 'precomputed', assign_labels="discretize",random_state=0,n_clusters=2).fit_predict(adj_matrix)
plt.scatter(nodes_list,clusters,c=clusters, s=50, cmap='viridis')
plt.show()