Calculation of node betweenness from a weighted adjacency matrix - networkx

I have an adjacency matrix with the non zero elements indicating the weights of the link.The weights are decimals below 1 but are positive. For example, consider the below matrix as the weighted adjacency matrix a
array([[0. , 0.93, 0.84, 0.76],
[0.93, 0. , 0.93, 0.85],
[0.84, 0.93, 0. , 0.92],
[0.76, 0.85, 0.92, 0. ]])
I would like to obtain the node betweenness centrality of all the nodes in it. It is to be noted that my actual adjacency matrix is 2000 X 2000. I am new to networkx and hence any help will be highly appreciable.

You can try this
import networkx as nx
import numpy as np
A=np.matrix([[0. , 0.93, 0.84, 0.76, 0.64],
[0.93, 0. , 0.93, 0.85, 0 ],
[0.84, 0.93, 0. , 0.92, 0.32],
[0.76, 0.85, 0.92, 0. , 0.55],
[0.64, 0 , 0.32, 0.55, 0]])
G=nx.from_numpy_matrix(A)
betweeness_dict = nx.centrality.betweenness_centrality(G,weight='weight')
The betweeness_dict will contain the betweeness centrality of all the nodes
{0: 0.0, 1: 0.0, 2: 0.13888888888888887, 3: 0.0, 4: 0.13888888888888887}
You can read more about the documentation at this link.

Related

How to plot a 3d graph in Matlab with my data?

Right now I am doing a parameter sweep and I am trying to convert my data to a 3D graph to show the results in a very nice fashion. The problem is that I don't quite know how to plot it as I am having an issue with the result variable.
mute_rate = [0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625]
mute_step = linspace(-2.5, 2.5, 6)
results = [949.58, 293.53, 57.69, 53.65, 293.41, 1257.49;
279.19, 97.94, 32.60, 29.52, 90.52, 286.94;
32.96, 28.06, 19.56, 6.44, 13.47, 55.80;
2.01, 1.52, 5.38, 1.00, 0.89, 1.41;
0.61, 0.01, 18.59, 0.03, 0.56, 1.22;
1.85, 1.51, 18.64, 18.57, 18.54, 6.90]
So the first row in the result variable presents the results of the mute rate and mute step performed on the population from my genetic algorithm. For example:
0.5, -2.5 = 949.58,
0.5, -1.5 = 293.53,
0.5, -0.5 = 57.69
etc
It sounds like you want something akin to:
mesh(mute_step, mute_rate, results);
shading interp;
Other styles of plot would be surf or pcolor (for a 2d view).

Kolmogorov-Smirnov Test Statistic

Can someone explain why, if I calculate manually the KS test statistic, the result is different from when I use scipy.stats.kstest?
>>> sample = np.array([1000,2000,2500,3000,5000])
>>> ecdf = np.array([0.2, 0.4, 0.6, 0.8, 1. ])
>>> cdf = stats.weibull_min(0.3, 100, 4000).cdf(sample)
>>> abs(ecdf - cdf).max()
0.3454961536273503
>>> stats.kstest(rvs=sample, cdf=stats.weibull_min(0.3, 100, 4000).cdf)
KstestResult(statistic=0.4722995454382698, pvalue=0.1534647709785294)
OK, I realized the mistake I made, so I will answer my onwn question. The KS-Statistic can't be calculated as abs(ecdf - cdf).max(), bacause of the right-continuity / left-discontinuity of the ECDF. The correct approach is:
>>> sample = np.array([1000, 2000, 2500, 3000, 5000])
>>> ecdf = np.array([0, 0.2, 0.4, 0.6, 0.8, 1. ])
>>> cdf = stats.weibull_min(0.3, 100, 4000).cdf(sample)
>>> max([(ecdf[1:] - cdf).max(), (cdf - ecdf[:-1]).max()])
0.4722995454382698

Keras back propagation

Suppose I have defined a network using Keras as follows:
model = Sequential()
model.add(Dense(10, input_shape=(10,), activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(9, activation='sigmoid'))
It has some weights:
[array([[ 0.33494413, -0.34308964, 0.12796348, 0.17187083, -0.40254939,
-0.06909397, -0.30374748, 0.14217842, 0.41163749, -0.15252003],
[-0.07053435, 0.53712451, -0.43015254, -0.28653857, 0.53299475, ...
When I give it some input:
[[ 0. 0.5 0. 0.5 1. 1. 0. 0.5 0.5 0.5]]
It produces some output:
[0.5476531982421875, 0.5172237753868103, 0.5247090458869934, 0.49434927105903625, 0.4599153697490692, 0.44612908363342285, 0.4727349579334259, 0.5116984844207764, 0.49565717577934265]
Whereas the desired output is:
[0.6776225034927386, 0.0, 0.5247090458869934, 0.0, 0.0, 0.0, 0.4727349579334259, 0.5116984844207764, 0.49565717577934265]
Making the Error Value:
[0.12996930525055106, -0.5172237753868103, 0.0, -0.49434927105903625, -0.4599153697490692, -0.44612908363342285, 0.0, 0.0, 0.0]
I can then calculate the evaluated gradients as follows:
outputTensor = model.output
listOfVariableTensors = model.trainable_weights
gradients = k.gradients(outputTensor, listOfVariableTensors)
trainingInputs = inputs
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients, feed_dict={model.input: trainingInputs})
Which yeilds the evaluated gradients:
[array([[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ],
[ 0.01015381, 0. , 0. , 0.03375177, -0.05576257,
0.03318337, -0.02608909, -0.06644543, -0.03461133, 0. ],
[ 0.02030762, 0. , 0. , 0.06750354, -0.11152515,
0.06636675, -0.05217818, -0.13289087, -0.06922265, 0. ],...
I would like to use these gradients to adjust my model, but I am losing track of the math & theory of backpropagation. Am I on the right track?

How to calculate mean of function in a gaussian fit?

I'm using the curve fitting app in MATLAB. If I understand correctly the "b1" component in the left box is the mean of function i.e. the x point where y=50% and my x data is [-0.8 -0.7 -0.5 0 0.3 0.5 0.7], so why is this number in this example so big (631)?
General model Gauss1:
f(x) = a1*exp(-((x-b1)/c1)^2)
Coefficients (with 95% confidence bounds):
a1 = 3.862e+258 (-Inf, Inf)
b1 = 631.2 (-1.117e+06, 1.119e+06)
c1 = 25.83 (-2.287e+04, 2.292e+04)
Your data looks like cdf and not pdf. You can use this code for your solution
xi=[-0.8,-0.7,-0.5, 0.0, 0.3, 0.5, 0.7];
yi= [0.2, 0.0, 0.2, 0.2, 0.5, 1.0, 1.0];
fun=#(v) normcdf(xi,v(1),v(2))-yi;
[v]=lsqnonlin(fun,[1,1]); %[1,2]
mu=v(1); sigma=v(2);
x=linspace(-1.5,1.5,100);
y=normcdf(x,mu,sigma);
figure(1);clf;plot(xi,yi,'x',x,y);
annotation('textbox',[0.2,0.7,0.1,0.1], 'String',sprintf('mu=%f\nsigma=%f',mu,sigma),'FitBoxToText','on','FontSize',16);
you will get: mu=0.24537, sigma=0.213
And if you still want to fit to pdf, just change the function 'normcdf' in 'fun' (and 'y') to 'normpdf'.

how to sort multidimensional matrices along multiple columns

I have a tricky matrix manipulation issue that I could really use some help with.
I need to reorganize a series of 2d matrices so that they align most effectively across subjects. Each matrix has ~50 rows (which are the observations) and 13 columns (which designate the 'weight' of each observation on a series of 13 outcome measures). Based on the manner in which the data are created, there is no inherent meaning in the order of the rows, however I need to reorganize each matrix such that the rows contain meaning between subjects.
Specifically, I want to be able to reorder the matrices such that the specific pattern of weightings in a given row aligns with a similar pattern in the same row across a group of 20 subjects. To make matters worse, some subjects have missing rows, although all have between 45 and 50 rows.
As an example:
subject 1:
[ 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.5, 0.6, 0.6, 0.7;
0.9, 0.8, 0.8, 0.7, 0.7, 0.6, 0.6, 0.5, 0.5, 0.4, 0.4, 0.3, 0.3]
subject 2:
[ 0.8, 0.7, 0.7, 0.6, 0.6, 0.5, 0.5, 0.4, 0.4, 0.3, 0.3, 0.2, 0.2;
0.0, 0.0, 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.6, 0.7]
problem: row 1 in subject 1 aligns best with row 2 in subject 2 (and v.v.) and I would like to reorganize them as such [note: the real life problem is much more convoluted than this].
I apologize ahead of time for how idiosyncratic this issue is, but I really appreciate any help that anyone can give.
Mac