Convert a numpy array of network values to a labelled node list with values - networkx

I have some graphs built with NetworkX with labelled nodes (names). I have computed trophic levels with the trophic tools script and obtained a numpy array of trophic values.
I want to create a node list of these values, with the according labels, similar for other topological indices (e.g. nx.degree_centrality is easily interpretable as every node names is followed by the relative value).
Can someone suggest how to merge or convert the numpy array to a labelled node list?
Thanks in advance!

I realised that the algorithm doesn't produce a real Laplacian and that every entry in the array was simply the node trophic value. I assumed the order of the array values as the same of the original node list, and the final data seems to concur with that (plant with lower values and predator with higher values).
This is the code to join node names with their trophic values if someone need to compute a similar trophic index:
trophic_levels = ta.trophic_levels(G)
trophic_levels_list = list(trophic_levels)
trophic_levels_series = pd.Series(trophic_levels_list)
G_nodes = pd.DataFrame.from_dict(dict(G.nodes(data=True)), orient='index')
G_nodes.reset_index(inplace=True)
G_nodes['trophic_level'] = trophic_levels_series

Related

Relabeling Nodes of a graph in networkx

I am trying to process the graph given in wiki-Vote.txt (https://snap.stanford.edu/data/wiki-Vote.html). There are 7115 nodes with id ranging from 3 to 8297. I want to relabel the nodes from 0 to 7114. I checked the mappings in relabel_nodes() but still could not solve the problem. Please suggest. thanks
edit
I'm not sure if it's new, but my original answer didn't mention
nx.convert_node_labels_to_integers(G, first_label=0, ordering='default', label_attribute=None). So for a given graph G, you can do
H=nx.convert_node_labels_to_integers(G). This doesn't guarantee that the order is the same as in G. You can have the original label be stored in H if you call H=nx.convert_node_labels_to_integers(G, label_attribute='original_name'). You can guarantee that the order is the same as in G, by setting ordering=sorted(G.nodes()).
original answer
Given a graph G with some set of nodes, the simplest thing would be
mapping = {old_label:new_label for new_label, old_label in enumerate(G.nodes())}
H = nx.relabel_nodes(G, mapping)
This will create a dictionary mapping whose keys are the old labels and whose values are their new labels (read up on dictionary comprehensions). The ordering of the new labels is given by the order that G.nodes() returns the values (which you can't control). The new graph H has the node labels changed.
If you want a specific order, you need to sort G.nodes() appropriately. So you can do
nodelist = G.nodes()
nodelist.sort()
mapping = {old_label:new_label for new_label, old_label in enumerate(nodelist)}
H = nx.relabel_nodes(G, mapping)
which would have them sorted in numerical order (or alphabetical order if the node names are strings). If you want some other custom order, you'll have to figure out how to sort nodelist.

changing variable names in a loop in fortran

I am trying to make a loop that will let the user create multiple matrices in which they have declared the size of the matrix as in the number of columns and rows. I have created the first part of this loop, but my issue is creating a variable whose name will change so that the matrix that was previously created in the loop will not be overwritten. I then have to multiply all the differnt matrices together.
DO n=1:number !number is the number of matrices that need to be created
WRITE(,)'Enter number of rows the matrix has'
READ(,)r
WRITE(,)'Enter number of columns'
READ(,)
REAL, DIMENSION(r,c) :: "here I need a changing variable name so it isn't overwritten every time."
I wouldn't dynamically generate new variables. It seems more like you just want to make each new variable an element of an array. Allocate an array with size equal to the number of loop iterations. It might get tricky if the variables are all 2d arrays of different dimensions, but you could certainly wrap it in some kind of structure.

Matlab Loop of all combinations

Im new to Matlab and this seems to be beyond me. Appreciate the help and thanks in advance.
Basically, I have a multiple columns dataset with column headers. Column numbers could vary from dataset to dataset.
Need to iterate through all the combinations of columns (eg A+B, A+C....B+C, B+D...etc) and run a formula (in this instance it is a correlation formula but could be another formula subsequently).
If particular combination returns "true", then the column headers of the pair will be returned.
Would appreciate if you could point me in the right direction.
Thanks in advance.
Use nchoosek to get all pairs of columns:
pairs_columns = nchoosek(1:m, 2);
pairs = {};
for pair = 1:size(pairs_columns,1)
flag = your_correlation_test(data(:,pairs_columns(pair,1)), data(:,pairs_columns(pair,2)));
if flag
pairs{end+1,1} = data_header(pairs_columns(pair,1));
pairs{end,2} = data_header(pairs_columns(pair,2)); %// Note that you don't need end+1 anymore as the previous line will have already increased the number of rows in the vector
end
end
m is your number of columns
your_correlation_test is your test function that returns a Boolean result
data is your dataset (which I'm assuming you can index by column number?)
data_header is a place holder for whatever the correct way to get the header is from your dataset based on the column number. Sorry I'm not very familiar with datasets in Matlab

Perl - determining the intersection of several numeric ranges

I would like to be able to load long list of positive integer ranges and create a new "summary" range list that is the union of the intersections of each pairs of ranges. And, I want to do this in Perl. For example:
Sample ranges: (1..30) (45..90) (15..34) (92..100)
Intersection of ranges: (15..30)
The only way I could think of was using a bunch of nested if statements to determine the starting point of sample A, sample B, sample C, etc. and figure out the overlap this way, but it's not possible to do that with hundreds of sample, each containing numerous ranges.
Any suggestions are appreciated!
The first thing you should do when you need to do some thing is take a look at CPAN to see what tools are available of if someone has solved your problem for you already.
Set::IntSpan and Set::IntRange are on the first page of results for "set" on CPAN.
What you want is the union of the intersection of each pair of ranges, so the algorithm is as follows:
Create an empty result set.
Create a set for each range.
For each set in the list,
For each later set in the list,
Find the intersection of those two sets.
Find the union of the result set and this intersection. This is the new result set.
Enumerate the elements of the result set.
I don't have code to share, but I would expand each range into hash, or use a Set module, and then use intersection operations on the sets.

Extracting data points from a matrix and saving them in different matrixes in MATLAB

I have a 2D Matrix consisting of some coordinates as below(example): Data(X,Y):
45.987543423,5.35000964
52.987544223,5,98765234
Also I have an array consisting of some integers >=0 , for example: Cluster(M)
2,0,3,1
each of these numbers in this array corresponds with a row of my 2D Matrix above.For example, it says that row one(coordinate) in the Data Matirx belongs to the cluster 2,second row belongs to cluster 0 and so on.
Now I want to have each of the datapoint of each cluster in a separate matrix, for example I want to save datapoints belonging to cluster 1 in a separate matrix, cluster 2 in a separate matrix and so on,....
I can do them manually, but the problem is this has to be an automatic extraction. which means that the number of clusters(range of the numbers in the cluster array varies in each run) so I have to have a general algorithm that does this extraction for me. Can someone help me please? thanks
Instead of dynamically creating a bunch of matrices, I would create a cell array with each matrix in a separate cell. Here's one way to do this, using the functions SORT and MAT2CELL:
[cluster,sortIndex] = sort(cluster); %# Sort cluster and get sorting index
data = data(sortIndex,:); %# Apply the same sorting to data
clusterCounts = diff([0 find(diff(cluster)) numel(cluster)]); %# Find size of
%# each cluster
cellArray = mat2cell(data,clusterCounts,2); %# Break up data into matrices,
%# each in a separate cell
You can use ARRAYFUN to distribute the coordinates among different cell arrays.
%# create sample data
clusterIdx = [2,0,3,1,1,1,3,2];
coordinates = rand(8,2);
%# first you get a list of unique cluster indices
clusterIdxUnique = unique(clusterIdx);
%# then you use arrayfun to distribute the coordinates
clusterCell = arrayfun(#(x)coordinates(clusterIdx==x,:),clusterIdxUnique,'UniformOutput',false);
The first element of clusterCell contains the coordinates corresponding to the first entry in clusterIdxUnique, etc.
I guess this is the solution:
data(cluster == i, :)
where i is the index of the cluster. Your index matrix is converted to a boolean matrix and then used to index the rows and each selected row is completely added to the resulting matrix.
If this is not what you're looking for, please specify your needs more clearly.
Thanks everyone, I managed to make it work with this code:
noOfClusters = max(cluster); %without noise
for i=1:noOfClusters
C(i,1) = {numData(cluster==i,:)}
end
I assume your codes are much faster,cause you don't use for loops.
I would either create a 3 dimensional array or table. That way the cluster index would be associated with the cluster. Something like the following construct:
xData = Data(:,1);
yData = Data(:,2);
clusterTable = table(Cluster, xData, yData);
This creates a table with column names and each row having a cluster index and a set of coordinates.