Why is the rich-club coefficient algorithm not defined for directed networks in NetworkX? - networkx

I'm working with NetworkX to compute the rich-club coefficient of a directed graph. However, I see in the documentation for the implementation of this algorithm that it is not implemented for directed networks.
I want to know if there are any references to understand better the reason of this and develop a solution for my scenario (compute Rich-Club for Directed graphs).
I found this reference and it seems that they have proposed a corrected equation to compute it. But I haven't found any additional references to confirm if the rich-club was initially defined for just undirected graphs (not even in the references cited by the doc page of NetworkX).

Related

Global clustering coefficient for directed network

Global clustering coefficient gives an outline of the clustering in the entire network. From theory, this measure can be applied to both undirected and directed networks.
Networx library provides a function average_clustering(g) that calculates global clustering for undirected networks but not for directed ones. Is there a way to implement the global clustering coefficient for directed networks in Python or are there other libraries that do this?
Thanks
I searched for information to no avail, on stack overflow there is another very similar question but got no answer.

partitioning large signed networks

I have a large signed network. The signed network is a weighted graph whose edges can be +1 or _1. I need to partition this graph so that most positive edges are placed inside the clusters and the negative edges are placed outside the cluster. this graph is very sparse.
Do you have ideas?
there is a special version of Louvain algorithm for the signed network in Pajek.
Does anyone know about the details of this algorithm?
This paper by Vincent Traag outlines one approach.
He also has a python package (built on top of igraph) called louvain that can do this for you.
This blog post demonstrates the package and method on an interesting use case.

Applying vector based clustering algorithms to social network context

i have a social network described as edges in a file. I used graph based clustering algorithms to find dense parts of the graph. However there is also vector based clustering which i need to apply to the data i have, but i can not find any context to this. I have also information about each node considering their features. I think using vectors containing the features of each user makes no sense here. For example k-Means would calculate the distance between user u1 with his feature vector v1 = [f1,f2,f3,..] and user u2 with its feature vector v2 = [f1,f2,f3,...]. However both vectors would have binary values depending on which feature the user has. Additionally i have a matrix with the users on one axis and the features on the other, where the user is able to set permission.
My Question is now, how i can make use of k-means, dbscan etc. in the context of this topic.
Best wishes.
Many algorithms can be modified to allow being used with distances for binary features. For example k-means can be modified for binary data: k-modes.
But I don't think it will do anything useful on your data.
You approach to this problem is bad: don't first decide the algorithm, then try to make it run. You are then bound to solve the wrong problem. Instead, formalize the problem first, in mathematics, what a good clustering would be. Then identify the appropriate algorithm by it's mathematical ability to find a good solution to this objective.

Can someone tell me about the kNN search algo that Matlab uses?

I wrote a basic O(n^2) algorithm for a nearest neighbor search. As usual Matlab 2013a's knnsearch(..) method works a lot faster.
Can someone tell me what kind of optimization they used in their implementation?
I am okay with reading any documentation or paper that you may point me to.
PS: I understand the documentation on the site mentions the paper on kd trees as a reference. But as far as I understand kd trees are the default option when column number is less than 10. Mine is 21. Correct me if I'm wrong about it.
The biggest optimization MathWorks have made in implementing nearest-neighbors search is that all the hard stuff is implemented in a MEX file, as compiled C, rather than MATLAB.
With an algorithm such as kNN that (in my limited understanding) is quite recursive and difficult to vectorize, that's likely to give such an improvement that the O() analysis will only be relevant at pretty high n.
In more detail, under the hood the knnsearch command uses createns to create a NeighborSearcher object. By default, when X has less than 10 columns, this will be a KDTreeSearcher object, and when X has more than 10 columns it will be an ExhaustiveSearcher object (both KDTreeSearcher and ExhaustiveSearcher are subclasses of NeighborSearcher).
All objects of class NeighbourSearcher have a method knnsearch (which you would rarely call directly, using instead the convenience command knnsearch rather than this method). The knnsearch method of KDTreeSearcher calls straight out to a MEX file for all the hard work. This lives in matlabroot\toolbox\stats\stats\#KDTreeSearcher\private\knnsearchmex.mexw64.
As far as I know, this MEX file performs pretty much the algorithm described in the paper by Friedman, Bentely, and Finkel referenced in the documentation page, with no structural changes. As the title of the paper suggests, this algorithm is O(log(n)) rather than O(n^2). Unfortunately, the contents of the MEX file are not available for inspection to confirm that.
The code builds a KD-tree space-partitioning structure to speed up nearest neighbor search, think of it like building indexes commonly used in RDBMS to speed up lookup operations.
In addition to nearest neighbor(s) searches, this structure also speeds up range-searches, which finds all points that are within a distance r from a query point.
As pointed by #SamRoberts, the core of the code is implemented in C/C++ as a MEX-function.
Note that knnsearch chooses to build a KD-tree only under certain conditions, and falls back to an exhaustive search otherwise (by naively searching all points for the nearest one).
Keep in mind that in cases of very high-dimensional data (and few instances), the algorithm degenerates and is no better than an exhaustive search. In general as you go with dimensions d>30, the cost of searching KD-trees will increase to searching almost all the points, and could even become worse than a brute force search due to the overhead involved in building the tree.
There are other variations to the algorithm that deals with high dimensions such as the ball trees which partitions the data in a series of nesting hyper-spheres (as opposed to partitioning the data along Cartesian axes like KD-trees). Unfortunately those are not implemented in the official Statistics toolbox. If you are interested, here is a paper which presents a survey of available kNN algorithms.
(The above is an illustration of searching a kd-tree partitioned 2d space, borrowed from the docs)

Plotting graph with respect to frequency

I am doing a project on Open Modelica and i have to simulate filters on it using active elements(op amp). Modelica plots graph with respect to time but i want my graphs with respect to frequency to analyze the frequency response of the system. I searched the internet but couldn't find anything useful. Please reply as soon as possible.
If you want to plot a variable with respect to another variable you can use plotParameteric from OMShell (OpenModelica Shell). In OMEdit (OpenModelica Connection Editor) you can click on parametric plot button x(y) and then select 2 variables.
I assume that what you want is a Bode plot. If so, it is important to understand that such a plot does not arise from a transient simulation. It is necessary to transform your system into a linear, time-invariant representation in order to express the response of your system in the frequency domain.
I do not know what specific features OpenModelica has in this regard. But those are at least the kinds of things you should search the documentation for. If you have access to MATLAB, then all you really need to do is extract the linearized version of the model (normally expressed as the so-called "ABCD" matrices) and MATLAB can get you the rest of the way.
There is also the Modelica_LinearSystems2 library which might be compatible with OpenModelica (I have no idea). It includes many types of operations you would typically perform on linear systems.