stress centrality in social network - networkx

i got the error of this code which is:
path[index][4] += 1
IndexError: list index out of range
why this happened?how can i remove this error ?
Code:
def stress_centrality(g):
stress = defaultdict(int)
for a in nx.nodes_iter(g):
for b in nx.nodes_iter(g):
if a==b:
continue
pred = nx.predecessor(G,b) # for unweighted graphs
#pred, distance = nx.dijkstra_predecessor_and_distance(g,b) # for weighted graphs
if a not in pred:
return []
path = [[a,0]]
path_length = 1
index = 0
while index >= 0:
n,i = path[index]
if n == b:
for vertex in list(map(lambda x:x[0], path[:index+1]))[1:-1]:
stress[vertex] += 1
if len(pred[n]) >i:
index += 1
if index == path_length:
path.append([pred[n][i],0])
path_length += 1
else:
path[index] = [pred[n][i],0]
else:
index -= 1
if index >= 0:
path[index][4] += 1
return stress

Without the data it's hard to give you anything more than an indicative answer.
This line
path[index][4] += 1
assumes there are 5 elements in path[index] but there are fewer than that. It seems to me that your code only assigns or appends to path lists of length 2. As in
path = [[a,0]]
path.append([pred[n][i],0])
path[index] = [pred[n][i],0]
So it's hard to see how accessing the 5th element of one of those lists could ever be correct.
This is a complete guess, but I think you might have meant
path[index][1] += 4

Related

Calculate bigrams co-occurrence matrix

I tried to ask a question regarding nathandrake's #nathandrake post: How do I calculate a word-word co-occurrence matrix with sklearn?
import pandas as pd
def co_occurance_matrix(input_text,top_words,window_size):
co_occur = pd.DataFrame(index=top_words, columns=top_words)
for row,nrow in zip(top_words,range(len(top_words))):
for colm,ncolm in zip(top_words,range(len(top_words))):
count = 0
if row == colm:
co_occur.iloc[nrow,ncolm] = count
else:
for single_essay in input_text:
essay_split = single_essay.split(" ")
max_len = len(essay_split)
top_word_index = [index for index, split in enumerate(essay_split) if row in split]
for index in top_word_index:
if index == 0:
count = count + essay_split[:window_size + 1].count(colm)
elif index == (max_len -1):
count = count + essay_split[-(window_size + 1):].count(colm)
else:
count = count + essay_split[index + 1 : (index + window_size + 1)].count(colm)
if index < window_size:
count = count + essay_split[: index].count(colm)
else:
count = count + essay_split[(index - window_size): index].count(colm)
co_occur.iloc[nrow,ncolm] = count
return co_occur
My question is: what if my words are not one word but bigrams. For example:
corpus = ['ABC DEF IJK PQR','PQR KLM OPQ','LMN PQR XYZ DEF ABC']
words = ['ABC PQR','PQR DEF']
window_size =100
result = co_occurance_matrix(corpus,words,window_size)
result
I changed the word list into a bigram list, then the co_occurance_matrix function is not working. All are showing 0.

Random number generators for a skiplist implementation

Most skiplists guides I see use an LCG random number generator and the following formula to generate the node heights.
random.generate().trailingZeroBitCount + 1
But LCGs have bad randomness in the lower order bits. Is this a problem, and if so what’s a good random number generator for skiplists?
If you are using trailingZeroBitCount then you are going for a specific distribution. In a random sample, we'd expect half of the numbers (the odd ones) to have trailingZeroBitCount == 0. We'd expect 1/4 of them to have 1 and so on.
I put this in a playground and got that distribution:
let randos = (1..<100000).map { _ in return arc4random() }
let zeroZeros = randos.filter{ i in i % 2 == 1 }
zeroZeros.count
let oneOrMoreZeros = randos.filter{ i in i % 2 == 0 }.map { $0 / 2 }
let oneZero = oneOrMoreZeros.filter{ i in i % 2 == 1 }
oneZero.count
let twoOrMoreZeros = oneOrMoreZeros.filter{ i in i % 2 == 0 }.map { $0 / 2 }
let twoZero = twoOrMoreZeros.filter{ i in i % 2 == 1 }
twoZero.count
I used arc4random because I don't know what you are using. Use this code to check your random number generator gets this kind of distribution.

Functional version of a typical nested while loop

I hope this question may please functional programming lovers. Could I ask for a way to translate the following fragment of code to a pure functional implementation in Scala with good balance between readability and execution speed?
Description: for each elements in a sequence, produce a sub-sequence contains the elements that comes after the current elements (including itself) with a distance smaller than a given threshold. Once the threshold is crossed, it is not necessary to process the remaining elements
def getGroupsOfElements(input : Seq[Element]) : Seq[Seq[Element]] = {
val maxDistance = 10 // put any number you may
var outerSequence = Seq.empty[Seq[Element]]
for (index <- 0 until input.length) {
var anotherIndex = index + 1
var distance = input(index) - input(anotherIndex) // let assume a separate function for computing the distance
var innerSequence = Seq(input(index))
while (distance < maxDistance && anotherIndex < (input.length - 1)) {
innerSequence = innerSequence ++ Seq(input(anotherIndex))
anotherIndex = anotherIndex + 1
distance = input(index) - input(anotherIndex)
}
outerSequence = outerSequence ++ Seq(innerSequence)
}
outerSequence
}
You know, this would be a ton easier if you added a description of what you're trying to accomplish along with the code.
Anyway, here's something that might get close to what you want.
def getGroupsOfElements(input: Seq[Element]): Seq[Seq[Element]] =
input.tails.map(x => x.takeWhile(y => distance(x.head,y) < maxDistance)).toSeq

remove duplicates in a table (rexx language)

I have a question about removing duplicates in a table (rexx language), I am on netphantom applications that are using the rexx language.
I need a sample on how to remove the duplicates in a table.
I do have a thoughts on how to do it though, like using two loops for these two tables which are A and B, but I am not familiar with this.
My situation is:
rc = PanlistInsertData('A',0,SAMPLE)
TABLE A (this table having duplicate data)
123
1
1234
12
123
1234
I need to filter out those duplicates data into TABLE B like this:
123
1234
1
12
You can use lookup stem variables to test if you have already found a value.
This should work (note I have not tested so there could be syntax errors)
no=0;
yes=1
lookup. = no /* initialize the stem to no, not strictly needed */
j=0
do i = 1 to in.0
v = in.i
if lookup.v <> yes then do
j = j + 1
out.j = v
lookup.v = yes
end
end
out.0 = j
You can eliminate the duplicates by :
If InStem first element, Move the element to OutStem Else check all the OutStem elements for the current InStem element
If element is found, Iterate to the next InStem element Else add InStem element to OutStem
Code Snippet :
/*Input Stem - InStem.
Output Stem - OutStem.
Array Counters - I, J, K */
J = 1
DO I = 1 TO InStem.0
IF I = 1 THEN
OutStem.I = InStem.I
ELSE
DO K = 1 TO J
IF (InStem.I ?= OutStem.K) & (K = J) THEN
DO
J = J + 1
OutStem.J = InStem.I
END
ELSE
DO
IF (InStem.I == OutStem.K) THEN
ITERATE I
END
END
END
OutStem.0 = J
Hope this helps.

speed up prime number generating

I have written a program that generates prime numbers . It works well but I want to speed it up as it takes quite a while for generating the all the prime numbers till 10000
var list = [2,3]
var limitation = 10000
var flag = true
var tmp = 0
for (var count = 4 ; count <= limitation ; count += 1 ){
while(flag && tmp <= list.count - 1){
if (count % list[tmp] == 0){
flag = false
}else if ( count % list[tmp] != 0 && tmp != list.count - 1 ){
tmp += 1
}else if ( count % list[tmp] != 0 && tmp == list.count - 1 ){
list.append(count)
}
}
flag = true
tmp = 0
}
print(list)
Two simple improvements that will make it fast up through 100,000 and maybe 1,000,000.
All primes except 2 are odd
Start the loop at 5 and increment by 2 each time. This isn't going to speed it up a lot because you are finding the counter example on the first try, but it's still a very typical improvement.
Only search through the square root of the value you are testing
The square root is the point at which a you half the factor space, i.e. any factor less than the square root is paired with a factor above the square root, so you only have to check above or below it. There are far fewer numbers below the square root, so you should check the only the values less than or equal to the square root.
Take 10,000 for example. The square root is 100. For this you only have to look at values less than the square root, which in terms of primes is roughly 25 values instead of over 1000 checks for all primes less than 10,000.
Doing it even faster
Try another method altogether, like a sieve. These methods are much faster but have a higher memory overhead.
In addition to what Nick already explained, you can also easily take advantage of the following property: all primes greater than 3 are congruent to 1 or -1 mod 6.
Because you've already included 2 and 3 in your initial list, you can therefore start with count = 6, test count - 1 and count + 1 and increment by 6 each time.
Below is my first attempt ever at Swift, so pardon the syntax which is probably far from optimal.
var list = [2,3]
var limitation = 10000
var flag = true
var tmp = 0
var max = 0
for(var count = 6 ; count <= limitation ; count += 6) {
for(var d = -1; d <= 1; d += 2) {
max = Int(floor(sqrt(Double(count + d))))
for(flag = true, tmp = 0; flag && list[tmp] <= max; tmp++) {
if((count + d) % list[tmp] == 0) {
flag = false
}
}
if(flag) {
list.append(count + d)
}
}
}
print(list)
I've tested the above code on iswift.org/playground with limitation = 10,000, 100,000 and 1,000,000.