Complexity of scipy.special.binom? - scipy

According to this answer, there are two functions to calculate the binomial coefficient, also known as "N choose K". One of them is scipy.special.binom().
Where is this function implemented? All I know is that it is a ufunc.
Furthermore, what is the time complexity of scipy.special.binom()?

The source can be found on Github in orthogonal_eval.pxd.
In the integer case, the complexity is O(k).
kx = floor(k)
if k == kx and (fabs(n) > 1e-8 or n == 0):
# Integer case: use multiplication formula for less rounding error
# for cases where the result is an integer.
#
# This cannot be used for small nonzero n due to loss of
# precision.
nx = floor(n)
if nx == n and kx > nx/2 and nx > 0:
# Reduce kx by symmetry
kx = nx - kx
if kx >= 0 and kx < 20:
num = 1.0
den = 1.0
for i in range(1, 1 + <int>kx):
num *= i + n - kx
den *= i
if fabs(num) > 1e50:
num /= den
den = 1.0
return num/den

Related

Generate large(512 bit+) prime number python 3.6

I've been attempting to generate large prime numbers with Python for RSA encryption for the past week and a half, with no luck. The Fermat primality test is infeasible at scales of 512 bits, and I can't quite wrap my head around Miller-Rabin. (I'm 13) All the scripts online seem to work with versions of Python below the one I'm using. What should I do to generate massive primes? (Yes, probabilistic primes are fine.)
Here is my Miller-Rabin prime checker:
def isPrime(n, k=5): # miller-rabin
from random import randint
if n < 2: return False
for p in [2,3,5,7,11,13,17,19,23,29]:
if n % p == 0: return n == p
s, d = 0, n-1
while d % 2 == 0:
s, d = s+1, d/2
for i in range(k):
x = pow(randint(2, n-1), d, n)
if x == 1 or x == n-1: continue
for r in range(1, s):
x = (x * x) % n
if x == 1: return False
if x == n-1: break
else: return False
return True
If you want a guaranteed prime (not a probable prime), that's not very much harder to arrange. See my blog for a method due to Pocklington.

Matlab : Confusion regarding unit of entropy to use in an example

Figure 1. Hypothesis plot. y axis: Mean entropy. x axis: Bits.
This Question is in continuation to a previous one asked Matlab : Plot of entropy vs digitized code length
I want to calculate the entropy of a random variable that is discretized version (0/1) of a continuous random variable x. The random variable denotes the state of a nonlinear dynamical system called as the Tent Map. Iterations of the Tent Map yields a time series of length N.
The code should exit as soon as the entropy of the discretized time series becomes equal to the entropy of the dynamical system. It is known theoretically that the entropy of the system, H is log_e(2) or ln(2) = 0.69 approx. The objective of the code is to find number of iterations, j needed to produce the same entropy as the entropy of the system, H.
Problem 1: My problem in when I calculate the entropy of the binary time series which is the information message, then should I be doing it in the same base as H? OR Should I convert the value of H to bits because the information message is in 0/1 ? Both give different results i.e., different values of j.
Problem 2: It can happen that the probality of 0's or 1's can become zero so entropy correspondng to it can become infinity. To prevent this, I thought of putting a check using if-else. But, the loop
if entropy(:,j)==NaN
entropy(:,j)=0;
end
does not seem to be working. Shall be greateful for ideas and help to solve this problem. Thank you
UPDATE : I implemented the suggestions and answers to correct the code. However, my logic of solving was not proper earlier. In the revised code, I want to calculate the entropy for length of time series having bits 2,8,16,32. For each code length, entropy is calculated. Entropy calculation for each code length is repeated N times starting for each different initial condition of the dynamical system. This appraoch is adopted to check at which code length the entropy becomes 1. The nature of the plot of entropy vs bits should be increasing from zero and gradually reaching close to 1 after which it saturates - remains constant for all the remaining bits. I am unable to get this curve (Figure 1). Shall appreciate help in correcting where I am going wrong.
clear all
H = 1 %in bits
Bits = [2,8,16,32,64];
threshold = 0.5;
N=100; %Number of runs of the experiment
for r = 1:length(Bits)
t = Bits(r)
for Runs = 1:N
x(1) = rand;
for j = 2:t
% Iterating over the Tent Map
if x(j - 1) < 0.5
x(j) = 2 * x(j - 1);
else
x(j) = 2 * (1 - x(j - 1));
end % if
end
%Binarizing the output of the Tent Map
s = (x >=threshold);
p1 = sum(s == 1 ) / length(s); %calculating probaility of number of 1's
p0 = 1 - p1; % calculating probability of number of 0'1
entropy(t) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculating entropy in bits
if isnan(entropy(t))
entropy(t) = 0;
end
%disp(abs(lambda-H))
end
Entropy_Run(Runs) = entropy(t)
end
Entropy_Bits(r) = mean(Entropy_Run)
plot(Bits,Entropy_Bits)
For problem 1, H and entropy can be in either nats or bits units, so long as they are both computed using the same units. In other words, you should use either log for both or log2 for both. With the code sample you provided, H and entropy are correctly calculated using consistant nats units. If you prefer to work in units of bits, the conversion of H should give you H = log(2)/log(2) = 1 (or using the conversion factor 1/log(2) ~ 1.443, H ~ 0.69 * 1.443 ~ 1).
For problem 2, as #noumenal already pointed out you can check for NaN using isnan. Alternatively you could check if p1 is within (0,1) (excluding 0 and 1) with:
if (p1 > 0 && p1 < 1)
entropy(:,j) = -p1 * log(p1) - (1 - p1) * log(1 - p1); %calculating entropy in natural base e
else
entropy(:, j) = 0;
end
First you just
function [mean_entropy, bits] = compute_entropy(bits, blocks, threshold, replicate)
if replicate
disp('Replication is ON');
else
disp('Replication is OFF');
end
%%
% Populate random vector
if replicate
seed = 849;
rng(seed);
else
rng('default');
end
rs = rand(blocks);
%%
% Get random
trial_entropy = zeros(length(bits));
for r = 1:length(rs)
bit_entropy = zeros(length(bits), 1); % H
% Traverse bit trials
for b = 1:(length(bits)) % N
tent_map = zeros(b, 1); %Preallocate for memory management
%Initialize
tent_map(1) = rs(r);
for j = 2:b % j is the iterator, b is the current bit
if tent_map(j - 1) < threshold
tent_map(j) = 2 * tent_map(j - 1);
else
tent_map(j) = 2 * (1 - tent_map(j - 1));
end % if
end
%Binarize the output of the Tent Map
s = find(tent_map >= threshold);
p1 = sum(s == 1) / length(s); %calculate probaility of number of 1's
%p0 = 1 - p1; % calculate probability of number of 0'1
bit_entropy(b) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculate entropy in bits
if isnan(bit_entropy(b))
bit_entropy(b) = 0;
end
%disp(abs(lambda-h))
end
trial_entropy(:, r) = bit_entropy;
disp('Trial Statistics')
data = get_summary(bit_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
end
% TO DO Compute the mean for each BIT index in trial_entropy
mean_entropy = 0;
disp('Overall Statistics')
data = get_summary(trial_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
%This is the wrong mean...
mean_entropy = data.mean;
function summary = get_summary(entropy)
summary = struct('mean', mean(entropy), 'sd', std(entropy));
end
end
and then you just have to
% Entropy Script
clear all
%% Settings
replicate = false; % = false % Use true for debugging only.
%H = 1; %in bits
Bits = 2.^(1:6);
Threshold = 0.5;
%Tolerance = 0.001;
Blocks = 100; %Number of runs of the experiment
%% Run
[mean_entropy, bits] = compute_entropy(Bits, Blocks, Threshold, replicate);
%What we want
%plot(bits, mean_entropy);
%What we have
plot(1:length(mean_entropy), mean_entropy);

Gram Schmidt Orthonormalisation

I am writing the following code for Gram Schmidt Orthogonalization. It says that there's an error in calling the function. What's the error and how to rectify it?
A =[1,1,1,1;-1,4,4,-1;4,-2,2,0];
A =A';
B=myGramschmidt(A);
function [B] = myGramschmidt(A)
x1=A(:,1);
x2=A(:,2);
x3=A(:,3);
v1=x1;
c = dot(v1);
v2 = x2-((dot(x2,v1)/c)* v1);
d = dot(v2);
v3 = x3-((dot(x3,v1)/c)* v1)-((dot(x3,v2)/d)* v2);
C=[v1,v2,v3];
V1=normc(v1);
V2=normc(v2);
V3=normc(v3);
B=[V1,V2,V3];
end
Using the Wikipedia Gram-Schmidt page, but Luis Mendo is correct as to why you got the error.
function [B] = myGramschmidt(A)
B = A;
for k = 1:size(A, 1)
for j = 1:k-1
B(k, :) = B(k, :) - proj(B(j, :), A(k, :));
end
end
end
function p = proj(u, v)
% https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process#The_Gram.E2.80.93Schmidt_process
p = dot(v, u) / dot(u, u) * u;
end
Try this vectorized implementation in python.
Also I would suggest to go through David C lay book for theory.
def replace_zero(array):
for i in range(len(array)) :
if array[i] == 0 :
array[i] = 1
return array
def gram_schmidt(self,A, norm=True, row_vect=False):
"""Orthonormalizes vectors by gram-schmidt process
Parameters
-----------
A : ndarray,
Matrix having vectors in its columns
norm : bool,
Do you need Normalized vectors?
row_vect: bool,
Does Matrix A has vectors in its rows?
Returns
-------
G : ndarray,
Matrix of orthogonal vectors
Gram-Schmidt Process
--------------------
The Gram–Schmidt process is a simple algorithm for
producing an orthogonal or orthonormal basis for any
nonzero subspace of Rn.
Given a basis {x1,....,xp} for a nonzero subspace W of Rn,
define
v1 = x1
v2 = x2 - (x2.v1/v1.v1) * v1
v3 = x3 - (x3.v1/v1.v1) * v1 - (x3.v2/v2.v2) * v2
.
.
.
vp = xp - (xp.v1/v1.v1) * v1 - (xp.v2/v2.v2) * v2 - .......
.... - (xp.v(p-1) / v(p-1).v(p-1) ) * v(p-1)
Then {v1,.....,vp} is an orthogonal basis for W .
In addition,
Span {v1,.....,vp} = Span {x1,.....,xp} for 1 <= k <= p
References
----------
Linear Algebra and Its Applications - By David.C.Lay
"""
if row_vect :
# if true, transpose it to make column vector matrix
A = A.T
no_of_vectors = A.shape[1]
G = A[:,0:1].copy() # copy the first vector in matrix
# 0:1 is done to to be consistent with dimensions - [[1,2,3]]
# iterate from 2nd vector to number of vectors
for i in range(1,no_of_vectors):
# calculates weights(coefficents) for every vector in G
numerator = A[:,i].dot(G)
denominator = np.diag(np.dot(G.T,G)) #to get elements in diagonal
weights = np.squeeze(numerator/denominator)
# projected vector onto subspace G
projected_vector = np.sum(weights * G,
axis=1,
keepdims=True)
# orthogonal vector to subspace G
orthogonalized_vector = A[:,i:i+1] - projected_vector
# now add the orthogonal vector to our set
G = np.hstack((G,orthogonalized_vector))
if norm :
# to get orthoNormal vectors (unit orthogonal vectors)
# replace zero to 1 to deal with division by 0 if matrix has 0 vector
# or normazalization value comes out to be zero
G = G/self.replace_zero(np.linalg.norm(G,axis=0))
if row_vect:
return G.T
return G
G = np.array([[1,0,0],[1,1,0],[1,1,1],[1,1,1]])
gram_schmidt(G)
>
array([[ 0.5 , -0.8660254 , 0. ],
[ 0.5 , 0.28867513, -0.81649658],
[ 0.5 , 0.28867513, 0.40824829],
[ 0.5 , 0.28867513, 0.40824829]])

Lagrange interpolation method

I use convolution and for loops (too much for loops) for calculating the interpolation using
Lagrange's method , here's the main code :
function[p] = lagrange_interpolation(X,Y)
L = zeros(n);
p = zeros(1,n);
% computing L matrice, so that each row i holds the polynom L_i
% Now we compute li(x) for i=0....n ,and we build the polynomial
for k=1:n
multiplier = 1;
outputConv = ones(1,1);
for index = 1:n
if(index ~= k && X(index) ~= X(k))
outputConv = conv(outputConv,[1,-X(index)]);
multiplier = multiplier * ((X(k) - X(index))^-1);
end
end
polynimialSize = length(outputConv);
for index = 1:polynimialSize
L(k,n - index + 1) = outputConv(polynimialSize - index + 1);
end
L(k,:) = multiplier .* L(k,:);
end
% continues
end
Those are too much for loops for computing the l_i(x) (this is done before the last calculation of P_n(x) = Sigma of y_i * l_i(x)) .
Any suggestions into making it more matlab formal ?
Thanks
Yeah, several suggestions (implemented in version 1 below): if loop can be combined with for above it (just make index skip k via something like jr(jr~=j) below); polynomialSize is always equal length(outputConv) which is always equal n (because you have n datapoints, (n-1)th polynomial with n coefficients), so the last for loop and next line can be also replaced with simple L(k,:) = multiplier * outputConv;
So I replicated the example on http://en.wikipedia.org/wiki/Lagrange_polynomial (and adopted their j-m notation, but for me j goes 1:n and m is 1:n and m~=j), hence my initialization looks like
clear; clc;
X=[-9 -4 -1 7]; %example taken from http://en.wikipedia.org/wiki/Lagrange_polynomial
Y=[ 5 2 -2 9];
n=length(X); %Lagrange basis polinomials are (n-1)th order, have n coefficients
lj = zeros(1,n); %storage for numerator of Lagrange basis polyns - each w/ n coeff
Lj = zeros(n); %matrix of Lagrange basis polyns coeffs (lj(x))
L = zeros(1,n); %the Lagrange polynomial coefficients (L(x))
then v 1.0 looks like
jr=1:n; %j-range: 1<=j<=n
for j=jr %my j is your k
multiplier = 1;
outputConv = 1; %numerator of lj(x)
mr=jr(jr~=j); %m-range: 1<=m<=n, m~=j
for m = mr %my m is your index
outputConv = conv(outputConv,[1 -X(m)]);
multiplier = multiplier * ((X(j) - X(m))^-1);
end
Lj(j,:) = multiplier * outputConv; %jth Lagrange basis polinomial lj(x)
end
L = Y*Lj; %coefficients of Lagrange polinomial L(x)
which can be further simplified if you realize that numerator of l_j(x) is just a polynomial with specific roots - for that there is a nice command in matlab - poly. Similarly the denominator is just that polyn evaluated at X(j) - for that there is polyval. Hence, v 1.9:
jr=1:n; %j-range: 1<=j<=n
for j=jr
mr=jr(jr~=j); %m-range: 1<=m<=n, m~=j
lj=poly(X(mr)); %numerator of lj(x)
mult=1/polyval(lj,X(j)); %denominator of lj(x)
Lj(j,:) = mult * lj; %jth Lagrange basis polinomial lj(x)
end
L = Y*Lj; %coefficients of Lagrange polinomial L(x)
Why version 1.9 and not 2.0? well, there is probably a way to get rid of this last for loop, and write it all in 1 line, but I can't think of it right now - it's a todo for v 2.0 :)
And, for dessert, if you want to get the same picture as wikipedia:
figure(1);clf
x=-10:.1:10;
hold on
plot(x,polyval(Y(1)*Lj(1,:),x),'r','linewidth',2)
plot(x,polyval(Y(2)*Lj(2,:),x),'b','linewidth',2)
plot(x,polyval(Y(3)*Lj(3,:),x),'g','linewidth',2)
plot(x,polyval(Y(4)*Lj(4,:),x),'y','linewidth',2)
plot(x,polyval(L,x),'k','linewidth',2)
plot(X,Y,'ro','linewidth',2,'markersize',10)
hold off
xlim([-10 10])
ylim([-10 10])
set(gca,'XTick',-10:10)
set(gca,'YTick',-10:10)
grid on
produces
enjoy and feel free to reuse/improve
Try:
X=0:1/20:1; Y=cos(X) and create L and apply polyval(L,1).
polyval(L,1)=0.917483227909543
cos(1)=0.540302305868140
Why there is huge difference?

sum the first n prime reciprocals such that the sum exceeds k (Matlab)

I am trying to write a program in matlab, such that the sum of the reciprocals of the n first prime numbers exceeds a given value k. To clearify, I am trying to make a function
SumPrime(k)
And it is supposed to return an integer n such that
\sum_{i=1}^{n} 1/p_i > k
sum of primes and reciprocals of them and plot in matlab?
I tried looking here, but this does not quite answer my question. Neither did the command
sumInversePrimes = sum(1./primes(n));
Here is my attempt. First i define a function for finding the n`th prime number.
function Y = NthPrime(n)
if n==1
Y = 2;
return
end
if n < 1 || round(n)~=n
return
end
j = 2;
u = 0;
while u < n
T = primes(j);
u = numel(T);
j = 1 + j;
end
Y = T(numel(T));
After doing this (lengthy?) code for finding the n`th prime number, the rest is a cakewalk.
function Y = E(u)
sum = 0
n = 0
while sum < u
n = n + 1
sum = sum + 1/( NthPrime(n) )
end
Y = n;
Return the proper values. This somewhat works. Alas it is very slow, and I guess this is very bad code. I have merely started learning coding in matlab, Could someone please help me either write a better code or optimize mine ?
XOXOXOX
Nebby
Here's how to precompute the sums then find the first that exceeds a threshold:
>> p = primes(1000);
>> cs = cumsum(1./p);
>> find(cs > 1.8, 1)
ans = 25