Achieve same random numbers in numpy as matlab - matlab

I want to know how to generate the same random (Normal Distribution) numbers in numpy as I do in MATLAB.
As an example when I do this in MATLAB
RandStream.setGlobalStream(RandStream('mt19937ar','seed',1));
rand
ans =
0.417022004702574
Now I can reproduce this with numpy:
import numpy as np
np.random.seed(1)
np.random.rand()
0.417022004702574
Which is nice, but when I do this with normal distribution I get different numbers.
RandStream.setGlobalStream(RandStream('mt19937ar','seed',1));
randn
ans =
-0.649013765191241
And with numpy
import numpy as np
np.random.seed(1)
np.random.randn()
1.6243453636632417
Both functions say in their documentation that they draw from the standard normal distribution, yet give me different results. Any idea how I can adjust my python/numpy to get the same numbers as MATLAB.
Because someone marked this as a duplicate:
This is about normal distribution, as I wrote in the beginning and end.
As I wrote uniform distribution works fine, this is about normal distribution.
None of the answers in the linked thread help with normal distribution.

My guess would be that the matlab and numpy may use different methods to get normal distribution of random numbers (which are obtained from uniform numbers in some way).
You can avoid this problem by writing a box-muller method to generate the random numbers yourself. For python,
import numpy as np
# Box-muller normal distribution, note needs pairs of random numbers as input
def randn_from_rand(rand):
assert rand.size == 2
#Use box-muller to get normally distributed random numbers
randn = np.zeros(2)
randn[0] = np.sqrt(-2.*np.log(rand[0]))*np.cos(2*np.pi*rand[1])
randn[1] = np.sqrt(-2.*np.log(rand[0]))*np.sin(2*np.pi*rand[1])
return randn
np.random.seed(1)
r = np.random.rand(2)
print(r, randn_from_rand(r))
which gives,
(array([ 0.417022 , 0.72032449]), array([-0.24517852, -1.29966152]))
and for matlab,
% Box-muller normal distribution, note needs pairs of random numbers as input
function randn = randn_from_rand(rand)
%Use box-muller to get normally distributed random numbers
randn(1) = sqrt(-2*log(rand(1)))*cos(2*pi*rand(2));
randn(2) = sqrt(-2*log(rand(1)))*sin(2*pi*rand(2));
which we call with
RandStream.setGlobalStream(RandStream('mt19937ar','seed',1));
r = [rand, rand]
rn = randn_from_rand(r)
with answer,
r =
0.4170 0.7203
rn =
-0.2452 -1.2997
Note, you can check the output is normally distributed, for python,
import matplotlib.pyplot as plt
ra = []
np.random.seed(1)
for i in range(1000000):
rand = np.random.rand(2)
ra.append(randn_from_rand(rand))
plt.hist(np.array(ra).ravel(),100)
plt.show()
which gives,

Related

Networkx - Get probability p(k) from network

I have plotted the histogram of network (dataframe), with count of 'k' node connections, like so:
import seaborn as sns
parameter ='k'
sns.histplot(network[parameter])
But now I need to create a modular random graph using above group distribution with:
from networkx.generators.community import random_partition_graph
random_partition_graph(sizes, p_in, p_out, seed=None, directed=False)
And, instead of counts, I need this value p(k), which must be passed as p_in.
p_in (float)
probability of edges with in groups
How do I get p(k) from my network?
This is how I would handle what you described. First, you can normalize your histogram such that the integral of the histogram is equal to 1. This can be done by setting the weights argument of your histogram appropriately. This histogram can then be considered the probability distribution of your degrees. Now that you have this probability distribution, i.e. a list of probability (deg_prob in the code) you can randomly sample from it using np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling). From this random sampling, you can then create a random expected_degree_graph by just passing your samples in the w argument.
You can then compare the degree distribution of your original graph with the one from your random graph.
See below for the code and more details:
import networkx as nx
from networkx.generators.random_graphs import binomial_graph
from networkx.generators.degree_seq import expected_degree_graph
import matplotlib.pyplot as plt
import numpy as np
fig=plt.figure()
N_nodes=1000
G=binomial_graph(n=N_nodes, p=0.01, seed=0) #Creating a random graph as data
degrees = np.array([G.degree(n) for n in G.nodes()])#Computing degrees of nodes
bins_val=np.arange(np.amin(degrees),np.amax(degrees)+2) #Bins
deg_prob,_,_=plt.hist(degrees,bins=bins_val,align='left',weights=np.ones_like(degrees)/N_nodes,
color='tab:orange',alpha=0.3,label='Original distribution')#Histogram
#Sampling from distribution
N_sampling=500
random_sampling=np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling)
#Creating random graph from samples
G_random_sampling=expected_degree_graph(random_sampling,seed=0,selfloops=False)
degrees_random_sampling = np.array([G_random_sampling.degree(n) for n in G_random_sampling.nodes()])
deg_prob_random_sampling,_,_=plt.hist(degrees_random_sampling,bins=bins_val,align='left',
weights=np.ones_like(degrees_random_sampling)/N_sampling,color='tab:blue',label='Sample distribution',alpha=0.3)
#Plotting both histograms
plt.xticks(bins_val)
plt.xlabel('degree')
plt.ylabel('Prob')
plt.legend()
plt.show()
The output then gives:

How to obtain new data from the given starting point?

I am new in Neural network and matlab. My problem -> I have some XYgraphs (X-data, Y-Time). All graphs have same time, but different X values. Also I have a starting point Z. I want to get the actual graph which start from Z, based on above said XY graphs. I tried it by using "nntool" which was available in matlab. I tried few algorithms like TRAINBR, TRAINLM, TRAINB etc. But the output of the trained network never starts from Z. I tried arranging my data, changed input ranges, tried with higher number of layers, epochs, neurons etc. Nothing worked. Please tell me how to solve this problem. No need to use nntool itself.You can suggest any better options... Please help me... A example picture is here...
From what I can infer, you are trying to interpolate. Naively one can do it by shifting the mean of the data to Z. I don't have MATLAB, but it shouldn't be difficult to read the Python code.
import matplotlib.pyplot as plt
import numpy as np
Z = 250
# Creating some fake data
y = np.zeros((1000,3))
y[:,0] = np.arange(1000)-500
y[:,1] = np.arange(1000)
y[:,2] = np.arange(1000)+500
x = np.arange(1000)
# Plotting fake data
plt.plot(x,y)
#Take mean along Y axis
ymean = np.mean(y,axis=1)
# Shift the mean to the desired Z after shifting it to origin
Zdash = ymean + (Z - ymean[0])
plt.plot(x,y)
plt.plot(x,Zdash)

How to fit a lognormal distribution

I want to fit a lognormal distribution in Python. My question is why should I use scipy.lognormal.fit instead of just doing the following:
from numpy import log
mu = log(data).mean()
sigma = log(data).std()
which gives the MLE of mu and sigma so that the distribution is lognormal(mu, sigma**2)?
Also, once I get mu and sigma how can I get a scipy object of the distribution lognormal(mu, sigma**2)? The arguments passed to scipy.stats.lognorm are not clear to me.
Thanks
Wrt fitting, you could use scipy.lognormal.fit, you could use scipy.normal.fit applied to log(x), you could do what you just wrote, I believe you should get pretty much the same result.
The only thing I could state, that you have to fit two parameters (mu, sigma), so you have to match two values. Instead of going for mean/stddev, some people might prefer to match peaks, thus getting (mu,sigma) from mode/stddev.
Wrt using lognorm with known mean and stddev
from scipy.stats import lognorm
stddev = 0.859455801705594
mean = 0.418749176686875
dist=lognorm([stddev],loc=mean) # will give you a lognorm distribution object with the mean and standard deviation you specify.
# You can then get the pdf or cdf like this:
import numpy as np
import pylab as pl
x=np.linspace(0,6,200)
pl.plot(x,dist.pdf(x))
pl.plot(x,dist.cdf(x))
pl.show()

Whitening Transformation of a Data Matrix in numpy does not replicate MATLAB results

I'm trying to do a Whitening transformation (or a sphering transformation) of a matrix in numpy. There are a number of ways to do this (for e.g., by calculating eigenvalues), but I'm looking for the solution to this particular approach I describe below. I am trying to raise a covariance matrix to a negative fractional power.
This is how one does it in MATLAB with an example matrix A:
A = [1 2 3; 4 5 6; 7 8 9; 10 11 12];
C = cov(A);
C^-0.5
As the answer, I get a matrix whose elements have real and imaginary parts.
When I try to do the same in Python as below, I get a matrix of nans:
from scipy.linalg import fractional_matrix_power
import numpy as np
A=np.asarray([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
C=np.cov(np.transpose(A))
fractional_matrix_power(C, -0.5)
Am I missing something here?
I tried the solution suggested in this answer but I'm not sure if this is calculating what I want. The resulting matrices in MATLAB and Python are not the same (because the solution of this problem (C-0.5) is not unique?).
from scipy.linalg import logm, expm
E = -0.5*logm(C)
F = expm(E)
The equivalent MATLAB code is:
expm((-0.5)*logm(C))
Is there a way to check this? For example, is the square of C-0.5 equivalent to the inverse of C, in which case their multiplication will result in Identity Matrix?
While trying to debug on Python, I ran into some other strange things too (not directly related to the problem above, but may hint at some problem I'm facing).
Multiplication of a matrix and its inverse is not I
D = fractional_matrix_power(C, 0.5)
E = np.linalg.inv(D)
np.dot(E,D)
The above operation does not return an Identity matrix. Should it not? It does in MATLAB. I also used scipy.linalg.inv instead of the numpy version, but didn't help.

How to fit a poisson distribution with seaborn?

I try to fit my data to a poisson distribution:
import seaborn as sns
import scipy.stats as stats
sns.distplot(x, kde = False, fit = stats.poisson)
But I get this error:
AttributeError: 'poisson_gen' object has no attribute 'fit'
Other distribution (gamma, etc) de work well.
The Poisson distribution (implemented in scipy as scipy.stats.poisson) is a discrete distribution. The discrete distributions in scipy do not have a fit method.
I'm not very familiar with the seaborn.distplot function, but it appears to assume that the data comes from a continuous distribution. If that is the case, then even if scipy.stats.poisson had a fit method, it would not be an appropriate distribution to pass to distplot.
The question title is "How to fit a poisson distribution with seaborn?", so for the sake of completeness, here's one way to get a plot of the data and its fit. seaborn is only used for the bar plot, using #mwaskom's suggestion to use seaborn.countplot. The fitting is actually trivial, because the maximum likelihood estimation for the Poisson distribution is simply the mean of the data.
First, the imports:
In [136]: import numpy as np
In [137]: from scipy.stats import poisson
In [138]: import matplotlib.pyplot as plt
In [139]: import seaborn
Generate some data to work with:
In [140]: x = poisson.rvs(0.4, size=100)
These are the values in the x:
In [141]: k = np.arange(x.max()+1)
In [142]: k
Out[142]: array([0, 1, 2, 3])
Use seaborn.countplot to plot the data:
In [143]: seaborn.countplot(x, order=k, color='g', alpha=0.5)
Out[143]: <matplotlib.axes._subplots.AxesSubplot at 0x114700490>
The maximum likelihood estimation of the Poisson parameter is simply the mean of the data:
In [144]: mlest = x.mean()
Use poisson.pmf() to get the expected probability, and multiply by the size of the data set to get the expected counts, and then plot using matplotlib. The bars are the counts of the actual data, and the dots are the expected counts of the fitted distribution:
In [145]: plt.plot(k, poisson.pmf(k, mlest)*len(x), 'go', markersize=9)
Out[145]: [<matplotlib.lines.Line2D at 0x114da74d0>]