How can I generate integer random number within [a,b] with below distribution in MATLAB:
p(x)= x^(-a)
I want the distribution to be normalized.
For continuous distributions: Generate random values given a PDF
For discrete distributions, as later it was specified in the OP:
The same rationale can be used as for continuous distributions: inverse transform sampling.
So from mathematical point of view there is no difference, the Matlab implementation however is different. Here is a simple solution with your distribution function:
% for reproducibility
rng(333)
% OPTIONS
% interval endpoints
a = 4;
b = 20;
% number of required random draws
n = 1e4;
% CALCULATION
x = a:b;
% normalization constant
nc = sum(x.^(-a));
% if a and b are finite it is more convinient to have the pdf and cdf as vectors
pmf = 1/nc*x.^(-a);
% create cdf
cdf = cumsum(pmf);
% generate uniformly distributed random numbers from [0,1]
r = rand(n,1);
% use the cdf to get the x value to rs
R = nan(n,1);
for ii = 1:n
rr = r(ii);
if rr == 1
R(ii) = b;
else
idx = sum(cdf < rr) + 1;
R(ii) = x(idx);
end
end
%PLOT
% verfication plot
f = hist(R,x);
bar(x,f/sum(f))
hold on
plot(x, pmf, 'xr', 'Linewidth', 1.2)
xlabel('x')
ylabel('Probability mass')
legend('histogram of random values', 'analytical pdf')
Notes:
the code is general, just replace the pmf with your function;
it is strange that the same parameter a appears in the distribution function and in the interval too.
Related
I'm generating 3d fractal noise in MATLAB using a variety of methods. It's working relatively well, but I'm having an issue where I see vertical striping artifacts in my noise. This happens regardless of what data type or resolution I use.
Edit: I figured it out. The solution is posted as an answer below. Thanks everyone for your thoughts and guidance!
expo = 2^6;
dims = [expo,expo,expo];
beta = -4.5;
render = randnd(beta, dims); % Create volumetric fractal
render = render - min(render); % Set floor to zero
render = render ./ max(render); % Set ceiling to one
%render = imbinarize(render); % BW Threshold option
render = render .* 255; % For greyscale
slicer = 1; % Turn on image slicer/saver
i = 0; % Page counter
format = '.png';
imagename = '___testDump/slice';
imshow(render(:,:,1),[0 255]); %Single test image
if slicer == 1
for c = 1:length(render)
i = i+1;
pagenumber = num2str(i);
filename = [imagename, pagenumber, format];
imwrite(uint8(render(:,:,i)),filename)
end
end
function X = randnd(beta,varargin)
seed = 999;
rng(seed); % Set seed
%% X = randnd(beta,varargin)
% Based on similar functions by Jon Yearsley and Hristo Zhivomirov
% Written by Marcin Konowalczyk
% Timmel Group # Oxford University
%% Parse the input
narginchk(0,Inf); nargoutchk(0,1);
if nargin < 2 || isempty(beta); beta = 0; end % Default to white noise
assert(isnumeric(beta) && isequal(size(beta),[1 1]),'''beta'' must be a number');
assert(-6 <= beta && beta <= 6,'''beta'' out of range'); % Put on reasonable bounds
%% Generate N-dimensional white noise with 'randn'
X = randn(varargin{:});
if isempty(X); return; end; % Usually happens when size vector contains zeros
% Squeeze prevents an error if X has more than one leading singleton dimension
% This is a slight deviation from the pure functionality of 'randn'
X = squeeze(X);
% Return if white noise is requested
if beta == 0; return; end;
%% Generate corresponding N-dimensional matrix of multipliers
N = size(X);
% Create matrix of multipliers (M) of X in the frequency domain
M = [];
for j = 1:length(N)
n = N(j);
if (rem(n,2)~=0) % if n is odd
% Nyquist frequency bin does not show up in odd-numbered fft
k = ifftshift(-(n-1)/2:(n-1)/2);
else
k = ifftshift(-n/2:n/2-1);
end
% Spectral multipliers
m = (k.^2)';
if isempty(M);
M = m;
else
% Create the permutation vector
M_perm = circshift(1:length(size(M))+1,[0 1]);
% Permute a singleton dimension to the beginning of M
M = permute(M,M_perm);
% Add m along the first dimension of M
M = bsxfun(#plus,M,m);
end
end
% Reverse M to match X (since new dimensions were being added form the left)
M = permute(M,length(size(M)):-1:1);
assert(isequal(size(M),size(X)),'Bad programming error'); % This should never occur
% Shape the amplitude multipliers by beta/4 which corresponds to shaping the power by beta
M = M.^(beta/4);
% Set the DC component to zero
M(1,1) = 0;
%% Multiply X by M in frequency domain
Xstd = std(X(:));
Xmean = mean(X(:));
X = real(ifftn(fftn(X).*M));
% Force zero mean unity standard deviation
X = X - mean(X(:));
X = X./std(X(:));
% Restore the standard deviation and mean from before the spectral shaping.
% This ensures the random sample from randn is truly random. After all, if
% the mean was always exactly zero it would not be all that random.
X = X + Xmean;
X = X.*Xstd;
end
Here is my solution:
My "min/max" code (lines 6 and 7) was bad. I wanted to divide all values in the matrix by the single largest value in the matrix so that all values would be between 0 and 1. Because I used max() improperly, I was stepping through the max value of each column and using that as my divisor; thus the vertical stripes.
In the end this is what my code looks like. X is the 3 dimensional matrix:
minVal = min(X,[],'all'); % Get the lowest value in the entire matrix
X = X - minVal; % Set min value to zero
maxVal = max(X,[],'all'); % Get the highest value in the entire matrix
X = X ./ maxVal; % Set max value to one
Like the title suggests, I am facing difficulty in understanding how we generate two correlated uniform [0,1] random variables. I am new to the idea of copulas.
I am struggling to write a MATLAB code wherein I am required to generate two correlated uniform [0,1] random variables.
Generating correlated uniform random variables with Gaussian Copula
rho = .75; % Desired target correlation
N = 1000; % Number of samples
Z = mvnrnd([0 0],[1 rho; rho 1], N);
U = normcdf(Z); % Correlated U(0,1) random variables
scatterhist(U(:,1),U(:,2),'Direction','out') % Visualize (change `rho` to see impact)
Note: Method not guaranteed to hit target correlation exactly but should be close enough for many applications.
This can be very useful to quickly generate correlated distributions using the inverse transform method (either analytically or numerically). Both use cases illustrated below.
Analytical approach
lambda = 2; alpha = 2; beta = 3;
rho = -.35; N = 1000;
Z = mvnrnd([0 0],[1 rho; rho 1], N);
U = normcdf(Z);
X = (-1/lambda)*log(U(:,1)); % Inverse Transform for Exponential
Y = beta*(-log(U(:,2))).^(1/alpha); % Inverse Transform for Weibull
corr(X,Y)
scatterhist(X,Y,'Direction','out')
Numerical approach
% Parameters
alpha = 6.7; lambda = 3;
mu = 0.1; sigma = 0.5;
rho = 0.75; N = 1000;
% Make distributions
pd_X = makedist('Gamma',alpha,lambda);
pd_Y = makedist('Lognormal',mu,sigma);
Z = mvnrnd([0 0],[1 rho; rho 1], N);
U = normcdf(Z);
% Use Inverse Transform for marginal distributions (numerically)
X = icdf(pd_X,U(:,1)); % Inverse CDF for X
Y = icdf(pd_Y,U(:,2)); % Inverse CDF for Y
corr(X,Y)
scatterhist(X,Y,'Direction','out')
References:
Inverse Transform
Copulas
Gaussian copula:
Ross, Sheldon. (2013). Simulation. Academic Press, San Diego, CA, 5th edition. 103--105.
Modified related answer from here.
I am coding a Gaussian Process regression algorithm. Here is the code:
% Data generating function
fh = #(x)(2*cos(2*pi*x/10).*x);
% range
x = -5:0.01:5;
N = length(x);
% Sampled data points from the generating function
M = 50;
selection = boolean(zeros(N,1));
j = randsample(N, M);
% mark them
selection(j) = 1;
Xa = x(j);
% compute the function and extract mean
f = fh(Xa) - mean(fh(Xa));
sigma2 = 1;
% computing the interpolation using all x's
% It is expected that for points used to build the GP cov. matrix, the
% uncertainty is reduced...
K = squareform(pdist(x'));
K = exp(-(0.5*K.^2)/sigma2);
% upper left corner of K
Kaa = K(selection,selection);
% lower right corner of K
Kbb = K(~selection,~selection);
% upper right corner of K
Kab = K(selection,~selection);
% mean of posterior
m = Kab'*inv(Kaa+0.001*eye(M))*f';
% cov. matrix of posterior
D = Kbb - Kab'*inv(Kaa + 0.001*eye(M))*Kab;
% sampling M functions from from GP
[A,B,C] = svd(Kaa);
F0 = A*sqrt(B)*randn(M,M);
% mean from GP using sampled points
F0m = mean(F0,2);
F0d = std(F0,0,2);
%%
% put together data and estimation
F = zeros(N,1);
S = zeros(N,1);
F(selection) = f' + F0m;
S(selection) = F0d;
% sampling M function from posterior
[A,B,C] = svd(D);
a = A*sqrt(B)*randn(N-M,M);
% mean from posterior GPs
Fm = m + mean(a,2);
Fmd = std(a,0,2);
F(~selection) = Fm;
S(~selection) = Fmd;
%%
figure;
% show what we got...
plot(x, F, ':r', x, F-2*S, ':b', x, F+2*S, ':b'), grid on;
hold on;
% show points we got
plot(Xa, f, 'Ok');
% show the whole curve
plot(x, fh(x)-mean(fh(x)), 'k');
grid on;
I expect to get some nice figure where the uncertainty of unknown data points would be big and around sampled data points small. I got an odd figure and even odder is that the uncertainty around sampled data points is bigger than on the rest. Can someone explain to me what I am doing wrong? Thanks!!
There are a few things wrong with your code. Here are the most important points:
The major mistake that makes everything go wrong is the indexing of f. You are defining Xa = x(j), but you should actually do Xa = x(selection), so that the indexing is consistent with the indexing you use on the kernel matrix K.
Subtracting the sample mean f = fh(Xa) - mean(fh(Xa)) does not serve any purpose, and makes the circles in your plot be off from the actual function. (If you choose to subtract something, it should be a fixed number or function, and not depend on the randomly sampled observations.)
You should compute the posterior mean and variance directly from m and D; no need to sample from the posterior and then obtain sample estimates for those.
Here is a modified version of the script with the above points fixed.
%% Init
% Data generating function
fh = #(x)(2*cos(2*pi*x/10).*x);
% range
x = -5:0.01:5;
N = length(x);
% Sampled data points from the generating function
M = 5;
selection = boolean(zeros(N,1));
j = randsample(N, M);
% mark them
selection(j) = 1;
Xa = x(selection);
%% GP computations
% compute the function and extract mean
f = fh(Xa);
sigma2 = 2;
sigma_noise = 0.01;
var_kernel = 10;
% computing the interpolation using all x's
% It is expected that for points used to build the GP cov. matrix, the
% uncertainty is reduced...
K = squareform(pdist(x'));
K = var_kernel*exp(-(0.5*K.^2)/sigma2);
% upper left corner of K
Kaa = K(selection,selection);
% lower right corner of K
Kbb = K(~selection,~selection);
% upper right corner of K
Kab = K(selection,~selection);
% mean of posterior
m = Kab'/(Kaa + sigma_noise*eye(M))*f';
% cov. matrix of posterior
D = Kbb - Kab'/(Kaa + sigma_noise*eye(M))*Kab;
%% Plot
figure;
grid on;
hold on;
% GP estimates
plot(x(~selection), m);
plot(x(~selection), m + 2*sqrt(diag(D)), 'g-');
plot(x(~selection), m - 2*sqrt(diag(D)), 'g-');
% Observations
plot(Xa, f, 'Ok');
% True function
plot(x, fh(x), 'k');
A resulting plot from this with 5 randomly chosen observations, where the true function is shown in black, the posterior mean in blue, and confidence intervals in green.
In the Matlab SVM tutorial, it says
You can set your own kernel function, for example, kernel, by setting 'KernelFunction','kernel'. kernel must have the following form:
function G = kernel(U,V)
where:
U is an m-by-p matrix.
V is an n-by-p matrix.
G is an m-by-n Gram matrix of the rows of U and V.
When I followed the custom SVM kernel example, I set a break point in mysigmoid.m function. However, I found U and V were in fact 1-by-p vectors and G was a scalar.
Why does not MATLAB process the kernel by matrices?
My custom kernel function is
function G = mysigmoid(U,V)
% Sigmoid kernel function with slope gamma and intercept c
gamma = 0.5;
c = -1;
G = tanh(gamma*U*V' + c);
end
My Matlab script is
%% Train SVM Classifiers Using a Custom Kernel
rng(1); % For reproducibility
n = 100; % Number of points per quadrant
r1 = sqrt(rand(2*n,1)); % Random radius
t1 = [pi/2*rand(n,1); (pi/2*rand(n,1)+pi)]; % Random angles for Q1 and Q3
X1 = [r1.*cos(t1), r1.*sin(t1)]; % Polar-to-Cartesian conversion
r2 = sqrt(rand(2*n,1));
t2 = [pi/2*rand(n,1)+pi/2; (pi/2*rand(n,1)-pi/2)]; % Random angles for Q2 and Q4
X2 = [r2.*cos(t2), r2.*sin(t2)];
X = [X1; X2]; % Predictors
Y = ones(4*n,1);
Y(2*n + 1:end) = -1; % Labels
% Plot the data
figure(1);
gscatter(X(:,1),X(:,2),Y);
title('Scatter Diagram of Simulated Data');
SVMModel1 = fitcsvm(X,Y,'KernelFunction','mysigmoid','Standardize',true);
% Compute the scores over a grid
d = 0.02; % Step size of the grid
[x1Grid,x2Grid] = meshgrid(min(X(:,1)):d:max(X(:,1)),...
min(X(:,2)):d:max(X(:,2)));
xGrid = [x1Grid(:),x2Grid(:)]; % The grid
[~,scores1] = predict(SVMModel1,xGrid); % The scores
figure(2);
h(1:2) = gscatter(X(:,1),X(:,2),Y);
hold on;
h(3) = plot(X(SVMModel1.IsSupportVector,1),X(SVMModel1.IsSupportVector,2),...
'ko','MarkerSize',10);
% Support vectors
contour(x1Grid,x2Grid,reshape(scores1(:,2),size(x1Grid)),[0,0],'k');
% Decision boundary
title('Scatter Diagram with the Decision Boundary');
legend({'-1','1','Support Vectors'},'Location','Best');
hold off;
CVSVMModel1 = crossval(SVMModel1);
misclass1 = kfoldLoss(CVSVMModel1);
disp(misclass1);
Kernels add dimensions to a feature. If you have, for example, one feature for sample x={a} it will expand it into something like x= {a_1... a_q}. As you are doing this for all of your data at once, you are going to have a M x P (M is the number of examples in your training set and P is the number of features). The second matrix it asks for is P x N, where N is the number of examples in the training/test set.
That said, your output should be M x N. Since it is instead 1, it means that you have U = 1XM and V=Nx1 where N=M. To have an output of M x N logic follows that you should simply transpose your inputs.
I have wrote this code:.
N=10000; % number of experiments
o= 1000+randn(1,N)*sqrt(10^4); % random normal distribution with mean 1000 and variance 10^4
b=700:50:1300; % specify the number of bins (possible values of the realizations)
prob=hist(o,b)/N %create ad histogram
X=[700:50:1300]
**
Now, how I can create a matrix which contains the values of b and prob?
In other words, I want a matrix of this kind:
matrix=[ value of X(i) ; probability associate at the value of X(i) ]
es: matrix=[... X(i)=850... ; ... prob(X(i)=850).. ]
Thank you a lot! ;)
I think you want the probabilities of the intervals for which the histogram is computed:
N = 100000; %// number of experiments
b = 700:50:1300; %// bin centers
mu = 1000; %// mean of distribution
sigma = 100; %// standard deviation of distribution
delta = (b(2)-b(1))/2; %// compute bin half-width
pb = normcdf(b+delta,mu,sigma)-normcdf(b-delta,mu,sigma); %// compute probability
Check:
o = mu + sigma*randn(1,N);
hist(o, b)
hold on
plot(b, N*pb, 'r', 'linewidth', 2)