Matlab: generate random numbers from custom made probability density function - matlab

I have a dataset with 3-hourly precipitation amounts for the month of January in the period 1977-1983 (see attachment). However, I want to generate precipitation data for the period 1984-1990 based upon these data. Therefore, I was wondering if it would be possible to create a custom made probability density function of the precipitation amounts (1977-1983) and from this, generate random numbers (precipitation data) for the desired period (1984-1990).
Is this possible in Matlab and could someone help me by doing so?
Thanks in advance!

A histogram will give you an estimate of the PDF -- just divide the bin counts by the total number of samples. From there you can estimate the CDF by integrating. Finally, you can choose a uniformly distributed random number between 0 and 1 and estimate the argument of the CDF that would yield that number. That is, if y is the random number you choose, then you want to find x such that CDF(x) = y. The value of x will be a random number with the desired PDF.

If you have 'Statistics and Machine Learning Toolbox', you can evaluate the PDF of the data with 'Kernel Distribution' method:
Percip_pd = fitdist(Percip,'Kernel');
Then use it for generating N random numbers from the same distribution:
y = random(Percip_pd,N,1);

Quoting #AnonSubmitter85:
"estimate the CDF by integrating. Finally, you can choose a uniformly
distributed random number between 0 and 1 and estimate the argument of
the CDF that would yield that number. That is, if y is the random
number you choose, then you want to find x such that CDF(x) = y. The
value of x will be a random number with the desired PDF."
%random sampling
N=10; %number of resamples
pdf = normrnd(0, 1, 1,100); %your pdf
s = cumsum(pdf); %its cumulative distribution
r = rand(N,1); %random numbers between 0 and 1
for ii=1:N
inds = find(s>r(ii));
indeces(ii)=inds(1); %find first value greater than the random number
end
resamples = pdf(indeces) %the resamples

Related

Auto-correlation of each column of a matrix in matlab

My data is included in a matrix (dim: 900 x 10) called input_data_matrix, each column of this matrix has 900 time-series random signals (light readings integer values).
I want to compute the relation (or correlation) between these 900 readings of same column independently, (not correlation with the other columns readings), such that I can get 10 correlation result values corresponding to the 10 column which are indicate how much the 900 readings of each column are correlate,
So, my question is how I can compute this in matlab and what is the best type of correlation to do this.
If I have understood correctly, what you want is the autocorrelation of each column of your input data. In that case, I would use the xcorr function (https://es.mathworks.com/help/signal/ref/xcorr.html), which for a given vector computes its autocorrelation. The code would be the following:
[m, n] = size(input_data_matrix);
output_matrix = zeros(m, n);
for i = 1:n
output_matrix(:,i) = xcorr(input_data_matrix(:,i));
end

Shannon entropy for a vector

I'd like to calculate the Shanon entropy of a vector (psi) over the time period. According to this reference,
I can calculate the entropy for every single element of psi using a loop that computes the entropy at every point. What I wan't to understand is how to set up the probability of psi(tk) lying in a certain bin. and how to set up the total number of bins.
I tried using Matlab's histogram command that will generate the suitable bins (" [N,edges] = histcounts(psi)") but I don't know how to proceed from there. How do I get the probability of each element being in the xth bin?
here is my current code:
% get the number of bins
[N,edges] = histcounts(psi)
%// Compute probability
h = hist(psi);
pdf = h / length(psi);
%// Set any entries that are 0 to 1 so that log calculation equals 0.
pdf(pdf == 0) = 1;
e=[];
%// Calculate entropy
for i=1:length(N)
e(i) = -sum(pdf(i).*log2(pdf(i)));
end
any ideas?

how to generate n samples from a line segment using Matlab

Given n samples of 100,how do we generate these random samples in the line segment below using matlab
line_segement:
x between -1 and 1, y=2
If you want to generate n random samples between to given limit (in your question -1 and 1), you can use the function rand.
Here an example:
% Define minimum x value
x_min=-1
% Define maximum x value
x_max=1
% Define the number of sample to be generated
n_sample=100
% Generate the samples
x_samples = sort(x_min + (x_max-x_min).*rand(n_sample,1))
In the example, the sort function is called to sort the values in order to have an ascendent series.
x_min and (x_max-x_min) are used to "shift" the series of random values so that it belongs to the desired interval (in this case -1 1), since rand returns random number on an open interval (0,1).
If you want to have a XY matrix composed by the random samples and the defined constant y value (2):
y_val=2;
xy=[x_samples ones(length(x_samples),1)*y_val]
plot([x_min x_max],[y_val y_val],'linewidth',2)
hold on
plot(xy(:,1),xy(:,2),'d','markerfacecolor','r')
grid on
legend({'xy segment','random samples'})
(in the picture, only 20 samples have been plot to make it more clear)
Hope this helps.

sum of MATLAB gaussian distribution is greater than 1

I want to compare the actual distribution of a time series with the normal law having same mean and std deviation. The interval on which I am computing this Gaussian distribution starts at the min and ends at the max values of the time series.
The problem is that I obtain a gaussian which has the classic bell shape but is shifted upwards, since the integral of the normal pdf is about 9.
Here it is my code:
N = 30; %number of segments in the interval
DISTR = struct('interval',NaN(size(shares.ret,1),N-1),'perf',NaN(size(shares.ret,1),N-1),'normal',NaN(size(shares.ret,1),N-1));
first_ret = table2array(rowfun(#(x) find(~isnan(x),1),shares.ret,'SeparateInputs',false)); %THIS LINE allows to calculate the distribution for EACH FUND on his own time horizon
for i = 1:size(shares.ret,1)
xbins = linspace(min(shares.ret{i,first_ret(i):end},[],2),max(shares.ret{i,first_ret(i):end},[],2),N);
y = (xbins(2)-xbins(1))/2;
DISTR.interval(i,:) = xbins(1:end-1)+y;
DISTR.perf(i,:) = hist(shares.ret{i,first_ret(i):end},DISTR.interval(i,1:end))/sum(~isnan(shares.ret{i,first_ret(i):end}),2);
DISTR.normal(i,:) = normpdf(DISTR.interval(i,:),mean(shares.ret{i,first_ret(i):end}),std(shares.ret{i,first_ret(i):end}));
end
Here I found a similar question that I didn't understand how to adapt to my case.
Any suggestion/help will be really appreciated.
Thanks

how to create a histogram in matlab with required number of cells?

I am new in matlab and I am making a gremetric simulation with k = m2 and p = 1/5.
I have to generate 1000 random numbers and I must show them in a histogram with 15 number of cells. this is what I have so far:
K = 2;
P 1/5;
R = geornd(p,k,1000);
now I am trying to show these result in a histogram with 15 cells but I dont know how to do it please help.
EDIT:
to get the histogram I used:
hist(Sc,15), and this is the results:
According to the doc for geornd, you need to provide the function with a probability parameter P (here 1/5) and a vector dictating the size of the output you want, so it looks like your K is not used correctly in this context.
If you want 1000 random values distributed according to geornd, you might want to use this instead:
R = geornd(0.2,[1 1000]); % P of 0.2 and array of 1 x 1000 numbers
hist(R,15)
Which gives the following:
If you do want do generate 2 distributions, then you can calculate them all at once and plot them separately:
R = geornd(0.2,[2 1000]);
% Plot 1st distribution:
hist(R(1,:),15)
Plot 2nd distribution:
hist(R(2,:),15)