I want to sample from only the tails ([-5sigma,-3sigma] and [3sigma,5sigma]) of a Normal Distribution when I run a Monte-Carlo Simulation and therefore Rejection Sampling comes to mind. I am however, struggling to implement this in Matlab. Up till now I have been using something similar to the code below (which I know isn't rejection sampling), but would Rejection Sampling be a better way to solve this issue?
function [new_E11] = elasticmodulusrng()
new_E11 = normrnd(136e9,9.067e9,[1 1]);
while new_E11>=136e9-3*9.067e9 && new_E11<=136e9+3*9.067e9
new_E11 = normrnd(136e9,9.067e9,[1 1]);
end
Thanks
Edit: Using code in the Answer
mu=0
sigma=1
%get scaling factor
scale=(normcdf(5*sigma+mu,mu,sigma)-normcdf(3*sigma+mu,mu,sigma))*2
%define pdf
cpdf=#(x)((1/scale)*normpdf(x,mu,sigma).*((abs(x-mu)<5.*sigma)&(abs(x-mu.*sigma)>3)))
%get cdf via integral
ccdf=#(x)integral(cpdf,mu-5.*sigma,x)
%allow vector inputs
ccdf=#(x)arrayfun(ccdf,x)
%inverse cdf
icdf=#(y)(fzero(#(x)(ccdf(x)-y),.5));
%allow vector inputs
icdf=#(x)arrayfun(icdf,x);
%icdf is very slow, thus evaluate some numbers and use the cached and interpolated version:
cachedicdf=nan(n+1,1);
x=0:0.01:1;
y=icdf(x);
icdf=#(uni)interp1(x,y,uni);
%plot some example data
hist(icdf(rand(10000000,1)),1000);
The accuracy is not what I expected, but I'll leave it here. Maybe someone is able to improve the code.
Related
I've been running a variation of the doseResponse function downloaded from here to generate dose-response sigmoid curves. However, I've had trouble with one of my datasets generating a linear curve instead. By running the following code, I get the following error and produce the following graph. I also uploaded the data called dose2.csv and resp2.csv to google drive here. Does anyone know how I can fix this? Thanks.
Code to generate graph
% Plotting Dose-Response Curve
response = resp2;
dose = dose2;
% Deal with 0 dosage by using it to normalise the results.
normalised=0;
if (sum(dose(:)==0)>0)
%compute mean control response
controlResponse=mean(response(dose==0));
%remove controls from dose/response curve
response=response(dose~=0)/controlResponse;
dose=dose(dose~=0);
normalised=1;
end
%hill equation sigmoid
sigmoid=#(beta,x)beta(1)+(beta(2)-beta(1))./(1+(x/beta(3)).^beta(4));
%calculate some rough guesses for initial parameters
minResponse=min(response);
maxResponse=max(response);
midResponse=mean([minResponse maxResponse]);
minDose=min(dose);
maxDose=max(dose);
%fit the curve and compute the values
%[coeffs,r,J]=nlinfit(dose,response,sigmoid,[minResponse maxResponse midResponse 1]); % nlinfit doesn't work as well
beta_new = lsqcurvefit(sigmoid,[minResponse maxResponse midResponse 1],dose,response);
[coeffs,r,J]=nlinfit(dose,response,sigmoid, beta_new);
ec50=coeffs(3);
hillCoeff=coeffs(4);
%plot the fitted sigmoid
xpoints=logspace(log10(minDose),log10(maxDose),1000);
semilogx(xpoints,sigmoid(coeffs,xpoints),'Color',[1 0 0],'LineWidth',2)
hold on
%notate the EC50
text(ec50,mean([coeffs(1) coeffs(2)]),[' \leftarrow ' sprintf('EC_{50}=%0.2g',ec50)],'FontSize',20,'Color',[1 0 0]);
%plot mean response for each dose with standard error
doses=unique(dose);
meanResponse=zeros(1,length(doses));
stdErrResponse=zeros(1,length(doses));
for i=1:length(doses)
responses=response(dose==doses(i));
meanResponse(i)=mean(responses);
stdErrResponse(i)=std(responses)/sqrt(length(responses));
%stdErrResponse(i)=std(responses);
end
errorbar(doses,meanResponse,stdErrResponse,'o','Color',[1 0 0],'LineWidth',2,'MarkerSize',12)
Warning Message
Solver stopped prematurely.
lsqcurvefit stopped because it exceeded the function evaluation limit,
options.MaxFunctionEvaluations = 4.000000e+02.
Warning: Iteration limit exceeded. Returning results from final iteration.
Graph (looking to generate a sigmoid curve not linear)
You also need to optimize your initial value [minResponse maxResponse midResponse 1] for lsqcurvefit. Don't just simply start with minimum or maximum values of given values. Instead, you may first start with your equations to estimate your coefficients.
Given the sigmoid model of sigmoid=#(beta,x)beta(1)+(beta(2)-beta(1))./(1+(x/beta(3)).^beta(4)). As x gets arbitrarily close to inf, equation will return beta(2). And as x gets arbitrarily close to 0, equation will return beta(1). Therefore, initial estimation of minResponse, maxResponse, and midResponse seems reasonable enough. Actually your problem lies in your initial estimation of 1. beta(4) can be roughly estimated with the inclination of your log graph. To my rough sketch it was around 1/4 and therefore you may conclude that your initial estimation of 1 was too large for convergence.
beta_new = lsqcurvefit(sigmoid,[minResponse maxResponse midResponse 1/4],dose,response);
I have a signal 's' of voice of which you can see an extract here:
I would like to plot the zero crossing points in the same graph. I have tried with the following code:
zci = #(v) find(v(:).*circshift(v(:), [-1 0]) <= 0); % Returns Zero-Crossing Indices Of Argument Vector
zx = zci(s);
figure
set(gcf,'color','w')
plot(t,s)
hold on
plot(t(zx),s(zx),'o')
But it does not interpole the points in which the sign change, so the result is:
However, I'd like that the highlighted points were as near as possible to zero.
I hope someone can help me. Thanks you for your responses in advanced.
Try this?
w = 1;
crossPts=[];
for k=1:(length(s)-1)
if (s(k)*s(k+1)<0)
crossPts(w) = (t(k)+t(k+1))/2;
w = w + 1;
end
end
figure
set(gcf,'color','w')
plot(t,s)
hold on
plot(t, s)
plot(crossPts, zeros(length(crossPts)), 'o')
Important questions: what is the highest frequency conponent of the signal you are measuring? Can you remeasure this signal? What is your sampling rate? What is this analysis for? (Schoolwork or scholarly research). You may have quite a bit of trouble measuring the zeros of this function with any significance or accurracy because it looks like your waveform has a frequency greater than half of your sampling rate (greater than your Nyquist frequency). Upsampling/interpolating your entire waveform will allow you to find the zeros much more precisely (but with no greater degree of accurracy) but this is a huge no-no in the scientific community. While my method may not look super pretty, it's the most accurate method that doesn't make unsafe assumptions. If you just want it to look pretty, I would recommend interp1 and using the 'Spline' method. You can interpolate the whole waveform and then use the above answer to find more accurate zeros.
Also, you could calculate the zeros on the interpolated waveform and then display it on the raw data.
A remotely possible solution to improve your data;
If you're measuring a human voice, why not try filtering at the range of human speech? This should be fine mathematically and could possibly improve your waveform.
I'm just learning Matlab and the fast fourier transform algorithm.
As a first step I tried to duplicate this example: https://en.wikipedia.org/wiki/Fourier_transform#Example
I use the following code:
t = -6:0.01:6;
s = cos(2 * pi * 3 * t) .* exp(-pi * t.^2);
figure(1);
plot(t, s);
xlim([-2 2]);
r = fft(s);
figure(2);
plot(t, abs(r));
And I obtained the following picture:
Figure 2:
Figure 1 is OK, but Figure 2 is not. I see one of the problem is that in Figure 2 I should plot vector r against frequency, not against time. Another problem in Figure 2 is the scale in the Y-axis.
Thus, I have 2 questions in order to duplicate the example:
How can I obtain the frequency domain (X-axis in Figure 2)?
How should I scale vector r (Y-axis in Figure 2)?
Your issue is that you aren't actually creating a frequency vector to plot the fft against. The reason that the fft is plotted against time is because that is what you specified in your plot command.
Here is a working fft outline:
N=length(t);
index=0:N-1;
FrequencyResolution=SamplingRate/N;
Frequency=index.*FrequencyResolution;
data_fft=fft(detrend(data));
%the detrend isn't necessary but it does look nicer because it focuses the plot on changes around the mean of the data
data_FFTmagnitude=abs(data_fft);
plot(Frequency, data_FFTmagnitude)
I remember once for the first time that I wanted to use DFT and FFT for one of my study projects I used this webpage, it explains in detail with examples on how to do so. I suggest you go through it and try to replicate for your case, doing so will give you insight and better understanding of the way one can use FFt as you said you are new to Matlab. Do not hesitate to ask again if you need more detailed help.
And also keep in mind that for FFT it is better to have signal length of a power of 2, that way you will get the most exact results, and if you cannot control your signal length you can take the largest power of 2 close to that length, as everyone usually does.
I am trying to simulate a distribution for parameter theta f= theta ^(z_f+n+alpha-1)*(1-theta)^(n+1-z_f-k+ beta-1), where all the parameter except for theta is know. I am using Metro polish hasting algorithm to do the MCMC simulation . My proposal density is a beta distribution with parameter alpha and beta. My code for the simulation are as follows. I am using a buitlin Matlab code called mhsample() for this purpose, How do I know if my code is working properly?
clear
clc
alpha=2;
beta=2;
z_f=1;
n=6;
k=5;
nsamples = 3000;
pdf= #(x) x^(z_f+n+alpha-1)*(1-x)^(n+1-z_f-k+beta-1); % here x acts as theta
proppdf= #(x,y) betapdf(x, alpha, beta);
proprnd =#(x) betarnd(alpha,beta,1);
smpl = mhsample(0.1,nsamples,'pdf',pdf,'proprnd',proprnd,'proppdf',proppdf);
I'm unsure of what you're asking when you say "how do I know if my code is working properly" -- I'm assuming it executes? But for a visual comparison of your function vs. the simulation, you can plot both the PDF and the data you got from mhsample as follows:
% i'm assuming you ran the code above so that smpl and #pdf are both defined...
fplot(pdf,[0 1]); % fplot takes your function and plots it between x-limit [0,1]
figure % new figure
hist(smpl,30); % 30 here is bin size, change it to your preference
Figure below:
the histogram of smpl's output on left, i.e., your simulation
the function pdf bounded in [0,1] on right for comparison to your simulation
This was just a wild guess because those two figures resemble each other and are also beta-distribution-esque.
If you want a more complex analysis than that, I'm afraid I'm not yet proficient in MCMC :)
I am trying to fit a line to some data without using polyfit and polyval. I got some good help already on how to implement this and I have gotten it to work with a simple sin function. However, when applied to the function I am trying to fit, it does not work. Here is my code:
clear all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=lb:step:ub;
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
i=1;
for x=lb:step:ub
G(i,1)= abs(sin((a/la)*pi*x*(sqrt(1+(1/x)^2)-1)));
N(i,1)=2*rand()-1;
Ghat(i,1)=(1+ep1*N(i,1))*G(i,1)+ep2*N(i,1);
r(i,1)=x;
i=i+1;
end
x=r;
y=G;
V=[x.^0];
Vfit=[x.^0];
for i=1:1:1000
V = [x.^i V];
c = V \ y;
Vfit = [x.^i Vfit];
yFit=Vfit*c;
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
The first two sections are just defining variables and the function. The second for loop is where I am making the fit. As you can see, I have it pause after every nth order in order to see the fit.
I changed your fit formula a bit, I got the same answers but quickly got
a warning that the matrix was singular. No sense in continuing past
the point that the inversion is singular.
Depending on what you are doing you can usually change out variables or change domains.
This doesn't do a lot better, but it seemed to help a little bit.
I increased the number of samples by a factor of 10 since the initial part of the curve
didn't look sampled highly enough.
I added a weighting variable but it is set to equal weight in the code below. Attempts
to deweight the tail didn't help as much as I hoped.
Probably not really a solution, but perhaps will help with a few more knobs/variables.
...
step=.01; %step-size through data
...
x=r;
y=G;
t=x.*sqrt(1+x.^(-2));
t=log(t);
V=[ t.^0];
w=ones(size(t));
for i=1:1:1000
% Trying to solve for value of c
% c that
% yhat = V*c approximates y
% or y = V*c
% V'*y = V'*V * c
% c = (V'*V) \ V'*y
V = [t.^i V];
c = (V'*diag(w.^2)*V ) \ (V'*diag(w.^2)*y) ;
yFit=V*c;
subplot(211)
plot(t,y,'o',t,yFit,'--')
subplot(212)
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
It looks like more of a frequency estimation problem, and trying to fit a unknown frequency
with polynomial tends to be touch and go. Replacing the polynomial basis with a quick
sin/cos basis didn't seem to do to bad.
V = [sin(t*i) cos(t*i) V];
Unless you specifically need a polynomial basis, you can apply your knowledge of the problem domain to find other potential basis functions for your fit, or to attempt to make the domain in which you are performing the fit more linear.
As dennis mentioned, a different set of basis functions might do better. However you can improve the polynomial fit with QR factorisation, rather than just \ to solve the matrix equation. It is a badly conditioned problem no matter what you do however, and using smooth basis functions wont allow you to accurately reproduce the sharp corners in the actual function.
clear all
close all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=logspace(log10(lb),log10(ub),100)';
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
y=abs(sin(a/la*pi*x.*(sqrt(1+(1./x).^2)-1)));
N=2*rand(size(x))-1;
Ghat=(1+ep1*N).*y+ep2*N;
V=[x.^0];
xs=(lb:.01:ub)';
Vfit=[xs.^0];
for i=1:1:20%length(x)-1
V = [x.^i V];
Vfit = [xs.^i Vfit];
[Q,R]=qr(V,0);
c = R\(Q'*y);
yFit=Vfit*c;
plot(x,y,'o',xs,yFit)
axis([0 10 0 1])
drawnow
pause
end