Calculating confidence intervals from the empirical distribution obtained with the bootstrap method

Calculating confidence intervals from the empirical distribution obtained with the bootstrap method - matlab

I have calculated the empirical distribution of the sample mean using the bootstrap method, but now I would also need to calculate the confidence interval for the population mean using the empirical distribution I found.
Is there a way to do it automatically in Matlab given my state? If not, how would you find the 95% confidence interval for population mean?

The bootstrapped confidence intervals for the mean as you have calculated it are the quantiles of the distribution. So, it can be as simple as
quantile(myBootstrappedMeans, [0.05, 0.95])
That will give a 90% confidence interval for the vector myBootstrappedMeans. For reference, http://math.usask.ca/~longhai/doc/talks/slide-bootstrap.pdf
0.05 and 0.95 are for the 90% confidence interval (the middle 90% of the data). For a different confidence interval, you would just need to choose the middle quantiles of that data. So, for 95% you would use 0.025 and 0.975. To generalize, you would use (1-level)/2 and (0.5 + level/2) where level is the confidence interval (or confidence level) that you want.

Related

Calculate 95% Confidence Interval from Diagnostic Test Data

I'm trying to calculate the 95% confidence intervals based off a series of matrices based in Matlab:
I know how to calculate the required sensitivity, specificity, negative predictive value and positive predictive value, however I'm not sure, given these data, how to calculate the 95% confidence interval.
Any help would be greatly appreciated.

You need the whole distribution to calculate your confidence intervals.
Once you have it, call it "data" and run the following code.
The code works with nans in the data, with vectors (Nx1) or matrices (NxM). As long as each distribution is represented by a column
%%% calculate CI
alpha_lvl = [2.5 97.5]; % lower and upper boundaries for the CI
SEM = nanstd(data)./sqrt(sum(~isnan(data ))); % Standard Error
ts = tinv(alpha_lvl/100,length(data)-1); % T-Score
CI = bsxfun(#plus,nanmean(data ),bsxfun(#times,SEM,ts')); % confidence intervals

how to use Fourier transform to predict future trend by fitting the dominant frequency

Here I have several depth profiles at the same location. This profiles were measured in a time series but not in evenly time interval. I decomposed those profiles by FFT and found the dominant frequency. Then I found the amplitudes of the dominant frequencies in each profile fit exponential regression. So how can I use this point to predict future development of the profiles. It's like you give a time then you get a profile.
Here my idea is to fit amplitude with the time. But I can only fit the real part of the complex not the imaginary part. As long as I change the value of the imagina part then the inverse FFT, the profiles would disappear. And I don't know why...
Here is part of the program
%fitting%forecast
pp1=polyfit(xx,YY1,1);
%pp2=polyfit(xx,YY2,1);
xx=log([5000 10000 20000 30000 50000 100000 500000]);
p1=pp1(1)*xx+pp1(2);
%p2=(pp2(1)*xx+pp2(2))*1i;
p=p1;%+p2;
%kk2=imag(Y1);
kk1=real(Y1);
%kk2(:,6)=kk2(:,5);
%kk2(:,7)=kk2(:,5);
kk1(:,6)=kk1(:,5);
kk1(:,7)=kk1(:,5);
kk1(5,:)=p;
kk1(91,:)=p;
kk=kk1;%+kk2*1i;
%as long as I fit the imaginary part the profile becomes strange
k=ifft(kk);
plot(a1)
hold on
plot(k,'*')

Matlab: finding phase difference using cpsd

From my understanding, when using the cpsd function as such:
[Pxy,f] = cpsd(x,y,window,Ns,NFFT,Fs);
matlab chops the time series data into smaller windows with size specified by you. And the windows are shifted by Ns data point. The final [Pxy, f] are an average of results obtained from each individual window. Please correct me if I am wrong about this process.
My question is, if I use angle(Pxy) at a specific frequency, say 34Hz. Does that give me the phase difference between signal x and y at the frequency 34Hz?
I am having doubt about this because if Pxy was an average between each individual window, and because each individual was offset by a window shift, doesn't that mean the averaged Pxy's phase is affected by the window shift?
I've tried to correct this by ensuring that the window shift corresponds to an integer of full phase difference corresponding to 34Hz. Is this correct?
And just a little background about what I am doing:
I basically have numerous time-series pressure measurement over 60 seconds at 1000Hz sampling rate.
Power spectrum analysis indicates that there is a peak frequency at 34 Hz for each signal. (averaged over all windows)
I want to compare each signal's phase difference from each other corresponding to the 34Hz peak.
FFT analysis of individual window reveals that this peak frequency moves around. So I am not sure if cpsd is the correct way to be going about this.
I am currently considering trying to use xcorr to calculate the overall time lag between the signals and then calculate the phase difference from that. I have also heard of hilbert transform, but I got no idea how that works yet.

Yes, cpsd works.
You can test your result by set two input signals, such as:
t=[0:0.001:5];
omega=25;
x1=sin(2*pi*omega*t);
x2=sin(2*pi*omega*t+pi/3);
you can check whether the phase shift calculated by cpsd is pi/3.

95% confidence bands around fit from fminunc

I have performed a MLE fit to some data that I have using the fminunc function in matlab and estimated the parameter confidence intervals from the Hessian output. Does anyone have a method for generating 95% confidence bands around the fitted curve?

A crude way to do this is generate parameters grid on the intervals that you have obtained, and plot the curve for every point of this grid. This works only if the number of parameters is low. You should keep in mind that the resulting confidence interval for curve is lower than that of the parameters, i.e. if you have 2 parameters with 95% confidence, you will obtain 95%*95%=90% confidence interval
If the curve depends on the parameters monotonically at every point, you can generate the interval just by taking boundary values of parameter intervals.

Bootstrap and asymmetric CI

I'm trying to create confidence interval for a set of data not randomly distributed and very skewed at right. Surfing, I discovered a pretty rude method that consists in using the 97.5% percentile (of my data) for the upperbound CL and 2.5% percentile for your lower CL.
Unfortunately, I need a more sophisticated way!
Then I discovered the bootstrap, precisley the MATLAB bootci function, but I'm having hard time to undestand how to used it properly.
Let's say that M is my matrix containing my data (19x100), and let's say that:
Mean = mean(M,2);
StdDev = sqrt(var(M'))';
How can I compute the asymmetrical CI for every row of the Mean vector using bootci?
Note: earlier, I was computing the CI in this very wrong way: Mean +/- 2 * StdDev, shame on me!

Let's say you have a 100x19 data set. Each column has a different distribution. We'll choose the log normal distribution, so that they skew to the right.
means = repmat(log(1:19), 100, 1);
stdevs = ones(100, 19);
X = lognrnd(means, stdevs);
Notice that each column is from the same distribution, and the rows are separate observations. Most functions in MATLAB operate on the rows by default, so it's always preferable to keep your data this way around.
You can compute bootstrap confidence intervals for the mean using the bootci function.
ci = bootci(1000, #mean, X);
This does 1000 resamplings of your data, calculates the mean for each resampling and then takes the 2.5% and 97.5% quantiles. To show that it's an asymmetric confidence interval about the mean, we can plot the mean and the confidence intervals for each column
plot(mean(X), 'r')
hold on
plot(ci')

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse