Estimated lambda exponential distribution - simulation

I'm trying to calculate lambda that is the rate of exponential distribution. For example if I have an interval of 5 seconds and I have 4 objects (on average) how is lambda calculated? I need formulas to calculate it. Can anyone help me?

The rate is the number of occurrences per time unit (total number of occurrences / total time). For your case, 4 per 5 time units or a rate of 0.8 per time unit. The mean time between occurrences will be the inverse of this, or 1.25 time units.

You're asking about Exponential_distribution
the exponential distribution is the probability distribution that
describes the time between events in [...] a process in which events
occur continuously and independently at a constant average rate

Related

Calculating confidence intervals from the empirical distribution obtained with the bootstrap method

I have calculated the empirical distribution of the sample mean using the bootstrap method, but now I would also need to calculate the confidence interval for the population mean using the empirical distribution I found.
Is there a way to do it automatically in Matlab given my state? If not, how would you find the 95% confidence interval for population mean?
The bootstrapped confidence intervals for the mean as you have calculated it are the quantiles of the distribution. So, it can be as simple as
quantile(myBootstrappedMeans, [0.05, 0.95])
That will give a 90% confidence interval for the vector myBootstrappedMeans. For reference, http://math.usask.ca/~longhai/doc/talks/slide-bootstrap.pdf
0.05 and 0.95 are for the 90% confidence interval (the middle 90% of the data). For a different confidence interval, you would just need to choose the middle quantiles of that data. So, for 95% you would use 0.025 and 0.975. To generalize, you would use (1-level)/2 and (0.5 + level/2) where level is the confidence interval (or confidence level) that you want.

Finding Probability of Gaussian Distribution Using Matlab

The original question was to model a lightbulb, which are used 24/7, and usually one lasts 25 days. A box of bulbs contains 12. What is the probability that the box will last longer than a year?
I had to use MATLAB to model a Gaussian curve based on an exponential variable.
The code below generates a Gaussian model with mean = 300 and std= sqrt(12)*25.
The reason I had to use so many different variables and add them up was because I was supposed to be demonstrating the central limit theorem. The Gaussian curve represents the probability of a box of bulbs lasting for a # of days, where 300 is the average number of days a box will last.
I am having trouble using the gaussian I generated and finding the probability for days >365. The statement 1-normcdf(365,300, sqrt(12)*25) was an attempt to figure out the expected value for the probability, which I got as .2265. Any tips on how to find the probability for days>365 based on the Gaussian I generated would be greatly appreciated.
Thank you!!!
clear all
samp_num=10000000;
param=1/25;
a=-log(rand(1,samp_num))/param;
b=-log(rand(1,samp_num))/param;
c=-log(rand(1,samp_num))/param;
d=-log(rand(1,samp_num))/param;
e=-log(rand(1,samp_num))/param;
f=-log(rand(1,samp_num))/param;
g=-log(rand(1,samp_num))/param;
h=-log(rand(1,samp_num))/param;
i=-log(rand(1,samp_num))/param;
j=-log(rand(1,samp_num))/param;
k=-log(rand(1,samp_num))/param;
l=-log(rand(1,samp_num))/param;
x=a+b+c+d+e+f+g+h+i+j+k+l;
mean_x=mean(x);
std_x=std(x);
bin_sizex=.01*10/param;
binsx=[0:bin_sizex:800];
u=hist(x,binsx);
u1=u/samp_num;
1-normcdf(365,300, sqrt(12)*25)
bar(binsx,u1)
legend(['mean=',num2str(mean_x),'std=',num2str(std_x)]);
[f, y]=ecdf(x) will create an empirical cdf for the data in x. You can then find the probability where it first crosses 365 to get your answer.
Generate N replicates of x, where N should be several thousand or tens of thousands. Then p-hat = count(x > 365) / N, and has a standard error of sqrt[p-hat * (1 - p-hat) / N]. The larger the number of replications is, the smaller the margin of error will be for the estimate.
When I did this in JMP with N=10,000 I ended up with [0.2039, 0.2199] as a 95% CI for the true proportion of the time that a box of bulbs lasts more than a year. The discrepancy with your value of 0.2265, along with a histogram of the 10,000 outcomes, indicates that actual distribution is still somewhat skewed. In other words, using a CLT approximation for the sum of 12 exponentials is going to give answers that are slightly off.

How do you calculate the sampling rate of a time varying signal?

I'm trying to apply a filter method (Butterworth) in MATLAB to remove a static acceleration (gravity).
The problem here is the sampling rate seems to be varying. As far as I know, the sampling rate is defined as the number of samples obtained in one second (samples per second), thus fs = 1/T. T is not fixed, and it was varying in my file:
The sampling times were as follows. The fraction components represent ms.
16:25:50.032
16:25:50.192
16:25:50.352
16:25:50.512
16:25:50.671
16:25:50.832
16:25:50.991
16:25:51.151
16:25:51.312
16:25:51.472
16:25:51.632
The value of T is 100ms but here we can see that T varies between 159 and 161. I am not sure how I can calculate the sampling rate in this case?
Also, if I have a varying sampling rate, can I use still Butterworth?

compute time series weighted average

I have a 8760x1 vector with the 1-hour average ambient temperature time series.
I want to calculate the weighted average temperature weighted by the percentage of operating
hours at each temperature level.
What i thought is divide the temperature range with:
ceil(Tmax-Tmin)
and then use hist.
Are there any other suggestions?
Thank you in advance.
mean(temperatures) should do it.
Since you have hourly measurements, the frequency of a given value will be reflecting the operating hours at that temperature level. A value that occurs frequently will therefore automatically have more weight in the average.
Let's say you have two vectors that are the same length, one is the temperature (temp), and the other is the amount of time at that temperature (time_at_temp). The weighted average formula is this:
wt_avg_temp = sum(temp .* time_at_temp) / sum(time_at_temp);

analyse time series at a specific frequency

I have a long data set of water temperature:
t = 1/24:1/24:365;
y = 1 + (30-1).*rand(1,length(t));
plot(t,y)
The series extends for one year and the number of measurements per day is 24 (i.e. hourly). I expect the water temperature to follow a diurnal pattern (i.e. have a period of 24 hours), therefore I would like to evaluate how the 24 hour cycle varies throughout the year. Is there a method for only looking at specific frequencies when analyzing a signal? If so, I would like to draw a plot showing how the 24 hour periodicity in the data varies through the year (showing for example if it is greater in the summer and less in the winter). How could I do this?
You could use reshape to transform your data to a 24x365 matrix. In the new matrix every column is a day and every row a time of day.
temperature=reshape(y,24,365);
time=(1:size(temperature,1))-1;
day=(1:size(temperature,2))-1;
[day,time]=meshgrid(day,time);
surf(time,day,temperature)
My first thought would be fourier transformation. This will give you a frequency spectrum.
At high frequencies (> 1/d) you would have the pattern for a day, at low frequencies the patter over longer times. (see lowpass and highpass filter)
Also you could go for a frequency/time visualization that will show how the frequencies change over a year.
A bit more work - but you could write a simple model and create a Kalman filter for it.