I apologise if this is quite obvious to some however I have been trying to get my head around bootstraps for a few hours, and for something so simple I am really struggling.
I have a large data set, however it is not normally distributed and am trying to find the confidence levels, hence why I have turned to bootstraps. I want to apply the bootstrap to the fourth column of a data set, which I can do.
However I am having trouble with the bootci function itself
ci=bootci(10000, ..... , array;
I am having trouble implementing the function, as I don't fully understand what the 2nd part of the bootci function, denoted ....., does.
I have seen #mean implemented on other examples, I'm assuming this calculates the mean for each column and applies it to the function.
If anyone could confirm my thinking or explain the function to me it would be much appreciated!
I am also unsure about how to change the sample size, could someone point me in the right direction?
From what I understand of the question:
ci = bootci(10000, #mean, X);
Will determine a 95% confidence interval of the mean of the dataset X using 10000 subsamples generated using random sampling with replacement from dataset X.
The second argument of the function #mean indicates that the function to apply to the subsamples is mean, and hence to calculate the confidence interval of the mean. You could equally pass in #std to calculate a confidence interval on the standard deviation if you wanted, or pass in any other suitable function for that matter.
From what I have read in the documentation, it does not seem to be possible to directly control the size of the subsamples used by the bootci function.
Related
I feel like this should be something simple to solve - but I'm struggling to find the answer anywhere.
I have a set of 'R' values and a set of time values, I want to use curve fitting (I haven't used this part of the software before) to calculate the 'R' values at a different set of time values, literally just be able to access what is displayed in a figure created using curve fitting using a different set of time values (ie I can point the curser to the values I want on a figure and write them down but this is not efficient at all for the number of time values I have). Context is an orbital motion radius vs time.
Thanks in advance :)
You can use Matlab's fit function to do this very easily. Assuming you have your data in arrays r and t, you can do something like this:
f = fit(t, r, 'smoothingspline')
disp(f(5))
If you consult the documentation, you can see the various fit types available. (See https://www.mathworks.com/help/curvefit/fit.html)
I was asked to implement the cross correlation in Matlab and compare it with the xcorr that Matlab provides.
From what I have searched its seems that cross correlation is similar to convolution but I still don’t fully understand how either of them work, so its impossible to get it down on code.
If somebody has done this before and is willing to share the code with a explanation on how it works is appreciated.
PS: I was told that I cant be using inbuilt functions other than the simple ones.(for, if, etc..)
I am sure you are familiar with this GIF from a convolution:
What do you see there? you calculate the value under two functions (the realtion between them is a multiplication), which is an integral (which in discrete system is a sum of the values inside your integration limits), and you do that for the whole integration limit in one function (so that's one inner loop) in every step of the integration limits of the other function , for the whole integration limit of the second function (nested in a second loop).
So there you have it, a convolution can be programmed as the sum of multiplications of the values of two functions inside two nested loops over the integration limits. For the cross correlation you just change one direction.
Try programming that and come back if it doesn't work. Good luck with your assignment!
I am trying to use ode45 in MAtlab and want to fix the number of points that MAtlab uses (number of time steps). Using the 'refine' option in ode45 seems not to help. For instance, if I set 'refine' to be 10, Matlab returns an array of 101.
Changing 'RelTol' and 'AbsTol' also does not help either. I know that it is possible to write tspan as [0,t1,t2,t3,...,tn] and that solves this issue, but I'd like to fix number of points via the 'refine' option.
Perhaps you misunderstand what the 'Refine' option actually does. From the documentation for odeset:
Refine — If Refine is 1, the solver returns solutions only at the end of each time step. If Refine is n >1, the solver subdivides each time step into n smaller intervals and returns solutions at each time point. Refine does not apply when length(tspan)>2 or the ODE solver returns the solution as a structure.
In other words, setting 'Refine' to 10 does not guarantee that you'll get 10 output points but rather that you'll get 10 output points per integration time step. In the case of an adaptive step size method like ode45, the solver chooses how big the steps are based on many criteria. If you want a given number of output points you must specify fixed time steps as you've already done via tspan. The linspace function might be helpful to you.
Another possibility is that you're not actually applying your options. Simply calling odeset is not sufficient. You must also remember to pass the output into ode45.
I'm trying my best to work it out with fmincon in MATLAB. When I call the function, I get one of the two following errors:
Number of function evaluation exceeded, or
Number of iteration exceeded.
And when I look at the solution so far, it is way off the one intended (I know so because I created a minimum vector).
Now even if I increase any of the tolerance constraint or max number of iterations, I still get the same problem.
Any help is appreciated.
First, if your problem can actually be cast as linear or quadratic programming, do that first.
Otherwise, have you tried seeding it with different starting values x0? If it's starting in a bad place, it may be much harder to get to the optimum.
If it's possible for you to provide the gradient of the function, that can help the optimizer tremendously (though obviously only if you can find it some way other than numerical differentiation). Similarly, if you can provide the (full or sparse) Hessian relatively cheaply, you're golden.
You can also try using a different algorithm in the solver.
Basically, fmincon by default has almost no info about the function it's trying to optimize, and providing more can be extremely helpful. If you can tell us more about the objective function, we might be able to give more tips.
The L1 norm is not differentiable. That can make it difficult for the algorithm to converge to a point where one of the residuals is zero. I suspect this is why number of iterations limits are exceeded. If your original problem is
min norm(residual(x),1)
s.t. Aeq*x=beq
you can reformulate the problem differentiably, as follows
min sum(b)
s.t. -b(i)<=residual(x,i)<=b(i)
Aeq*x=beq
where residual(x,i) is the i-th residual, x is the original vector of unknowns, and b is a further unknown vector of bounds that you add to the problem.
I am looking for numerical integration with matlab. I know that there is a trapz function in matlab but the precision is not good enough. By searching it online, I found there is a quad function there it seems only accept symbolic expression as input. My data is all discrete and one-dimensional. Is that any way to use quad on my data? Thanks.
An answer to your question would be no. The only way to perform numerical integration for data with no expression in Matlab is by using the trapz function. If it's not accurate enough for you, try writing your own quad function as Li-aung said, it's very simple, this may help.
Another method you may try is to use the powerful Curve Fitting Tool cftool to make a fit then use the integrate function which can operate on cfit objects (it has a weird convention, the upper limit is the first argument!). I don't think you will get much accurate answers than trapz, it depends on the fit.
Use the spline function in MATLAB to interpolate your data, then integrate this data. This is the standard method for integrating data in discrete form.
You can use quadl() to integrate your data if you first create a function in which you interpolate them.
function f = int_fun(x,xdata,ydata)
f = interp1(xdata,ydata,x);
And then feed it to the quadl() function:
integral = quadl(#int_fun,A,B,[],[],x,y) % syntax to pass extra arguments
% to the function
Integration of a function of one variable is the computation of the area under the curve of the graph of the function. For this answer I'll leave aside the nasty functions and the corner cases and all the twists and turns that trip up writers of numerical integration routines, most of which are probably not relevant here.
Simpson's rule is an approach to the numerical integration of a function for which you have a code to evaluate the function at points within its domain. That's irrelevant here.
Let's suppose that your data represents a time series of values collected at regular intervals. Then you can plot your data as a histogram with bars of equal width. The integrand you seek is the sum of the areas of the bars in the histogram between the limits you are interested in.
You should be able to apply this approach to data sets where the x-axis (ie the width of the bars in the histogram) does not show time, to the situation where the bars are not of equal width, to the situation where the data crosses the x-axis, and most reasonable data sets, quite easily.
The discretisation of your data establishes a limit to the accuracy of the result you can get. If, for example, your time series is sampled at 1sec intervals you can't integrate over an interval which is not a whole number of seconds by this approach. But then, you don't really have the data on which to compute a figure with any more accuracy by any approach. Sure, you can use Matlab (or anything else) to generate extra digits of precision but they don't carry any meaning.