How to detect and remove outliers in MATLAB? - matlab

I have a problem detecting outliers in a set of data. Let's say I have two arrays x and y, and y is a quadratic function of x. Some of the values of y do not follow this function. How can I detect them?
I tried the rmoutliers function, but it doesn't seem to solve this problem since it only deals with normally distributed data.
Basically, I am trying to study some material behavior. The behavior is represented by y. I use an optimization method to get the different values of y as a function of x. Because sometimes the optimization doesn't yield accurate results, I get outliers.The relationship I am expecting should follow some nearly quadratic function, but the coefficients of this function are variable based one the provided set of data, so I can't use a certain function of x and use it to detect the outliers in the array of y values.

Related

How do I solve a linear optimization problem with a constrained output?

I am trying to solve the following optimization problem in Matlab:
MPC problem
k is the time with N being the amount of timesteps.
linprog(c_k,F_uk,f_k) solves elements 7.1a and 7.1e of the above problem description. However, the output of the model/problem, x, needs to be constrained within bounds. In the image, x is first converted to y and then y is constrained, but directly constraining x would also work.
For context, u is the (decision variable) input to various radiators in a building and x the resulting temperatures of rooms and wall, which need to be constrained between eg 20 and 25 degrees. v are external factors such as outside temperature.
Is there a way to incorporate the constraint on x in the linprog function? Or should I use another optimization method altogether?
What I tried:
what linprog solves
I've thought a lot about how to rewrite x/u and/or how to use one of the three constraint methods as seen in the image to constrain x. Note that vector u in my problem description is the x to be solved in matlab, while x in my problem is a different variable.
I've thought about adding the states x to the decision variable u, but the problem is also that x depends on x in the previous timestep. u is currently a long vector with input variables for each timestep.
Perhaps I should use a heuristic algorithm, but a low computation time is important for my research.

Nonlinear curve fitting of a matrix function in python

I have the following problem. I have a N x N real matrix called Z(x; t), where x and t might be vectors in general. I have N_s observations (x_k, Z_k), k=1,..., N_s and I'd like to find the vector of parameters t that better approximates the data in the least square sense, which means I want t that minimizes
S(t) = \sum_{k=1}^{N_s} \sum_{i=1}^{N} \sum_{j=1}^N (Z_{k, i j} - Z(x_k; t))^2
This is in general a non-linear fitting of a matrix function. I'm only finding examples in which one has to fit scalar functions which are not immediately generalizable to a matrix function (nor a vector function). I tried using the scipy.optimize.leastsq function, the package symfit and lmfit, but still I don't manage to find a solution. Eventually, I'm ending up writing my own code...any help is appreciated!
You can do curve-fitting with multi-dimensional data. As far as I am aware, none of the low-level algorithms explicitly support multidimensional data, but they do minimize a one-dimensional array in the least-squares sense. And the fitting methods do not really care about the "independent variable(s)" x except in that they help you calculate the array to be minimized - perhaps to calculate a model function to match to y data.
That is to say: if you can write a function that would take the parameter values and calculate the matrix to be minimized, just flatten that 2-d (on n-d) array to one dimension. The fit will not mind.

extrapolating a 2D matrix to predict a future output

I have a 2D 2401*266 matrix K which corresponds to x values (t: stored in a 1*266 array) and y values(z: stored in a 1*2401 array).
I want to extrapolate the matrix K to predict some future values (corresponding to t(1,267:279). So far I have extended t so that it is now a 1*279 matrix using a for loop:
for tq = 267:279
t(1,tq) = t(1,tq-1)+0.0333333333;
end
However I am stumped on how to extrapolate K without fitting a polynomial to each individual row?
I feel like there must be a more efficient way than this??
There are countless of extrapolation methods in the literature, "fitting a polynomial to each row" would be just one of them, not necessarily invalid, not sure why you mention that you do no wan't to do it. For 2D data perhaps fitting a surface would lead to better results though.
However, if you want an easy, simple way (that might or might not work with your problem), you can always use the function interp2, for interpolation. If you chose spline or makima as interpolation functions, it will also extrapolate for any query point outside the domain of K.

2-Dimensional Minimization without Derivatives and Ignoring certain Input Parameters on the go

I have a Function V which depends on two variables v1 and v2 and a parameter-Array p containing 15 Parameters.
I want to Minimize my Function V regarding v1 and v2, but there is no closed expression for my Function, so I can't build and use the Derivatives.
The Problem is the following : For caluclating the Value of my Function I need the Eigenvalues of two 4x4 Matrices (which should be symmetric and real by concept, but sometimes the EigenSolver does not get real Eigenvalues). These Eigenvalues I calculate with the Eigen Package. The entries of the Matrices are given by v1,v2 and p.
There are certain Input Sets for which some of these Eigenvalues become negative. These are Input Sets which I want to ignore for my calculation as they will lead to an complex Function value and my Function is only allowed to have real values.
Is there a way to include this? My first attempt was a Nelder-Mead-Simplex Algorithm using the GSL-Library and an way too high Output value for the Function if one of the Eigenvalues becomes negative, but this doesn't work.
Thanks for any suggestions.
For the Nelder-Mead simplex, you could reject new points as vertices for the simplex, unless they have the desired properties.
Your method to artificially increase the function value for forbidden points is also called penalty or barrier function. You might want to re-design your penalty function.
Another optimization method without derivatives is the Simulated Annealing method. Again, you could modify the method to avoid forbidden points.
What do you mean by "doesn't work"? Does it take too long? Are the resulting function values too high?
Depending on the function evaluation cost, it might be an approach to simply scan a 2D interval, evaluate all width x height function values and drill down in the tile with the lowest function values.

Can MATLAB plot a constrained symbolic function?

I'm trying to plot the following:
syms f_ih n
bi = (1-f_ih)/2
where f_ih is constrained by: 0 ≤ f_ih ≤ (n-1)/n. I'd like to show bi as a surface plot with independent variables n and f_ih, but ezsurf doesn't allow for variable bounds. How can I do this?
The command I'd love to run would simply be ezsurf(bi,[0,(n-1)/n]), but it's not that simple.
Thanks!
There's a good reason for requiring the domain to be numeric as opposed to symbolic: the function has to be evaluated in order to obtain actual numbers that can be plotted. It makes no sense to plot something that is purely symbolic like your equations (in Matlab or on paper) unless you specify n. Some functions may be scale invariant (not quite the right term mathematically), meaning they look the same when evaluated/plotted on different domains, but Matlab has no way of knowing. Choose a value of n that results in a plot that looks how want. Then, if you like, you can remove the numbers from the axes afterwards and label them as if it were for arbitrary n even though it's not.