I have a truncation function defined as:
function f = phi_b(x, b)
if b == 0
f = sign(x);
else
f = -1 * (x<-b) + 1*(x>b) + (1/b) * x .* ((x>=-b & x<=b));
end;
It is used to truncate the observations which in my particular case corresponds to white noise:
model = arima('Constant',0,'AR',{0},'Variance',1);
y = simulate(model, 100);
The function I need in the end is:
r = #(b) (1/100) * sum((phi_b(y,b)).^2);
The problem is in finding the solution of the equation r(b)==0.1. Usual procedures like the one below will not work:
solve(r(b)==0.1, b)
Is there any way to solve such types of equations?
If the result of r(b) is a vector, you could invoke the min function and see where in this vector the closest value would be to 0.1. You can do something like:
result = r(b);
[val,index] = min(abs(result - 0.1));
val will contain how "close" 0.1 is with the best element in the vector that matches this criteria and index will tell you where in the result vector this element is. For example, if val = 0.00001 and index = 7, this means that the best value in result is 0.00001 away from 0.1. Also, index 7 in result is where this best element is located. To see what the actual value is, do r(7) or r(index).
Interestingly enough, you can use val as a way of measuring the resolution of your data. In other words, if val is very large, this could mean that you need to introduce more values in your vector at a smaller step size. If val is quite small, this could mean that what you originally specified as your b vector is adequate enough. I'm not familiar with the function so I have not considered whether or not there could be no solutions to the data you have provided to your r function.
Related
I want to find the maximum value using the second derivative of the the expression when x is between 0 and 1. In other words I am taking the derivative of cox(x^2) twice to get the second derivative resulting in - 2*sin(x^2) - 4*x^2*cos(x^2), then I want to evaluate this second derivative at x = 0 to x = 1, and display the maximum value of the populated values.
I have:
syms x
f = cos(x^2);
secondD = diff(diff(f));
for i = 0:1
y = max(secondD(i))
end
Can someone help?
You can do it easily by subs and double:
syms x
f = cos(x^2);
secondD = diff(diff(f));
% instead of the for loop
epsilon = 0.01;
specified_range = 0:epsilon:1;
[max_val, max_ind] = max(double(subs(secondD, specified_range)));
Please note that it is a numerical approach to find the maximum and the returned answer is not completely correct all the time. However, by increasing the epsilon, you can expect a better result in general (again in some cases it is not completely correct).
I want to check the calculation of the laplace filter from scipy.ndimage and compare it to my own method if differentiation. Below I have a piece of code that I ran
import scipy.ndimage.filters
n = 100
x_range = y_range = np.linspace(-1, 1, n)
X, Y = np.meshgrid(x_range, y_range)
f_fun = X ** 2 + Y ** 2
f_fun_laplace = 4 * np.ones(f_fun.shape)
res_laplace = scipy.ndimage.filters.laplace(f_fun, mode='constant')
I expect that the variable res_laplace will have the constant value of 4 over the whole domain (excluding the boundaries for simplicity), since this is what I would get by applying the laplace operator to my function f(x,y) = x^2 + y^2.
However, the value that res_laplace produces is 0.00163 in this case. So my question was.. why is this not equal to 4?
The answer, in this specific case, is that you need to multiply the output of scipy.ndimage.filters.laplace with a factor of (1/delta_x) ** 2. Where delta_x = np.diff(x_range)[0]
I simply assumed that the filter would take care of that, but in hindsight it is of course not able know the value delta_x.
And since we are differentiating twice, we need to square the inverse of this delta_x.
Ok, so I have a function that takes the maximum of a 16 different functions. I want to find the minimum of this function subject to the condition that this function is equal to another function. This is what the code looks like, (H1,...,H16 are all column vectors):
function f = opt(a,b,c)
F1 = a*mean(H1) + b*var(H1)+ c*skewness(H1);
...*more functions here*...
F15 = a*mean(H15) + b*var(H15)+ c*skewness(H15);
F16 = a*mean(H16) + b*var(H16)+ c*skewness(H16);
FVEC = [F1,F2,F3,F4,F5,F6,F7,F8,F9,F10,F11,F12,F13,F14,F15,F16];
[ max, max_index ] = max(FVEC);
f = max;
end
The constraint I want it basically that the above function should be equal to the first one in the list:
opt(a,b,c) = a*mean(H1) + b*var(H1)+ c*skewness(H1)
I think I'm supposed to use fmincon, but despite my repeated attempts, I seem to be running into issues, plus it does not look like it supports constraints depending on another function (although I might be misreading the docs). Is this the right function to use? What is the best way to approach this problem? I am very new to MATLAB and, so, I'm not familiar with what a typical approach would look like.
The max function is non-differentiable. Most solvers expect smooth functions (including fmincon). Luckily there is a simple linear formulation:
min y
y >= v(i) for all i
y will automatically assume the largest value of the v(i).
Your constraint would be
y = v(1)
In this case we can even drop the min y.
This would enforce the first set of observations to be the maximum v. I am not sure, but this could lead to an infeasible model (if it cannot arrange a,b,c in such a way).
I have an array of Fa which contains values I found from a function. Is there a way to use interp1 function in Matlab to find the index at which a specific value occurs? I have found tutorials for interp1 which I can find a specific value in the array using interp1 by knowing the corresponding index value.
Example from http://www.mathworks.com/help/matlab/ref/interp1.html:
Here are two vectors representing the census years from 1900 to 1990 and the corresponding United States population in millions of people.
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 203.212 226.505 249.633];
The expression interp1(t,p,1975) interpolates within the census data to estimate the population in 1975. The result is
ans =
214.8585
- but I want to find the t value for 214.8585.
In some sense, you want to find roots of a function -
f(x)-val
First of all, there might be several answers. Second, since the function is piecewise linear, you can check each segment by solving the relevant linear equation.
For example, suppose that you have this data:
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 70.212 226.505 249.633];
And you want to find the value 140
val = 140;
figure;plot(t,p);hold on;
plot( [min(t),max(t)], [val val],'r');
You should first subtract the value of val from p,
p1 = p - val;
Now you want only the segments in which p1 sign changes, either from + -> -, or vice versa.
segments = abs(diff(sign(p1)==1));
In each of these segments, you can solve the relevant linear equation a*x+b==0, and find the root. That is the index of your value.
for i=1:numel(segments)
x(1) = t(segments(i));
x(2) = t(segments(i)+1);
y(1) = p1(segments(i));
y(2) = p1(segments(i)+1);
m = (y(2)-y(1))/(x(2)-x(1));
n = y(2) - m * x(2);
index = -n/m;
scatter(index, val ,'g');
end
And here is the result:
You can search for the value in Fa directly:
idx = Fa==value_to_find;
To find the index use find function:
find(Fa==value_to_find);
Of course, this works only if the value_to_find is present in Fa. But as I understand it, this is what you want. You do not need interp for that.
If on the other hand the value might not be present in Fa, but Fa is sorted, you can search for values larger than value_to_find and take the first such index:
find(Fa>=value_to_find,1);
If your problem is more complicated than that, look at Andreys answer.
Andrey's solution works in principle, but the code presented here does not. The problem is with the definition of the segments, which yields a vector of 0's and 1's, whereafter the call to "t(segments(i))" results in an error (I tried to copy & paste the code - I hope I did not fail in that simple task).
I made a small change to the definition of the segments. It might be done more elegantly. Here it is:
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 70.212 226.505 249.633];
val = 140;
figure;plot(t,p,'.-');hold on;
plot( [min(t),max(t)], [val val],'r');
p1 = p - val;
tn = 1:length(t);
segments = tn([abs(diff(sign(p1)==1)) 0].*tn>0);
for i=1:numel(segments)
x(1) = t(segments(i));
x(2) = t(segments(i)+1);
y(1) = p1(segments(i));
y(2) = p1(segments(i)+1);
m = (y(2)-y(1))/(x(2)-x(1));
n = y(2) - m * x(2);
index = -n/m;
scatter(index, val ,'g');
end
interpolate the entire function to a higher precision. Then search.
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 203.212 226.505 249.633];
precision = 0.5;
ti = 1900:precision:1990;
pi = interp1(t,p,ti);
now pi holds all pi values for every half a year. Assuming the values always increase you could find the year by max(ti(pi < x)) where x = 214.8585. Here pi < x creates a logical vector used to filter ti to only provide the years when p is less than x. max() is then used to take the most recent year, which will also be closest to x if the assumption that p is always increasing holds.
The answer to the most general case was given above by Andrey, and I agree with it.
For the example that you stated, a simple particular solution would be:
interp1(p,t,214.8585)
In this case you are solving for the year when a given population is known.
This approach will NOT work when there is more than one solution. If you try this with Andrey's values you will only get the first solution to the problem.
I have a MATLAB routine with one rather obvious bottleneck. I've profiled the function, with the result that 2/3 of the computing time is used in the function levels:
The function levels takes a matrix of floats and splits each column into nLevels buckets, returning a matrix of the same size as the input, with each entry replaced by the number of the bucket it falls into.
To do this I use the quantile function to get the bucket limits, and a loop to assign the entries to buckets. Here's my implementation:
function [Y q] = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q)
q=transpose(q);
end
Y = zeros(size(X));
for i = 1:nLevels
% "The variables g and l indicate the entries that are respectively greater than
% or less than the relevant bucket limits. The line Y(g & l) = i is assigning the
% value i to any element that falls in this bucket."
if i ~= nLevels % "The default; doesnt include upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#lt,X,q(i+1,:));
else % "For the final level we include the upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#le,X,q(i+1,:));
end
Y(g & l) = i;
end
Is there anything I can do to speed this up? Can the code be vectorized?
If I understand correctly, you want to know how many items fell in each bucket.
Use:
n = hist(Y,nbins)
Though I am not sure that it will help in the speedup. It is just cleaner this way.
Edit : Following the comment:
You can use the second output parameter of histc
[n,bin] = histc(...) also returns an index matrix bin. If x is a vector, n(k) = >sum(bin==k). bin is zero for out of range values. If x is an M-by-N matrix, then
How About this
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
Y = zeros(size(X));
for i = 1:numel(q)-1
Y = Y+ X>=q(i);
end
This results in the following:
>>X = [3 1 4 6 7 2];
>>[Y, q] = levels(X,2)
Y =
1 1 2 2 2 1
q =
1 3.5 7
You could also modify the logic line to ensure values are less than the start of the next bin. However, I don't think it is necessary.
I think you shoud use histc
[~,Y] = histc(X,q)
As you can see in matlab's doc:
Description
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts. No elements of x can be complex.
I made a couple of refinements (including one inspired by Aero Engy in another answer) that have resulted in some improvements. To test them out, I created a random matrix of a million rows and 100 columns to run the improved functions on:
>> x = randn(1000000,100);
First, I ran my unmodified code, with the following results:
Note that of the 40 seconds, around 14 of them are spent computing the quantiles - I can't expect to improve this part of the routine (I assume that Mathworks have already optimized it, though I guess that to assume makes an...)
Next, I modified the routine to the following, which should be faster and has the advantage of being fewer lines as well!
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
Y = ones(size(X));
for i = 2:nLevels
Y = Y + bsxfun(#ge,X,q(i,:));
end
The profiling results with this code are:
So it is 15 seconds faster, which represents a 150% speedup of the portion of code that is mine, rather than MathWorks.
Finally, following a suggestion of Andrey (again in another answer) I modified the code to use the second output of the histc function, which assigns entries to bins. It doesn't treat the columns independently, so I had to loop over the columns manually, but it seems to be performing really well. Here's the code:
function [Y q] = levels(X,nLevels)
p = linspace(0,1,nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
q(end,:) = 2 * q(end,:);
Y = zeros(size(X));
for k = 1:size(X,2)
[junk Y(:,k)] = histc(X(:,k),q(:,k));
end
And the profiling results:
We now spend only 4.3 seconds in codes outside the quantile function, which is around a 500% speedup over what I wrote originally. I've spent a bit of time writing this answer because I think it's turned into a nice example of how you can use the MATLAB profiler and StackExchange in combination to get much better performance from your code.
I'm happy with this result, although of course I'll continue to be pleased to hear other answers. At this stage the main performance increase will come from increasing the performance of the part of the code that currently calls quantile. I can't see how to do this immediately, but maybe someone else here can. Thanks again!
You can sort the columns and divide+round the inverse indexes:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
[S,IX]=sort(X);
[grid1,grid2]=ndgrid(1:size(IX,1),1:size(IX,2));
invIX=zeros(size(X));
invIX(sub2ind(size(X),IX(:),grid2(:)))=grid1;
Y=ceil(invIX/size(X,1)*nLevels);
Or you can use tiedrank:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
R=tiedrank(X);
Y=ceil(R/size(X,1)*nLevels);
Surprisingly, both these solutions are slightly slower than the quantile+histc solution.