Matlab using interp1 to find the index? - matlab

I have an array of Fa which contains values I found from a function. Is there a way to use interp1 function in Matlab to find the index at which a specific value occurs? I have found tutorials for interp1 which I can find a specific value in the array using interp1 by knowing the corresponding index value.
Example from http://www.mathworks.com/help/matlab/ref/interp1.html:
Here are two vectors representing the census years from 1900 to 1990 and the corresponding United States population in millions of people.
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 203.212 226.505 249.633];
The expression interp1(t,p,1975) interpolates within the census data to estimate the population in 1975. The result is
ans =
214.8585
- but I want to find the t value for 214.8585.

In some sense, you want to find roots of a function -
f(x)-val
First of all, there might be several answers. Second, since the function is piecewise linear, you can check each segment by solving the relevant linear equation.
For example, suppose that you have this data:
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 70.212 226.505 249.633];
And you want to find the value 140
val = 140;
figure;plot(t,p);hold on;
plot( [min(t),max(t)], [val val],'r');
You should first subtract the value of val from p,
p1 = p - val;
Now you want only the segments in which p1 sign changes, either from + -> -, or vice versa.
segments = abs(diff(sign(p1)==1));
In each of these segments, you can solve the relevant linear equation a*x+b==0, and find the root. That is the index of your value.
for i=1:numel(segments)
x(1) = t(segments(i));
x(2) = t(segments(i)+1);
y(1) = p1(segments(i));
y(2) = p1(segments(i)+1);
m = (y(2)-y(1))/(x(2)-x(1));
n = y(2) - m * x(2);
index = -n/m;
scatter(index, val ,'g');
end
And here is the result:

You can search for the value in Fa directly:
idx = Fa==value_to_find;
To find the index use find function:
find(Fa==value_to_find);
Of course, this works only if the value_to_find is present in Fa. But as I understand it, this is what you want. You do not need interp for that.
If on the other hand the value might not be present in Fa, but Fa is sorted, you can search for values larger than value_to_find and take the first such index:
find(Fa>=value_to_find,1);
If your problem is more complicated than that, look at Andreys answer.

Andrey's solution works in principle, but the code presented here does not. The problem is with the definition of the segments, which yields a vector of 0's and 1's, whereafter the call to "t(segments(i))" results in an error (I tried to copy & paste the code - I hope I did not fail in that simple task).
I made a small change to the definition of the segments. It might be done more elegantly. Here it is:
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 70.212 226.505 249.633];
val = 140;
figure;plot(t,p,'.-');hold on;
plot( [min(t),max(t)], [val val],'r');
p1 = p - val;
tn = 1:length(t);
segments = tn([abs(diff(sign(p1)==1)) 0].*tn>0);
for i=1:numel(segments)
x(1) = t(segments(i));
x(2) = t(segments(i)+1);
y(1) = p1(segments(i));
y(2) = p1(segments(i)+1);
m = (y(2)-y(1))/(x(2)-x(1));
n = y(2) - m * x(2);
index = -n/m;
scatter(index, val ,'g');
end

interpolate the entire function to a higher precision. Then search.
t = 1900:10:1990;
p = [75.995 91.972 105.711 123.203 131.669...
150.697 179.323 203.212 226.505 249.633];
precision = 0.5;
ti = 1900:precision:1990;
pi = interp1(t,p,ti);
now pi holds all pi values for every half a year. Assuming the values always increase you could find the year by max(ti(pi < x)) where x = 214.8585. Here pi < x creates a logical vector used to filter ti to only provide the years when p is less than x. max() is then used to take the most recent year, which will also be closest to x if the assumption that p is always increasing holds.

The answer to the most general case was given above by Andrey, and I agree with it.
For the example that you stated, a simple particular solution would be:
interp1(p,t,214.8585)
In this case you are solving for the year when a given population is known.
This approach will NOT work when there is more than one solution. If you try this with Andrey's values you will only get the first solution to the problem.

Related

Finding the maximum value from an expression using a loop in Matlab

I want to find the maximum value using the second derivative of the the expression when x is between 0 and 1. In other words I am taking the derivative of cox(x^2) twice to get the second derivative resulting in - 2*sin(x^2) - 4*x^2*cos(x^2), then I want to evaluate this second derivative at x = 0 to x = 1, and display the maximum value of the populated values.
I have:
syms x
f = cos(x^2);
secondD = diff(diff(f));
for i = 0:1
y = max(secondD(i))
end
Can someone help?
You can do it easily by subs and double:
syms x
f = cos(x^2);
secondD = diff(diff(f));
% instead of the for loop
epsilon = 0.01;
specified_range = 0:epsilon:1;
[max_val, max_ind] = max(double(subs(secondD, specified_range)));
Please note that it is a numerical approach to find the maximum and the returned answer is not completely correct all the time. However, by increasing the epsilon, you can expect a better result in general (again in some cases it is not completely correct).

MATLAB: How to plot a cubic expression for certain range of input pressure

I have a cubic expression here
I am trying to determine and plot δ𝛿 in the expression for P values of 0.0 to 5000. I'm really struggling to get the expression for δ in terms of the pressure P.
clear all;
close all;
t = 0.335*1e-9;
r = 62*1e-6;
delta = 1.2*1e+9;
E = 1e+12;
v = 0.17;
P = 0:100:5000
P = (4*delta*t)*w/r^2 + (2*E*t)*w^3/((1-v)*r^4);
I would appreciate if anyone could provide pointers.
I suggest two simple methods.
You evaluate P as a function of delta then you plot(P,delta). This is quick and dirty but if all you need is a plot it will do. The inconvenience is that you may to do some guess-and-trial to find the correct interval of P values, but you can also take a large enough value of delta_max and then restrict the x-axis limit of the plot.
Your function is a simple cubic, which you can solve analytically (see here if you are lost) to invert P(delta) into delta(P).
What you want is the functional inverse of your expression, i.e., δ𝛿 as a function of P. Since it's a cubic polynomial, you can expect up to three solutions (roots) for a give value of P. However, I'm guessing that you're only interested in real-valued solutions and nonnegative values of P. In that case there's just one real root for each value of P.
Given the values of your parameters, it makes most sense to solve this numerically using fzero. Using the parameter names in your code (different from equations):
t = 0.335*1e-9;
r = 62*1e-6;
delta = 1.2*1e9;
E = 1e12;
v = 0.17;
f = #(w,p)2*E*t*w.^3/((1-v)*r^4)+4*delta*t*w/r^2-p;
P = 0:100:5000;
w0 = [0 1]; % Bounded initial guess, valid up to very large values of P
w_sol = zeros(length(P),1);
for i = 1:length(P)
w_sol(i) = fzero(#(w)f(w,P(i)),w0); % Find solution for each P
end
figure;
plot(P,w_sol);
You could also solve this using symbolic math:
syms w p
t = 0.335*sym(1e-9);
r = 62*sym(1e-6);
delta = 1.2*sym(1e9);
E = sym(1e12);
v = sym(0.17);
w_sol = solve(p==2*E*t*w^3/((1-v)*r^4)+4*delta*t*w/r^2,w);
P = 0:100:5000;
w_sol = double(subs(w_sol(1),p,P)); % Plug in P values and convert to floating point
figure;
plot(P,w_sol);
Because of your numeric parameter values, solve returns an answer in terms of three RootOf objects, the first of which is the real one you want.

Iteration of matrix-vector multiplication which stores specific index-positions

I need to solve a min distance problem, to see some of the work which has being tried take a look at:
link: click here
I have four elements: two column vectors: alpha of dim (px1) and beta of dim (qx1). In this case p = q = 50 giving two column vectors of dim (50x1) each. They are defined as follows:
alpha = alpha = 0:0.05:2;
beta = beta = 0:0.05:2;
and I have two matrices: L1 and L2.
L1 is composed of three column-vectors of dimension (kx1) each.
L2 is composed of three column-vectors of dimension (mx1) each.
In this case, they have equal size, meaning that k = m = 1000 giving: L1 and L2 of dim (1000x3) each. The values of these matrices are predefined.
They have, nevertheless, the following structure:
L1(kx3) = [t1(kx1) t2(kx1) t3(kx1)];
L2(mx3) = [t1(mx1) t2(mx1) t3(mx1)];
The min. distance problem I need to solve is given (mathematically) as follows:
d = min( (x-(alpha_p*t1_k - beta_q*t1_m)).^2 + (y-(alpha_p*t2_k - beta_q*t2_m)).^2 +
(z-(alpha_p*t3_k - beta_q*t3_m)).^2 )
the values x,y,z are three fixed constants.
My problem
I need to develop an iteration which can give me back the index positions from the combination of: alpha, beta, L1 and L2 which fulfills the min-distance problem from above.
I hope the formulation for the problem is clear, I have been very careful with the index notations. But if it is still not so clear... the step size for:
alpha is p = 1,...50
beta is q = 1,...50
for L1; t1, t2, t3 is k = 1,...,1000
for L2; t1, t2, t3 is m = 1,...,1000
And I need to find the index of p, index of q, index of k and index of m which gives me the min. distance to the point x,y,z.
Thanks in advance for your help!
I don't know your values so i wasn't able to check my code. I am using loops because it is the most obvious solution. Pretty sure that someone from the bsxfun-brigarde ( ;-D ) will find a shorter/more effective solution.
alpha = 0:0.05:2;
beta = 0:0.05:2;
L1(kx3) = [t1(kx1) t2(kx1) t3(kx1)];
L2(mx3) = [t1(mx1) t2(mx1) t3(mx1)];
idx_smallest_d =[1,1,1,1];
smallest_d = min((x-(alpha(1)*t1(1) - beta(1)*t1(1))).^2 + (y-(alpha(1)*t2(1) - beta(1)*t2(1))).^2+...
(z-(alpha(1)*t3(1) - beta(1)*t3(1))).^2);
%The min. distance problem I need to solve is given (mathematically) as follows:
for p=1:1:50
for q=1:1:50
for k=1:1:1000
for m=1:1:1000
d = min((x-(alpha(p)*t1(k) - beta(q)*t1(m))).^2 + (y-(alpha(p)*t2(k) - beta(q)*t2(m))).^2+...
(z-(alpha(p)*t3(k) - beta(q)*t3(m))).^2);
if d < smallest_d
smallest_d=d;
idx_smallest_d= [p,q,k,m];
end
end
end
end
end
What I am doing is predefining the smallest distance as the distance of the first combination and then checking for each combination rather the distance is smaller than the previous shortest distance.

Finding an estimated solution of the equation

I have a truncation function defined as:
function f = phi_b(x, b)
if b == 0
f = sign(x);
else
f = -1 * (x<-b) + 1*(x>b) + (1/b) * x .* ((x>=-b & x<=b));
end;
It is used to truncate the observations which in my particular case corresponds to white noise:
model = arima('Constant',0,'AR',{0},'Variance',1);
y = simulate(model, 100);
The function I need in the end is:
r = #(b) (1/100) * sum((phi_b(y,b)).^2);
The problem is in finding the solution of the equation r(b)==0.1. Usual procedures like the one below will not work:
solve(r(b)==0.1, b)
Is there any way to solve such types of equations?
If the result of r(b) is a vector, you could invoke the min function and see where in this vector the closest value would be to 0.1. You can do something like:
result = r(b);
[val,index] = min(abs(result - 0.1));
val will contain how "close" 0.1 is with the best element in the vector that matches this criteria and index will tell you where in the result vector this element is. For example, if val = 0.00001 and index = 7, this means that the best value in result is 0.00001 away from 0.1. Also, index 7 in result is where this best element is located. To see what the actual value is, do r(7) or r(index).
Interestingly enough, you can use val as a way of measuring the resolution of your data. In other words, if val is very large, this could mean that you need to introduce more values in your vector at a smaller step size. If val is quite small, this could mean that what you originally specified as your b vector is adequate enough. I'm not familiar with the function so I have not considered whether or not there could be no solutions to the data you have provided to your r function.

How can I speed up this call to quantile in Matlab?

I have a MATLAB routine with one rather obvious bottleneck. I've profiled the function, with the result that 2/3 of the computing time is used in the function levels:
The function levels takes a matrix of floats and splits each column into nLevels buckets, returning a matrix of the same size as the input, with each entry replaced by the number of the bucket it falls into.
To do this I use the quantile function to get the bucket limits, and a loop to assign the entries to buckets. Here's my implementation:
function [Y q] = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q)
q=transpose(q);
end
Y = zeros(size(X));
for i = 1:nLevels
% "The variables g and l indicate the entries that are respectively greater than
% or less than the relevant bucket limits. The line Y(g & l) = i is assigning the
% value i to any element that falls in this bucket."
if i ~= nLevels % "The default; doesnt include upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#lt,X,q(i+1,:));
else % "For the final level we include the upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#le,X,q(i+1,:));
end
Y(g & l) = i;
end
Is there anything I can do to speed this up? Can the code be vectorized?
If I understand correctly, you want to know how many items fell in each bucket.
Use:
n = hist(Y,nbins)
Though I am not sure that it will help in the speedup. It is just cleaner this way.
Edit : Following the comment:
You can use the second output parameter of histc
[n,bin] = histc(...) also returns an index matrix bin. If x is a vector, n(k) = >sum(bin==k). bin is zero for out of range values. If x is an M-by-N matrix, then
How About this
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
Y = zeros(size(X));
for i = 1:numel(q)-1
Y = Y+ X>=q(i);
end
This results in the following:
>>X = [3 1 4 6 7 2];
>>[Y, q] = levels(X,2)
Y =
1 1 2 2 2 1
q =
1 3.5 7
You could also modify the logic line to ensure values are less than the start of the next bin. However, I don't think it is necessary.
I think you shoud use histc
[~,Y] = histc(X,q)
As you can see in matlab's doc:
Description
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts. No elements of x can be complex.
I made a couple of refinements (including one inspired by Aero Engy in another answer) that have resulted in some improvements. To test them out, I created a random matrix of a million rows and 100 columns to run the improved functions on:
>> x = randn(1000000,100);
First, I ran my unmodified code, with the following results:
Note that of the 40 seconds, around 14 of them are spent computing the quantiles - I can't expect to improve this part of the routine (I assume that Mathworks have already optimized it, though I guess that to assume makes an...)
Next, I modified the routine to the following, which should be faster and has the advantage of being fewer lines as well!
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
Y = ones(size(X));
for i = 2:nLevels
Y = Y + bsxfun(#ge,X,q(i,:));
end
The profiling results with this code are:
So it is 15 seconds faster, which represents a 150% speedup of the portion of code that is mine, rather than MathWorks.
Finally, following a suggestion of Andrey (again in another answer) I modified the code to use the second output of the histc function, which assigns entries to bins. It doesn't treat the columns independently, so I had to loop over the columns manually, but it seems to be performing really well. Here's the code:
function [Y q] = levels(X,nLevels)
p = linspace(0,1,nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
q(end,:) = 2 * q(end,:);
Y = zeros(size(X));
for k = 1:size(X,2)
[junk Y(:,k)] = histc(X(:,k),q(:,k));
end
And the profiling results:
We now spend only 4.3 seconds in codes outside the quantile function, which is around a 500% speedup over what I wrote originally. I've spent a bit of time writing this answer because I think it's turned into a nice example of how you can use the MATLAB profiler and StackExchange in combination to get much better performance from your code.
I'm happy with this result, although of course I'll continue to be pleased to hear other answers. At this stage the main performance increase will come from increasing the performance of the part of the code that currently calls quantile. I can't see how to do this immediately, but maybe someone else here can. Thanks again!
You can sort the columns and divide+round the inverse indexes:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
[S,IX]=sort(X);
[grid1,grid2]=ndgrid(1:size(IX,1),1:size(IX,2));
invIX=zeros(size(X));
invIX(sub2ind(size(X),IX(:),grid2(:)))=grid1;
Y=ceil(invIX/size(X,1)*nLevels);
Or you can use tiedrank:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
R=tiedrank(X);
Y=ceil(R/size(X,1)*nLevels);
Surprisingly, both these solutions are slightly slower than the quantile+histc solution.