Finding the maximum value from an expression using a loop in Matlab - matlab

I want to find the maximum value using the second derivative of the the expression when x is between 0 and 1. In other words I am taking the derivative of cox(x^2) twice to get the second derivative resulting in - 2*sin(x^2) - 4*x^2*cos(x^2), then I want to evaluate this second derivative at x = 0 to x = 1, and display the maximum value of the populated values.
I have:
syms x
f = cos(x^2);
secondD = diff(diff(f));
for i = 0:1
y = max(secondD(i))
end
Can someone help?

You can do it easily by subs and double:
syms x
f = cos(x^2);
secondD = diff(diff(f));
% instead of the for loop
epsilon = 0.01;
specified_range = 0:epsilon:1;
[max_val, max_ind] = max(double(subs(secondD, specified_range)));
Please note that it is a numerical approach to find the maximum and the returned answer is not completely correct all the time. However, by increasing the epsilon, you can expect a better result in general (again in some cases it is not completely correct).

Related

Minimize difference between indicator variables in Matlab

I'm new to Matlab and want to write a program that chooses the value of a parameter (P) to minimize the difference between two vectors, where each vector is a variable in a dataframe. The first vector (call it A) is a predetermined vector of 1s and 0s, and the second vector (call it B) has each of its entries determined as an indicator function that depends on the value of the parameter P and other variables in the dataframe. For instance, let C be a third variable in the dataset, so
A = [1, 0, 0, 1, 0]
B = [x, y, z, u, v]
where x = 1 if (C[1]+10)^0.5 - P > (C[1])^0.5 and otherwise x = 0, and similarly, y = 1 if (C[2]+10)^0.5 - P > (C[2])^0.5 and otherwise y = 0, and so on.
I'm not really sure where to start with the code, except that it might be useful to use the fminsearch command. Any suggestions?
Edit: I changed the above by raising to a power, which is closer to the actual example that I have. I'm also providing a complete example in response to a comment:
Let A be as above, and let C = [10, 1, 100, 1000, 1]. Then my goal with the Matlab code would be to choose a value of P to minimize the differences between the coordinates of the vectors A and B, where B[1] = 1 if (10+10)^0.5 - P > (10)^0.5 and otherwise B[1] = 0, and similarly B[2] = 1 if (1+10)^0.5 - P > (1)^0.5 and otherwise B[2] = 0, etc. So I want to choose P to maximize the likelihood that A[1] = B[1], A[2] = B[2], etc.
I have the following setup in Matlab, where ds is the name of my dataset:
ds.B = zeros(size(ds,1),1); % empty vector to fill
for i = 1:size(ds,1)
if ((ds.C(i) + 10)^(0.5) - P > (ds.C(i))^(0.5))
ds.B(i) = 1;
else
ds.B(i) = 0;
end
end
Now I want to choose the value of P to minimize the difference between A and B. How can I do this?
EDIT: I'm also wondering how to do this when the inequality is something like (C[i]+10)^0.5 - P*D[i] > (C[i])^0.5, where D is another variable in my dataset. Now P is a scalar being multiplied rather than just added. This seems more complicated since I can't solve for P exactly. How can I solve the problem in this case?
EDIT 1: It seems fminbnd() isn't optimal, likely due to the stairstep nature of the indicator function. I've updated to test the midpoints of all the regions between indicator function flips, plus endpoints.
EDIT 2: Updated to include dataset D as a coefficient of P.
If you can package your distance calculation up in a single function based on P, you can then search for its minimum.
arraySize = 1000;
ds.A = double(rand([arraySize,1]) > 0.5);
ds.C = rand(size(ds.A));
ds.D = rand(size(ds.A));
B = #(P)double((ds.C+10).^0.5 - P.*ds.D > ds.C.^0.5);
costFcn = #(P)sqrt(sum((ds.A-B(P)).^2));
% Solving the equation (C+10)^0.5 - P*D = C^0.5 for P, and sorting the results
BCrossingPoints = sort(((ds.C+10).^0.5-ds.C.^0.5)./ds.D);
% Taking the average of each crossing point with its neighbors
BMidpoints = (BCrossingPoints(1:end-1)+BCrossingPoints(2:end))/2;
% Appending endpoints onto the midpoints
PsToTest = [BCrossingPoints(1)-0.1; BMidpoints; BCrossingPoints(end)+0.1];
% Calculate the distance from A to B at each P to test
costResult = arrayfun(costFcn,PsToTest);
% Find the minimum cost
[~,lowestCostIndex] = min(costResult);
% Find the optimum P
optimumP = PsToTest(lowestCostIndex);
ds.B = B(optimumP);
semilogx(PsToTest,costResult)
xlabel('P')
ylabel('Distance from A to B')
1.- x is assumed positive real only, because with x<0 then complex values show up.
Since no comment is made in the question it seems reasonable to assume x real and x>0 only.
As requested, P 'the parameter' a scalar, P only has 2 significant states >0 or <0, let's see how is this:
2.- The following lines generate kind-of random A and C.
Then a sweep of p is carried out and distances d1 and d2 are calculated.
d1 is euclidean distance and d2 is the absolute of the difference between A and and B converting both from binary to decimal:
N=10
% A=[1 0 0 1 0]
A=randi([0 1],1,N);
% C=[10 1 1e2 1e3 1]
C=randi([0 1e3],1,N)
p=[-1e4:1:1e4]; % parameter to optimize
B=zeros(1,numel(A));
d1=zeros(1,numel(p)); % euclidean distance
d2=zeros(1,numel(p)); % difference distance
for k1=1:1:numel(p)
B=(C+10).^.5-p(k1)>C.^.5;
d1(k1)=(sum((B-A).^2))^.5;
d2(k1)=abs(sum(A.*2.^[numel(A)-1:-1:0])-sum(B.*2.^[numel(A)-1:-1:0]));
end
figure;
plot(p,d1)
grid on
xlabel('p');title('d1')
figure
plot(p,d2)
grid on
xlabel('p');title('d2')
The only degree of freedom to optimise seems to be the sign of P regardless of |P| value.
3.- f(p,x) has either no root, or just one root, depending upon p
The threshold funtion is
if f(x)>0 then B(k)==1 else B(k)==0
this is
f(p,x)=(x+10)^.5-p-x^.5
Now
(x+10).^.5-p>x.^.5 is same as (x+10).^.5-x.^.5>p
There's a range of p that keeps f(p,x)=0 without any (real) root.
For the particular case p=0 then (x+10).^.5 and x.^.5 do not intersect (until Inf reached = there's no intersection)
figure;plot(x,(x+10).^.5,x,x.^.5);grid on
[![enter image description here][3]][3]
y2=diff((x+10).^.5-x.^.5)
figure;plot(x(2:end),y2);
grid on;xlabel('x')
title('y2=diff((x+10).^.5-x.^.5)')
[![enter image description here][3]][3]
% 005
This means the condition f(x)>0 is always true holding all bits of B=1. With B=1 then d(A,B) turns into d(A,1), a constant.
However, for a certain value of p then there's one root and f(x)>0 is always false keeping all bits of B=0.
In this case d(A,B) the cost function turns into d(A,0) and this is A itself.
4.- P as a vector
The optimization gains in degrees of freedom if instead of P scalar, P is considered as vector.
For a given x there's a value of p that switches B(k) from 0 to 1.
Any value of p below such threshold keeps B(k)=0.
Equivalently, inverting f(x) :
g(p)=(10-p^2)^2/(4*p^2)>x
Values of x below this threshold bring B closer to A because for each element of B it's flipped to the element value of A.
Therefore, it's convenient to consider P as a vector, not a ascalar, and :
For all, or as many (as possible) elements of C to meet c(k)<(10-p^2)^2/(4*p^2) in order to get C=A or
minimize d(A,C)
5.- roots of f(p,x)
syms t positive
p=[-1000:.1:1000];
zp=NaN*ones(1,numel(p));
sol=zeros(1,numel(p));
for k1=1:1:numel(p)
p(k1)
eq1=(t+10)^.5-p(k1)-t^.5-p(k1)==0;
s1=solve(eq1,t);
if ~isempty(s1)
zp(k1)=s1;
end
end
nzp=~isnan(zp);
zp(nzp)
returns
=
620.0100 151.2900 64.5344 34.2225 20.2500 12.7211
8.2451 5.4056 3.5260 2.2500 1.3753 0.7803
0.3882 0.1488 0.0278

MATLAB: How to plot a cubic expression for certain range of input pressure

I have a cubic expression here
I am trying to determine and plot δ𝛿 in the expression for P values of 0.0 to 5000. I'm really struggling to get the expression for δ in terms of the pressure P.
clear all;
close all;
t = 0.335*1e-9;
r = 62*1e-6;
delta = 1.2*1e+9;
E = 1e+12;
v = 0.17;
P = 0:100:5000
P = (4*delta*t)*w/r^2 + (2*E*t)*w^3/((1-v)*r^4);
I would appreciate if anyone could provide pointers.
I suggest two simple methods.
You evaluate P as a function of delta then you plot(P,delta). This is quick and dirty but if all you need is a plot it will do. The inconvenience is that you may to do some guess-and-trial to find the correct interval of P values, but you can also take a large enough value of delta_max and then restrict the x-axis limit of the plot.
Your function is a simple cubic, which you can solve analytically (see here if you are lost) to invert P(delta) into delta(P).
What you want is the functional inverse of your expression, i.e., δ𝛿 as a function of P. Since it's a cubic polynomial, you can expect up to three solutions (roots) for a give value of P. However, I'm guessing that you're only interested in real-valued solutions and nonnegative values of P. In that case there's just one real root for each value of P.
Given the values of your parameters, it makes most sense to solve this numerically using fzero. Using the parameter names in your code (different from equations):
t = 0.335*1e-9;
r = 62*1e-6;
delta = 1.2*1e9;
E = 1e12;
v = 0.17;
f = #(w,p)2*E*t*w.^3/((1-v)*r^4)+4*delta*t*w/r^2-p;
P = 0:100:5000;
w0 = [0 1]; % Bounded initial guess, valid up to very large values of P
w_sol = zeros(length(P),1);
for i = 1:length(P)
w_sol(i) = fzero(#(w)f(w,P(i)),w0); % Find solution for each P
end
figure;
plot(P,w_sol);
You could also solve this using symbolic math:
syms w p
t = 0.335*sym(1e-9);
r = 62*sym(1e-6);
delta = 1.2*sym(1e9);
E = sym(1e12);
v = sym(0.17);
w_sol = solve(p==2*E*t*w^3/((1-v)*r^4)+4*delta*t*w/r^2,w);
P = 0:100:5000;
w_sol = double(subs(w_sol(1),p,P)); % Plug in P values and convert to floating point
figure;
plot(P,w_sol);
Because of your numeric parameter values, solve returns an answer in terms of three RootOf objects, the first of which is the real one you want.

Why is this for loop giving me an error?

So I am trying to go through a for loop that will increment .1 every time and will do this until the another variable h is less than or equal to zero. Then I am suppose to graph this h variable along another variable x. The code that I wrote looks like this:
O = 20;
v = 200;
g = 32.2;
for t = 0:.1:12
% Calculate the height
h(t) = (v)*(t)*(sin(O))-(1/2)*(g)*(t^2);
% Calculate the horizontal location
x(t) = (v)*(t)*cos(O);
if t > 0 && h <= 0
break
end
end
The Error that I keep getting when running this code says "Attempted to access h(0); index must be a positive integer or logical." I don't understand what exactly is going on in order for this to happen. So my question is why is this happening and is there a way I can solve it, Thank you in advance.
You're using t as your loop variable as well as your indexing variable. This doesn't work, because you'll try to access h(0), h(0.1), h(0.2), etc, which doesn't make sense. As the error says, you can only access variables using integers. You could replace your code with the following:
t = 0:0.1:12;
for i = 1:length(t)
% use t(i) instead of t now
end
I will also point out that you don't need to use a for loop to do this. MATLAB is optimised for acting on matrices (and vectors), and will in general run faster on vectorised functions rather than for loops. For instance, your equation for h could be replaced with the following:
O = 20;
v = 200;
g = 32.2;
t = 0:0.1:12;
h = v * t * sin(O) - 0.5 * g * t.^2;
The only difference is that you have to use the element-wise square (.^2) rather than the normal square (^2). This means that MATLAB will square each element of the vector t, rather than multiplying the vector t by itself.
In short:
As the error says, t needs to be an integer or logical.
But your t is t=0:0.1:12, therefore a decimal value.
O = 20;
v = 200;
g = 32.2;
for t = 0:.1:12
% Calculate the height
idx_t = 1:numel(t);
h(idx_t) = (v)*(t)*(sin(O))-(1/2)*(g)*(t^2);
% Calculate the horizontal location
x(idx_t) = (v)*(t)*cos(O);
if t > 0 && h <= 0
break
end
end
Look this question's answer for more options: Subscript indices must either be real positive integers or logical error

Integration via trapezoidal sums in MATLAB

I need help finding an integral of a function using trapezoidal sums.
The program should take successive trapezoidal sums with n = 1, 2, 3, ...
subintervals until there are two neighouring values of n that differ by less than a given tolerance. I want at least one FOR loop within a WHILE loop and I don't want to use the trapz function. The program takes four inputs:
f: A function handle for a function of x.
a: A real number.
b: A real number larger than a.
tolerance: A real number that is positive and very small
The problem I have is trying to implement the formula for trapezoidal sums which is
Δx/2[y0 + 2y1 + 2y2 + … + 2yn-1 + yn]
Here is my code, and the area I'm stuck in is the "sum" part within the FOR loop. I'm trying to sum up 2y2 + 2y3....2yn-1 since I already accounted for 2y1. I get an answer, but it isn't as accurate as it should be. For example, I get 6.071717974723753 instead of 6.101605982576467.
Thanks for any help!
function t=trapintegral(f,a,b,tol)
format compact; format long;
syms x;
oldtrap = ((b-a)/2)*(f(a)+f(b));
n = 2;
h = (b-a)/n;
newtrap = (h/2)*(f(a)+(2*f(a+h))+f(b));
while (abs(newtrap-oldtrap)>=tol)
oldtrap = newtrap;
for i=[3:n]
dx = (b-a)/n;
trapezoidsum = (dx/2)*(f(x) + (2*sum(f(a+(3:n-1))))+f(b));
newtrap = trapezoidsum;
end
end
t = newtrap;
end
The reason why this code isn't working is because there are two slight errors in your summation for the trapezoidal rule. What I am precisely referring to is this statement:
trapezoidsum = (dx/2)*(f(x) + (2*sum(f(a+(3:n-1))))+f(b));
Recall the equation for the trapezoidal integration rule:
Source: Wikipedia
For the first error, f(x) should be f(a) as you are including the starting point, and shouldn't be left as symbolic. In fact, you should simply get rid of the syms x statement as it is not useful in your script. a corresponds to x1 by consulting the above equation.
The next error is the second term. You actually need to multiply your index values (3:n-1) by dx. Also, this should actually go from (1:n-1) and I'll explain later. The equation above goes from 2 to N, but for our purposes, we are going to go from 1 to N-1 as you have your code set up like that.
Remember, in the trapezoidal rule, you are subdividing the finite interval into n pieces. The ith piece is defined as:
x_i = a + dx*i; ,
where i goes from 1 up to N-1. Note that this starts at 1 and not 3. The reason why is because the first piece is already taken into account by f(a), and we only count up to N-1 as piece N is accounted by f(b). For the equation, this goes from 2 to N and by modifying the code this way, this is precisely what we are doing in the end.
Therefore, your statement actually needs to be:
trapezoidsum = (dx/2)*(f(a) + (2*sum(f(a+dx*(1:n-1))))+f(b));
Try this and let me know if you get the right answer. FWIW, MATLAB already implements trapezoidal integration by doing trapz as #ADonda already pointed out. However, you need to properly structure what your x and y values are before you set this up. In other words, you would need to set up your dx before hand, then calculate your x points using the x_i equation that I specified above, then use these to generate your y values. You then use trapz to calculate the area. In other words:
dx = (b-a) / n;
x = a + dx*(0:n);
y = f(x);
trapezoidsum = trapz(x,y);
You can use the above code as a reference to see if you are implementing the trapezoidal rule correctly. Your implementation and using the above code should generate the same results. All you have to do is change the value of n, then run this code to generate the approximation of the area for different subdivisions underneath your curve.
Edit - August 17th, 2014
I figured out why your code isn't working. Here are the reasons why:
The for loop is unnecessary. Take a look at the for loop iteration. You have a loop going from i = [3:n] yet you don't reference the i variable at all in your loop. As such, you don't need this at all.
You are not computing successive intervals properly. What you need to do is when you compute the trapezoidal sum for the nth subinterval, you then increment this value of n, then compute the trapezoidal rule again. This value is not being incremented properly in your while loop, which is why your area is never improving.
You need to save the previous area inside the while loop, then when you compute the next area, that's when you determine whether or not the difference between the areas is less than the tolerance. We can also get rid of that code at the beginning that tries and compute the area for n = 2. That's not needed, as we can place this inside your while loop. As such, this is what your code should look like:
function t=trapintegral(f,a,b,tol)
format long; %// Got rid of format compact. Useless
%// n starts at 2 - Also removed syms x - Useless statement
n = 2;
newtrap = ((b-a)/2)*(f(a) + f(b)); %// Initialize
oldtrap = 0; %// Initialize to 0
while (abs(newtrap-oldtrap)>=tol)
oldtrap = newtrap; %//Save the old area from the previous iteration
dx = (b-a)/n; %//Compute width
%//Determine sum
trapezoidsum = (dx/2)*(f(a) + (2*sum(f(a+dx*(1:n-1))))+f(b));
newtrap = trapezoidsum; % //This is the new sum
n = n + 1; % //Go to the next value of n
end
t = newtrap;
end
By running your code, this is what I get:
trapezoidsum = trapintegral(#(x) (x+x.^2).^(1/3),1,4,0.00001)
trapezoidsum =
6.111776299189033
Caveat
Look at the way I defined your function. You must use element-by-element operations as the sum command inside the loop will be vectorized. Take a look at the ^ operations specifically. You need to prepend a dot to the operations. Once you do this, I get the right answer.
Edit #2 - August 18th, 2014
You said you want at least one for loop. This is highly inefficient, and whoever specified having one for loop in the code really doesn't know how MATLAB works. Nevertheless, you can use the for loop to accumulate the sum term. As such:
function t=trapintegral(f,a,b,tol)
format long; %// Got rid of format compact. Useless
%// n starts at 3 - Also removed syms x - Useless statement
n = 3;
%// Compute for n = 2 first, then proceed if we don't get a better
%// difference tolerance
newtrap = ((b-a)/2)*(f(a) + f(b)); %// Initialize
oldtrap = 0; %// Initialize to 0
while (abs(newtrap-oldtrap)>=tol)
oldtrap = newtrap; %//Save the old area from the previous iteration
dx = (b-a)/n; %//Compute width
%//Determine sum
%// Initialize
trapezoidsum = (dx/2)*(f(a) + f(b));
%// Accumulate sum terms
%// Note that we multiply each term by (dx/2), but because of the
%// factor of 2 for each of these terms, these cancel and we thus have dx
for n2 = 1 : n-1
trapezoidsum = trapezoidsum + dx*f(a + dx*n2);
end
newtrap = trapezoidsum; % //This is the new sum
n = n + 1; % //Go to the next value of n
end
t = newtrap;
end
Good luck!

How can I speed up this call to quantile in Matlab?

I have a MATLAB routine with one rather obvious bottleneck. I've profiled the function, with the result that 2/3 of the computing time is used in the function levels:
The function levels takes a matrix of floats and splits each column into nLevels buckets, returning a matrix of the same size as the input, with each entry replaced by the number of the bucket it falls into.
To do this I use the quantile function to get the bucket limits, and a loop to assign the entries to buckets. Here's my implementation:
function [Y q] = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q)
q=transpose(q);
end
Y = zeros(size(X));
for i = 1:nLevels
% "The variables g and l indicate the entries that are respectively greater than
% or less than the relevant bucket limits. The line Y(g & l) = i is assigning the
% value i to any element that falls in this bucket."
if i ~= nLevels % "The default; doesnt include upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#lt,X,q(i+1,:));
else % "For the final level we include the upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#le,X,q(i+1,:));
end
Y(g & l) = i;
end
Is there anything I can do to speed this up? Can the code be vectorized?
If I understand correctly, you want to know how many items fell in each bucket.
Use:
n = hist(Y,nbins)
Though I am not sure that it will help in the speedup. It is just cleaner this way.
Edit : Following the comment:
You can use the second output parameter of histc
[n,bin] = histc(...) also returns an index matrix bin. If x is a vector, n(k) = >sum(bin==k). bin is zero for out of range values. If x is an M-by-N matrix, then
How About this
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
Y = zeros(size(X));
for i = 1:numel(q)-1
Y = Y+ X>=q(i);
end
This results in the following:
>>X = [3 1 4 6 7 2];
>>[Y, q] = levels(X,2)
Y =
1 1 2 2 2 1
q =
1 3.5 7
You could also modify the logic line to ensure values are less than the start of the next bin. However, I don't think it is necessary.
I think you shoud use histc
[~,Y] = histc(X,q)
As you can see in matlab's doc:
Description
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts. No elements of x can be complex.
I made a couple of refinements (including one inspired by Aero Engy in another answer) that have resulted in some improvements. To test them out, I created a random matrix of a million rows and 100 columns to run the improved functions on:
>> x = randn(1000000,100);
First, I ran my unmodified code, with the following results:
Note that of the 40 seconds, around 14 of them are spent computing the quantiles - I can't expect to improve this part of the routine (I assume that Mathworks have already optimized it, though I guess that to assume makes an...)
Next, I modified the routine to the following, which should be faster and has the advantage of being fewer lines as well!
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
Y = ones(size(X));
for i = 2:nLevels
Y = Y + bsxfun(#ge,X,q(i,:));
end
The profiling results with this code are:
So it is 15 seconds faster, which represents a 150% speedup of the portion of code that is mine, rather than MathWorks.
Finally, following a suggestion of Andrey (again in another answer) I modified the code to use the second output of the histc function, which assigns entries to bins. It doesn't treat the columns independently, so I had to loop over the columns manually, but it seems to be performing really well. Here's the code:
function [Y q] = levels(X,nLevels)
p = linspace(0,1,nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
q(end,:) = 2 * q(end,:);
Y = zeros(size(X));
for k = 1:size(X,2)
[junk Y(:,k)] = histc(X(:,k),q(:,k));
end
And the profiling results:
We now spend only 4.3 seconds in codes outside the quantile function, which is around a 500% speedup over what I wrote originally. I've spent a bit of time writing this answer because I think it's turned into a nice example of how you can use the MATLAB profiler and StackExchange in combination to get much better performance from your code.
I'm happy with this result, although of course I'll continue to be pleased to hear other answers. At this stage the main performance increase will come from increasing the performance of the part of the code that currently calls quantile. I can't see how to do this immediately, but maybe someone else here can. Thanks again!
You can sort the columns and divide+round the inverse indexes:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
[S,IX]=sort(X);
[grid1,grid2]=ndgrid(1:size(IX,1),1:size(IX,2));
invIX=zeros(size(X));
invIX(sub2ind(size(X),IX(:),grid2(:)))=grid1;
Y=ceil(invIX/size(X,1)*nLevels);
Or you can use tiedrank:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
R=tiedrank(X);
Y=ceil(R/size(X,1)*nLevels);
Surprisingly, both these solutions are slightly slower than the quantile+histc solution.