In MATLAB, how can I automatically find a specific portion of a graph using an equation? - matlab

I'm new to MATLAB, and I'm plotting a graph which starts off constant, rises, and then oscillates roughly around a constant value. I want to automatically find the x-coordinate for the point where this begins (about 1100 in the figure shown), and I imagine to automate it I need to do something like finding the first point which falls within certain threshold boundaries.
I'm not really sure how to go about that, though; would you be able to help me out?
I can provide the data if it helps, but I think I'm probably asking a pretty basic conceptual question with a straightforward answer I just don't know how to find.

An obvious answer could be to make up some threshold value, and when the y-value of the graph exceeds this threshold, you know there must have been a jump.
The function that retrieves the x value could look like this:
function x = getXofJump(data, threshold)
base_value = data(1)
x = 1
while abs(data(x + 1) - base_value) < threshold && x + 1 < len(data)
x = x + 1
end
end

Related

Calculating vector lengths within my loop

The code is meant to draw a line with each click of the mouse made within the figure. Here is my code;
b=1
while b>0;
axis([-10, 10, -10, 10])
b=b+1
[x(b),y(b)]=ginput(1)
plot(x,y,x,y)
end
However I cant get my head around how I can add the vectors seeing as some are + and some are -, I need to turn the negatives into positives. I need to add code that will give me an overall combined length after all the mouse clicks. Maybe I am just thinking about it completely wrong.
I have tried;
length=(sqrt(x.^2)+(y.^2))
I was hoping this would give me the correct vector length accept unless I click an exact straight line.
So assuming that x and y are vectors of sequential points, if you want to get the total distance then you need to take the sum of the distance between each point i.e.
Σi(sqrt((xi-xi-1)2+(yi-yi-1)2))
in Matlab we can calculate all the x (and y) differences in one step using the diff function so in the end we get
length = sum(sqrt(diff(x).^2 + diff(y).^2))
Your problem is twofold: there's a typo, I think instead of
length=(sqrt(x.^2)+(y.^2))
you mean
length=sqrt((x.^2)+(y.^2))
The second problem is, that this does not actually calculate the length of your path, instead, you probably want something like
clear all
b=1;
radius = 3;
while b>0;
axis([-10, 10, -10, 10])
b=b+1;
[x(b),y(b)]=ginput(1);
plot(x,y,x,y);
length(b)=sqrt((x(b)-x(b-1))^2+(y(b)-y(b-1))^2);
if sqrt(x(b)^2 + y(b)^2) < radius; break; end
end
sum(length)
which calculates the length for each new piece you add and sums them all up.
As soon as you click within "radius" distance of 0 the while loop breaks.
Also, generally it's good practice to preallocate variables, no big deal here, cause your arrays are small, just saying.
Note: Dan's solution gives you a vectorized way of calculating the total length in one step, so in case you don't need the individual path lengths this is the more concise way to go.

How to remove data points from a data set in Matlab

In Matlab, I have a vector that is a 1x204 double. It represents a biological signal over a certain period of time and over that time the signal varies - sometimes it peaks and goes up and sometimes it remains relatively small, close to the baseline value of 0. I need to plot this the reciprocal of this data (on the xaxis) against another set of data (on the y-axis) in order to do some statistical analysis.
The problem is that due to those points close to 0, for e.g. the smallest point I have is = -0.00497, 1/0.00497 produces a value of -201 and turns into an "outlier", while the rest of the data is very different and the values not as large. So I am trying to remove the very small values close to 0, from the data set so that it does not affect 1/value.
I know that I can use the cftool to remove those points from the plot, but how do I get the vector with those points removed? Is there a way of actually removing the points? From the cftool and removing those points on the original, I was able to generate the code and find out which exact points they are, but I don't know how to create a vector with those points removed.
Can anyone help?
I did try using the following for loop to get it to remove values, with 'total_BOLD_time_course' being my signal and '1/total_BOLD_time_course' is what I want to plot, but the problem with this is that in my if statement total_BOLD_time_course(i) = 1, which is not exactly true - so by doing this the points still exist in the vector but are now taking the value 1. But I just want them to be gone from the vector.
for i = 1:204
if total_BOLD_time_course(i) < 0 && total_BOLD_time_course(i) < -0.01
total_BOLD_time_course(i) = 1;
else if total_BOLD_time_course(i) > 0 && total_BOLD_time_course(i) < 0.01
total_BOLD_time_course(i) = 1 ;
end
end
end
To remove points from an array, use the syntax
total_BOLD_time_course( abs(total_BOLD_time_course<0.01) ) = nan
that makes them 'blank' on the graph, and ignored by further calculations, but without destroying the temporal sequence of the datapoints.
If actually destroying timepoints is not a concern then do
total_BOLD_time_course( abs(total_BOLD_time_course<0.01) ) = []
Then there'll be fewer data points, and they won't map on to any other time_course you have. But the advantage is that it will "close up" the gaps in the graph.
--
PS
note that in your code, the phrase
x<0 && x<-0.01
is redundant because if any number is less than -0.01, it is automatically less than 0. I believe the first should be x>0, and then your code is fine.
As VHarisop suggests, you can set a threshold for outliers and exclude them. But, depending on your plot, it might be important to ensure that the remaining data are not shunted horizontally to fill the gaps. To plot 1./y as a function of x, you could either just plot(x, 1./y) and then set the y limits with ylim to exclude the outliers from view, or use NaNs:
e = 0.01
y( abs(y) < e ) = nan;
plot( x, 1./y )
For quantitative (non-visual) statistical analysis, either remove the values entirely from y as suggested—bearing in mind that this leaves you with a shorter vector—or use statistics functions that know how to treat NaNs as missing data (nanmean, nanstd, etc).
Yeah, you can. You might want to define a threshold, like e = 0.01, and cut off all vector elements whose absolute value is below e.
Example:
# assuming v is your initial vector
e = 0.01
new_vector = v(abs(v) > e);
Alternatively, you could use the excludedata tool from the Curve Fitting Toolbox, since you know the indices of the vector elements you want to exlude.

Trying to find the point when two functions differ from each other a 5% with Matlab

As I've written in the title, I'm trying to find the exact distance (dimensionless distance in this case) when two functions start to differ from each other a 5% of the Y-axis. The two functions intersect at the value of 1 in the X-axis and I need to find the described distance before the intersection, not after (i.e., it must be less than 1). I've written a Matlab code for you to see the shape of the functions and the following calculations which I'm trying to make them work but they don't, I don't know why. "Explicit solution could not be found".
I don't know if I explained it clearly. Please let me know if you need a more detailed explanation.
I hope you can throw some light in this issue.
Thank you so much in advanced.
r=0:0.001:1.2;
ro=0.335;
rt=r./ro;
De=0.3534;
k=2.8552;
B=(2*k/De)^0.5;
Fm=2.*De.*B.*ro.*[1-exp(B.*ro.*(1-rt))].*exp(B.*ro.*(1-rt));
A=5;
b=2.2347;
C=167.4692;
Ftt=(C.*(exp(-b.*rt).*((b.^6.*rt.^5)./120 + (b.^5.*rt.^4)./24 + (b.^4.*rt.^3)./6 + (b.^3.*rt.^2)./2 + b.^2.*rt + b) - b.*exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1)))./rt.^6 - (6.*C.*(exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1) - 1))./rt.^7 - A.*b.*exp(-b.*rt);
plot(rt,-Fm,'red')
axis([0 2 -1 3])
xlabel('Dimensionless distance')
ylabel('Force, -dU/dr')
hold on
plot(rt,-Ftt,'green')
clear rt
syms rt
%assume(0<rt<1)
r1=solve((Fm-Ftt)/Ftt==0.05,rt)
r2=solve((Ftt-Fm)/Fm==0.05,rt)
Welcome to the crux of floating point data. The reason why is because for the values of r that you are providing, the exact solution of 0.05 may be in between two of the values in your r array and so you won't be able to get an exact solution. Also, FWIW, your equation may never generate a solution of 0.05, which is why you're getting that error too. Either way, doing that explicit solve on floating point data is never recommended, unless you know very well how your data are shaped and what values you expect for the output of the function you're applying the data to.
As such, it's always recommended that you find the nearest value that satisfies your condition. As such, you should do something like this:
[~,ind] = min(abs((Fm-Ftt)./Ftt - 0.05));
r1 = r(ind);
The first line will find the nearest location in your r array that satisfies the 5% criterion. The next line of code will then give you the value that is in your r array that satisfies this. You can do the same with r2 by:
[~,ind2] = min(abs((Ftt-Fm)./Fm - 0.05));
r2 = r(ind2);
What the above code is basically doing is that it is trying to find at what point in your array would the difference between your data and 5% be 0. In other words, which point in your r array would be close enough to make the above relation equal to 0, or essentially when it is as close to 5% as possible.
If you want to improve this, you can always change the step size of r... perhaps make it 0.00001 or something. However, the smaller the step size, the larger your array and you'll eventually run out of memory!

How to generate random matlab vector with these constraints

I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?

Calculating the maximum distance between elements of vector in MATLAB

Let's assume that we have a vector like
x = -1:0.05:1;
ids = randperm(length(x));
x = x(ids(1:20));
I would like to calculate the maximum distance between the elements of x in some idiomatic way. It would be easy to just iterate over all possible combinations of x's elements but I feel like there could be a way to do it with MATLAB's built-in functions in some crazy but idiomatic way.
What about
max_dist = max(x) - min(x)
?
Do you mean the difference between the largest and smallest elements in your vector ? If you do, then something like this will work:
max(x) - min(x)
If you don't, then I've misunderstood the question.
This is an interpoint distance computation, although a simple one, since you are working in one dimension. Really that point which falls at a maximum distance in one dimension is always one of two possible points. So all you need do is grab the minimum value and the maximum value from the list, and see which is farther away from the point in question. So assuming that the numbers in x are real numbers, this will work:
xmin = min(x);
xmax = max(x);
maxdistance = max(x - xmin,xmax - x);
As an alternative, some time ago I put a general interpoint distance computation tool up on the file exchange (IPDM). It is smart enough to special case simple problems like the 1-d farthest point problem. This call would do it for you:
D = ipdm(x,'subset','farthest','result','struct');
Of course, it will not be as efficient as the simple code I wrote above, since it is a fully general tool.
Uhh... would love to have a MATLAB at my hands and its still early in the morning, but what about something like:
max_dist = max(x(2:end) - x(1:end-1));
I don't know if this is what You are looking for.