How can I elegantly determine a central interval of a given distance M in an array of length N? - matlab

I have an array x=1:N. I want to visualize the central part of a curve determined by x, say only the part xx=N/2-M/2:N/2+M/2. I know I can do this if I round everything (N and M can be anything), but this makes a simple indexing operation quite lengthy and unreadable. Is there a more elegant way of doing this?

Adjust your thinking, express the size of the interval in terms of its 'radius' (call it m) rather than its 'diameter' (M) and, voila
xx = median(x)-m:median(x)+m
That's way more elegant, isn't it ! Since you'll probably want integers everywhere try
xx = floor(median(x)-m):ceil(median(x)+m)

Rounding is implicitly done by MATLAB on integer types, so you can simply convert M and N to integers:
N = uint32(N);
M = uint32(M);
xx = N/2-M/2:N/2+M/2;

Related

Matlab: Comparing two vectors with different length and different values?

Lets say I have two vectors A and B with different lengths Length(A) is not equal to Length(B) and the Values in Vector A, are not the same as in Vector B. I want to compare each value of B with Values of A (Compare means if Value B(i) is almost the same value of A(1:end) for example B(i)-Tolerance<A(i)<B(i)+Tolerance.
How Can I do this without using for loop since the data is huge?
I know ismember(F), intersect,repmat,find but non of those function can really help me
You may try a solution along these lines:
tol = 0.1;
N = 1000000;
a = randn(1, N)*1000; % create a randomly
b = a + tol*rand(1, N); % b is "tol" away from a
a_bin = floor(a/tol);
b_bin = floor(b/tol);
result = ismember(b_bin, a_bin) | ...
ismember(b_bin, a_bin-1) | ...
ismember(b_bin, a_bin+1);
find(result==0) % should be empty matrix.
The idea is to discretize the a and b variables to bins of size tol. Then, you ask whether b is found in the same bin as any element from a, or in the bin to the left of it, or in the bin to the right of it.
Advantages: I believe ismember is clever inside, first sorting the elements of a and then performing sublinear (log(N)) search per element b. This is unlike approaches which explicitly construct differences of each element in b with elements from a, meaning the complexity is linear in the number of elements in a.
Comparison: for N=100000 this runs 0.04s on my machine, compared to 20s using linear search (timed using Alan's nice and concise tf = arrayfun(#(bi) any(abs(a - bi) < tol), b); solution).
Disadvantages: this leads to that the actual tolerance is anything between tol and 1.5*tol. Depends on your task whether you can live with that (if the only concern is floating point comparison, you can).
Note: whether this is a viable approach depends on the ranges of a and b, and value of tol. If a and b can be very big and tol is very small, the a_bin and b_bin will not be able to resolve individual bins (then you would have to work with integral types, again checking carefully that their ranges suffice). The solution with loops is a safer one, but if you really need speed, you can invest into optimizing the presented idea. Another option, of course, would be to write a mex extension.
It sounds like what you are trying to do is have an ismember function for use on real valued data.
That is, check for each value B(i) in your vector B whether B(i) is within the tolerance threshold T of at least one value in your vector A
This works out something like the following:
tf = false(1, length(b)); %//the result vector, true if that element of b is in a
t = 0.01; %// the tolerance threshold
for i = 1:length(b)
%// is the absolute difference between the
%//element of a and b less that the threshold?
matches = abs(a - b(i)) < t;
%// if b(i) matches any of the elements of a
tf(i) = any(matches);
end
Or, in short:
t = 0.01;
tf = arrayfun(#(bi) any(abs(a - bi) < t), b);
Regarding avoiding the for loop: while this might benefit from vectorization, you may also want to consider looking at parallelisation if your data is that huge. In that case having a for loop as in my first example can be handy since you can easily do a basic version of parallel processing by changing the for to parfor.
Here is a fully vectorized solution. Note that I would actually recommend the solution given by #Alan, as mine is not likely to work for big datasets.
[X Y]=meshgrid(A,B)
M=abs(X-Y)<tolerance
Now the logical index of elements in a that are within the tolerance can be obtained with any(M) and the index for B is found by any(M,2)
bsxfun to the rescue
>> M = abs( bsxfun(#minus, A, B' ) ); %//' difference
>> M < tolerance
Another way to do what you want is with a logical expression.
Since A and B are vectors of different sizes you can't simply subtract and look for values that are smaller than the tolerance, but you can do the following:
Lmat = sparse((abs(repmat(A,[numel(B) 1])-repmat(B',[1 numel(A)])))<tolerance);
and you will get a sparse logical matrix with as many ones in it as equal elements (within tolerance). You could then count how many of those elements you have by writing:
Nequal = sum(sum(Lmat));
You could also get the indexes of the corresponding elements by writing:
[r,c] = find(Lmat);
then the following code will be true (for all j in numel(r)):
B(r(j))==A(c(j))
Finally, you should note that this way you get multiple counts in case there are duplicate entries in A or in B. It may be advisable to use the unique function first. For example:
A_new = unique(A);

Double sqrt solution in Matlab?

I would like to know how can I get both the positive and the negative solution from a sqrt in Matlab.
For example if I have:
sin(a) = sqrt(1-cos(a)^2);
The docs don't say anything specific about always only providing the positive square root but it does seem like a fair assumption in which case you can get the negative square pretty easily like this:
p = sqrt(1-cos(a)^2);
n = -sqrt(1-cos(a)^2);
btw assigning to sin(a) like that is going to create a variable called sin which will hide the sin function leading to many possible errors, so I would highly recommend choosing a different variable name.
MATLAB (and every other programming language that I know of) only returns the principal square root of x when calling sqrt(x) or equivalent.
How you'd write the square root of x mathematically, is
s = ±√x
which is just a shorthand for writing the whole solution set
s = {+√x -√x}
In MATLAB, you'd write it the same as this last case, but with slightly different syntax,
s = [+sqrt(x) -sqrt(x)]
which can be computed more efficiently if you "factor out" the sqrt:
s = sqrt(x) * [1 -1]
So, for your case,
s = sqrt(1-cos(a)^2) * [1 -1]
or, if you so desire,
s = sin(acos(a)) * [1 -1]
which is a tad slower, but perhaps more readable (and actually a bit more accurate as well).
Now of course, if you can somehow find the components whose quotient results in the value of your cosine, then you wouldn't have to deal with all this messy business of course....
sqrt does not solve equations, only gives numerical output. You will need to formulate your equation as you need it, and then you can use sqrt(...) -1*sqrt(...) to give your positive and negative outputs.

How to generate random matlab vector with these constraints

I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?

What's the most idiomatic way to create a vector with a 1 at index i?

In Matlab, suppose I would like to create a 0-vector of length L, except with a 1 at index i?
For example, something like:
>> mostlyzeros(6, 3)
ans =
0 0 1 0 0 0
The purpose is so I can use it as a 'selection' vector which I'll multiply element-wise with another vector.
The simplest way I can think of is this:
a = (1:N)==m;
where N>=m. Having said that, if you want to use the resulting vector as a "selection vector", I don't know why you'd multiply two vectors elementwise, as I would expect that to be relatively slow and inefficient. If you want to get a vector containing only the m-th value of vector v in the m-th position, this would be a more straightforward method:
b = ((1:N)==m)*v(m);
Although the most natural method would have to be this:
b(N)=0;
b(m)=v(m);
assuming that b isn't defined before this (if b is defined, you need to use zeros rather than just assigning the Nth value as zero - it has been my experience that creating a zero vector or matrix that didn't exist before that is most easily done by assigning the last element of it to be zero - it's also useful for extending a matrix or vector).
I'm having a hard time thinking of anything more sensible than:
Vec = zeros(1, L);
Vec(i) = 1;
But I'd be happy to be proven wrong!
UPDATE: The one-liner solution provided by #GlenO is very neat! However, be aware that if efficiency is the chief criteria, then a few speed tests on my machine indicate that the simple method proposed in this answer and the other two answers is 3 or 4 times faster...
NEXT UPDATE: Ah! So that's what you mean by "selection vectors". #GlenO has given a good explanation of why for this operation a vector of ones and zeros is not idiomatic Matlab - however you choose to build it.
ps Try to avoid using i as a subscript, since it is actually a matlab function.
Just for the fun of it, another one-liner:
function [out] = mostlyzeros(idx, L)
out([L, idx]) = [0 1];
I can think of:
function mostlyones(m,n)
mat=zeros(1,m);
mat(n)=1;
Also, one thing to note. In MATLAB, index starts from one and not from zero. So your function call should have been mostlyzeros(6,3)
I would simply create a zero-vector and change whatever value you like to one:
function zeroWithOne(int numOfZeros, int pos)
a = zeros(numOfZeros,1);
a(pos) = 1;
Another one line option, which should be fast is:
vec = sparse(1, ii, 1, 1, L);

Calculating the maximum distance between elements of vector in MATLAB

Let's assume that we have a vector like
x = -1:0.05:1;
ids = randperm(length(x));
x = x(ids(1:20));
I would like to calculate the maximum distance between the elements of x in some idiomatic way. It would be easy to just iterate over all possible combinations of x's elements but I feel like there could be a way to do it with MATLAB's built-in functions in some crazy but idiomatic way.
What about
max_dist = max(x) - min(x)
?
Do you mean the difference between the largest and smallest elements in your vector ? If you do, then something like this will work:
max(x) - min(x)
If you don't, then I've misunderstood the question.
This is an interpoint distance computation, although a simple one, since you are working in one dimension. Really that point which falls at a maximum distance in one dimension is always one of two possible points. So all you need do is grab the minimum value and the maximum value from the list, and see which is farther away from the point in question. So assuming that the numbers in x are real numbers, this will work:
xmin = min(x);
xmax = max(x);
maxdistance = max(x - xmin,xmax - x);
As an alternative, some time ago I put a general interpoint distance computation tool up on the file exchange (IPDM). It is smart enough to special case simple problems like the 1-d farthest point problem. This call would do it for you:
D = ipdm(x,'subset','farthest','result','struct');
Of course, it will not be as efficient as the simple code I wrote above, since it is a fully general tool.
Uhh... would love to have a MATLAB at my hands and its still early in the morning, but what about something like:
max_dist = max(x(2:end) - x(1:end-1));
I don't know if this is what You are looking for.