I am trying to plot some data. The script I wrote below has worked fine before, but now I have no idea why it's not working.
Here is the code:
x = [335,41,14,18,15,9,7,9,20607,5,5,143,3,5,72,134,2,28,172,3,72,173,280,186,20924,1,1,22,3,3,1,2,13,1,3,2,11,66,12983,176,123,192,64,258,182,123,299,58,198,7,113,342,72,8376,122,20,19,2,3,28,8,36,8,56,43,2,48,127,395,4664,186,46,236,219,258,69,203,189,169,72,100,78,109,46,112,3929,272,40,4,31,2,97,36,5,35,56,2,237,1672,256,224,28,163,341,151,263,157,397,94,380,173,75,87,272,1194,133,6,112,1,6,2,26,25,64,8,40,57,106,525,150,248,125,269,264,256,357,153,64,152,283,1,2,2,454,154,39,1,1,64,151,242,1,18,99,1,36,607,55,54,110,225,108,37,1,144,162,137,107,21,360,362,18,51,25,43,1,3,6,1,27,7,45,326,32,103,50,124,155,39,180,143,33,116,46,7,151,120,19,4,2,4,110,2,7,4,9,4,27,216,323,148,1,1,2,1,47,113,150,1,2,144,16,4827,1,1,1,14];
size = length(x);
disp(size);
z = 0;
for i = 1:size
z = z + 1;
y(i) = z;
end
scatter(x,y);
This code should ensure that y is of same length as x as we are only filling in y as long as x is (since we are using a for loop from 1 through to size, where size is basically the number of indices in x), but I keep getting this error. I checked with disp and it turns out that my x and y vectors have different lengths, x is 227 and y is 256. Can anyone help me out with this trivial issue?
This is most likely because y was created to be a different size before you ran that piece of code you showed us. Somewhere in your script before you call this piece of code, y was created to be a 256 element vector and now you are reusing this variable in this part of the code by populating elements in the y vector. The variable x has 227 elements and the loop you wrote will change y's first 227 elements as you have looped for as many times as there are elements in x. However, the remaining 29 elements are still there from before. The reusing of the variable y is probably why your script is failing as now the sizes between both variables are not the same. As such, explicitly recreate y before calling scatter.
Actually, that for loop is not needed at all. The purpose of the loop is to create an increasing array from 1 up to as many elements as you have in x.
Just do this instead:
y = 1:size;
I also do not like that you are creating a variable called size. It is overshadowing the function size, which finds the number of elements in each dimension that the input array contains.
With the recommendations I've stated above, replace your entire code with this:
x = [335,41,14,18,15,9,7,9,20607,5,5,143,3,5,72,134,2,28,172,3,72,173,280,186,20924,1,1,22,3,3,1,2,13,1,3,2,11,66,12983,176,123,192,64,258,182,123,299,58,198,7,113,342,72,8376,122,20,19,2,3,28,8,36,8,56,43,2,48,127,395,4664,186,46,236,219,258,69,203,189,169,72,100,78,109,46,112,3929,272,40,4,31,2,97,36,5,35,56,2,237,1672,256,224,28,163,341,151,263,157,397,94,380,173,75,87,272,1194,133,6,112,1,6,2,26,25,64,8,40,57,106,525,150,248,125,269,264,256,357,153,64,152,283,1,2,2,454,154,39,1,1,64,151,242,1,18,99,1,36,607,55,54,110,225,108,37,1,144,162,137,107,21,360,362,18,51,25,43,1,3,6,1,27,7,45,326,32,103,50,124,155,39,180,143,33,116,46,7,151,120,19,4,2,4,110,2,7,4,9,4,27,216,323,148,1,1,2,1,47,113,150,1,2,144,16,4827,1,1,1,14];
numX = numel(x);
y = 1 : numX;
scatter(x,y);
The vector y is now explicitly created instead of reusing the variable that was created with a previous size in the past. It also uses the colon operator to explicitly create this sequence instead of using a for loop. That for loop is just not needed. numel determines the total number of elements for an input matrix. I don't like using length as a personal preference because it finds the number of elements in the largest dimension. This may work fine for vectors, but it has really made some hard to spot bugs in code that I've written in the past.
I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?