How to extend the range of a variable in idl - range

I'd like to use k to calculate the times of the for loop executed. It would be billions of times and I tried long64, then after some time k became negative. Is there any other way to do this?
Sorry I think I made a wrong description. My code is a 3-layer for nest block and each of them is calculating 2*256^3 numbers, once the value is equal to 0, I'd like to make k+=k. In the end I set print, 'k=', k and when idl was running, I found k ran from positive to negative. I used a cluster to do the computation so it didn't take quite a long time.

My guess is that you are not really using a long64 for k. The type for the loop variable in a for loop comes from the start value. For example, in this case:
k = 0LL
for k = 0, n - 1 do ...
k is a int (16-bit) because 0 is a int. You probably want something like:
for k = 0LL, n - 1LL do begin ...

Related

Efficient Enumeration of subsets of variable size

There are 2 variables: i and j.
i runs from 1 to fixed constant N. j runs from 1 to M(i), meaning we have an array
M=zeros(N,1);
M=[3,2,5,4];
Also S is a set of say 10 integers. Corresponding to every i, I have to find all possible ways of choosing numbers from S up to a given max limit M(i).
My code:
desired_enumerations={};
for i=1:N
for j=1:M(i)
a = combnk(S,j);
for k=1:size(a,1)
desired_enumerations{end+1,1}=a(k,:);
end
end
end
My question:
I want to pre-compute all possible subsets of S up to max(M) elements outside the double for loop and then use that data inside the for loops so i don't have to enumerate the subsets inside double for loop and avoid repeated calculations.
What is an efficient way to do this?
EDIT:
Just as a bonus if someone could also give the time-complexity of the code in big-O notation. Or i can open a new question.

What is this code doing? Machine Learning

I'm just learning matlab and I have a snippet of code which I don't understand the syntax of. The x is an n x 1 vector.
Code is below
p = (min(x):(max(x)/300):max(x))';
The p vector is used a few lines later to plot the function
plot(p,pp*model,'r');
It generates an arithmetic progression.
An arithmetic progression is a sequence of numbers where the next number is equal to the previous number plus a constant. In an arithmetic progression, this constant must stay the same value.
In your code,
min(x) is the initial value of the sequence
max(x) / 300 is the increment amount
max(x) is the stopping criteria. When the result of incrementation exceeds this stopping criteria, no more items are generated for the sequence.
I cannot comment on this particular choice of initial value and increment amount, without seeing the surrounding code where it was used.
However, from a naive perspective, MATLAB has a linspace command which does something similar, but not exactly the same.
Certainly looks to me like an odd thing to be doing. Basically, it's creating a vector of values p that range from the smallest to the largest values of x, which is fine, but it's using steps between successive values of max(x)/300.
If min(x)=300 and max(x)=300.5 then this would only give 1 point for p.
On the other hand, if min(x)=-1000 and max(x)=0.3 then p would have thousands of elements.
In fact, it's even worse. If max(x) is negative, then you would get an error as p would start from min(x), some negative number below max(x), and then each element would be smaller than the last.
I think p must be used to create pp or model somehow as well so that the plot works, and without knowing how I can't suggest how to fix this, but I can't think of a good reason why it would be done like this. using linspace(min(x),max(x),300) or setting the step to (max(x)-min(x))/299 would make more sense to me.
This code examines an array named x, and finds its minimum value min(x) and its maximum value max(x). It takes the maximum value and divides it by the constant 300.
It doesn't explicitly name any variable, setting it equal to max(x)/300, but for the sake of explanation, I'm naming it "incr", short for increment.
And, it creates a vector named p. p looks something like this:
p = [min(x), min(x) + incr, min(x) + 2*incr, ..., min(x) + 299*incr, max(x)];

How to generate random matlab vector with these constraints

I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?

What's the most idiomatic way to create a vector with a 1 at index i?

In Matlab, suppose I would like to create a 0-vector of length L, except with a 1 at index i?
For example, something like:
>> mostlyzeros(6, 3)
ans =
0 0 1 0 0 0
The purpose is so I can use it as a 'selection' vector which I'll multiply element-wise with another vector.
The simplest way I can think of is this:
a = (1:N)==m;
where N>=m. Having said that, if you want to use the resulting vector as a "selection vector", I don't know why you'd multiply two vectors elementwise, as I would expect that to be relatively slow and inefficient. If you want to get a vector containing only the m-th value of vector v in the m-th position, this would be a more straightforward method:
b = ((1:N)==m)*v(m);
Although the most natural method would have to be this:
b(N)=0;
b(m)=v(m);
assuming that b isn't defined before this (if b is defined, you need to use zeros rather than just assigning the Nth value as zero - it has been my experience that creating a zero vector or matrix that didn't exist before that is most easily done by assigning the last element of it to be zero - it's also useful for extending a matrix or vector).
I'm having a hard time thinking of anything more sensible than:
Vec = zeros(1, L);
Vec(i) = 1;
But I'd be happy to be proven wrong!
UPDATE: The one-liner solution provided by #GlenO is very neat! However, be aware that if efficiency is the chief criteria, then a few speed tests on my machine indicate that the simple method proposed in this answer and the other two answers is 3 or 4 times faster...
NEXT UPDATE: Ah! So that's what you mean by "selection vectors". #GlenO has given a good explanation of why for this operation a vector of ones and zeros is not idiomatic Matlab - however you choose to build it.
ps Try to avoid using i as a subscript, since it is actually a matlab function.
Just for the fun of it, another one-liner:
function [out] = mostlyzeros(idx, L)
out([L, idx]) = [0 1];
I can think of:
function mostlyones(m,n)
mat=zeros(1,m);
mat(n)=1;
Also, one thing to note. In MATLAB, index starts from one and not from zero. So your function call should have been mostlyzeros(6,3)
I would simply create a zero-vector and change whatever value you like to one:
function zeroWithOne(int numOfZeros, int pos)
a = zeros(numOfZeros,1);
a(pos) = 1;
Another one line option, which should be fast is:
vec = sparse(1, ii, 1, 1, L);

find discretization steps

I have data files F_j, each containing a list of numbers with an unknown number of decimal places. Each file contains discretized measurements of some continuous variable and
I want to find the discretization step d_j for file F_j
A solution I could come up with: for each F_j,
find the number (n_j) of decimal places;
multiply each number in F_j with 10^{n_j} to obtain integers;
find the greatest common divisor of the entire list.
I'm looking for an elegant way to find n_j with Matlab.
Also, finding the gcd of a long list of integers seems hard — do you have any better idea?
Finding the gcd of a long list of numbers isn't too hard. You can do it in time linear in the size of the list. If you get lucky, you can do it in time a lot less than linear. Essentially this is because:
gcd(a,b,c) = gcd(gcd(a,b),c)
and if either a=1 or b=1 then gcd(a,b)=1 regardless of the size of the other number.
So if you have a list of numbers xs you can do
g = xs(1);
for i = 2:length(xs)
g = gcd(x(i),g);
if g == 1
break
end
end
The variable g will now store the gcd of the list.
Here is some sample code that I believe will help you get the GCD once you have the numbers you want to look at.
A = [15 30 20];
A_min = min(A);
GCD = 1;
for n = A_min:-1:1
temp = A / n;
if (max(mod(temp,1))==0)
% yay GCD found
GCD = n;
break;
end
end
The basic concept here is that the default GCD will always be 1 since every number is divisible by itself and 1 of course =). The GCD also can't be greater than the smallest number in the list, thus I start with the smallest number and then decriment by 1. This is assuming that you have already converted the numbers to a whole number form at this point. Decimals will throw this off!
By using the modulus of 1 you are testing to see if the number is a whole number, if it isn't you will have a decmial remainder left which is greater than 0. If you anticipate having to deal with negative you will have to tweak this test!
Other than that, the first time you find a number where the modulus of the list (mod 1) is all zeros you've found the GCD.
Enjoy!