I would like to know if Optuna offers an option to repeat each trial five times or more to get the average performance of the network over different initial weights.
No. If you want to measure an average of multiple runs, update your objective to repeat your computation inside a trial. A simple example could be the following:
NUM_RUNS = 5
def objective(trial: Trial) -> float:
num_filters = trial.suggest_int(...)
hidden_size = trial.suggest_int(...)
objective_values = []
for _ in range(NUM_RUNS):
model = Model(num_filters=num_filters, hidden_size=hidden_size)
model.fit(X_train, y_train)
accuracy = model.evaluate(X_valid, y_valid)
objective_values.append(accuracy)
return np.mean(accuracy)
Related
After reading the docs about "stateEstimatorPF" I get a little confused about how to create the StateTransitionFcn for my case. In my case I have 10 sensors measurments that decay exponentially and I want to find the best parameters for my function model.
The function model is x = exp(B*deltaT)*x_1, where x are the hypotheses, deltaT is the constant time delta in my measurments and x_1 is the true previous state. I would like to use the particle filter to estimate the parameter B. If I guess right, B should be the particles and the weighted mean of this particles should be what I'm looking for.
How can I write the StateTransitionFcn and use the "stateEstimatorPF" to solve this problem?
The code below is what I get so far (and it does not work):
pf = robotics.ParticleFilter
pf.StateTransitionFcn = #stateTransitionFcn
pf.StateEstimationMethod = 'mean';
pf.ResamplingMethod = 'systematic';
initialize(pf,5000,[0.9],1);
measu = [1.0, 0.9351, 0.8512, 0.9028, 0.7754, 0.7114, 0.6830, 0.6147, 0.5628, 0.7090]
states = []
for i=1:10
[statePredicted,stateCov] = predict(pf);
[stateCorrected,stateCov] = correct(pf,measu(i));
states(i) = getStateEstimate(pf)
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function predictParticles = stateTransitionFcn(pf, prevParticles,x_1)
predictParticles = exp(prevParticles)*x_1 %how to properly use x_1?%;
end
I am using Matlab GPU computing to run a simulation. I suspect I may encounter a "random number seed" overlapping issue. My code is the following
N = 10000;
v = rand(N,1);
p = [0:0.1:1];
pA = [0:0.1:2];
[v,p,pA] = ndgrid(v,p,pA);
v = gpuArray(v);
p = gpuArray(p);
pA = gpuArray(pA);
t = 1;
bH = 0.9;
bL = 0.6;
a = 0.5;
Y = MyFunction(v,p,pA,t,bH,bL,a);
function[RA] = MyFunction(v,p,pA,t,bH,bL,a)
function[RA] = SSP1(v,p,pA)
RA = 0;
S1 = rand;
S2 = rand;
S3 = rand;
vA1 = (S1<a)*bH+(S1>=a)*bL;
vA2 = (S2<a)*bH+(S2>=a)*bL;
vA3 = (S3<a)*bH+(S3>=a)*bL;
if p<=t && pA>3*bL && pA<=3*bH
if pA>vA1+vA2+vA3
if v>=p
RA = p;
end
else
if v+vA1+vA2+vA3>=p+pA
RA = p+pA;
end
end
end
end
[RA] = gather(arrayfun(#SSP1,v,p,pA));
end
The idea of the code is the following:
I generate N random agents, which is characterized by the value of v. Then for each agent, I have to compute a quantity given (p,pA). As I have N agents and many combinations of (p,pA), I want to use GPU to speed up the process. But here comes a tricky thing:
for each agent, in order to finish my computation, I have to generate 3 extra random variables, vA1,vA2,vA3. Based on my understanding of GPU (I could be wrong), it does these computations simultaneously, i.e, for each agent v, it generates 3 random variables vA1,vA2,vA3. And GPU does this N procedures at the same time. However, I am not sure whether for agent 1 and agent 2, the corresponding vA1,vA2,vA3 may overlap? Because here N could be 1 million. I want to make sure that for all of these agents, the random number seed that is used to generate their corresponding vA1,vA2,vA3 won't overlap; otherwise, I am in big trouble.
There is a way to prevent this from happening, which is: I first generate 3N of these random variables vA1,vA2,vA3. Then I put them into my GPU. However, that may require a lot of GPU memory, which I don't have. The current method, I guess does not need too much GPU memory, as I am generating vA1,vA2,vA3 on the fly?
What you say does not happen. The proof is that the following code snipped generates random values in hB.
A=ones(100,1);
dA=gpuArray(A);
[hB] = gather(arrayfun(#applyrand,dA));
function dB=applyrand(dA)
r=rand;
dB=dA*r;
end
That said, your code has only 12 values for your random variables (4 for each) because for your use of S1, S2 and S3 you are basically flipping a coin:
vA1 = (S1<0.5)*bH+(S1>=0.5)*bL;
so vA1 is either 0, bH, bL or bH+bL.
Maybe this lack of variability is what is making you think that you don't have much randomness, not very clear from the question.
My problem is the optimization issue for SVIGP in the US Flight dataset.
I implemented the SVGP model for the US flight data mentioned in the Hensman 2014 using the number of inducing point = 100, batch_size = 1000, learning rate = 1e-5 and maxiter = 500.
The result is pretty strange end ELBO does not increase and it have large variance no matter how I tune the learning rate
Initialization
M = 100
D = 8
def init():
kern = gpflow.kernels.RBF(D, 1, ARD=True)
Z = X_train[:M, :].copy()
m = gpflow.models.SVGP(X_train, Y_train.reshape([-1,1]), kern, gpflow.likelihoods.Gaussian(), Z, minibatch_size=1000)
return m
m = init()
Inference
m.feature.trainable = True
opt = gpflow.train.AdamOptimizer(learning_rate = 0.00001)
m.compile()
opt.minimize(m, step_callback=logger, maxiter = 500)
plt.plot(logf)
plt.xlabel('iteration')
plt.ylabel('ELBO')
Result:
Added Results
Once I add more iterations and use large learning rate. It is good to see that ELBO increases as iterations increase. But it is very confused that both RMSE(root mean square error) for training and testing data increase too. Do you have some suggestions?
Figures and codes shown as follows:
ELBOs vs iterations
Train RMSEs vs iterations
Test RMSEs vs iterations
Using logger
def logger(x):
print(m.compute_log_likelihood())
logx.append(x)
logf.append(m.compute_log_likelihood())
logt.append(time.time() - st)
py_train = m.predict_y(X_train)[0]
py_test = m.predict_y(X_test)[0]
rmse_hist.append(np.sqrt(np.mean((Y_train - py_train)**2)))
rmse_test_hist.append(np.sqrt(np.mean((Y_test - py_test)**2)))
logger.i+=1
logger.i = 1
And the full code is shown through link.
I am using the following code to calculate altitude.
Data = [Distance1',Gradient];
Result = Data(dsearchn(Data(:,1), Distance2), 2);
Altitude = -cumtrapz(Distance2, Result)/1000;
Distance 1 and Distance 2 has different size with same values so I am comparing them to get corresponding value of Gradient to use with Distance 2.
Just to execute these 3 lines the Matlab takes 12 to 15 seconds. Which slow down my whole algorithm.
Is there any better way I can perform above action without slowing down my algorithm?
If I understand correctly, you are looking for the first occurance of number Distance2 in the column Data(:,1). You can perform about 3 times faster using find. Try:
k = find(Data(:,1) == Distance2,1);
Result = Data(k,2);
Here is a timing test, where pow is the length of your data (10^pow for 10000 rows), and fac is the increased speed factor for using find
pow = 5;
data = round(rand(10^pow,1)*10);
funcFind = #() find(data == 5,1);
timeFind = timeit(funcFind);
funcD = #() dsearchn(data,5);
timeD = timeit(funcD);
fac = timeD/timeFind
I manage to find alternative method by using interp1 function.
Here is an Example Code.
Distance2= [1:10:1000]';
Distance1= [1:1:1000]';
Gradient= rand(1000,1);
Data= [Distance1,Gradient];
interp1(Distance1,Data(:,2),Distance2,'nearest');
This function add just 1 second to my original simulation time which is way better then previous 12 to 15 seconds.
I am trying to analyze timeseries of wheel turns that were sampled at 1 minute intervals for 10 days. t is a 1 x 14000 array that goes from .1666 hours to 240 hours. analysis.timeseries.(grp).(chs) is a 1 x 14000 array for each of my groups of interest and their specific channels that specifize activity at each minute sampled. I'm interested in collecting the maximum power and the frequency it occurs at. My problem is I'm not sure what units f is coming out in. I would like to have it return in cycles per hour and span to a maximum period of 30 hours. I tried to use the Galileo example in the documentation as a guide, but it didn't seem to work.
Below is my code:
groups = {'GFF' 'GMF' 'SFF' 'SMF'};
chgroups = {chnamesGF chnamesGM chnamesSF chnamesSM};
t1 = (t * 3600); %matlab treats this as seconds so convert it to an hour form
onehour = seconds(hours(1));
for i = 1:4
grp = groups{1,i};
chn = chgroups{1,i};
for channel = 1:length(chn)
chs = chn{channel,1};
[pxx,f]= plomb(analysis.timeseries.(grp).(chs),t, 30/onehour,'normalized');
analysis.pxx.(grp).(chs) = pxx;
analysis.f.(grp).(chs) = f;
analysis.lsp.power.(grp).(chs) = max(pxx);
[row,col,v] = find(analysis.pxx.(grp).(chs) == analysis.lsp.power.(grp).(chs));
analysis.lsp.tau.(grp).(chs) = analysis.f.(grp).(chs)(row);
end
end
Not really an answer but it is hard to put a image in a comment.
Judging by this (plomb manual matlab),
I think that pxx is without dimension as for f is is the frequency so 1/(dimension of t) dimension. If your t is in hours I would say h^-1.
So I'd rather say try
[pxx,f]= plomb(analysis.timeseries.(grp).(chs),t*30.0/onehour,'normalized');