How can I change the solver parameter in Caffe through pycaffe?
E.g. right after calling solver = caffe.get_solver(solver_prototxt_filename) I would like to change the solver's parameters (learning rate, stepsize, gamma, momentum, base_lr, power, etc.), without having to change solver_prototxt_filename.
Maybe you can create a temporary file.
First of all, load your solver parameters with
from caffe.proto import caffe_pb2
from google.protobuf import text_format
solver_config = caffe_pb2.SolverParameter()
with open('/your/solver/path') as f:
text_format.Merge(str(f.read()), solver_config)
You can modify any solver parameter just setting the desired value in solver_config (e.g. solver_config.test_interval = 15). Then, it's just creating a temp file and load your solver from it:
new_solver_config = text_format.MessageToString(solver_config)
with open('temp.prototxt', 'w') as f:
f.write(new_solver_config)
solver = caffe.get_solver('temp.prototxt')
solver.step(1)
Related
I'm trying to build a neural network that will take in the solutions to a system of ODE's and predict the parameters of the system. I'm using Julia and in particular, the DiffEqFlux package. The structure of a network is a few simple Dense layers chained together that predict some intermediate parameters (in this case, some chemical reaction free energies), which then feed into some deterministic (non-trained) layers that convert those parameters into the ones that go into the system of equations (in this case, reaction rate constants). I've tried two different approaches from here:
Chain the ODE solve directly on as the last layer of the network. In this case, the loss function is just comparing the inputs to the outputs.
Have the ODE solve in the loss function, so the network output is just the parameters.
However, in neither case can I get Flux.train! to actually run.
A silly little example for the first option that gives the same error I'm getting (I've tried to keep as many things parallel to my actual case as possible, i.e. the solver, etc., although I did omit the intermediate deterministic layers since they don't seem to make a difference) is shown below.
using Flux, DiffEqFlux, DifferentialEquations
# let's use Chris' favorite example, Lotka-Volterra
function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
u0 = [1.0,1.0]
tspan = (0.0,10.0)
# generate a couple sets of solutions to train on
training_params = [[1.5,1.0,3.0,1.0], [1.4,1.1,3.1,0.9]]
training_sols = [solve(ODEProblem(lotka_volterra, u0, tspan, tp)).u[end] for tp in training_params]
model = Chain(Dense(2,3), Dense(3,4), p -> diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end])
# in this case we just want outputs to match inputs
# (actual parameters we're after are outputs of next-to-last layer)
training_data = zip(training_sols, training_sols)
# mean squared error loss
loss(x,y) = Flux.mse(model(x), y)
p = Flux.params(model[1:2])
Flux.train!(loss, p, training_data, ADAM(0.001))
# gives TypeError: in typeassert, expected Float64, got ForwardDiff.Dual{Nothing, Float64, 8}
I've tried all three solver layers, diffeq_adjoint, diffeq_rd, and diffeq_fd, none of which work, but all of which give different errors that I'm having trouble parsing.
For the other option (which I'd actually prefer, but either way would work), just replace the model and loss function definitions as:
model = Chain(Dense(2,3), Dense(3,4))
function loss(x,y)
p = model(x)
sol = diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end]
Flux.mse(sol, y)
end
The same error is thrown as above.
I've been hacking at this for over a week now and am completely stumped; any ideas?
You're running into https://github.com/JuliaDiffEq/DiffEqFlux.jl/issues/31, i.e. forward-mode AD for the Jacobian doesn't play nice with Flux.jl right now. To get around this, use Rodas4(autodiff=false) instead.
I want to create a custom loss function for a double-input double-output model in Keras that:
minimizes the reconstruction error of two autoencoders;
maximizes the correlation of the bottleneck features of the autoencoders.
For this I need to pass to the loss function:
both inputs;
both outputs / reconstructions;
output of intermediate layers for both (hidden activations).
I know I can pass both inputs and outputs to Model, but am struggling to find a way to pass the hidden activations.
I could create two new Models that have the output of the intermediate layers and pass that to loss, like:
intermediate_layer_model1 = Model(input=input1, output=autoencoder.get_layer('encoded1').output)
intermediate_layer_model2 = Model(input=input2, output=autoencoder.get_layer('encoded2').output)
autoencoder.compile(optimizer='adadelta', loss=loss(intermediate_layer_model1, intermediate_layer_model2))
But still, I would need to find a way to match the y_true in loss to the correct intermediate model.
What is the right way to approach this?
Edit
Here's an approach that I think should work. Simplified:
# autoencoder 1
input1 = Input(shape=(input_dim,))
encoded1 = Dense(encoding_dim, activation='relu', name='encoded1')(input1)
decoded1 = Dense(input_dim, activation='sigmoid', name='decoded1')(encoded1)
# autoencoder 2
input2 = Input(shape=(input_dim,))
encoded2 = Dense(encoding_dim, activation='relu', name='encoded2')(input2)
decoded2 = Dense(input_dim, activation='sigmoid', name='decoded2')(encoded2)
# merge encodings
merge_layer = merge([encoded1, encoded2], mode='concat', name='merge', concat_axis=1)
model = Model(input=[input1, input2], output=[decoded1, decoded2, merge_layer])
model.compile(optimizer='rmsprop', loss={
'decoded1': 'binary_crossentropy',
'decoded2': 'binary_crossentropy',
'merge': correlation,
})
Then in correlation I can split y_pred and do the calculations.
How about:
Defining a single model with a multiple outputs (be sure that you named a coding and reconstruction layer properly):
duo_model = Model(input=input, output=[coding_layer, reconstruction_layer])
Compiling your model with two different losses (or even performing a loss reweighting):
duo_model.compile(optimizer='rmsprop',
loss={'coding_layer': correlation_loss,
'reconstruction_layer': 'mse'})
Taking your final model as a:
encoder = Model(input=input, output=[coding_layer])
autoencoder = Model(input=input, output=[reconstruction_layer])
After proper compilation this should do the job.
When it comes to defining a proper correlation loss function there are two ways:
when coding layer and your output layer have the same dimension -
you could easly use predefinied cosine_proximity function from
Keras library.
when coding layer has different dimensonality -
you shoud first find embedding of coding vector and reconstruction vector to the same space and then - compute correlation there. Remember that this embedding should either be a Keras layer / function or Theano / Tensor flow operation (depending on which backend you are using). Of course you can compute both embedding and correlation function as a part of one loss function.
I am currently doing a project on multimodal biometrics (fusion at score level). So I need to get the score before fusion.
Can anyone tell me how to get the score of the particular test sample using a trained SVM classifier?
I have used the inbuilt svmtrain and svmclassify functions in MATLAB.
Unfortunately the svmclassify function only outputs the label of the class and no distance (score). You will have to write your own classification function. Luckily, this is very easy: As you do have the Statistics Toolbox with svmclassify, you can easily look at the source code of the function with
edit svmclassify
You will see that most of the function is checking inputs etc. The important parts are scaling the data:
sample(:,c) = svmStruct.ScaleData.scaleFactor(c) * ...
(sample(:,c) + svmStruct.ScaleData.shift(c));
and doing the classification using a built-in function svmdecision:
outclass = svmdecision(sample,svmStruct);
From the definition of svmdecision you will see that it outputs the distance f, but svmclassify ignores it. You could therefore easily create a new function, which looks almost exactly like svmclassify, but also returns f:
1 function [outclass,f] = svmclassify(svmStruct,sample, varargin)
...
112 [outclass,f] = svmdecision(sample,svmStruct);
...
158 outclass = []; f = [];
You will find that svmdecision is a private function. To be able to call it from your function, you have to make a copy in your local folder (or any subfolder).
we are trying to integrate a simulation model into Simulink as a block. We have a custom continuous block which loads an m file that contains the functions Derivatives, Outputs etc.
My question is: is there a way to find out which solver is used currently and with which parameters? Our model won't be able to support variable time solvers and I would like to give a warning. Similarly, the model requires the fixed step time for initialization.
Thanks in advance.
You can get the current solver name using
get_param('modelName', 'SolverName');
Some of the other common solver parameters are
AbsTol
FixedStep
InitialStep
ZcThreshold
ExtrapolationOrder
MaxStep
MinStep
RelTol
SolverMode
You can find other parameters you may wish to query by opening the .mdl file in your favorite text editor and digging through it.
If I'm understanding your use case correctly, you are trying to determine the type of solver (and other solver params) for the top-level simulink system containing your block.
I think the following should give you what you want:
get_param(bdroot, 'SolverType'); % //Returns 'Variable-step' or 'Fixed-step'
get_param(bdroot, 'FixedStep'); % //Returns the fixed step size
Notice that for purposes of generality/reusability, this uses bdroot to identify the top-level system (rather than explicitly specifying the name of this system).
If you want to find out more about other model parameters that you can get/set, I would check out this doc.
Additionally, I'm interested to know why it is that your model doesn't support a variable-step solver?
In this case I have a neural network (NN) instance in my base workspace that I wish to use in a simulation with Simulink. I wrapped the use of the NN in an Embedded Matlab function with input arguments that should be used in by the net.
In principal I wish to do something like this:
function XBDDprime = NN(F, XB, XBD, XBDD)
%#eml
global net;
XBDDprime = net([F XB XBD XBDD]');
Where the goal is to fetch the net object from base workspace (which is an instance of the class network).
This a swing at the problem where I used evalin to read the variable from workspace:
function XBDDprime = NN(F, XB, XBD, XBDD)
%#eml
eml.extrinsic('evalin');
net = evalin('base', 'net'); %Fetch net from workspace
XBDDprime = net([F XB XBD XBDD]'); %Error!
This doesn't compile because it seems like simulink thinks net is an array and net(...) is array subscripting (actual error message: Subscripting into an mxArray is not supported).
It seems to me like Simulink needs to have a full definition of any object used to be able to compile the embedded matlab function, is that correct? Is there even a solution? Can I use Simulink.Signal somehow to wrap the NN and add that as an argument to the function block?
Edit
I tried using load as well to load the serialized net object from file. That didn't work either. Seems to be the same problem where the compiler thinks s is an mxArray.
function XBDDprime = NN(F, XB, XBD, XBDD)
%#eml
eml.extrinsic('load')
s = load('net');
XBDDprime = s.net([F XB XBD XBDD]');
Solution
I finally caved and went for the matlab function block which can look like any of the examples above.
You could define the net parameter as an input of the NN function and use a From Workspace block to get it into your model. I'm not sure if this will work with an Embedded MATLAB function block, you might need to switch to an M Code block.
Generate Simulink block for neural network simulation
Syntax
gensim(net,st)
To Get Help
Type help network/gensim.