Related
I'm trying to build a neural network that will take in the solutions to a system of ODE's and predict the parameters of the system. I'm using Julia and in particular, the DiffEqFlux package. The structure of a network is a few simple Dense layers chained together that predict some intermediate parameters (in this case, some chemical reaction free energies), which then feed into some deterministic (non-trained) layers that convert those parameters into the ones that go into the system of equations (in this case, reaction rate constants). I've tried two different approaches from here:
Chain the ODE solve directly on as the last layer of the network. In this case, the loss function is just comparing the inputs to the outputs.
Have the ODE solve in the loss function, so the network output is just the parameters.
However, in neither case can I get Flux.train! to actually run.
A silly little example for the first option that gives the same error I'm getting (I've tried to keep as many things parallel to my actual case as possible, i.e. the solver, etc., although I did omit the intermediate deterministic layers since they don't seem to make a difference) is shown below.
using Flux, DiffEqFlux, DifferentialEquations
# let's use Chris' favorite example, Lotka-Volterra
function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
u0 = [1.0,1.0]
tspan = (0.0,10.0)
# generate a couple sets of solutions to train on
training_params = [[1.5,1.0,3.0,1.0], [1.4,1.1,3.1,0.9]]
training_sols = [solve(ODEProblem(lotka_volterra, u0, tspan, tp)).u[end] for tp in training_params]
model = Chain(Dense(2,3), Dense(3,4), p -> diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end])
# in this case we just want outputs to match inputs
# (actual parameters we're after are outputs of next-to-last layer)
training_data = zip(training_sols, training_sols)
# mean squared error loss
loss(x,y) = Flux.mse(model(x), y)
p = Flux.params(model[1:2])
Flux.train!(loss, p, training_data, ADAM(0.001))
# gives TypeError: in typeassert, expected Float64, got ForwardDiff.Dual{Nothing, Float64, 8}
I've tried all three solver layers, diffeq_adjoint, diffeq_rd, and diffeq_fd, none of which work, but all of which give different errors that I'm having trouble parsing.
For the other option (which I'd actually prefer, but either way would work), just replace the model and loss function definitions as:
model = Chain(Dense(2,3), Dense(3,4))
function loss(x,y)
p = model(x)
sol = diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end]
Flux.mse(sol, y)
end
The same error is thrown as above.
I've been hacking at this for over a week now and am completely stumped; any ideas?
You're running into https://github.com/JuliaDiffEq/DiffEqFlux.jl/issues/31, i.e. forward-mode AD for the Jacobian doesn't play nice with Flux.jl right now. To get around this, use Rodas4(autodiff=false) instead.
I have been reading the documentation: here and here but it's really unclear for me and I don't see how to use pratically crossval to do a leave one out cross-validation.
vals = crossval(fun,X)
vals = crossval(fun,X,Y,...)
mse = crossval('mse',X,y,'Predfun',predfun)
mcr = crossval('mcr',X,y,'Predfun',predfun)
val = crossval(criterion,X1,X2,...,y,'Predfun',predfun)
vals = crossval(...,'name',value)
I really don't understand the funpart.
I have estimatimate chlorophyll rate with different index. Then I have done a linear regression between those index and the field taken chlorophyll rate. Now I want to validate them, one of my estimation is a column with 22 entries, so I want to use 21 of them as trainee and 1 as a test, and do 22 loops so that all the data have been used as test.
But I don't where should I put the regression model? If my regression is Y=aX+b,
do I re-use the a and b calculated before for the train part, or do I do a new linear regression with the train part then see what's the test will be with that?
I am not sure I totally understood how to make a leave one out model.
Then I want to know the result of the test by calculating the RMSE (and maybe the R²).
How do I code that using crossval?
I saw the answer to the question here but I don't have access to the crossvalind fonction with my license.
Well I finaly figure it out: so this is my script:
First I charged my data and the linear regression fonction
X=indicesCha_without_Cloud(:,3);
y=Cha_g_m2t_without_Cloud(:,3);
testval=#(XTRAIN,ytrain,XTEST)Linear_regression_indices( XTRAIN,ytrain,XTEST);
where in my case fun(in the Mathwork help) is testvaland Linear_regression_indices is a very simple fonction:
function [ Linear_regression_indices ] = Linear_regression_indices(XTRAIN,ytrain,XTEST )
Linear_regression_indices=(polyval(polyfit(XTRAIN,ytrain,1),XTEST));
end
There is 2 ways to do it and they both give the same result:
one by using simply the crossval fonction
cvMse = crossval('mse',X,y,'predfun',testval,'leaveout',1);
this will do as many fold as the data size, using each time one of the data as Xtest
the second one is using cvpartition
c = cvpartition(n,'LeaveOut') creates a random partition for leave-one-out cross validation on n observations. Leave-one-out is a special case of 'KFold', in which the number of folds equals the number of observations. link
c = cvpartition(y,'LeaveOut');
cvMse2=crossval('mse',X,y,'predfun',testval,'partition',c);
then the RMSE can be easily calculated
RMSE=sqrt(cvMse);
RMSE2=sqrt(cvMse2);
then I simply get my answer, in my case RMSE=0,3548
I want to train a neural network in tensorflow with a max-margin loss function using one negative sample per positive sample:
max(0,1 -pos_score +neg_score)
What I'm currently doing is this:
The network takes three inputs: input1, and then one positive example input2_pos and one negative example input2_neg. (These are indices to a word embeddings layer.) The network is supposed to calculate a score that expresses how related two examples are.
Here's a simplified version of my code:
input1 = tf.placeholder(dtype=tf.int32, shape=[batch_size])
input2_pos = tf.placeholder(dtype=tf.int32, shape=[batch_size])
input2_neg = tf.placeholder(dtype=tf.int32, shape=[batch_size])
# f is a neural network outputting a score
pos_score = f(input1,input2_pos)
neg_score = f(input1,input2_neg)
cost = tf.maximum(0., 1. -pos_score +neg_score)
optimizer= tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
What I see when I run this, is that like this the network just learns which input holds the positive example - it always predicts a similar score along the lines of:
pos_score = 0.9965983
neg_score = 0.00341663
How can I structure the variables/training so that the network learns the task instead?
I want just one network that takes two inputs and calculates a score expressing the correlation between them, and train it with max-margin loss.
Calculating scores for positive and negative separately does not seem like an option to me, since then it won't backpropagate properly. Another option seems to be randomizing inputs - but then for the loss function I need to know which example is the positive one - inputting that as another parameter would give away the solution again?
Any ideas?
Given your results (1 for every positive, 0 for every negative) it seems you have two different networks learning:
to predict 1 for the first one
to predict 0 for the second one
When using max-margin loss, you need to use the same network for computing both pos_score and neg_score. The way to do that is to share the variables. I will give you a small example using tf.get_variable():
with tf.variable_scope("network"):
w = tf.get_variable("weights", shape=..., initializer=...)
def f(x, y):
with tf.variable_scope("network", reuse=True):
w = tf.get_variable("weights")
res = w * (x - y) # some computation
return res
With this function f as model, the training will optimize the shared variable with name "network/weights".
I'm filternig an audio file with matlab inside a function script which has a function named "filtro_pf" created by me which is a band-pass filter and its outputs are the IIR corresponding coefficients:
[b,a] = filtro_pf(Ap,As,fp1,fs1,fp2,fs2,1); //BAND PASS FILTER
b2 = [0.0039,0,-0.0156,0,0.0234,0,-0.0156,0,0.0039];
a2 =[1.0000,-6.2682,17.2660,-27.3421,27.2582,-17.5284,7.0998,-1.6555,0.1701];
wfpf = filter(b,a,audio_stream);
wavplay(wfpf,fs);
Note that the second and third lines (b2 and a2) are the values which the function 'filtro_pb()' gives me for those inputs. I've run once and then copied them to these variables. Now, after I run this script, if I ask for 'a' and 'a2' in the console I will have:
a =
1.0000 -6.2682 17.2660 -27.3421 27.2582 -17.5284 7.0998 -1.6555 0.1701
a2 =
1.0000 -6.2682 17.2660 -27.3421 27.2582 -17.5284 7.0998 -1.6555 0.1701
They are pretty much the same. But if I use 'a2' instead of 'a' in the filter() function, it does not work. I hear a kind of tick sound and that's it. With 'a' I can hear the sound correctly filtered. This same code is used before and it does work:
%[b,a] = filtro_pa(Ap,As,fp,fs,1); //HIGH PASS FILTER
b = [0.5411 -1.6232 1.6232 -0.5411];
a = [1.0000 -1.8062 1.2298 -0.2925];
wfpa = filter(b2, a2, audio_stream);
wavplay(wfpa,fs);
Again, I used this script previously and saw that these values (from a2 and b2) were the output to these inputs. Now instead of calling the function again (which is commented by the way) I use the 'a' and 'b'vectors directly. It does work for the LowPass and for the HighPass filter.
All of this are for test purposes so I do not expect suggestions like "why use the vector instead of calling the function then?".
I just want to know, how can the function not work with the second variable if they are pretty much the same?
There is more precision present in the variables than is displayed, which means that your a2 and b2 vectors are not the same as a and b. It might appear surprising that errors on that order would make the filter unstable, but it appears that is what is happening. You should be able to explore this by looking at the filter response with freqz, and by plotting up the resulting audio vector rather than just listening to it.
You can use format long to print more precision, but these will still have some rounding error. To avoid, save the vectors to a .mat file and reload that. The .mat file will use binary format and store the full precision of your vectors.
The reason it works for the other filters is probably because those filters are less sensitive to rounding errors in their coefficients: they have fewer coefficients, and those coefficients are less extreme in value.
Here's a sample comparison of frequency response:
[H W] = freqz(b2, a2); % your filter (with error)
a_error = zeros(size(a2));
a_error(9) = a_error(9)+.001; % a little bit of error in a single coefficient
[HE WE] = freqz(b2, a2 + a_error); % frequency response of THAT filter
plot(log10(abs([H HE])))
As you can see, a small change makes a large difference.
An actual analysis of the instability comes from looking at the poles and zeros of the filter:
[z p k] = tf2zp(b2, a2);
abs(p)
If any poles have magnitude greater than 1 (this one does), the filter will be unstable. Try the real values, then your "approximate" values, and see what happens.
I have a transfer function that I am trying to use to filter acceleration data.
So far I have been able to use lsim with a sin wave over about 10 seconds and I'm getting the result I expect. However, I cannot work how to get data to the function in real time.
To clarify, every 0.1 seconds I am receiving an acceleration value from an external program. I need to filter the values to remove high frequency variations in the data. I need this to occur for each data point I receive as I then use the current filtered acceleration value into additional processing steps.
How do I use the transfer function in continuously and update the output value each time new data is received?
This is an example of how to do this with filter:
filter_state = []; % start with empty filter history
while not_stopped()
x = get_one_input_sample();
[y, filter_state] = filter(B, A, x, filter_state);
process_one_output_sample(y);
end;
Note that you need to use the extended form of filter, in which you pass the history of the filter from one iteration to the next using the variable filter_state. The B and A are the coefficients of your discrete filter. If your filter is continuous, you need to transform it to a discrete filter first using c2d or so. Note that if your filter is very complex, you might need to split up the filtering in several steps (e.g. one second-order-stage per filter) to avoid numerical problems.