I am wondering, if there is an example code doing signal transform using neural network?
I am formulating my question in following steps:
I have a input signal x and its output y. So the x goes through the whole network and eventually becomes y.
So conceptually, if I have a bunch of x and a bunch y, I should be able to converge those weights in the network, such that next time when I throw a new x into the network, a corresponding new y will generated based on the training the network had.
So simply, through training, I will have my own transformeor/operator that turns x into y.
Is there such an code package? Or am I looking for something not existing? The reason is because this problem sound very fitting the neural network nature.
Thanks a lot.
Related
I'm currently working through the SciML tutorials workshop exercises for the Julia language (https://tutorials.sciml.ai/html/exercises/01-workshop_exercises.html). Specifically, I'm stuck on exercise 6 part 3, which involves training a neural network to approximate the system of equations
function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
The goal is to replace the equation for du[2] with a neural network: du[2] = NN(u, p)
where NN is a neural net with parameters p and inputs u.
I have a set of sample data that the network should try to match. The loss function is the squared difference between the network model's output and that sample data.
I defined my network with
NN = Chain(Dense(2,30), Dense(30, 1)). I can get Flux.train! to run, but the problem is that sometimes the initial parameters for the neural network result in a loss on the order of 10^20 and so training never converges. My best try got the loss down from about 2000 initially to about 20 using the ADAM optimizer over about 1000 iterations, but I can't seem to do any better.
How can I make sure my network is consistently trainable, and is there a way to get better convergence?
How can I make sure my network is consistently trainable, and is there a way to get better convergence?
See the FAQ page on techniques for improving convergence. In a nutshell, the single shooting approach of most ML papers is very unstable and does not work on most practical problems, but there are a litany of techniques to help out. One of the best ones is multiple shooting, which optimizes only short bursts (in parallel) along the time series.
But training on a small interval and growing the interval works, also using more stable optimizers (BFGS) can work. You can also weigh the loss function so that earlier times mean more. Lastly, you can minibatch in a way similar to multiple shooting, i.e. start from a data point and only solve to the next (in fact, if you actually look at the original neural ODE paper NumPy code, they do not do the algorithm as explained but instead do this form of sampling to stabilize the spiral ODE training).
If I use the trained RNN(or LSTM) to generate series data(name the network with generated-RNN), then I use the series data to train the RNN(i.e. the same structure with the generated-RNN) from scratch, is it possible to get the same trained network(the same trained weights) with the generated-RNN?
You take series with X as input and Y as output to train a model, then with that model you generate series O.
Now you want to recreate the Ws from O=sigmoid(XWL1+b)...*WLN+bn with O as input and X as output.
Is it possible?
Unless X=O, then most likely not. I can't give a formal mathematical proof but multiplying forward through a network isn't equal to multiplying backwards, mainly due to the activation function. If you removed the activation function or took the inverse of the activation function you would more likely approach the Ws you desired, though another set of weights might also get you the the same output for a given input.
Also, this question would be better received in stats.stackexchange than stackoverflow.
We have weights and optimizer in the neural network.
Why cant we just W * input then apply activation, estimate loss and minimize it?
Why do we need to do W * input + b?
Thanks for your answer!
There are two ways to think about why biases are useful in neural nets. The first is conceptual, and the second is mathematical.
Neural nets are loosely inspired by biological neurons. The basic idea is that human neurons take a bunch of inputs and "add" them together. If the sum of the inputs is greater than some threshold, then the neuron will "fire" (produce an output that goes to other neurons). This threshold is essentially the same thing as a bias. So, in this way, the bias in artificial neural nets helps to replicate the behavior of real, human neurons.
Another way to think about biases is simply by considering any linear function, y = mx + b. Let's say you are using y to approximate some linear function z. If z has a non-zero z-intercept, and you have no bias in the equation for y (i.e. y = mx), then y can never perfectly fit z. Similarly, if the neurons in your network have no bias terms, then it can be harder for your network to approximate some functions.
All that said, you don't "need" biases in neural nets--and, indeed, recent developments (like batch normalization) have made biases less frequent in convolutional neural nets.
I have a data set with three input variables and one output. I want to fit the target column to the input data using an artificial neural network with one layer, for example. My concern is, can I obtain a closed form or analytical formula that links the modeled target to the three input variable? Please let me know whether the software can process that.
Technically, you can always translate a one-layer neural net into a closed form solution. Let's call your input variables x, y, and z, your bias node b, and the connections between them and your output cx, cy, cz, and cb respectively. Once you've trained the neural network so that the connection weights are adjusted, the output value should equal xcx + ycy + zcz + cb. (technically the bias term is bcb, but b = 1).
If you were using a more complicated neural network, this equation would get a lot more complicated, particularly as you add hidden layers, but theoretically you can always write it out analytically. It just won't always be tractable to evaluate the analytic form.
For a more detailed answer, see: Deriving equation by weights and biases from a neural network
I am new to this neural network in matlab. I wanted to create a Neural Network using matlab simulation.
This matlab simulation is using pattern recognition.
I am running on a windows XP platform.
For example, I have a sets of waveforms of circular shape.
I have extracted out the poles.
These poles will teach my Neural Network that it is circular in shape, hence whenever I input another set of slightly different circular shape waveform, the Neural Network is able to distinguish between the shape.
Currently, I have extracted the poles of these 3 shapes, cylinder, circle and rectangle.
But I am clueless of how I should go about creating my Neural Network.
I'd recommend utilizing SOM (Self-organizing map) for pattern recognition since it's really robust. Also there's a Som Toolbox for Matlab you might be interested in. However, to make it learn waves while neglecting their offsets, you'd need to make some changes to the "similarity function". These changes will affect quite a lot on the SOM's training time but if that's not a problem, keep reading.
For the SOM you'll have to sample your waves to constant sized vectors, let say:
sin x -> sin_vector = (a1, a2, a3, ..., aN)
cos x -> cos_vector = (b1, b2, b3, ..., bN)
Usually similarity of "SOM-vectors" is calculated with euclidian distance. Euclidian distance of those two vectors is huge since they have a different offset. In your case they should be considered to be similar ie. distance to be small. So.. if you don't sample all the similar waves from the same starting point, they will be classified in different classes. That is probably a problem. But! Similarity of vectors in SOM is calculated in order to find the BMU (best-matching unit) from the map and pulling the BMU's and its neigborhood's vectors torwards the values of the given sample. So all you need to change is the way to compare those vectors and the way to pull the vectors' values torwards the sample so that both will be "offset-tolerent".
Slow but working solution is first finding the best offset index for each vector. Best offset index is the one that will produce the smallest value with euclidian distance for the sample. Smallest distance calculated with some node of the net will then be the BMU. Then the BMU's and its neigborhood's vectors are pulled torwards the given sample using the offset index calculated for each node just before. Everything else should work out-of-the-box.
This solution is relatively slow but should work great. I'd recommend studying the consept of SOM thoroughly and then reading this post (and angry comments) again :)
PLEASE comment if you know some mathematical solution that would be better than that previous one!
You can try to use Matlab's Neural network pattern recognition tool nprtool as it is specialize to train and test neural network for pattern recognition.