Calculation of the diffusion/activation energy - lammps

I am currently attempting to do a KMC simulation to calculate the diffusion coefficient of an Inconel-Ni alloy using LAMMPS. As a part of that, I would like to know how to calculate, in LAMMPS preferably (or in any other way), the diffusion/activation energy that is taken or expended by all the atoms at any given timestep in the model when they diffuse from their current atomic site to a new atomic site. I cannot use the Arrhenius equation since my objective is to calculate the diffusion coefficient 'D' and both D and activation energy 'E_a' are unknown. Could anyone provide any insight into this? Thank you.
Regards,
Rajesh

Related

Why do we take the derivative of the transfer function in calculating back propagation algorithm?

What is the concept behind taking the derivative? It's interesting that for somehow teaching a system, we have to adjust its weights. But why are we doing this using a derivation of the transfer function. What is in derivation that helps us. I know derivation is the slope of a continuous function at a given point, but what does it have to do with the problem.
You must already know that the cost function is a function with the weights as the variables.
For now consider it as f(W).
Our main motive here is to find a W for which we get the minimum value for f(W).
One of the ways for doing this is to plot function f in one axis and W in another....... but remember that here W is not just a single variable but a collection of variables.
So what can be the other way?
It can be as simple as changing values of W and see if we get a lower value or not than the previous value of W.
But taking random values for all the variables in W can be a tedious task.
So what we do is, we first take random values for W and see the output of f(W) and the slope at all the values of each variable(we get this by partially differentiating the function with the i'th variable and putting the value of the i'th variable).
now once we know the slope at that point in space we move a little further towards the lower side in the slope (this little factor is termed alpha in gradient descent) and this goes on until the slope gives a opposite value stating we already reached the lowest point in the graph(graph with n dimensions, function vs W, W being a collection of n variables).
The reason is that we are trying to minimize the loss. Specifically, we do this by a gradient descent method. It basically means that from our current point in the parameter space (determined by the complete set of current weights), we want to go in a direction which will decrease the loss function. Visualize standing on a hillside and walking down the direction where the slope is steepest.
Mathematically, the direction that gives you the steepest descent from your current point in parameter space is the negative gradient. And the gradient is nothing but the vector made up of all the derivatives of the loss function with respect to each single parameter.
Backpropagation is an application of the Chain Rule to neural networks. If the forward pass involves applying a transfer function, the gradient of the loss function with respect to the weights will include the derivative of the transfer function, since the derivative of f(g(x)) is f’(g(x))g’(x).
Your question is a really good one! Why should I move the weight more in one direction when the slope of the error wrt. the weight is high? Does that really make sense? In fact it does makes sense if the error function wrt. the weight is a parabola. However it is a wild guess to assume it is a parabola. As rcpinto says, assuming the error function is a parabola, make the derivation of the a updates simple with the Chain Rule.
However, there are some other parameter update rules that actually addresses this, non-intuitive assumption. You can make update rule that takes the weight a fixed size step in the down-slope direction, and then maybe later in the training decrease the step size logarithmic as you train. (I'm not sure if this method has a formal name.)
There are also som alternative error function that can be used. Look up Cross Entropy in you neural network text book. This is an adjustment to the error function such that the derivative (of the transfer function) factor in the update rule cancels out. Just remember to pick the right cross entropy function based on you output transfer function.
When I first started getting into Neural Nets, I had this question too.
The other answers here have explained the math which makes it pretty clear that a derivative term will appear in your calculations while you are trying to update the weights. But all of those calculations are being done in order to implement Back-propagation, which is just one of the ways of updating weights! Now read on...
You are correct in assuming that at the end of the day, all a neural network tries to do is update its weights to fit the data you feed into it. Within this statement lies your answer too. What you are getting confused with here is the idea of the Back-propagation algorithm. Many textbooks use backprop to update neural nets by default but do not mention that there are other ways to update weights too. This leads to the confusion that neural nets and backprop are the same thing and are inherently connected. This also leads to the false belief that neural nets need backprop to train.
Please remember that Back-propagation is just ONE of the ways out there to train your neural network (although it is the most famous one). Now, you must have seen the math involved in backprop, and hence you can see where the derivative term comes in from (some other answers have also explained that). It is possible that other training methods won't need the derivatives, although most of them do. Read on to find out why....
Think about this intuitively, we are talking about CHANGING weights, the direct mathematical operation related to change is a derivative, makes sense that you should need to evaluate derivatives to change weights.
Do let me know if you are still confused and I'll try to modify my answer to make it better. Just as a parting piece of information, another common misconception is that gradient descent is a part of backprop, just like it is assumed that backprop is a part of neural nets. Gradient descent is just one way to minimize your cost function, there are plenty of others you can use. One of the answers above makes this wrong assumption too when it says "Specifically Gradient Descent". This is factually incorrect. :)
Training a neural network means minimizing an associated "error" function wrt the networks weights. Now there are optimization methods that use only function values (Simplex method of Nelder and Mead, Hooke and Jeeves, etc), methods that in addition use first derivatives (steepest descend, quasi Newton, conjugate gradient) and Newton methods using second derivatives as well. So if you want to use a derivative method, you have to calculate the derivatives of the error function, which in return involves the derivatives of the transfer or activation function.
Back propagation is just a nice algorithm to calculate the derivatives, and nothing more.
Yes, the question was really good, this question was also came in my head while i am understanding the Backpropagation. After doing ForwordPropagation on neural network we do back propagation in network to minimize the total error. And there also many other way to minimize the error.your question is why we are doing derivative in backpropagation, the reason is that, As we all know the meaning of derivative is to find the slope of a function or in other words we can find change of particular thing with respect to particular thing. So here we are doing derivative to minimize the total error with respect to the corresponding weights of the network.
and here by doing the derivation of total error with respect to weights we can find it's slope or in other words we can find what is the change in total error with respect to the small change of the weight, so that we can update the weight to minimize the error with the help of this Gradient Descent formula, that is, Weight= weight-Alpha*(del(Total error)/del(weight)).Or in other words New Weights = Old Weights - learning-rate x Partial derivatives of loss function w.r.t. parameters.
Here Alpha is the learning rate which is control the weight update, means if the derivative the - ve than Alpha make it +ve(Becouse of -Alpha in formula) and if +ve it's remain +ve so that weight update goes in +ve direction and it's reflected to minimize the Total error.And also the as derivative part is multiples with Alpha, it's decrees the step size of Alpha when the weight converge to the optimal value of weight(minimum error). Thats why we are doing derivative to minimize the error.

Band Periodicity

this is one of sounds features and it has been used in some AI voice app.
"The periodicity property of each sub-band is represented by the maximum
local peak of the normalized correlation function."
I can't understand the way to calculate it, any help will be appreciated
any law to find "normalized correlation" and "maximum local peak" .
in my law i cant understand what "s(n)" value.
I need a "s(n)" explanation so i can build a function using matlab , or a clearer law..
please if u don't have an idea about band periodicity don't try to searcher about it,there is nothing on the internet!
s(n) is just the sample at index n of the signal after bandpass filtering.

Beginners issue in polynomial curve fitting [Part 1]

I have just started understanding modeling techniques based on regression models and was going through MATLAB curve fitting toolbox and the SO. I have fundamental doubts and unable to proceed further. I have a single vector set with k=100 data points which I want to fit into an AR model,MA model,ARMA model successively to see which is better suited.Starting with an AR(p) model of the form y(k+1)=a*y(k)+ b*y(k-1)The command
coeff = polyfit(x,y,d)
will fit a polynomial of degree say d=1 with p number of coefficients indicating the order of the model (AR(p)). But I just have 1 set of data which is the recording of the angular moment.So,what will go as the first parameter (x) of the function signature i.e what will be x,y?Then, what if the linear models are not good enough so I may have to select the nonlinear models.Can somebody please guide with code snippets what are the steps in fitting,checking for overfitting,residual calculation etc.
x is likely to be k (index of y). And the whole code:
c =polyfit(1:length(y), y, d).
Matlab has a curve fitting toolbox. You could use it to check different nonlinear fitting in GUI to get some intuition.
If you want steps there's a great Coursera Machine Learning course. The beginning of this course is related to linear regression and I recommend you to spend some hours at least on that beginning.

ANN-based navigation system

I am currently working on an indoor navigation system using a Zigbee WSN in star topology.
I currently have signal strength data for 60 positions in an area of 15m by 10 approximately. I want to use ANN to help predict the coordinates for other positions. After going through a number of threads, I realized that normalizing the data would give me better results.
I tried that and re-trained my network a few times. I managed to get the goal parameter in the nntool of MATLAB to the value .000745, but still after I give a training sample as a test input, and then scaling it back, it is giving a value way-off.
A value of .000745 means that my data has been very closely fit, right? If yes, why this anomaly? I am dividing and multiplying by the maximum value to normalize and scale the value back respectively.
Can someone please explain me where I might be going wrong? Am I using the wrong training parameters? (I am using TRAINRP, 4 layers with 15 neurons in each layer and giving a goal of 1e-8, gradient of 1e-6 and 100000 epochs)
Should I consider methods other than ANN for this purpose?
Please help.
For spatial data you can always use Gaussian Process Regression. With a proper kernel you can predict pretty well and GP regression is a pretty simple thing to do (just matrix inversion and matrix vector multiplication) You don't have much data so exact GP regression can be easily done. For a nice source on GP Regression check this.
What did you scale? Inputs or outputs? Did scale input+output for your trainingset and only the output while testing?
What kind of error measure do you use? I assume your "goal parameter" is an error measure. Is it SSE (sum of squared errors) or MSE (mean squared errors)? 0.000745 seems to be very small and usually you should have almost no error on your training data.
Your ANN architecture might be too deep with too few hidden units for an initial test. Try different architectures like 40-20 hidden units, 60 HU, 30-20-10 HU, ...
You should generate a test set to verify your ANN's generalization. Otherwise overfitting might be a problem.

spectral coherence of time seires

I have two time series of data, one which is water temperature and the other is air temperature (hourly measurements for one year). Both measurements are taken simultaneously and the vectors are therefore the same size. The command corrcoef illustrates that they have a correlation equal to ~0.9.
Now I'm trying a different approach to find the correlation where I was thinking of spectral coherence. As far as I understand, in order to do this I should find the autospectral density of each time series? (i.e. of water temperature and air temperature) and then find the correlation between them?
As I am new to signal processing I was hoping for some advice on the best ways of doing this!
I would recommend consulting this site. It contains an excellent reference to your question. If you need help with the cohere function, let me know.