Back propagation, how to deal with one parameter updated twice - neural-network

In my neural network, one parameter (word embedding of some word) will be updated twice. My question is that do I need to update this parameter one by one? or some other method to update this parameter?
This is the current method I think:
w = w_old + learning_rate * (grad_1 + grad_2)

Related

Requested output arguments of a function [duplicate]

This question already has an answer here:
How to determine if an output of a function-call is unused?
(1 answer)
Closed 4 years ago.
Let's say I have
function [a, b] = foo
a = 1;
b = 2;
and the user is calling
[~, B] = foo;
I would like only b = 2 to happen, to prevent a time consuming operation a = 1. Is there a way to find out that a was actually not requested by user?
Long ago it was not possible. I wonder if Mathworks improved this or anyone found a workaround in the meantime.
Note: the opposite is possible: if user calls A = foo, the nargout is 1.
It is not possible for good reason. Just because you do not require the middle output at the point of return of the function, doesn't mean that its calculation was unnecessary within the function, unless you explicitly write your function to do so.
E.g., it is possible that the calculation of output 3 depends on output 2 inside the function, even if not explicitly requested at output time). Matlab has no way of knowing this, so it cannot assume that any and all calculations involving that output can be discarded.
If you require a memory efficient manner of ensuring only the correct calculations take place, then change your output strategy.
I would suggest returning a struct with the right fields, where you request the fields you desire at function calling. In this way, your final struct would only contain the fields you desire, and you can ensure inside your function that unnecessary calculations do not take place.

views in collections in scala

I understand that a view is a light-weight collection and that it is lazy. I would like to understand what makes a view light weight.
Say I have a list of 1000 random numbers. I'll like to find even numbers in this list and pick only 1st 10 even numbers. I believe using a view here is better because we can avoid creating an intermediate list esp because I'll pick only 1st 10 even numbers. Initially, I thought that the the optimization is achieved because the function I'll use in the filter method will not get executed till the method force is called but this isn't correct I believe. I am struggling to understand what makes using the view better in this scenario. Or have I picked a wrong example?
val r = scala.util.Random
val l:List[Int] = List.tabulate(1000)(x=>r.nextInt())
//without view, I'll get an intermediate list. The function x%2==0 will be on each elemenet of l
val l1 = l.filter(x=>(x%2 == 0))
//this will give size of l2. I got size as 508 but yours could be different depending on the random numbers generated in your case
l1.size
//pick 1st 10 even numbers
val l2 = l1.take(10)
//using view. I thought that x%2==0 will not be executed right now
val lv1 = l.view.filter(x=>(x%2 == 0))
lv1: scala.collection.SeqView[Int,List[Int]] = SeqViewF(...)
lv1.size //this is same as l1 size so my assumption that x%2==0 will not be executed is wrong else lv1.size will not be same as l1.size
val lv2 = lv1.take(10).force
**Question 1 - if I use view, how is the processing optimised?
Question 2 - lv1 is of type SeqViewF, F is related to filter but what does it mean?
Question 3 - what do the elements of lv1 look like (l1 for example are integers)**
You wrote:
lv1.size //this is same as l1 size so my assumption that x%2==0 will
not be executed is wrong else lv1.size will not be same as l1.size
Your assumption is actually correct it's just that your means of measuring the difference is faulty.
val l:List[Int] = List.fill(10)(util.Random.nextInt) // ten random Ints
// print every Int that gets tested in the filter
val lv1 = l.view.filter{x => println(x); x%2 == 0} // no lines printed
lv1.size // ten Ints sent to STDOUT
So, as you see, taking the size of your view also forces its completion.
Yeah, that's not a very fitting example. What you are doing is better done with an iterator: list.filter(_ % 2 == 0).take(10). This doesn't create intermediate collections, and does not scan the list past the first 10 even elements (view wouldn't either, it's just a bit of an overcomplication for this case).
A view is a sequence of delayed operations. It has a reference to the collection, and a bunch of operations to be applied when it is forced. The way operations to be applied are recorded is rather complicated, and not really important. You guessed right - SeqViewF means a view of a sequence with a filter applied. If you map over it, you'll get a SeqViewFM etc.
When would this be needed?
One example is when you need to "massage" a sequence that you are passing somewhere else. Suppose, you have a function, that combines elements of a sequence you pass in somehow:
def combine(s: Seq[Int]) = s.iterator.zipWithIndex.map {
case(x, i) if i % 2 == 0 => x
case(x, _) => -x
}.sum
Now, suppose, you have a huge stream of numbers, and you want to combine only even ones, while dropping the others. You can use your existing function for that:
val result = combine(stream.view.filter(_ % 2 == 0))
Of course, if combine parameter was declared as iterator to begin with, you would not need the view again, but that is not always possible, sometimes you just have to use some standard interface, that just wants a sequence.
Here is a fancier example, that also takes advantage of the fact that the elements are computed on access:
def notifyUsers(users: Seq[User]) = users
.iterator
.filter(_.needsNotification)
.foreach(_.notify)
timer.schedule(60 seconds) { notifyUsers(userIDs.view.map(getUser)) }
So, I have some ids of the users that may need to be notified of some external events. I have them stored in userIDs.
Every minute a task runs, that finds all users that need to be notified, and sends a notification to each of them.
Here is the trick: notifyUsers takes a collection of User as a parameter. But what we are really passing in is a view, composed of the initial set of user ids, and a .map operation, getting the User object for each of them. As a result, every time the task runs, a new User object will be obtained for each id (perhaps, from the database), so, if the _needsNotification flag gets changed, the new value is picked up.
Surely, I could change notifyUsers to receive the list of ids, and do getUser on its own instead, but that wouldn't be as neat. First, this way, it is easier to unit-test - I can just pass an a list of test objects directly in, without bothering to mock out getUser. And second, a generic utility like this is more useful - a User could be a trait, for example, that could be representing many different domain objects.

Pause JModelica and Pass Incremental Inputs During Simulation

Hi Modelica Community,
I would like to run two models in parallel in JModelica but I'm not sure how to pass variables between the models. One model is a python model and the other is an EnergyPlusToFMU model.
The examples in the JModelica documentation has the full simulation period inputs defined prior to the simulation of the model. I don't understand how one would configure a model that pauses for inputs, which is a key feature of FMUs and co-simulation.
Can someone provide me with an example or piece of code that shows how this could be implemented in JModelica?
Do I put the simulate command in a loop? If so, how do I handle warm up periods and initialization without losing data at prior timesteps?
Thank you for your time,
Justin
Late answer, but in case it is picked up by others...
You can indeed put the simulation into a loop, you just need to keep track of the state of your system, such that you can re-init it at every iteration. Consider the following example:
Ts = 100
x_k = x_0
for k in range(100):
# Do whatever you need to get your input here
u_k = ...
FMU.reset()
FMU.set(x_k.keys(), x_k.values())
sim_res = FMU.simulate(
start_time=k*Ts,
final_time=(k+1)*Ts,
input=u_k
)
x_k = get_state(sim_res)
Now, I have written a small function to grab the state, x_k, of the system:
# Get state names and their values at given index
def get_state(fmu, results, index):
# Identify states as variables with a _start_ value
identifier = "_start_"
keys = fmu.get_model_variables(filter=identifier + "*").keys()
# Now, loop through all states, get their value and put it in x
x = {}
for name in keys:
x[name] = results[name[len(identifier):]][index]
# Return state
return x
This relies on setting "state_initial_equations": True compile option.

clearer explanation of function level scope for recursion

This is an example from the book 'Matlab for Neuroscientists'. I don't understand the order in which, or why, g gets assigned a new value after each recursion. Nor do I understand why "factorial2" is included in the final line of code.
here is a link to the text
Basically, I am asking for someone to re-word the authors explanation (circled in red) of how the function works, as if they were explaining the concept and processes to a 5-year old. I'm brand new to programming. I thought I understood how this worked from reading another book, but now this authors explanation is causing nothing but confusion. Many thanks to anyone who can help!!
A recursive method works by breaking a larger problem into smaller problems each time the method is called. This allows you to break what would be a difficult problem; a factorial summation, into a series of smaller problems.
Each recursive function has 2 parts:
1) The base case: The lowest value that we care about evaluating. Usually this goes to zero or one.
if (num == 1)
out = 1;
end
2) The general case: The general case is what we are going to call until we reach the base case. We call the function again, but this time with 1 less than the previous function started with. This allows us to work our way towards the base case.
out = num + factorial(num-1);
This statement means that we are going to firstly call the function with 1 less than what this function with; we started with three, the next call starts with two, the call after that starts with 1 (Which triggers our base case!)
Once our base case is reached, the methods "recurse-out". This means they bounce backwards, back into the function that called it, bringing all the data from the functions below it!It is at this point that our summation actually occurs.
Once the original function is reached, we have our final summation.
For example, let's say you want the summation of the first 3 integers.
The first recursive call is passed the number 3.
function [out] = factorial(num)
%//Base case
if (num == 1)
out = 1;
end
%//General case
out = num + factorial(num-1);
Walking through the function calls:
factorial(3); //Initial function call
//Becomes..
factorial(1) + factorial(2) + factorial(3) = returned value
This gives us a result of 6!

How do I get MuPAD to stop using placeholders for matrix elements?

I'm trying to evaluate some matrix multiplications in MuPAD. The output is using sigmas as placeholders for the matrix elements since they are long expressions (I assume that's the reason). Is there a way to get MuPAD to display the individual matrix elements as (in my case) the exponential functions that they really are, regardless of the length of the expression?
Below is an example of a case where MuPAD is using sigmas instead of the actual exponential functions. I would like to be able to see what the individual matrix elements of TotT^4 really are.
The commands I executed in the MuPAD interface that lead up to TotT^4 are:
T1 := matrix([[exp((J+B/2)/T),exp(-(J+B/6)/T)],[exp((-J+B/6)/T),exp((J-B/2)/T)]])
T2 := matrix([[exp((J1+B/2)/T),exp(-(J1+B/6)/T)],[exp((-J1+B/6)/T),exp((J1-B/2)/T)]])
T1d := linalg::transpose(T1)
TotT := T1d*T2
The class of your variable can be obtain via type(totT): Dom::Matrix. You may want to look at the many methods of this class in the documentation. As far as I can tell, this issue has something to do with the pretty printing of the class's print method. Other classes exhibit this same substitution, so it may be a function of the overloaded print. I was not able to change the behavior by adjusting setPrintMaxSize, PRETTYPRINT, TEXTWIDTH, or any of the optional arguments to print. You might still try yourself as there are many permutations.
I also tried using the expand function. expand(TotT,IgnoreAnalyticConstraints) nearly works though it could have undesirable effects in some cases if things were expanded too much. Calling simplify does get rid go the substitutions, but it also changes the nature of some of the entries by simplifying. It is probably also not a general solution to this issue.
One way that does work, but is ugly, is to use the expr2text method, which returns a result as a string:
expr2text(TotT)
which returns
"matrix([[exp((B/6 - J)/T)*exp((B/6 - J1)/T) + exp((B/2 + J)/T)*exp((B/2 + J1)/T), ...
exp(-(B/2 - J1)/T)*exp((B/6 - J)/T) + exp((B/2 + J)/T)*exp(-(B/6 + J1)/T)], ...
[exp(-(B/2 - J)/T)*exp((B/6 - J1)/T) + exp((B/2 + J1)/T)*exp(-(B/6 + J)/T), ...
exp(-(B/2 - J)/T)*exp(-(B/2 - J1)/T) + exp(-(B/6 + J)/T)*exp(-(B/6 + J1)/T)]])"
I think that this question would be a good one to ask over at Matlab Central or by filing a service request if you have a license with support.