I came across with the term GFF in some research paper. I found a line on GFF at https://www.semanticscholar.org/paper/The-Systematic-Trajectory-Search-Algorithm-for-the-Chen-Tseng/5c01686a41c31a6b7a9077edb323ed88cf158a98 that says "...links are not restricted to just going from one layer to the next layer". Is it that a part of the links of one layer can skip the next layer and fed to another non-adjacent layer? If so, then what will the links of the adjacent layer do? Can anybody throw a light on this type of network?
I'm not sure where you're getting confused. The illustrations in figure 1 clarify it quite easily for me. Yes, a link can go from any layer to any higher layer; links are not restricted to the next layer up. Note how node 1 in the input layer drives the hidden node in each of three layers, as well as the output layer. [I'll stick with node 1; the four input nodes are topologically identical.]
I'm not sure where you're confused with "the links of the adjacent layer". From your usage, I gather that you term a link owned by the layer of its source node. For example, the link from node 5 to node 8 "belongs" to the first (lowest) hidden layer, not to the output layer.
With that usage, let's look at a particular case in point: the link from node 1 to node 6 (middle hidden layer), skipping the lowest hidden layer (consisting of node 5). For sake of illustration, let's ignore the other links from node 1. Now, node 1 is driving only node 6, driving it directly from the input layer. This skip does not affect the other links at all: they continue to do what they do: drive the value of the source node into the linear equation of the destination node. Node 5 continues to be a function of the other input nodes; Node 5 continues to drive nodes 6, 7, and 8.
Perhaps your worries can be eased with a "dummy" node in each layer that gets skipped. Again, let's focus on the links from node 1 (to nodes 5, 6, 7, 8). Instead of letting node 1 skip layers, let's add nodes 1.2, 1.3, and 1.4 in the low, middle, and high hidden layers. Replace the "skipper" links from node 1. Instead, use these links, top (output) to bottom (input)
1.4 -> 8
1.3 -> 1.4
1.3 -> 7
1.2 -> 1.3
1.2 -> 6
1 -> 1.2
1 -> 5
In the sequence 1 -> 1.2 -> 1.3 -> 1.4, all link (edge) weights are 1 with a bias of 0. You now have a topology with identical algebraic properties, and no link skips a layer.
Note that any finite, acyclic network is a GFF. "Layer" is a convenience for our design; the topology restricts the "layer" of a node only by its longest path from an input node, and its longest path to an output node. It helps us to organize the nodes into layers for our own purposes, timing, debugging, etc., but a generalized flow simulator doesn't care. All it cares about is which nodes drive which other nodes, and whether a given node has all the inputs it needs to drive its output links on the next computational cycle.
Does that help?
Related
cross posted from OR-tools google group
I am working with a multi-vehicle VRP with due dates over 5 periods (ex. some nodes are due at time t=0, so I give them a penalty cost of 10000 and 1000 for all other nodes, etc. till period 4). Initially I followed the exact steps as laid out here with "AddDisjunction" to set priorities for certain nodes, and so expected that the solution would always pick up more, rather than less, nodes. However, in my example code , you'll see that the solver is dropping multiple nodes with smaller demand and picking up nodes with larger demand instead. I came across this same issue when working on a single-vehicle problem, but I was able to use AddSoftSameVehicleConstraint as a workaround.
My code here: I would direct you to the cell titled "Basic SVRP" onwards; the cells prior are for data generation. The most important thing in the output is that, all nodes starting with "a" or "b" are only of demand 1-2 units, while "c" and "d" are of demands between 4-8 units. Therefore, I should ideally see nodes of "a" and "b" only dropped as a last resort.
Any help here to rectify this would be greatly appreciated, happy to simplify/clarify where needed.
In NEAT you can add a special bias input node that is always active. Regarding the implementation of such a node there is not much information in the original paper. Now I want to know how the bias node should behave, if there is a at all a consensus.
So the question is:
Do connections from the bias node come about during evolution and can be split for new nodes just like regular connections or does the bias node always have connections to all non-input nodes?
To answer my own question: According to the NEAT users page Kenneth O. Stanley talks about why the bias in NEAT is used as an extra input neuron:
Why does NEAT use a bias node instead of having a bias parameter in each node?
Mainly because not all nodes need a bias. Thus, it would unnecessarily enlarge the search space to be searching for a proper bias for every node in the system. Instead, we let evolution decide which nodes need biases by connecting the bias node to those nodes. This issue is not a major concern; it could work either way. You can easily code a bias into every node and try that as well.
My best guess is therefore that the BIAS input is treated like any other input in NEAT, with the difference that it is always active.
I'm trying to implement the proposed model in a CVPR paper (Deep Interactive Object Selection) in which the data set contains 5 channels for each input sample:
1.Red
2.Blue
3.Green
4.Euclidean distance map associated to positive clicks
5.Euclidean distance map associated to negative clicks (as follows):
To do so, I should fine tune the FCN-32s network using "object binary masks" as labels:
As you see, in the first conv layer I have 2 extra channels, so I did net surgery to use pretrained parameters for the first 3 channels and Xavier initialization for 2 extras.
For the rest of the FCN architecture, I have these questions:
Should I freeze all the layers before "fc6" (except the first conv layer)? If yes, how the extra channels of the first conv will be learned? Are the gradients strong enough to reach the first conv layer during training process?
What should be the kernel size of the "fc6"? should I keep 7? I saw in "Caffe net_surgery" notebook that it depends on the output size of the last layer ("pool5").
The main problem is the number of outputs of the "score_fr" and "upscore" layers, since I'm not doing class segmentation (to use 21 for 20 classes and the background), how should I change it? What about 2? (one for object and the other for the non-object (background) area)?
Should I change "crop" layer "offset" to 32 to have center crops?
In case of changing each of these layers, what is the best initialization strategy for them? "bilinear" for "upscore" and "Xavier" for the rest?
Should I convert my binary label matrix values into zero-centered ( {-0.5,0.5} ) status, or it is OK to use them with the values in {0,1} ?
Any useful idea will be appreciated.
PS:
I'm using Euclidean loss, while I'm using "1" as the number of outputs for "score_fr" and "upscore" layers. If I use 2 for that, I guess it should be softmax.
I can answer some of your questions.
The gradients will reach the first layer so it should be possible to learn the weights even if you freeze the other layers.
Change the num_output to 2 and finetune. You should get a good output.
I think you'll need to experiment with each of the options and see how the accuracy is.
You can use the values 0,1.
I am trying to understand hierarchical quorums in Zookeeper. The documentation here
gives an example but I am still not quiet sure I understand it. My question is, if I have a two node Zookeeper cluster (I know it is not recommended but let's consider it for the sake of this example)
server.1 and
server.2,
can I have hierarchical quorums as follows:
group.1=1:2
weight.1=2
weight.2=2
With the above configuration:
Even if one node goes down I still have enough votes (?) to
maintain a quorum ? is this a correct statement ?
What is the zookeeper quorum value here (2 - for two nodes or 3 -
for 4 votes)
In a second example, say I have:
group.1=1:2
weight.1=2
weight.2=1
In this case if server.2 goes down,
Should I still have sufficient votes (2) to maintain a quorum ?
As far as I understand from the documentation, When we give weight to a node, then the majority varies from being the number of nodes. For example, if there are 10 nodes and 3 of the nodes have been given 70 percent of weightage, then it is enough to have those three nodes active in the network. Hence,
You don't have enough majority since both nodes have equal weight of 2. So, if one node goes down, we have only 50 percent of the network being active. Hence quorum is not achieved.
Since total weight is 4. we require 70 percent of 4 which would be 2.8 so closely 3, since we have only two nodes, both needs to be active to meet the quorum.
In the second example, it is clear from the weights given that 2/3 of the network would be enough (depends on the configuration set by us, I would assume 70 percent always,) if 65 percent is enough to say that network is alive, then the quorum is reached with one node which has weightage 2.
I have a network and simulate it in netlogo.In my network i have n nodes with a random data from [0.1,2,...,19].
at the beginning one random node became sink and 3 random nodes start to send its data to sink.i declare a variable named gamma.after nodes send their data to sink,sink decide to whether store that data in its memory space or not base on gamma.after 0.5s this process repeat.at each time some nodes are sink and want some data.this is the way i distribute data in my network.
after all i have to change gamma from 0 to 1 to determine best value for that. and each time run my code to plot count of something.i mean:first run my code with gamma=1 and after run it again with gamma=0.98 and ...
if Entropy <= gamma
[
do something
]
If i press the setup button each time i change gamma my network setup change and i can not compare the same network with another gamma.
How can i compare my network with multi value of gammas??
I mean is that possible to save all my process and run it exaclly the same again?
You can use random-seed to always create the same network and then use a new seed (created and set with random-seed new-seed) to generate the random numbers and ask order etc for your processing. The tool BehaviorSpace will allow you to do many runs with different values of gamma.
Using this approach will guarantee you the same network. However, just because a particular value of gamma is best for one network, does not make it the best for other networks. So you could create multiple networks with different seeds and have NetLogo select each network (as #David suggests) or you could simply allow NetLogo to create the different networks and run many simulations so that you have a more robust answer that works over an 'average' network.
It is possible if you design some tests first, when you put random data each time you press setup the previous graph is not the same as the new one, thus you'll need to load the same data everytime you want to test.
An idea:
Make text files with the node data and the value of gamma. For 4 nodes you'd have something like:
dat1.txt
1 3 2 9
1
dat2.txt
1 3 2 9
0.98
dat3.txt
1 3 2 9
0.96
And so on...
You can genereate this files with a procedure and an specific seed (see random-numbers), this means that if you want to generate 30 tests (30 sets of 4 nodes in the above example), you'll need 30 different seeds.