How should nodes be connected in a neural network? - neural-network

I recently was introduced to the amazing world of neural networks. I've noticed their amazing flexibility and capability. However, I'm no gonna lie, my knowledge about their technicalities is sparse. The network of interest is the multilayer perceptron. It consists of some input nodes, some hidden nodes and some output nodes. However, I would like to know, do all input nodes need to be connected to all hidden nodes and all hidden nodes need to be connected to all output nodes? Or is there some determining factor to decide which input nodes should be connected to which hidden nodes which are in turn connected to which output nodes?
Your help is much appreciated :3

do all input nodes need to be connected to all hidden nodes and all
hidden nodes need to be connected to all output nodes?
Since an Multi-Layer Perceptron (MLP) is a Fully Connected Network, each node in one layer connects with a certain weight W{i,y} to every node in the following layer. See the image bellow.
Or is there some determining factor to decide which input nodes should be connected to which hidden nodes which are in turn connected to which output nodes?
You can implement pruning methods to remove some connections and observe if it improves the accurancy and performance of the neural network. Generally, it is made after you train your neural network model and you can see the performance. See these links:
A new pruning algorithm for neural network
An iterative pruning algorithm for feedforward neural networks
It also could be made by exaustive search, on other words, brute force (removing and reconnecting nodes between each layers).

Related

How does dropout work (with multiple GPUs)?

Let's say I'm using multiple GPUs and I'm training a neural network that's using dropout. I know that dropout randomly turns off certain nodes in the network for each training sample, and then only updates the weights in the "thinned network," then this seems like a very serial process. How are the weight updates combined during parallelism?
For instance, input #1 removes some x nodes and input #2 removes some other y nodes. Let's say z nodes are common to both instances of the sub-network. Does dropout require backprop to finish for input #1 before beginning feed forward for input #2? Or if it happens parallely, then how are the $z$ nodes updated?
I've already seen this post, but the answer in the post doesn't seem to answer the question.

Siamese networks: Why does the network to be duplicated?

The DeepFace paper from Facebook uses a Siamese network to learn a metric. They say that the DNN that extracts the 4096 dimensional face embedding has to be duplicated in a Siamese network, but both duplicates share weights. But if they share weights, every update to one of them will also change the other. So why do we need to duplicate them?
Why can't we just apply one DNN to two faces and then do backpropagation using the metric loss? Do they maybe mean this and just talk about duplicated networks for "better" understanding?
Quote from the paper:
We have also tested an end-to-end metric learning ap-
proach, known as Siamese network [8]: once learned, the
face recognition network (without the top layer) is repli-
cated twice (one for each input image) and the features are
used to directly predict whether the two input images be-
long to the same person. This is accomplished by: a) taking
the absolute difference between the features, followed by b)
a top fully connected layer that maps into a single logistic
unit (same/not same). The network has roughly the same
number of parameters as the original one, since much of it
is shared between the two replicas, but requires twice the
computation. Notice that in order to prevent overfitting on
the face verification task, we enable training for only the
two topmost layers.
Paper: https://research.fb.com/wp-content/uploads/2016/11/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf
The short answer is that yes, I think that looking at the architecture of the network will help you understand what is going on. You have two networks that are "joined at the hip" i.e. sharing weights. That's what makes it a "Siamese network". The trick is that you want the two images you feed into the network to pass through the same embedding function. So to ensure that this happens both branches of the network need to share weights.
Then we combine the two embeddings into a metric loss (called "contrastive loss" in the image below). And we can back-propagate as normal, we just have two input branches available so that we can feed in two images at a time.
I think a picture is worth a thousand words. So check out how a Siamese network is constructed (at least conceptually) below.
The gradients depend on the activation values. So for each branch gradients will be different and final update could be based on some averaging to share the weights

Why do we need layers in artificial neural network?

I am quite new to artificial neural network, and what I cannot understand is why we need the concept of layer.
Isn't is enough to connect each neuron to some other neurons creating a kind of web more then a layered based structure?
For example for solving the XOR we usually need at least 3 layers, 1 input with 2 neurons, 1+ hidden layer(s) with some neurons and 1 output layer with 1 neuron.
Couldn't we create a network with 2 input neurons (we need them) and 1 output connected by a web of other neurons?
Example of what I mean
The term 'layer' is different than you might think. There is always a 'web' of neurons. A layer just denotes a group of neurons.
If I want to connect layer X with layer Y, this means I am connecting all neurons from layer X to all neurons from layer Y. But not always! You could also connect each neuron from layer X to just one neuron in layer Y. There are lots of different connection techniques.
But layers aren't required! It just makes the coding (and explanation) process a whole lot easier. Instead of connecting all neurons one by one, you can connect them in layers. It's far easier to say "layer A and B are connected" than "neuron 1,2,3,4,5 are all connected with neurons 6,7,8,9".
If you are interested in 'layerless' networks, please take a look at Liquid State Machines:
(the neurons might look to be layered, but they aren't!)
PS: I develop a Javascript neural network library, and I have created an onlinedemo in which a neural network evolves to an XOR gate - without layers, just starting with input and output. View it here.. Your example picture is exactly what kind of networks you could develop with this library.

Neural Network : extrapolate between two trainings

Is it possible to train a Network with 2 inputs : one is the data and the other is a constant that we define.
We train the network with one set of datas and set the second input to '10' for example
then once it has converged, we train with another set of data and set the second input to '20' this time.
Now what if i input test data with the second parameter set to '15', will it automatically extrapolate between the two learned states?
If not, how do i do if i want to do what i explained above : extrapolate between two training states?
thanks a lot
Jeff
It is possible to add another input as a parameter into the neural network, but I am unsure what benefit you are trying to achieve by adding this input.
You would need to train the neural network with this input so that it would estimate the value between the two trained networks. This would involve training each individual network first and then training a second network that would extrapolate between states.
If you are trying to modularise each trained Neural Network for specific roles or classifications, and these classifications represent some kind of continuous relationship (for example, weather predictions that specialise for no rain, light rain, moderate rain and heavy rain), then perhaps this input could be used in some way to encourage the output of a particular network.
If you would like to adjust the weights of each network so that some Neural Networks have more preference than others, perhaps an Ensemble approach can assist with different weights for each network (static and dynamic options are available). If you just want to map the differences between the two networks with different weights, perhaps a linear or non-linear function can be applied between the two networks and mapped to detail the changes between the two trained networks.

Artificial Neural Network that creates it's own connections

I've been reading about feed forward Artificial Neural Networks (ANN), and normally they need training to modify their weights in order to achieve the desired output. They will also always produce the same output when receiving the same input once tuned (biological networks don't necessarily).
Then I started reading about evolving neural networks. However, the evolution usually involves recombining two parents genomes into a new genome, there is no "learning" but really recombining and verifying through a fitness test.
I was thinking, the human brain manages it's own connections. It creates connections, strengthens some, and weakens others.
Is there a neural network topology that allows for this? Where the neural network, once having a bad reaction, either adjusts it's weights accordingly, and possibly creates random new connections (I'm not sure how the brain creates new connections, but even if I didn't, a random mutation chance of creating a new connection could alleviate this). A good reaction would strengthen those connections.
I believe this type of topology is known as a Turing Type B Neural Network, but I haven't seen any coded examples or papers on it.
This paper, An Adaptive Spiking Neural Network with Hebbian Learning, specifically addresses the creation of new neurons and synapses. From the introduction:
Traditional rate-based neural networks and the newer spiking neural networks have been shown to be very effective for some tasks, but they have problems with long term learning and "catastrophic forgetting." Once a network is trained to perform some task, it is difficult to adapt it to new applications. To do this properly, one can mimic processes that occur in the human brain: neurogenesis and synaptogenesis, or the birth and death of both neurons and synapses. To be effective, however, this must be accomplished while maintaining the current memories.
If you do some searching on google with the keywords 'neurogenesis artificial neural networks', or similar, you will find more articles. There is also this similar question at cogsci.stackexchange.com.
neat networks as well as cascading add their own connections/neurons to solve problems by building structures to create specific responses to stimuli