Counting flops in VGGNET vs ResNet - neural-network

I would like to know how a 16 layer VGGNET has 15.3 billion flops while 152 layer Resnet has only 11.3 billion flops. Is it because of the initial layers of VGGNet which run 64 filters on the 224x224 image?

Related

Why the depth of kernel of first convolutional layer is 48 in AlexNet?

In AlexNet, the filter size is 5*5*48 in first layer and 3*3*128 in second layer.
Why 48 and 128 are used as depth? Can we change both to different numbers?
Thanks
The depiction of neural network there could be confusing for some. Actually, layer with 48 dimensions, specifically 5 * 5 * 48 dimensions is the second convolutional layer. From the article;
..The second convolutional layer takes as input the (response-normalized
and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 × 5 × 48
I assumed though, your confusion stem from the first layer was described as 11 * 11 * 96 dimensions but depiction in the image was not. In case you are asking why did the authors chose such size, is still something varies in scientific community as about deciding the parameters of a neural network is somewhat done by intuition (at least by this time).

What's a common solution for an alternating Cohen's kappa score when it comes to deep neuronal networks?

I am fairly new to deep convolutional networks. I am using a kappa matrix to calculate my score. Right now I am checking my kappa score every 500 iterations and it looks like an increasing saw-signal.
I am using an SGD-Solver and a base-learning rate of 0.00001.
I have a roughly 15000 images (512x512) per class,evenly distributed, and there are 5 classes. I am using a Batchsize of 4
For validation the data set got a different distribution:
73%
6%
15%
2%
2%
I have a net with 24 convolutional layers, which uses BatchNorm-Layers and alternating 3x3 and 1x1 layers between larger Convolutional - Layers. So the structure is fairly deep.
But when it comes to the Cohen's kappa score I am not able to move over the 0.1 score, more likely the Cohen's kappa score is rising and then falling and then rising again.
Is there a common solution for this kind of behaviour ? Since I am fairly new I do not really know where to change my behaviour, like: Is my learning rate too high / too low ? Is my architecture too small / too large ? Am I using an inapropriate sovler?
For those interested: The data is from the kaggle competition for diabetic eye retinopathy. I am not taking part in the competition, it's just an interesting project.
https://www.kaggle.com/c/diabetic-retinopathy-detection
This graph shows how my kappa score operates

CAFFE: Run forward pass with fewer nodes in FC layer

I am trying to perform an experiment in Caffe with a very simple single hidden layer NN. I am using the MNIST dataset trained with a single hidden layer (of 128 nodes). I have all the weights from the fully trained network already.
However, during the feed forward stage I would like to use only a smaller subset of these nodes i.e 32 or 64. So for example, I would like to calculate the activations of 64 nodes during the feed forward pass and save them. then during the next run, calculate the activations of the other 64 nodes and combine them with the activations of the first 64 so i get the activations of all 128 nodes. Thus calculating the activations of all 128 nodes but in two 'passes'.
Is there a way to achieve this in Caffe? Please excuse me as I am very new to Caffe ( just started using it this week! ).

Bad regression output of neural network - an unwanted upper bound?

I am having a problem in a project which uses pybrain(a python library for neural network)
to build an ANN and do regression as prediction.
I am using 3-layer ANN, with 14 inputs, 10 hidden neurons in the hidden layer, and 2 outputs. A typical training or test example would be like this,
Inputs(divided by space):
1534334.489 1554790.856 1566060.675 20 20 20 50 45000 -11.399025 13 1.05E-03 1.775475116 20 0
Outputs(divided by space):
1571172.296 20
And I am using pybrain's BackpropTrainer so it is training using Backpropagation, and I trained until convergence.
The weird thing of the result is that the prediction of the first output(e.g. the first output of the trained ANN using test inputs) tracks the real value well in lower parts of the curve but seems to have an unwanted upperbound when real value rises.
I changed the number of hidden neurons to 10 but it still behaves like this. Even if I tested the trained ANN using the original training samples, it would still have an upperbound like this.
Does anyone have an intuition or advice on what's wrong here? Thanks!
Try to normalize the values(input and output) between (-1, +1).

Number of feature maps in convolution neural network

I've read this articles http://www.codeproject.com/Articles/143059/Neural-Network-for-Recognition-of-Handwritten-Di and when I turn to this one:
Layer #0: is the gray scale image of the handwritten character in the MNIST database which is padded to 29x29 pixel. There are 29x29= 841 neurons in the input layer.
Layer #1: is a convolutional layer with six (6) feature maps. There are 13x13x6 = 1014 neurons, (5x5+1)x6 = 156 weights, and 1014x26 = 26364 connections from layer #1 to the previous layer.
How can we get the six(6) feature maps just from convolution on image ?
I think we just get only one feature map. Or am i wrong ?
I'm doing my research around convolution neural network.
Six different kernels(or filters) are convoluted on the same image to generate six feature map.
Layer #0: Input image with 29x29 pixels thus have 29*29=841 neuron(input neuron)
Layer #1: Convolutional layer uses 6 different kernels(or filters) of size 5x5 pixel and stride length 2(amount of shift while convoluting input with kernals or filters) which are convoluted with the input image(29x29) generating 6 different feature maps(13x13) thus 13x13x6=1014 neuron.
Filter size 5x5 and a bias(for weight correction) thus (5x5)+1 neuron and was we have 6 kernals(or filters), gives 6*[(5x5)+1]= 156 neuron.
During convolution we move kernels(or filters) 26 times(13 horizontal move + 13 vertical move) and finally 1014*26=26364 connections from Layer #0 to Layer #1.
You should go through this research paper by Y LeCun, L Bottou, Y Bengio: Gradient- Based learing applied to document recognition Section II to understand convolution neural network(I recommend to read the whole paper).
Another place where you can find detailed explanation and python implementation fo CNN is here. If you have time I recommend to go through this site for more details about deep learning.
Thank you.
you get six feature maps by convolving with six different kernel on the same image.