I am trying to train a binary classification model in caffe that tells if an input image is a dog or background. I have 8223 positive samples and 33472 negative samples. My validation set contains 1200 samples, 600 of each class. In fact, my positives are snippets taken from MS-COCO dataset. All images are resized so the biiger dimension does not exceed 92 and the smaller dimension is not smaller than 44. After creating the LMDB files using create_imagenet.sh (resize=false), I started training with the solver and train .prototxt's below. The problem is that I am getting a constant accuracy (0.513333 or 0.486667) which indicates that the network is not learning anything.
I hope that someone is able to help
Thank you in advanced
solver file:
iter_size: 32
test_iter: 600
test_interval: 20
base_lr: 0.001
display: 2
max_iter: 20000
lr_policy: "step"
gamma: 0.99
stepsize: 700
momentum: 0.9
weight_decay: 0.0001
snapshot: 40
snapshot_prefix: "/media/DATA/classifiers_data/dog_object/models/"
solver_mode: GPU
net: "/media/DATA/classifiers_data/dog_object/net.prototxt"
solver_type: ADAM
train.prototxt:
layer {
name: "train-data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
data_param {
source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_train_lmdb"
batch_size: 1
backend: LMDB
}
}
layer {
name: "val-data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
data_param {
source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_val_lmdb"
batch_size: 1
backend: LMDB
}
}
layer {
name: "scale"
type: "Power"
bottom: "data"
top: "scale"
power_param {
scale: 0.00390625
}
}
layer {
bottom: "scale"
top: "conv1_1"
name: "conv1_1"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "conv1_1"
top: "conv1_1"
name: "relu1_1"
type: "ReLU"
}
layer {
bottom: "conv1_1"
top: "conv1_2"
name: "conv1_2"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "conv1_2"
top: "conv1_2"
name: "relu1_2"
type: "ReLU"
}
layer {
name: "spatial_pyramid_pooling"
type: "SPP"
bottom: "conv1_2"
top: "spatial_pyramid_pooling"
spp_param {
pool: MAX
pyramid_height : 4
}
}
layer {
bottom: "spatial_pyramid_pooling"
top: "fc6"
name: "fc6"
type: "InnerProduct"
inner_product_param {
num_output: 64
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "fc6"
top: "fc6"
name: "relu6"
type: "ReLU"
}
layer {
bottom: "fc6"
top: "fc6"
name: "drop6"
type: "Dropout"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
bottom: "fc6"
top: "fc7"
name: "fc7"
type: "InnerProduct"
inner_product_param {
num_output: 2
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc7"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy/top1"
type: "Accuracy"
bottom: "fc7"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
part of training log:
I1125 15:52:36.604038 2326 solver.cpp:362] Iteration 40, Testing net (#0)
I1125 15:52:36.604071 2326 net.cpp:723] Ignoring source layer train-data
I1125 15:52:47.127979 2326 solver.cpp:429] Test net output #0: accuracy = 0.486667
I1125 15:52:47.128067 2326 solver.cpp:429] Test net output #1: loss = 0.694894 (* 1 = 0.694894 loss)
I1125 15:52:48.937928 2326 solver.cpp:242] Iteration 40 (0.141947 iter/s, 14.0897s/2 iter), loss = 0.67717
I1125 15:52:48.938014 2326 solver.cpp:261] Train net output #0: loss = 0.655692 (* 1 = 0.655692 loss)
I1125 15:52:48.938040 2326 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I1125 15:52:52.858757 2326 solver.cpp:242] Iteration 42 (0.510097 iter/s, 3.92083s/2 iter), loss = 0.673962
I1125 15:52:52.858841 2326 solver.cpp:261] Train net output #0: loss = 0.653978 (* 1 = 0.653978 loss)
I1125 15:52:52.858875 2326 sgd_solver.cpp:106] Iteration 42, lr = 0.001
I1125 15:52:56.581573 2326 solver.cpp:242] Iteration 44 (0.53723 iter/s, 3.7228s/2 iter), loss = 0.673144
I1125 15:52:56.581656 2326 solver.cpp:261] Train net output #0: loss = 0.652269 (* 1 = 0.652269 loss)
I1125 15:52:56.581689 2326 sgd_solver.cpp:106] Iteration 44, lr = 0.001
I1125 15:53:00.192082 2326 solver.cpp:242] Iteration 46 (0.553941 iter/s, 3.61049s/2 iter), loss = 0.669606
I1125 15:53:00.192167 2326 solver.cpp:261] Train net output #0: loss = 0.650559 (* 1 = 0.650559 loss)
I1125 15:53:00.192200 2326 sgd_solver.cpp:106] Iteration 46, lr = 0.001
I1125 15:53:04.195417 2326 solver.cpp:242] Iteration 48 (0.499585 iter/s, 4.00332s/2 iter), loss = 0.674327
I1125 15:53:04.195691 2326 solver.cpp:261] Train net output #0: loss = 0.648808 (* 1 = 0.648808 loss)
I1125 15:53:04.195736 2326 sgd_solver.cpp:106] Iteration 48, lr = 0.001
I1125 15:53:07.856842 2326 solver.cpp:242] Iteration 50 (0.546265 iter/s, 3.66123s/2 iter), loss = 0.661835
I1125 15:53:07.856925 2326 solver.cpp:261] Train net output #0: loss = 0.647097 (* 1 = 0.647097 loss)
I1125 15:53:07.856957 2326 sgd_solver.cpp:106] Iteration 50, lr = 0.001
I1125 15:53:11.681635 2326 solver.cpp:242] Iteration 52 (0.522906 iter/s, 3.82478s/2 iter), loss = 0.66071
I1125 15:53:11.681720 2326 solver.cpp:261] Train net output #0: loss = 0.743264 (* 1 = 0.743264 loss)
I1125 15:53:11.681754 2326 sgd_solver.cpp:106] Iteration 52, lr = 0.001
I1125 15:53:15.544859 2326 solver.cpp:242] Iteration 54 (0.517707 iter/s, 3.86319s/2 iter), loss = 0.656414
I1125 15:53:15.544950 2326 solver.cpp:261] Train net output #0: loss = 0.643741 (* 1 = 0.643741 loss)
I1125 15:53:15.544986 2326 sgd_solver.cpp:106] Iteration 54, lr = 0.001
I1125 15:53:19.354320 2326 solver.cpp:242] Iteration 56 (0.525012 iter/s, 3.80943s/2 iter), loss = 0.645277
I1125 15:53:19.354404 2326 solver.cpp:261] Train net output #0: loss = 0.747059 (* 1 = 0.747059 loss)
I1125 15:53:19.354431 2326 sgd_solver.cpp:106] Iteration 56, lr = 0.001
I1125 15:53:23.195466 2326 solver.cpp:242] Iteration 58 (0.520681 iter/s, 3.84112s/2 iter), loss = 0.677604
I1125 15:53:23.195549 2326 solver.cpp:261] Train net output #0: loss = 0.640145 (* 1 = 0.640145 loss)
I1125 15:53:23.195575 2326 sgd_solver.cpp:106] Iteration 58, lr = 0.001
I1125 15:53:25.140920 2326 solver.cpp:362] Iteration 60, Testing net (#0)
I1125 15:53:25.140965 2326 net.cpp:723] Ignoring source layer train-data
I1125 15:53:35.672775 2326 solver.cpp:429] Test net output #0: accuracy = 0.513333
I1125 15:53:35.672937 2326 solver.cpp:429] Test net output #1: loss = 0.69323 (* 1 = 0.69323 loss)
I1125 15:53:37.635395 2326 solver.cpp:242] Iteration 60 (0.138503 iter/s, 14.4401s/2 iter), loss = 0.655983
I1125 15:53:37.635478 2326 solver.cpp:261] Train net output #0: loss = 0.638368 (* 1 = 0.638368 loss)
I1125 15:53:37.635512 2326 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I1125 15:53:41.458472 2326 solver.cpp:242] Iteration 62 (0.523143 iter/s, 3.82305s/2 iter), loss = 0.672996
I1125 15:53:41.458555 2326 solver.cpp:261] Train net output #0: loss = 0.753101 (* 1 = 0.753101 loss)
I1125 15:53:41.458588 2326 sgd_solver.cpp:106] Iteration 62, lr = 0.001
I1125 15:53:45.299643 2326 solver.cpp:242] Iteration 64 (0.520679 iter/s, 3.84114s/2 iter), loss = 0.668675
I1125 15:53:45.299737 2326 solver.cpp:261] Train net output #0: loss = 0.634894 (* 1 = 0.634894 loss)
A few comments:
1. Your test set contains 1200 samples, but you are validating on only 600 each time: test_iter*batch_size=600. See this answer for more details.
2. Did you shuffle your training data when you created your lmdb? See this answer for more details.
3. How do you init your weights? there seems to be no call for fillers in your prototxt file. If you do not explicitly define fillers, your weights are init to zero. It's a very difficult starting point for SGD. See this answer for more details.
4. Have you tried setting debug_info: true in your solver and looking into the debug log to trace the root cause of you problem? See this thread for more details.
Related
I have a long text file like this:
I0927 11:33:18.534551 16932 solver.cpp:244] Train net output #0: loss = 2.61145 (* 1 = 2.61145 loss)
I0927 11:33:18.534620 16932 sgd_solver.cpp:106] Iteration 20, lr = 0.001
I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027
I0927 11:33:33.221771 16932 solver.cpp:244] Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss)
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016
I0927 11:33:47.884717 16932 solver.cpp:244] Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss)
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975
I0927 11:34:02.543442 16932 solver.cpp:244] Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss)
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758
I0927 11:34:17.297659 16932 solver.cpp:244] Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss)
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001
I0927 11:34:31.962934 16932 solver.cpp:228] Iteration 120, loss = 0.792767
I want to extract the following information
[ Iteration, Train net output, lr ]
and put them in a cell in MATLAB.
can you please direct me how I can do that?
I am deleting the first two and last line of your log to make it consistent such that you have a Train net output and sgd_solver .. lr = line after every Iteration like this:
I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027
I0927 11:33:33.221771 16932 solver.cpp:244] Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss)
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016
I0927 11:33:47.884717 16932 solver.cpp:244] Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss)
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975
I0927 11:34:02.543442 16932 solver.cpp:244] Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss)
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758
I0927 11:34:17.297659 16932 solver.cpp:244] Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss)
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001
You can read this file as text using fileread and then execute regexp using the following code:
txt = fileread('log.txt');
it = regexp(txt,'I0927.*solver.cpp:228]\sIteration\s(.*),.*','tokens','dotexceptnewline')
it =
1×4 cell array
{1×1 cell} {1×1 cell} {1×1 cell} {1×1 cell}
net_out = regexp(txt,'I0927.*solver.cpp:244]\s*Train\snet\soutput.*loss\s=\s(\S*).*','tokens','dotexceptnewline');
lr = regexp(txt,'I0927.*sgd_solver.cpp:106]\sIteration.*lr\s=\s(\S*)','tokens','dotexceptnewline');
The outputs will need a little bit of conditioning before you can convert them to numbers:
% Get outputs out of their cells
it = [it{:}]';
net_out = [net_out{:}]';
lr = [lr{:}]';
sim_out = str2double([it net_out lr]);
As suggested by Some Guy, you can use regexp:
fid = fopen('log.txt','r');
output = {};
line = fgetl(fid);
while ischar(line)
l1 = regexp(line, 'Iteration\s+(\d+),\s+loss\s+=\s+', 'tokens', 'once');
if ~isempty(l1)
%// we got the first line of an iteration
line = fgetl(fid);
l2 = regexp(line, 'Train net output #0: loss = (\S+)', 'tokens', 'once');
line = fgetl(fid);
l3 = regexp(line, 'Iteration \d+, lr = (\S+)', 'tokens', 'once');
output{end+1} = [str2double(l1{1}), str2double(l2{1}), str2double(l3{1})];
end
line = fgetl(fid);
end;
fclose(fid);
output = vertcat(output{:});
BTW, are you aware of $CAFFE_ROOT/tools/extra/parse_log.py utility by caffe?
I have the following code:
tic;
H = rand(100, 1000);
F = rand(1, 1000);
net = newff(H, F, [30, 10], { 'tansig' 'tansig'}, 'traingdx', 'learngdm', 'mse');
net.trainParam.epochs = 400;
net.performParam.regularization = 0.05;
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 0;
% net = train(net, H, F, 'useGPU', 'yes', 'showResources', 'yes'); % line 1
net = train(net, H, F, 'showResources', 'yes'); % line 2
toc;
with line 2 uncommented I get
Computing Resources:
GPU device #1, GeForce 800M
Elapsed time is 5.084222 seconds.
and with line 1 uncommented I get
Computing Resources:
MEX2
Elapsed time is 1.870803 seconds.
Why is GPU slower than CPU?
My GPU properties:
CUDADevice with properties:
Name: 'GeForce 800M'
Index: 1
ComputeCapability: '2.1'
SupportsDouble: 1
DriverVersion: 6
ToolkitVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
FreeMemory: 1.9886e+09
MultiprocessorCount: 1
ClockRateKHz: 1475000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1`
I have gpu
>> d = gpuDevice
d =
CUDADevice with properties:
Name: 'GeForce 800M'
Index: 1
ComputeCapability: '2.1'
SupportsDouble: 1
DriverVersion: 6
ToolkitVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
FreeMemory: 1.9886e+09
MultiprocessorCount: 1
ClockRateKHz: 1475000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1`
I try use gpu in neuralnetwork train, but my gpu slower than cpu.
If I try use gpuArray, gpu is faster than cpu, but I haven't speed acceleration in neural network training.
For example
>> a1 = rand(1000); b1 = rand(1000); tic; c1 = a1 * b1; toc;
Elapsed time is 0.044095 seconds.
>> a2 = gpuArray(rand(1000)); b2 = gpuArray(rand(1000)); tic; c2 = a2 * b2; toc;
Elapsed time is 0.000416 seconds.
But in code
net = newff(H, F, Layers, { 'tansig' 'tansig'}, 'traingdx', 'learngdm', 'mse');
net.trainParam.epochs = Epochs;
net.trainParam.show = 500;
net.trainParam.time = 495;
net.trainParam.goal = 2.0000e-11;
net.trainParam.max_fail = 200000;
net.trainParam.min_grad = 1.0000e-050;
net.performParam.regularization = 0.05;
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 0;
if Gpu1 == 1
net = train(net, H, F, 'useGPU', 'yes', 'showResources','yes');
else
net = train(net, H, F, 'showResources','yes');
end;
tic; net = net_example(300, [23, 9], rand(100, 1000), rand(1, 1000), 1); toc;
Computing Resources:
GPU device #1, GeForce 800M
works slower than
tic; net = net_example(300, [23, 9], rand(100, 1000), rand(1, 1000), 0); toc;
Computing Resources:
MEX2
First of all, truly thanks for your watching. This output confuse me in several days. Whyyyy the CNN arrived at 90% accuracy after 100 iterations and cannot improve any more? Though I reduce or increase layers of CNN would not affect the output. Though I reduce the learning rate to 1e-4 that would not affect the output. This let me feel crazy.
I0520 17:18:54.184824 8010 solver.cpp:280] Learning Rate Policy: step
I0520 17:18:54.185271 8010 solver.cpp:337] Iteration 0, Testing net (#0)
I0520 17:18:56.232154 8010 solver.cpp:404] Test net output #0: accuracy = 0.104375
I0520 17:18:56.232197 8010 solver.cpp:404] Test net output #1: loss = 0.693147 (* 1 = 0.693147 loss)
I0520 17:18:56.466557 8010 solver.cpp:228] Iteration 0, loss = 0.693147
I0520 17:18:56.466586 8010 solver.cpp:244] Train net output #0: loss = 0.693147 (* 1 = 0.693147 loss)
I0520 17:18:56.466620 8010 sgd_solver.cpp:106] Iteration 0, lr = 0.01
I0520 17:19:11.837409 8010 solver.cpp:337] Iteration 100, Testing net (#0)
I0520 17:19:13.835307 8010 solver.cpp:404] Test net output #0: accuracy = 0.9025
I0520 17:19:13.835340 8010 solver.cpp:404] Test net output #1: loss = 0.322544 (* 1 = 0.322544 loss)
I0520 17:19:13.893486 8010 solver.cpp:228] Iteration 100, loss = 0.130141
I0520 17:19:13.893525 8010 solver.cpp:244] Train net output #0: loss = 0.130141 (* 1 = 0.130141 loss)
I0520 17:19:13.983126 8010 sgd_solver.cpp:106] Iteration 100, lr = 0.01
I0520 17:19:30.863654 8010 solver.cpp:337] Iteration 200, Testing net (#0)
I0520 17:19:32.862107 8010 solver.cpp:404] Test net output #0: accuracy = 0.89375
I0520 17:19:32.862146 8010 solver.cpp:404] Test net output #1: loss = 0.338875 (* 1 = 0.338875 loss)
I0520 17:19:32.920205 8010 solver.cpp:228] Iteration 200, loss = 0.3774
I0520 17:19:32.920241 8010 solver.cpp:244] Train net output #0: loss = 0.3774 (* 1 = 0.3774 loss)
I0520 17:19:33.014003 8010 sgd_solver.cpp:106] Iteration 200, lr = 0.01
I0520 17:19:48.727584 8010 solver.cpp:337] Iteration 300, Testing net (#0)
I0520 17:19:50.727105 8010 solver.cpp:404] Test net output #0: accuracy = 0.885625
I0520 17:19:50.727149 8010 solver.cpp:404] Test net output #1: loss = 0.35577 (* 1 = 0.35577 loss)
I0520 17:19:50.784287 8010 solver.cpp:228] Iteration 300, loss = 0.378196
I0520 17:19:50.784317 8010 solver.cpp:244] Train net output #0: loss = 0.378196 (* 1 = 0.378196 loss)
I0520 17:19:50.864651 8010 sgd_solver.cpp:106] Iteration 300, lr = 0.01
I0520 17:20:05.957052 8010 solver.cpp:337] Iteration 400, Testing net (#0)
I0520 17:20:07.961210 8010 solver.cpp:404] Test net output #0: accuracy = 0.89625
I0520 17:20:07.961241 8010 solver.cpp:404] Test net output #1: loss = 0.333487 (* 1 = 0.333487 loss)
I0520 17:20:08.018559 8010 solver.cpp:228] Iteration 400, loss = 0.247491
I0520 17:20:08.018585 8010 solver.cpp:244] Train net output #0: loss = 0.247491 (* 1 = 0.247491 loss)
I0520 17:20:08.097172 8010 sgd_solver.cpp:106] Iteration 400, lr = 0.01
I0520 17:20:23.099375 8010 solver.cpp:337] Iteration 500, Testing net (#0)
I0520 17:20:25.104033 8010 solver.cpp:404] Test net output #0: accuracy = 0.87875
I0520 17:20:25.104063 8010 solver.cpp:404] Test net output #1: loss = 0.369838 (* 1 = 0.369838 loss)
I0520 17:20:25.161242 8010 solver.cpp:228] Iteration 500, loss = 0.377606
I0520 17:20:25.161267 8010 solver.cpp:244] Train net output #0: loss = 0.377606 (* 1 = 0.377606 loss)
I0520 17:20:25.240455 8010 sgd_solver.cpp:106] Iteration 500, lr = 0.01
I0520 17:20:40.367671 8010 solver.cpp:337] Iteration 600, Testing net (#0)
I0520 17:20:42.372931 8010 solver.cpp:404] Test net output #0: accuracy = 0.899375
I0520 17:20:42.372972 8010 solver.cpp:404] Test net output #1: loss = 0.3275 (* 1 = 0.3275 loss)
I0520 17:20:42.431170 8010 solver.cpp:228] Iteration 600, loss = 0.504837
I0520 17:20:42.431207 8010 solver.cpp:244] Train net output #0: loss = 0.504837 (* 1 = 0.504837 loss)
I0520 17:20:42.509552 8010 sgd_solver.cpp:106] Iteration 600, lr = 0.01
I0520 17:20:57.602491 8010 solver.cpp:337] Iteration 700, Testing net (#0)
I0520 17:20:59.610484 8010 solver.cpp:404] Test net output #0: accuracy = 0.8775
I0520 17:20:59.610515 8010 solver.cpp:404] Test net output #1: loss = 0.37246 (* 1 = 0.37246 loss)
I0520 17:20:59.667711 8010 solver.cpp:228] Iteration 700, loss = 0.377645
I0520 17:20:59.667740 8010 solver.cpp:244] Train net output #0: loss = 0.377646 (* 1 = 0.377646 loss)
I0520 17:20:59.746636 8010 sgd_solver.cpp:106] Iteration 700, lr = 0.01
I0520 17:21:14.917335 8010 solver.cpp:337] Iteration 800, Testing net (#0)
I0520 17:21:16.927249 8010 solver.cpp:404] Test net output #0: accuracy = 0.89125
I0520 17:21:16.927291 8010 solver.cpp:404] Test net output #1: loss = 0.344211 (* 1 = 0.344211 loss)
I0520 17:21:16.984709 8010 solver.cpp:228] Iteration 800, loss = 0.630042
I0520 17:21:16.984740 8010 solver.cpp:244] Train net output #0: loss = 0.630042 (* 1 = 0.630042 loss)
I0520 17:21:17.068943 8010 sgd_solver.cpp:106] Iteration 800, lr = 0.01
I0520 17:21:32.367197 8010 solver.cpp:337] Iteration 900, Testing net (#0)
I0520 17:21:34.375227 8010 solver.cpp:404] Test net output #0: accuracy = 0.88625
I0520 17:21:34.375259 8010 solver.cpp:404] Test net output #1: loss = 0.354399 (* 1 = 0.354399 loss)
I0520 17:21:34.432396 8010 solver.cpp:228] Iteration 900, loss = 0.251611
I0520 17:21:34.432428 8010 solver.cpp:244] Train net output #0: loss = 0.251612 (* 1 = 0.251612 loss)
I0520 17:21:34.512991 8010 sgd_solver.cpp:106] Iteration 900, lr = 0.01
I0520 17:21:49.736721 8010 solver.cpp:337] Iteration 1000, Testing net (#0)
I0520 17:21:51.748107 8010 solver.cpp:404] Test net output #0: accuracy = 0.89625
I0520 17:21:51.748139 8010 solver.cpp:404] Test net output #1: loss = 0.33392 (* 1 = 0.33392 loss)
I0520 17:21:51.805516 8010 solver.cpp:228] Iteration 1000, loss = 0.377219
I0520 17:21:51.805547 8010 solver.cpp:244] Train net output #0: loss = 0.37722 (* 1 = 0.37722 loss)
I0520 17:21:51.891490 8010 sgd_solver.cpp:106] Iteration 1000, lr = 0.01
I0520 17:22:07.312000 8010 solver.cpp:337] Iteration 1100, Testing net (#0)
I0520 17:22:09.324790 8010 solver.cpp:404] Test net output #0: accuracy = 0.8825
I0520 17:22:09.324831 8010 solver.cpp:404] Test net output #1: loss = 0.362015 (* 1 = 0.362015 loss)
I0520 17:22:09.382184 8010 solver.cpp:228] Iteration 1100, loss = 0.377469
I0520 17:22:09.382210 8010 solver.cpp:244] Train net output #0: loss = 0.37747 (* 1 = 0.37747 loss)
I0520 17:22:09.470769 8010 sgd_solver.cpp:106] Iteration 1100, lr = 0.01
I0520 17:22:24.663307 8010 solver.cpp:337] Iteration 1200, Testing net (#0)
I0520 17:22:26.671860 8010 solver.cpp:404] Test net output #0: accuracy = 0.89375
I0520 17:22:26.671903 8010 solver.cpp:404] Test net output #1: loss = 0.33863 (* 1 = 0.33863 loss)
I0520 17:22:26.729918 8010 solver.cpp:228] Iteration 1200, loss = 0.246504
I0520 17:22:26.729953 8010 solver.cpp:244] Train net output #0: loss = 0.246505 (* 1 = 0.246505 loss)
I0520 17:22:26.816223 8010 sgd_solver.cpp:106] Iteration 1200, lr = 0.01
I0520 17:22:42.258044 8010 solver.cpp:337] Iteration 1300, Testing net (#0)
I0520 17:22:44.267346 8010 solver.cpp:404] Test net output #0: accuracy = 0.88875
I0520 17:22:44.267387 8010 solver.cpp:404] Test net output #1: loss = 0.349121 (* 1 = 0.349121 loss)
I0520 17:22:44.324551 8010 solver.cpp:228] Iteration 1300, loss = 0.507589
I0520 17:22:44.324579 8010 solver.cpp:244] Train net output #0: loss = 0.50759 (* 1 = 0.50759 loss)
I0520 17:22:44.406440 8010 sgd_solver.cpp:106] Iteration 1300, lr = 0.01
I0520 17:22:59.415982 8010 solver.cpp:337] Iteration 1400, Testing net (#0)
I0520 17:23:01.426676 8010 solver.cpp:404] Test net output #0: accuracy = 0.89625
I0520 17:23:01.426718 8010 solver.cpp:404] Test net output #1: loss = 0.334152 (* 1 = 0.334152 loss)
I0520 17:23:01.483747 8010 solver.cpp:228] Iteration 1400, loss = 0.250863
I0520 17:23:01.483773 8010 solver.cpp:244] Train net output #0: loss = 0.250864 (* 1 = 0.250864 loss)
I0520 17:23:01.566565 8010 sgd_solver.cpp:106] Iteration 1400, lr = 0.01
I0520 17:23:16.813146 8010 solver.cpp:337] Iteration 1500, Testing net (#0)
I0520 17:23:18.822875 8010 solver.cpp:404] Test net output #0: accuracy = 0.89875
I0520 17:23:18.822907 8010 solver.cpp:404] Test net output #1: loss = 0.328215 (* 1 = 0.328215 loss)
I0520 17:23:18.879963 8010 solver.cpp:228] Iteration 1500, loss = 0.116459
I0520 17:23:18.879992 8010 solver.cpp:244] Train net output #0: loss = 0.116459 (* 1 = 0.116459 loss)
I0520 17:23:18.971333 8010 sgd_solver.cpp:106] Iteration 1500, lr = 0.01
I0520 17:23:34.049043 8010 solver.cpp:337] Iteration 1600, Testing net (#0)
I0520 17:23:36.062679 8010 solver.cpp:404] Test net output #0: accuracy = 0.896875
I0520 17:23:36.062708 8010 solver.cpp:404] Test net output #1: loss = 0.332045 (* 1 = 0.332045 loss)
I0520 17:23:36.119786 8010 solver.cpp:228] Iteration 1600, loss = 0.509788
I0520 17:23:36.119813 8010 solver.cpp:244] Train net output #0: loss = 0.509789 (* 1 = 0.509789 loss)
I0520 17:23:36.203153 8010 sgd_solver.cpp:106] Iteration 1600, lr = 0.01
I0520 17:23:51.507007 8010 solver.cpp:337] Iteration 1700, Testing net (#0)
I0520 17:23:53.517299 8010 solver.cpp:404] Test net output #0: accuracy = 0.8925
I0520 17:23:53.517330 8010 solver.cpp:404] Test net output #1: loss = 0.341289 (* 1 = 0.341289 loss)
I0520 17:23:53.574323 8010 solver.cpp:228] Iteration 1700, loss = 0.247204
I0520 17:23:53.574349 8010 solver.cpp:244] Train net output #0: loss = 0.247205 (* 1 = 0.247205 loss)
I0520 17:23:53.660018 8010 sgd_solver.cpp:106] Iteration 1700, lr = 0.01
I0520 17:24:08.737020 8010 solver.cpp:337] Iteration 1800, Testing net (#0)
I0520 17:24:10.748651 8010 solver.cpp:404] Test net output #0: accuracy = 0.90375
I0520 17:24:10.748685 8010 solver.cpp:404] Test net output #1: loss = 0.318335 (* 1 = 0.318335 loss)
I0520 17:24:10.805716 8010 solver.cpp:228] Iteration 1800, loss = 0.12061
I0520 17:24:10.805747 8010 solver.cpp:244] Train net output #0: loss = 0.120611 (* 1 = 0.120611 loss)
I0520 17:24:10.890601 8010 sgd_solver.cpp:106] Iteration 1800, lr = 0.01
I0520 17:24:25.873175 8010 solver.cpp:337] Iteration 1900, Testing net (#0)
I0520 17:24:27.883744 8010 solver.cpp:404] Test net output #0: accuracy = 0.89125
I0520 17:24:27.883777 8010 solver.cpp:404] Test net output #1: loss = 0.34394 (* 1 = 0.34394 loss)
I0520 17:24:27.940755 8010 solver.cpp:228] Iteration 1900, loss = 0.248079
I0520 17:24:27.940788 8010 solver.cpp:244] Train net output #0: loss = 0.24808 (* 1 = 0.24808 loss)
I0520 17:24:28.023818 8010 sgd_solver.cpp:106] Iteration 1900, lr = 0.01
I0520 17:24:43.362032 8010 solver.cpp:337] Iteration 2000, Testing net (#0)
I0520 17:24:45.372015 8010 solver.cpp:404] Test net output #0: accuracy = 0.890625
I0520 17:24:45.372057 8010 solver.cpp:404] Test net output #1: loss = 0.345227 (* 1 = 0.345227 loss)
I0520 17:24:45.429081 8010 solver.cpp:228] Iteration 2000, loss = 0.24597
I0520 17:24:45.429111 8010 solver.cpp:244] Train net output #0: loss = 0.245971 (* 1 = 0.245971 loss)
I0520 17:24:45.512468 8010 sgd_solver.cpp:106] Iteration 2000, lr = 0.001
I0520 17:25:00.576135 8010 solver.cpp:337] Iteration 2100, Testing net (#0)
I0520 17:25:02.587014 8010 solver.cpp:404] Test net output #0: accuracy = 0.89375
I0520 17:25:02.587056 8010 solver.cpp:404] Test net output #1: loss = 0.338609 (* 1 = 0.338609 loss)
I0520 17:25:02.644042 8010 solver.cpp:228] Iteration 2100, loss = 0.378319
I0520 17:25:02.644069 8010 solver.cpp:244] Train net output #0: loss = 0.37832 (* 1 = 0.37832 loss)
I0520 17:25:02.727481 8010 sgd_solver.cpp:106] Iteration 2100, lr = 0.001
I0520 17:25:17.801662 8010 solver.cpp:337] Iteration 2200, Testing net (#0)
I0520 17:25:19.811810 8010 solver.cpp:404] Test net output #0: accuracy = 0.88875
I0520 17:25:19.811852 8010 solver.cpp:404] Test net output #1: loss = 0.349157 (* 1 = 0.349157 loss)
I0520 17:25:19.868859 8010 solver.cpp:228] Iteration 2200, loss = 0.509671
I0520 17:25:19.868885 8010 solver.cpp:244] Train net output #0: loss = 0.509672 (* 1 = 0.509672 loss)
I0520 17:25:19.953173 8010 sgd_solver.cpp:106] Iteration 2200, lr = 0.001
I0520 17:25:35.135272 8010 solver.cpp:337] Iteration 2300, Testing net (#0)
I0520 17:25:37.147276 8010 solver.cpp:404] Test net output #0: accuracy = 0.890625
I0520 17:25:37.147317 8010 solver.cpp:404] Test net output #1: loss = 0.345208 (* 1 = 0.345208 loss)
I0520 17:25:37.204273 8010 solver.cpp:228] Iteration 2300, loss = 0.37789
I0520 17:25:37.204303 8010 solver.cpp:244] Train net output #0: loss = 0.377891 (* 1 = 0.377891 loss)
I0520 17:25:37.287936 8010 sgd_solver.cpp:106] Iteration 2300, lr = 0.001
I0520 17:25:52.305204 8010 solver.cpp:337] Iteration 2400, Testing net (#0)
I0520 17:25:54.314997 8010 solver.cpp:404] Test net output #0: accuracy = 0.8775
I0520 17:25:54.315038 8010 solver.cpp:404] Test net output #1: loss = 0.372589 (* 1 = 0.372589 loss)
I0520 17:25:54.372031 8010 solver.cpp:228] Iteration 2400, loss = 0.377802
I0520 17:25:54.372058 8010 solver.cpp:244] Train net output #0: loss = 0.377803 (* 1 = 0.377803 loss)
I0520 17:25:54.456874 8010 sgd_solver.cpp:106] Iteration 2400, lr = 0.001
I0520 17:26:09.711889 8010 solver.cpp:337] Iteration 2500, Testing net (#0)
I0520 17:26:11.721444 8010 solver.cpp:404] Test net output #0: accuracy = 0.8875
I0520 17:26:11.721487 8010 solver.cpp:404] Test net output #1: loss = 0.35174 (* 1 = 0.35174 loss)
I0520 17:26:11.779080 8010 solver.cpp:228] Iteration 2500, loss = 0.247256
I0520 17:26:11.779130 8010 solver.cpp:244] Train net output #0: loss = 0.247256 (* 1 = 0.247256 loss)
I0520 17:26:11.871363 8010 sgd_solver.cpp:106] Iteration 2500, lr = 0.001
I0520 17:26:26.890584 8010 solver.cpp:337] Iteration 2600, Testing net (#0)
I0520 17:26:28.901217 8010 solver.cpp:404] Test net output #0: accuracy = 0.888125
I0520 17:26:28.901259 8010 solver.cpp:404] Test net output #1: loss = 0.350424 (* 1 = 0.350424 loss)
I0520 17:26:28.959149 8010 solver.cpp:228] Iteration 2600, loss = 0.247565
I0520 17:26:28.959192 8010 solver.cpp:244] Train net output #0: loss = 0.247566 (* 1 = 0.247566 loss)
I0520 17:26:29.043840 8010 sgd_solver.cpp:106] Iteration 2600, lr = 0.001
I0520 17:26:44.058362 8010 solver.cpp:337] Iteration 2700, Testing net (#0)
I0520 17:26:46.068781 8010 solver.cpp:404] Test net output #0: accuracy = 0.896875
I0520 17:26:46.068824 8010 solver.cpp:404] Test net output #1: loss = 0.332274 (* 1 = 0.332274 loss)
I0520 17:26:46.126523 8010 solver.cpp:228] Iteration 2700, loss = 0.377633
I0520 17:26:46.126566 8010 solver.cpp:244] Train net output #0: loss = 0.377634 (* 1 = 0.377634 loss)
I0520 17:26:46.211163 8010 sgd_solver.cpp:106] Iteration 2700, lr = 0.001
I0520 17:27:01.389209 8010 solver.cpp:337] Iteration 2800, Testing net (#0)
I0520 17:27:03.399862 8010 solver.cpp:404] Test net output #0: accuracy = 0.888125
I0520 17:27:03.399893 8010 solver.cpp:404] Test net output #1: loss = 0.350417 (* 1 = 0.350417 loss)
I0520 17:27:03.456836 8010 solver.cpp:228] Iteration 2800, loss = 0.377626
I0520 17:27:03.456866 8010 solver.cpp:244] Train net output #0: loss = 0.377627 (* 1 = 0.377627 loss)
I0520 17:27:03.542301 8010 sgd_solver.cpp:106] Iteration 2800, lr = 0.001
I0520 17:27:18.608510 8010 solver.cpp:337] Iteration 2900, Testing net (#0)
I0520 17:27:20.618741 8010 solver.cpp:404] Test net output #0: accuracy = 0.888125
I0520 17:27:20.618775 8010 solver.cpp:404] Test net output #1: loss = 0.350423 (* 1 = 0.350423 loss)
I0520 17:27:20.675734 8010 solver.cpp:228] Iteration 2900, loss = 0.377743
I0520 17:27:20.675763 8010 solver.cpp:244] Train net output #0: loss = 0.377744 (* 1 = 0.377744 loss)
I0520 17:27:20.761608 8010 sgd_solver.cpp:106] Iteration 2900, lr = 0.001
I0520 17:27:36.487820 8010 solver.cpp:337] Iteration 3000, Testing net (#0)
I0520 17:27:38.498106 8010 solver.cpp:404] Test net output #0: accuracy = 0.888125
I0520 17:27:38.498141 8010 solver.cpp:404] Test net output #1: loss = 0.350422 (* 1 = 0.350422 loss)
I0520 17:27:38.555095 8010 solver.cpp:228] Iteration 3000, loss = 0.117625
I0520 17:27:38.555124 8010 solver.cpp:244] Train net output #0: loss = 0.117626 (* 1 = 0.117626 loss)
I0520 17:27:38.651935 8010 sgd_solver.cpp:106] Iteration 3000, lr = 0.001
I0520 17:27:54.956867 8010 solver.cpp:337] Iteration 3100, Testing net (#0)
I0520 17:27:56.967296 8010 solver.cpp:404] Test net output #0: accuracy = 0.8975
I0520 17:27:56.967337 8010 solver.cpp:404] Test net output #1: loss = 0.330956 (* 1 = 0.330956 loss)
I0520 17:27:57.024297 8010 solver.cpp:228] Iteration 3100, loss = 0.118173
I0520 17:27:57.024324 8010 solver.cpp:244] Train net output #0: loss = 0.118174 (* 1 = 0.118174 loss)
I0520 17:27:57.118137 8010 sgd_solver.cpp:106] Iteration 3100, lr = 0.001
I0520 17:28:13.332152 8010 solver.cpp:337] Iteration 3200, Testing net (#0)
I0520 17:28:15.342835 8010 solver.cpp:404] Test net output #0: accuracy = 0.896875
I0520 17:28:15.342877 8010 solver.cpp:404] Test net output #1: loss = 0.332293 (* 1 = 0.332293 loss)
I0520 17:28:15.399948 8010 solver.cpp:228] Iteration 3200, loss = 0.507063
I0520 17:28:15.399974 8010 solver.cpp:244] Train net output #0: loss = 0.507064 (* 1 = 0.507064 loss)
I0520 17:28:15.496682 8010 sgd_solver.cpp:106] Iteration 3200, lr = 0.001
I0520 17:28:31.541100 8010 solver.cpp:337] Iteration 3300, Testing net (#0)
I0520 17:28:33.551388 8010 solver.cpp:404] Test net output #0: accuracy = 0.898125
I0520 17:28:33.551429 8010 solver.cpp:404] Test net output #1: loss = 0.32964 (* 1 = 0.32964 loss)
I0520 17:28:33.608397 8010 solver.cpp:228] Iteration 3300, loss = 0.377692
I0520 17:28:33.608423 8010 solver.cpp:244] Train net output #0: loss = 0.377693 (* 1 = 0.377693 loss)
I0520 17:28:33.702908 8010 sgd_solver.cpp:106] Iteration 3300, lr = 0.001
I0520 17:28:49.041621 8010 solver.cpp:337] Iteration 3400, Testing net (#0)
I0520 17:28:51.052881 8010 solver.cpp:404] Test net output #0: accuracy = 0.89375
I0520 17:28:51.052911 8010 solver.cpp:404] Test net output #1: loss = 0.338749 (* 1 = 0.338749 loss)
I0520 17:28:51.109899 8010 solver.cpp:228] Iteration 3400, loss = 0.377645
I0520 17:28:51.109926 8010 solver.cpp:244] Train net output #0: loss = 0.377646 (* 1 = 0.377646 loss)
I0520 17:28:51.194988 8010 sgd_solver.cpp:106] Iteration 3400, lr = 0.001
Followings are detail of the CNN.
Picture SNR are pretty low about 0.01.
Train set: 0-73167, 1-8888
Validation set: 0-24389, 1-2962
The CNN are designed for 2-class classification. I was inspired by deep residual network from Recently Microsoft Research Paper.
Caffe code:
layer {
name: "Protein"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "/home/tg/Documents/caffe/my/protein/data/mean.binaryproto"
}
data_param {
source: "/home/tg/Documents/caffe/my/protein/data/train_db"
batch_size: 16
backend: LMDB
}
}
layer {
name: "Protein"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "/home/tg/Documents/caffe/my/protein/data/mean.binaryproto"
# mean_file: "/home/tg/Documents/caffe/my/protein/data/test_mean.binaryproto"
}
data_param {
source: "/home/tg/Documents/caffe/my/protein/data/val_db"
#source: "/home/tg/Documents/caffe/my/protein/test_db"
batch_size: 16
backend: LMDB
}
}
layer{
bottom: "data"
top: "conv1"
name: "conv1"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 5
pad: 3
stride: 2
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "bn_conv1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "scale_conv1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "relu_conv1"
type: "ReLU"
}
layer {
bottom: "conv1"
top: "res1_branch2a"
name: "res1_branch2a"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
}
layer {
bottom: "res1_branch2a"
top: "res1_branch2a"
name: "bn1_branch2a"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res1_branch2a"
top: "res1_branch2a"
name: "scale1_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res1_branch2a"
top: "res1_branch2a"
name: "res1_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res1_branch2a"
top: "res1_branch2b"
name: "res1_branch2b"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
bias_term: false
}
}
layer {
bottom: "res1_branch2b"
top: "res1_branch2b"
name: "bn1_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res1_branch2b"
top: "res1_branch2b"
name: "scale1_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res1_branch2b"
top: "res1_branch2b"
name: "res1_branch2b_relu"
type: "ReLU"
}
layer {
bottom: "res1_branch2b"
top: "res1_branch2c"
name: "res1_branch2c"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
}
layer {
bottom: "res1_branch2c"
top: "res1_branch2c"
name: "bn1_branch2c"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res1_branch2c"
top: "res1_branch2c"
name: "scale1_branch2c"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "conv1"
bottom: "res1_branch2c"
top: "res1"
name: "res1"
type: "Eltwise"
}
layer {
bottom: "res1"
top: "res1"
name: "res1_relu"
type: "ReLU"
}
layer {
bottom: "res1"
top: "res2_branch1"
name: "res2_branch1"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 1
pad: 0
stride: 2
bias_term: false
}
}
layer {
bottom: "res2_branch1"
top: "res2_branch1"
name: "bn2_branch1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2_branch1"
top: "res2_branch1"
name: "scale2_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res1"
top: "res2_branch2a"
name: "res2_branch2a"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 1
pad: 0
stride: 2
bias_term: false
}
}
layer {
bottom: "res2_branch2a"
top: "res2_branch2a"
name: "bn2_branch2a"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2_branch2a"
top: "res2_branch2a"
name: "scale2_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2_branch2a"
top: "res2_branch2a"
name: "res2_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res2_branch2a"
top: "res2_branch2b"
name: "res2_branch2b"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
bias_term: false
}
}
layer {
bottom: "res2_branch2b"
top: "res2_branch2b"
name: "bn2_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2_branch2b"
top: "res2_branch2b"
name: "scale2_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2_branch2b"
top: "res2_branch2b"
name: "res2_branch2b_relu"
type: "ReLU"
}
layer {
bottom: "res2_branch2b"
top: "res2_branch2c"
name: "res2_branch2c"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
}
layer {
bottom: "res2_branch2c"
top: "res2_branch2c"
name: "bn_branch2c"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2_branch2c"
top: "res2_branch2c"
name: "scale_branch2c"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2_branch1"
bottom: "res2_branch2c"
top: "res2"
name: "res2"
type: "Eltwise"
}
layer {
bottom: "res2"
top: "res2"
name: "res2_relu"
type: "ReLU"
}
layer{
bottom: "res2"
top: "fc1"
name: "fc1"
type: "InnerProduct"
inner_product_param {
num_output: 128
}
}
layer {
bottom: "fc1"
top: "fc1_relu"
name: "fc1_relu"
type: "ReLU"
}
layer{
bottom: "fc1_relu"
top: "fc2"
name: "fc2"
type: "InnerProduct"
inner_product_param {
num_output: 2
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc2"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
solver:
test_iter: 100
test_interval: 100
base_lr: 0.01
lr_policy: "step"
momentum: 0.9
stepsize: 2000
gamma: 0.3
weight_decay: 0.004
We need help to understand the parameters to use for smaller set of training (6000 jpgs) and val (170 jpgs) jpgs. Our execution was killed and exited after test score 0/1 in Iteration 0.
We are trying to run the imagenet sample on the caffe website tutorial at
http://caffe.berkeleyvision.org/gathered/examples/imagenet.html.
Instead of using the full set of ILSVRC2 images in the package, we use our own training set of 6000 jpegs and val set of 170 jpeg images. They are each 256 x 256 jpeg files in the train and val directories as instructed. We ran the script to get the auxillary data:
./data/ilsvrc12/get_ilsvrc_aux.sh
The train.txt and val.txt files are setup to describe one of two possible categories for each jpeg file.
Then we ran the script to compute the mean image data which appeared to run correctly:
./examples/imagenet/make_imagenet_mean.sh
We used the model definitions supplied in the tutorial for imagenet_train.prototxt and imagenet_val.prototxt.
Since we are training on much fewer images we modified the imagenet_solver.prototxt as follows:
train_net: "./imagenet_train.prototxt"
test_net: "./imagenet_val.prototxt"
test_iter: 3
test_interval: 10
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 10
display: 20
max_iter: 45
momentum: 0.9
weight_decay: 0.0005
snapshot: 10
snapshot_prefix: "caffe_imagenet_train"
solver_mode: CPU
when we run it using:
./train_imagenet.sh
We get the following output where it hangs:
.......
.......
I0520 23:07:53.175761 4678 net.cpp:85] drop7 <- fc7
I0520 23:07:53.175791 4678 net.cpp:99] drop7 -> fc7 (in-place)
I0520 23:07:53.176246 4678 net.cpp:126] Top shape: 50 4096 1 1 (204800)
I0520 23:07:53.176275 4678 net.cpp:152] drop7 needs backward computation.
I0520 23:07:53.176296 4678 net.cpp:75] Creating Layer fc8
I0520 23:07:53.176306 4678 net.cpp:85] fc8 <- fc7
I0520 23:07:53.176314 4678 net.cpp:111] fc8 -> fc8
I0520 23:07:53.184213 4678 net.cpp:126] Top shape: 50 1000 1 1 (50000)
I0520 23:07:53.184908 4678 net.cpp:152] fc8 needs backward computation.
I0520 23:07:53.185607 4678 net.cpp:75] Creating Layer prob
I0520 23:07:53.186135 4678 net.cpp:85] prob <- fc8
I0520 23:07:53.186538 4678 net.cpp:111] prob -> prob
I0520 23:07:53.187166 4678 net.cpp:126] Top shape: 50 1000 1 1 (50000)
I0520 23:07:53.187696 4678 net.cpp:152] prob needs backward computation.
I0520 23:07:53.188244 4678 net.cpp:75] Creating Layer accuracy
I0520 23:07:53.188431 4678 net.cpp:85] accuracy <- prob
I0520 23:07:53.188540 4678 net.cpp:85] accuracy <- label
I0520 23:07:53.188870 4678 net.cpp:111] accuracy -> accuracy
I0520 23:07:53.188907 4678 net.cpp:126] Top shape: 1 2 1 1 (2)
I0520 23:07:53.188915 4678 net.cpp:152] accuracy needs backward computation.
I0520 23:07:53.188922 4678 net.cpp:163] This network produces output accuracy
I0520 23:07:53.188942 4678 net.cpp:181] Collecting Learning Rate and Weight Decay.
I0520 23:07:53.188954 4678 net.cpp:174] Network initialization done.
I0520 23:07:53.188961 4678 net.cpp:175] Memory required for Data 210114408
I0520 23:07:53.189008 4678 solver.cpp:49] Solver scaffolding done.
I0520 23:07:53.189018 4678 solver.cpp:61] Solving CaffeNet
I0520 23:07:53.189033 4678 solver.cpp:106] Iteration 0, Testing net
I0520 23:09:06.699695 4678 solver.cpp:142] Test score #0: 0
I0520 23:09:06.700203 4678 solver.cpp:142] Test score #1: 7.07406
Killed
Done.