Backward pass in Caffe Python Layer is not called/working? - neural-network
I am unsuccessfully trying to implement a simple loss layer in Python using Caffe. As reference, I found several layers implemented in Python, including here, here and here.
Starting with the EuclideanLossLayer as provided by the Caffe documentation/examples, I was not able to get it working and startd debugging. Even using this simple TestLayer:
def setup(self, bottom, top):
"""
Checks the correct number of bottom inputs.
:param bottom: bottom inputs
:type bottom: [numpy.ndarray]
:param top: top outputs
:type top: [numpy.ndarray]
"""
print 'setup'
def reshape(self, bottom, top):
"""
Make sure all involved blobs have the right dimension.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'reshape'
top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
def forward(self, bottom, top):
"""
Forward propagation.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'forward'
top[0].data[...] = bottom[0].data
def backward(self, top, propagate_down, bottom):
"""
Backward pass.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param propagate_down:
:type propagate_down:
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'backward'
bottom[0].diff[...] = top[0].diff[...]
I am not able to get the Python layer working. The learning task is rather simple, as I am merely trying to predict whether a real-valued number is positive or negative. The corresponding data is generated as follows and written to LMDBs:
N = 10000
N_train = int(0.8*N)
images = []
labels = []
for n in range(N):
image = (numpy.random.rand(1, 1, 1)*2 - 1).astype(numpy.float)
label = int(numpy.sign(image))
images.append(image)
labels.append(label)
Writing the data to LMDB should be correct as tests with the MNIST dataset provided by Caffe show no problems. The network is defined as follows:
net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB,
source = lmdb_path, ntop = 2)
net.fc1 = caffe.layers.Python(net.data, python_param = dict(module = 'tools.layers', layer = 'TestLayer'))
net.score = caffe.layers.TanH(net.fc1)
net.loss = caffe.layers.EuclideanLoss(net.score, net.labels)
Solving is done manually using:
for iteration in range(iterations):
solver.step(step)
The corresponding prototxt files are below:
solver.prototxt:
weight_decay: 0.0005
test_net: "tests/test.prototxt"
snapshot_prefix: "tests/snapshot_"
max_iter: 1000
stepsize: 1000
base_lr: 0.01
snapshot: 0
gamma: 0.01
solver_mode: CPU
train_net: "tests/train.prototxt"
test_iter: 0
test_initialization: false
lr_policy: "step"
momentum: 0.9
display: 100
test_interval: 100000
train.prototxt:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
test.prototxt:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/test_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
I tried tracking it down, adding debug messages in the backward and foward methods of TestLayer, only the forward methods gets called during solving (note that NO testing is performed, the calls can only be related ot solving). Similarly I added debug messages in python_layer.hpp:
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
LOG(INFO) << "cpp forward";
self_.attr("forward")(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
LOG(INFO) << "cpp backward";
self_.attr("backward")(top, propagate_down, bottom);
}
Again, only the forward pass is executed. When I remove the backward method in TestLayer, solving still works. When removing the forward method, an error is thrown as forward is not implemented. I would expect the same for backward, so it seems that the backward pass does not get executed at all. Switching back to regular layers and adding debug messages, everything works as expected.
I have the feeling that I am missing something simple or fundamental, but I was not able to resolve the problem for several days now. So any help or hints are appreciated.
Thanks!
This is the intended behaviour since you do not have any layers "below" your python layer that actually need the gradients to compute the weight updates. Caffe notices this and skips the backward computation for such layers because it would be a waste of time.
Caffe prints for all layers if the backward computation is needed in the log at the network initialization time.
In your case, you should see something like:
fc1 does not need backward computation.
If you put an "InnerProduct" or "Convolution" layer below your "Python" layer (eg. Data->InnerProduct->Python->Loss) the backward computation becomes necessary and your backward method gets called.
In addition to Erik B.'s answer, you can force caffe to backprob by specifying
force_backward: true
In your net prototxt.
See comments in caffe.proto for more information.
Mine wasn't working even though I did set force_backward: true as suggested by David Stutz. I found out here and here that I was forgetting to set the diff of the last layer to 1 at the index of the target class.
As Mohit Jain describes in his caffe-users answer, if you are doing ImageNet classification with the tabby cat, after doing the forward pass, you'll have to do something like:
net.blobs['prob'].diff[0][281] = 1 # 281 is tabby cat. diff shape: (1, 1000)
Notice that you'll have to change the 'prob' accordingly to the name of your last layer, which is usually softmax and 'prob'.
Here's an example based on mine:
deploy.prototxt (it's loosely based on VGG16 just to show the structure of the file, but I didn't test it):
name: "smaller_vgg"
input: "data"
force_backward: true
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "pool1"
top: "fc1"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "fc1"
top: "fc1"
}
layer {
name: "drop1"
type: "Dropout"
bottom: "fc1"
top: "fc1"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
inner_product_param {
num_output: 1000
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc2"
top: "prob"
}
main.py:
import caffe
prototxt = 'deploy.prototxt'
model_file = 'smaller_vgg.caffemodel'
net = caffe.Net(model_file, prototxt, caffe.TRAIN) # not sure if TEST works as well
image = cv2.imread('tabbycat.jpg', cv2.IMREAD_UNCHANGED)
net.blobs['data'].data[...] = image[np.newaxis, np.newaxis, :]
net.blobs['prob'].diff[0, 298] = 1
net.forward()
backout = net.backward()
# access grad from backout['data'] or net.blobs['data'].diff
Related
How do I implement a spec for this format?
I am working on the VST3 preset format. This is what I managed till now: meta: id: vstpreset file-extension: vstpreset endian: le seq: - id: magic contents: 'VST3' - id: version type: u4 - id: class_id type: str size: 32 encoding: ASCII - id: ofs_chunk_list type: u8 instances: chunk_list: pos: ofs_chunk_list type: chunk_list size-eos: true types: chunk: seq: - id: blob size-eos: true chunk_list: seq: - id: magic contents: 'List' - id: len_entries type: u4 - id: entries repeat: expr repeat-expr: len_entries type: entry types: entry: seq: - id: id type: u4 - id: offset type: u8 - id: length type: u8 instances: chunk: io: _root._io pos: offset size: length This kind of works; however, I cannot figure out how to implement the section labelled DATA AREA in the illustration shown in the link above. Isn't it kinda unintuitive to store the CHUNK LIST after the DATA AREA?
How to not rebuild a DockerImageAsset at every deploy using aws-cdk in TypeScript?
My app is a Python API that I package as a Docker image and use with ECS Fargate (Spot Instances). The code below works. My issue is that it rebuilds the entire image every time I deploy this – which is very time-consuming (downloads all dependencies, makes the image, uploads, etc). I want it to reuse the exact same image uploaded to ECR by aws-cdk itself. Is there a way (env variable or else) for me to skip this when I don't touch the app's code and just make changes to the stack? #!/usr/bin/env node import * as cdk from "#aws-cdk/core" import * as ecs from "#aws-cdk/aws-ecs" import * as ec2 from "#aws-cdk/aws-ec2" import * as ecrassets from "#aws-cdk/aws-ecr-assets" // See https://docs.aws.amazon.com/cdk/api/latest/docs/aws-ecs-readme.html export class Stack extends cdk.Stack { constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) { super(scope, id, props) /** * Repository & Image */ const apiDockerImage = new ecrassets.DockerImageAsset( this, `my-api-image`, { directory: `.`, exclude: [`cdk.out`, `cdk`, `.git`] } ) /** * Cluster */ const myCluster = new ecs.Cluster(this, "Cluster", {}) // Add Spot Capacity to the Cluster myCluster.addCapacity(`spot-auto-scaling-group-capacity`, { maxCapacity: 2, minCapacity: 1, instanceType: new ec2.InstanceType(`r5a.large`), spotPrice: `0.0400`, spotInstanceDraining: true }) // A task Definition describes what a single copy of a task should look like const myApiFargateTaskDefinition = new ecs.FargateTaskDefinition( this, `api-fargate-task-definition`, { cpu: 2048, memoryLimitMiB: 8192, } ) // Add image to task def myApiFargateTaskDefinition.addContainer(`api-container`, { image: ecs.ContainerImage.fromEcrRepository( apiDockerImage.repository, `latest` ), }) // And the service attaching the task def to the cluster const myApiService = new ecs.FargateService( this, `my-api-fargate-service`, { cluster: myCluster, taskDefinition: myApiFargateTaskDefinition, desiredCount: 1, assignPublicIp: true, } ) } }
The proper solution is to build your image outside of this deployment process and just get a reference to that image in ECR.
How To Add Line Separators in AWS Kinesis in CloudFormation?
We are processing data (records) through Kinesis Stream going to Kinesis Firehose and then outputting the data to a file in our S3 bucket. Currently, however, all the records are on the same line in our output file, but we want each record to be separated so they are on their own line. Instead of something like: Store1, 100, Broccoli Store1, 101, Avocado Store1, 102, Apple It currently looks like: Store1, 100, BroccoliStore1, 101, AvocadoStore1, 102, Apple Here is our CloudFormation template: Resources: MyBucket: Type: AWS::S3::Bucket MyStream: Type: AWS::Kinesis::Stream Properties: Name: my-stream RetentionPeriodHours: 24 ShardCount: 5 MyFirehose: Type: AWS::KinesisFirehose::DeliveryStream Properties: DeliveryStreamName: my-firehose DeliveryStreamType: KinesisStreamAsSource KinesisStreamSourceConfiguration: KinesisStreamARN: Fn::Sub: "${MyStream.Arn}" RoleARN: Fn::Sub: "${MyRole.Arn}" S3DestinationConfiguration: BufferingHints: IntervalInSeconds: 60 SizeInMBs: 50 CompressionFormat: UNCOMPRESSED Prefix: concessions/ BucketARN: Fn::Sub: "${MyBucket.Arn}" RoleARN: Fn::Sub: "${MyRole.Arn}" How can we add line separators so that the records show up on their own lines?
Who ever is feeding your kinesis stream should add '\n' at the end. See Java example below: PutRecordRequest putRecordRequest = new PutRecordRequest(); putRecordRequest.setFirehoseName("incoming-stream"); String data = "some data" + "\n"; // add \n as a record separator Record record = new Record(); record.setData(ByteBuffer.wrap(data.getBytes(StandardCharsets.UTF_8))); putRecordRequest.setRecord(record); firehoseClient.putRecord(putRecordRequest); See source.
YAML Error: could not determine a constructor for the tag
This is very similar to questions/44786412 but mine appears to be triggered by YAML safe_load(). I'm using Ruamel's library and YamlReader to glue a bunch of CloudFormation pieces together into a single, merged template. Is bang-notation just not proper YAML? Outputs: Vpc: Value: !Ref vpc Export: Name: !Sub "${AWS::StackName}-Vpc" No problem with these Outputs: Vpc: Value: Ref: vpc Export: Name: Fn::Sub: "${AWS::StackName}-Vpc" Resources: vpc: Type: AWS::EC2::VPC Properties: CidrBlock: Fn::FindInMap: [ CidrBlock, !Ref "AWS::Region", Vpc ] Part 2; how to get load() to leave what's right of the 'Fn::Select:' alone. FromPort: Fn::Select: [ 0, Fn::FindInMap: [ Service, https, Ports ] ] gets converted to this, that now CF doesn't like. FromPort: Fn::Select: [0, {Fn::FindInMap: [Service, https, Ports]}] If I unroll the statement fully then no problems. I guess the shorthand is just problematic. FromPort: Fn::Select: - 0 - Fn::FindInMap: [Service, ssh, Ports]
Your "bang notation" is proper YAML, normally this is called a tag. If you want to use the safe_load() with those you'll have to provide constructors for the !Ref and !Sub tags, e.g. using: ruamel.yaml.add_constructor(u'!Ref', your_ref_constructor, constructor=ruamel.yaml.SafeConstructor) where for both tags you should expect to handle scalars a value. and not the more common mapping. I recommend you use the RoundTripLoader instead of the SafeLoader, that will preserve order, comments, etc. as well. The RoundTripLoader is a subclass of the SafeLoader. If you are using ruamel.yaml>=0.15.33, which supports round-tripping scalars, you can do (using the new ruamel.yaml API): import sys from ruamel.yaml import YAML yaml = YAML() yaml.preserve_quotes = True data = yaml.load("""\ Outputs: Vpc: Value: !Ref: vpc # first tag Export: Name: !Sub "${AWS::StackName}-Vpc" # second tag """) yaml.dump(data, sys.stdout) to get: Outputs: Vpc: Value: !Ref: vpc # first tag Export: Name: !Sub "${AWS::StackName}-Vpc" # second tag In older 0.15.X versions, you'll have to specify the classes for the scalar objects yourself. This is cumbersome, if you have many objects, but allows for additional functionality: import sys from ruamel.yaml import YAML class Ref: yaml_tag = u'!Ref:' def __init__(self, value, style=None): self.value = value self.style = style #classmethod def to_yaml(cls, representer, node): return representer.represent_scalar(cls.yaml_tag, u'{.value}'.format(node), node.style) #classmethod def from_yaml(cls, constructor, node): return cls(node.value, node.style) def __iadd__(self, v): self.value += str(v) return self class Sub: yaml_tag = u'!Sub' def __init__(self, value, style=None): self.value = value self.style = style #classmethod def to_yaml(cls, representer, node): return representer.represent_scalar(cls.yaml_tag, u'{.value}'.format(node), node.style) #classmethod def from_yaml(cls, constructor, node): return cls(node.value, node.style) yaml = YAML(typ='rt') yaml.register_class(Ref) yaml.register_class(Sub) data = yaml.load("""\ Outputs: Vpc: Value: !Ref: vpc # first tag Export: Name: !Sub "${AWS::StackName}-Vpc" # second tag """) data['Outputs']['Vpc']['Value'] += '123' yaml.dump(data, sys.stdout) which gives: Outputs: Vpc: Value: !Ref: vpc123 # first tag Export: Name: !Sub "${AWS::StackName}-Vpc" # second tag
perl how to grab spesific string in paragraph style
I'm new on perl and still learning by doing some case. I've some case on parsing a log with perl. There is a data log : Physical interface: ge-1/0/2, Unit: 101, Vlan-id: 101, Address: 10.187.132.3/27 Index: 353, SNMP ifIndex: 577, VRRP-Traps: enabled Interface state: up, Group: 1, State: backup, VRRP Mode: Active Priority: 190, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.1 Dead timer: 2.715s, Master priority: 200, Master router: 10.187.132.2 Virtual router uptime: 5w5d 12:54 Tracking: disabled Physical interface: ge-1/0/2, Unit: 102, Vlan-id: 102, Address: 10.187.132.35/27 Index: 354, SNMP ifIndex: 580, VRRP-Traps: enabled Interface state: up, Group: 2, State: master, VRRP Mode: Active Priority: 200, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.33 Advertisement Timer: 0.816s, Master router: 10.187.132.35 Virtual router uptime: 5w5d 12:54, Master router uptime: 5w5d 12:54 Virtual Mac: 00:00:5e:00:01:02 Tracking: disabled Physical interface: ge-1/0/2, Unit: 103, Vlan-id: 103, Address: 10.187.132.67/27 Index: 355, SNMP ifIndex: 581, VRRP-Traps: enabled Interface state: up, Group: 3, State: backup, VRRP Mode: Active Priority: 190, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.65 Dead timer: 2.624s, Master priority: 200, Master router: 10.187.132.66 Virtual router uptime: 5w5d 12:54 Tracking: disabled I curious how we can retrieve some value and store it to array. I've tried grep it but I'm confuse how to take a spesific value. Expected Value on Array of Hashes: $VAR1 = { 'interface' => 'ge-1/0/2.101', 'address' => '10.187.132.3/27', 'State' => 'backup' 'Master-router' => '10.187.132.2' }; $VAR2 = { 'interface' => 'ge-1/0/2.102', 'address' => '10.187.132.35/27', 'State' => 'master' 'Master-router' => '10.187.132.35' }; $VAR3 = { 'interface' => 'ge-1/0/2.103', 'address' => '10.187.132.67/27', 'State' => 'backup' 'Master-router' => '10.187.132.66' };
You could use regex to split each paragraph up. Something like this might work: /((\w|\s|-)+):\s([^,]+)/m The matching groups would then look something like: Match 1 1. Physical interface 2. e 3. ge-1/0/2 Match 2 1. Unit 2. t 3. 101 Match 3 1. Vlan-id 2. d 3. 101 As you can see 1. corresponds to a key whereas 3. is the corresponding value. You can store the set of pairs any way you like. For this to work each attribute in the log would need to be comma-seperated, which the example you have listed isn't. Assuming the example you have listed is correct, you would have to adjust the regex a little to make it work. You can test it online at rubular until it works. If it is comma seperated you might just want to split each paragraph by "," and then split each result on ":". EDIT: It seems to me like each line is comma seperated so the methods mentioned above might work perfectly well if you use them on a single line at a time.
To parse the data: Obtain a section from the file by reading lines into a section while the lines start with whitespace. Split the sections at the item separators which are either commas or linebreaks Split each item at the first colon into a key and value Case-fold the key and store the pair into a hash A sketch of an implementation: my #hashes; while (<>) { push #hashes, {} if /\A\S/; for my $item (split /,/) { my ($k, $v) = split /:/, $item, 2; $hashes[-1]{fc $k} = $v; } } Then you can extract those pieces of information from the hash which you are interested in.
Since each record is a paragraph, you can have Perl read the file in those chunks by local $/ = ''; (paragraph mode). Then, use a regex to capture each value that you want within that paragraph, pair that value with a hash key, and then push a reference to that hash onto an array to form an array of hashes (AoH): use strict; use warnings; use Data::Dumper; my #arr; local $/ = ''; while (<DATA>) { my %hash; ( $hash{'interface'} ) = /interface:\s+([^,]+)/; ( $hash{'address'} ) = /Address:\s+(\S+)/; ( $hash{'State'} ) = /State:\s+([^,]+)/; ( $hash{'Master-router'} ) = /Master router:\s+(\S+)/; push #arr, \%hash; } print Dumper \#arr; __DATA__ Physical interface: ge-1/0/2, Unit: 101, Vlan-id: 101, Address: 10.187.132.3/27 Index: 353, SNMP ifIndex: 577, VRRP-Traps: enabled Interface state: up, Group: 1, State: backup, VRRP Mode: Active Priority: 190, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.1 Dead timer: 2.715s, Master priority: 200, Master router: 10.187.132.2 Virtual router uptime: 5w5d 12:54 Tracking: disabled Physical interface: ge-1/0/2, Unit: 102, Vlan-id: 102, Address: 10.187.132.35/27 Index: 354, SNMP ifIndex: 580, VRRP-Traps: enabled Interface state: up, Group: 2, State: master, VRRP Mode: Active Priority: 200, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.33 Advertisement Timer: 0.816s, Master router: 10.187.132.35 Virtual router uptime: 5w5d 12:54, Master router uptime: 5w5d 12:54 Virtual Mac: 00:00:5e:00:01:02 Tracking: disabled Physical interface: ge-1/0/2, Unit: 103, Vlan-id: 103, Address: 10.187.132.67/27 Index: 355, SNMP ifIndex: 581, VRRP-Traps: enabled Interface state: up, Group: 3, State: backup, VRRP Mode: Active Priority: 190, Advertisement interval: 1, Authentication type: none Advertisement threshold: 3, Delay threshold: 100, Computed send rate: 0 Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.187.132.65 Dead timer: 2.624s, Master priority: 200, Master router: 10.187.132.66 Virtual router uptime: 5w5d 12:54 Tracking: disabled Output: $VAR1 = [ { 'Master-router' => '10.187.132.2', 'interface' => 'ge-1/0/2', 'address' => '10.187.132.3/27', 'State' => 'backup' }, { 'Master-router' => '10.187.132.35', 'interface' => 'ge-1/0/2', 'address' => '10.187.132.35/27', 'State' => 'master' }, { 'Master-router' => '10.187.132.66', 'interface' => 'ge-1/0/2', 'address' => '10.187.132.67/27', 'State' => 'backup' } ]; Hope this helps!