I am working on my own implementation of wavenet in tensorflow. I keep having this issue where the audio goes to zero when generating. I think that it might help if my classes were weighted correctly. To do this I decided to divide my cost function by the frequency in which its value occurs. Right now I am keeping a running total of the frequencies as I train. To compute them I unroll the list of 128 different distinct values and I compute the count for each one. I feel like there should be a way to do this with vector operations but I am unsure how. Do any of you know how I can do away with the for loop?
with tf.variable_scope('training'):
self.global_step = tf.get_variable('global_step', [], tf.int64, initializer = tf.constant_initializer(), trainable = False)
class_count = tf.get_variable('class_count', (quantization_channels,), tf.int64, initializer = tf.constant_initializer(), trainable = False)
total_count = tf.get_variable('total_count', [], tf.int64, initializer = tf.constant_initializer(), trainable = False)
y_ = tf.reshape(y_, (-1,))
y = tf.reshape(y, (-1, quantization_channels))
counts = [0] * quantization_channels
for i in range(quantization_channels):
counts[i] = tf.reduce_sum(tf.cast(tf.equal(y_, i), tf.int64))
counts = class_count + tf.pack(counts)
total = total_count + tf.reduce_prod(tf.shape(y_, out_type = tf.int64))
with tf.control_dependencies([tf.assign(class_count, counts), tf.assign(total_count, total)]):
class_freq = tf.cast(counts, tf.float32) / tf.cast(total, tf.float32)
weights = tf.gather(class_freq, y_)
self.cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_) / (quantization_channels * weights + 1e-2))
self.accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(y, 1), y_), tf.float32))
opt = tf.train.AdamOptimizer(self.learning_rate)
grads = opt.compute_gradients(self.cost)
grads = [(tf.clip_by_value(g, -1.0, 1.0), v) for g, v in grads]
self.train_step = opt.apply_gradients(grads, global_step = self.global_step)
Try using tf.histogram_fixed_width function to get a distribution of class labels per batch.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.histogram_fixed_width.md
Related
Doing a final project for my school and need help because I'm not sure how to execute it.
I'm investigating how image distortion will affect ANN's learning ability. Have no background in coding whatsoever so any help will be hugely appreciated! Can't use TensorFlow as part of requirements.
This is what my prof used for network classification which is standard from the textbook we're using.
# neural network class definition
class neuralNetwork:
# initialise the neural network
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
# set number of nodes in each input, hidden, output layer
self.inodes = inputnodes
self.hnodes = hiddennodes
self.onodes = outputnodes
self.wih = numpy.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
self.who = numpy.random.normal(0.0, pow(self.hnodes, -0.5), (self.onodes, self.hnodes))
self.lr = learningrate
self.activation_function = lambda x: scipy.special.expit(x)
pass
# train the neural network
def train(self, inputs_list, targets_list):
inputs = numpy.array(inputs_list, ndmin=2).T
targets = numpy.array(targets_list, ndmin=2).T
hidden_inputs = numpy.dot(self.wih, inputs)
hidden_outputs = self.activation_function(hidden_inputs)
final_inputs = numpy.dot(self.who, hidden_outputs)
final_outputs = self.activation_function(final_inputs)
output_errors = targets - final_outputs
hidden_errors = numpy.dot(self.who.T, output_errors)
self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
pass
# query the neural network
def query(self, inputs_list):
inputs = numpy.array(inputs_list, ndmin=2).T
hidden_inputs = numpy.dot(self.wih, inputs)
hidden_outputs = self.activation_function(hidden_inputs)
final_inputs = numpy.dot(self.who, hidden_outputs)
final_outputs = self.activation_function(final_inputs)
return final_outputs
# number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 200
output_nodes = 10
learning_rate = 0.1
# create instance of neural network
n = neuralNetwork(input_nodes,hidden_nodes,output_nodes, learning_rate)
# load the mnist training data CSV file into a list
training_data_file = open("mnist_train.csv", 'r')
training_data_list = training_data_file.readlines()
training_data_file.close()
# train the neural network
epochs = 5
for e in range(epochs):
for record in training_data_list:
all_values = record.split(',')
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
targets = numpy.zeros(output_nodes) + 0.01
targets[int(all_values[0])] = 0.99
n.train(inputs, targets)
pass
pass
# load the mnist test data CSV file into a list
test_data_file = open("mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
# test the neural network
# scorecard for how well the network performs, initially empty
scorecard = []
for record in test_data_list:
all_values = record.split(',')
correct_label = int(all_values[0])
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
outputs = n.query(inputs)
label = numpy.argmax(outputs)
if (label == correct_label):
scorecard.append(1)
else:
scorecard.append(0)
pass
pass
# calculate the performance score, the fraction of correct answers
scorecard_array = numpy.asarray(scorecard)
print ("performance = ", scorecard_array.sum() / scorecard_array.size)
So I came up with a sharpening kernel and basically what I want to accomplish is to apply this kernel to the network so that the images in the training dataset is now 'sharpened'. (Also am using the convolution method given by my prof) Where am I supposed to add this to the class above??
sharpen_kernel = np.array([[0, -1, 0],
[-1, 5, -1],
[0, -1, 0]])
# convolve your image with the kernel
matplotlib.rcParams['figure.figsize'] = 20,20
conv_image = numpy.ones((28,28))
print("shape of image_array",numpy.shape(image_array))
step = 3
i=0
while i < 25:
i+=1
j = 0
while j < 25 :
sub_image = image_array[i:(i+step),j:(j+step):] # array[starty : endy , start x : end x : step]
sub_image = numpy.reshape(sub_image,(1,(step ** 2)))
kernel = numpy.reshape(sharpen_kernel, ((step ** 2),1))
conv_scalar = numpy.dot(sub_image,kernel)
conv_image[i,j] = conv_scalar
j+=1
pass
pass
Basically want to measure the relationship of the performance of ANN and quality of image (e.g. blur, sharpening). Help!
My network just refuses to train. To make code reading less of a hassle, I abbreviate some complicated logic. Would update more if needed.
model = DistMultNN()
optimizer = optim.SGD(model.parameters(), lr=0.0001)
for t in range(500):
e1_neg = sampling_logic()
e2_neg = sampling_logic()
e1_pos = sampling_logic()
r = sampling_logic()
e2_pos = sampling_logic()
optimizer.zero_grad()
y_pred = model(tuple(zip(e1_pos, r, e2_pos)), e1_neg, e2_neg)
loss = model.loss(y_pred)
loss.backward()
optimizer.step()
I define my network as follow
class DistMultNN(nn.Module):
def __init__(self):
super().__init__()
self.seed = 42
self.entities_embedding = nn.init.xavier_uniform_(
torch.zeros((self.NO_ENTITIES, self.ENCODING_DIM), requires_grad=True))
self.relation_embedding = nn.init.xavier_uniform_(
torch.zeros((self.NO_RELATIONSHIPS, self.ENCODING_DIM), requires_grad=True))
self.W = torch.rand(self.ENCODING_DIM, self.ENCODING_DIM, requires_grad=True) # W is symmetric, todo: requireGrad?
self.W = (self.W + self.W.t()) / 2
self.b = torch.rand(self.ENCODING_DIM, 1, requires_grad=True)
self.lambda_ = 1.
self.rnn = torch.nn.RNN(input_size=encoding_dim, hidden_size=1, num_layers=1, nonlinearity='relu')
self.loss_func = torch.nn.LogSoftmax(dim=1)
def loss(self, y_pred):
softmax = -1 * self.loss_func(y_pred)
result = torch.mean(softmax[:, 0])
result.requires_grad = True
return result
def forward(self, samples, e1neg, e2neg):
batch_size = len(samples)
batch_result = np.zeros((batch_size, len(e1neg[0]) + 1))
for datapoint_id in range(batch_size):
entity_1 = entities_embed_lookup(datapoint_id[0])
entity_2 = entities_embed_lookup(datapoint_id[2])
r = relation_embed_lookup(datapoint_id[1])
x = self.some_fourier_transform(entity_1, r, entity_2)
batch_result[datapoint_id][0] = self.some_matmul(x)
for negative_example_id in range(len(e1neg[0])):
same_thing_with_negative_examples()
batch_result[datapoint_id][negative_example_id + 1] = self.some_matmul(x)
batch_result_tensor = torch.tensor(data=batch_result)
return batch_result_tensor
I tried checking weights using e.g. print(model.rnn.all_weights) in the training loop but they do not change. What did I do wrong?
So first of all the result.requires_grad = True should not be needed and in fact should throw an error, because results would normally not be a leaf variable.
So in your forward at the end you create a new tensor out of a numpy array:
batch_result_tensor = torch.tensor(data=batch_result)
and out of this result you calculate the loss and want to backward it. This doesn't work because batch_result_tensor is not part of any computation graph needed to calculate a gradient. You can't just mix numpy and torch this way.
The forward function has to consists of operations with torch tensors, which require grad, if you want to update and optimize them. So the default case is you have layers, which have weight tensors which requires grad. You have a input which you pass to the layers and so the computational graph is build and all the operations are recorded in it.
So I would start making batch_result a torch tensor and remove batch_result_tensor = torch.tensor(data=batch_result) and result.requires_grad = True. You might have to change more.
I am interested to derive the discrete wavelet transform for noise reduction of more than 50,000 data points. I am using wmulden - matlab tool for wavelet tranform. Under this function, wfastmcd, an another function is being called which takes only 50000 data points at a time. It would be highly helpful if anyone suggests how to partition the data point to get the transform of entire data set or if there is any other matlab tool available for these kind of calculations.
I've used a for loop to solve that one.
First of all, I've calculated how many "steps" I needed to take on my signal, on a fixed size window of 50000, like:
MAX_SAMPLES = 50000;
% mySignalSize is the size of my samples vector.
steps = ceil(mySignalSize/MAX_SAMPLES);
After that, I've applied the wmulden function "steps" times, checking every time if my step is not larger than the original signal vector size, like the following:
% Wavelet fields
level = 5;
wname = 'sym4';
tptr = 'sqtwolog';
sorh = 's';
npc_app = 'heur';
npc_fin = 'heur';
den_signal = zeros(mySignalSize,1);
for i=1:steps
if (i*MAX_SAMPLES) <= mySignalSize
x_den = wmulden(originalSignal( (((i-1) * MAX_SAMPLES) + 1) : (i*MAX_SAMPLES) ), level, wname, npc_app, npc_fin, tptr, sorh);
den_signal((((i-1) * MAX_SAMPLES) + 1):i*MAX_SAMPLES) = x_den;
else
old_step = (((i-1) * MAX_SAMPLES) + 1);
new_step = mySignalSize - old_step;
last_step = old_step + new_step;
x_den = wmulden(originalSignal( (((i-1) * MAX_SAMPLES) + 1) : last_step ), level, wname, npc_app, npc_fin, tptr, sorh);
den_signal((((i-1) * MAX_SAMPLES) + 1):last_step) = x_den;
end
end
That should do the trick.
How do i simulate a binomial distribution with values for investment with two stocks acme and widget?
Number of trials is 1000
invest in each stock for 5 years
This is my code. What am I doing wrong?
nyears = 5;
ntrials = 1000;
startamount = 100;
yrdeposit = 50;
acme = zeros(nyears, 1);
widget = zeros(nyears,1);
v5 = zeros(ntrials*5, 1);
v5 = zeros(ntrials*5, 1);
%market change between -5 to 1%
marketchangeacme = (-5+(1+5)*rand(nyears,1));
marketchangewidget = (-3+(3+3)*rand(nyears,1));
acme(1) = startamount;
widget(1) = startamount;
for m=1:numTrials
for n=1:nyears
acme(n) = acme(n-1) + (yrdeposit * (marketchangeacme(n)));
widget(n) = acme(n-1) + (yrdeposit * (marketchangewidget(n)));
vacme5(i) = acme(j);
vwidget5(i) = widget(j);
end
theMean(m) = mean(1:n*nyears);
p = 0.5 % prob neg return
acmedrop = (marketchangeacme < p)
widgetdrop = (marketchangewidget <p)
end
plot(mean)
Exactly what you are trying to calculate is not clear. However some things that are obviously wrong with the code are:
widget(n) presumable isn't a function of acme(n-1) but rather 'widget(n-1)`
Every entry of theMean will be mean(1:nyears*nyears), which for nyears=5 will be 13. (This is because n=nyears always at that point in code.)
The probability of a negative return for acme is 5/6, not 0.5.
To find the locations of the negative returns you want acmedrop = (marketchangeacme < 0); not < 0.5 (nor any other probability). Similarly for widgetdrop.
You are not preallocating vacme5 nor vwidget5 (but you do preallocate v5 twice, and then never use it.
You don't create a variable called mean (and you never should) so plot(mean) will not work.
Using Run & Time on my algorithm I found that is a bit slow on adding standard deviation to integers. First of all I created the large integer matrix:
NumeroCestelli = 5;
lover_bound = 0;
upper_bound = 250;
steps = 10 ;
Alpha = 0.123
livello = [lover_bound:steps:upper_bound];
L = length(livello);
[PianoSperimentale] = combinator(L,NumeroCestelli,'c','r');
for i=1:L
PianoSperimentale(PianoSperimentale==i)=livello(i);
end
then I add standard deviation (sigma = alpha * mu) and error (of a weigher) like this:
%Standard Deviation
NumeroEsperimenti = size(PianoSperimentale,1);
PesoCestelli = randn(NumeroEsperimenti,NumeroCestelli)*Alfa;
PesoCestelli = PesoCestelli.*PianoSperimentale + PianoSperimentale;
random = randn(NumeroEsperimenti,NumeroCestelli);
PesoCestelli(PesoCestelli<0) = random(PesoCestelli<0).*(Alfa.*PianoSperimentale(PesoCestelli<0) + PianoSperimentale(PesoCestelli<0));
%Error
IncertezzaCella = 0.5*10^(-6);
Incertezza = randn(NumeroEsperimenti,NumeroCestelli)*IncertezzaCella;
PesoIncertezza = PesoCestelli.*Incertezza+PesoCestelli;
PesoIncertezza = (PesoIncertezza<0).*(-PesoIncertezza)+PesoIncertezza;
Is there a faster way?
There is not enough information for me to test it, but I bet that eliminating all the duplicate calculations that you do will lead to a speedup. I have tried to remove some of them:
PesoCestelli = randn(NumeroEsperimenti,NumeroCestelli)*Alfa;
PesoCestelli = (1+PesoCestelli).*PianoSperimentale;
random = randn(NumeroEsperimenti,NumeroCestelli);
idx = PesoCestelli<0;
PesoCestelli(idx) = random(idx).*(1+Alfa).*PianoSperimentale(idx);
%Error
IncertezzaCella = 0.5*10^(-6);
Incertezza = randn(NumeroEsperimenti,NumeroCestelli)*IncertezzaCella;
PesoIncertezza = abs((1+PesoCestelli).*Incertezza);
Note that I reduced the last two lines to a single line.
You calculate PesoCestelli<0 a number of times. You could just calculate it once and save teh value. You also create a full set of random numbers, but only use a subset of them where PesoCestelli<0. You might be able to speed things up by only creating the number of random numbers you need.
It is not clear what Alfa is, but if it is a scalar, instead of
Alfa.*PianoSperimentale(PesoCestelli<0) + PianoSperimentale(PesoCestelli<0)
it might be faster to do
(1+Alfa).*PianoSperimentale(PesoCestelli<0)