Tflearn Input shape error - neural-network

Error
"Cannot feed value of shape (128, 1) for Tensor 'TargetsData/Y:0',
which has shape '(?,)'".
Code
I have 4 classes and vocab consists of words 17355.
tf.reset_default_graph()
net = tflearn.input_data(shape=(None,trainX.shape[1]),name='input')
net = tflearn.fully_connected(net, 200, activation='ReLU')
net = tflearn.fully_connected(net, 25, activation='ReLU')
net = tflearn.fully_connected(net, 4, activation='softmax')
net = tflearn.regression(net, optimizer='sgd',
learning_rate=0.1,
to_one_hot = True,n_classes =4,
loss='categorical_crossentropy')
model = tflearn.DNN(net)
model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=128, n_epoch=100)
trainX.shape = 12384,17355 , trainY.shape = 12384,1 , testX.shape = 1376,17355 , testY.shape = 1376,1

What causes this error?
The error ”Cannot feed value of shape … for Tensor 'TargetsData/Y:0', which has shape …” is mainly caused by that the shape of trainY is different with the placeholder shape of the estimator (regression) layer.
Why?
In your case, the main problem is that the shape of trainY is (?, 1) which is a 2D tensor, but the shape of the placeholder is (?,) which is an 1D tensor. So we get this error.
How to solve it?
Reshape the trainY to 1D tensor. For you had set the to_one_hot = True in the regression layer, so the placeholder shape is an 1D tensor which contains the class indices. For details you can check the source code:tflearn/tflearn/layers/estimator.py about regression:
with tf.name_scope(pscope):
p_shape = [None] if to_one_hot else input_shape
placeholder = tf.placeholder(shape=p_shape, dtype=dtype, name="Y")
So, we need to reshape the trainY from (12384,1) to (12384,) before fed to the model.

Related

How to build a recurrent neural net in Keras where each input goes through a layer first?

I'm trying to build an neural net in Keras that would look like this:
Where x_1, x_2, ... are input vectors that undergo the same transformation f. f is itself a layer whose parameters must be learned. The sequence length n is variable across instances.
I'm having trouble understanding two things here:
What should the input look like?
I'm thinking of a 2D tensor with shape (number_of_x_inputs, x_dimension), where x_dimension is the length of a single vector $x$. Can such 2D tensor have a variable shape? I know tensors can have variable shapes for batch processing, but I don't know if that helps me here.
How do I pass each input vector through the same transformation before feeding it to the RNN layer?
Is there a way to sort of extend for example a GRU so that an f layer is added before going through the actual GRU cell?
I'm not an expert, but I hope this helps.
Question 1:
Vectors x1, x2... xn can have different shapes, but I'm not sure if the instances of x1 can have different shapes. When I have different shapes I usually pad the short sequences with 0s.
Question 2:
I'm not sure about extending a GRU, but I would do something like this:
x_dims = [50, 40, 30, 20, 10]
n = 5
def network():
shared_f = Conv1D(5, 3, activation='relu')
shated_LSTM = LSTM(10)
inputs = []
to_concat = []
for i in range(n):
x_i = Input(shape=(x_dims[i], 1), name='x_' + str(i))
inputs.append(x_i)
step1 = shared_f(x_i)
to_concat.append(shated_LSTM(step1))
merged = concatenate(to_concat)
final = Dense(2, activation='softmax')(merged)
model = Model(inputs=inputs, outputs=[final])
# model = Model(inputs=[sequence], outputs=[part1])
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
return model
m = network()
In this example, I used a Conv1D as the shared f transformation, but you could use something else (Embedding, etc.).

Tensorflow: modify shape of placeholder when retrieving metagraph

I trained a recurrent neural network (LSTM) and saved the weights and the metagraph. When I retrieve the metagraph for prediction, everything works perfectly as long as the sequence length is the same as during the training.
One of the benefits of LSTM is that the sequence length of the inputs can vary (for example, if inputs are letters forming a sentence, the length of the sentences can vary).
How can I change the sequence length of the inputs when retrieving the graph from a metagraph?
More details with code:
During training, I use placeholders x and y to feed the data. For prediction, I retrieve these placeholders but cannot manage to change their shape (from [None, previous_sequence_length=100, n_input], to [None, new_sequence_length=50, n_input]).
In the file model.py, defining the architecture and placeholders:
self.x = tf.placeholder("float32", [None, self.n_steps, self.n_input], name='x_input')
self.y = tf.placeholder("float32", [None, self.n_classes], name='y_labels')
tf.add_to_collection('x', self.x)
tf.add_to_collection('y', self.y)
...
def build_model(self):
#using the placeholder self.x to build the model
...
tf.split(0, self.n_input, self.x) # split input for RNN cell
...
In the file prediction.py where I retrieve the metagraph for prediction:
with tf.Session() as sess:
latest_checkpoint = tf.train.latest_checkpoint(checkpoint_dir=checkpoint_dir)
new_saver = tf.train.import_meta_graph(latest_checkpoint + '.meta')
new_saver.restore(sess, latest_checkpoint)
x = tf.get_collection('x')[0]
y = tf.get_collection('y')[0]
...
sess.run(..., feed_dict={x: batch_x})
Here is the error I get:
ValueError: Cannot feed value of shape (128, 50, 2) for Tensor u'placeholders/x_input:0', which has shape '(?, 100, 2)'
NOTE: I manage to solve this problem when not using metagraph but rather reconstructing the model from scratch and loading only the saved weights (and not the metagraph).
EDIT: when replacing self.n_steps with None and modifying tf.split(0, self.n_input, self.x) with tf.split(0, self.x.get_shape()[1], self.x) I get the following error: TypeError: Expected int for argument 'num_split' not Dimension(None).
When you define your varible, I suggest you to write it as following
[None, None, n_input]
instead of:
[None, new_sequence_length=50, n_input]
It works in my case. I hope it helps

Selectively zero weights in TensorFlow?

Lets say I have an NxM weight variable weights and a constant NxM matrix of 1s and 0s mask.
If a layer of my network is defined like this (with other layers similarly defined):
masked_weights = mask*weights
layer1 = tf.relu(tf.matmul(layer0, masked_weights) + biases1)
Will this network behave as if the corresponding 0s in mask are zeros in weights during training? (i.e. as if the connections represented by those weights had been removed from the network entirely)?
If not, how can I achieve this goal in TensorFlow?
The answer is yes. The experiment depicts the following graph.
The implementation is:
import numpy as np, scipy as sp, tensorflow as tf
x = tf.placeholder(tf.float32, shape=(None, 3))
weights = tf.get_variable("weights", [3, 2])
bias = tf.get_variable("bias", [2])
mask = tf.constant(np.asarray([[0, 1], [1, 0], [0, 1]], dtype=np.float32)) # constant mask
masked_weights = tf.multiply(weights, mask)
y = tf.nn.relu(tf.nn.bias_add(tf.matmul(x, masked_weights), bias))
loss = tf.losses.mean_squared_error(tf.constant(np.asarray([[1, 1]], dtype=np.float32)),y)
weights_grad = tf.gradients(loss, weights)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print("Masked weights=\n", sess.run(masked_weights))
data = np.random.rand(1, 3)
print("Graident of weights\n=", sess.run(weights_grad, feed_dict={x: data}))
sess.close()
After running the code above, you will see the gradients are masked as well. In my example, they are:
Graident of weights
= [array([[ 0. , -0.40866762],
[ 0.34265977, -0. ],
[ 0. , -0.35294518]], dtype=float32)]
The answer is yes and the reason lies in backpropogation as explained below.
mask_w = mask * w
del(mask_w) = mask * del(w).
The mask will make the gradient 0 wherever its value is zero. Wherever its value is 1, gradient will flow as previously. This is a common trick used in seq2seq predictions to mask the different size output in decoding layer. You can read more about this here.

Tensorflow, how to chain GRU layers

Right now I am trying to chain multiple GRU recurrent layers to each other in tensorflow. I am getting the following error.
ValueError: Variable GRUCell/Gates/Linear/Matrix already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "/home/chase/workspace/SentenceEncoder/sent_enc.py", line 42, in <module>
output, states[i] = grus[i](output, states[i])
Here is my code.
x = tf.placeholder(tf.float32, (batch_size, time_steps, vlen), 'x')
y_exp = tf.placeholder(tf.float32, (batch_size, time_steps, vlen), 'y_exp')
with tf.name_scope('encoder'):
gru_sizes = (128, 256, 512)
grus = [tf.nn.rnn_cell.GRUCell(sz) for sz in gru_sizes]
states = [tf.zeros((batch_size, g.state_size)) for g in grus]
for t in range(time_steps):
output = tf.reshape(x[:, t, :], (batch_size, vlen))
for i in range(len(grus)):
output, states[i] = grus[i](output, states[i])
I am aware that tensorflow provides a MultiRNNCell for doing this but I kind of wanted to figure it out for myself.
I managed to fix it. I needed to add a different variable scope for each of the layers. I also needed to reuse the variables after the first time step.
x = tf.placeholder(tf.float32, (batch_size, time_steps, vlen), 'x')
y_exp = tf.placeholder(tf.float32, (batch_size, time_steps, vlen), 'y_exp')
with tf.name_scope('encoder'):
gru_sizes = (128, 256, 512)
grus = [tf.nn.rnn_cell.GRUCell(sz) for sz in gru_sizes]
states = [tf.zeros((batch_size, g.state_size)) for g in grus]
for t in range(time_steps):
output = tf.reshape(x[:, t, :], (batch_size, vlen))
for i in range(len(grus)):
with tf.variable_scope('gru_' + str(i), reuse = t > 0):
output, states[i] = grus[i](output, states[i])

8-tap daubechies wavelet decomposition in matlab

I have a code to implement 8-tap Daubechies wavelet decomposition . First I decompose in 4 levels and the reconstruct the original image from the coefficients. code is as give below
Im=imread('me.jpg');
% Perform wavelet transform using Daubechies filter bank, 4th order
[LL1,HL1,LH1,HH1] = dwt2(Im,'db8');
[LL2,HL2,LH2,HH2] = dwt2(LL1,'db8');
[LL3,HL3,LH3,HH3] = dwt2(LL2,'db8');
[LL4,HL4,LH4,HH4] = dwt2(LL3,'db8');
% inverse wavelet transform
[LL3] = idwt2(LL4, HL4, LH4, HH4,'db8');
[LL2] = idwt2(LL3, HL3, LH3, HH3,'db8');
[LL1] = idwt2(LL2, HL2, LH2, HH2,'db8');
[reconstructed] = idwt2(LL1, HL1, LH1, HH1,'db8');
Using the above code I am getting an error message as
??? Array dimensions must match for binary array op.
Error in ==> idwt2 at 93 x = upsconv2(a,{Lo_R,Lo_R},sx,dwtEXTM,shift)+
... % Approximation.
Error in ==> NoiseExtract_2 at 55 [LL2] = idwt2(LL3, HL3, LH3,
HH3,'db8');
At the inverse transform step the size of LL3,LL2 and LL1 are changed. How to solve this problem?