Related
This code does not have an error when running on a computer, but when I move all the files and run it on a Raspberry Pi, I get this error. Where and how to fix it?
OSError; savedModel file does not exist at: models/mask_detector.model/{saved_model.pbtxt|saved_model.pb}
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.models import load_model
import numpy as np
import cv2
facenet = cv2.dnn.readNet('models/deploy.prototxt', 'models/res10_300x300_ssd_iter_140000.caffemodel')
model = load_model('models/mask_detector.model')
cap = cv2.VideoCapture(0)
i = 0
while cap.isOpened():
ret, img = cap.read()
if not ret:
break
h, w = img.shape[:2]
blob = cv2.dnn.blobFromImage(img, scalefactor=1., size=(300, 300), mean=(104., 177., 123.))
facenet.setInput(blob)
dets = facenet.forward()
result_img = img.copy()
for i in range(dets.shape[2]):
confidence = dets[0, 0, i, 2]
if confidence < 0.5:
continue
x1 = int(dets[0, 0, i, 3] * w)
y1 = int(dets[0, 0, i, 4] * h)
x2 = int(dets[0, 0, i, 5] * w)
y2 = int(dets[0, 0, i, 6] * h)
face = img[y1:y2, x1:x2]
face_input = cv2.resize(face, dsize=(224, 224))
face_input = cv2.cvtColor(face_input, cv2.COLOR_BGR2RGB)
face_input = preprocess_input(face_input)
face_input = np.expand_dims(face_input, axis=0)
mask, nomask = model.predict(face_input).squeeze()
if mask > nomask:
color = (0, 255, 0)
label = 'Mask %d%%' % (mask * 100)
else:
color = (0, 0, 255)
label = 'No Mask %d%%' % (nomask * 100)
cv2.rectangle(result_img, pt1=(x1, y1), pt2=(x2, y2), thickness=2, color=color, lineType=cv2.LINE_AA)
cv2.putText(result_img, text=label, org=(x1, y1 - 10), fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.8,
color=color, thickness=2, lineType=cv2.LINE_AA)
cv2.imshow('img',result_img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
I understand that:
atan2(vector.y, vector.x) = the angle between the vector and the X axis.
But I wanted to know how to get the angle between two vectors using atan2. So I came across this solution:
atan2(vector1.y - vector2.y, vector1.x - vector2.x)
My question is very simple:
Will the two following formulas produce the same number?
atan2(vector1.y - vector2.y, vector1.x - vector2.x)
atan2(vector2.y - vector1.y, vector2.x - vector1.x)
If not: How do I know what vector comes first in the subtractions?
atan2(vector1.y - vector2.y, vector1.x - vector2.x)
is the angle between the difference vector (connecting vector2 and vector1) and the x-axis,
which is problably not what you meant.
The (directed) angle from vector1 to vector2 can be computed as
angle = atan2(vector2.y, vector2.x) - atan2(vector1.y, vector1.x);
and you may want to normalize it to the range [0, 2 π):
if (angle < 0) { angle += 2 * M_PI; }
or to the range (-π, π]:
if (angle > M_PI) { angle -= 2 * M_PI; }
else if (angle <= -M_PI) { angle += 2 * M_PI; }
A robust way to do it is by finding the sine of the angle using the cross product, and the cosine of the angle using the dot product and combining the two with the Atan2() function.
In C# this is:
public struct Vector2
{
public double X, Y;
/// <summary>
/// Returns the angle between two vectos
/// </summary>
public static double GetAngle(Vector2 A, Vector2 B)
{
// |A·B| = |A| |B| COS(θ)
// |A×B| = |A| |B| SIN(θ)
return Math.Atan2(Cross(A,B), Dot(A,B));
}
public double Magnitude { get { return Math.Sqrt(Dot(this,this)); } }
public static double Dot(Vector2 A, Vector2 B)
{
return A.X*B.X+A.Y*B.Y;
}
public static double Cross(Vector2 A, Vector2 B)
{
return A.X*B.Y-A.Y*B.X;
}
}
class Program
{
static void Main(string[] args)
{
Vector2 A=new Vector2() { X=5.45, Y=1.12};
Vector2 B=new Vector2() { X=-3.86, Y=4.32 };
double angle=Vector2.GetAngle(A, B) * 180/Math.PI;
// angle = 120.16850967865749
}
}
See the test case above in GeoGebra.
I think a better formula was posted here:
http://www.mathworks.com/matlabcentral/answers/16243-angle-between-two-vectors-in-3d
angle = atan2(norm(cross(a,b)), dot(a,b))
So this formula works in 2 or 3 dimensions.
For 2 dimensions this formula simplifies to the one stated above.
Nobody pointed out that if you have a single vector, and want to find the angle of the vector from the X axis, you can take advantage of the fact that the argument to atan2() is actually the slope of the line, or (delta Y / delta X). So if you know the slope, you can do the following:
given:
A = angle of the vector/line you wish to determine (from the X axis).
m = signed slope of the vector/line.
then:
A = atan2(m, 1)
Very useful!
If you care about accuracy for small angles, you want to use this:
angle = 2*atan2(|| ||b||a - ||a||b ||, || ||b||a + ||a||b ||)
Where "||" means absolute value, AKA "length of the vector". See https://math.stackexchange.com/questions/1143354/numerically-stable-method-for-angle-between-3d-vectors/1782769
However, that has the downside that in two dimensions, it loses the sign of the angle.
As a complement to the answer of #martin-r one should note that it is possible to use the sum/difference formula for arcus tangens.
angle = atan2(vec2.y, vec2.x) - atan2(vec1.y, vec1.x);
angle = -atan2(vec1.x * vec2.y - vec1.y * vec2.x, dot(vec1, vec2))
where dot = vec1.x * vec2.x + vec1.y * vec2.y
Caveat 1: make sure the angle remains within -pi ... +pi
Caveat 2: beware when the vectors are getting very similar, you might get extinction in the first argument, leading to numerical inaccuracies
You don't have to use atan2 to calculate the angle between two vectors. If you just want the quickest way, you can use dot(v1, v2)=|v1|*|v2|*cos A
to get
A = Math.acos( dot(v1, v2)/(v1.length()*v2.length()) );
angle(vector.b,vector.a)=pi/2*((1+sgn(xa))*(1-sgn(ya^2))-(1+sgn(xb))*(1-sgn(yb^2)))
+pi/4*((2+sgn(xa))*sgn(ya)-(2+sgn(xb))*sgn(yb))
+sgn(xa*ya)*atan((abs(xa)-abs(ya))/(abs(xa)+abs(ya)))
-sgn(xb*yb)*atan((abs(xb)-abs(yb))/(abs(xb)+abs(yb)))
xb,yb and xa,ya are the coordinates of the two vectors
The formula, angle(vector.b,vector.a), that I sent, give results
in the four quadrants and for any coordinates xa,ya and xb,yb.
For coordinates xa=ya=0 and or xb=yb=0 is undefined.
The angle can be bigger or smaller than pi, and can be positive
or negative.
Here a little program in Python that uses the angle between vectors to determine if a point is inside or outside a certain polygon
import sys
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from shapely.geometry import Point, Polygon
from pprint import pprint
# Plot variables
x_min, x_max = -6, 12
y_min, y_max = -3, 8
tick_interval = 1
FIG_SIZE = (10, 10)
DELTA_ERROR = 0.00001
IN_BOX_COLOR = 'yellow'
OUT_BOX_COLOR = 'black'
def angle_between(v1, v2):
""" Returns the angle in radians between vectors 'v1' and 'v2'
The sign of the angle is dependent on the order of v1 and v2
so acos(norm(dot(v1, v2))) does not work and atan2 has to be used, see:
https://stackoverflow.com/questions/21483999/using-atan2-to-find-angle-between-two-vectors
"""
arg1 = np.cross(v1, v2)
arg2 = np.dot(v1, v2)
angle = np.arctan2(arg1, arg2)
return angle
def point_inside(point, border):
""" Returns True if point is inside border polygon and False if not
Arguments:
:point: x, y in shapely.geometry.Point type
:border: [x1 y1, x2 y2, ... , xn yn] in shapely.geomettry.Polygon type
"""
assert len(border.exterior.coords) > 2,\
'number of points in the polygon must be > 2'
point = np.array(point)
side1 = np.array(border.exterior.coords[0]) - point
sum_angles = 0
for border_point in border.exterior.coords[1:]:
side2 = np.array(border_point) - point
angle = angle_between(side1, side2)
sum_angles += angle
side1 = side2
# if wn is 1 then the point is inside
wn = sum_angles / 2 / np.pi
if abs(wn - 1) < DELTA_ERROR:
return True
else:
return False
class MainMap():
#classmethod
def settings(cls, fig_size):
# set the plot outline, including axes going through the origin
cls.fig, cls.ax = plt.subplots(figsize=fig_size)
cls.ax.set_xlim(-x_min, x_max)
cls.ax.set_ylim(-y_min, y_max)
cls.ax.set_aspect(1)
tick_range_x = np.arange(round(x_min + (10*(x_max - x_min) % tick_interval)/10, 1),
x_max + 0.1, step=tick_interval)
tick_range_y = np.arange(round(y_min + (10*(y_max - y_min) % tick_interval)/10, 1),
y_max + 0.1, step=tick_interval)
cls.ax.set_xticks(tick_range_x)
cls.ax.set_yticks(tick_range_y)
cls.ax.tick_params(axis='both', which='major', labelsize=6)
cls.ax.spines['left'].set_position('zero')
cls.ax.spines['right'].set_color('none')
cls.ax.spines['bottom'].set_position('zero')
cls.ax.spines['top'].set_color('none')
#classmethod
def get_ax(cls):
return cls.ax
#staticmethod
def plot():
plt.tight_layout()
plt.show()
class PlotPointandRectangle(MainMap):
def __init__(self, start_point, rectangle_polygon, tolerance=0):
self.current_object = None
self.currently_dragging = False
self.fig.canvas.mpl_connect('key_press_event', self.on_key)
self.plot_types = ['o', 'o-']
self.plot_type = 1
self.rectangle = rectangle_polygon
# define a point that can be moved around
self.point = patches.Circle((start_point.x, start_point.y), 0.10,
alpha=1)
if point_inside(start_point, self.rectangle):
_color = IN_BOX_COLOR
else:
_color = OUT_BOX_COLOR
self.point.set_color(_color)
self.ax.add_patch(self.point)
self.point.set_picker(tolerance)
cv_point = self.point.figure.canvas
cv_point.mpl_connect('button_release_event', self.on_release)
cv_point.mpl_connect('pick_event', self.on_pick)
cv_point.mpl_connect('motion_notify_event', self.on_motion)
self.plot_rectangle()
def plot_rectangle(self):
x = [point[0] for point in self.rectangle.exterior.coords]
y = [point[1] for point in self.rectangle.exterior.coords]
# y = self.rectangle.y
self.rectangle_plot, = self.ax.plot(x, y,
self.plot_types[self.plot_type], color='r', lw=0.4, markersize=2)
def on_release(self, event):
self.current_object = None
self.currently_dragging = False
def on_pick(self, event):
self.currently_dragging = True
self.current_object = event.artist
def on_motion(self, event):
if not self.currently_dragging:
return
if self.current_object == None:
return
point = Point(event.xdata, event.ydata)
self.current_object.center = point.x, point.y
if point_inside(point, self.rectangle):
_color = IN_BOX_COLOR
else:
_color = OUT_BOX_COLOR
self.current_object.set_color(_color)
self.point.figure.canvas.draw()
def remove_rectangle_from_plot(self):
try:
self.rectangle_plot.remove()
except ValueError:
pass
def on_key(self, event):
# with 'space' toggle between just points or points connected with
# lines
if event.key == ' ':
self.plot_type = (self.plot_type + 1) % 2
self.remove_rectangle_from_plot()
self.plot_rectangle()
self.point.figure.canvas.draw()
def main(start_point, rectangle):
MainMap.settings(FIG_SIZE)
plt_me = PlotPointandRectangle(start_point, rectangle) #pylint: disable=unused-variable
MainMap.plot()
if __name__ == "__main__":
try:
start_point = Point([float(val) for val in sys.argv[1].split()])
except IndexError:
start_point= Point(0, 0)
border_points = [(-2, -2),
(1, 1),
(3, -1),
(3, 3.5),
(4, 1),
(5, 1),
(4, 3.5),
(5, 6),
(3, 4),
(3, 5),
(-0.5, 1),
(-3, 1),
(-1, -0.5),
]
border_points_polygon = Polygon(border_points)
main(start_point, border_points_polygon)
I would like to know the difference between batch normalization and self normalized neural network. In other words, would SELU (Scaled Exponential Linear Unit) replace batch normalization and how?
Moreover, I after looking into the values of the SELU activations, they were in the range: [-1, 1]. While this is not the case with batch normalization. Instead, the values after the BN layer (before the relu activation), took the values of [-a, a] Approximately, and not [-1, 1].
Here is how I printed the values after the SELU activation and after batch norm layer:
batch_norm_layer = tf.Print(batch_norm_layer,
data=[tf.reduce_max(batch_norm_layer), tf.reduce_min(batch_norm_layer)],
message = name_scope + ' min and max')
And similar code for the SELU activations...
Batch norm layer is defined as follows:
def batch_norm(x, n_out, phase_train, in_conv_layer = True):
with tf.variable_scope('bn'):
beta = tf.Variable(tf.constant(0.0, shape=n_out),
name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=n_out),
name='gamma', trainable=True)
if in_conv_layer:
batch_mean, batch_var = tf.nn.moments(x, [0, 1, 2], name='moments')
else:
batch_mean, batch_var = tf.nn.moments(x, [0, 1], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.9999)
def mean_var_with_update():
ema_apply_op = ema.apply([batch_mean, batch_var])
with tf.control_dependencies([ema_apply_op]):
return tf.identity(batch_mean), tf.identity(batch_var)
mean, var = tf.cond(phase_train,
mean_var_with_update,
lambda: (ema.average(batch_mean), ema.average(batch_var)))
normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
return normed
Therefore, since batch norm outputs higher values, the loss increases dramatically, and thus I got nans.
In addition, I tried reducing the learning rate with batch norm, but, that didn't help as well. So how to fix this problem???
Here is the following code:
import tensorflow as tf
import numpy as np
import os
import cv2
batch_size = 32
num_epoch = 102
latent_dim = 100
def weight_variable(kernal_shape):
weights = tf.get_variable(name='weights', shape=kernal_shape, dtype=tf.float32, trainable=True,
initializer=tf.truncated_normal_initializer(stddev=0.02))
return weights
def bias_variable(shape):
initial = tf.constant(0.0, shape=shape)
return tf.Variable(initial)
def batch_norm(x, n_out, phase_train, convolutional = True):
with tf.variable_scope('bn'):
exp_moving_avg = tf.train.ExponentialMovingAverage(decay=0.9999)
beta = tf.Variable(tf.constant(0.0, shape=n_out),
name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=n_out),
name='gamma', trainable=True)
if convolutional:
batch_mean, batch_var = tf.nn.moments(x, [0, 1, 2], name='moments')
else:
batch_mean, batch_var = tf.nn.moments(x, [0], name='moments')
update_moving_averages = exp_moving_avg.apply([batch_mean, batch_var])
m = tf.cond(phase_train, lambda: exp_moving_avg.average(batch_mean), lambda: batch_mean)
v = tf.cond(phase_train, lambda: exp_moving_avg.average(batch_var), lambda: batch_var)
normed = tf.nn.batch_normalization(x, m, v, beta, gamma, 1e-3)
normed = tf.Print(normed, data=[tf.shape(normed)], message='size of normed?')
return normed, update_moving_averages # Note that we should run the update_moving_averages with sess.run...
def conv_layer(x, w_shape, b_shape, padding='SAME'):
W = weight_variable(w_shape)
tf.summary.histogram("weights", W)
b = bias_variable(b_shape)
tf.summary.histogram("biases", b)
# Note that I used a stride of 2 on purpose in order not to use max pool layer.
conv = tf.nn.conv2d(x, W, strides=[1, 2, 2, 1], padding=padding) + b
conv_batch_norm, update_moving_averages = batch_norm(conv, b_shape, phase_train=tf.cast(True, tf.bool))
name_scope = tf.get_variable_scope().name
conv_batch_norm = tf.Print(conv_batch_norm,
data=[tf.reduce_max(conv_batch_norm), tf.reduce_min(conv_batch_norm)],
message = name_scope + ' min and max')
activations = tf.nn.relu(conv_batch_norm)
tf.summary.histogram("activations", activations)
return activations, update_moving_averages
def deconv_layer(x, w_shape, b_shape, padding="SAME", activation='selu'):
W = weight_variable(w_shape)
tf.summary.histogram("weights", W)
b = bias_variable(b_shape)
tf.summary.histogram('biases', b)
x_shape = tf.shape(x)
out_shape = tf.stack([x_shape[0], x_shape[1] * 2, x_shape[2] * 2, w_shape[2]])
if activation == 'selu':
conv_trans = tf.nn.conv2d_transpose(x, W, out_shape, [1, 2, 2, 1], padding=padding) + b
conv_trans_batch_norm, update_moving_averages = \
batch_norm(conv_trans, b_shape, phase_train=tf.cast(True, tf.bool))
transposed_activations = tf.nn.relu(conv_trans_batch_norm)
else:
conv_trans = tf.nn.conv2d_transpose(x, W, out_shape, [1, 2, 2, 1], padding=padding) + b
conv_trans_batch_norm, update_moving_averages = \
batch_norm(conv_trans, b_shape, phase_train=tf.cast(True, tf.bool))
transposed_activations = tf.nn.sigmoid(conv_trans_batch_norm)
tf.summary.histogram("transpose_activation", transposed_activations)
return transposed_activations, update_moving_averages
tfrecords_filename_seq = ["C:/Users/user/PycharmProjects/AffectiveComputing/P16_db.tfrecords"]
filename_queue = tf.train.string_input_producer(tfrecords_filename_seq, num_epochs=num_epoch, shuffle=False, name='queue')
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
# Defaults are not specified since both keys are required.
features={
'height': tf.FixedLenFeature([], tf.int64),
'width': tf.FixedLenFeature([], tf.int64),
'image_raw': tf.FixedLenFeature([], tf.string),
'annotation_raw': tf.FixedLenFeature([], tf.string)
})
# This is how we create one example, that is, extract one example from the database.
image = tf.decode_raw(features['image_raw'], tf.uint8)
# The height and the weights are used to
height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)
# The image is reshaped since when stored as a binary format, it is flattened. Therefore, we need the
# height and the weight to restore the original image back.
image = tf.reshape(image, [height, width, 3])
annotation = tf.cast(features['annotation_raw'], tf.string)
min_after_dequeue = 100
num_threads = 1
capacity = min_after_dequeue + num_threads * batch_size
label_batch, images_batch = tf.train.batch([annotation, image],
shapes=[[], [112, 112, 3]],
batch_size=batch_size,
capacity=capacity,
num_threads=num_threads)
label_batch_splitted = tf.string_split(label_batch, delimiter=',')
label_batch_values = tf.reshape(label_batch_splitted.values, [batch_size, -1])
label_batch_numbers = tf.string_to_number(label_batch_values, out_type=tf.float32)
confidences = tf.slice(label_batch_numbers, begin=[0, 2], size=[-1, 1])
images_batch = tf.cast([images_batch], tf.float32)[0] # Note that casting the image will increases its rank.
with tf.name_scope('image_normal'):
images_batch = tf.map_fn(lambda img: tf.image.per_image_standardization(img), images_batch)
#images_batch = tf.Print(images_batch, data=[tf.reduce_max(images_batch), tf.reduce_min(images_batch)],
# message='min and max in images_batch')
with tf.variable_scope('conv1'):
conv1, uma_conv1 = conv_layer(images_batch, [4, 4, 3, 32], [32]) # image size: [56, 56]
with tf.variable_scope('conv2'):
conv2, uma_conv2 = conv_layer(conv1, [4, 4, 32, 64], [64]) # image size: [28, 28]
with tf.variable_scope('conv3'):
conv3, uma_conv3 = conv_layer(conv2, [4, 4, 64, 128], [128]) # image size: [14, 14]
with tf.variable_scope('conv4'):
conv4, uma_conv4 = conv_layer(conv3, [4, 4, 128, 256], [256]) # image size: [7, 7]
conv4_reshaped = tf.reshape(conv4, [-1, 7 * 7 * 256], name='conv4_reshaped')
w_c_mu = tf.Variable(tf.truncated_normal([7 * 7 * 256, latent_dim], stddev=0.1), name='weight_fc_mu')
b_c_mu = tf.Variable(tf.constant(0.1, shape=[latent_dim]), name='biases_fc_mu')
w_c_sig = tf.Variable(tf.truncated_normal([7 * 7 * 256, latent_dim], stddev=0.1), name='weight_fc_sig')
b_c_sig = tf.Variable(tf.constant(0.1, shape=[latent_dim]), name='biases_fc_sig')
epsilon = tf.random_normal([1, latent_dim])
tf.summary.histogram('weights_c_mu', w_c_mu)
tf.summary.histogram('biases_c_mu', b_c_mu)
tf.summary.histogram('weights_c_sig', w_c_sig)
tf.summary.histogram('biases_c_sig', b_c_sig)
with tf.variable_scope('mu'):
mu = tf.nn.bias_add(tf.matmul(conv4_reshaped, w_c_mu), b_c_mu)
tf.summary.histogram('mu', mu)
with tf.variable_scope('stddev'):
stddev = tf.nn.bias_add(tf.matmul(conv4_reshaped, w_c_sig), b_c_sig)
tf.summary.histogram('stddev', stddev)
with tf.variable_scope('z'):
latent_var = mu + tf.multiply(tf.sqrt(tf.exp(stddev)), epsilon)
tf.summary.histogram('features_sig', stddev)
w_dc = tf.Variable(tf.truncated_normal([latent_dim, 7 * 7 * 256], stddev=0.1), name='weights_dc')
b_dc = tf.Variable(tf.constant(0.0, shape=[7 * 7 * 256]), name='biases_dc')
tf.summary.histogram('weights_dc', w_dc)
tf.summary.histogram('biases_dc', b_dc)
with tf.variable_scope('deconv4'):
deconv4 = tf.nn.bias_add(tf.matmul(latent_var, w_dc), b_dc)
deconv4_batch_norm, uma_deconv4 = \
batch_norm(deconv4, [7 * 7 * 256], phase_train=tf.cast(True, tf.bool), convolutional=False)
deconv4 = tf.nn.relu(deconv4_batch_norm)
deconv4_reshaped = tf.reshape(deconv4, [-1, 7, 7, 256], name='deconv4_reshaped')
with tf.variable_scope('deconv3'):
deconv3, uma_deconv3 = deconv_layer(deconv4_reshaped, [3, 3, 128, 256], [128], activation='selu')
with tf.variable_scope('deconv2'):
deconv2, uma_deconv2 = deconv_layer(deconv3, [3, 3, 64, 128], [64], activation='selu')
with tf.variable_scope('deconv1'):
deconv1, uma_deconv1 = deconv_layer(deconv2, [3, 3, 32, 64], [32], activation='selu')
with tf.variable_scope('deconv_image'):
deconv_image_batch, uma_deconv = deconv_layer(deconv1, [3, 3, 3, 32], [3], activation='sigmoid')
# loss function.
with tf.name_scope('loss_likelihood'):
# temp1 shape: [32, 112, 112, 3]
temp1 = images_batch * tf.log(deconv_image_batch + 1e-9) + (1 - images_batch) * tf.log(1 - deconv_image_batch + 1e-9)
#temp1 = temp1 * confidences. This will give an error. Therefore, we should expand the dimension of confidence tensor
confidences_ = tf.expand_dims(tf.expand_dims(confidences, axis=1), axis=1) # shape: [32, 1, 1, 1].
temp1 = temp1 * confidences_
log_likelihood = -tf.reduce_sum(temp1, reduction_indices=[1, 2, 3])
log_likelihood_total = tf.reduce_sum(log_likelihood)
#l2_loss = tf.reduce_mean(tf.abs(tf.subtract(images_batch, deconv_image_batch)))
with tf.name_scope('loss_KL'):
# temp2 shape: [32, 200]
temp2 = 1 + tf.log(tf.square(stddev + 1e-9)) - tf.square(mu) - tf.square(stddev)
temp3 = temp2 * confidences # confidences shape is [32, 1]
KL_term = - 0.5 * tf.reduce_sum(temp3, reduction_indices=1)
KL_term_total = tf.reduce_sum(KL_term)
with tf.name_scope('total_loss'):
variational_lower_bound = tf.reduce_mean(log_likelihood + KL_term)
tf.summary.scalar('loss', variational_lower_bound)
with tf.name_scope('optimizer'):
optimizer = tf.train.AdamOptimizer(0.00001).minimize(variational_lower_bound)
init_op = tf.group(tf.local_variables_initializer(),
tf.global_variables_initializer())
saver = tf.train.Saver()
model_path = 'C:/Users/user/PycharmProjects/VariationalAutoEncoder/' \
'VariationalAutoEncoderFaces/tensorboard_logs/Graph_model/ckpt'
# Here is the session...
with tf.Session() as sess:
train_writer = tf.summary.FileWriter('C:/Users/user/PycharmProjects/VariationalAutoEncoder/'
'VariationalAutoEncoderFaces/tensorboard_logs/Event_files', sess.graph)
merged = tf.summary.merge_all()
# Note that init_op should start before the Coordinator and the thread otherwise, this will throw an error.
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
step = 0
to_run_list = [uma_conv1, uma_conv2, uma_conv3, uma_conv4, uma_deconv1, uma_deconv2, uma_deconv3,
uma_deconv4, uma_deconv, optimizer, variational_lower_bound, merged,
deconv_image_batch, image]
# Note that the last name "Graph_model" is the name of the saved checkpoints file => the ckpt is saved
# under tensorboard_logs.
ckpt = tf.train.get_checkpoint_state(
os.path.dirname(model_path))
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
print('checkpoints are saved!!!')
else:
print('No stored checkpoints')
epoch = 0
while not coord.should_stop():
_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, loss, summary, reconstructed_image, original_image = \
sess.run(to_run_list)
print('total loss:', loss)
original_image = cv2.cvtColor(np.array(original_image), cv2.COLOR_RGB2BGR)
reconstructed_image = cv2.cvtColor(np.array(reconstructed_image[0]), cv2.COLOR_RGB2BGR)
cv2.imshow('original_image', original_image)
cv2.imshow('reconstructed_image', reconstructed_image)
cv2.waitKey(1)
if step % 234 == 0:
epoch += 1
print('epoch:', epoch)
if epoch == num_epoch - 2:
coord.request_stop()
if step % 100 == 0:
train_writer.add_summary(summary, step)
#print('total loss:', loss)
#print('log_likelihood_', log_likelihood_)
#print('KL_term', KL_term_)
step += 1
save_path = saver.save(sess, model_path)
coord.request_stop()
coord.join(threads)
train_writer.close()
Any help is much appreciated!!
Here are some sample codes to show the trend of means and variances over 3 SELU layers. The numbers of nodes on the layers (including the input layer) are [15, 30, 30, 8]
import tensorflow as tf
import numpy as np
import os
#-----------------------------------------------#
# https://github.com/bioinf-jku/SNNs/blob/master/selu.py
# The SELU activation function
def selu(x):
with ops.name_scope('elu') as scope:
alpha = 1.6732632423543772848170429916717
scale = 1.0507009873554804934193349852946
return scale*tf.where(x>=0.0, x, alpha*tf.nn.elu(x))
#-----------------------------------------------#
# https://github.com/bioinf-jku/SNNs/blob/master/selu.py
# alpha-dropout
def dropout_selu(x, rate, alpha= -1.7580993408473766, fixedPointMean=0.0, fixedPointVar=1.0,
noise_shape=None, seed=None, name=None, training=False):
"""Dropout to a value with rescaling."""
def dropout_selu_impl(x, rate, alpha, noise_shape, seed, name):
keep_prob = 1.0 - rate
x = ops.convert_to_tensor(x, name="x")
if isinstance(keep_prob, numbers.Real) and not 0 < keep_prob <= 1:
raise ValueError("keep_prob must be a scalar tensor or a float in the "
"range (0, 1], got %g" % keep_prob)
keep_prob = ops.convert_to_tensor(keep_prob, dtype=x.dtype, name="keep_prob")
keep_prob.get_shape().assert_is_compatible_with(tensor_shape.scalar())
alpha = ops.convert_to_tensor(alpha, dtype=x.dtype, name="alpha")
alpha.get_shape().assert_is_compatible_with(tensor_shape.scalar())
if tensor_util.constant_value(keep_prob) == 1:
return x
noise_shape = noise_shape if noise_shape is not None else array_ops.shape(x)
random_tensor = keep_prob
random_tensor += random_ops.random_uniform(noise_shape, seed=seed, dtype=x.dtype)
binary_tensor = math_ops.floor(random_tensor)
ret = x * binary_tensor + alpha * (1-binary_tensor)
a = math_ops.sqrt(fixedPointVar / (keep_prob *((1-keep_prob) * math_ops.pow(alpha-fixedPointMean,2) + fixedPointVar)))
b = fixedPointMean - a * (keep_prob * fixedPointMean + (1 - keep_prob) * alpha)
ret = a * ret + b
ret.set_shape(x.get_shape())
return ret
with ops.name_scope(name, "dropout", [x]) as name:
return utils.smart_cond(training,
lambda: dropout_selu_impl(x, rate, alpha, noise_shape, seed, name),
lambda: array_ops.identity(x))
#-----------------------------------------------#
# build a 3-layer dense network with SELU activation and alpha-dropout
sess = tf.InteractiveSession()
w1 = tf.constant(np.random.normal(loc=0.0, scale=np.sqrt(1.0/15.0), size = [15, 30]))
b1 = tf.constant(np.random.normal(loc=0.0, scale=0.5, size = [30]))
x1 = tf.constant(np.random.normal(loc=0.0, scale=1.0, size = [200, 15]))
y1 = tf.add(tf.matmul(x1, w1), b1)
y1_selu = selu(y1)
y1_selu_dropout = dropout_selu(y1_selu, 0.05, training=True)
w2 = tf.constant(np.random.normal(loc=0.0, scale=np.sqrt(1.0/30.0), size = [30, 30]))
b2 = tf.constant(np.random.normal(loc=0.0, scale=0.5, size = [30]))
x2 = y1_selu_dropout
y2 = tf.add(tf.matmul(x2, w2), b2)
y2_selu = selu(y2)
y2_selu_dropout = dropout_selu(y2_selu, 0.05, training=True)
w3 = tf.constant(np.random.normal(loc=0.0, scale=np.sqrt(1.0/30.0), size = [30, 8]))
b3 = tf.constant(np.random.normal(loc=0.0, scale=0.5, size = [8]))
x3 = y2_selu_dropout
y3 = tf.add(tf.matmul(x3, w3), b3)
y3_selu = selu(y3)
y3_selu_dropout = dropout_selu(y3_selu, 0.05, training=True)
#-------------------------#
# evaluate the network
x1_v, y1_selu_dropout_v, \
x2_v, y2_selu_dropout_v, \
x3_v, y3_selu_dropout_v, \
= sess.run([x1, y1_selu_dropout, x2, y2_selu_dropout, x3, y3_selu_dropout])
#-------------------------#
# print each layer's mean and standard deviation (1st line: input; 2nd line: output)
print("Layer 1")
print(np.mean(x1_v), np.std(x1_v))
print(np.mean(y1_selu_dropout_v), np.std(y1_selu_dropout_v))
print("Layer 2")
print(np.mean(x2_v), np.std(x2_v))
print(np.mean(y2_selu_dropout_v), np.std(y2_selu_dropout_v))
print("Layer 3")
print(np.mean(x3_v), np.std(x3_v))
print(np.mean(y3_selu_dropout_v), np.std(y3_selu_dropout_v))
Here is one possible output. Over 3 layers, the mean and standard deviation are still close to 0 and 1, respectively.
Layer 1
-0.0101213033749 1.01375071842
0.0106228883975 1.09375593322
Layer 2
0.0106228883975 1.09375593322
-0.027910206754 1.12216643393
Layer 3
-0.027910206754 1.12216643393
-0.131790078631 1.09698413493
I'm using Python 2.7 and OpenCV 3.x for my project for omr sheet evaluation using web camera.
While finding the number of white pixels in around the center of circle,I came to know that the intensity values are wrong, but it shows the correct values in MATLAB when I'm using imtool('a1.png').
I'm using .png image (datatype uint8).
just run the code and in the image go to [360:370,162:172] coordinate and see the intensity values.. it should not be 0.
find the images here -> a1.png a2.png
Why is this happening?
import numpy as np
import cv2
from matplotlib import pyplot as plt
#select radius of circle
radius = 10;
#function for finding white pixels
def thresh_circle(img,ptx,pty):
centerX = ptx;
centerY = pty;
cntOfWhite = 0;
for i in range((centerX - radius),(centerX + radius)):
for j in range((centerY - radius), (centerY + radius)):
if(j < img.shape[0] and i < img.shape[1]):
val = img[i][j]
if (val == 255):
cntOfWhite = cntOfWhite + 1;
return cntOfWhite
MIN_MATCH_COUNT = 10
img1 = cv2.imread('a1.png',0) # queryImage
img2 = cv2.imread('a2.png',0) # trainImage
sift = cv2.SIFT()# Initiate SIFT detector
kp1, des1 = sift.detectAndCompute(img1,None)# find the keypoints and descriptors with SIFT
kp2, des2 = sift.detectAndCompute(img2,None)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1,des2,k=2)
good = []# store all the good matches as per Lowe's ratio test.
for m,n in matches:
if m.distance < 0.7*n.distance:
good.append(m)
if len(good)>MIN_MATCH_COUNT:
src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.LMEDS,5.0)
#print M
matchesMask = mask.ravel().tolist()
h,w = img1.shape
else:
print "Not enough matches are found - %d/%d" % (len(good),MIN_MATCH_COUNT)
matchesMask = None
img3 = cv2.warpPerspective(img1, M, (img2.shape[1],img2.shape[0]))
blur = cv2.GaussianBlur(img3,(5,5),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
ret,th2 = cv2.threshold(blur,ret3,255,cv2.THRESH_BINARY_INV)
print th2[360:370,162:172]#print a block of image
plt.imshow(th2, 'gray'),plt.show()
cv2.waitKey(0)
cv2.imwrite('th2.png',th2)
ptyc = np.array([170,200,230,260]);#y coordinates of circle center
ptxc = np.array([110,145,180,215,335,370,405,440])#x coordinates of circle center
pts_src = np.zeros(shape = (32,2),dtype=np.int);#x,y coordinates of circle center
ct = 0;
for i in range(0,4):
for j in range(0,8):
pts_src[ct][1] = ptyc[i];
pts_src[ct][0] = ptxc[j];
ct = ct+1;
boolval = np.zeros(shape=(8,4),dtype=np.bool)
ct = 0;
for j in range(0,8):
for i in range(0,4):
a1 = thresh_circle(th2,pts_src[ct][0],pts_src[ct][1])
ct = ct+1;
if(a1 > 50):
boolval[j][i] = 1
else:
boolval[j][i] = 0
I finally got a working tile-able version of Simplex noise working after much work, but I can't seem to get it to record and display correctly when using a BufferedImage. Whenever I try to create an image, it ends up with bands or rings of black and white, instead of a smooth change of shades, which is what I'm expecting. I'm guessing there's something simple I'm not doing, but for the life of me, I can't find it.
This is my code (quite a bit of which is from Stefan Gustavson's Simplex noise implementation):
import java.awt.image.BufferedImage
import javax.imageio.ImageIO
import java.io.File
import scala.util.Random
object ImageTest {
def main(args: Array[String]): Unit = {
val image = generate(1024, 1024, 1)
ImageIO.write(image, "png", new File("heightmap.png"))
}
def generate(width: Int, height: Int, octaves: Int) = {
val map = new BufferedImage(width, height, BufferedImage.TYPE_USHORT_GRAY)
val pi2 = Math.PI * 2
for ( x <- 0 until width;
y <- 0 until height) {
var total = 0.0
for (oct <- 1 to octaves) {
val scale = (1 - 1/Math.pow(2, oct))
val s = x / width.toDouble
val t = y / height.toDouble
val dx = 1-scale
val dy = 1-scale
val nx = scale + Math.cos(s*pi2) * dx
val ny = scale + Math.cos(t*pi2) * dy
val nz = scale + Math.sin(s*pi2) * dx
val nw = scale + Math.sin(t*pi2) * dy
total += (((noise(nx,ny,nz,nw)+1)/2)) * Math.pow(0.5, oct)
}
map.setRGB(x,y, (total * 0xffffff).toInt)
}
map
}
// Simplex 4D noise generator
// returns -1.0 <-> 1.0
def noise(x: Double, y: Double, z: Double, w: Double) = {
// Skew the (x,y,z,w) space to determine which cell of 24 simplices we're in
val s = (x + y + z + w) * F4; // Factor for 4D skewing
val i = Math.floor(x+s).toInt
val j = Math.floor(y+s).toInt
val k = Math.floor(z+s).toInt
val l = Math.floor(w+s).toInt
val t = (i+j+k+l) * G4 // Factor for 4D unskewing
val xBase = x - (i-t) // Unskew the cell space and set the x, y, z, w
val yBase = y - (j-t) //distances from the cell origin
val zBase = z - (k-t)
val wBase = w - (l-t)
// For the 4D case, the simplex is a 4D shape I won't even try to describe.
// To find out which of the 24 possible simplices we're in, we need to
// determine the magnitude ordering of x0, y0, z0 and w0.
// Six pair-wise comparisons are performed between each possible pair
// of the four coordinates, and the results are used to rank the numbers.
var rankx = 0
var ranky = 0
var rankz = 0
var rankw = 0
if(xBase > yBase) rankx+=1 else ranky+=1
if(xBase > zBase) rankx+=1 else rankz+=1
if(xBase > wBase) rankx+=1 else rankw+=1
if(yBase > zBase) ranky+=1 else rankz+=1
if(yBase > wBase) ranky+=1 else rankw+=1
if(zBase > wBase) rankz+=1 else rankw+=1
// simplex[c] is a 4-vector with the numbers 0, 1, 2 and 3 in some order.
// Many values of c will never occur, since e.g. x>y>z>w makes x<z, y<w and x<w
// impossible. Only the 24 indices which have non-zero entries make any sense.
// We use a thresholding to set the coordinates in turn from the largest magnitude.
// Rank 3 denotes the largest coordinate.
val i1 = if (rankx >= 3) 1 else 0
val j1 = if (ranky >= 3) 1 else 0
val k1 = if (rankz >= 3) 1 else 0
val l1 = if (rankw >= 3) 1 else 0
// Rank 2 denotes the second largest coordinate.
val i2 = if (rankx >= 2) 1 else 0
val j2 = if (ranky >= 2) 1 else 0
val k2 = if (rankz >= 2) 1 else 0
val l2 = if (rankw >= 2) 1 else 0
// Rank 1 denotes the second smallest coordinate.
val i3 = if (rankx >= 1) 1 else 0
val j3 = if (ranky >= 1) 1 else 0
val k3 = if (rankz >= 1) 1 else 0
val l3 = if (rankw >= 1) 1 else 0
// The fifth corner has all coordinate offsets = 1, so no need to compute that.
val xList = Array(xBase, xBase-i1+G4, xBase-i2+2*G4, xBase-i3+3*G4, xBase-1+4*G4)
val yList = Array(yBase, yBase-j1+G4, yBase-j2+2*G4, yBase-j3+3*G4, yBase-1+4*G4)
val zList = Array(zBase, zBase-k1+G4, zBase-k2+2*G4, zBase-k3+3*G4, zBase-1+4*G4)
val wList = Array(wBase, wBase-l1+G4, wBase-l2+2*G4, wBase-l3+3*G4, wBase-1+4*G4)
// Work out the hashed gradient indices of the five simplex corners
val ii = if (i < 0) 256 + (i % 255) else i % 255
val jj = if (j < 0) 256 + (j % 255) else j % 255
val kk = if (k < 0) 256 + (k % 255) else k % 255
val ll = if (l < 0) 256 + (l % 255) else l % 255
val gradIndices = Array(
perm(ii+perm(jj+perm(kk+perm(ll)))) % 32,
perm(ii+i1+perm(jj+j1+perm(kk+k1+perm(ll+l1)))) % 32,
perm(ii+i2+perm(jj+j2+perm(kk+k2+perm(ll+l2)))) % 32,
perm(ii+i3+perm(jj+j3+perm(kk+k3+perm(ll+l3)))) % 32,
perm(ii+1+perm(jj+1+perm(kk+1+perm(ll+1)))) % 32)
// Calculate the contribution from the five corners
var total = 0.0
for (dim <- 0 until 5) {
val (x,y,z,w) = (xList(dim), yList(dim), zList(dim), wList(dim))
var t = 0.5 - x*x - y*y - z*z - w*w
total += {
if (t < 0) 0.0
else {
t *= t
val g = grad4(gradIndices(dim))
t * t * ((g.x*x)+(g.y*y)+(g.z*z)+(g.w*w))
}
}
}
// Sum up and scale the result to cover the range [-1,1]
27.0 * total
}
case class Grad(x: Double, y: Double, z: Double, w: Double = 0.0)
private lazy val grad4 = Array(
Grad(0,1,1,1), Grad(0,1,1,-1), Grad(0,1,-1,1), Grad(0,1,-1,-1),
Grad(0,-1,1,1),Grad(0,-1,1,-1),Grad(0,-1,-1,1),Grad(0,-1,-1,-1),
Grad(1,0,1,1), Grad(1,0,1,-1), Grad(1,0,-1,1), Grad(1,0,-1,-1),
Grad(-1,0,1,1),Grad(-1,0,1,-1),Grad(-1,0,-1,1),Grad(-1,0,-1,-1),
Grad(1,1,0,1), Grad(1,1,0,-1), Grad(1,-1,0,1), Grad(1,-1,0,-1),
Grad(-1,1,0,1),Grad(-1,1,0,-1),Grad(-1,-1,0,1),Grad(-1,-1,0,-1),
Grad(1,1,1,0), Grad(1,1,-1,0), Grad(1,-1,1,0), Grad(1,-1,-1,0),
Grad(-1,1,1,0),Grad(-1,1,-1,0),Grad(-1,-1,1,0),Grad(-1,-1,-1,0))
private lazy val perm = new Array[Short](512)
for(i <- 0 until perm.length)
perm(i) = Random.nextInt(256).toShort
private lazy val F4 = (Math.sqrt(5.0) - 1.0) / 4.0
private lazy val G4 = (5.0 - Math.sqrt(5.0)) / 20.0
}
I've checked the output values of the noise function I'm using, which as of yet hasn't returned anything outside of (-1, 1) exclusive. And for a single octave, the value I'm supplying to the image (total) has not gone outside of (0,1) exclusive, either.
The only thing I can see that would be a problem is either the BufferedImage type is set incorrectly, or I'm multiplying total by the wrong hex value when setting the values in the image.
I've looked through the Javadocs on BufferedImage for information about the types and the values they accept, though nothing I've found seems to be out of place in my code (though, I am fairly new to using BufferedImage in general). And I've tried changing the hex value, but neither seems to change anything. The only thing I've found that has any affect is if I divide the (total * 0xffffff).toInt value by 256, which seems to darken the bands a bit and a slight gradient appears over the areas it should, but if I increase the division too much, the image just becomes black. So as of right now I'm stuck on what could be the issue.
(total * 0xffffff).toInt doesn't seem to make sense. You are creating an ARGB value from a grayscale float with a single multiplication?
I think you want something like this:
val i = (total * 0xFF).toInt
val rgb = 0xFF000000 | (i << 16) | (i << 8) | i
That gives me a smooth random texture, although with very low contrast—with 1 octave, your total seems to vary approx from 0.2 to 0.3, so you may need to adjust the scale a bit.
I'm not sure though how you can get 16-bit grayscale resolution. Perhaps you need to set the raster data directly instead of using setRGB (which forces you down to 8 bits). The comments below this question suggest that you use the raster directly.