Standalone image patch extraction op in Tensorflow - neural-network

In the Tensorflow docs, the tf.nn.conv2d-operation is described to:
Flatten the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].
Extract image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].
For each patch, right-multiply the filter matrix and the image patch vector.
Is there an operation to apply just step 2? I cannot find anything like that in the API docs. I might be searching with the wrong keywords.

This is now added to the tensorflow api: https://www.tensorflow.org/versions/r0.9/api_docs/python/array_ops.html#extract_image_patches

I guess a trick to do that would be to:
Take a filter of shape [filter_height, filter_width, in_channels, output_channels] with output_channels = filter_height * filter_width * in_channels
Fix the value of this filter in a way that when the filter is flattened to a 2-D matrix (cf. your step 2), it is the identity matrix. Check my example code below for a simple way to do that with np.eye().reshape()
Perform a normal tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
You now have an output of shape [batch, out_height, out_width, filter_height * filter_width * in_channels]
Here is a simple code for an input image of size 3*3 with 1 channel (and batch size 1).
import tensorflow as tf
import numpy as np
input_value = np.arange(1, 10).reshape((1, 3, 3, 1))
input = tf.constant(input_value)
input = tf.cast(input, tf.float32)
filter_value = np.eye(9).reshape((3, 3, 1, 9))
filter = tf.constant(filter_value)
filter = tf.cast(filter, tf.float32)
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

Related

How is RMSD calculated in the Scipy implementation of the Kabsch algorithm?

Scipy calculates the rmsd like this, and I'll paraphrase it here for convenience (for readability I drop the weights and the max(*, 0))
rmsd = np.sqrt(np.sum(b ** 2 + a ** 2) - 2 * np.sum(s))
To me this does not look like RMSD.
Now from the docs one would infer that the rmsd return value is defined as the square root of double this expression:
The latter is indeed what I would consider to be the RMSD. In fact I went ahead and coded it up (note that this function expects me to apply the estimated transformation to one of the sets of points first whereas the snippet above does not)
def _calc_rmsd(a: np.ndarray, b_transformed: np.ndarray) -> float:
distances = np.linalg.norm(a - b_transformed, axis=-1)
rmsd = np.sqrt((distances ** 2).sum() / len(distances))
return rmsd
I also plotted out what these would look like for randomly generated point pairs with normally distributed noise (blue is scipy, orange is mine)
Or extending the plot out to 200 point pairs:
So to sum it up:
The definition of rmsd in the docs is in agreement with what I believe to be the widely accepted notion of rmsd
The scipy code implementation of rmsd disagrees with the latter. I don't even understand what it's supposed to mathematically represent.
From monte-carlo simulations, clearly the two implementations have different outcomes.
So what's going on?
Apparently the SciPy code is not returning the root-mean-squared distance. It sums the squared differences, but it does not divide by the number of vectors before taking the square root. The difference between the SciPy calculation and yours is a factor of sqrt(len(a)). You can verify this with an example such as the following.
In [157]: from scipy.spatial.transform import Rotation
In [158]: def _calc_rmsd(a: np.ndarray, b_transformed: np.ndarray) -> float:
...: distances = np.linalg.norm(a - b_transformed, axis=-1)
...: rmsd = np.sqrt((distances ** 2).sum() / len(distances))
...: return rmsd
...:
Some test data:
In [159]: a = np.array([[0, 1, 1], [1, 1, 1.5], [2.0, -1.0, 4.0], [-1, 0, 5]])
In [160]: b = np.array([[0, 1, 1.5], [2, 2, 2], [1, -1, 5], [-3, 0.1, 1]])
Compute the rotation:
In [161]: R, rmsd = Rotation.align_vectors(a, b)
In [162]: rmsd
Out[162]: 3.8753534834716685
Here's your calculation of the RMSD:
In [163]: _calc_rmsd(a, R.apply(b))
Out[163]: 1.9376767417358356
And here is your calculation, multiplied by sqrt(len(a)), so it matches the result returned by Rotation.align_vectors:
In [164]: _calc_rmsd(a, R.apply(b)) * np.sqrt(len(a))
Out[164]: 3.875353483471671
This looks like a documentation issue. If you have a moment, you could create a new issue for this over in https://github.com/scipy/scipy/issues

Spatial Points Outlier Clustering Method

I would like to implement an unsupervised clustering to detect grids (vertical/horizontal lines) for spatial points.
I have tried DBSCAN and it gives subpar results. It is able to pick out the grids as seen in red below:
However, it is not able to completely pick out all the points that form the vertical/horizontal lines and if i relax the parameters of epsilon, it will incorrectly classify more points as noisy (e.g. the bottom left of the picture).
I was wondering if maybe there is a modification model of DBSCAN that uses ellipse instead of circles? Or any other clustering methods recommended for this that does not need to prespecify the number of clusters?
Or is there a better method to identify these points that make the grid? Any help is appreciated.
You can use an anisotropical DBSCAN by modifying your data this way : value of anisotropy >1 will find vertical clusters and values <1 will find horizontal clusters.
from sklearn.cluster import DBSCAN
def anisotropical_DBSCAN(X, anisotropy, eps, min_samples):
"""ANIsotropic DBSCAN clustering : some documentation would be nice here :)
returns an array with """
X[:, 1] = X[:, 1]*anisotropy
db = DBSCAN(eps=eps, min_samples=min_samples).fit(X)
return db
Here is a full example with data :
import numpy as np
from sklearn.cluster import DBSCAN
from sklearn.datasets import make_blobs
centers = [[1, 1], [-1, -1], [1, -1]]
X, labels_true = make_blobs(
n_samples=750, centers=centers, cluster_std=0.4, random_state=0
)
print(X.shape)
def anisotropical_DBSCAN(X, anisotropy, eps, min_samples):
"""ANIsotropic DBSCAN clustering : some documentation would be nice here :)
returns an array with """
X[:, 1] = X[:, 1]*anisotropy
db = DBSCAN(eps=eps, min_samples=min_samples).fit(X)
return db
db = anisotropical_DBSCAN(X, anisotropy = 0.1, eps = 0.1, min_samples = 10)
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
# Number of clusters in labels, ignoring noise if present.
n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
# #############################################################################
# Plot result
import matplotlib.pyplot as plt
# Black removed and is used for noise instead.
unique_labels = set(labels)
colors = [plt.cm.Spectral(each) for each in np.linspace(0, 1, len(unique_labels))]
for k, col in zip(unique_labels, colors):
if k == -1:
# Black used for noise.
col = [0, 0, 0, 1]
class_member_mask = labels == k
xy = X[class_member_mask & core_samples_mask]
plt.plot(
xy[:, 0],
xy[:, 1],
"o",
markerfacecolor=tuple(col),
markeredgecolor="k",
markersize=14,
)
xy = X[class_member_mask & ~core_samples_mask]
plt.plot(
xy[:, 0],
xy[:, 1],
"o",
markerfacecolor=tuple(col),
markeredgecolor="k",
markersize=6,
)
plt.title("Estimated number of clusters: %d" % n_clusters_)
You get vertical clusters :
Now change the parameters to db = anisotropical_DBSCAN(X, anisotropy = 10, eps = 1, min_samples = 10) I had to change eps value because the horizontal scale and vertical scale arent the same, but in your case, you should be able to keep the same (eps, min sample) for detecting lines
And you get horizontal clusters :
There are also implementations of anisotropical DBSCAN that are probably a lot cleaner https://github.com/gissong/ADCN

Scipy.curve_fit() vs. Matlab fit() weighted nonlinear least squares

I have a Matlab reference routine that I am trying to convert to numpy/scipy. I have encountered a curve fitting problem that does I cannot solve in Python. So here is a simple example which demonstrates the problem. The data is completely synthetic and not part of the problem.
Let's say I'm trying to fit a straight-line model of noisy data -
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539]
For the unweighted solution in Matlab, I would code
g = #(m, b, x)(m*x + b)
f = fittype(g)
bestfit = fit(x, y, g)
which produces a solution of bestfit.m = 1.048, bestfit.b = -0.09219
Running this data through scipy.optimize.curve_fit() produces identical results.
If instead the fit uses a decay function to reduce the impact of data points
dw = [0.7290, 0.5120, 0.3430, 0.2160, 0.1250, 0.0640, 0.0270, 0.0080, 0.0010, 0]
weightedfit = fit(x, y, g, 'Weights', dw)
This produces a slope if 0.944 and offset 0.1484.
I have not figured out how to conjure this result from scipy.optimize.curve_fit using the sigma parameter. If I pass the weights as provided to Matlab, the '0' causes a divide by zero exception. Clearly Matlab and scipy are thinking very differently about the meaning of the weights in the underlying optimization routine. Is there a simple way of converting between the two that allows me to provide a weighting function which produces identical results?
Ok, so after further investigation I can offer the answer, at least for this simple example.
import numpy as np
import scipy as sp
import scipy.optimize
def modelFun(x, m, b):
return m * x + b
def testFit():
w = np.diag([1.0, 1/0.7290, 1/0.5120, 1/0.3430, 1/0.2160, 1/0.1250, 1/0.0640, 1/0.0270, 1/0.0080, 1/0.0010])
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539])
popt = sp.optimize.curve_fit(modelFun, x, y, sigma=w)
print(popt[0])
print(popt[1])
Which produces the desired result.
In order to force sp.optimize.curve_fit to minimize the same chisq metric as Matlab using the curve fitting toolbox, you must do two things:
Use the reciprocal of the weight factors
Create a diagonal matrix from the new weight factors. According to the scipy reference:
sigma None or M-length sequence or MxM array, optional
Determines the uncertainty in ydata. If we define residuals as r =
ydata - f(xdata, *popt), then the interpretation of sigma depends on
its number of dimensions:
A 1-d sigma should contain values of standard deviations of errors in
ydata. In this case, the optimized function is chisq = sum((r / sigma)
** 2).
A 2-d sigma should contain the covariance matrix of errors in ydata.
In this case, the optimized function is chisq = r.T # inv(sigma) # r.
New in version 0.19.
None (default) is equivalent of 1-d sigma filled with ones.

Passing Individual Channels of Tensors to Layers in Keras

I am trying to emulate something equivalent to a SeparableConvolution2D layer for the theano backend (it already exists for the TensorFlow backend). As the first step What I need to do is pass ONE channel from a tensor into the next layer. So say I have a 2D convolution layer called conv1 with 16 filters which produces an output with shape: (batch_size, 16, height, width) I need to select the subtensor with shape (: , 0, : , : ) and pass it to the next layer. Simple enough right?
This is my code:
from keras import backend as K
image_input = Input(batch_shape = (batch_size, 1, height, width ), name = 'image_input' )
conv1 = Convolution2D(16, 3, 3, name='conv1', activation = 'relu')(image_input)
conv2_input = K.reshape(conv1[:,0,:,:] , (batch_size, 1, height, width))
conv2 = Convolution2D(16, 3, 3, name='conv1', activation = 'relu')(conv2_input)
This throws:
Exception: You tried to call layer "conv1". This layer has no information about its expected input shape, and thus cannot be built. You can build it manually via: layer.build(batch_input_shape)
Why does the layer not have the required shape information? I'm using reshape from the theano backend. Is this the right way of passing individual channels to the next layer?
I asked this question on the keras-user group and I got an answer there:
https://groups.google.com/forum/#!topic/keras-users/bbQ5CbVXT1E
Quoting it:
You need to use a lambda layer, like: Lambda(x: x[:, 0:1, :, :], output_shape=lambda x: (x[0], 1, x[2], x[3]))
Note that such a manual implementation of a separable convolution would be horribly inefficient. The correct solution is to use the TensorFlow backend.

Selectively zero weights in TensorFlow?

Lets say I have an NxM weight variable weights and a constant NxM matrix of 1s and 0s mask.
If a layer of my network is defined like this (with other layers similarly defined):
masked_weights = mask*weights
layer1 = tf.relu(tf.matmul(layer0, masked_weights) + biases1)
Will this network behave as if the corresponding 0s in mask are zeros in weights during training? (i.e. as if the connections represented by those weights had been removed from the network entirely)?
If not, how can I achieve this goal in TensorFlow?
The answer is yes. The experiment depicts the following graph.
The implementation is:
import numpy as np, scipy as sp, tensorflow as tf
x = tf.placeholder(tf.float32, shape=(None, 3))
weights = tf.get_variable("weights", [3, 2])
bias = tf.get_variable("bias", [2])
mask = tf.constant(np.asarray([[0, 1], [1, 0], [0, 1]], dtype=np.float32)) # constant mask
masked_weights = tf.multiply(weights, mask)
y = tf.nn.relu(tf.nn.bias_add(tf.matmul(x, masked_weights), bias))
loss = tf.losses.mean_squared_error(tf.constant(np.asarray([[1, 1]], dtype=np.float32)),y)
weights_grad = tf.gradients(loss, weights)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print("Masked weights=\n", sess.run(masked_weights))
data = np.random.rand(1, 3)
print("Graident of weights\n=", sess.run(weights_grad, feed_dict={x: data}))
sess.close()
After running the code above, you will see the gradients are masked as well. In my example, they are:
Graident of weights
= [array([[ 0. , -0.40866762],
[ 0.34265977, -0. ],
[ 0. , -0.35294518]], dtype=float32)]
The answer is yes and the reason lies in backpropogation as explained below.
mask_w = mask * w
del(mask_w) = mask * del(w).
The mask will make the gradient 0 wherever its value is zero. Wherever its value is 1, gradient will flow as previously. This is a common trick used in seq2seq predictions to mask the different size output in decoding layer. You can read more about this here.