Is gpflow's scaled_euclid_dist as stable as tensorflow_probabilities?

Is gpflow's scaled_euclid_dist as stable as tensorflow_probabilities? - gpflow

Basically, for kernels that depend on r the L2 norm is calculated here and one sees we first clip the value. In tensorflow probability they use a sqrt with a modified gradient that replaces grad(|x - x'|) with a large but finite number when x=x'.
My question is whether they are equivalent, or if one is better?

I've checked gradients for GPflow clipped version for x=x'. The result surprises, as it is zero. I did expect it to be high value.
Simple check confirms that gradient of tf.sqrt(1e-40) should return 5.e+19, and I'm not sure that clipped version has right behaviour.

Related

Which scaling technique does it use?

I have a matrix X, the size of which is 100*2000 double. I want to know which kind of scaling technique is applied to matrix X in the following command, and why it does not use z-score to do scaling?
X = X./repmat(sqrt(sum(X.^2)),size(X,1),1);

That scaling comes from linear algebra. That's what we call normalizing by producing a unit vector. Assuming that each row is an observation and each column is a feature, what's happening here is that we are going through every observation that you collected and normalizing each feature value over all observations such that the overall length / magnitude of a particular feature for all observations is set to 1.
The bottom division takes a look at each feature and determines the norm or magnitude of the feature over all observations. Once you find these magnitudes, you then take each feature for each observation and divide by their respective magnitudes.
The reason why unit vectors are often employed is to describe a point in feature space with respect to a set of basis vectors. Normalizing by producing unit vectors gives you the smallest possible way to represent one component in feature space and so what's probably happening here is that the observations are now being transformed such that each component / feature is being represented in terms of a set of basis vectors. Each basis vector is one feature in the data.
Check out the Wikipedia article on Unit Vectors for more details: http://en.wikipedia.org/wiki/Unit_vector

Implementation of Radon transform in Matlab, output size

Due to the nature of my problem, I want to evaluate the numerical implementations of the Radon transform in Matlab (i.e. different interpolation methods give different numerical values).
while trying to code my own Radon, and compare it to Matlab's output, I found out that my radon projection sizes are different than Matlab's.
So a bit of intuition of how I compute the amount if radon samples needed. Let's do the 2D case.
The idea is that the maximum size would be when the diagonal (in a rectangular shape at least) part is proyected in the radon transform, so diago=sqrt(size(I,1),size(I,2)). As we dont wan nothing out, n_r=ceil(diago). n_r should be the amount of discrete samples of the radon transform should be to ensure no data is left out.
I noticed that Matlab's radon output is always even, which makes sense as you would want a "ray" through the rotation center always. And I noticed that there are 2 zeros in the endpoints of the array in all cases.
So in that case, n_r=ceil(diago)+mod(ceil(diago)+1,2)+2;
However, it seems that I get small discrepancies with Matlab.
A MWE:
% Try: 255,256
pixels=256;
I=phantom('Modified Shepp-Logan',pixels);
rd=radon(I,pi/4);
size(rd,1)
s=size(I);
diagsize=sqrt(sum(s.^2));
n_r=ceil(diagsize)+mod(ceil(diagsize)+1,2)+2
rd=
367
n_r =
365
As Matlab's Radon transform is a function I can not look into, I wonder why could it be this discrepancy.

I took another look at the problem and I believe this is actually the right answer. From the "hidden documentation" of radon.m (type in edit radon.m and scroll to the bottom)
Grandfathered syntax
R = RADON(I,THETA,N) returns a Radon transform with the
projection computed at N points. R has N rows. If you do not
specify N, the number of points the projection is computed at
is:
2*ceil(norm(size(I)-floor((size(I)-1)/2)-1))+3
This number is sufficient to compute the projection at unit
intervals, even along the diagonal.
I did not try to rederive this formula, but I think this is what you're looking for.

This is a fairly specialized question, so I'll offer up an idea without being completely sure it is the answer to your specific question (normally I would pass and let someone else answer, but I'm not sure how many readers of stackoverflow have studied radon). I think what you might be overlooking is the floor function in the documentation for the radon function call. From the doc:
The radial coordinates returned in xp are the values along the x'-axis, which is
oriented at theta degrees counterclockwise from the x-axis. The origin of both
axes is the center pixel of the image, which is defined as
floor((size(I)+1)/2)
For example, in a 20-by-30 image, the center pixel is (10,15).
This gives different behavior for odd- or even-sized problems that you pass in. Hence, in your example ("Try: 255, 256"), you would need a different case for odd versus even, and this might involve (in effect) padding with a row and column of zeros.

MATLAB edge function

I have a question regarding the parameters in the edge function.
edge(img,'sobel',threshold);
edge(img,'prewitt',threshold) ;
edge(img,'roberts',threshold);
edge(img,'canny',thresh_canny,sigma);
How should the threshold for the first 3 types be chosen? Is there an aspect that can help choosing this threshold (like histogram for instance)? I am aware of the function graythresh but I want to set it manually. So far I know it's a value between 0-1, but I don't know how to interpret it.
Same thing for Canny. I`m trying to input an array for thresh_canny = [low_limit, high_limit]. but don't know how to look at these values. How does the sigma value influence the image?

It really depends on the type of edge you want to see in the output. If you want to see really powerful edges, use a smaller interval in the higher end of threshold (say 0.9-1) and this is relative to highest gradient magnitude of the image.
As far as the sigma is concerned, it is used in filtering the input image before passing to edge. This is to reduce noise in the input image.

Detect incorrect points in a homogeneous surface

In my project i have hige surfaces of 20.000 points computed by a algorithm. This algorithm, sometimes, has an error, computing 1 or more points in an small area incorrectly.
This error can not be solved in the algorithm, but needs to be detected afterwards.
The error can be seen in the next figure:
As you can see, there is a point wrongly computed that not only breaks the full homogeneous surface, but also destroys the aestetics of the plot (wich is also important in the project.)
Sometimes it can be more than a point, in general no more than 5 or 6. The error is allways the Z axis, so no need to check X and Y
I have been squeezing my mind to find a bit "generic" algorithm to detect this poitns.
I thougth that maybe taking patches of surface and meaning the Z, then detecting the points out of the variance... but I dont think it will work allways.
Any ideas?
NOTE: I dont want someone to write code for me, just an idea.
PD: relevant code for the avobe image:
[x,y] = meshgrid([-2:.07:2]);
Z = x.*exp(-x.^2-y.^2);
subplot(1,2,1)
surf(x,y,Z,gradient(Z))
subplot(1,2,2)
Z(35,35)=Z(35,35)+0.3;
surf(x,y,Z,gradient(Z))

The standard trick is to use a Laplacian, looking for the largest outliers. (This is not unlike what Mohsen posed for an answer, but is actually a bit easier.) You could even probably do it with conv2, so it would be pretty efficient.
I could offer a few ways to implement the idea. A simple one is to use my gridfit tool, found on the File Exchange. (Gridfit essentially uses a Laplacian for its smoothing operation.) Fit the surface with all points included, then look for the single point that was perturbed the most by the fit. Exclude it, then rerun the fit, again looking for the largest outlier. (With gridfit, you can use weights to give points a zero weight, a simple way to exclude a point or list of points.) When the largest perturbation that was needed is small enough, you can decide to stop the process. A nice thing is gridfit will also impute new values for the outliers, filling in all of the holes.
A second approach is to use the Laplacian directly, in more of a filtering approach. Here, you simply compute a value at each point that is the average of each neighbor to the left, right, above, and below. The single value that is most largely in disagreement with its computed average is replaced with a new value. Or, you can use a weighted average of the new value with the old one there. Again, iterate until the process does not generate anything larger than some tolerance. (This is the basis of an old outlier detection and correction scheme that I recall from the Fortran IMSL libraries, but probably dates back to roughly 30 years ago.)

Since your functions seems to vary smoothly these abrupt changes can be detected by looking into the derivatives. You can
Take the derivative in one direction
Calculate mean and standard deviation of derivative
Find the points by looking for points that are further from mean by certain multiple of standard deviation.
Here is the code
U=diff(Z);
V=(U-mean(U(:)))/std(U(:));
surf(x(2:end,:),y(2:end,:),V)
V=[zeros(1,size(V,2)); V];
V(abs(V)<10)=0;
V=sign(V);
W=cumsum(V);
[I,J]=find(W);
outliers = [I, J];
For your example you get this plot for V with a peak at around 21.7 while second peak is at around 1.9528, so maybe a threshold of 10 is ok.
and running the code returns
outliers =
35 35
The need for cumsum is for the cases that you have a patch of points next to each other that are incorrect.

How to overcome singularities in numerical integration (in Matlab or Mathematica)

I want to numerically integrate the following:
where
and a, b and β are constants which for simplicity, can all be set to 1.
Neither Matlab using dblquad, nor Mathematica using NIntegrate can deal with the singularity created by the denominator. Since it's a double integral, I can't specify where the singularity is in Mathematica.
I'm sure that it is not infinite since this integral is based in perturbation theory and without the
has been found before (just not by me so I don't know how it's done).
Any ideas?

(1) It would be helpful if you provide the explicit code you use. That way others (read: me) need not code it up separately.
(2) If the integral exists, it has to be zero. This is because you negate the n(y)-n(x) factor when you swap x and y but keep the rest the same. Yet the integration range symmetry means that amounts to just renaming your variables, hence it must stay the same.
(3) Here is some code that shows it will be zero, at least if we zero out the singular part and a small band around it.
a = 1;
b = 1;
beta = 1;
eps[x_] := 2*(a-b*Cos[x])
n[x_] := 1/(1+Exp[beta*eps[x]])
delta = .001;
pw[x_,y_] := Piecewise[{{1,Abs[Abs[x]-Abs[y]]>delta}}, 0]
We add 1 to the integrand just to avoid accuracy issues with results that are near zero.
NIntegrate[1+Cos[(x+y)/2]^2*(n[x]-n[y])/(eps[x]-eps[y])^2*pw[Cos[x],Cos[y]],
{x,-Pi,Pi}, {y,-Pi,Pi}] / (4*Pi^2)
I get the result below.
NIntegrate::slwcon:
Numerical integration converging too slowly; suspect one of the following:
singularity, value of the integration is 0, highly oscillatory integrand,
or WorkingPrecision too small.
NIntegrate::eincr:
The global error of the strategy GlobalAdaptive has increased more than
2000 times. The global error is expected to decrease monotonically after a
number of integrand evaluations. Suspect one of the following: the
working precision is insufficient for the specified precision goal; the
integrand is highly oscillatory or it is not a (piecewise) smooth
function; or the true value of the integral is 0. Increasing the value of
the GlobalAdaptive option MaxErrorIncreases might lead to a convergent
numerical integration. NIntegrate obtained 39.4791 and 0.459541
for the integral and error estimates.
Out[24]= 1.00002
This is a good indication that the unadulterated result will be zero.
(4) Substituting cx for cos(x) and cy for cos(y), and removing extraneous factors for purposes of convergence assessment, gives the expression below.
((1 + E^(2*(1 - cx)))^(-1) - (1 + E^(2*(1 - cy)))^(-1))/
(2*(1 - cx) - 2*(1 - cy))^2
A series expansion in cy, centered at cx, indicates a pole of order 1. So it does appear to be a singular integral.
Daniel Lichtblau

The integral looks like a Cauchy Principal Value type integral (i.e. it has a strong singularity). That's why you can't apply standard quadrature techniques.
Have you tried PrincipalValue->True in Mathematica's Integrate?

In addition to Daniel's observation about integrating an odd integrand over a symmetric range (so that symmetry indicates the result should be zero), you can also do this to understand its convergence better (I'll use latex, writing this out with pen and paper should make it easier to read; it took a lot longer to write than to do, it's not that complicated):
First, epsilon(x)-\epsilon(y)\propto\cos(y)-\cos(x)=2\sin(\xi_+)\sin(\xi_-) where I have defined \xi_\pm=(x\pm y)/2 (so I've rotated the axes by pi/4). The region of integration then is \xi_+ between \pi/\sqrt{2} and -\pi/\sqrt{2} and \xi_- between \pm(\pi/\sqrt{2}-\xi_-). Then the integrand takes the form \frac{1}{\sin^2(\xi_-)\sin^2(\xi_+)} times terms with no divergences. So, evidently, there are second-order poles, and this isn't convergent as presented.
Perhaps you could email the persons who obtained an answer with the cos term and ask what precisely it is they did. Perhaps there's a physical regularisation procedure being employed. Or you could have given more information on the physical origin of this (some sort of second order perturbation theory for some sort of bosonic system?), had that not been off-topic here...

May be I am missing something here, but the integrand
f[x,y]=Cos^2[(x+y)/2]*(n[x]-n[y])/(eps[x]-eps[y]) with n[x]=1/(1+Exp[Beta*eps[x]]) and eps[x]=2(a-b*Cos[x]) is indeed a symmetric function in x and y: f[x,-y]= f[-x,y]=f[x,y].
Therefore its integral over any domain [-u,u]x[-v,v] is zero. No numerical integration seems to be needed here. The result is just zero.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse