I've written some Matlab procedures that evaluate orthogonal polynomials, and as a sanity check I was trying to ensure that their dot product would be zero.
But, while I'm fairly sure there's not much that can go wrong, I'm finding myself with a slightly curious behaviour. My test is quite simple:
x = -1:.01:1;
for i0=0:9
v1 = poly(x, i0);
for i1=0:i0
v2 = poly(x,i1);
fprintf('%d, %d: %g\n', i0, i1, v1*v2');
end
end
(Note the dot product v1*v2' needs to be this way round because x is a horizontal vector.)
Now, to cut to the end of the story, I end up with values close to 0 (order of magnitude about 1e-15) for pairs of degrees that add up to an odd number (i.e., i0+i1=2k+1). When i0==i1 I expect the dot product not to be 0, but this also happens when i0+i1=2k, which I didn't expect.
To give you some more details, I initially did this test with Chebyshev polynomials of first kind. Now, they are orthogonal with respect to the weight
1 ./ sqrt(1-x.^2)
which goes to infinity when x goes to 1. So I thought that leaving this term out could be the cause of non-zero dot products.
But then, I did the same test with Legendre polynomials, and I get exactly the same result: when the sum of the degrees is even, the dot product is definitely far from 0 (order of magnitude 1e2).
One last detail, I used the trigonometric formula cos(n*acos(x)) to evaluate the Chebyshev polynomials, and I tried the recursive formula as well as one of the formulas involving the binomial coefficient to evaluate the Legendre polynomials.
Can anyone explain this odd (pun intended) behaviour?
You're being misled by symmetry. Both Chebyshev and Legendre polynomials are eigenfunctions of the parity operator, which means that they can all be classified as either odd or even functions. I guess the same goes for your custom orthogonal polynomials.
Due to this symmetry, if you multiply a polynomial P_n(x) by P_m(x), then the result will be an odd function if n+m is odd, and it will be even otherwise. You're computing sum_k P_n(x_k)*P_m(x_k) for a symmetric set of x_k values around the origin. This implies that for odd n+m you will always get zero. Try computing sum_k P_n(x_k)*Q_m(x_k) with P a Legendre, and Q a Chebyshev polynomial. My point is that for n+m=odd, the result doesn't tell you anything about orthogonality or the accuracy of your integration.
The problem is that probably you're not integrating accurately enough. These orthogonal polynomials defined on [-1,1] vary quite rapidly on their domain, especially close to the boundaries (x==+-1). Try increasing the points of your integration, using a non-equidistant mesh, or a proper integration using integral.
Final note: I'd advise you against calling your functions poly, since that's a MATLAB built-in. (And so is legendre.)
Related
In choosing parameters such as plaintext_modulus, is there any good strategy? (aside from guess-and-check until the output looks correct)
In particular, I'm experimenting with IntegerEncoder with BFV. My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
This is the correct way of thinking when using IntegerEncoder. Note, however, that when using BatchEncoder (PolyCRTBuilder in SEAL 2.*) the situation is exactly the opposite: each slot in the plaintext vector is an integer modulo poly_modulus.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
The whole point of IntegerEncoder is that fresh encodings have as small coefficients as possible, delaying plain_modulus overflow and allowing you to use smaller plain_modulus (implies smaller noise growth). SEAL 2.* had an automatic parameter selection tool that performed heuristic upper bound estimates on noise growth and plaintext coefficient growth, and basically did exactly what you want. Unfortunately these estimates were performed on a per-operation basis, causing overestimates in the earlier operations to blow up in later stages of the computation. As a result, the estimates were not very tight for more than the simplest computations and in many cases the parameters this tool provided were oversized.
To estimate the plaintext coefficient growth in multiplications, let's consider two polynomials p(x) and q(x). Obviously the product will have degree exactly equal to deg(p)+deg(q)---that part is easy. If |P| denotes the infinity norm of a polynomial P (absolute value of largest coefficient), then:
|p*q| <= min{deg(p)+1, deg(q)+1} * |p||q|.
Actually, SEAL 2.* is a little bit more precise here. Instead of using the degrees, it uses the number of non-zero coefficients in these polynomials. This makes a big difference when the polynomials are sparse, in which case the contribution from cross-terms is much smaller and a better bound is:
|p*q| <= min{#(non_zero_coeffs(p)), #(non_zero_coeffs(q))} * |p||q|.
A deeper analysis of coefficient growth in IntegerEncoder-like encoders is done in https://eprint.iacr.org/2016/250 by Costache et al., which you may want to look at.
I have an integration function which does not have indefinite integral expression.
Specifically, the function is f(y)=h(y)+integral(#(x) exp(-x-1/x),0,y) where h(y) is a simple function.
Matlab numerically computes f(y) well, but I want to compute the following function.
g(w)=w*integral(1-f(y).^(1/w),0,inf) where w is a real number in [0,1].
The problem for computing g(w) is handling f(y).^(1/w) numerically.
How can I calculate g(w) with MATLAB? Is it impossible?
Expressions containing e^(-1/x) are generally difficult to compute near x = 0. Actually, I am surprised that Matlab computes f(y) well in the first place. I'd suggest trying to compute g(w)=w*integral(1-f(y).^(1/w),epsilon,inf) for epsilon greater than zero, then gradually decreasing epsilon toward 0 to check if you can get numerical convergence at all. Convergence is certainly not guaranteed!
You can calculate g(w) using the functions you have, but you need to add the (ArrayValued,true) name-value pair.
The option allows you to specify a vector-valued w and allows the nested integral call to receive a vector of y values, which is how integral naturally works.
f = #(y) h(y)+integral(#(x) exp(-x-1/x),0,y,'ArrayValued',true);
g = #(w) w .* integral(1-f(y).^(1./w),0,Inf,'ArrayValued',true);
At least, that works on my R2014b installation.
Note: While h(y) may be simple, if it's integral over the positive real line does not converge, g(w) will more than likely not converge (I don't think I need to qualify that, but I'll hedge my bets).
I have a matrix valued function which I'm trying to find its limit as x goes to 1.
So, in this example, I have three matrices v1-3, representing respectively the sampled values at [0.85, 0.9, 0.99]. What I do now, which is quite inefficient, is the following:
for i=1:101
for j = 1:160
v_splined = spline([0.85,0.9,0.99], [v1(i,j), v2(i,j), v3(i,j)], [1]);
end
end
There must be a better more efficient way to do this. Especially when soon enough I'll face the situation where v's will be 4-5 dimensional vectors.
Thanks!
Disclaimer: Naively extrapolating is risky business, do so at your own risk
Here's what I would say
Using a spline to extrapolate is risky business and not generally recommended. Do you know anything about the behavior of your function near x=1?
In the case where you only have 3 points you're probably better off using a 2nd order polynomial (a parabola) rather than fitting a spline through the three points. (unless you have a good reason not to do this.)
If you want to use a parabola (or higher order interpolating polynomial when you have more points), you can vectorize your code and use Lagrange or Newton polynomials to perform the extrapolation which will probably give you a nice speed up.
Using interpolating polynomials will also generalize easily to higher order polynomials with more points given. However, this will make extrapolation even more risky since high-order interpolating polynomials tend to oscillate severely near the ends of the domain.
If you want to use Lagrange polynomials to form a parabola, your result is given by:
v_splined = v1*(1-.9)*(1-.99)/( (.85-.9)*(.85-.99) ) ...
+v2*(1-.85)*(1-.99)/( (.9-.85)*(.9-.99) ) ...
+v3*(1-.85)*(1-.9)/( (.99-.85)*(.99-.9) );
I left this un-simplified so you can see how it comes from the Lagrange polynomials, but obviously simplifying is easy. Also note that this eliminates the need for loops.
This is my first post to stackoverflow, so if this isn't the correct area I apologize. I am working on minimizing a L1-Regularized System.
This weekend is my first dive into optimization, I have a basic linear system Y = X*B, X is an n-by-p matrix, B is a p-by-1 vector of model coefficients and Y is a n-by-1 output vector.
I am trying to find the model coefficients, I have implemented both gradient descent and coordinate descent algorithms to minimize the L1 Regularized system. To find my step size I am using the backtracking algorithm, I terminate the algorithm by looking at the norm-2 of the gradient and terminating if it is 'close enough' to zero(for now I'm using 0.001).
The function I am trying to minimize is the following (0.5)*(norm((Y - X*B),2)^2) + lambda*norm(B,1). (Note: By norm(Y,2) I mean the norm-2 value of the vector Y) My X matrix is 150-by-5 and is not sparse.
If I set the regularization parameter lambda to zero I should converge on the least squares solution, I can verify that both my algorithms do this pretty well and fairly quickly.
If I start to increase lambda my model coefficients all tend towards zero, this is what I expect, my algorithms never terminate though because the norm-2 of the gradient is always positive number. For example, a lambda of 1000 will give me coefficients in the 10^(-19) range but the norm2 of my gradient is ~1.5, this is after several thousand iterations, While my gradient values all converge to something in the 0 to 1 range, my step size becomes extremely small (10^(-37) range). If I let the algorithm run for longer the situation does not improve, it appears to have gotten stuck somehow.
Both my gradient and coordinate descent algorithms converge on the same point and give the same norm2(gradient) number for the termination condition. They also work quite well with lambda of 0. If I use a very small lambda(say 0.001) I get convergence, a lambda of 0.1 looks like it would converge if I ran it for an hour or two, a lambda any greater and the convergence rate is so small it's useless.
I had a few questions that I think might relate to the problem?
In calculating the gradient I am using a finite difference method (f(x+h) - f(x-h))/(2h)) with an h of 10^(-5). Any thoughts on this value of h?
Another thought was that at these very tiny steps it is traveling back and forth in a direction nearly orthogonal to the minimum, making the convergence rate so slow it is useless.
My last thought was that perhaps I should be using a different termination method, perhaps looking at the rate of convergence, if the convergence rate is extremely slow then terminate. Is this a common termination method?
The 1-norm isn't differentiable. This will cause fundamental problems with a lot of things, notably the termination test you chose; the gradient will change drastically around your minimum and fail to exist on a set of measure zero.
The termination test you really want will be along the lines of "there is a very short vector in the subgradient."
It is fairly easy to find the shortest vector in the subgradient of ||Ax-b||_2^2 + lambda ||x||_1. Choose, wisely, a tolerance eps and do the following steps:
Compute v = grad(||Ax-b||_2^2).
If x[i] < -eps, then subtract lambda from v[i]. If x[i] > eps, then add lambda to v[i]. If -eps <= x[i] <= eps, then add the number in [-lambda, lambda] to v[i] that minimises v[i].
You can do your termination test here, treating v as the gradient. I'd also recommend using v for the gradient when choosing where your next iterate should be.
I want to numerically integrate the following:
where
and a, b and β are constants which for simplicity, can all be set to 1.
Neither Matlab using dblquad, nor Mathematica using NIntegrate can deal with the singularity created by the denominator. Since it's a double integral, I can't specify where the singularity is in Mathematica.
I'm sure that it is not infinite since this integral is based in perturbation theory and without the
has been found before (just not by me so I don't know how it's done).
Any ideas?
(1) It would be helpful if you provide the explicit code you use. That way others (read: me) need not code it up separately.
(2) If the integral exists, it has to be zero. This is because you negate the n(y)-n(x) factor when you swap x and y but keep the rest the same. Yet the integration range symmetry means that amounts to just renaming your variables, hence it must stay the same.
(3) Here is some code that shows it will be zero, at least if we zero out the singular part and a small band around it.
a = 1;
b = 1;
beta = 1;
eps[x_] := 2*(a-b*Cos[x])
n[x_] := 1/(1+Exp[beta*eps[x]])
delta = .001;
pw[x_,y_] := Piecewise[{{1,Abs[Abs[x]-Abs[y]]>delta}}, 0]
We add 1 to the integrand just to avoid accuracy issues with results that are near zero.
NIntegrate[1+Cos[(x+y)/2]^2*(n[x]-n[y])/(eps[x]-eps[y])^2*pw[Cos[x],Cos[y]],
{x,-Pi,Pi}, {y,-Pi,Pi}] / (4*Pi^2)
I get the result below.
NIntegrate::slwcon:
Numerical integration converging too slowly; suspect one of the following:
singularity, value of the integration is 0, highly oscillatory integrand,
or WorkingPrecision too small.
NIntegrate::eincr:
The global error of the strategy GlobalAdaptive has increased more than
2000 times. The global error is expected to decrease monotonically after a
number of integrand evaluations. Suspect one of the following: the
working precision is insufficient for the specified precision goal; the
integrand is highly oscillatory or it is not a (piecewise) smooth
function; or the true value of the integral is 0. Increasing the value of
the GlobalAdaptive option MaxErrorIncreases might lead to a convergent
numerical integration. NIntegrate obtained 39.4791 and 0.459541
for the integral and error estimates.
Out[24]= 1.00002
This is a good indication that the unadulterated result will be zero.
(4) Substituting cx for cos(x) and cy for cos(y), and removing extraneous factors for purposes of convergence assessment, gives the expression below.
((1 + E^(2*(1 - cx)))^(-1) - (1 + E^(2*(1 - cy)))^(-1))/
(2*(1 - cx) - 2*(1 - cy))^2
A series expansion in cy, centered at cx, indicates a pole of order 1. So it does appear to be a singular integral.
Daniel Lichtblau
The integral looks like a Cauchy Principal Value type integral (i.e. it has a strong singularity). That's why you can't apply standard quadrature techniques.
Have you tried PrincipalValue->True in Mathematica's Integrate?
In addition to Daniel's observation about integrating an odd integrand over a symmetric range (so that symmetry indicates the result should be zero), you can also do this to understand its convergence better (I'll use latex, writing this out with pen and paper should make it easier to read; it took a lot longer to write than to do, it's not that complicated):
First, epsilon(x)-\epsilon(y)\propto\cos(y)-\cos(x)=2\sin(\xi_+)\sin(\xi_-) where I have defined \xi_\pm=(x\pm y)/2 (so I've rotated the axes by pi/4). The region of integration then is \xi_+ between \pi/\sqrt{2} and -\pi/\sqrt{2} and \xi_- between \pm(\pi/\sqrt{2}-\xi_-). Then the integrand takes the form \frac{1}{\sin^2(\xi_-)\sin^2(\xi_+)} times terms with no divergences. So, evidently, there are second-order poles, and this isn't convergent as presented.
Perhaps you could email the persons who obtained an answer with the cos term and ask what precisely it is they did. Perhaps there's a physical regularisation procedure being employed. Or you could have given more information on the physical origin of this (some sort of second order perturbation theory for some sort of bosonic system?), had that not been off-topic here...
May be I am missing something here, but the integrand
f[x,y]=Cos^2[(x+y)/2]*(n[x]-n[y])/(eps[x]-eps[y]) with n[x]=1/(1+Exp[Beta*eps[x]]) and eps[x]=2(a-b*Cos[x]) is indeed a symmetric function in x and y: f[x,-y]= f[-x,y]=f[x,y].
Therefore its integral over any domain [-u,u]x[-v,v] is zero. No numerical integration seems to be needed here. The result is just zero.