how many boolean functions have exactly one non-zero fourier coefficient? - boolean-expression

This question is part of the book Analysis of Boolean functions, by Ryan O'donnell.
I did try to understand the the concept of boolean functions by watching the video lecture of the author.
What I inferred was that, the number of non-zero fourier coefficients can only be a power of two. I think, due to this, exactly one coefficient is not possible.

If a Boolean function has exactly one non-zero fourier coefficient then it is equal to $$\pm \chi_S$$ for some $$S$$. Hence, there are exactly $$2^{n+1}$$ Boolean functions with exactly one nonzero fourier coefficient.

Related

Choosing a suitable plaintext_modulus

In choosing parameters such as plaintext_modulus, is there any good strategy? (aside from guess-and-check until the output looks correct)
In particular, I'm experimenting with IntegerEncoder with BFV. My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
This is the correct way of thinking when using IntegerEncoder. Note, however, that when using BatchEncoder (PolyCRTBuilder in SEAL 2.*) the situation is exactly the opposite: each slot in the plaintext vector is an integer modulo poly_modulus.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
The whole point of IntegerEncoder is that fresh encodings have as small coefficients as possible, delaying plain_modulus overflow and allowing you to use smaller plain_modulus (implies smaller noise growth). SEAL 2.* had an automatic parameter selection tool that performed heuristic upper bound estimates on noise growth and plaintext coefficient growth, and basically did exactly what you want. Unfortunately these estimates were performed on a per-operation basis, causing overestimates in the earlier operations to blow up in later stages of the computation. As a result, the estimates were not very tight for more than the simplest computations and in many cases the parameters this tool provided were oversized.
To estimate the plaintext coefficient growth in multiplications, let's consider two polynomials p(x) and q(x). Obviously the product will have degree exactly equal to deg(p)+deg(q)---that part is easy. If |P| denotes the infinity norm of a polynomial P (absolute value of largest coefficient), then:
|p*q| <= min{deg(p)+1, deg(q)+1} * |p||q|.
Actually, SEAL 2.* is a little bit more precise here. Instead of using the degrees, it uses the number of non-zero coefficients in these polynomials. This makes a big difference when the polynomials are sparse, in which case the contribution from cross-terms is much smaller and a better bound is:
|p*q| <= min{#(non_zero_coeffs(p)), #(non_zero_coeffs(q))} * |p||q|.
A deeper analysis of coefficient growth in IntegerEncoder-like encoders is done in https://eprint.iacr.org/2016/250 by Costache et al., which you may want to look at.

Can any function be decomposed as sum of Gaussians?

In Fourier series, any function can be decomposed as sum of sine and
cosine
In neural networks, any function can be decomposed as weighted sum over logistic functions. (A one layer neural network)
In wavelet transforms, any function can be decomposed as weighted sum of Haar functions
Is there also such property for decomposition into mixture of Gaussians? If so, is there a proof?
If the sum allows to be infinite, then the answer is Yes. Please refer to Yves Meyer's book of "Wavelet and Operators", section 6.6, lemma 10.
There's a theorem, the Stone-Weierstrass theorem, which gives conditions for when a family of functions can approximate any continuous function. You need
an algebra of functions (closed under addition, subtraction, and
multiplication)
the constant functions
and you need the functions to separate points:
(for any two distinct points you can find a a function that assigns them different values)
You can approximate a constant function with increasingly wide gaussians. You can time-shift gaussians to separate points. So if you form an algebra out of gaussians, you can approximate any continuous function with them.
Yes. Decomposing any function to a sum of any kind of Gaussians is possible, since it can be decomposed to a sum of Dirac functions :) (and Dirac is a Gaussian where the variance approaches zero).
Some more interesting questions would be:
Can any function be decomposed to a sum of non-zero variance Gaussians, with a given, constant variance, that are defined around varying centers?
Can any function be be decomposed to a sum of non-zero variance Gaussians, all having 0 as the center, but defined with alternating variances?
The Mathematics Stack Exchange might be a better place to answer these questions though.

How to compute inverse of a matrix accurately?

I'm trying to compute an inverse of a matrix P, but if I multiply inv(P)*P, the MATLAB does not return the identity matrix. It's almost the identity (non diagonal values in the order of 10^(-12)). However, in my application I need more precision.
What can I do in this situation?
Only if you explicitly need the inverse of a matrix you use inv(), otherwise you just use the backslash operator \.
The documentation on inv() explicitly states:
x = A\b is computed differently than x = inv(A)*b and is recommended for solving systems of linear equations.
This is because the backslash operator, or mldivide() uses whatever method is most suited for your specific matrix:
x = A\B solves the system of linear equations A*x = B. The matrices A and B must have the same number of rows. MATLABĀ® displays a warning message if A is badly scaled or nearly singular, but performs the calculation regardless.
Just so you know what algorithm MATLAB chooses depending on your input matrices, here's the full algorithm flowchart as provided in their documentation
The versatility of mldivide in solving linear systems stems from its ability to take advantage of symmetries in the problem by dispatching to an appropriate solver. This approach aims to minimize computation time. The first distinction the function makes is between full (also called "dense") and sparse input arrays.
As a side-note about error of order of magnitude 10^(-12), besides the above mentioned inaccuracy of the inv() function, there's floating point accuracy. This post on MATLAB issues on it is rather insightful, with a more general computer science post on it here. Basically, if you are computing numerics, don't worry (too much at least) about errors 12 orders of magnitude smaller.
You have what's called an ill-conditioned matrix. It's risky to try to take the inverse of such a matrix. In general, taking the inverse of anything but the smallest matrices (such as those you see in an introduction to linear algebra textbook) is risky. If you must, you could try taking the Moore-Penrose pseudoinverse (see Wikipedia), but even that is not foolproof.

Numerical Instability Kalman Filter in MatLab

I am trying to run a standard Kalman Filter algorithm to calculate likelihoods, but I keep getting a problema of a non positive definite variance matrix when calculating normal densities.
I've researched a little and seen that there may be in fact some numerical instabitlity; tried some numerical ways to avoid a non-positive definite matrix, using both choleski decomposition and its variant LDL' decomposition.
I am using MatLab.
Does anyone suggest anything?
Thanks.
I have encountered what might be the same problem before when I needed to run a Kalman filter for long periods but over time my covariance matrix would degenerate. It might just be a problem of losing symmetry due to numerical error. One simple way to enforce your covariance matrix (let's call it P) to remain symmetric is to do:
P = (P + P')/2 # where P' is transpose(P)
right after estimating P.
post your code.
As a rule of thumb, if the model is not accurate and the regularization (i.e. the model noise matrix Q) is not sufficiently "large" an underfitting will occur and the covariance matrix of the estimator will be ill-conditioned. Try fine tuning your Q matrix.
The Kalman Filter implemented using the Joseph Form is known to be numerically unstable, as any old timer who once worked with single precision implementation of the filter can tell. This problem was discovered zillions of years ago and prompt a lot of research in implementing the filter in a stable manner. Probably the best well-known implementation is the UD, where the Covariance matrix is factorized as UDU' and the two factors are updated and propagated using special formulas (see Thoronton and Bierman). U is an upper diagonal matrix with "1" in its diagonal, and D is a diagonal matrix.

Find the best linear combination of two vectors resembling a third vector; implementing constraints

I have a vector z that I want to approximate by a linear combination of two other vectors (x,y) such that the residual of a*x+b*y and z is minimized. Also I want to keep one coefficient (a) positive for the fitting.
Any suggestions which command may help?
Thanks!
If you didn't have a bound on one of the coefficients, your problem could have been viewed as multiple regression (solved in matlab by regress). Since one of the coefficients is bounded, you should use lsqlin. This function solves least squares problems with bounds or inequalities on the coefficients. Don't forget to include an all-ones intercept predictor if your signals are not centered.
I think that fminsearch would be an overshoot in this case, since lsqlin does exactly what you want.
You have to define a function, which desribes the cost. The lower the cost is, the better the solution. The output must be a single scalar, e.g. the norm of the difference.
To avoid negative values for x, add something like (x<0)*inf. This rejects every solution with a negative x.
If done so, use fminsearch for a numeric solution.