Use of Galois Automorphisms in Homomorphic Encryption - seal

The SEAL (Simple Encrypted Arithmetic Library) uses Galois Automorphisms to enable batch computations (i.e., the addition and multiplication of many ciphertexts in parallel in one single operation).
The batching procedure is described in sections 5.6 Galois Automorphisms and 7.4 CRT Batching of the SEAL 2.3.1 manual.
In particular, the two sections above state that the following rings are isomorphic.
\prod_{i=0}^{n} \mathbb{Z}_t \cong \prod_{i=0}^{n} \mathbb{Z}_t[\zeta^{2i+1}] \cong \mathbb{Z}_t[x]/(x^n+1)
where \zeta is a primitive 2n-th root of unity modulo t.
An image of the above equation can found here (Stackoverflow does not allow me display images for now)
The same sections also state that mapping plaintext tuples in \prod_{i=0}^{n} \mathbb{Z}_t to \mathbb{Z}_t[x]/(x^n+1) can be done using Galois Automorphims.
More precisely, a n-dimensional \mathbb{Z}_t-vector plaintext can be thought of as a 2-by-(n/2) matrix, and the Galois Automorphisms would correspond to rotations of the columns and rows of that matrix.
Following the application of the Galois Automorphisms on the plaintext vector (rotations of the rows and columns), one can obtain a corresponding element in \mathbb{Z}_t[x]/(x^n+1), which will be used for batch computations.
My questions are the following.
1- Why is \mathbb{Z}_t[\zeta^{2i+1}] isomorphic to \mathbb{Z}_t ?
2- How are the Galois Automorphisms used precisely to map n-dimensional \mathbb{Z}_t-vector plaintexts to elements in \mathbb{Z}_t[x]/(x^n+1)?
Or stated differently, how does the Compose operation work? And how do you use Galois Automorphisms (row and column rotations) to compute it?
========================================================================

The isomorphism simply evaluates a polynomial at a root of unity to obtain an element of Zt. Note that this works because the relevant root of unity is itself in Zt. The entire batching system is just a big old Chinese Remainder Theorem: the batching slots are the reductions of the plaintext polynomial modulo x-zeta2i+1 for different i. Going back requires a standard CRT reconstruction.
In practice the CRT is implemented through the Number Theoretic Transform (FFT over a finite field) and its inverse. The Galois automorphism acts on the roots of unity by permuting them, forming two orbits. If we order the plaintext matrix slots in a way that the batching slot value corresponding to the next Galois conjugate of a primitive root is always to the left (or right) of the slot value corresponding to that primitive root, then the Galois action will permute the rows of the matrix cyclically. The two orbits can also be interchanged, which corresponds to the column rotation (swap).
Matters are further complicated by the fact that the NTT algorithm that SEAL uses results in a so-called "bit reversed" output order. This needs to be taken into account when the correct ordering of the batching values is determined before any NTT or inverse NTT can be performed.

Related

for how How to compute whether representational similarity matrix values are significant

I am new to RSA analysis in fMRI images. I used SPM 12 for preprocessing and first level analysis of my fMRI images and used RSA-toolbox to compute RDMs (representational dissimilarity matrix) for my conditions in an specific region of the brain. Now I have the RDM mateix for every single subject also have the overall RDM across all subjects. However, RSA-toolbox doesn't report any p value or significance test for the values in the RDM. How can I compute or determine which values in the RDM matrix are significant and which are not? I used pearson's r method for to compute RDMs. In paeticular I want to have an explaination about the mathematics that can be used to test significancy of these values.

How to do binary linear algebra on a sparse matrix in Matlab (or any other language)?

I have a sparse binary matrices whose properties I want to analyze over the binary field. The application is to analyze some sparse, binary error-correcting codes. The matrices themselves are too big to handle as full dense matrices, with sizes on the order of 10,000 x 30,000 and bigger, even though only a small percentange of entries are going to be filled. I want to be able to do binary linear algebra while exploiting the matrices' sparsity.
The two main things I will need to do are:
-finding a basis of intersection of its row space with the row space of another sparse matrix
-finding its rank
I've seen that there some packages to find subspace intersection (e.g. this MuPAD function) and to find the rank of a matrix over different fields (like gfrank), but they take prohibitively long time for the matrices I'm working with.
Is there anything like this available? Or any tricks that can be used to do this? If this is possible in another programming language that would also be helpful.

Sparse boolean matrix multiplication

Does anybody know the efficient implementation of sparse boolean matrix multiplication? I'm interested in both CPU and GPGPU implementations because it is necessary to multiply matrices of different sizes (from 8x8 to up to 10^8x10^8). Currently, I use cuSPARSE library, but it supports only numerical matrices (float, double etc) and this fact leads to huge overhead (by memory and time) which is critical in my task.
Since a boolean matrix can be viewed as the adjacency matrix of some (bipartite) graph, its product with another matrix can be interpreted as the distance 2 connections between the nodes of two subgraphs linked by a common set of nodes.
To avoid wasting space and exploit some amount of bit parallelism, you could try using some form of succint data structure for graph storage and manipulation.
One such family of data structures which could be useful in your case is the K2-tree (or Kn in general), which uses an approach to store the adjacencies similar to spatial decompositions such as quad- and oct- trers.
Ultimately, the best algorithm and data structure will heavily depend on the dimension and sparsity patterns of your matrices.

Is it possible to calculate Euclidean Distance between two varying length matrices?

I have started working on online signature data-set for verification purpose. I have two matrices containing digitized data of two signatures of varying length (the number of rows differ). e.g. one is 177×7 and second is 170×7.
I want to treat each column as one time series and I'd like to compare one time series of a signature with the corresponding time series of second signature.
How should I align the two time series?
I think this question really belongs on Math.StackExchange, but I will do my best to answer it here. The short answer is that the Euclidean distance cannot be applied in this case and you will need to define your own notion of distance. This may or may not actually be feasible.
The notion of distance relies on the existence of a "metric" defined on the space of interest. If your vectors are of different lengths then traditional metrics (including the Euclidean distance) are ill-defined and you need to define a new metric that works for you.
There are two things you'll need to do here:
Define the space you're working with. This seems to be the set of vectors of length 177 or length 170. This is a very unusual set.
Define your metric (and ensure that it actually meets all the properties of a metric).
The most obvious solution is to project vectors of length 177 into the space of vectors of length 170 and then compute the Euclidean distance as usual. For example, you could just ignore the last 7 elements of the vector. Note that this is not a metric on your original set as it violates the condition ( d(x,y)=0 iff x=y ), but it is a metric on the projected vectors. There may be a clever solution on the original set, but there is not an obvious one. Again, the people on Math.StackExchange may be able to help you more.

Clustering: a training dataset of variable data dimensions

I have a dataset of n data, where each data is represented by a set of extracted features. Generally, the clustering algorithms need that all input data have the same dimensions (the same number of features), that is, the input data X is a n*d matrix of n data points each of which has d features.
In my case, I've previously extracted some features from my data but the number of extracted features for each data is most likely to be different (I mean, I have a dataset X where data points have not the same number of features).
Is there any way to adapt them, in order to cluster them using some common clustering algorithms requiring data to be of the same dimensions.
Thanks
Sounds like the problem you have is that it's a 'sparse' data set. There are generally two options.
Reduce the dimensionality of the input data set using multi-dimensional scaling techniques. For example Sparse SVD (e.g. Lanczos algorithm) or sparse PCA. Then apply traditional clustering on the dense lower dimensional outputs.
Directly apply a sparse clustering algorithm, such as sparse k-mean. Note you can probably find a PDF of this paper if you look hard enough online (try scholar.google.com).
[Updated after problem clarification]
In the problem, a handwritten word is analyzed visually for connected components (lines). For each component, a fixed number of multi-dimensional features is extracted. We need to cluster the words, each of which may have one or more connected components.
Suggested solution:
Classify the connected components first, into 1000(*) unique component classifications. Then classify the words against the classified components they contain (a sparse problem described above).
*Note, the exact number of component classifications you choose doesn't really matter as long as it's high enough as the MDS analysis will reduce them to the essential 'orthogonal' classifications.
There are also clustering algorithms such as DBSCAN that in fact do not care about your data. All this algorithm needs is a distance function. So if you can specify a distance function for your features, then you can use DBSCAN (or OPTICS, which is an extension of DBSCAN, that doesn't need the epsilon parameter).
So the key question here is how you want to compare your features. This doesn't have much to do with clustering, and is highly domain dependant. If your features are e.g. word occurrences, Cosine distance is a good choice (using 0s for non-present features). But if you e.g. have a set of SIFT keypoints extracted from a picture, there is no obvious way to relate the different features with each other efficiently, as there is no order to the features (so one could compare the first keypoint with the first keypoint etc.) A possible approach here is to derive another - uniform - set of features. Typically, bag of words features are used for such a situation. For images, this is also known as visual words. Essentially, you first cluster the sub-features to obtain a limited vocabulary. Then you can assign each of the original objects a "text" composed of these "words" and use a distance function such as cosine distance on them.
I see two options here:
Restrict yourself to those features for which all your data-points have a value.
See if you can generate sensible default values for missing features.
However, if possible, you should probably resample all your data-points, so that they all have values for all features.