Can we multiply only specific slots in ciphertexts using Microsoft SEAL? - seal

For example : In a BFV scheme, using BatchEncoder, we have a ciphertext of some numbers [1,2,3,4,5,0,0,0.......] and we have another ciphertext - [-1, -2, -3, -4, -5, -6, 0, 0......] and we add the two together.
The output for it would be - [0,0,0,0,0,-6,0.....].
Now, if i just want the values of the first five slots, how do i do that? Is it possible?
Also, is it possible for us to multiply only, let's say, the first 5 slots of a ciphertext with the first 5 slots of another ciphertext?

Related

dual function not working when using several measures in a plot

I am making a barplot with two measures both using the dual() function like so:
IF(AMOUNT>4,AMOUNT,dual(num('5','<#'),4))
I want the number formatting to be by measure expression so that all numbers below 5 are displayed as "<5" instead of their actual value.
For example, if:
AMOUNT = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
AMOUNT displayed in plot = [10, 9, 8, 7, 6, 5, <5, <5, <5, <5]
When I use only one of the measures the plot works fine and everything is displayed correctly.
However, if I use both measures, the chart displays "4" (actual value) instead of "<5" (the measure expression).
Does anyone know how to make the plot display the measure-expression when using several measures?
I am not sure how exactly you want chart suppose to look like (I understand that "AMOUNT" is a dimension but what is a measure - bar height?). Maybe you could share example chart (can be even in paint) and screenshot of data-model. Generally you should modify only dimension and set it just as:
=If(AMOUNT<5,'<5',AMOUNT)
Expression leave as it is so for example:
Count(Field)
Sum(Field)

How do I write MATLAB code for a DAC converter ?

In the first step I generated a sequence of bits (0,1)..
I used a randi command x = randi([0 1],1,3) to generate random bits
I stuck with these 2 steps :
Divide sequence by 3 bits into 8 levels
[000, 001, 010, 011, 100, 101, 110, 111]
For each quantum level assigns amplitude value from the range [-2, 2]
I won't provide the full source code to leave a bit of the homework for you, but I will give you some hints:
randi() is creating a sequence of 0 and 1 floating point numbers
Look at the documentation of function bitpack. This allows you to pack your bits from array elements into a single byte. Be aware that you need to provide an 8 element array of "bits" to fill a byte. User 'uint8' as the class argument.
before passing the array of floating point numbers to bitpack you have to convert it to a logical array by using the logical() function.
look at the documentation of linspace() to create an array with 8 elements containing your equally spaces amplitude values
lookup the amplitude value in this array for each "digital" value.

Using a neural network to provide recommendations

So I started playing with FANN (http://leenissen.dk/) in order to create a simple recommendation engine.
For example,
User X has relations to records with ids [1, 2, 3]
Other users have relations to following ids:
User A: [1, 2, 3, 4]
User B: [1, 2, 3, 4]
It would be natural, then, that there's some chance user X would be interested in record with id 4 as well and that it should be the desired output of the recommendation engine.
It feels like this would be something a neural network could accomplish. However, from trying out FANN and googling around, it seems there seems to need to be some mathematical relation with the data and results. Here with ids there's none; the ids could just as well be any symbols.
Question: Is it possible to solve this kind of problem with a neural network and where should I begin to search for a solution?
What you are looking for is some kind of recurrent neural network; a network that stores 'context' in some way or another. Examples of such networks would be LSTM and GRU. So basically, you have to input your data sequentially. Based on the context and the current input, the network will predict which label is most likely.
it seems there seems to need to be some mathematical relation with the data and results. Here with ids there's none; the ids could just as well be any symbols.
There is a definitely relation between the data and the results, and this can be expressed through weights and biases.
So how would it work? First you one-hot encoding your inputs and outputs. So basically, you want to predict which label is most likely after a set of labels that a user has already interacted with.
If you have 5 labels: A, B, C, D, E that means you will have 5 inputsand outputs: [0, 0, 0, 0, 0].
If your label is A, the array will be [1, 0, 0, 0, 0], if it's D, it will be [0, 0, 0, 1, 0].
So the key to LSTM's and GRU's that the data should be sequential. So basically, you input all the labels watched one by one. So if a user has watched A, B and C:
activate: [1,0,0,0,0]
activate: [0,1,0,0,0]
// the output of this activation will be the next predicted label
activate: [0,0,1,0,0]
// output: [0.1, 0.3, 0.2, 0.7, 0.5], so the next label is D
And you should always train the network so that the output of INt is INt+1

K-means Coding Implementation

I am looking for an implementation of k-means that will out where each row of data belongs too.
I have found other links like Matlab:K-means clustering
But they do not help.
So I am looking for something like this. If my data is as follows
1, 2, 4, 5, 6, 7, 8, 9
1, 4, 7, 8, 9, 4, 5, 6
I would like to know that Row 1 Belongs to Cluster A and Row 2 Belongs to Cluster B and so on.
Does anyone know if Matlab can show me that, if so how? If not does anyone have a link to some code that would be able to do that?
Yes, the kmeans command from Statistics Toolbox will do this. Here's an example using the Fisher Iris dataset that is supplied with the toolbox. meas is a 100x4 dataset of four anatomical variables (petal length, petal width, sepal length, sepal width) measured on 150 irises. The output variable, which I've here called clusterIndex, tells you which cluster each row of the dataset falls into, and can be used, for example, as a variable to color points in a plot.
>> load fisheriris
>> k = 3;
>> clusterIndex = kmeans(meas,3);
>> scatter(meas(:,1),meas(:,2),[],clusterIndex,'filled')

Is my subset sum algorithm of polynomial time?

I came up with a new algorithm to solve the subset sum problem, and I think it's in polynomial time. Tell me I'm either wrong or a total genius.
Quick starter facts:
Subset sum problem is an NP-complete problem. Solving it in polynomial time means that P = NP. The number of subsets in a set of length N, is 2^N.
On the more useful hand, the number of unique subsets in the length N is at maximum the sum of the whole set minus the smallest element, or, the range of sums that any subset can possibly produce is between the sum of all the negative elements and the sum of all the positive elements, since no sum can possibly be bigger or smaller than all the positive or negative sums, which grow at a linear rate when we add extra elements.
What this means is that as N increases, the number of duplicate subsets increases exponentially, and the number of unique, useful subsets increases only linearly. If an algorithm could be devised that could remove the duplicate subsets at the earliest possible opportunity, we would run in polynomial time. A quick example is easily taken from binary. From only the numbers that are powers of two, we can create unique subsets for any integral value. As such, any subset involving any other number (if we had all powers of two) is a duplicate and a waste. By not computing them and their derivatives, we can save virtually all the running time of the algorithm, since the numbers which are powers of two are logarithmically occurring compared to any integer.
As such, I propose a simple algorithm that will remove these duplicates and save having to compute them and all their derivatives.
To begin with, we'll sort the set which is only O(N log N), and split it into two halves, positive and negative. The procedure for the negative numbers is identical, so I'll only outline the positive numbers (the set now means just the positive half, just for clarification).
Imagine an array indexed by sum, which has entries for all the possible result sums of the positive side (which is only linear, remember). When we add an entry, the value is the entries in the subset. So like, array[3] = { 1, 2 }.
In general, we now move to enumerate all subsets in the set. We do this by starting with the subsets of one length, then adding to them. When we have all the unique subsets, they form an array, and we simply iterate them in the fashion used in Horowitz/Sahni.
Now we start with the "first generation" values. That is, if there were no duplicate numbers in the original data set, there are guaranteed to be no duplicates in these values. That is, all single-value subsets, and all length of the set minus one length subsets. These can easily be generated by summing the set, and subtracting each element in turn. In addition the set itself is a valid first generation sum and subset, as well as each individual element of the subset.
Now we do the second generation values. We loop through each value in the array and for each unique subset, if it doesn't have it, we add it and compute the new unique subset. If we have a duplicate, fun occurs. We add it to a collision list. When we come to add new subsets, we check if they're on the collision list. We key by the less desirable (normally larger, but can be arbitrary) equal subset. When we come to add to subsets, if we would generate a collision, we simply do nothing. When we come to add the more desirable subset, it misses the check and adds, generating the common subset. Then we just repeat for the other generations.
By removing duplicate subsets in this manner, we don't have to keep combining the duplicates with the rest of the set, nor keep checking them for collisions, nor sum them. Most importantly, by not creating new subsets that are non-unique, we're not generating new subsets from them, which can, I believe, turn the algorithm from NP to P, since the growth of subsets is no longer exponential- we discard the vast majority of them before they can "reproduce" in the next generation and create more subsets by being combined with the other non-duplicate subsets.
I don't think I've explained this too well. I have pictures... they're in my head. The important thing is that by discarding duplicate subsets, you could remove virtually all of the complexity.
For example, imagine (because I'm doing this example by hand) a simple dataset that goes -7 to 7 (not zero) for which we aim at zero. Sort and split, so we're left with (1, 2, 3, 4, 5, 6, 7). The sum is 28. But 2^7 is 128. So 128 subsets fit in the range 1 .. 28, meaning that we know in advance that 100 sets are duplicates. If we had 8, then we'd only have 36 slots, but now 256 subsets. So you can easily see that the number of dupes would now be 220, greater than double what it was before.
In this case, the first generation values are 1, 2, 3, 4, 5, 6, 7, 28, 27, 26, 25, 24, 23, 22, 21, and we map them to their constituent components, so
1 = { 1 }
2 = { 2 }
...
28 = { 1, 2, 3, 4, 5, 6, 7 }
27 = { 2, 3, 4, 5, 6, 7 }
26 = { 1, 3, 4, 5, 6, 7 }
...
21 = { 1, 2, 3, 4, 5, 6 }
Now to generate the new subsets, we take each subset in turn and add it to each other subset, unless they have a mutual subsubset, e.g. 28 and 27 have a hueg mutual subsubset. So when we take 1 and we add it to 2, we get 3 = { 1, 2 } but owait! It's already in the array. What this means is that we now don't add 1 to any subset that already has 2 in it, and vice versa, because that's a duplicate on 3's subsets.
Now we have
1 = { 1 }
2 = { 2 }
// Didn't add 1 to 2 to get 3 because that's a dupe
3 = { 3 } // Add 1 to 3, amagad, get a duplicate. Repeat the process.
4 = { 4 } // And again.
...
8 = { 1, 7 }
21? Already has 1 in.
...
27? We already have 28
Now we add 2 to all.
1? Existing duplicate
3? Get a new duplicate
...
9 = { 2, 7 }
10 = { 1, 2, 7 }
21? Already has 2 in
...
26? Already have 28
27? Got 2 in already.
3?
1? Existing dupe
2? Existing dupe
4? New duplicate
5? New duplicate
6? New duplicate
7? New duplicate
11 = { 1, 3, 7 }
12 = { 2, 3, 7 }
13 = { 1, 2, 3, 7 }
As you can see, even though I am still adding new subsets each time, the quantity is only going up linearly.
4?
1? Existing dupe
2? Existing dupe
3? Existing dupe
5? New duplicate
6? New duplicate
7? New duplicate
8? New duplicate
9? New duplicate
14 = {1, 2, 4, 7}
15 = {1, 3, 4, 7}
16 = {2, 3, 4, 7}
17 = {1, 2, 3, 4, 7}
5?
1,2,3,4 existing duplicate
6,7,8,9,10,11,12 new duplicate
18 = {1, 2, 3, 5, 7}
19 = {1, 2, 4, 5, 7}
20 = {1, 3, 4, 5, 7}
21 = new duplicate
Now we have every value in the range, so we stop, and add to our list 1-28. Repeat for negative numbers, iterate through lists.
Edit:
This algorithm is totally wrong in any case. Subsets which have duplicate sums are not duplicates for the purposes of which subsets can be spawned from them, because they are arrived at differently- i.e., they cannot be folded.
This does not prove P = NP.
You have failed to consider the possibility where the positive numbers are: 1, 2, 4, 8, 16, etc... and so there will be no duplicates when you sum subsets, so it will run in O(2^N) time in this case.
You can treat this as a special case but still the algorithm is still not polynomial for other similar cases. This assumption that you made is where you go away from the NP-complete version of subset sum to solving only easy (polynomial time) problems:
[assume the sum of the positive numbers grows] at a linear rate when we add extra elements.
Here you are effectively assuming that P (i.e. number of bits required to state the problem) is smaller than N. Quote from Wikipedia:
Thus, the problem is most difficult if N and P are of the same order.
If you assume that N and P are of the same order then you can't assume that the sum grows linearly indefinitely as you add more elements. As you add more elements to your set those elements also need to get larger to ensure that problem remains hard to solve.
If P (the number of place values) is a small fixed number, then there are dynamic programming algorithms that can solve it exactly.
You have rediscovered one of these algorithms. It's a nice piece of work but it isn't something new and it doesn't prove P = NP. Sorry!
Dead MG,
It has been almost half a year since you posted but I will answer anyway.
Mark Byers wrote most of what should be written.
The algorithm is known.
Such algorithms are known as generating functions algorithms or simply as dynamic programming algorithms.
Your algorihtm has very important feature, the so called pseudopolynomial complexity.
Traditional complexity is a function of the size of the problem. In terms of traditional complexity your algorithm has O(2^n) pessimistic complexity (that is for the numbers 1,2, 4,... as was mentioned earlier )
The complexity of your algorithm algorithm can be alternatively expressed as the function of the size of the problem and the size of some numbers in the problem. For your algorithm it would be something like O(nw) where w is the number of distinct sums.
This is psuedopolynomial complexity. It is a VERY important feature. Such algorithms can solve lots of real-world problem instances, despite problem complexity class.
Horowitz and Sahni algorithm has pessimistic complexity O(2^N/2). This is not two times better than your algorithm but lot's more - 2^N/2 times better than your algorithm. What Greg probably meant was that Horowitz and Sahni algorithm can solve twice as big instances of the problem (having twice as many numbers in the subset sum)
That's true in theory but in practice Horowitz and Sahni can solve (on home computers) instances with about 60 numbers, while the algorithm similiar to yours can handle even instances with 1000 numbers (provided that the numbers aren't too big themselves)
In fact the two algorithms can even be mixed, that is of your kind and of Horowitz and Sahni algorithm. Such solution has both pseudopolynomial complexity and pessimistic complexity of O(2^n/2).
A trained computer sciencist can construct such algorithm as yours by means of generating functions theory.
Both trained and untrained can think it up the way you did.
Do not necessarily think in terms "is it known?". It should be important to you that you can invent such algorithm on your own. It means that you probably can invent other important algorithms on your own and someday one that isn't known maybe. Knowing current progress in the field and what's in the literature helps. Otherwise you will keep on reinventing the wheel.
When I was way back in high school I reinvented Dijkstra algorithm. My version had terrible complexity because I didn't know anything about data structures. Anyway, I am still proud of myself.
If you are still studying pay attention to generating functions theory.
You may also want to check out on wiki:
psuedopolynomial time
weakly NP-complete
strongly NP-complete
generating functions
Megli
What this means is that as N increases, the number of duplicate subsets increases exponentially, and the number of unique, useful subsets increases only linearly.
Not necessarily - the number of duplicate subset sums is also determined by the value of the number closest to zero in the set (that the greater the minimum distance to zero - the fewer the duplicate subset sums for the set).
In general, we now move to enumerate all subsets in the set.
Unfortunately, enumerating all the sums of the subsets of the set requires performing an exponential number of addition operations (2^7 or 128 in your example). Otherwise, how would the algorithm determine what the unique sums happen to be? So, although the steps that follow the first step could very well have a polynomial running time, the algorithm as a whole has exponential complexity (because an algorithm is only as fast as its slowest part).
Incidentally, best known algorithm for solving the subset sum problem (Horowitz and Sahni, 1974) has O(2^N/2) complexity - which makes it about twice as fast as this algorithm.