Challenging Algorithmic Modular Multiples - numbers

While practicing coding problems, I ran into this challenging problem:
Say I have a 7-variable equation, (A+B+C+C+D+B)(E+F+B+C)(G+F+F), and a huge list with up to 3500 numbers:
A 1, 2, 3; B 1, 2; C 9, 1; D 1; E 2; F 1; G 1
This list basically gives all the possible values that each variable can have. The question is, how many ways can I choose values of variables such that the resulting number is a multiple of seven?
For example, I could choose from the above list
A = 1, B = 1, C = 1, D = 1, E = 2, F = 1, G = 1. This would make the number (1+2+1+1+1+1)(2+1+1+1)(1+1+1) = 35, which is indeed a multiple of seven.
My solution was to test every possible combination of the seven variables, and check if that sum was a multiple of seven. However, that solution is obviously very slow. Does anyone have a more efficient solution for this problem?

while if the first number is a multiple of 7 than no matter what you times it by it well remain a multiple of 7
so (E+F+B+C)(G+F+F) is completely irrelevant.
if (E+F+B+C) is a multiple of 7 then the total is also a multiple of 7 same with (G+F+F).
If none of them are multiples of 7 than I don't think the answer can be a multiple of 7. At least, I cannot think of a case where this could happen.
ex no amount of 10s will ever be devisible by 7 except say 7 or 14 10s

Related

Hashing functions and Universal Hashing Family

I need to determine whether the following Hash Functions Set is universal or not:
Let U be the set of the keys - {000, 001, 002, 003, ... ,999} - all the numbers between 0 and 999 padded with 0 in the beginning where needed. Let n = 10 and 1 < a < 9 ,an integer between 1 and 9. We denote by ha(x) the rightmost digit of the number a*x.
For example, h2(123) = 6, because, 2 * 123 = 246.
We also denote H = {h1, h2, h3, ... ,h9} as our set of hash functions.
Is H is universal? prove.
I know I need to calculate the probability for collision of 2 different keys and check if it's smaller or equal to 1/n (which is 1/10), so I tried to separate into cases - if a is odd or even, because when a is even the last digit of a*x will be 0/2/4/6/8, else it could be anything. But it didn't help me so much as I'm stuck on it.
Would be very glad for some help here.

Find the largest x for which x^b+a = a

Stability (Numerical analysis)
Trying to apply the answer I saw in this question, a+x=a worked just fine with a+eps(a)/2. Suppose we have x^b+a=a, where b is a small integer, say 3 and a=2000. Then a+(eps(a))^3 or a+(eps(a)/2)^3 will always return number a. Can someone help with the measurement of x? Any way, even different from eps will do just fine.
p.s. 1938+(eps(1938)/0.00000000469)^3 is the last number that returns ans = 1.9380e+003.
1938+(eps(1938)/0.0000000047)^3 returns a=1938. Does that have to do with anything?
x = (eps(a)/2).^(1/(b-eps(a)/2))
if b = 3,
(eps(1938)/2).^(1/(3-eps(1938)/2)) > eps(1938)/0.0000000047

MATLAB, how to evaluate multiple indices in one line?

I don't know how to explain this better than by giving you an example.
Suppose I have the following array:
a = magic(6)
And then I take a 'slice' of that like this:
a(:,1)
It will print:
35
3
31
8
30
4
Now I want the first number, so I want to write:
a(:,1)(1)
Instead of:
b = a(:,1)
b(1)
Also, is there a way to do something like this (assignment and comparison, i.e. set b, then evaluate against it):
(b = a(:,1))(1)
Ok, here's an update with a function where it isn't trivial to use a(1, 1)
come_on = sprintf('%i, ', magic(3));
come_on(1:end-2)
8, 3, 4, 1, 5, 9, 6, 7, 2
Also, what if I only want the first 4 numbers on magic(3)?
It would be better to write
sprintf('%i, ', magic(3)(1:4))(1:end-2)
instead of tens of lines, MHO.
You cannot concatenate indexing as foo(1)(2)(3). However, you can index multiple dimensions at once. So in this case, a(1,1) will give you what you want.

How to select the last column of numbers from a table created by FoldList in Mathematica

I am new to Mathematica and I am having difficulties with one thing. I have this Table that generates 10 000 times 13 numbers (12 numbers + 1 that is a starting number). I need to create a Histogram from all 10 000 13th numbers. I hope It's quite clear, quite tricky to explain.
This is the table:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]
The result for the following histogram could be a table that will take all the 13th numbers to one table - than It would be quite easy to create an histogram. Maybe with "select"? Or maybe you know other ways to solve this.
You can access different parts of a list using Part or (depending on what parts you need) some of the more specialised commands, such as First, Rest, Most and (the one you need) Last. As noted in comments, Histogram[Last/#F] or Histogram[F[[All,-1]]] will work fine.
Although it wasn't part of your question, I would like to note some things you could do for your specific problem that will speed it up enormously. You are defining Mu, Sigma etc 10,000 times, because they are inside the Table command. You are also recalculating Mu - Sigma^2/2)*t + Sigma*Sqrt[t] 120,000 times, even though it is a constant, because you have it inside the FoldList inside the Table.
On my machine:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]; // Timing
{4.19049, Null}
This alternative is ten times faster:
F = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
FoldList[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]]; // Timing
{0.403365, Null}
I use With for the local constants and Module for the things that are other redefined within the Table (Xi) or are calculations based on the local constants (beta). This question on the Mathematica StackExchange will help explain when to use Module versus Block versus With. (I encourage you to explore the Mathematica StackExchange further, as this is where most of the Mathematica experts are hanging out now.)
For your specific code, the use of Part isn't really required. Instead of using FoldList, just use Fold. It only retains the final number in the folding, which is identical to the last number in the output of FoldList. So you could try:
FF = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
Fold[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]];
Histogram[FF]
Calculating FF in this way is even a little faster than the previous version. On my system Timing reports 0.377 seconds - but such a difference from 0.4 seconds is hardly worth worrying about.
Because you are setting the seed with SeedRandom, it is easy to verify that all three code examples produce exactly the same results.
Making my comment an answer:
Histogram[Last /# F]

Generate a hash sum for several integers

I am facing the problem of having several integers, and I have to generate one using them. For example.
Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4
Hash Sum: 43
I have some restriction in the values, the maximum value that and attribute can have is 30, the addition of all of them is always 30. And the attributes are always positive.
The key is that I want to generate the same hash sum for similar integers, for example if I have the integers, 14, 4, 10, 2 then I want to generate the same hash sum, in the case above 43. But of course if the integers are very different (4, 4, 2, 20) then I should have a different hash sum. Also it needs to be fast.
Ideally I would like that the output of the hash sum is between 0 and 512, and it should evenly distributed. With my restrictions I can have around 5K different possibilities, so what I would like to have is around 10 per bucket.
I am sure there are many algorithms that do this, but I could not find a way of googling this thing. Can anyone please post an algorithm to do this?.
Some more information
The whole thing with this is that those integers are attributes for a function. I want to store the values of the function in a table, but I do not have enough memory to store all the different options. That is why I want to generalize between similar attributes.
The reason why 10, 5, 15 are totally different from 5, 10, 15, it is because if you imagine this in 3d then both points are a totally different point
Some more information 2
Some answers try to solve the problem using hashing. But I do not think this is so complex. Thanks to one of the comments I have realized that this is a clustering algorithm problem. If we have only 3 attributes and we imagine the problem in 3d, what I just need is divide the space in blocks.
In fact this can be solved with rules of this type
if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
Block = 21
if ( (5 < att[0] < 10) && (5 < att[1] < 10) && (5 < att[2] < 10) && (5 < att[3] < 10))
Block = 45
The problem is that I need a fast and a general way to generate those ifs I cannot write all the possibilities.
The simple solution:
Convert the integers to strings separated by commas, and hash the resulting string using a common hashing algorithm (md5, sha, etc).
If you really want to roll-your-own, I would do something like:
Generate large prime P
Generate random numbers 0 < a[i] < P (for each dimension you have)
To generate hash, calculate: sum(a[i] * x[i]) mod P
Given the inputs a, b, c, and d, each ranging in value from 0 to 30 (5 bits), the following will produce an number in the range of 0 to 255 (8 bits).
bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)
Whether the general approach is appropriate depends on how the question is interpreted. The 3 least significant bits are dropped, grouping 0-7 in the same set, 8-15 in the next, and so forth.
0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255
Trivially tested with:
for (int a = 0; a <= 30; a++)
for (int b = 0; b <= 30; b++)
for (int c = 0; c <= 30; c++)
for (int d = 0; d <= 30; d++) {
int bucket = ((a & 0x18) << 3) |
((b & 0x18) << 1) |
((c & 0x18) >> 1) |
((d & 0x18) >> 3);
printf("%d, %d, %d, %d -> %d\n",
a, b, c, d, bucket);
}
You want a hash function that depends on the order of inputs and where similar sets of numbers will generate the same hash? That is, you want 50 5 5 10 and 5 5 10 50 to generate different values, but you want 52 7 4 12 to generate the same hash as 50 5 5 10? A simple way to do something like this is:
long hash = 13;
for (int i = 0; i < array.length; i++) {
hash = hash * 37 + array[i] / 5;
}
This is imperfect, but should give you an idea of one way to implement what you want. It will treat the values 50 - 54 as the same value, but it will treat 49 and 50 as different values.
If you want the hash to be independent of the order of the inputs (so the hash of 5 10 20 and 20 10 5 are the same) then one way to do this is to sort the array of integers into ascending order before applying the hash. Another way would be to replace
hash = hash * 37 + array[i] / 5;
with
hash += array[i] / 5;
EDIT: Taking into account your comments in response to this answer, it sounds like my attempt above may serve your needs well enough. It won't be ideal, nor perfect. If you need high performance you have some research and experimentation to do.
To summarize, order is important, so 5 10 20 differs from 20 10 5. Also, you would ideally store each "vector" separately in your hash table, but to handle space limitations you want to store some groups of values in one table entry.
An ideal hash function would return a number evenly spread across the possible values based on your table size. Doing this right depends on the expected size of your table and on the number of and expected maximum value of the input vector values. If you can have negative values as "coordinate" values then this may affect how you compute your hash. If, given your range of input values and the hash function chosen, your maximum hash value is less than your hash table size, then you need to change the hash function to generate a larger hash value.
You might want to try using vectors to describe each number set as the hash value.
EDIT:
Since you're not describing why you want to not run the function itself, I'm guessing it's long running. Since you haven't described the breadth of the argument set.
If every value is expected then a full lookup table in a database might be faster.
If you're expecting repeated calls with the same arguments and little overall variation, then you could look at memoizing so only the first run for a argument set is expensive, and each additional request is fast, with less memory usage.
You would need to define what you mean by "similar". Hashes are generally designed to create unique results from unique input.
One approach would be to normalize your input and then generate a hash from the results.
Generating the same hash sum is called a collision, and is a bad thing for a hash to have. It makes it less useful.
If you want similar values to give the same output, you can divide the input by however close you want them to count. If the order makes a difference, use a different divisor for each number. The following function does what you describe:
int SqueezedSum( int a, int b, int c, int d )
{
return (a/11) + (b/7) + (c/5) + (d/3);
}
This is not a hash, but does what you describe.
You want to look into geometric hashing. In "standard" hashing you want
a short key
inverse resistance
collision resistance
With geometric hashing you susbtitute number 3 with something whihch is almost opposite; namely close initial values give close hash values.
Another way to view my problem is using the multidimesional scaling (MS). In MS we start with a matrix of items and what we want is assign a location of each item to an N dimensional space. Reducing in this way the number of dimensions.
http://en.wikipedia.org/wiki/Multidimensional_scaling