In the UVM test I declare and start the sequences, but the output from separate sequences with the same parameters are "related" somehow(see example at the bottom), so when I do cross coverage I only coverage 12.5% of the cases, what is causing this? How can I make the output of the two sequences independent and random?
//declare
ve_master_sequence#( 8,`num_inputs) x_agent_sequence_inst;
ve_master_sequence#( 8,`num_inputs) y_agent_sequence_inst;
//build_phase
x_agent_sequence_inst = ve_master_sequence#( 8,`num_inputs)::type_id::create("x_seq");
y_agent_sequence_inst = ve_master_sequence#( 8,`num_inputs)::type_id::create("y_seq");
//run_phase
x_agent_sequence_inst.start(multadd_env_inst.ve_x_agent_inst.sequencer);
y_agent_sequence_inst.start(multadd_env_inst.ve_y_agent_inst.sequencer);
The environment contains 4 master agents, two 32 bit, two 8 bit. The same parameterized sequence is run on all the agents
// within the sequence
virtual task body();
`uvm_info("ve_master_sequence", $sformatf("begin body()"), UVM_MEDIUM);
for(int i=0; i<length; i++) begin
req = ve_seq_item#(data_width)::type_id::create("req");
start_item(req);
while(!req.randomize() with {
data <= (2**data_width)-1;
delay dist { [0:1] := 2, [2:6] := 1};
});
finish_item(req);
get_response(req);
end
#1000;
endtask
I replaced the req.randomize() with $urandom_range, which worked but it means losing all the constrained random abilities of systemverilog.
When I run the code, and do cross coverage there is a relationship between the output of the sequencers that are the same size,
when y = 0 is always x = 79 or 80
when y = 1 is always x = 80 or 81
when y = 2 is always x = 81 or 82
....
when y = 51 is always x = 130 or 131
when y = 52 is always x = 131 or 132
etc..
Apparently UVM uses its parent Random Number Generator and the sequence name to create a new RNG for the sequence. This is to give good random stability.
Try changing the names for the sequences to make them more unique. I am assuming that longer unique strings give a higher degree of randomization.
Inside the sequence class was this loop creating sequence items. The explanation is (as already said above) UVM uses the class hierarchy for to create the random seed, which gives good random stability
for(int i=0; i<1000; i++) begin
//this caused the error
req = ve_seq_item#(data_width)::type_id::create("req");
//this fixed it
req = ve_seq_item#(data_width)::type_id::create($sformatf("req_%1d", i));
//randomizing the sequence item with the loop variable
Related
I am trying to use system verilog constraint solver to solve the following problem statement :
We have N balls each with unique weight and these balls need to be distributed into groups , such that weight of each group does not exceed a threshold ( MAX_WEIGHT) .
Now i want to find all such possible solutions . The code I wrote in SV is as follows :
`define NUM_BALLS 5
`define MAX_WEIGHT_BUCKET 100
class weight_distributor;
int ball_weight [`NUM_BALLS];
rand int unsigned solution_array[][];
constraint c_solve_bucket_problem
{
foreach(solution_array[i,j]) {
solution_array[i][j] inside {ball_weight};
//unique{solution_array[i][j]};
foreach(solution_array[ii,jj])
if(!((ii == i) & (j == jj))) solution_array[ii][jj] != solution_array[i][j];
}
foreach(solution_array[i,])
solution_array[i].sum() < `MAX_WEIGHT_BUCKET;
}
function new();
ball_weight = {10,20,30,40,50};
endfunction
function void post_randomize();
foreach(solution_array[i,j])
$display("solution_array[%0d][%0d] = %0d", i,j,solution_array[i][j]);
$display("solution_array size = %0d",solution_array.size);
endfunction
endclass
module top;
weight_distributor weight_distributor_o;
initial begin
weight_distributor_o = new();
void'(weight_distributor_o.randomize());
end
endmodule
The issue i am facing here is that i want the size of both the dimentions of the array to be randomly decided based on the constraint solution_array[i].sum() < `MAX_WEIGHT_BUCKET; . From my understanding of SV constraints i believe that the size of the array will be solved before value assignment to the array .
Moreover i also wanted to know if unique could be used for 2 dimentional dynamic array .
You can't use the random number generator (RNG) to enumerate all possible solutions of your problem. It's not built for this. An RNG can give you one of these solutions with each call to randomize(). It's not guaranteed, though, that it gives you a different solution with each call. Say you have 3 solutions, S0, S1, S2. The solver could give you S1, then S2, then S1 again, then S1, then S0, etc. If you know how many solutions there are, you can stop once you've seen them all. Generally, though, you don't know this beforehand.
What an RNG can do, however, is check whether a solution you provide is correct. If you loop over all possible solutions, you can filter out only the ones that are correct. In your case, you have N balls and up to N groups. You can start out by putting each ball into one group and trying if this is a correct solution. You can then put 2 balls into one group and all the other N - 2 into a groups of one. You can put two other balls into one group and all the others into groups of one. You can start putting 2 balls into one group, 2 other balls into one group and all the other N - 4 into groups of one. You can continue this until you put all N balls into the same group. I'm not really sure how you can easily enumerate all solutions. Combinatorics can help you here. At each step of the way you can check whether a certain ball arrangement satisfies the constraints:
// Array describing an arrangement of balls
// - the first dimension is the group
// - the second dimension is the index within the group
typedef unsigned int unsigned arrangement_t[][];
// Function that gives you the next arrangement to try out
function arrangement_t get_next_arrangement();
// ...
endfunction
arrangement = get_next_arrangement();
if (weight_distributor_o.randomize() with {
solution.size() == arrangement.size();
foreach (solution[i]) {
solution[i].size() == arrangement[i].size();
foreach (solution[i][j])
solution[i][j] == arrangement[i][j];
}
})
all_solutions.push_back(arrangement);
Now, let's look at weight_distributor. I'd recommend you write each requirement in an own constraint as this makes the code much more readable.
You can shorten you uniqueness constraint that you wrote as a double loop to using the unique operator:
class weight_distributor;
// ...
constraint unique_balls {
unique { solution_array };
}
endclass
You already had a constraint that each group can have at most MAX_WEIGHT in it:
class weight_distributor;
// ...
constraint max_weight_per_group {
foreach (solution_array[i])
solution_array[i].sum() < `MAX_WEIGHT_BUCKET;
}
endclass
Because of the way array sizes are solved, it's not possible to write constraints that will ensure that you can compute a valid solution using simple calls randomize(). You don't need this, though, if you want to check whether a solution is valid. This is due to the constraints on array sizes in the between arrangement and solution_array.
Try this.!
class ABC;
rand bit[3:0] md_array [][]; // Multidimansional Arrays with unknown size
constraint c_md_array {
// First assign the size of the first dimension of md_array
md_array.size() == 2;
// Then for each sub-array in the first dimension do the following:
foreach (md_array[i]) {
// Randomize size of the sub-array to a value within the range
md_array[i].size() inside {[1:5]};
// Iterate over the second dimension
foreach (md_array[i][j]) {
// Assign constraints for values to the second dimension
md_array[i][j] inside {[1:10]};
}
}
}
endclass
module tb;
initial begin
ABC abc = new;
abc.randomize();
$display ("md_array = %p", abc.md_array);
end
endmodule
https://www.chipverify.com/systemverilog/systemverilog-foreach-constraint
A question/problem for anyone experienced with Xilinx Vivado HLS and FPGA design:
I need help reducing the utilization numbers of a design within the confines of HLS (i.e. can't just redo the design in an HDL). I am targeting the Zedboard (Zynq 7020).
I'm trying to implement 2048-bit RSA in HLS, using the Tenca-koc multiple-word radix 2 montgomery multiplication algorithm, shown below (More algorithm details here):
I wrote this algorithm in HLS and it works in simulation and in C/RTL cosim. My algorithm is here:
#define MWR2MM_m 2048 // Bit-length of operands
#define MWR2MM_w 8 // word size
#define MWR2MM_e 257 // number of words per operand
// Type definitions
typedef ap_uint<1> bit_t; // 1-bit scan
typedef ap_uint< MWR2MM_w > word_t; // 8-bit words
typedef ap_uint< MWR2MM_m > rsaSize_t; // m-bit operand size
/*
* Multiple-word radix 2 montgomery multiplication using carry-propagate adder
*/
void mwr2mm_cpa(rsaSize_t X, rsaSize_t Yin, rsaSize_t Min, rsaSize_t* out)
{
// extend operands to 2 extra words of 0
ap_uint<MWR2MM_m + 2*MWR2MM_w> Y = Yin;
ap_uint<MWR2MM_m + 2*MWR2MM_w> M = Min;
ap_uint<MWR2MM_m + 2*MWR2MM_w> S = 0;
ap_uint<2> C = 0; // two carry bits
bit_t qi = 0; // an intermediate result bit
// Store concatenations in a temporary variable to eliminate HLS compiler warnings about shift count
ap_uint<MWR2MM_w> temp_concat=0;
// scan X bit-by bit
for (int i=0; i<MWR2MM_m; i++)
{
qi = (X[i]*Y[0]) xor S[0];
// C gets top two bits of temp_concat, j'th word of S gets bottom 8 bits of temp_concat
temp_concat = X[i]*Y.range(MWR2MM_w-1,0) + qi*M.range(MWR2MM_w-1,0) + S.range(MWR2MM_w-1,0);
C = temp_concat.range(9,8);
S.range(MWR2MM_w-1,0) = temp_concat.range(7,0);
// scan Y and M word-by word, for each bit of X
for (int j=1; j<=MWR2MM_e; j++)
{
temp_concat = C + X[i]*Y.range(MWR2MM_w*j+(MWR2MM_w-1), MWR2MM_w*j) + qi*M.range(MWR2MM_w*j+(MWR2MM_w-1), MWR2MM_w*j) + S.range(MWR2MM_w*j+(MWR2MM_w-1), MWR2MM_w*j);
C = temp_concat.range(9,8);
S.range(MWR2MM_w*j+(MWR2MM_w-1), MWR2MM_w*j) = temp_concat.range(7,0);
S.range(MWR2MM_w*(j-1)+(MWR2MM_w-1), MWR2MM_w*(j-1)) = (S.bit(MWR2MM_w*j), S.range( MWR2MM_w*(j-1)+(MWR2MM_w-1), MWR2MM_w*(j-1)+1));
}
S.range(S.length()-1, S.length()-MWR2MM_w) = 0;
C=0;
}
// if final partial sum is greater than the modulus, bring it back to proper range
if (S >= M)
S -= M;
*out = S;
}
Unfortunately, the LUT utilization is huge.
This is problematic because I need to be able to fit multiple of these blocks in hardware as axi4-lite slaves.
Could someone please provide a few suggestions as to how I can reduce the LUT utilization, WITHIN THE CONFINES OF HLS?
I've already tried the following:
Experimenting with different word lengths
switching the top level inputs to arrays so they are BRAM (i.e. not using ap_uint<2048>, but instead ap_uint foo[MWR2MM_e])
Experimenting with all sorts of directives: compartmentalizing into multiple inline functions, dataflow architecture, resource limits on lshr, etc.
However, nothing really drives the LUT utilization down in a meaningful way. Is there a glaringly obvious way that I could reduce the utilization that is apparent to anyone?
In particular, I've seen papers on implementations of the mwr2mm algorithm that (only use one DSP block and one BRAM). Is this even worth attempting to implement using HLS? Or is there no way that I can actually control the resources that the algorithm is mapped to without describing it in HDL?
Thanks for the help.
I have been using arc4random() and arc4random_uniform() and I always had the feeling that they wasn't exactly random, for example, I was randomly choosing values from an Array but often the values that came out were the same when I generated them multiple times in a row, so today I thought that I would use an Xcode playground to see how these functions are behaving, so I first tests arc4random_uniform to generate a number between 0 and 4, so I used this algorithm :
import Cocoa
var number = 0
for i in 1...20 {
number = Int(arc4random_uniform(5))
}
And I ran it several times, and here is how to values are evolving most of the time :
So as you can see the values are increasing and decreasing repeatedly, and once the values are at the maximum/minimum, they often stay at it during a certain time (see the first screenshot at the 5th step, the value stays at 3 during 6 steps, the problem is that it isn't at all unusual, the function actually behaves in that way most of the time in my tests.
Now, if we look at arc4random(), it's basically the same :
So here are my questions :
Why is this function behaving in this way ?
How to make it more random ?
Thank you.
EDIT :
Finally, I made two experiments that were surprising, the first one with a real dice :
What surprised me is that I wouldn't have said that it was random, since I was seeing the same sort of pattern that as described as non-random for arc4random() & arc4random_uniform(), so as Jean-Baptiste Yunès pointed out, humans aren't good to see if a sequence of numbers is really random.
I also wanted to do a more "scientific" experiment, so I made this algorithm :
import Foundation
var appeared = [0,0,0,0,0,0,0,0,0,0,0]
var numberOfGenerations = 1000
for _ in 1...numberOfGenerations {
let randomNumber = Int(arc4random_uniform(11))
appeared[randomNumber]++
}
for (number,numberOfTimes) in enumerate(appeared) {
println("\(number) appeard \(numberOfTimes) times (\(Double(numberOfGenerations)/Double(numberOfTimes))%)")
}
To see how many times each number appeared, and effectively the numbers are randomly generated, for example, here is one output from the console :
0 appeared 99 times.
1 appeared 97 times.
2 appeared 78 times.
3 appeared 80 times.
4 appeared 87 times.
5 appeared 107 times.
6 appeared 86 times.
7 appeared 97 times.
8 appeared 100 times.
9 appeared 91 times.
10 appeared 78 times.
So it's definitely OK 😊
EDIT #2 : I made again the dice experiment with more rolls, and it's still as surprising to me :
A true random sequence of numbers cannot be generated by an algorithm. They can only produce pseudo-random sequence of numbers (something that looks like a random sequence). So depending on the algorithm chosen, the quality of the "randomness" may vary. The quality of arc4random() sequences is generally considered to have a good randomness.
You cannot analyze the randomness of a sequence visually... Humans are very bad to detect randomness! They tend to find some structure where there is no. Nothing really hurts in your diagrams (except the rare subsequence of 6 three in-a-row, but that is randomness, sometimes unusual things happens). You would be surprised if you had used a dice to generate a sequence and draw its graph. Beware that a sample of only 20 numbers cannot be seriously analyzed against its randomness, your need much bigger samples.
If you need some other kind of randomness, you can try to use /dev/random pseudo-file, which generate a random number each time you read in. The sequence is generated by a mix of algorithms and external physical events that ay happens in your computer.
It depends on what you mean when you say random.
As stated in the comments, true randomness is clumpy. Long strings of repeats or close values are expected.
If this doesn't fit your requirement, then you need to better define your requirement.
Other options could include using a shuffle algorithm to dis-order things in an array, or use an low-discrepancy sequence algorithm to give a equal distribution of values.
I don’t really agree with the idea of humans who are very bad to detect randomness.
Would you be satisfied if you obtain 1-1-2-2-3-3-4-4-5-5-6-6 after throwing 6 couples of dices ? however the dices frequencies are perfect…
This is exactly the problem i’m encountering with arc4random or arc4random_uniform functions.
I’m developing a backgammon application since many years which is based on a neural network trained by word champions players. I DO know that it plays much better than any one but many users think it is cheating. I also have doubts sometimes so I’ve decided to throw all dices by myself…
I’m not satisfied at all with arc4random, even if frequencies are OK.
I always throw a couple of dices and results lead to unacceptable situations, for example : getting five consecutive double dices for the same player, waiting 12 turns (24 dices) until the first 6 occurs.
It is easy to test (C code) :
void randomDices ( int * dice1, int * dice2, int player )
{
( * dice1 ) = arc4random_uniform ( 6 ) ;
( * dice2 ) = arc4random_uniform ( 6 ) ;
// Add to your statistics
[self didRandomDice1:( * dice1 ) dice2:( * dice2 ) forPlayer:player] ;
}
Maybe arc4random doesn’t like to be called twice during a short time…
So I’ve tried several solutions and finally choose this code which runs a second level of randomization after arc4random_uniform :
int CFRandomDice ()
{
int __result = -1 ;
BOOL __found = NO ;
while ( ! __found )
{
// random int big enough but not too big
int __bigint = arc4random_uniform ( 10000 ) ;
// Searching for the first character between '1' and '6'
// in the string version of bigint :
NSString * __bigString = #( __bigint ).stringValue ;
NSInteger __nbcar = __bigString.length ;
NSInteger __i = 0 ;
while ( ( __i < __nbcar ) && ( ! __found ) )
{
unichar __ch = [__bigString characterAtIndex:__i] ;
if ( ( __ch >= '1' ) && ( __ch <= '6' ) )
{
__found = YES ;
__result = __ch - '1' + 1 ;
}
else
{
__i++ ;
}
}
}
return ( __result ) ;
}
This code create a random number with arc4random_uniform ( 10000 ), convert it to string and then searches for the first digit between ‘1’ and ‘6’ in the string.
This appeared to me as a very good way to randomize the dices because :
1/ frequencies are OK (see the statistics hereunder) ;
2/ Exceptional dice sequences occur at exceptional times.
10000 dices test:
----------
Game Stats
----------
HIM :
Total 1 = 3297
Total 2 = 3378
Total 3 = 3303
Total 4 = 3365
Total 5 = 3386
Total 6 = 3271
----------
ME :
Total 1 = 3316
Total 2 = 3289
Total 3 = 3282
Total 4 = 3467
Total 5 = 3236
Total 6 = 3410
----------
HIM doubles = 1623
ME doubles = 1648
Now I’m sure that players won’t complain…
the thing is that, the 1st number is already ORACLE LONG,
second one a Date (SQL DATE, no timestamp info extra), the last one being a Short value in the range 1000-100'000.
how can I create sort of hash value that will be unique for each combination optimally?
string concatenation and converting to long later:
I don't want this, for example.
Day Month
12 1 --> 121
1 12 --> 121
When you have a few numeric values and need to have a single "unique" (that is, statistically improbable duplicate) value out of them you can usually use a formula like:
h = (a*P1 + b)*P2 + c
where P1 and P2 are either well-chosen numbers (e.g. if you know 'a' is always in the 1-31 range, you can use P1=32) or, when you know nothing particular about the allowable ranges of a,b,c best approach is to have P1 and P2 as big prime numbers (they have the least chance to generate values that collide).
For an optimal solution the math is a bit more complex than that, but using prime numbers you can usually have a decent solution.
For example, Java implementation for .hashCode() for an array (or a String) is something like:
h = 0;
for (int i = 0; i < a.length; ++i)
h = h * 31 + a[i];
Even though personally, I would have chosen a prime bigger than 31 as values inside a String can easily collide, since a delta of 31 places can be quite common, e.g.:
"BB".hashCode() == "Aa".hashCode() == 2122
Your
12 1 --> 121
1 12 --> 121
problem is easily fixed by zero-padding your input numbers to the maximum width expected for each input field.
For example, if the first field can range from 0 to 10000 and the second field can range from 0 to 100, your example becomes:
00012 001 --> 00012001
00001 012 --> 00001012
In python, you can use this:
#pip install pairing
import pairing as pf
n = [12,6,20,19]
print(n)
key = pf.pair(pf.pair(n[0],n[1]),
pf.pair(n[2], n[3]))
print(key)
m = [pf.depair(pf.depair(key)[0]),
pf.depair(pf.depair(key)[1])]
print(m)
Output is:
[12, 6, 20, 19]
477575
[(12, 6), (20, 19)]
I am facing the problem of having several integers, and I have to generate one using them. For example.
Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4
Hash Sum: 43
I have some restriction in the values, the maximum value that and attribute can have is 30, the addition of all of them is always 30. And the attributes are always positive.
The key is that I want to generate the same hash sum for similar integers, for example if I have the integers, 14, 4, 10, 2 then I want to generate the same hash sum, in the case above 43. But of course if the integers are very different (4, 4, 2, 20) then I should have a different hash sum. Also it needs to be fast.
Ideally I would like that the output of the hash sum is between 0 and 512, and it should evenly distributed. With my restrictions I can have around 5K different possibilities, so what I would like to have is around 10 per bucket.
I am sure there are many algorithms that do this, but I could not find a way of googling this thing. Can anyone please post an algorithm to do this?.
Some more information
The whole thing with this is that those integers are attributes for a function. I want to store the values of the function in a table, but I do not have enough memory to store all the different options. That is why I want to generalize between similar attributes.
The reason why 10, 5, 15 are totally different from 5, 10, 15, it is because if you imagine this in 3d then both points are a totally different point
Some more information 2
Some answers try to solve the problem using hashing. But I do not think this is so complex. Thanks to one of the comments I have realized that this is a clustering algorithm problem. If we have only 3 attributes and we imagine the problem in 3d, what I just need is divide the space in blocks.
In fact this can be solved with rules of this type
if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
Block = 21
if ( (5 < att[0] < 10) && (5 < att[1] < 10) && (5 < att[2] < 10) && (5 < att[3] < 10))
Block = 45
The problem is that I need a fast and a general way to generate those ifs I cannot write all the possibilities.
The simple solution:
Convert the integers to strings separated by commas, and hash the resulting string using a common hashing algorithm (md5, sha, etc).
If you really want to roll-your-own, I would do something like:
Generate large prime P
Generate random numbers 0 < a[i] < P (for each dimension you have)
To generate hash, calculate: sum(a[i] * x[i]) mod P
Given the inputs a, b, c, and d, each ranging in value from 0 to 30 (5 bits), the following will produce an number in the range of 0 to 255 (8 bits).
bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)
Whether the general approach is appropriate depends on how the question is interpreted. The 3 least significant bits are dropped, grouping 0-7 in the same set, 8-15 in the next, and so forth.
0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255
Trivially tested with:
for (int a = 0; a <= 30; a++)
for (int b = 0; b <= 30; b++)
for (int c = 0; c <= 30; c++)
for (int d = 0; d <= 30; d++) {
int bucket = ((a & 0x18) << 3) |
((b & 0x18) << 1) |
((c & 0x18) >> 1) |
((d & 0x18) >> 3);
printf("%d, %d, %d, %d -> %d\n",
a, b, c, d, bucket);
}
You want a hash function that depends on the order of inputs and where similar sets of numbers will generate the same hash? That is, you want 50 5 5 10 and 5 5 10 50 to generate different values, but you want 52 7 4 12 to generate the same hash as 50 5 5 10? A simple way to do something like this is:
long hash = 13;
for (int i = 0; i < array.length; i++) {
hash = hash * 37 + array[i] / 5;
}
This is imperfect, but should give you an idea of one way to implement what you want. It will treat the values 50 - 54 as the same value, but it will treat 49 and 50 as different values.
If you want the hash to be independent of the order of the inputs (so the hash of 5 10 20 and 20 10 5 are the same) then one way to do this is to sort the array of integers into ascending order before applying the hash. Another way would be to replace
hash = hash * 37 + array[i] / 5;
with
hash += array[i] / 5;
EDIT: Taking into account your comments in response to this answer, it sounds like my attempt above may serve your needs well enough. It won't be ideal, nor perfect. If you need high performance you have some research and experimentation to do.
To summarize, order is important, so 5 10 20 differs from 20 10 5. Also, you would ideally store each "vector" separately in your hash table, but to handle space limitations you want to store some groups of values in one table entry.
An ideal hash function would return a number evenly spread across the possible values based on your table size. Doing this right depends on the expected size of your table and on the number of and expected maximum value of the input vector values. If you can have negative values as "coordinate" values then this may affect how you compute your hash. If, given your range of input values and the hash function chosen, your maximum hash value is less than your hash table size, then you need to change the hash function to generate a larger hash value.
You might want to try using vectors to describe each number set as the hash value.
EDIT:
Since you're not describing why you want to not run the function itself, I'm guessing it's long running. Since you haven't described the breadth of the argument set.
If every value is expected then a full lookup table in a database might be faster.
If you're expecting repeated calls with the same arguments and little overall variation, then you could look at memoizing so only the first run for a argument set is expensive, and each additional request is fast, with less memory usage.
You would need to define what you mean by "similar". Hashes are generally designed to create unique results from unique input.
One approach would be to normalize your input and then generate a hash from the results.
Generating the same hash sum is called a collision, and is a bad thing for a hash to have. It makes it less useful.
If you want similar values to give the same output, you can divide the input by however close you want them to count. If the order makes a difference, use a different divisor for each number. The following function does what you describe:
int SqueezedSum( int a, int b, int c, int d )
{
return (a/11) + (b/7) + (c/5) + (d/3);
}
This is not a hash, but does what you describe.
You want to look into geometric hashing. In "standard" hashing you want
a short key
inverse resistance
collision resistance
With geometric hashing you susbtitute number 3 with something whihch is almost opposite; namely close initial values give close hash values.
Another way to view my problem is using the multidimesional scaling (MS). In MS we start with a matrix of items and what we want is assign a location of each item to an N dimensional space. Reducing in this way the number of dimensions.
http://en.wikipedia.org/wiki/Multidimensional_scaling