Fastest possible string key lookup for known set of keys

Fastest possible string key lookup for known set of keys - hash

Consider a lookup function with the following signature, which needs to return an integer for a given string key:
int GetValue(string key) { ... }
Consider furthermore that the key-value mappings, numbering N, are known in advance when the source code for function is being written, e.g.:
// N=3
{ "foo", 1 },
{ "bar", 42 },
{ "bazz", 314159 }
So a valid (but not perfect!) implementation for the function for the input above would be:
int GetValue(string key)
{
switch (key)
{
case "foo": return 1;
case "bar": return 42;
case "bazz": return 314159;
}
// Doesn't matter what we do here, control will never come to this point
throw new Exception();
}
It is also known in advance exactly how many times (C>=1) the function will be called at run-time for every given key. For example:
C["foo"] = 1;
C["bar"] = 1;
C["bazz"] = 2;
The order of such calls is not known, however. E.g. the above could describe the following sequence of calls at run-time:
GetValue("foo");
GetValue("bazz");
GetValue("bar");
GetValue("bazz");
or any other sequence, provided the call counts match.
There is also a restriction M, specified in whatever units is most convenient, defining the upper memory bound of any lookup tables and other helper structures that can be used by the GetValue (the structures are initialized in advance; that initialization is not counted against the complexity of the function). For example, M=100 chars, or M=256 sizeof(object reference).
The question is, how to write the body of GetValue such that it is as fast as possible - in other words, the aggregate time of all GetValue calls (note that we know the total count, per everything above) is minimal, for given N, C and M?
The algorithm may require a reasonable minimal value for M, e.g. M >= char.MaxValue. It may also require that M be aligned to some reasonable boundary - for example, that it may only be a power of two. It may also require that M must be a function of N of a certain kind (for example, it may allow valid M=N, or M=2N, ...; or valid M=N, or M=N^2, ...; etc).
The algorithm can be expressed in any suitable language or other form. For runtime performance constrains for generated code, assume that the generated code for GetValue will be in C#, VB or Java (really, any language will do, so long as strings are treated as immutable arrays of characters - i.e. O(1) length and O(1) indexing, and no other data computed for them in advance). Also, to simplify this a bit, answers which assume that C=1 for all keys are considered valid, though those answers which cover the more general case are preferred.
Some musings on possible approaches
The obvious first answer to the above is using a perfect hash, but generic approaches to finding one seem to be imperfect. For example, one can easily generate a table for a minimal perfect hash using Pearson hashing for the sample data above, but then the input key would have to be hashed for every call to GetValue, and Pearson hash necessarily scans the entire input string. But all sample keys actually differ in their third character, so only that can be used as the input for the hash instead of the entire string. Furthermore, if M is required to be at least char.MaxValue, then the third character itself becomes a perfect hash.
For a different set of keys this may no longer be true, but it may still be possible to reduce the amount of characters considered before the precise answer can be given. Furthermore, in some cases where a minimal perfect hash would require inspecting the entire string, it may be possible to reduce the lookup to a subset, or otherwise make it faster (e.g. a less complex hashing function?) by making the hash non-minimal (i.e. M > N) - effectively sacrificing space for the sake of speed.
It may also be that traditional hashing is not such a good idea to begin with, and it's easier to structure the body of GetValue as a series of conditionals, arranged such that the first checks for the "most variable" character (the one that varies across most keys), with further nested checks as needed to determine the correct answer. Note that "variance" here can be influenced by the number of times each key is going to be looked up (C). Furthermore, it is not always readily obvious what the best structure of branches should be - it may be, for example, that the "most variable" character only lets you distinguish 10 keys out of 100, but for the remaining 90 that one extra check is unnecessary to distinguish between them, and on average (considering C) there are more checks per key than in a different solution which does not start with the "most variable" character. The goal then is to determine the perfect sequence of checks.

You could use the Boyer search, but I think that the Trie would be a much more effiecent method. You can modify the Trie to collapse the words as you make the hit count for a key zero, thus reducing the number of searches you would have to do the farther down the line you get. The biggest benefit you would get is that you are doing array lookups for the indexes, which is much faster than a comparison.

You've talked about a memory limitation when it comes to precomputation - is there also a time limitation?
I would consider a trie, but one where you didn't necessarily start with the first character. Instead, find the index which will cut down the search space most, and consider that first. So in your sample case ("foo", "bar", "bazz") you'd take the third character, which would immediately tell you which string it was. (If we know we'll always be given one of the input words, we can return as soon as we've found a unique potential match.)
Now assuming that there isn't a single index which will get you down to a unique string, you need to determine the character to look at after that. In theory you precompute the trie to work out for each branch what the optimal character to look at next is (e.g. "if the third character was 'a', we need to look at the second character next; if it was 'o' we need to look at the first character next) but that potentially takes a lot more time and space. On the other hand, it could save a lot of time - because having gone down one character, each of the branches may have an index to pick which will uniquely identify the final string, but be a different index each time. The amount of space required by this approach would depend on how similar the strings were, and might be hard to predict in advance. It would be nice to be able to dynamically do this for all the trie nodes you can, but then when you find you're running out of construction space, determine a single order for "everything under this node". (So you don't end up storing a "next character index" on each node underneath that node, just the single sequence.) Let me know if this isn't clear, and I can try to elaborate...
How you represent the trie will depend on the range of input characters. If they're all in the range 'a'-'z' then a simple array would be incredibly fast to navigate, and reasonably efficient for trie nodes where there are possibilities for most of the available options. Later on, when there are only two or three possible branches, that becomes wasteful in memory. I would suggest a polymorphic Trie node class, such that you can build the most appropriate type of node depending on how many sub-branches there are.
None of this performs any culling - it's not clear how much can be achieved by culling quickly. One situation where I can see it helping is when the number of branches from one trie node drops to 1 (because of the removal of a branch which is exhausted), that branch can be eliminated completely. Over time this could make a big difference, and shouldn't be too hard to compute. Basically as you build the trie you can predict how many times each branch will be taken, and as you navigate the trie you can subtract one from that count per branch when you navigate it.
That's all I've come up with so far, and it's not exactly a full implementation - but I hope it helps...

Is a binary search of the table really so awful? I would take the list of potential strings and "minimize" them, the sort them, and finally do a binary search upon the block of them.
By minimize I mean reducing them to the minimum they need to be, kind of a custom stemming.
For example if you had the strings: "alfred", "bob", "bill", "joe", I'd knock them down to "a", "bi", "bo", "j".
Then put those in to a contiguous block of memory, for example:
char *table = "a\0bi\0bo\0j\0"; // last 0 is really redundant..but
char *keys[4];
keys[0] = table;
keys[1] = table + 2;
keys[2] = table + 5;
keys[3] = table + 8;
Ideally the compiler would do all this for you if you simply go:
keys[0] = "a";
keys[1] = "bi";
keys[2] = "bo";
keys[3] = "j";
But I can't say if that's true or not.
Now you can bsearch that table, and the keys are as short as possible. If you hit the end of the key, you match. If not, then follow the standard bsearch algorithm.
The goal is to get all of the data close together and keep the code itty bitty so that it all fits in to the CPU cache. You can process the key from the program directly, no pre-processing or adding anything up.
For a reasonably large number of keys that are reasonably distributed, I think this would be quite fast. It really depends on the number of strings involved. For smaller numbers, the overhead of computing hash values etc is more than search something like this. For larger values, it's worth it. Just what those number are all depends on the algorithms etc.
This, however, is likely the smallest solution in terms of memory, if that's important.
This also has the benefit of simplicity.
Addenda:
You don't have any specifications on the inputs beyond 'strings'. There's also no discussion about how many strings you expect to use, their length, their commonality or their frequency of use. These can perhaps all be derived from the "source", but not planned upon by the algorithm designer. You're asking for an algorithm that creates something like this:
inline int GetValue(char *key) {
return 1234;
}
For a small program that happens to use only one key all the time, all the way up to something that creates a perfect hash algorithm for millions of strings. That's a pretty tall order.
Any design going after "squeezing every single bit of performance possible" needs to know more about the inputs than "any and all strings". That problem space is simply too large if you want it the fastest possible for any condition.
An algorithm that handles strings with extremely long identical prefixes might be quite different than one that works on completely random strings. The algorithm could say "if the key starts with "a", skip the next 100 chars, since they're all a's".
But if these strings are sourced by human beings, and they're using long strings of the same letters, and not going insane trying to maintain that data, then when they complain that the algorithm is performing badly, you reply that "you're doing silly things, don't do that". But we don't know the source of these strings either.
So, you need to pick a problem space to target the algorithm. We have all sorts of algorithms that ostensibly do the same thing because they address different constraints and work better in different situations.
Hashing is expensive, laying out hashmaps is expensive. If there's not enough data involved, there are better techniques than hashing. If you have large memory budget, you could make an enormous state machine, based upon N states per node (N being your character set size -- which you don't specify -- BAUDOT? 7-bit ASCII? UTF-32?). That will run very quickly, unless the amount of memory consumed by the states smashes the CPU cache or squeezes out other things.
You could possibly generate code for all of this, but you may run in to code size limits (you don't say what language either -- Java has a 64K method byte code limit for example).
But you don't specify any of these constraints. So, it's kind of hard to get the most performant solution for your needs.

What you want is a look-up table of look-up tables.
If memory cost is not an issue you can go all out.
const int POSSIBLE_CHARCODES = 256; //256 for ascii //65536 for unicode 16bit
struct LutMap {
int value;
LutMap[POSSIBLE_CHARCODES] next;
}
int GetValue(string key) {
LutMap root = Global.AlreadyCreatedLutMap;
for(int x=0; x<key.length; x++) {
int c = key.charCodeAt(x);
if(root.next[c] == null) {
return root.value;
}
root = root.next[c];
}
}

I reckon that it's all about finding the right hash function. As long as you know what the key-value relationship is in advance, you can do an analysis to try and find a hash function to meet your requrements. Taking the example you've provided, treat the input strings as binary integers:
foo = 0x666F6F (hex value)
bar = 0x626172
bazz = 0x62617A7A
The last column present in all of them is different in each. Analyse further:
foo = 0xF = 1111
bar = 0x2 = 0010
bazz = 0xA = 1010
Bit-shift to the right twice, discarding overflow, you get a distinct value for each of them:
foo = 0011
bar = 0000
bazz = 0010
Bit-shift to the right twice again, adding the overflow to a new buffer:
foo = 0010
bar = 0000
bazz = 0001
You can use those to query a static 3-entry lookup table. I reckon this highly personal hash function would take 9 very basic operations to get the nibble (2), bit-shift (2), bit-shift and add (4) and query (1), and a lot of these operations can be compressed further through clever assembly usage. This might well be faster than taking run-time infomation into account.

Have you looked at TCB . Perhaps the algorithm used there can be used to retrieve your values. It sounds a lot like the problem you are trying to solve. And from experience I can say tcb is one of the fastest key store lookups I have used. It is a constant lookup time, regardless of the number of keys stored.

Consider using Knuth–Morris–Pratt algorithm.
Pre-process given map to a large string like below
String string = "{foo:1}{bar:42}{bazz:314159}";
int length = string.length();
According KMP preprocessing time for the string will take O(length).
For searching with any word/key will take O(w) complexity, where w is length of the word/key.
You will be needed to make 2 modification to KMP algorithm:
key should be appear ordered in the joined string
instead of returning true/false it should parse the number and return it
Wish it can give a good hints.

Here's a feasible approach to determine the smallest subset of chars to target for your hash routine:
let:
k be the amount of distinct chars across all your keywords
c be the max keyword length
n be the number of keywords
in your example (padded shorter keywords w/spaces):
"foo "
"bar "
"bazz"
k = 7 (f,o,b,a,r,z, ), c = 4, n = 3
We can use this to compute a lower bound for our search. We need at least log_k(n) chars to uniquely identify a keyword, if log_k(n) >= c then you'll need to use the whole keyword and there's no reason to proceed.
Next, eliminate one column at a time and check if there are still n distinct values remaining. Use the distinct chars in each column as a heuristic to optimize our search:
2 2 3 2
f o o .
b a r .
b a z z
Eliminate columns with the lowest distinct chars first. If you have <= log_k(n) columns remaining you can stop. Optionally you could randomize a bit and eliminate the 2nd lowest distinct col or try to recover if the eliminated col results in less than n distinct words. This algorithm is roughly O(n!) depending on how much you try to recover. It's not guaranteed to find an optimal solution but it's a good tradeoff.
Once you have your subset of chars, proceed with the usual routines for generating a perfect hash. The result should be an optimal perfect hash.

Related

Should I count up in Perl 6 with a sequence or a range?

Perl 6 has lazy lists, but it also has unbounded Range objects. Which one should you choose for counting up by whole numbers?
And there's unbounded Range with two dots:
0 .. *
There's the Seq (sequence) with three dots:
0 ... *
A Range generates lists of consecutives thingys using their natural order. It inherits from Iterable, but also Positional so you can index a range. You can check if something is within a Range, but that's not part of the task.
A Seq can generate just about anything you like as long as it knows how to get to the next element. It inherits from Iterable, but also PositionalBindFailover which fakes the Positional stuff through a cache and list conversion. I don't think that a big deal if you're only moving from one element to the next.
I'm going back and forth on this. At the moment I'm thinking it's Range.

Both 0 .. * and 0 ... * are fine.
Iterating over them, for example with a for loop, has exactly the same effect in both cases. (Neither will leak memory by keeping around already iterated elements.)
Assigning them to a # variable produces the same lazy Array.
So as long as you only want to count up numbers to infinity by a step of 1, I don't see a downside to either.
The ... sequence construction operator is more generic though, in that it can also be used to
count with a different step (1, 3 ... *)
count downwards (10 ... -Inf)
follow a geometric sequence (2, 4, 8 ... *)
follow a custom iteration formula (1, 1, *+* ... *)
so when I need to do something like that, then I'd consider using ... for any nearby and related "count up by one" as well, for consistency.
On the other hand:
A Range can be indexed efficiently without having to generate and cache all preceding elements, so if you want to index your counter in addition to iterating over it, it is preferable. The same goes for other list operations that deal with element positions, like reverse: Range has efficient overloads for them, whereas using them on a Seq has to iterate and cache its elements first.
If you want to count upwards to a variable end-point (as in 1 .. $n), it's safer to use a Range because you can be sure it'll never count downwards, no matter what $n is. (If the endpoint is less than the startpoint, as in 1 .. 0, it will behave as an empty sequence when iterated, which tends to get edge-cases right in practice.)
Conversely, if you want to safely count downwards ensuring it will never unexpectedly count upwards, you can use reverse 1 .. $n.
Lastly, a Range is a more specific/high-level representation of the concept of "numbers from x to y", whereas a Seq represents the more generic concept of "a sequence of values". A Seq is, in general, driven by arbitrary generator code (see gather/take) - the ... operator is just semantic sugar for creating some common types of sequences. So it may feel more declarative to use a Range when "numbers from x to y" is the concept you want to express. But I suppose that's a purely psychological concern... :P

Semantically speaking, a Range is a static thing (a bounded set of values), a Seq is a dynamic thing (a value generator) and a lazy List a static view of a dynamic thing (an immutable cache for generated values).
Rule of thumb: Prefer static over dynamic, but simple over complex.
In addition, a Seq is an iterable thing, a List is an iterable positional thing, and a Range is an ordered iterable positional thing.
Rule of thumb: Go with the most generic or most specific depending on context.
As we're dealing with iteration only and are not interested in positional access or bounds, using a Seq (which is essentially a boxed Iterator) seems like a natural choice. However, ordered sets of consecutive integers are exactly what an integer Range represents, and personally that's what I would see as most appropriate for your particular use case.
When there is no clear choice, I tend to prefer ranges for their simplicity anyway (and try to avoid lazy lists, the heavy-weight).
Note that the language syntax also nudges you in the direction of Range, which are rather heavily Huffman-coded (two-char infix .., one-char prefix ^).

There is a difference between ".." (Range) and "..." (Seq):
$ perl6
> 1..10
1..10
> 1...10
(1 2 3 4 5 6 7 8 9 10)
> 2,4...10
(2 4 6 8 10)
> (3,6...*)[^5]
(3 6 9 12 15)
The "..." operator can intuit patterns!
https://docs.perl6.org/language/operators#index-entry-..._operators
As I understand, you can traverse a Seq only once. It's meant for streaming where you don't need to go back (e.g., a file). I would think a Range should be a fine choice.

How can we prove that a bitcoin block is always solvable?

I'm trying to implement a simple cryptocurrency similar to bitcoin, just to understand it deeply down to the code level.
I understand that a bitcoin block contains a hash of the previous block, many transactions and an reward transaction for the solver.
the miner basically runs SHA256 on this candidate block combined with an random number. As long as the first certain digits of a hash result are zeros, we say this block is solved, and we broadcast the result to the entire network to claim the reward.
but I have never seen anyone proving that a block is solvable at all. I guess this is guaranteed by SHA256? because the solution size is fixed, after trying enough inputs, you are guaranteed to hit every hash result? but how can you prove that the solution distribution of a block is even (uniform), so that you can indeed cover all hash results?
now, suppose a block is indeed always solvable, can I assume that using 64bit for the random integer is enough to solve it? how about 32bit? or I have to use an infinite bit integer?
for example, in the basiccoin project:
the code for proof of work is the following:
def POW(block, hashes):
halfHash = tools.det_hash(block)
block[u'nonce'] = random.randint(0, 10000000000000000000000000000000000000000)
count = 0
while tools.det_hash({u'nonce': block['nonce'],
u'halfHash': halfHash}) > block['target']:
count += 1
block[u'nonce'] += 1
if count > hashes:
return {'error': False}
if restart_signal.is_set():
restart_signal.clear()
return {'solution_found': True}
''' for testing sudden loss in hashpower from miners.
if block[u'length']>150:
else: time.sleep(0.01)
'''
return block
this code randoms a number between [0, 10000000000000000000000000000000000000000] as a start point, and then it just increases the value one by one:
block[u'nonce'] += 1
I'm not a python programmer, I don't know how python handles the type of the integer. there is no handling of integer overflow.
I'm trying to implement similar thing with c++, I don't know what kind of integer can guarantee a solution.

but how can you prove that the solution distribution of a block is even (uniform), so that you can indeed cover all hash results?
SHA256 is deterministic so if you rehash the txns it will always provide the same 256 hash.
The client nodes keep all the txn and the hashes in the merkle tree for the network clients to propagate and verify the longest possible block chain.
The merkle tree is the essential data structure for recording the hashes of previous blocks.
From there the chain of hash confirmations can be tracked from the origin (genesis) block.

Crypto algorithm varying based on password itself?

I've tried searching and found nothing (don't know what these would be called if they were things already, so searching is kind of hard), so forgive me if this is dumb or already answered somewhere. For the sake of argument lets says I'm using bcrypt or something of that reputation/quality when I say I'm hashing something.
First, is there are reason that your hashing algorithm cannot vary with the password or it's intermediate hashes?
public static byte[] myHash(byte[] input, byte[] saltA, byte[] saltB) {
return input[0] % 2 == 0
? bcrypt(bcrypt(input, saltA), saltB)
: bcrypt(bcrypt(input, saltB), saltA);
}
I feel like this doesn't use much CPU - it's just two iterations of bcrypt, and I've suggestions for 10+ iterations in other security discussions - but let's say bcrypt has discovered to be fully reversible if you knew the salt and the hash, one unhashing would now necessitate unhashing it twice - once with saltA, then with saltB and once vice versa, giving you two candidate passwords, one of which is a decoy with a 50% false positive rate (that is, it rehashes to the correct hash because it's first bit is correctly even or odd), requiring heuristics or human eyes to correctly identify the real one, so you've at least doubled the computing resources needed and possibly required human intervention. But we can do better:
public static byte[] myBetterHash(byte[] input, byte[] saltA, byte[] saltB) {
byte[] curr = input;
for (int i = 0; i < 5; i++) {
switch(input[i] % 3) {
case 0: curr = bcrypt(bcrypt(bcrypt(curr, input), saltA), saltB); break;
case 1: curr = bcrypt(bcrypt(bcrypt(curr, saltB), input), saltA); break;
case 2: curr = bcrypt(bcrypt(bcrypt(curr, saltA), saltB), input); break;
}
}
return input;
}
Now there are 3 unhashes per iteration over the 5 iterations, yielding 243 candidate passwords, and probably dozens of false positives to eliminate, but even if not, then had to do 243 times the unhashing work they would have if you had just done it. Also, the inclusion of the input again as a salt in subsequent hashes makes it impossible to actually do the unhashing, plus it requires the attacker to hold onto a little extra memory. That said, my last idea is as follows:
public static byte[] myBestHash(byte[] input, byte[] saltA, byte[] saltB) {
byte[] curr = input;
byte[][] arr = new byte[16][]
for (int i = 0; i < 16; i++) {
arr[i] = curr;
switch(curr[0] % 4) {
case 0: curr = bcrypt(curr, saltA); break;
case 1: curr = bcrypt(curr, saltB); break;
case 2: curr = bcrypt(curr, arr[input[i] % i]); break;
case 3: curr = bcrypt(bcrypt(curr, saltA), saltB); break;
}
}
return input;
}
Now the attacker has to deal with an immense number of potential unhashings (3^16 = over 4 million), each of which has to be verified with the above memory intensive (it holds onto 16 intermediate hashes and there's no way to optimize that out).
Second, I feel like the memory intensiveness of that final example paired with the branching salts and maybe even the fact that one of the branches calls bcrypt twice instead of once might, in some combination, make things harder to brute force with graphics cards by making the tack at hand ill-suited to them or making the process waste more I/O than normal. If nothing else, extending this approach beyond 16 iterations will continue to bloat the RAM usage, making it harder to parallelize. Imagine if 256 iterations were used and space for 1024 intermediate hashes had to be held onto for every hash that got leaked in an attack - if the intermediate hashes themselves are, say 1024 bits (= 128 bytes), that's 32kB of wasted memory for every iteration of the brute force attack, which isn't much, but it will definitely add up to at least slightly slower iterations for a brute force attacker and fewer iterations in parallel (due to the extra memory - though 32kB isn't much against a modern password cracking rig, that's 32GB written to memory for a million guesses, and if nothing else that should slow things down a little extra).
So, am I onto something, or is this completely stupid?

This more properly belongs on crypto, but here's my $0.02:
Your calculations are slightly off and depend on some assumptions you make but which don't necessarily hold.
Now the idea of extra iterations isn't new - PBKDF2 uses iterations in a very similar way and bcrypt uses iterations internally already.
The "multi salt" idea gains you little in reality, since the salt is usually stored alongside the plaintext password in plaintext anyways.
This is one of those things where you think "well, if 1 function call and 1 salt are good, then 50 iterations and 2 salts are better. This couldn't possibly hurt!" But this isn't how cryptography works and this sort of thing could hurt although it doesn't in this case; it just wastes resources.
Please don't do this sort of thing blindly. If you want to increase resistance to brute force dictionary attacks against hashes, adjust the bcrypt difficulty factor. Similarly, choose a long and unique salt and let bcrypt do it's thing.

A) The adversary is obviously prepared to compute bcrypt() as most often used, for very low cost. B) Maybe also prepared to run N simple iterations, with some cheat, cheaper than we would expect. C) Very likely not able to reverse the hash function.
In any case the "outer" procedure using bcrypt() should be extremely stupidly written to lower the strength. Branching makes it harder for the adversary to use it's standard crack-rigs: certainly for A) and B), and maybe also for C) - but in case of C) there are big problems anyway.
Combine different types of cryptographic hashing functions, apply branching and unless something trivial mistake is applied it won't be weaker; but most likely harder for the adversary to apply the standard crack-rigs. If the base hash you're building on is fundamentally flawed (many assumes if it was the case they would know it: but that's just naive) the branching might not solve it. But likely make it harder to crack than easier. Wasting resources is good in this case: branching and different procedure than the "stupidly just apply it multiple times" will waste more adversary resources than honest resources: so good.

Make hash value reserved

I need to use an existing (C++) hash function which creates 32 bit hash values for given keys.
The function is extremely complicated.
Now I need to have one value reserved, i.e. so the hash function will never output this value.
Is there a safe way of doing so without understanding/changing complex logic of the existing hash function?
Many thanks...

The simplest approach if you want a hash function which will never return zero:
int result;
hash = compute_hash_one_way(); // Hopefully it's not zero
if (hash) return hash; // In which case we return it
hash = compute_hash_another_way(); // Try something else
if (hash) return hash; // If that was good, return that
return 8675309; // We know THAT's not zero
The second hash computation need not be anything fancy; basically, if one has available any non-zero value that kinda-sorta depends on the input, one may as well use it in preference to returning a constant, but it would likely be better to use a really crummy fast hash function (or even simply always return a constant if the original returned zero) than spend so much time computing the second hash that outside code might infer that the original hash was zero. Note that if the original hash is good, even returning a constant when the original hash returns zero will only cause that constant to be returned for one in two billion inputs rather than one in four billion.
[Incidentally, if I had written the specs for GetHashCode or hashcode in .NET/Java, I would have strongly recommended that a good hash function should only return zero if it could do so essentially instantaneously. The extra time required to e.g. have Integer.GetHashCode() never return zero would in most cases exceed any time that might be spent calling GetHashCode redundantly on the value zero, but something like a string hash which returns zero can on some occasions have major performance implications.]

Looks like you need 'optional' keys. You'd then do
hash = hash_combine(has_value()? 1 : 0, has_value()? hash(value()) : 0);
Alternatively, if you insist, you could reduce the number of bits to 31
compromised_hash = SHIFT_RIGHT(raw_hash) ^ raw_hash; // just an example.
Now, the MSB will always be empty. If it is not: you have your special marker. It would not be easy to make this so that it reduces hash domain by only 1 element (unless you can change the hash raw function)

Complexity of List.reverse?

In Scala, there is reverse method for lists. What is the complexity of this method? Is it better to simply use the original list and always remember that the list is the reverse of what we expect, or to explicitly use reverse before operating on it.
EDIT: What I am really interested in is to get the last two elements of the original list (or the first two of the reversed list).
So I would do something like:
val myList = origList.reverse
val a = myList(0)
val b = myList(1)
This is not in a loop, just a one-time thing in my library... but if someone else uses the library and puts it in a loop, it is not under my control.

Looking at the source, it's O(n) as you might reasonably expect:
override def reverse: List[A] = {
var result: List[A] = Nil
var these = this
while (!these.isEmpty) {
result = these.head :: result
these = these.tail
}
result
}
If in your code you're able to iterate through the list in reverse order at the same cost of iterating in forward order, then it would be more efficient to do this rather than reversing the List.
In fact, if your alternative operation which involves using the original list works in less than O(n) time, then there's a real argument for going with that. Making an algorithm asymptotically faster will make a huge difference if you ever rely on it more (especially if used inside other loops, as oxbow_lakes points out below).
On the whole though I'd expect that anything where you're reversing a list means that you care about the relative ordering of a non-trivial number of elements, and so whatever you're doing is inherently O(n) anyway. (This might not be true for other data structures such as a binary tree; but lists are linear, and in the extreme case even reverse . head can't be done in O(1) time with a singly-linked list.)
So if you're choosing between two O(n) options - for the vast majority of applications, shaving a few nanoseconds off the iteration time isn't going to really gain you anything. Hence it would be "best" to make your code as readable as possible - which means calling reverse and then iterating, if that's closest to your intention.
(And if your app is too slow, and profiling shows that this list manipulation is a hotspot, then you can think about how to make it more efficient. Which by that point may well involve a different option to both of your current candidates, given the extra context you'll have at that point.)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse