How do I search a hash table? - hash

I've just started learning about hash tables and I understand how to insert but not how to search. These are the algorithms I'll be basing this question off:
Hashing the key
int Hash (int key) {
return key % 10; //table has a max size of 10
}
Linear probing for collision resolution.
Suppose I call insert twice with the keys 1, 11, and 21. This would return slot 1 for all 3 keys. After collision resolution the table would have the values 1, 11, and 21 at slots 1, 2, and 3. This is what I assume would happen with my understanding of inserting.
After doing this, how would I get the slots 2 and 3 if I search for the keys 11 and 21? From what I've read searching a hash table should do literally the same thing as inserting except when you arrive at the desired slot, you return the value at that slot instead of inserting something into it.
If I take this literally and apply the same algorithm, if I search for the key 11 I would arrive at slot 4 because it would start at slot 1 and keep probing forward until it finds an empty slot. It wouldn't stop at slot 2 even though it's what I want because it's not empty.
I'm struggling with this even if I use separate chaining. All 3 keys would be stored at slot 1 but using the same algorithm to search would return slot 1, not which node in the linked list.

Each slot stores a key/value pair. As you're searching through each slot, check whether the key is equal to the key you're searching for. Stop searching and return the value when you find an equal key.
With separate chaining, you can do a linear search through the list, checking the key against each key in the list.

I usually prefer to make each entry in the table a struct so I can create a linked list to handle collisions. This reduces collisions significantly. Something like this.
struct hashtable
{
int key;
struct hashtable *pList;
};
struct hashtable ht[10];
void Insert(int key);
{
index = Hash(key);
if (!ht[index].key)
{
ht[index].key = key;
ht[idnex].pList = 0;
}
else
{
struct hashtable *pht;
pht = ht[index].pList;
while (pht->pList)
pht = pht->pList;
pht->pList = new struct hashtable;
pht->pList->key = key;
pht->pList->pList = 0;
}
return;
}
The lookup function would, of course, have to traverse the list if it doesn't find the first entry's key matches. If performance is critical, you could use other strategies for the linked lists such as sorting them and using a binary search.

Related

Why Iam getting ReferenceOutOfRangeException while PlayerPref a list in Unity [duplicate]

I have some code and when it executes, it throws a IndexOutOfRangeException, saying,
Index was outside the bounds of the array.
What does this mean, and what can I do about it?
Depending on classes used it can also be ArgumentOutOfRangeException
An exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll but was not handled in user code Additional information: Index was out of range. Must be non-negative and less than the size of the collection.
What Is It?
This exception means that you're trying to access a collection item by index, using an invalid index. An index is invalid when it's lower than the collection's lower bound or greater than or equal to the number of elements it contains.
When It Is Thrown
Given an array declared as:
byte[] array = new byte[4];
You can access this array from 0 to 3, values outside this range will cause IndexOutOfRangeException to be thrown. Remember this when you create and access an array.
Array Length
In C#, usually, arrays are 0-based. It means that first element has index 0 and last element has index Length - 1 (where Length is total number of items in the array) so this code doesn't work:
array[array.Length] = 0;
Moreover please note that if you have a multidimensional array then you can't use Array.Length for both dimension, you have to use Array.GetLength():
int[,] data = new int[10, 5];
for (int i=0; i < data.GetLength(0); ++i) {
for (int j=0; j < data.GetLength(1); ++j) {
data[i, j] = 1;
}
}
Upper Bound Is Not Inclusive
In the following example we create a raw bidimensional array of Color. Each item represents a pixel, indices are from (0, 0) to (imageWidth - 1, imageHeight - 1).
Color[,] pixels = new Color[imageWidth, imageHeight];
for (int x = 0; x <= imageWidth; ++x) {
for (int y = 0; y <= imageHeight; ++y) {
pixels[x, y] = backgroundColor;
}
}
This code will then fail because array is 0-based and last (bottom-right) pixel in the image is pixels[imageWidth - 1, imageHeight - 1]:
pixels[imageWidth, imageHeight] = Color.Black;
In another scenario you may get ArgumentOutOfRangeException for this code (for example if you're using GetPixel method on a Bitmap class).
Arrays Do Not Grow
An array is fast. Very fast in linear search compared to every other collection. It is because items are contiguous in memory so memory address can be calculated (and increment is just an addition). No need to follow a node list, simple math! You pay this with a limitation: they can't grow, if you need more elements you need to reallocate that array (this may take a relatively long time if old items must be copied to a new block). You resize them with Array.Resize<T>(), this example adds a new entry to an existing array:
Array.Resize(ref array, array.Length + 1);
Don't forget that valid indices are from 0 to Length - 1. If you simply try to assign an item at Length you'll get IndexOutOfRangeException (this behavior may confuse you if you think they may increase with a syntax similar to Insert method of other collections).
Special Arrays With Custom Lower Bound
First item in arrays has always index 0. This is not always true because you can create an array with a custom lower bound:
var array = Array.CreateInstance(typeof(byte), new int[] { 4 }, new int[] { 1 });
In that example, array indices are valid from 1 to 4. Of course, upper bound cannot be changed.
Wrong Arguments
If you access an array using unvalidated arguments (from user input or from function user) you may get this error:
private static string[] RomanNumbers =
new string[] { "I", "II", "III", "IV", "V" };
public static string Romanize(int number)
{
return RomanNumbers[number];
}
Unexpected Results
This exception may be thrown for another reason too: by convention, many search functions will return -1 (nullables has been introduced with .NET 2.0 and anyway it's also a well-known convention in use from many years) if they didn't find anything. Let's imagine you have an array of objects comparable with a string. You may think to write this code:
// Items comparable with a string
Console.WriteLine("First item equals to 'Debug' is '{0}'.",
myArray[Array.IndexOf(myArray, "Debug")]);
// Arbitrary objects
Console.WriteLine("First item equals to 'Debug' is '{0}'.",
myArray[Array.FindIndex(myArray, x => x.Type == "Debug")]);
This will fail if no items in myArray will satisfy search condition because Array.IndexOf() will return -1 and then array access will throw.
Next example is a naive example to calculate occurrences of a given set of numbers (knowing maximum number and returning an array where item at index 0 represents number 0, items at index 1 represents number 1 and so on):
static int[] CountOccurences(int maximum, IEnumerable<int> numbers) {
int[] result = new int[maximum + 1]; // Includes 0
foreach (int number in numbers)
++result[number];
return result;
}
Of course, it's a pretty terrible implementation but what I want to show is that it'll fail for negative numbers and numbers above maximum.
How it applies to List<T>?
Same cases as array - range of valid indexes - 0 (List's indexes always start with 0) to list.Count - accessing elements outside of this range will cause the exception.
Note that List<T> throws ArgumentOutOfRangeException for the same cases where arrays use IndexOutOfRangeException.
Unlike arrays, List<T> starts empty - so trying to access items of just created list lead to this exception.
var list = new List<int>();
Common case is to populate list with indexing (similar to Dictionary<int, T>) will cause exception:
list[0] = 42; // exception
list.Add(42); // correct
IDataReader and Columns
Imagine you're trying to read data from a database with this code:
using (var connection = CreateConnection()) {
using (var command = connection.CreateCommand()) {
command.CommandText = "SELECT MyColumn1, MyColumn2 FROM MyTable";
using (var reader = command.ExecuteReader()) {
while (reader.Read()) {
ProcessData(reader.GetString(2)); // Throws!
}
}
}
}
GetString() will throw IndexOutOfRangeException because you're dataset has only two columns but you're trying to get a value from 3rd one (indices are always 0-based).
Please note that this behavior is shared with most IDataReader implementations (SqlDataReader, OleDbDataReader and so on).
You can get the same exception also if you use the IDataReader overload of the indexer operator that takes a column name and pass an invalid column name.
Suppose for example that you have retrieved a column named Column1 but then you try to retrieve the value of that field with
var data = dr["Colum1"]; // Missing the n in Column1.
This happens because the indexer operator is implemented trying to retrieve the index of a Colum1 field that doesn't exist. The GetOrdinal method will throw this exception when its internal helper code returns a -1 as the index of "Colum1".
Others
There is another (documented) case when this exception is thrown: if, in DataView, data column name being supplied to the DataViewSort property is not valid.
How to Avoid
In this example, let me assume, for simplicity, that arrays are always monodimensional and 0-based. If you want to be strict (or you're developing a library), you may need to replace 0 with GetLowerBound(0) and .Length with GetUpperBound(0) (of course if you have parameters of type System.Array, it doesn't apply for T[]). Please note that in this case, upper bound is inclusive then this code:
for (int i=0; i < array.Length; ++i) { }
Should be rewritten like this:
for (int i=array.GetLowerBound(0); i <= array.GetUpperBound(0); ++i) { }
Please note that this is not allowed (it'll throw InvalidCastException), that's why if your parameters are T[] you're safe about custom lower bound arrays:
void foo<T>(T[] array) { }
void test() {
// This will throw InvalidCastException, cannot convert Int32[] to Int32[*]
foo((int)Array.CreateInstance(typeof(int), new int[] { 1 }, new int[] { 1 }));
}
Validate Parameters
If index comes from a parameter you should always validate them (throwing appropriate ArgumentException or ArgumentOutOfRangeException). In the next example, wrong parameters may cause IndexOutOfRangeException, users of this function may expect this because they're passing an array but it's not always so obvious. I'd suggest to always validate parameters for public functions:
static void SetRange<T>(T[] array, int from, int length, Func<i, T> function)
{
if (from < 0 || from>= array.Length)
throw new ArgumentOutOfRangeException("from");
if (length < 0)
throw new ArgumentOutOfRangeException("length");
if (from + length > array.Length)
throw new ArgumentException("...");
for (int i=from; i < from + length; ++i)
array[i] = function(i);
}
If function is private you may simply replace if logic with Debug.Assert():
Debug.Assert(from >= 0 && from < array.Length);
Check Object State
Array index may not come directly from a parameter. It may be part of object state. In general is always a good practice to validate object state (by itself and with function parameters, if needed). You can use Debug.Assert(), throw a proper exception (more descriptive about the problem) or handle that like in this example:
class Table {
public int SelectedIndex { get; set; }
public Row[] Rows { get; set; }
public Row SelectedRow {
get {
if (Rows == null)
throw new InvalidOperationException("...");
// No or wrong selection, here we just return null for
// this case (it may be the reason we use this property
// instead of direct access)
if (SelectedIndex < 0 || SelectedIndex >= Rows.Length)
return null;
return Rows[SelectedIndex];
}
}
Validate Return Values
In one of previous examples we directly used Array.IndexOf() return value. If we know it may fail then it's better to handle that case:
int index = myArray[Array.IndexOf(myArray, "Debug");
if (index != -1) { } else { }
How to Debug
In my opinion, most of the questions, here on SO, about this error can be simply avoided. The time you spend to write a proper question (with a small working example and a small explanation) could easily much more than the time you'll need to debug your code. First of all, read this Eric Lippert's blog post about debugging of small programs, I won't repeat his words here but it's absolutely a must read.
You have source code, you have exception message with a stack trace. Go there, pick right line number and you'll see:
array[index] = newValue;
You found your error, check how index increases. Is it right? Check how array is allocated, is coherent with how index increases? Is it right according to your specifications? If you answer yes to all these questions, then you'll find good help here on StackOverflow but please first check for that by yourself. You'll save your own time!
A good start point is to always use assertions and to validate inputs. You may even want to use code contracts. When something went wrong and you can't figure out what happens with a quick look at your code then you have to resort to an old friend: debugger. Just run your application in debug inside Visual Studio (or your favorite IDE), you'll see exactly which line throws this exception, which array is involved and which index you're trying to use. Really, 99% of the times you'll solve it by yourself in a few minutes.
If this happens in production then you'd better to add assertions in incriminated code, probably we won't see in your code what you can't see by yourself (but you can always bet).
The VB.NET side of the story
Everything that we have said in the C# answer is valid for VB.NET with the obvious syntax differences but there is an important point to consider when you deal with VB.NET arrays.
In VB.NET, arrays are declared setting the maximum valid index value for the array. It is not the count of the elements that we want to store in the array.
' declares an array with space for 5 integer
' 4 is the maximum valid index starting from 0 to 4
Dim myArray(4) as Integer
So this loop will fill the array with 5 integers without causing any IndexOutOfRangeException
For i As Integer = 0 To 4
myArray(i) = i
Next
The VB.NET rule
This exception means that you're trying to access a collection item by index, using an invalid index. An index is invalid when it's lower than the collection's lower bound or greater than equal to the number of elements it contains. the maximum allowed index defined in the array declaration
Simple explanation about what a Index out of bound exception is:
Just think one train is there its compartments are D1,D2,D3.
One passenger came to enter the train and he have the ticket for D4.
now what will happen. the passenger want to enter a compartment that does not exist so obviously problem will arise.
Same scenario: whenever we try to access an array list, etc. we can only access the existing indexes in the array. array[0] and array[1] are existing. If we try to access array[3], it's not there actually, so an index out of bound exception will arise.
To easily understand the problem, imagine we wrote this code:
static void Main(string[] args)
{
string[] test = new string[3];
test[0]= "hello1";
test[1]= "hello2";
test[2]= "hello3";
for (int i = 0; i <= 3; i++)
{
Console.WriteLine(test[i].ToString());
}
}
Result will be:
hello1
hello2
hello3
Unhandled Exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
Size of array is 3 (indices 0, 1 and 2), but the for-loop loops 4 times (0, 1, 2 and 3). So when it tries to access outside the bounds with (3) it throws the exception.
A side from the very long complete accepted answer there is an important point to make about IndexOutOfRangeException compared with many other exception types, and that is:
Often there is complex program state that maybe difficult to have control over at a particular point in code e.g a DB connection goes down so data for an input cannot be retrieved etc... This kind of issue often results in an Exception of some kind that has to bubble up to a higher level because where it occurs has no way of dealing with it at that point.
IndexOutOfRangeException is generally different in that it in most cases it is pretty trivial to check for at the point where the exception is being raised. Generally this kind of exception get thrown by some code that could very easily deal with the issue at the place it is occurring - just by checking the actual length of the array. You don't want to 'fix' this by handling this exception higher up - but instead by ensuring its not thrown in the first instance - which in most cases is easy to do by checking the array length.
Another way of putting this is that other exceptions can arise due to genuine lack of control over input or program state BUT IndexOutOfRangeException more often than not is simply just pilot (programmer) error.
These two exceptions are common in various programming languages and as others said it's when you access an element with an index greater than the size of the array. For example:
var array = [1,2,3];
/* var lastElement = array[3] this will throw an exception, because indices
start from zero, length of the array is 3, but its last index is 2. */
The main reason behind this is compilers usually don't check this stuff, hence they will only express themselves at runtime.
Similar to this:
Why don't modern compilers catch attempts to make out-of-bounds access to arrays?

Does this sorting algorithm exist? (implemented in Swift)

This might be a bad question but I am curious.
I was following some data structures and algorithms courses online, and I came across algorithms such as selection sort, insertion sort, bubble sort, merge sort, quick sort, heap sort.. They almost never get close to O(n) when the array is reverse-sorted.
I was wondering one thing: why are we not using space in return of time?
When I organise something I pick up one, and put it where it belongs to. So I thought if we have an array of items, we could just put each value to the index with that value.
Here is my implementation in Swift 4:
let simpleArray = [5,8,3,2,1,9,4,7,0]
let maxSpace = 20
func spaceSort(array: [Int]) -> [Int] {
guard array.count > 1 else {
return array
}
var realResult = [Int]()
var result = Array<Int>(repeating: -1, count: maxSpace)
for i in 0..<array.count{
if(result[array[i]] != array[i]){
result[array[i]] = array[i]
}
}
for i in 0..<result.count{
if(result[i] != -1){
realResult.append(i)
}
}
return realResult
}
var spaceSorted = [Int]()
var execTime = BenchTimer.measureBlock {
spaceSorted = spaceSort(array: simpleArray)
}
print("Average execution time for simple array: \(execTime)")
print(spaceSorted)
Results I get:
Does this sorting algorithm exist already?
Is this a bad idea because it only takes unique values and loses the duplicates? Or could there be uses for it?
And why can't I use Int.max for the maxSpace?
Edit:
I get the error below
error: Execution was interrupted.
when I use let maxSpace = Int.max
MyPlayground(6961,0x7000024af000) malloc: Heap corruption detected,
free list is damaged at 0x600003b7ebc0
* Incorrect guard value: 0 MyPlayground(6961,0x7000024af000) malloc: * set a breakpoint in malloc_error_break to debug
Thanks for the answers
This is an extreme version of radix sort. Quoted from Wikipedia:
radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.
In this case you choose your radix as maxSpace, and so you don't have any "elements with more than one significant digit" (from quote above).
Now, if you would use a Hash Set data structure instead of an array, you would actually not need to really allocate the space for the whole range. You would still keep all the loop iterations though (from 0 to maxSpace), and it would check whether the hash set contains the value of i (the loop variable), and if so, output it.
This can only be an efficient algorithm if maxSpace has the same order of magnitude as the number of elements in your input array. Other sorting algorithms can sort with O(nlogn) time complexity, so for cases where maxSpace is much greater than nlogn, the algorithm is not that compelling.

Anylogic referencing columns in a collection

I am using a collection to represent available trucks in a system. I am using a 1 or 0 for a given index number, using a 1 to say that indexed truck is available. I am then trying to assign that index number to a customer ID. I am trying to randomly select an available truck from those listed as available. I am getting an error saying the left-hand side of an assignment must be a variable and highlighting the portion of the code reading Available_Trucks() = 1. This is the code:
agent.ID = randomWhere(Available_Trucks, Available_Trucks() = 1);
The way you are doing it won't work... randomWhere when applied to a collection of integers, will return the element of the collection (in this case 1 or 0).
So doing
randomWhere(Available_Trucks,at->at==1); //this is the right synthax
will return 1 always since that's the value of the number chosen in the collection. So what you need is to get the index of the number of the collection that is equal to 1. But you will have to create a function to do that yourself... something like this (probably not the best way but it works: agent.ID=getRandomAvailbleTruck(Available_Trucks);
And the function getRandomAvailbleTruck will take as an argument a collection (arrayList probably).. it will return -1 if there is no availble truck
int availableTrucks=count(collection,c->c==1);
if(availableTrucks==0) return -1;
int rand=uniform_discr(1,availableTrucks);
int i=0;
int j=0;
while(i<rand){
if(collection.get(j)==1){
i++;
if(i==rand){
return j;
}
}
j++;
}
return -1;
Now another idea is to instead of using 0 and 1 for the availability, you can use correlative numbers: 1,2,3,4,5 ... etc and use a 0 if it's not available. For instance if truck 3 is not availble, the array will be 1,2,0,4,5 and if it's available it will be 1,2,3,4,5.
In that case you can use
agent.ID=randomTrue(available_trucks,at->at>0);
But you will get an error if there is no available truck, so check that.
Nevertheless, what you are doing is horrible practice... And there is a much easier way to do it if you put the availability in your truck if your truck is an agent...
Then you can just do
Truck truck=randomWhere(trucks,t->t.available==1);
if(truck!=null)
agent.ID=truck.ID;

Assigning times to events

include "globals.mzn";
%Data
time_ID = [11,12,13,14,15];
eventId = [0011, 0012, 0013, 0021, 0022, 0031, 0041, 0051, 0061, 0071];
int:ntime = 5;
int:nevent = 10;
set of int: events =1..nevent;
set of int: time = 1..ntime;
array[1..nevent] of int:eventId;
array[1..nevent] of var time:event_time;
array[1..ntime] of int:time_ID;
solve satisfy;
constraint
forall(event in eventId)(
exists(t in time_ID)(
event_time[event] = t ));
output[ show(event_time) ];
I'm trying to assign times to an event using the code above.
But rather than randomly assign times to the events, it returns an error " array access out of bounds"
How can I make it select randomly from the time array?
Thank you
The error was because you tried to assign the index 11 (the first element in eventId array) in "event_time" array.
The assigment of just 1's is correct since you haven't done any other constraints on the "event_time" array. If you set the number of solutions to - say - 3 you will see other solutions. And, in fact, the constraint as it stand now is not really meaningful since it just ensures that there is some assignment to the elements in "event_time", but this constraint is handled by the domain of "event_time" (i.e. that all indices are in the range 1..ntime).

how to get a parentNode's index i using d3.js

Using d3.js, were I after (say) some value x of a parent node, I'd use:
d3.select(this.parentNode).datum().x
What I'd like, though, is the data (ie datum's) index. Suggestions?
Thanks!
The index of an element is only well-defined within a collection. When you're selecting just a single element, there's no collection and the notion of an index is not really defined. You could, for example, create a number of g elements and then apply different operations to different (overlapping) subsets. Any individual g element would have several indices, depending on the subset you consider.
In order to do what you're trying to achieve, you would have to keep a reference to the specific selection that you want to use. Having this and something that identifies the element, you can then do something like this.
var value = d3.select(this.parentNode).datum().x;
var index = -1;
selection.each(function(d, i) { if(d.x == value) index = i; });
This relies on having an attribute that uniquely identifies the element.
If you have only one selection, you could simply save the index as another data attribute and access it later.
var gs = d3.selectAll("g").data(data).append("g")
.each(function(d, i) { d.index = i; });
var something = gs.append(...);
something.each(function() {
d3.select(this.parentNode).datum().index;
});