I would like to write/read a hash-table to/from disk, but it is not a (print)able object. I won't know the key names so I can't think of a way to do it manually. I read that there might be distribution-specific ways to do this; is there anything for this in SBCL?
I didn't find anything in the SBCL manual or on Google.
If not, is there another, storable way to keep lists of integers bound to strings, be able to modify those lists efficiently, and have constant or at least faster-than-alist access time?
Are binary search trees easy enough to implement with alists and is that a good idea for making a basic database?
MAPHASH maps a function with two arguments over a hash table for side effects. The two arguments are the key and value of each item in the hash table. You can use this to for example write each item in the hash table as a list of the key and value:
(maphash (lambda (key value)(write (list key value))) *hash-table*)
Related
Recently, in version R2022b they announced the introduction of dictionaries.
I was under the impression that dictionaries were already available, provided by containers.Map. Are dictionaries just a different name mapped to containers.Map? Or are there other differences? I was unable to find anything comparing them online.
From what I can gather, after reading this blog post and the comments under it, and the documentation (I haven’t yet had a chance to experiment with them, so feel free to correct me if I’m wrong):
dictionary is an actual primitive type, like double, cell or struct. containers.Map is a “custom class”, even if nowadays the code is built-in, the functionality can never be as integrated as for a primitive type. Consequently, dictionary is significantly faster.
dictionary uses normal value semantics. If you make a copy you have two independent dictionaries (note MATLAB’s lazy copy mechanism). containers.Map is a handle class, meaning that all copies point to the same data, modifying one copy modifies them all.
containers.Map can use char arrays (the old string format) or numbers as keys (string is implicitly converted to char when used as key). dictionary can use any type, as long as it overloads keyhash. This means you can use your own custom class objects as keys.
dictionary is vectorized, you can look up multiple values at once. With a containers.Map you can look up multiple values using the values function, not the normal lookup syntax.
dictionary has actual O(1) lookup. If I remember correctly, containers.Map doesn’t.*
containers.Map can store any array as value, dictionary stores only scalars. The scalar can be a cell, which can contain any array, but this leads to awkward semantics, since retrieving the value retrieves the cell, not its contents.
* No, it is also O(1), at least in R2022b.
I have to sorting algorithm to sort out recipes from one of my machines. Is there a way to push/append values into my array?
There is no standard List or similar implementation in the Beckhoff/TwinCAT libraries.
Best choice would be to make your own.
Make a function block containing an array and public method for appending (and other functionality you want).
Internally in the function block you then keep track of which indices are used and which are free.
Previously I had been using "being the elements of" feature of loop to iterate over a sequence of an unknown type. I just found out that "being the elements of" is not provided in every implementation of Common Lisp and am wondering if there is any clean way to iterate over a sequence using loop. The best solution I have been able to come with is to coerce the sequence to a list and then iterate over that.
No, LOOP does not provide such a feature directly. If your LOOP implementation is extensible (which the standard says nothing about, too), you might be able to implement such a feature.
LOOP has clauses to iterate over lists - for item in list - and a clause to iterate over a vector - for e across vector - note that strings are also vectors, one-dimensional arrays. But not both together.
Otherwise use MAP or MAP-INTO to iterate over sequences.
The ITERATE macro provides such a feature: for i in-sequence seq.
Open Hashing (Separate Chaining):
In open hashing, keys are stored in linked lists attached to cells of a hash table.
Closed Hashing (Open Addressing):
In closed hashing, all keys are stored in the hash table itself without the use of linked lists.
I am unable to understand why they are called open, closed and Separate. Can some one explain it?
The use of "closed" vs. "open" reflects whether or not we are locked in to using a certain position or data structure (this is an extremely vague description, but hopefully the rest helps).
For instance, the "open" in "open addressing" tells us the index (aka. address) at which an object will be stored in the hash table is not completely determined by its hash code. Instead, the index may vary depending on what's already in the hash table.
The "closed" in "closed hashing" refers to the fact that we never leave the hash table; every object is stored directly at an index in the hash table's internal array. Note that this is only possible by using some sort of open addressing strategy. This explains why "closed hashing" and "open addressing" are synonyms.
Contrast this with open hashing - in this strategy, none of the objects are actually stored in the hash table's array; instead once an object is hashed, it is stored in a list which is separate from the hash table's internal array. "open" refers to the freedom we get by leaving the hash table, and using a separate list. By the way, "separate list" hints at why open hashing is also known as "separate chaining".
In short, "closed" always refers to some sort of strict guarantee, like when we guarantee that objects are always stored directly within the hash table (closed hashing). Then, the opposite of "closed" is "open", so if you don't have such guarantees, the strategy is considered "open".
You have an array that is the "hash table".
In Open Hashing each cell in the array points to a list containg the collisions. The hashing has produced the same index for all items in the linked list.
In Closed Hashing you use only one array for everything. You store the collisions in the same array. The trick is to use some smart way to jump from collision to collision until you find what you want. And do this in a reproducible / deterministic way.
The name open addressing refers to the fact that the location ("address") of the element is not determined by its hash value. (This method is also called closed hashing).
In separate chaining, each bucket is independent, and has some sort of ADT (list, binary search trees, etc) of entries with the same index.
In a good hash table, each bucket has zero or one entries, because we need operations of order O(1) for insert, search, etc.
This is a example of separate chaining using C++ with a simple hash function using mod operator (clearly, a bad hash function)
At first I assumed that every collection class would receive an additional par method which would convert the collection to a fitting parallel data structure (like map returns the best collection for the element type in Scala 2.8).
Now it seems that some collection classes support a par method (e. g. Array) but others have toParSeq, toParIterable methods (e. g. List). This is a bit weird, since Array isn't used or recommended that often.
What is the reason for that? Wouldn't it be better to just have a par available on all collection classes doing the "right thing"?
If I have data which might be processed in parallel, what types should I use? The traits in scala.collection or the type of the implementation directly?
Or should I prefer Arrays now, because they seem to be cheaper to parallelize?
Lists aren't that well suited for parallel processing. The reason is that to get to the end of the list, you have to walk through every single element. Thus, you may as well just treat the list as an iterator, and thus may as well just use something more generic like toParIterable.
Any collection that has a fast index is a good candidate for parallel processing. This includes anything implementing LinearSeqOptimized, plus trees and hash tables. Array has as fast of an index as you can get, so it's a fairly natural choice. You can also use things like ArrayBuffer (which has a par method returning a ParArray).