Using string object as a hash key in Common Lisp - hash

I'm trying to create a "dictionary" type - ie hash table with a string as a key. Is this possible or wise in Lisp?
I noticed that this works as expected:
> (setq table (make-hash-table))
#<HASH-TABLE :TEST EQL size 0/60 #x91AFA46>
> (setf (gethash 1 table) "one")
"one"
> (gethash 1 table)
"one"
However, the following does not:
> (setq table (make-hash-table))
#<HASH-TABLE :TEST EQL size 0/60 #x91AFA0E>
> table
#<HASH-TABLE :TEST EQL size 0/60 #x91AFA0E>
> (setf (gethash "one" table) 1)
1
> (gethash "one" table)
NIL
NIL

You need to make hash-table that uses 'equal instead if 'eql. 'eql doesn't evaluate two strings with same content to 't, while 'equal does.
Here is how you do it:
(make-hash-table :test 'equal)
As skypher noted you can also use 'equalp instead if you want case-insensitive string hashing.

Related

Non destructive modify hash table

Is it possible to non-destructively add new key-value pairs to a Common Lisp (SBCL) hash table? The standard way to add new elements to a hash table is to call:
(setf (gethash key *hash-table*) value)
but the call to setf modifies *hash-table* corrupting the original. I have an application where I'd like to take advantage of the efficiency of hash table lookups, but I also would like to non destructively modify them. The work-around I see is to copy the original hash table before operating on it, but that is not practical in my case since the hash tables I'm dealing with contain many thousands of elements and copying large hash tables, say, in a loop would negate the computational efficiency advantage of using them in the first place.
Depending on your needs you may be able to just use an association list, using assoc and other functions to establish new bindings on top of existing ones. The fact that assoc returns the first matching element means you can shadow bindings:
(let ((list '((:a . 1) (:b . 2))))
(acons :b 3 list))
=> ((:b . 3) (:a . 1) (:b . 2))
If you call (assoc :b list) in the resulting list, the entry will be (:b . 3), but the original list is unmodified.
FSet
If association lists are not enough, the FSet library provides purely functional data-structures for Common Lisp, like maps, which are immutable hash-tables. They are implemented as balanced trees, which is better than a naive approach. There are also other data-structures that are more efficient, but you probably need to implement them yourselves (Hash array mapped trie (edit: see https://github.com/danshapero/cl-hamt, thanks #Flux)). That being said, FSet is good enough in general.
FSet is available through Quicklisp
USER> (ql:quickload :fset)
Create a map; notice the printed representation is made to be read again, if you install the appropriate reader macros. But you can perfectly use the library without the modified syntax table.
USER> (fset:map (:a 0) (:b 1))
#{| (:A 0) (:B 1) |}
Update the previous map with a new binding for :c:
USER> (fset:with * :c 3)
#{| (:A 0) (:B 1) (:C 3) |}
Update the previous map with a new binding for :b, which shadows the previous one:
USER> (fset:with * :b 4)
#{| (:A 0) (:B 4) (:C 3) |}
All the intermediate maps are unmodified:
USER> (list * ** *** )
(#{| (:A 0) (:B 4) (:C 3) |}
#{| (:A 0) (:B 1) (:C 3) |}
#{| (:A 0) (:B 1) |})
I don't think that you can pass-by-reference a hash-table to another hash-table in common lisp.
But what I had an idea was how to avoid to copy the entire hashtable,
yet get to a result by one call is to use the default value argument position
of gethash.
(gethash key ht default-value) returns what is given for default-value, when key is not present in ht.
;; prepare three example hash-tables, where *h3* and *h2* gets the additional keys
;; and if a key is not present in *h3*, one should look up in *h2*, and if not there too, in *h1*.
(defparameter *h1* (make-hash-table))
(setf (gethash 'a *h1*) 1)
(setf (gethash 'b *h1*) 2)
(setf (gethash 'c *h1*) 3)
(defparameter *h2* (make-hash-table))
(setf (gethash 'd *h2*) 4)
(setf (gethash 'e *h2*) 5)
(defparameter *h3* (make-hash-table))
(setf (gethash 'f *h3*) 6)
;; the call
(gethash 'a *h3* (gethash 'a *h2* (gethash 'a *h1*)))
;; would give the desired result `1`.
;; let us assume, there is a chain of hash-tables *hk* *h(k-1)* ... *h2* *h1*
;; in which one should look up into that order.
;; Then it is to us to build the code
;; (gethash 'a *hk* (gethash 'a *h(k-1)* ...(gethash 'a *h2* (gethash 'a *h1*))...))
;; automatically for every lookup.
;; this macro does it:
(defmacro mget (key hash-tables-list)
(flet ((inject-last (e1 e2) `(,#e1 ,e2)))
(reduce #'inject-last
(mapcar (lambda (ht) `(gethash ,key ,ht))
(nreverse hash-tables-list)))))
;; let's see its macroexpansion:
(macroexpand-1 '(mget 'a (*h3* *h2* *h1*)))
;; (GETHASH 'A *H3* (GETHASH 'A *H2* (GETHASH 'A *H1*))) ;
;; T
;; and run the code:
(mget 'a (*h2* *h1*))
;; 1 ;
;; NIL
One could attach information which are the next hash table to look in
in the hash-table object. And even automate the generation of the list (*h3* *h2* *h1*) so that one writes only
(gethash* key ht) which then calls mget ...
Well, of course through all this the hash-access gets slowed.
It is a trade-off between copying entire hash-tables or pay the cost in performance at each call ...
automatic finding of hash-tables which are extended by *h3*
(setf (get '*h3* 'extendeds) '(*h2* *h1*))
(setf (get '*h2* 'extendeds) '(*h1*))
(defun collect-extendeds (hts)
(let ((res (loop for ht in hts
nconcing (get ht 'extendeds))))
(remove-duplicates res)))
;; this function can recursively retrieve all hashtables
(defun get-extendeds* (hts &optional (acc '()))
(let ((hts (if (listp hts) hts (list hts))))
(let ((nexts (collect-extendeds hts)))
(cond ((every #'null nexts) (nreverse (remove-duplicates (append hts acc))))
(t (get-extendeds* nexts (remove-duplicates (append hts acc))))))))
;; write a macro to retrieve key's value from all downstream hashtables
(defmacro geth (key ht)
`(mget ,key ,(get-extendeds* ht)))
(geth 'a *h3*)
;; 1 ;
;; NIL ;; NIL because it was not in *h3* directly but in one of the hashtables
;; which it extends.
;; problem is if 'NIL is a value of an existing key,
;; one would still get 'NIL NIL.

SIMPLE-READER-ERROR , illegal sharp macro character, Subcharacter #\< not defined for dispatch char #\#

In chapter 4 of the book, Successful Lisp, by David B. Lamkins,
there is a simple application to keep track of bank checks.
https://dept-info.labri.fr/~strandh/Teaching/MTP/Common/David-Lamkins/chapter04.html
At the end, we write a macro that will save and restore functions.
The problem occurs when I execute the reader function and the value to read is the hash table.
The macro to save and restore functions is:
(defmacro def-i/o (writer-name reader-name (&rest vars))
(let ((file-name (gensym))
(var (gensym))
(stream (gensym)))
`(progn
(defun ,writer-name (,file-name)
(with-open-file (,stream ,file-name
:direction :output :if-exists :supersede)
(dolist (,var (list ,#vars))
(declare (special ,#vars))
(print ,var ,stream))))
(defun ,reader-name (,file-name)
(with-open-file (,stream ,file-name
:direction :input :if-does-not-exist :error)
(dolist (,var ',vars)
(set ,var (read ,stream)))))
t)))
Here is my hash table and what is going on:
(defvar *payees* (make-hash-table :test #'equal))
(check-check 100.00 "Acme" "Rocket booster T-1000")
CL-USER> *payees*
#<HASH-TABLE :TEST EQUAL :COUNT 0 {25520F91}>
CL-USER> (check-check 100.00 "Acme" "T-1000 rocket booster")
#S(CHECK
:NUMBER 100
:DATE "2020-4-1"
:AMOUNT 100.0
:PAID "Acme"
:MEMO "T-1000 rocket booster")
CL-USER> (def-i/o save-checks charge-checks (*payees*))
T
CL-USER> (save-checks "/home/checks.dat")
NIL
CL-USER> (makunbound '*payees*)
*PAYEES*
CL-USER> (load-checks "/home/checks.dat")
; Evaluation aborted on #<SB-INT:SIMPLE-READER-ERROR "illegal sharp macro character: ~S" {258A8541}>.
In Lispworks, the error message is:
Error: subcharacter #\< not defined for dispatch char #\#.
To simplify, I get the same error if I execute:
CL-USER> (defvar *payees* (make-hash-table :test #'equal))
*PAYEES*
CL-USER> (with-open-file (in "/home/checks.dat"
:direction :input)
(set *payees* (read in)))
; Evaluation aborted on #<SB-INT:SIMPLE-READER-ERROR "illegal sharp macro character: ~S" {23E83B91}>.
Can someone explain to me where the problem is coming from, and what I need to fix in my code for this to work.
Thank you in advance for the explanations you can give me and the help you can give me.
Values which cannot be read back are printed with #<, see Sharpsign Less-Than-Sign. Common Lisp does not define how hash-tables should be printed readably (there is no reader syntax for hash tables):
* (make-hash-table)
#<HASH-TABLE :TEST EQL :COUNT 0 {1006556E53}>
The example in the book is only suitable to be used with the bank example that was previously shown, and is not intended to be a general-purpose serialization mechanism.
There are many ways to print a hash-table and read them back, depending on your needs, but there is no default representation. See cl-store for a library that store arbitrary objects.
Custom dump function
Let's write a form that evaluate to an equivalent hash-table; let's define a generic function named dump, and a default method that simply returns the object as-is:
(defgeneric dump (object)
(:method (object) object))
Given a hash-table, we can serialize it as a plist (sequence of key/value elements in a list), while also calling dump on keys and values, in case our hash-tables contain hash-tables:
(defun hash-table-plist-dump (hash &aux plist)
(with-hash-table-iterator (next hash)
(loop
(multiple-value-bind (some key value) (next)
(unless some
(return plist))
(push (dump value) plist)
(push (dump key) plist)))))
Instead of reinventing the wheel, we could also have used hash-table-plist from the alexandria system.
(ql:quickload :alexandria)
The above is equivalent to:
(defun hash-table-plist-dump (hash)
(mapcar #'dump (alexandria:hash-table-plist hash)))
Then, we can specialize dump for hash-tables:
(defmethod dump ((hash hash-table))
(loop
for (initarg accessor)
in '((:test hash-table-test)
(:size hash-table-size)
(:rehash-size hash-table-rehash-size)
(:rehash-threshold hash-table-rehash-threshold))
collect initarg into args
collect `(quote ,(funcall accessor hash)) into args
finally (return
(alexandria:with-gensyms (h k v)
`(loop
:with ,h := (make-hash-table ,#args)
:for (,k ,v)
:on (list ,#(hash-table-plist-dump hash))
:by #'cddr
:do (setf (gethash ,k ,h) ,v)
:finally (return ,h))))))
The first part of the loop computes all the hash-table properties, like the kind of hash function to use or the rehash-size, and builds args, the argument list for a call to make-hash-table with the same values.
In the finally clause, we build a loop expression (see backquotes) that first allocates the hash-table, then populates it according to the current values of the hash, and finally return the new hash. Note that the generated code does not depend on alexandria, it could be read back from another Lisp system that does not have this dependency.
Example
CL-USER> (alexandria:plist-hash-table '("abc" 0 "def" 1 "ghi" 2 "jkl" 3)
:test #'equal)
#<HASH-TABLE :TEST EQUAL :COUNT 4 {100C91F8C3}>
Dump it:
CL-USER> (dump *)
(LOOP :WITH #:H603 := (MAKE-HASH-TABLE :TEST 'EQUAL :SIZE '14 :REHASH-SIZE '1.5
:REHASH-THRESHOLD '1.0)
:FOR (#:K604 #:V605) :ON (LIST "jkl" 3 "ghi" 2 "def" 1 "abc"
0) :BY #'CDDR
:DO (SETF (GETHASH #:K604 #:H603) #:V605)
:FINALLY (RETURN #:H603))
The generated data is also valid Lisp code, evaluate it:
CL-USER> (eval *)
#<HASH-TABLE :TEST EQUAL :COUNT 4 {100CD5CE93}>
The resulting hash is equalp to the original one:
CL-USER> (equalp * ***)
T

Lisp: Accessing recursive hashes with syntactic sugar

I'm trying to build a function (or macro) to ease the getting and setting of data deep in a hash table (meaning, a hash within a hash, within a hash, etc). I don't think I can do it with a macro, and I'm not sure how to do it with eval. I'd like to be able to do the following:
(gethashdeep *HEROES* "Avengers" "Retired" "Tony Stark")
and have that return "Iron Man"
The hashes are all created with:
(setf hashtablename (make-hash-table :test 'equal))
and populated from there.
I can do the following, but would like to abstract it so I can programmatically pull a value from an arbitrary depth:
;;pulling from a hash that's 2 deep
(gethash "Tony Stark" (gethash "Avengers" *HEROES*))
update - my go at it:
(defun getdeephash (hashpath h k)
(let* ((rhashpath (reverse hashpath))
(hashdepth (list-length hashpath))
(hashcommand (concatenate 'string "(gethash \"" k "\"")))
(loop for i from 1 to hashdepth
do (setf hashcommand (concatenate 'string hashcommand "(gethash \"" (nth (- i 1) rhashpath) "\"")))
(setf hashcommand (concatenate 'string hashcommand " " h (make-string (- hashdepth 0) :initial-element #\Right_Parenthesis) ")"))
(values hashcommand)))
There's nothing fancy needed for nested table lookup.
One possible definition of gethash-deep:
(defun gethash-deep (value &rest keys)
(if (or (endp keys)
(not (hash-table-p value)))
value
(apply #'gethash-deep
(gethash (first keys) value)
(rest keys))))
Example use:
(defun table (&rest keys-and-values &key &allow-other-keys)
(let ((table (make-hash-table :test 'equal)))
(loop for (key value) on keys-and-values by #'cddr
do (setf (gethash key table) value))
table))
(defparameter *heroes*
(table "Avengers"
(table "Retired" (table "Tony Stark" "Iron Man")
"Active" (table "Bruce Banner" "Hulk"))))
(gethash-deep *heroes* "Avengers" "Retired" "Tony Stark") => "Iron Man"
(gethash-deep *heroes* "Avengers" "Active" "Bruce Banner") => "Hulk"
It's a one liner with the Access library:
(ql:quickload "access")
We define the *heroes* hash table (as in Xach's example):
(defun table (&rest keys-and-values &key &allow-other-keys)
(let ((table (make-hash-table :test 'equal)))
(loop for (key value) on keys-and-values by #'cddr
do (setf (gethash key table) value))
table))
TABLE
(defparameter *heroes*
(table "Avengers"
(table "Retired" (table "Tony Stark" "Iron Man")
"Active" (table "Bruce Banner" "Hulk"))))
Usually we use access:access for a consistent access to diverse data structures (alist, plist, hash table, objects,…). For a nested access we use access:accesses (plural):
(access:accesses *heroes* "Avengers" "Retired" "Tony Stark")
"Iron Man"
Besides, we can setf it:
(setf (access:accesses *heroes* "Avengers" "Retired" "Tony Stark") "me")
"me"
It is a battle tested library, since it is the core of the Djula template library, one of the most downloaded Quicklisp libraries.
my blog post: https://lisp-journey.gitlab.io/blog/generice-consistent-access-of-data-structures-dotted-path/
For an ad-hoc construct, you could use ->> from arrows:
(->> table
(gethash "Avengers")
(gethash "Retired")
(gethash "Tony Stark"))
If you want to mix with other accessors (e. g. aref), you could use as-> or -<> instead to deal with the different argument orders.
If I wanted to implement it, I'd rely on the implicit checks and maybe use reduce:
(defun gethash-deep (hash-table &rest keys)
(reduce (lambda (table key)
(gethash key table))
keys
:initial-value hash-table))
Or loop:
(defun gethash-deep (table &rest keys)
(loop :for k :in keys
:for v := (gethash k table) :then (gethash k v)
:finally (return v)))
If all the keys exist this should work, the only reason gethash can't be used directly in the reduce is that it receives the inputs in the incorrect order to work well.
(defun gethashdeep (table &rest keys)
(reduce #'(lambda (h k)
(gethash k h)) (cdr keys)
:initial-value (gethash (car keys) table)))
Of course it could be written as a macro and be setfable
(defmacro gethashdeep1 (table &rest keys)
(reduce #'(lambda (h k) (list 'gethash k h)) (cdr keys)
:initial-value (list 'gethash (car keys) table)))
Of course this has limitations and does not create hashes that do not exist.

building a hash table with gensym and macrolet

I'm trying to build a hash table (among other actions) while reading. I don't want the hash table to have global scope (yet), so I'm doing this with a macro and gensym. Inside the macro x, I'm defining a macro s which is similar to setf, but defines an entry in a hash table instead of defining a symbol somewhere. It blows up. I think I understand the error message, but how do I make it work?
The code:
#!/usr/bin/clisp -repl
(defmacro x (&rest statements)
(let ((config-variables (gensym)))
`(macrolet ((s (place value)
(setf (gethash 'place ,config-variables) value)))
(let ((,config-variables (make-hash-table :test #'eq)))
(progn ,#statements)
,config-variables))))
(defun load-config ()
(let ((config-file-tree (read *standard-input*)))
(eval config-file-tree)))
(defun load-test-config ()
(with-input-from-string (*standard-input* "(x (s fred 3) (s barney 5))")
(load-config)))
(load-test-config)
The output:
*** - LET*: variable #:G12655 has no value
The following restarts are available:
USE-VALUE :R1 Input a value to be used instead of #:G12655.
STORE-VALUE :R2 Input a new value for #:G12655.
SKIP :R3 skip (LOAD-TEST-CONFIG)
STOP :R4 stop loading file /u/asterisk/semicolon/build.l/stackoverflow-semi
Just guessing what Bill might really want.
Let's say he wants a mapping from some keys to some values as a configuration in a file.
Here is the procedural way.
open a stream to the data
read it as an s-expression
walk the data and fill a hash-table
Example code:
(defun read-mapping (&optional (stream *standard-input*))
(destructuring-bind (type &rest mappings) (read stream)
(assert (eq type 'mapping))
(let ((table (make-hash-table)))
(loop for (key value) in mappings
do (setf (gethash key table) value))
table)))
(defun load-config ()
(read-mapping))
(defun load-test-config ()
(with-input-from-string (*standard-input* "(mapping (fred 3) (barney 5))")
(load-config)))
(load-test-config)
Use:
CL-USER 57 > (load-test-config)
#<EQL Hash Table{2} 402000151B>
CL-USER 58 > (describe *)
#<EQL Hash Table{2} 402000151B> is a HASH-TABLE
BARNEY 5
FRED 3
Advantages:
no macros
data does not get encoded in source code and generated source code
no evaluation (security!) via EVAL needed
no object code bloat via macros which are expanding to larger code
functional abstraction
much easier to understand and debug
Alternatively I would write a read-macro for { such that {(fred 3) (barney 5)} would be directly read as an hash-table.
If you want to have computed values:
(defun make-table (mappings &aux (table (make-hash-table)))
(loop for (key value) in mappings
do (setf (gethash key table) (eval value)))
table)
CL-USER 66> (describe (make-table '((fred (- 10 7)) (barney (- 10 5)))))
#<EQL Hash Table{2} 4020000A4B> is a HASH-TABLE
BARNEY 5
FRED 3
Turning that into a macro:
(defmacro defmapping (&body mappings)
`(make-table ',mappings))
(defmapping
(fred 3)
(barney 5))
In a macrolet you are as well defining a macro, so the usual rules apply, i.e. you have to backquote expressions, that are to be evaluated at run-time. Like this:
(defmacro x (&rest statements)
(let ((config-variables (gensym)))
`(macrolet ((s (place value)
`(setf (gethash ',place ,',config-variables) ,value)))
(let ((,config-variables (make-hash-table :test #'eq)))
(progn ,#statements)
,config-variables))))

What's the best way to sort a hashtable by value?

Now i have to copy the hastable to a list before sorting it:
(defun good-red ()
(let ((tab (make-hash-table)) (res '()))
(dotimes (i 33) (setf (gethash (+ i 1) tab) 0))
(with-open-file (stream "test.txt")
(loop for line = (read-line stream nil)
until (null line)
do
(setq nums (butlast (str2lst (substring line 6))))
(dolist (n nums) (incf (gethash n tab)))
))
**(maphash #'(lambda (k v) (push (cons k v) res)) tab)**
(setq sort-res (sort res #'< :key #'cdr))
(reverse (nthcdr (- 33 18) (mapcar #'car sort-res))) ))
BTW, what's the better way to fetch the first N elements of a list ?
Vatine's answer is technically correct, but probably not super helpful for the immediate problem of someone asking this question. The common case of using a hash table to hold a collection of counters, then selecting the top N items by score can be done like this:
;; convert the hash table into an association list
(defun hash-table-alist (table)
"Returns an association list containing the keys and values of hash table TABLE."
(let ((alist nil))
(maphash (lambda (k v)
(push (cons k v) alist))
table)
alist))
(defun hash-table-top-n-values (table n)
"Returns the top N entries from hash table TABLE. Values are expected to be numeric."
(subseq (sort (hash-table-alist table) #'> :key #'cdr) 0 n))
The first function returns the contents of a hash table as a series of cons'd pairs in a list, which is called an association list (the typical list representation for key/value pairs). Most Lisp enthusiasts already have a variation of this function on hand because it's such a common operation. This version is from the Alexandria library, which is very widely used in the CL community.
The second function uses SUBSEQ to grab the first N items from the list returned by sorting the alist returned by the first function using the CDR of each pair as the key. Changing :key to #'car would sort by hash keys, changing #'> to #'< would invert the sort order.
A hash-table is inherently unordered. If you want it sorted, you need to initialize some sort of ordered data structure with the contents.
If you want to fetch the first N elements of a sequence, there's always SUBSEQ.