hash-tables, counting duplicate keys, scheme - hash

I'm hoping someone can help me with this while I continue searching for a solution.
I'm confused on how to iterate through a hash table and find duplicate keys. I want to remove the duplicates, but consolidate their values.
So, say I have a list of strings:
(define strings '("abcde" "bcdea" "cdeab" "deabc" "eabcd" "abcde"))
And I store them into a hash table where the values are their index positions in the list.
So, I'm wanting to build a hash table like this:
(abcde (0, 5))
(bcdea 1)
(cdeab 2)
(deabc 3)
(eabcd 4)
Each string is a key, and the value is a list of the indexes where that string is found. Basically, I'm counting the number of occurrences of a substring in a large string, and noting their positions.
I know how to make the hash table:
(define my-hash-table (make-hash))
(for-each (lambda (s v) (hash-set! my-hash-table s v)) strings values) ;;values is a list of 0,1,2,3,4,5
(map (lambda (s) (list s (hash-ref my-hash-table s))) strings)
This just builds a hash table of the keys and their values, it doesn't consider if a key is already present in the table.
I'd appreciate any advice. If someone doesn't mind going through it step-by-step with me I'd be very grateful, I'm trying to learn scheme.
I'm using RSR5.

The trick is to check whether each key has already a value, if so we append it to a list - by definition, each key can only have one value associated. I think you're looking for something like this:
(define strings '("abcde" "bcdea" "cdeab" "deabc" "eabcd" "abcde"))
(define values '(0 1 2 3 4 5))
(define my-hash-table (make-hash))
(for-each (lambda (s v)
(hash-update! my-hash-table
s
(lambda (a) (cons v a)) ; add element to list
(lambda () '()))) ; we start with '()
strings
values)
Alternatively, we can create and update the hash table using a functional style of programming:
(define my-hash-table
(foldl (lambda (s v a)
(hash-update a
s
(lambda (a) (cons v a)) ; add element to list
(lambda () '()))) ; we start with '()
(hash)
strings
values))
Either way, it works as expected:
(hash->list my-hash-table) ; we get a keys/values list for free
=> '(("eabcd" 4) ("deabc" 3) ("bcdea" 1) ("cdeab" 2) ("abcde" 5 0))

Are you using SRFI 69? Look into hash-table-update!.

Related

Insertion Sort in Common lisp

I want to implement the sorting function in common-lisp with this INSERT function
k means cons cell with number & val, and li means list where I want insert k into.
with this function, I can make a list of cell
(defun INSERT (li k) (IF (eq li nil) (cons (cons(car k)(cdr k)) nil)
(IF (eq (cdr li) nil)
(IF (< (car k)(caar li)) (cons (cons(car k)(cdr k)) li)
(cons (car li) (cons (cons(car k)(cdr k)) (cdr li)) )
)
(cond
( (eq (< (caar li) (car k)) (< (car k) (caadr li)) )
(cons (car k) (cons (cons (car k) (cdr k)) (cdr li)) ) )
(t (cons (car li) (INSERT (cdr li) k)) )))))
and what I want is the code of this function below. it has only one parameter li(non sorted list)
(defun Sort_List (li)(...this part...))
without using assignment, and using the INSERT function
Your insert function is very strange. In fact I find it so hard to read that I cn't work out what it's doing except that there's no need to check for both the list being null and its cdr being null. It also conses a lot of things it doesn't need, unless you are required by some part of the specification of the problem to make copies of the conses you are inserting.
Here is a version of it which is much easier to read and which does not copy when it does not need to. Note that this takes its arguments in the other order to yours:
(defun insert (thing into)
(cond ((null into)
(list thing))
((< (car thing) (car (first into)))
(cons thing into))
(t (cons (first into)
(insert thing (rest into))))))
Now, what is the algorithm for insertion sort? Well, essentially it is:
loop over the list to be sorted:
for each element, insert it into the sorted list;
finally return the sorted list.
And we're not allowed to use assignment to do this.
Well, there is a standard trick to do this sort of thing, which is to use a tail-recursive function with an accumulator argument, which accumulates the results. We can either write this function as an explicit auxiliary function, or we can make it a local function. I'm going to do the latter both because there's no reason for a function which is only ever used locally to be globally visible, and because (as I'm assuming this is homework) it makes it harder to submit directly.
So here is this version of the function:
(defun insertion-sort (l)
(labels ((is-loop (tail sorted)
(if (null tail)
sorted
(is-loop (rest tail) (insert (first tail) sorted)))))
(is-loop l '())))
This approach is fairly natural in Scheme, but not very natural in CL. An alternative approach which does not use assignment, at least explicitly, is to use do. Here is a version which uses do:
(defun insertion-sort (l)
(do ((tail l (rest tail))
(sorted '() (insert (first tail) sorted)))
((null tail) sorted)))
There are two notes about this version.
First of all, although it's not explicitly using assignment it pretty clearly implicitly is doing so. I think that's probably cheating.
Secondly it's a bit subtle why it works: what, exactly, is the value of tail in (insert (first tail) sorted), and why?
A version which is clearer, but uses loop which you are probably not meant to know about, is
(defun insertion-sort (l)
(loop for e in l
for sorted = (insert e '()) then (insert e sorted)
finally (return sorted)))
This, however, is also pretty explicitly using assignment.
As Kaz has pointed out below, there is an obvious way (which I should have seen!) of doing this using the CL reduce function. What reduce does, conceptually, is to successively collapse a sequence of elements by calling a function which takes two arguments. So, for instance
(reduce #'+ '(1 2 3 4))
is the same as
(+ (+ (+ 1 2) 3) 4)
This is easier to see if you use cons as the function:
> > (reduce #'cons '(1 2 3 4))
(((1 . 2) . 3) . 4)
> (cons (cons (cons 1 2) 3) 4)
(((1 . 2) . 3) . 4)
Well, of course, insert, as defined above, is really suitable for this: it takes an ordered list and inserts a new pair into it, returning a new ordered list. There are two problems:
my insert takes its arguments in the wrong order (this is possibly why the original one took the arguments in the other order!);
there needs to be a way of 'seeding' the initial sorted list, which will be ().
Well we can fix the wrong-argument-order either by rewriting insert, or just by wrapping it in a function which swaps the arguments: I'll do the latter because I don't want to revisit what I wrote above and I don't want two versions of the function.
You can 'seed' the initial null value by either just prepending it to the list of things to sort, or in fact reduce has a special option to provide the initial value, so we'll use that.
So using reduce we get this version of insertion-sort:
(defun insertion-sort (l)
(reduce (lambda (a e)
(insert e a))
l :initial-value '()))
And we can test this:
> (insertion-sort '((1 . a) (-100 . 2) (64.2 . "x") (-2 . y)))
((-100 . 2) (-2 . y) (1 . a) (64.2 . "x"))
and it works fine.
So the final question the is: are we yet again cheating by using some function whose definition obviously must involve assignment? Well, no, we're not, because you can quite easily write a simplified reduce and see that it does not need to use assignment. This version is much simpler than CL's reduce, and in particular it explicitly requires the initial-value argument:
(defun reduce/simple (f list accum)
(if (null list)
accum
(reduce/simple f (rest list) (funcall f accum (first list)))))
(Again, this is not very natural CL code since it relies on tail-call elimination to handle large lists, but it makes the point that you can do this without assignment.)
And so now we can write one final version of insertion-sort:
(defun insertion-sort (l)
(reduce/simple (lambda (a e)
(insert e a))
l '()))
And it's easy to check that this works as well.

Assignment in Lisp

I have the following setup in Common Lisp. my-object is a list of 5 binary trees.
(defun make-my-object ()
(loop for i from 0 to 5
for nde = (init-tree)
collect nde))
Each binary tree is a list of size 3 with a node, a left child and a right child
(defstruct node
(min 0)
(max 0)
(ctr 0))
(defun vals (tree)
(car tree))
(defun left-branch (tree)
(cadr tree))
(defun right-branch (tree)
(caddr tree))
(defun make-tree (vals left right)
(list vals left right))
(defun init-tree (&key (min 0) (max 1))
(let ((n (make-node :min min :max max)))
(make-tree n '() '())))
Now, I was trying to add an element to one of the binary trees manually, like this:
(defparameter my-object (make-my-object))
(print (left-branch (car my-object))) ;; returns NIL
(let ((x (left-branch (car my-object))))
(setf x (cons (init-tree) x)))
(print (left-branch (car my-object))) ;; still returns NIL
The second call to print still returns NIL. Why is this? How can I add an element to the binary tree?
The first function is just:
(defun make-my-object ()
(loop repeat 5 collect (init-tree)))
Now you define a structure for node, but you use a list for the tree and my-object? Why aren't they structures?
Instead of car, cadr and caddr one would use first, second, third.
(let ((x (left-branch (car my-object))))
(setf x (cons (init-tree) x)))
You set the local variable x to a new value. Why? After the let the local variable is also gone. Why aren't you setting the left branch instead? You would need to define a way to do so. Remember: Lisp functions return values, not memory locations you can later set. How can you change the contents in a list? Even better: use structures and change the slot value. The structure (or even CLOS classes) has following advantages over plain lists: objects carry a type, slots are named, accessors are created, a make function is created, a type predicate is created, ...
Anyway, I would define structures or CLOS classes for node, tree and object...
Most of the code in this question isn't essential to the real problem here. The real problem comes in with the misunderstanding of this code:
(let ((x (left-branch (car my-object))))
(setf x (cons (init-tree) x)))
We can see the same kind of behavior without user-defined structures of any kind:
(let ((cell (cons 1 2)))
(print cell) ; prints (1 . 2)
(let ((x (car cell)))
(setf x 3)
(print cell))) ; prints (1 . 2)
If you understand why both print statements produce (1 . 2), then you've got enough to understand why your own code isn't doing what you (previously) expected it to do.
There are two variables in play here: cell and x. There are three values that we're concerned with 1, 2, and the cons-cell produced by the call (cons 1 2). Variables in Lisp are often called bindings; the variable, or name, is bound to a value. The variable cell is bound to the the cons cell (1 . 2). When we go into the inner let, we evaluate (car cell) to produce the value 1, which is then bound to the variable x. Then, we assign a new value, 3, to the variable x. That doesn't modify the cons cell that contains the value that x was originally bound to. Indeed, the value that was originally bound to x was produced by (car cell), and once the call to (car cell) returned, the only value that mattered was 1.
If you have some experience in other programming languages, this is directly analogous to something like
int[] array = ...;
int x = array[2]; // read from the array; assign result to x
x = 42; // doesn't modify the array
If you want to modify a structure, you need to setf the appropriate part of the structure. E.g.:
(let ((cell (cons 1 2)))
(print cell) ; prints (1 . 2)
(setf (car cell) 3)
(print cell)) ; prints (3 . 2)

Convert lisp function to use map

Hello I am looking forward to convert my existing function:
(defun checkMember (L A)
(cond
((NULL L) nil)
( (and (atom (car L)) (equal (car L) A)) T )
(T (checkMember (cdr L) A))))
To use map functions, but i honestly cant understand exactly how map functions work, could you maybe advice me how this func's work?
this is my atempt:
(defun checkMem (L A)
(cond
((NULL L) nil)
( (and (atom (car L)) (equal (car L) (car A))) T )
(T (mapcar #'checkMem (cdr L) A))))
A mapping function is not appropriate here because the task involves searching the list to determine whether it contains a matching item. This is not mapping.
Mapping means passing each element through some function (and usually collecting the return values in some way). Sure, we can abuse mapping into solving the problem somehow.
But may I instead suggest that this is a reduce problem rather than a mapping problem? Reducing means processing all the elements of a list in order to produce a single value which summarizes that list.
Warm up: use reduce to add elements together:
(reduce #'+ '(1 2 3)) -> 6
In our case, we want to reduce the list differently: to a single value which is T or NIL based on whether the list contains some item.
Solution:
(defun is-member (list item)
(reduce (lambda (found next-one) (or found (eql next-one item)))
list :initial-value nil))
;; tests:
(is-member nil nil) -> NIL
(is-member nil 42) -> NIL
(is-member '(1) 1) -> T
(is-member '(1) 2) -> NIL
(is-member '(t t) 1) -> NIL ;; check for accumulator/item mixup
(is-member '(1 2) 2) -> T
(is-member '(1 2) 3) -> NIL
...
A common pattern in using a (left-associative) reduce function is to treat the left argument in each reduction as an accumulated value that is being "threaded" through the reduce. When we do a simple reduce with + to add numbers, we don't think about this, but the left argument of the function used for the reduction is always the partial sum. The partial sum is initialized to zero because reduce first calls the + function with no arguments, which is possible: (+) is zero in Lisp.
Concretely, what happens in (reduce #'+ '(1 2 3)) is this:
first, reduce calls (+) which returns 0.
then, reduce calls (+ 0 1), which produces the partial sum 1.
next, reduce calls (+ 1 2), using the previous partial sum as the left argument, and the next element as the right argument. This returns 3, of course.
finally, reduce calls (+ 3 3), resulting in 6.
In our case, the accumulated value we are "threading" through the reduction is not a partial sum, but a boolean value. This boolean becomes the left argument which is called found inside the reducing function. We explicitly specify the initial value using :initial-value nil, because our lambda function does not support being called with no arguments. On each call to our lambda, we short-circuit: if found is true, it means that a previous reduction has already decided that the list contains the item, and we just return true. Otherwise, we check the right argument: the next item from the list. If it is equal to item, then we return T, otherwise NIL. And this T or NIL then becomes the found value in the next call. Once we return T, this value will "domino" through the rest of the reduction, resulting in a T return out of reduce.
If you insist on using mapping, you can do something like: map each element to a list which is empty if the element doesn't match the item, otherwise nonempty. Do the mapping in such a way that the lists are catenated together. If the resulting list is nonempty, then the original list must have contained one or more matches for the item:
(defun is-member (list item)
(if (mapcan (lambda (elem)
(if (eq elem item) (list elem))) list)
t))
This approach performs lots of wasteful allocations if the list contains many occurrences of the item.
(The reduce approach is also wasteful because it keeps processing the list after it is obvious that the return value will be T.)
What about this:
(defun checkMember (L a)
(car (mapcan #'(lambda (e)
(and (equal a e) (list T)))
L)))
Note: it does not recurse into list elements, but the original function did not either.
(defun memb (item list)
(map nil
(lambda (element)
(when (eql item element)
(return-from memb t)))
list))
Try this,
Recursive version:
(defun checkmember (l a)
(let ((temp nil))
(cond ((null l) nil) ((find a l) (setf temp (or temp t)))
(t
(mapcar #'(lambda (x) (cond ((listp x)(setf temp (or temp (checkmember x a))))))
l)))
temp))
Usage: (checkmember '(1 (2 5) 3) 20) => NIL
(checkmember '(1 (2 5) 3) 2) => T
(checkmember '(1 2 3) 2) => T
(checkmember '((((((((1)))))))) 1) = T

create racket accumulator "variable"

Im really having problems understanding how I can create variable that would act as an accumulator in racket. This is definitely a really stupid question....but racket's documentation is pretty difficult for me to read.
I know I will use some kind of define statement or let statement.
I want to be able to pass a number to a variable or function and it adds the current value with the new value keeps the sum...How would I do this....?? Thank you..
(define (accumulator newvalue) "current=current+newvalue"
something like this..
An accumulator is generally just a function parameter. There are a few chapters in How to Design Programs (online, starting here) that cover accumulators. Have you read them?
For example, the reverse function is implemented using an accumulator that remembers the prefix of the list, reversed:
;; reverse : list -> list
(define (reverse elems0)
;; reverse/accum : list list -> list
(define (reverse/accum elems reversed-prefix)
(cond [(null? elems)
reversed-prefix]
[else
(reverse/accum (cdr elems)
(cons (car elems) reversed-prefix))]))
(reverse/accum elems null))
Note that the scope of the accumulator reversed-prefix is limited to the function. It is updated by calling the function with a new value for that parameter. Different calls to reverse have different accumulators, and reverse remembers nothing from one call to the next.
Perhaps you mean state variable instead. In that case, you define it (or bind it with let or lambda) at the appropriate scope and update it using set!. Here's a global state variable:
;; total : number
(define total 0)
;; add-to-total! : number -> number
(define (add-to-total! n)
(set! total (+ total n))
total)
(add-to-total! 5) ;; => 5
(add-to-total! 31) ;; => 36
Here's a variation that creates local state variables, so you can have multiple counters:
;; make-counter : -> number -> number
(define (make-counter)
(let ([total 0])
(lambda (n)
(set! total (+ total n))
total)))
(define counterA (make-counter))
(define counterB (make-counter))
(counterA 5) ;; => 5
(counterB 10) ;; => 10
(counterA 15) ;; => 20
(counterB 20) ;; => 30
But don't call state variables accumulators; it will confuse people.
Do you mean something like this?
(define (accumulator current newvalue)
(let ((current (+ current newvalue)))
...)
You can close over the accumulator variable:
(define accumulate
(let ((acc 0))
(λ (new-val)
(set! acc (+ acc new-val))
acc)))
(accumulate 10) ;=> 10
(accumulate 4) ;=> 14

What's the best way to sort a hashtable by value?

Now i have to copy the hastable to a list before sorting it:
(defun good-red ()
(let ((tab (make-hash-table)) (res '()))
(dotimes (i 33) (setf (gethash (+ i 1) tab) 0))
(with-open-file (stream "test.txt")
(loop for line = (read-line stream nil)
until (null line)
do
(setq nums (butlast (str2lst (substring line 6))))
(dolist (n nums) (incf (gethash n tab)))
))
**(maphash #'(lambda (k v) (push (cons k v) res)) tab)**
(setq sort-res (sort res #'< :key #'cdr))
(reverse (nthcdr (- 33 18) (mapcar #'car sort-res))) ))
BTW, what's the better way to fetch the first N elements of a list ?
Vatine's answer is technically correct, but probably not super helpful for the immediate problem of someone asking this question. The common case of using a hash table to hold a collection of counters, then selecting the top N items by score can be done like this:
;; convert the hash table into an association list
(defun hash-table-alist (table)
"Returns an association list containing the keys and values of hash table TABLE."
(let ((alist nil))
(maphash (lambda (k v)
(push (cons k v) alist))
table)
alist))
(defun hash-table-top-n-values (table n)
"Returns the top N entries from hash table TABLE. Values are expected to be numeric."
(subseq (sort (hash-table-alist table) #'> :key #'cdr) 0 n))
The first function returns the contents of a hash table as a series of cons'd pairs in a list, which is called an association list (the typical list representation for key/value pairs). Most Lisp enthusiasts already have a variation of this function on hand because it's such a common operation. This version is from the Alexandria library, which is very widely used in the CL community.
The second function uses SUBSEQ to grab the first N items from the list returned by sorting the alist returned by the first function using the CDR of each pair as the key. Changing :key to #'car would sort by hash keys, changing #'> to #'< would invert the sort order.
A hash-table is inherently unordered. If you want it sorted, you need to initialize some sort of ordered data structure with the contents.
If you want to fetch the first N elements of a sequence, there's always SUBSEQ.