How to find most common element in common lisp list? - lisp

Say we have a list with these elements:
("apple" "pear" "apple" "banana" "pear" "apple")
How to determine that most common element is this list is "apple"?

The question is quite broad, so here below is a utility function that groups elements into bags of same frequencies. The primary return value is a hash-table where keys are positive numbers (number of occurrences) and values lists of elements. All those lists form a partition of elements.
The secondary return value is the intermediate frequency hash table, which might be useful for the caller. You should be able to find the most frequently occurring element(s) with that.
(defun frequency-bags (elements &key (test #'equal))
(let ((frequencies (make-hash-table :test test))
(bags (make-hash-table :test #'eql)))
(dolist (e elements) (incf (gethash e frequencies 0)))
(maphash (lambda (k v) (push k (gethash v bags))) frequencies)
(values bags frequencies)))
References
make-hash-table
gethash
maphash
dolist
values
setf
push
Examples
(alexandria:hash-table-alist
(frequency-bags
'("apple" "pear" "apple" "banana" "pear" "pear" "apple")))
=> ((1 "banana")
(3 "pear" "apple"))
(alexandria:hash-table-alist
(frequency-bags
'("apple" "apple" "orange" "peach" "banana" "pear" "pear" "apple")))
=> ((2 "pear")
(1 "banana" "peach" "orange")
(3 "apple"))

Related

Lisp insertion sort -> defun span() function confusion

I am trying to learn how lisp works. Below is a part of a insertion sort algorithm and I don't understand what it is doing as a whole. I kind of understand what predicate list is but not sure what it is checking. Could someone explain?
(defun span (predicate list)
(let ((tail (member-if-not predicate list)))
(values (ldiff list tail) tail)))
(member-if-not p l) returns a tail of l beginning with the first element for which p is false. So (member-if-not #'evenp '(2 3 4)) is (3 4) (and this is eq to the cdr of the original list in this case).
If l2 is a tail of l1 then (ldiff l1 l2) returns a list of the elements of l1 which precede it.
So
(let ((l '(2 3 4)))
(let ((tail (member-if-not #'evenp l)))
(values (ldiff l tail) tail)))
will return (2) and (3 4) (and the second value will be the cdr of the original list).

Scheme macro pairwise processing question

(For now please ignore that what I'm after is un-Schemey, because this for a DSL aimed at non-programmers)
I'd like to do something eqivalent to this:
(pairwise key1 value1 key2 value2)
Which would expand to this, m being another macro I've defined (hence I can't simply use a variadic style function):
(list (cons key1 (m value1)) (cons key2 (m value2)))
I've tried this:
(define-syntax pairwise
(syntax-rules ()
((_ key value ...)
(list (cons key (m value)) ...))))
But as I guessed it expanded to:
(list (cons key1 (m value1)) (cons key1 (m key2)) (cons key1 (m value2)))
I'm a bit stuck on how to process these elements pairwise in the way I'd like, without requiring a user to add inner brackets.
You can do this with recursion. Instead of having one case that looks like
((_ key value ...)
(list (cons key (m value)) ...))
You can have two cases that look like
((_)
'())
((_ key value . rest)
(cons (cons key (m value)) (pairwise . rest)))
Similar to how you would design a recursive list-processing function, but with the base case as a syntax-rules case (detected at compile-time) instead of an if or cond condition (detected at run-time).

Lisp use member through a list of lists

in common lisp I have a tree of symbols like:
(setf a '((shoe (walks(town)) (has-laces(snow)))
(tree (grows(bob)) (is-green(house)) (is tall(work)))))
all are symbols.
I want to return the sublist that contains the symbol I search for (in this case I might search using the symbol shoe and return the entire sublist in which they are contained. the keywords are always in the second layer never deeper
trying to use:
(mapcar #'member (shoe my-list))
but requires shoe to be a list (because of mapcar?) things got very convoluted after that. help please!
Given:
(setf a '((shoe (walks(town)) (has-laces(snow)))
(tree (grows(bob)) (is-green(house)) (is tall(work)))))
We can find the first (shoe ...) sublist like this:
(find 'shoe a :key #'car)
-> (SHOE (WALKS (TOWN)) (HAS-LACES (SNOW)))
I.e. search through the list of objects, which are lists, and use their car as the search key.
If there can be duplicates and we want a list of all of the sublists which start with shoe, then Common Lisp's standard library shows itself a bit clumsy. There isn't a nice function which finds all occurrences of an item; we resort to remove-if-not with a lambda:
(remove-if-not (lambda (x) (eq x 'shoe)) a :key #'car)
We can also write a loop expression:
(loop for (sym . rest) in a and
for whole in a
if (eq sym 'shoe) collect whole)
We can also make ourselves a quick and dirty find-all which can be invoked similarly to all:
(defun find-all (item sequence &key (key #'identity) (test #'eql))
(remove-if-not (lambda (elem) (funcall test item elem)) sequence :key key))
Then:
(find-all 'shoe a :key #'car)
--> ((SHOE (WALKS (TOWN)) (HAS-LACES (SNOW))))
(find-all 'x '((x 1) (y 2) (x 3) (z 4)) :key #'car)
--> ((X 1) (X 3))
(find 'x '((x 1) (y 2) (x 3) (z 4)) :key #'car)
--> ((X 1))

hash-tables, counting duplicate keys, scheme

I'm hoping someone can help me with this while I continue searching for a solution.
I'm confused on how to iterate through a hash table and find duplicate keys. I want to remove the duplicates, but consolidate their values.
So, say I have a list of strings:
(define strings '("abcde" "bcdea" "cdeab" "deabc" "eabcd" "abcde"))
And I store them into a hash table where the values are their index positions in the list.
So, I'm wanting to build a hash table like this:
(abcde (0, 5))
(bcdea 1)
(cdeab 2)
(deabc 3)
(eabcd 4)
Each string is a key, and the value is a list of the indexes where that string is found. Basically, I'm counting the number of occurrences of a substring in a large string, and noting their positions.
I know how to make the hash table:
(define my-hash-table (make-hash))
(for-each (lambda (s v) (hash-set! my-hash-table s v)) strings values) ;;values is a list of 0,1,2,3,4,5
(map (lambda (s) (list s (hash-ref my-hash-table s))) strings)
This just builds a hash table of the keys and their values, it doesn't consider if a key is already present in the table.
I'd appreciate any advice. If someone doesn't mind going through it step-by-step with me I'd be very grateful, I'm trying to learn scheme.
I'm using RSR5.
The trick is to check whether each key has already a value, if so we append it to a list - by definition, each key can only have one value associated. I think you're looking for something like this:
(define strings '("abcde" "bcdea" "cdeab" "deabc" "eabcd" "abcde"))
(define values '(0 1 2 3 4 5))
(define my-hash-table (make-hash))
(for-each (lambda (s v)
(hash-update! my-hash-table
s
(lambda (a) (cons v a)) ; add element to list
(lambda () '()))) ; we start with '()
strings
values)
Alternatively, we can create and update the hash table using a functional style of programming:
(define my-hash-table
(foldl (lambda (s v a)
(hash-update a
s
(lambda (a) (cons v a)) ; add element to list
(lambda () '()))) ; we start with '()
(hash)
strings
values))
Either way, it works as expected:
(hash->list my-hash-table) ; we get a keys/values list for free
=> '(("eabcd" 4) ("deabc" 3) ("bcdea" 1) ("cdeab" 2) ("abcde" 5 0))
Are you using SRFI 69? Look into hash-table-update!.

What's the best way to sort a hashtable by value?

Now i have to copy the hastable to a list before sorting it:
(defun good-red ()
(let ((tab (make-hash-table)) (res '()))
(dotimes (i 33) (setf (gethash (+ i 1) tab) 0))
(with-open-file (stream "test.txt")
(loop for line = (read-line stream nil)
until (null line)
do
(setq nums (butlast (str2lst (substring line 6))))
(dolist (n nums) (incf (gethash n tab)))
))
**(maphash #'(lambda (k v) (push (cons k v) res)) tab)**
(setq sort-res (sort res #'< :key #'cdr))
(reverse (nthcdr (- 33 18) (mapcar #'car sort-res))) ))
BTW, what's the better way to fetch the first N elements of a list ?
Vatine's answer is technically correct, but probably not super helpful for the immediate problem of someone asking this question. The common case of using a hash table to hold a collection of counters, then selecting the top N items by score can be done like this:
;; convert the hash table into an association list
(defun hash-table-alist (table)
"Returns an association list containing the keys and values of hash table TABLE."
(let ((alist nil))
(maphash (lambda (k v)
(push (cons k v) alist))
table)
alist))
(defun hash-table-top-n-values (table n)
"Returns the top N entries from hash table TABLE. Values are expected to be numeric."
(subseq (sort (hash-table-alist table) #'> :key #'cdr) 0 n))
The first function returns the contents of a hash table as a series of cons'd pairs in a list, which is called an association list (the typical list representation for key/value pairs). Most Lisp enthusiasts already have a variation of this function on hand because it's such a common operation. This version is from the Alexandria library, which is very widely used in the CL community.
The second function uses SUBSEQ to grab the first N items from the list returned by sorting the alist returned by the first function using the CDR of each pair as the key. Changing :key to #'car would sort by hash keys, changing #'> to #'< would invert the sort order.
A hash-table is inherently unordered. If you want it sorted, you need to initialize some sort of ordered data structure with the contents.
If you want to fetch the first N elements of a sequence, there's always SUBSEQ.