Can a tail recursive function still get stack overflow? - lisp

I've been solving some challenges at codesignal.com using C-Lisp to learn it and I've been avoiding using loops to make lisp style code.
In this challenge called alternatingSums (which gives you an int array a that can be very large and ask you to return an array/list {sumOfEvenIndexedElements, sumOfOddIndexedElements}) i have been receiving stack overflow error with this code:
(defun alternatingSums(a &optional (index 0) (accumulated '(0 0)))
(cond ((= index (length a))
accumulated)
((evenp index)
(alternatingSums
a
(1+ index)
`(,(+ (svref a index ) (elt accumulated 0)) ,(elt accumulated 1)))
)
((oddp index)
(alternatingSums
a
(1+ index)
`(,(elt accumulated 0) ,(+ (svref a index ) (elt accumulated 1))))
)
)
)
isn't it tail-recursive or can tail-recursive functions still get stack-overflow?

Recursive functions which call themselves from tail position can lead to stack overflow; language implementations must support some form of tail call elimination to avoid the problem.
I've been avoiding using loops to make lisp style code.
Common Lisp does not require that implementations do tail call elimination, but Scheme implementations must do so. It is idiomatic in Scheme to use recursion for iteration, but in Common Lisp it is idiomatic to use other iteration devices unless recursion provides a natural solution for the problem at hand.
Although Common Lisp implementations are not required to do tail call elimination, many do. Clisp does support limited tail call elimination, but only in compiled code, and only for self-recursive tail calls. This is not well-documented, but there is some discussion to be found here #Renzo. OP posted code will be subject to tail call elimination when compiled in Clisp since the function alternatingSums calls itself from tail position. This covers most cases in which you may be interested in tail call elimination, but note that tail call elimination is then not done for mutually recursive function definitions in Clisp. See the end of this answer for an example.
Defining a function from the REPL, or loading a definition from a source file, will result in interpreted code. If you are working in a development environment like SLIME, it is easy to compile: from the source file buffer either do Ctrl-c Ctrl-k to compile the whole file and send it to the REPL, or place the point inside of or immediately after a function definition and do Ctrl-c Ctrl-c to compile a single definition and send it to the REPL.
You could also compile the source file before loading it, e.g. (load (compile-file "my-file.lisp")). Or you could load the source file, and compile a function after that, e.g. (load "my-file.lisp"), then (compile 'my-function).
As already mentioned, it would probably be more likely that idiomatic Common Lisp code would not use recursion for this sort of function anyway. Here is a definition using the loop macro that some would find more clear and concise:
(defun alternating-sums (xs)
(loop for x across xs
and i below (length xs)
if (evenp i) sum x into evens
else sum x into odds
finally (return (list evens odds))))
The Case of Mutually Recursive Functions in Clisp
Here is a simple pair of mutually recursive function definitions:
(defun my-evenp (n)
(cond ((zerop n) t)
((= 1 n) nil)
(t (my-oddp (- n 1)))))
(defun my-oddp (n)
(my-evenp (- n 1)))
Neither function calls itself directly, but my-evenp has a call to my-oddp in tail position, and my-oddp has a call to my-evenp in tail position. One would like for these tail calls to be eliminated to avoid blowing the stack for large inputs, but Clisp does not do this. Here is the disassembly:
CL-USER> (disassemble 'my-evenp)
Disassembly of function MY-EVENP
14 byte-code instructions:
0 (LOAD&PUSH 1)
1 (CALLS2&JMPIF 172 L16) ; ZEROP
4 (CONST&PUSH 0) ; 1
5 (LOAD&PUSH 2)
6 (CALLSR&JMPIF 1 47 L19) ; =
10 (LOAD&DEC&PUSH 1)
12 (CALL1 1) ; MY-ODDP
14 (SKIP&RET 2)
16 L16
16 (T)
17 (SKIP&RET 2)
19 L19
19 (NIL)
20 (SKIP&RET 2)
CL-USER> (disassemble 'my-oddp)
Disassembly of function MY-ODDP
3 byte-code instructions:
0 (LOAD&DEC&PUSH 1)
2 (CALL1 0) ; MY-EVENP
4 (SKIP&RET 2)
Compare with a tail recursive function that calls itself. Here there is no call to factorial in the disassembly, but instead a jump instruction has been inserted: (JMPTAIL 2 5 L0).
(defun factorial (n acc)
(if (zerop n) acc
(factorial (- n 1) (* n acc))))
CL-USER> (disassemble 'factorial)
Disassembly of function FACTORIAL
11 byte-code instructions:
0 L0
0 (LOAD&PUSH 2)
1 (CALLS2&JMPIF 172 L15) ; ZEROP
4 (LOAD&DEC&PUSH 2)
6 (LOAD&PUSH 3)
7 (LOAD&PUSH 3)
8 (CALLSR&PUSH 2 57) ; *
11 (JMPTAIL 2 5 L0)
15 L15
15 (LOAD 1)
16 (SKIP&RET 3)
Some Common Lisp implementations do support tail call elimination for mutually recursive functions. Here is the disassembly of my-oddp from SBCL:
;; SBCL
; disassembly for MY-ODDP
; Size: 40 bytes. Origin: #x52C8F9E4 ; MY-ODDP
; 9E4: 498B4510 MOV RAX, [R13+16] ; thread.binding-stack-pointer
; 9E8: 488945F8 MOV [RBP-8], RAX
; 9EC: BF02000000 MOV EDI, 2
; 9F1: 488BD3 MOV RDX, RBX
; 9F4: E8771B37FF CALL #x52001570 ; GENERIC--
; 9F9: 488B5DF0 MOV RBX, [RBP-16]
; 9FD: B902000000 MOV ECX, 2
; A02: FF7508 PUSH QWORD PTR [RBP+8]
; A05: E9D89977FD JMP #x504093E2 ; #<FDEFN MY-EVENP>
; A0A: CC10 INT3 16 ; Invalid argument count trap
This is a little harder to read than the previous examples because SBCL compiles to assembly language instead of byte code, but you can see that a jump instruction has been substituted for the call to my-evenp:
; A05: E9D89977FD JMP #x504093E2 ; #<FDEFN MY-EVENP>

Common Lisp compilers are not required to optimize tail calls. Many do, but not all implementations compile your code by default; you have to compile the file using compile-file, or else the function individually with (compile 'alternatingsums).
CLISP contains both an interpreter, which processes the nested-list representation of Lisp source code, and a byte code compiler. The compiler supports tail recursion, whereas the interpreter doesn't:
$ clisp -q
[1]> (defun countdown (n) (unless (zerop n) (countdown (1- n))))
COUNTDOWN
[2]> (countdown 10000000)
*** - Program stack overflow. RESET
[3]> (compile 'countdown)
COUNTDOWN ;
NIL ;
NIL
[4]> (countdown 10000000)
NIL
Peeking under the hood a little bit:
[5]> (disassemble 'countdown)
Disassembly of function COUNTDOWN
1 required argument
0 optional arguments
No rest parameter
No keyword parameters
8 byte-code instructions:
0 L0
0 (LOAD&PUSH 1)
1 (CALLS2&JMPIF 172 L10) ; ZEROP
4 (LOAD&DEC&PUSH 1)
6 (JMPTAIL 1 3 L0)
10 L10
10 (NIL)
11 (SKIP&RET 2)
NIL
We can see that the virtual machine has a JMPTAIL primitive.
Another approach to tail calling is via macros. Years ago, I hacked up a macro called tlet which lets you define (what look like) lexical functions using syntax similar to labels. The tlet construct compiles to a tagbody form in which the tail calls among the functions are go forms. It does not analyze calls for being in tail position: all calls are unconditional transfers that do not return regardless of their position in the syntax. The same source file also provides a trampoline-based implementation of tail calling among global functions.
Here is tlet in CLISP; note: the expression has not been compiled, yet it doesn't run out of stack:
$ clisp -q -i tail-recursion.lisp
;; Loading file tail-recursion.lisp ...
;; Loaded file tail-recursion.lisp
[1]> (tlet ((counter (n) (unless (zerop n) (counter (1- n)))))
(counter 100000))
NIL
tlet is not an optimizer. The call to counter is semantically a goto, always; it's not a procedure call that can sometimes turn into a goto under the right circumstances. Watch what happens when we add a print:
[2]> (tlet ((counter (n) (unless (zerop n) (print (counter (1- n))))))
(counter 100000))
NIL
That's right; nothing! (counter (1- n)) never returns, and so print is never called.

Related

Racket: Why wont this compile?

I'm attempting to program a simple function that adds integers to a list descending from a range of "high" and "low", incremented by "step"
For example,
if the input is (3 12 3), the expected output is '(12 9 6 3)
Below is the following code:
(define (downSeries step high low [(define ret '())])
(if (< high low)
ret
(cons ret (- high step))
(downSeries (step (- high step) low))))
I'm pretty new to racket, but I'm really not sure why this isn't compiling. Any tips? Thank you.
Since only racket is tagged and no special languages are describes it is expeted the first line in the definition window is #lang racket. Answer will be different for student languages.
1 The last argument is nested in two parentheses and is illegal syntax. Default arguments only have one set:
(define (test mandatory (optional '()))
(list mandatory optional))
(test 1) ; ==> (1 ())
(test 1 2) ; ==> (1 2)
2 You have 4 operands in your if form. It takes maximum 3!
(if prediate-expression
then-expression
else-expression)
Looking at the code you should have the cons expression in the position of ret argument. Having it before the recursion makes it dead code. ret will always be (). Eg this loks similar to a typical fold implementation:
(define (fold-1 combine init lst)
(if (null? lst)
init ; fully grown init returned
(fold-1 combine
(combine (car lst) init) ; init grows
(cdr lst))))

Is there a straightforward lisp equivalent of Python's generators?

In Python you can write this:
def firstn(n):
num = 0
while num < n:
yield num
num += 1
What is the lisp equivalent of this?
Existing package
Download, install and load the GENERATORS system with Quicklisp. Then, use package :generators (or preferably, define your own package first).
(ql:quickload :generators)
(use-package :generators)
Define an infinite generator for random values:
(defun dice (n)
(make-generator ()
;; repeatedly return a random value between 1 and N
(loop (yield (1+ (random n))))))
Use the generator:
(loop
with dice = (dice 6)
repeat 20
collect (next dice))
=> (1 2 6 1 1 4 4 2 4 3 6 2 1 5 6 5 1 5 1 2)
Note however what the author of the library says:
This library is more of an interesting toy, though as far as I know it
does work. I dont think I have ever used this in application code,
though I think that with care, it could be.
See also
The ITERATE package provides a way to define generators for use inside its iteration facility.
The SERIES package provide stream-like data structures and operations on them.
The Snakes library (same approach as GENERATORS as far as I know).
Iterators in generic-cl
Closures
In practice, CL does not rely that much on generators as popularized by Python. What happens instead is that when people need lazy sequences, they use closures:
(defun dice (n)
(lambda ()
(1+ (random n))))
Then, the equivalent of next is simply a call to the thunk generated by dice:
(loop
with dice = (dice 6)
repeat 20
collect (funcall dice))
This is the approach that is preferred, in particular because there is no need to rely on delimited continuations like with generators. Your example involves a state, which the dice example does not require (there is a hidden state that influences random, but that's another story) . Here is how your counter is typically implemented:
(defun first-n (n)
(let ((counter -1))
(lambda ()
(when (< counter n)
(incf counter)))))
Higher-order functions
Alternatively, you design a generator that accepts a callback function which is called by your generator for each value. Any funcallable can be used, which allows the caller to retain control over code execution:
(defun repeatedly-throw-dice (n callback)
(loop (funcall callback (1+ (random n)))))
Then, you can use it as follows:
(prog ((counter 0) stack)
(repeatedly-throw-dice 6
(lambda (value)
(if (<= (incf counter) 20)
(push value stack)
(return (nreverse stack))))))
See documentation for PROG.
do-traversal idiom
Instead of building a function, data sources that provides a custom way of generating values (like matches of a regular expressions in a string) also regularly provide a macro that abstracts their control-flow. You would use it as follows:
(let ((counter 0) stack)
(do-repeatedly-throw-dice (value 6)
(if (<= (incf counter) 20)
(push value stack)
(return (nreverse stack))))))
DO-X macros are expected to define a NIL block around their body, which is why the return above is valid.
A possible implementation for the macro is to wrap the body in a lambda form and use the callback-based version defined above:
(defmacro do-repeatedly-throw-dice ((var n) &body body)
`(block nil (repeatedly-throw-dice ,n (lambda (,var) ,#body))))
Directly expanding into a loop would be possible too:
(defmacro do-repeatedly-throw-dice ((var n) &body body)
(let ((max (gensym)) (label (make-symbol "NEXT")))
`(prog ((,max ,n) ,var)
,label
(setf ,var (1+ (random ,max)))
(progn ,#body)
(go ,label))))
One step of macroexpansion for above form:
(prog ((#:g1078 6) value)
#:next
(setf value (1+ (random #:g1078)))
(progn
(if (<= (incf counter) 20)
(push value stack)
(return (nreverse stack))))
(go #:next))
Bindings
Broadly speaking, building a generator with higher-order functions or directly with a do- macro gives the same result. You can implement one with the other (personally, I prefer to define first the macro and then the function using the macro, but doing the opposite is also interesting, since you can redefine the function without recompiling all usages of the macro).
However, there is still a difference: the macro reuses the same variable across iterations, whereas the closure introduces a fresh binding each time. For example:
(let ((list))
(dotimes (i 10) (push (lambda () i) list))
(mapcar #'funcall list))
.... returns:
(10 10 10 10 10 10 10 10 10 10)
Most (if not all) iterators in Common Lisp tend to work like this1, and it should not come as a surprise for experienced users (the opposite would be surprising, in fact). If dotimes was implemented by repeatedly calling a closure, the result would be different:
(defmacro my-dotimes ((var count-form &optional result-form) &body body)
`(block nil
(alexandria:map-iota (lambda (,var) ,#body) ,count-form)
,result-form))
With the above definition, we can see that:
(let ((list))
(my-dotimes (i 10) (push (lambda () i) list))
(mapcar #'funcall list))
... returns:
(9 8 7 6 5 4 3 2 1 0)
In order to have the same result with the standard dotimes, you only need to create a fresh binding before building the closure:
(let ((list))
(dotimes (i 10)
(let ((j i))
(push (lambda () j) list))))
Here j is a fresh binding whose value is the current value of i at closure creation time; j is never mutated so the closure will constantly return the same value.
If you wanted to, you could always introduce that inner let from the macro, but this is rarely done.
1: Note that the specification for DOTIMES does not require that bindings are fresh at each iteration, or only mutates the same binding at each step: "It is implementation-dependent whether dotimes establishes a new binding of var on each iteration or whether it establishes a binding for var once at the beginning and then assigns it on any subsequent iterations." In order to write portably, it is necessary to assume the worst-case scenario (i.e. mutation, which happens to be what most (all?) implementations do) and manually rebind iteration variables if they are to be captured and reused at a later point.

What's the difference between (list nil) and '(nil) in Lisp? [duplicate]

This question already has answers here:
Why does this function return a different value every time?
(4 answers)
Unexpected persistence of data [duplicate]
(1 answer)
Closed 7 years ago.
First of all, let me say I'm a beginner in Lisp. To be honest I have been a beginner for some time now, but there are still many things I don't know well.
While I was writing this question, I came up with a strange bug in my code.
Here is a function that will return the list (0 1 ... n) with the list e appended. It uses rplacd along the way to keep track of the last element, to avoid a final call to last.
For example, (foo 4 '(x)) returns (0 1 2 3 4 x).
The "head" is stored in a, which is not simply nil, because there is only one nil, and never a copy of it (if I understand correctly), hence I can't simply append to nil.
(defun foo (n e)
(let* ((a (list nil)) (tail a))
(loop for i to n
do (rplacd tail (setf tail (list i)))
finally (rplacd tail (setf tail e))
(return (cdr a)))))
(defun bar (n e)
(let* ((a '(nil)) (tail a))
(loop for i to n
do (rplacd tail (setf tail (list i)))
finally (rplacd tail (setf tail e))
(return (cdr a)))))
The only difference between these functions is the (list nil) replaced by '(nil) in bar. While foo works as expected, bar always returns nil.
My initial guess is this happens because the original cdr of a is indeed nil, and the quoted list may be considered constant. However, if I do (setf x '(nil)) (rplacd x 1) I get (nil . 1) as expected, so I must be at least partially wrong.
When evaluated, '(nil) and (list nil) produce similar lists, but the former can be considered constant when present in source code. You should not perform any destructive operations on a constant quoted list in Common Lisp. See http://l1sp.org/cl/3.2.2.3 and http://l1sp.org/cl/quote. In particular, the latter says "The consequences are undefined if literal objects (including quoted objects) are destructively modified."
Quoted data is considered a constant. If you have two functions:
(defun test (&optional (arg '(0)))
(setf (car arg) (1+ (car arg)))
(car arg))
(defun test2 ()
'(0))
These are two functions both using the constant list (0) right?
The implementation may choose to not mutate constants:
(test) ; ==> Error, into the debugger we go
The implementation can cons the same list twice (the reader might do that for it)
(test2) ; ==> (0)
(test) ; ==> 1
(test) ; ==> 2
(test) ; ==> 3
(test2) ; ==> (0)
The implementation can see it's the same and hench save space:
(test2) ; ==> (0)
(test) ; ==> 1
(test) ; ==> 2
(test) ; ==> 3
(test2) ; ==> (3)
In fact. The last two behavior might happen in the same implementation dependent on the function being compiled or not.
In CLISP both functions work the same. I also see when disassembling with SBCL that the constant actually is mutated so I wonder if perhaps it has constant folded (cdr '(0)) at compile time and doesn't use the mutated list at all. It really doesn't matter since both are considered good "undefined" behavior.
The part from CLHS about this is very short
The consequences are undefined if literal objects (including quoted
objects) are destructively modified.

Defining setf for function in closure

If I create a closure like this,
(let ((A (make-array '(10) :initial-element 5)))
(defun h (i)
(aref a i))
(defsetf h (i) (x) `(setf (aref ,a ,i) ,x)))
then, as I expect, (h i) will return the i-th element of a:
(h 1) ;; => 5
(h 2) ;; => 5
Butalthough the setf expansion semes to work and correctly set the i-th element of a, it also produces a warning in SBCL:
(setf (h 1) 10)
; in: SETF (H 1)
; (SETF (AREF #(5 10 5 5 5 5 5 5 5 5) 1) #:G1124)
; --> LET* MULTIPLE-VALUE-BIND LET FUNCALL SB-C::%FUNCALL
; ==>
; ((SETF AREF) #:NEW0 #(5 10 5 5 5 5 5 5 5 5) 1)
;
; caught WARNING:
; Destructive function (SETF AREF) called on constant data.
; See also:
; The ANSI Standard, Special Operator QUOTE
; The ANSI Standard, Section 3.2.2.3
;
; compilation unit finished
; caught 1 WARNING condition
In GCL an error is signalled:
>(setf (h 1) 10)
Error:
Fast links are on: do (si::use-fast-links nil) for debugging
Signalled by LAMBDA-CLOSURE.
Condition in LAMBDA-CLOSURE [or a callee]: INTERNAL-SIMPLE-UNBOUND-VARIABLE: Cell error on A: Unbound variable:
Broken at LIST. Type :H for Help.
1 Return to top level.
In CLISP and ECL, the example works just fine.
I am returning to Common Lisp after writing Scheme for a couple of years, so I may be mixing the two languages, conceptually. I suppose I have triggered behavior that is undefined according to the spec, but I can't see exactly what I did wrong. I would appreciate any help with this!
Your Problem
It is often instructive to try macroexpand:
(macroexpand '(setf (h 2) 7))
==>
(LET* ()
(MULTIPLE-VALUE-BIND (#:G655)
7
(SETF (AREF #(5 5 5 5 5 5 5 5 5 5) 2) #:G655)))
As you can see, your setf call expands into a form which calls setf on a literal array which is a bad idea in general and, in fact, this is precisely what SBCL is warning you about:
Destructive function (SETF AREF) called on constant data.
Note that despite the warning SBCL (and other conformant implementations like CLISP and ECL) will do what you expect them to do.
This is because the literal array is referred to by the local variable which is accessible to the function h.
Solution
I suggest that you use a function instead
(let ((A (make-array '(10) :initial-element 5)))
(defun h (i)
(aref a i))
(defun (setf h) (x i)
(setf (aref a i) x)))

how to get 64 bit integer in common lisp?

I want to write a bitboard in common lisp, so I need a 64 bit integer. How do I get a 64 bit integer in common lisp? Also, are there any libraries that could help me accomplish this without writing everything from scratch?
You can declare your variables to be of type (signed-byte 64) or (unsigned-byte 64):
CL-USER> (typexpand '(unsigned-byte 64))
(INTEGER 0 18446744073709551615)
T
CL-USER> (typexpand '(signed-byte 64))
(INTEGER -9223372036854775808 9223372036854775807)
T
It depends upon your implementation if it is actually clever enough to really stuff this in 8 consecutive bytes or if it will use a bignum for this. Appropriate optimize-declarations might help.
Here's a (very simple) example of such type declarations, and handling integers in binary:
(let* ((x #b01)
(y #b10)
(z (logior x y)))
(declare ((signed-byte 64) x y z))
(format t "~a~%" (logbitp 1 x))
(format t "~a~%" (logbitp 1 (logior x (ash 1 1))))
(format t "~b~%" z))
Output:
NIL
T
11
Here's a setf-expander definition to get a simple setter for bits in integers, and a corresponding getter:
(define-setf-expander logbit (index place &environment env)
(multiple-value-bind (temps vals stores store-form access-form)
(get-setf-expansion place env)
(let ((i (gensym))
(store (gensym))
(stemp (first stores)))
(values `(,i ,#temps)
`(,index ,#vals)
`(,store)
`(let ((,stemp (dpb ,store (byte 1 ,i) ,access-form))
,#(cdr stores))
,store-form
,store)
`(logbit ,i ,access-form)))))
(defun logbit (index integer)
(ldb (byte 1 index) integer))
These can be used like this:
(let ((x 1))
(setf (logbit 3 x) 1)
x)
==> 9
(let ((x 9))
(setf (logbit 3 x) 0)
x)
==> 1
(logbit 3 1)
==> 0
(logbit 3 9)
==> 1
In portable Common Lisp 'Integers' are as large as you like. There is a more efficient subset of integers called 'fixnums'. The exact range of fixnums is implementation depended. But it is typically not the full 64 bit (on a 64bit architecture) which can be used, since most Common Lisp implementations need type tag bits. For the user there is not much of a difference. Fixnums are a subset of integers and one can add two fixnums and get a not-fixnum integer result. The only differences that may be observable is that computation with non-fixnum integers is slower, needs more storage, ... Generally, if you want to do computation with integers, you don't need to declare that you want to calculate with 64bit. You just use Integers and the usual operations for those.
If you want real 64bit large integers (represented in only 64bits, without tags, etc.) and computation with those, you'll leave the portable ANSI CL capabilities. If and how CLISP supports that, is best asked on the CLISP mailing list.
Documentation
Type FIXNUM
Type INTEGER
Example usage of bit vectors/arrays to implement a 8x8 bit-board
(starting with brutally and prematurely optimized code just to show a
way to get tight assembler code):
(defun make-bitboard ()
(make-array '(8 8) :element-type '(mod 2) :initial-element 0))
MAKE-BITBOARD will create a 8x8 bitboard as an array of bits. When
using SBCL, this is internally represented as 1 bit per element (so
you have 64 bits + array instance overhead). If you ask for
optimizations when accessing the board, you'll get fast code.
(declaim (inline get-bitboard))
(defun get-bitboard (bit-board x y)
(declare (optimize speed (safety 0) (debug 0))
(type (simple-array (mod 2) (8 8)) bit-board)
(type fixnum x y))
(aref bit-board x y))
(declaim (notinline get-bitboard))
The DECLAIMs are there to allow local
inlining requests for
GET-BITBOARD.
An example of using GET-BITBOARD:
(defun use-bitboard (bit-board)
(declare (optimize speed (safety 0) (debug 0))
(type (simple-array (mod 2) (8 8)) bit-board)
(inline get-bitboard))
(let ((sum 0))
(declare (type fixnum sum))
(dotimes (i 8)
(declare (type fixnum i))
(dotimes (j 8)
(declare (type fixnum j))
(incf sum (the (mod 2) (get-bitboard bit-board i j)))))
sum))
Since there is no SET-BITBOARD yet, an example of using USE-BITBOARD is:
(use-bitboard (make-bitboard))
Disassembling USE-BITBOARD (SBCL again, Linux x64) shows that the
compiler inlined GET-BITBOARD:
; disassembly for USE-BITBOARD
; 030F96A2: 31F6 XOR ESI, ESI ; no-arg-parsing entry point
; 6A4: 31D2 XOR EDX, EDX
; 6A6: EB54 JMP L3
; 6A8: 90 NOP
; 6A9: 90 NOP
; 6AA: 90 NOP
; 6AB: 90 NOP
; 6AC: 90 NOP
; 6AD: 90 NOP
; 6AE: 90 NOP
; 6AF: 90 NOP
; 6B0: L0: 31DB XOR EBX, EBX
; 6B2: EB3E JMP L2
; 6B4: 90 NOP
; 6B5: 90 NOP
; 6B6: 90 NOP
; 6B7: 90 NOP
; 6B8: 90 NOP
; 6B9: 90 NOP
; 6BA: 90 NOP
; 6BB: 90 NOP
; 6BC: 90 NOP
; 6BD: 90 NOP
; 6BE: 90 NOP
; 6BF: 90 NOP
; 6C0: L1: 488D04D500000000 LEA RAX, [RDX*8]
; 6C8: 4801D8 ADD RAX, RBX
; 6CB: 4C8B4711 MOV R8, [RDI+17]
; 6CF: 48D1F8 SAR RAX, 1
; 6D2: 488BC8 MOV RCX, RAX
; 6D5: 48C1E906 SHR RCX, 6
; 6D9: 4D8B44C801 MOV R8, [R8+RCX*8+1]
; 6DE: 488BC8 MOV RCX, RAX
; 6E1: 49D3E8 SHR R8, CL
; 6E4: 4983E001 AND R8, 1
; 6E8: 49D1E0 SHL R8, 1
; 6EB: 4C01C6 ADD RSI, R8
; 6EE: 4883C302 ADD RBX, 2
; 6F2: L2: 4883FB10 CMP RBX, 16
; 6F6: 7CC8 JL L1
; 6F8: 4883C202 ADD RDX, 2
; 6FC: L3: 4883FA10 CMP RDX, 16
; 700: 7CAE JL L0
; 702: 488BD6 MOV RDX, RSI
; 705: 488BE5 MOV RSP, RBP
; 708: F8 CLC
; 709: 5D POP RBP
; 70A: C3 RET
Not sure why the compiler put in all those NOPs (leaving space for
instrumentation later? alignments?) but if you look at the code at the
end it's pretty compact (not as compact as hand-crafted assembler, of
course).
Now this is an obvious case of premature optimization. The correct way
to start here would be to simply write:
(defun get-bitboard (bit-board x y)
(aref bit-board x y))
(defun use-bitboard (bit-board)
(let ((sum 0))
(dotimes (i 8)
(dotimes (j 8)
(incf sum (get-bitboard bit-board i j))))
sum))
... and then use a profiler when running the game code that uses the
bit-board to see where the CPU bottlenecks are. SBCL includes a nice
statistical profiler.
Starting with the simpler and slower code, with no declarations for
speed, is best. Just compare the size of the code - I started with the
code with plenty of declarations to make the simple code at the end
look even simpler by comparison :-). The advantage here is that you
can treat Common Lisp as a scripting/prototyping language when trying
out ideas, then squeeze more performance out of the code that the
profiler suggests.
The assembly code is obviously not as tight as loading the whole board in
one 64 bit register and then accessing individual bits. But if you
suddenly decide that you want more than 1 bit per square, it's much
easier to change the CL code than to change assembler code (just
change the array type everywhere from '(mod 2) to '(mod 16), for
instance).
You want to use bit vectors, which are arbitrary sized arrays of bits, rather than something like a 64 bit integer. The implementation will deal with the internal representations for you.