The list primitives in Lisp have the opperator length defined as follows:
(define (length lst)
(if (null? lst) 0 (+ 1 (length (cdr lst)))))
Why can't implementers of some Lisp make length a primitive computed in constant time?
Thanks
The reason is the "traditional" implementation of lists in Lisp. Elsewhere, lists are sometimes implemented similarly to an array. So, when you create a list, it's a single, self-contained thing.
An array-like list: [ 1 2 3 4 5 6 ]
If you wanted to tack an item -- '0', for example -- to the beginning of that list (and assuming the list implementation is simplistic), all the items in the list would be shifted up (to the right). Then, the '0' would be installed in the new spot.
Step 0. [ 1 2 3 4 5 6 ]
Step 1. [ 1 2 3 4 5 6 ] (Shift Right)
Step 2. [ 0 1 2 3 4 5 6 ] (Insert)
That is not how (traditional) Lisp lists are at all. Every item of the list is a separate piece that points to the next item. The parts of the list are called "conses". (This is called a linked-list.)
[1] -> [2] -> [3] -> [4] -> [5] -> [6] -> nil
Huh? Well, each of those items is a cons cell. And each cons cell holds two values: car and cdr. (When cons cells are used with lists, it's nicer to think of "car" and "cdr" as "first" and "rest". "First" holds whatever data you want the list to hold, like numbers, or even links to other lists. "Rest" holds the rest of the list. You can actually put whatever you want into the "rest" part, too, but we'll ignore that here.) "Nil" is the "empty list", which basically means "the list has ended!"
So, what if we wanted to tack a '0' onto the front again?
[0] -> [1] -> [2] -> [3] -> [4] -> [5] -> [6] -> nil
See? We just created a new cons cell that pointed to the beginning of the old list. You'll hear people call this "consing" a value onto the front. But, you'll notice that the rest of the list is untouched.
Okay, but why does anyone want this? You can share lists. Imagine you wanted two basically identical lists. Only now you wanted one list to start with a '7', and the other with a '4'? Well, with the array-like way, we'd have to have two lists.
[ 7 0 1 2 3 4 5 6 ]
[ 4 0 1 2 3 4 5 6 ]
What if you replaced the '5' with '13', and share the change?
[ 7 0 1 2 3 4 13 6 ]
[ 4 0 1 2 3 4 5 6 ]
You have two totally separate lists. If you want to share the change, you'll have to make the change in both lists.
What happens with traditional Lisp lists? First, sticking a '7' and a '4' to the front:
[7]
\
----> [0] -> [1] -> [2] -> [3] -> [4] -> [5] -> [6] -> nil
/
[4]
Yup. We just made two conses that both point to the old list. The rest of the list is shared. And, if we replace '5' with '13':
[7]
\
----> [0] -> [1] -> [2] -> [3] -> [4] -> [13] -> [6] -> nil
/
[4]
Both lists get the change, because the rest is shared.
The point of all this is that, unlike what many folks expect, traditional Lisp lists aren't a single thing. They are a bunch of little things (conses) that point to each other to form a chain, which is the list. If you want to get the length of the chain, you have to follow all the little links until you get to the end, nil. Thus, O(n) length.
Lisp lists could be made O(1) if they were done like arrays. I'm sure some obscure Lisps do this, and get rid of linked lists (conses) altogether. But, conses seem to be one of the things that makes its way into basically every popular Lisp dialect. If you want O(1) length and lookup, most of them provide arrays or vectors, too.
Linked-list fun:
-----------------<------------------------<----------
| |
--> [0] -> [1] -> [2] -> [3] -> [4] -> [5] -> [6] -->
This linked list eventually points to itself. If you use your length function, it will loop forever, looking for the end, but never finding it.
Most Lisp dialects don't have lists as a primitive data type. In Common Lisp for example lists are made of cons cells and the empty list (aka NIL). A cons cell is a data structure with two fields: CAR and CDR. With cons cells and NIL one can provide a lot of data structures: singly linked lists, assoc lists, circular lists, trees, ...
Lisp could have implemented lists in a different way (without conses), but generally it hasn't.
Btw., Common Lisp has adjustable arrays with fill pointer. The fill pointer provides the length in O(1).
It's O(n) for plain linked list, but there are other data structures available. For example, Clojure implements List and Vector with O(1) count.
Because people don't ask for the length of a list often enough to justify the space cost. And if you do, you're doing it wrong.
Related
I am trying to make a contract for data that looks like this:
'(a (b c) (d e) ...) ; a, b, c, d, e are all symbols
which is basically a list consisting of a symbol followed by an arbitrary number of lists of two symbols.
There is list/c but that only lets me make it with a fixed number of elements.
There is also *list/c which takes arbitrary initial values, followed by final fixed values, which is kind of the opposite of what I need.
How do I make a correct contract for my data structure?
You can use cons/c to apply one contract to the head of the list and another to the tail. What you want to express is that the head is a symbol and the tail is a list of pairs of symbols, so that'd be:
(cons/c symbol? (listof (list/c symbol? symbol?)))
How should I think of quote in the context of Category Theory? Is quote a monad? What kind of things is it?
I do not think it plays any role in category theory since it does not have anything to do with computation and everything to do with parsing and syntax. It is not a monad.
Imagine you want the string 5 + 1, what do you do? Well you enclose it in double quotes like "5 + 1" in the code and suddenly the result isn't 6 anymore but the string "5 + 1". Is "" something special in category theory? Is it a monad? Don't think so since it tells the compiler to create such a data structure that results in that string. In Haskell "hello" is just fancy syntax sugar for ['H', 'e', 'l', 'l', 'o']. In most languages a string is just a series of consecutive bytes, often an array.
The quote special form performs the same operation syck that '(+ 1 2) isn't an expression anymore, but data. The compiler does (cons '+ (cons '1 (cons '2 '()))) and store the pointer to that for everywhere you have some literal ending with (+ 1 2). Because of that (eq '(1 2) (cdr '(+ 1 2)) might be #t but #f is just as reasonable outcome since the compiler might not optimize for shared structure.
Moving forward you could imagine a fancy language that can dictate how the parser and compiler interpret literals. Almost all languages I know has string literals but if you made code to model complex numbers it would have been cool to say in the code that 3+5i should become tmp1 = make_complex 3 5 and that tmp1 is to be put everywhere the literal 3+5i exists in code. Why should numbers, strings, chars and perhaps regular expressions have special treatment?
I am coming across more and more situations in Scratch where I have to convert a number to its ACSII character or visa versa. There is no built function in the blocks for this.
My solution is to create a list of size 26 and append letters A-Z into each sequence using a variable called alphabet = "abcdefghijklmnopqrstuvwxyz" and iterating over it with a Repeat block and appending LETTER COUNT of ALPHABET to the list.The result is a list data structure with letters A_Z between location 1 to 26.In effect creating my own ASCII table.
To do a converson say from number 26 to 'Z' I have to iterate over the list to get the correct CHAR value. It really slows down the program that is heavily dependent on the CHR() feature. Is there a better or more efficient solution?
My solution is to create a list of size 26 and append letters A-Z into each sequence using a variable called alphabet = "abcdefghijklmnopqrstuvwxyz"
Stop right there. You don't even need to convert it into a list. You can just look it up directly from the string.
To get a character from an index is very easy. Just use the (letter () of []) block.
To get the index of a character is more complex. Unfortunately, Scratch doesn't have a built-in way to do that. What i would do here is define a index of [] in [] custom pseudo-reporter block:
define index of (char) in (str)
set [i v] to [1]
repeat until <<(i) = (length of (str))> or <(letter (i) of (str)) = (char)>>
change [i v] by (1)
view online
You can then call the block as index of [a] in (alphabet) and it will set the i variable to 1.
This code doesn't have any case for if the character isn't found, but the link i provided does include that, if you need.
You could also use Snap! which is similar to Scratch, but has more blocks. Snap! has a unicode block, that will convert a character to its ASCII or unicode value.
For the love of the almighty I have yet to understand the purpose of the symbol 'iamasymbol. I understand numbers, booleans, strings... variables. But symbols are just too much for my little imperative-thinking mind to take. What exactly do I use them for? How are they supposed to be used in a program? My grasp of this concept is just fail.
In Scheme and Racket, a symbol is like an immutable string that happens to be interned so that symbols can be compared with eq? (fast, essentially pointer comparison). Symbols and strings are separate data types.
One use for symbols is lightweight enumerations. For example, one might say a direction is either 'north, 'south, 'east, or 'west. You could of course use strings for the same purpose, but it would be slightly less efficient. Using numbers would be a bad idea; represent information in as obvious and transparent a manner as possible.
For another example, SXML is a representation of XML using lists, symbols, and strings. In particular, strings represent character data and symbols represent element names. Thus the XML <em>hello world</em> would be represented by the value (list 'em "hello world"), which can be more compactly written '(em "hello world").
Another use for symbols is as keys. For example, you could implement a method table as a dictionary mapping symbols to implementation functions. To call a method, you look up the symbol that corresponds to the method name. Lisp/Scheme/Racket makes that really easy, because the language already has a built-in correspondence between identifiers (part of the language's syntax) and symbols (values in the language). That correspondence makes it easy to support macros, which implement user-defined syntactic extensions to the language. For example, one could implement a class system as a macro library, using the implicit correspondence between "method names" (a syntactic notion defined by the class system) and symbols:
(send obj meth arg1 arg2)
=>
(apply (lookup-method obj 'meth) obj (list arg1 arg2))
(In other Lisps, what I've said is mostly truish, but there are additional things to know about, like packages and function vs variable slots, IIRC.)
A symbol is an object with a simple string representation that (by default) is guaranteed to be interned; i.e., any two symbols that are written the same are the same object in memory (reference equality).
Why do Lisps have symbols? Well, it's largely an artifact of the fact that Lisps embed their own syntax as a data type of the language. Compilers and interpreters use symbols to represent identifiers in a program; since Lisp allows you to represent a program's syntax as data, it provides symbols because they're part of the representation.
What are they useful apart from that? Well, a few things:
Lisp is commonly used to implement embedded domain-specific languages. Many of the techniques used for that come from the compiler world, so symbols are an useful tool here.
Macros in Common Lisp usually involve dealing with symbols in more detail than this answer provides. (Though in particular, generation of unique identifiers for macro expansions requires being able to generate a symbol that's guaranteed never to be equal to any other.)
Fixed enumeration types are better implemented as symbols than strings, because symbols can be compared by reference equality.
There are many data structures you can construct where you can get a performance benefit from using symbols and reference equality.
Symbols in lisp are human-readable identifiers. They are all singletons. So if you declare 'foo somewhere in your code and then use 'foo again, it will point to the same place in memory.
Sample use: different symbols can represent different pieces on a chessboard.
From Structure and Interpretation of Computer Programs Second Edition by Harold Abelson and Gerald Jay Sussman 1996:
In order to manipulate symbols we need a new element in our language:
the ability to quote a data object. Suppose we want to construct the list
(a b). We can’t accomplish this with (list a b), because this expression
constructs a list of the values of a and b rather than the symbols themselves.
This issue is well known in the context of natural languages, where words
and sentences may be regarded either as semantic entities or as character
strings (syntactic entities). The common practice in natural languages is to use quotation marks to indicate that a word or a sentence is to be treated
literally as a string of characters. For instance, the first letter of “John” is
clearly “J.” If we tell somebody “say your name aloud,” we expect to hear
that person’s name. However, if we tell somebody “say ‘your name’ aloud,”
we expect to hear the words “your name.” Note that we are forced to nest
quotation marks to describe what somebody else might say.
We can follow this same practice to identify lists and symbols that are
to be treated as data objects rather than as expressions to be evaluated.
However, our format for quoting differs from that of natural languages in
that we place a quotation mark (traditionally, the single quote symbol ’)
only at the beginning of the object to be quoted. We can get away with this in Scheme syntax because we rely on blanks and parentheses to delimit
objects. Thus, the meaning of the single quote character is to quote the
next object.
Now we can distinguish between symbols and their values:
(define a 1)
(define b 2)
(list a b)
(1 2)
(list ’a ’b)
(a b)
(list ’a b)
(a 2)
Lists containing symbols can look just like the expressions of our language:
(* (+ 23 45) (+ x 9))
(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))
Example: Symbolic Differentiation
A symbol is just a special name for a value. The value could be anything, but the symbol is used to refer to the same value every time, and this sort of thing is used for fast comparisons. As you say you are imperative-thinking, they are like numerical constants in C, and this is how they are usually implemented (internally stored numbers).
To illustrate the point made by Luis Casillas, it might be useful to observe how symbols eval differently than strings.
The example below is for mit-scheme (Release 10.1.10). For convenience, I use this function as eval:
(define my-eval (lambda (x) (eval x (scheme-report-environment 5))))
A symbol can easily evaluate to the value or function it names:
(define s 2) ;Value: s
(my-eval "s") ;Value: "s"
(my-eval s) ;Value: 2
(define a '+) ;Value: a
(define b "+") ;Value: b
(my-eval a) ;Value: #[arity-dispatched-procedure 12]
(my-eval b) ;Value: "+"
((my-eval a) 2 3) ;Value: 5
((my-eval b) 2 3) ;ERROR: The object "+" is not applicable.
I'm new to programming (just started!) and have hit a wall recently. I am making a fansite for World of Warcraft, and I want to link to a popular site (wowhead.com). The following page shows what I'm trying to figure out: http://www.wowhead.com/?talent#ozxZ0xfcRMhuVurhstVhc0c
From what I understand, the "ozxZ0xfcRMhuVurhstVhc0c" part of the link is a hash. It contains all the information about that particular talent spec on the page, and changes whenever I add or remove points into a talent. I want to be able to recreate this part, so that I can then link my users directly to wowhead to view their talent trees, but I havn't the foggiest idea how to do that. Can anyone provide some guidance?
The first character indicates the class:
0 Druid
c Hunter
o Mage
s Paladin
b Priest
f Rogue
h Shaman
I Warlock
L Warrior
j Death Knight
The remaining characters indicate where in each tree points have been allocated. Each tree is separate, delimited by 'Z'. So if e.g. all the points are in the third tree, then the 2nd and 3rd characters will be "ZZ" indicating "end of first tree" and "end of second tree".
To generate the code for a given tree, split the talents up into pairs, going left-to-right and top-to-bottom. Each pair of talents is represented by a single character. So for example, in the DK's Blood tree segment, the first character will indicate the number of points allocated to Butchery and Subversion, and the second character will stand for Blade Barrier and Bladed Armor.
What character represents each allocation among the pair? I'm sure there's an algorithm, probably based on the ASCII character set, but all I've worked out so far is this lookup table. Find the number of points in the first talent along the top, and the number of points in the second talent along the left side. The encoded character is at the intersection.
0 1 2 3 4 5
0 0 o b h L x
1 z k d u p t
2 M R r G T g
3 c s f I j e
4 m a w N n v
5 V q i A y E
So if our Death Knight has one point in Butchery and two points in Subversion, the first character is 'R'. If instead we put no points in those two and five in Blade Barrier, the first two characters will be "0x". Trailing '0's (all the other pairs in the tree with no points allocated) can be omitted, as can trailing 'Z' delimiters (when there are no points in the subsequent trees). For one final example, the entire code for a DK with just a single point in Toughness would be "jZ0o": "Death Knight", "End of the first tree", "No points in the first pair of talents", "one point in the first talent of the second pair".
Can anyone work out what function generates the lookup table above? There's probably a clue in the codes for the classes: in alphabetical order (except for the DK which was added to the game after the others), they correspond to a series in the lookup table of (0,0), (0,3), (1,0), (1,3), (2,0), etc.
If you go to http://www.wowhead.com/?talent and start using the talent tree you can see the mysterious code being built up in the address bar as you click on the various boxes. So it's definitely not a hash but some kind of encoded structure data.
As the code is built up as you click the logic for building the code will be in the JavaScript on that page.
So my advice is do a view source on the page, download the JavaScript files and have a look at them.
I think it isn't a hash value, because hash values are normally one-ways values. This means you cannot (easily) restore the original information from which the hash code was generated.
Best thing would be to contact someone from wowhead.com and ask them how to interpret this information. I am sure they will help you out with some information about what type of encoding they use for the parameters. But without any help of the developers from wowhead.com it is almost impossible to figure out what information is encoded into this parameter.
I am not even sure the parameter you mentioned contains the talents of your character. Maybe it's just a session id or something like that. Take a look into the post data your browser sends to the server, it may contain a hidden field with the value you are searching for (you can use Tamper Data Firefox Addon).
I don't think ozxZ0xfcRMhuVurhstVhc0c is a hash value. I think it is a key (probably encrypted/encoded in some way). The server uses this key to retrieve information from it database. Since you don't have access to the database you don't know which key is needed, let alone how to encode it.
You need the original function that generates the hash.
I don't think that's public though :(
Check this out: hash wikipedia
Good luck learning how to program!
These hashes are hard to 'reverse engineer' unless you know how it was generated.
For example, it could be:
s1 = "random_string-" + score;
hash = encrypt(s1)
...etc
so it is hard to get the original data back from the hash (that is the whole point anyway).
your best bet would be link to the profile that would have the latest score ..etc