There is so much intricacy in the equality functions in DrRacket. There are subtle differences between them I can't understand.
Can you explain why these two results differ? Why for instance 'a' == 'a', but "abc" != "abc"?
(eqv? (integer->char 955) (integer->char 955))
; => true
(eqv? (number->string 955) (number->string 955))
; => false
While the two "(number->string 955)" look the same, they return two separate objects in memory. With that in mind, let's compare:
(eq? (number->string 955) (number->string 955))
#f
This is false because eq? cares about identicality, that is: are the things being compared exactly the same object in memory. This check is fast, but is often not what you are wanting.
(eqv? (number->string 955) (number->string 955))
#f
This is again false, for the same reason as eq? -- these are not the same objects in memory. eqv? however makes an exception for numbers and characters: it will compare those by value, so two numbers are eqv? if they have the same value. This is still fast, and is usually what you want when you're doing number equality tests.
(equal? (number->string 955) (number->string 955))
#t
Now this is true. Why? The objects are still different, but equal? makes exceptions for strings (and other data types too, but I'll keep it simple). When equal? is given strings, it compares the strings lexically: so if they're the same length and the same sequence of characters, they're "equal". This is the check you want for strings.
eqv? basically does an identity comparison, except that in the case of numbers and characters, then the value is compared instead. This is why two characters with value 955 compare the same.
What do I mean by identity? Consider this:
(define a (number->string 955))
(define b (number->string 955))
(string-set! a 0 #\0)
(printf "a = ~s, b = ~s, (eqv? a b) = ~a~%" a b (eqv? a b))
You'll notice that only a's string is altered, not b's string. That's because they are different string objects.
The opposite scenario is when aliasing is involved:
(define a (number->string 955))
(define b a)
(string-set! a 0 #\0)
(printf "a = ~s, b = ~s, (eqv? a b) = ~a~%" a b (eqv? a b))
Here, a and b point to the same string object, and the effect of the string-set! is visible in both places.
There are a full two pages in the RNRS specification related to eq?, eqv?, equal? and =. Here is the Draft R7RS Specification. Check it out (pages 30 and 39)!
For eqv?, #t is returned if:
obj1 and obj2 are both characters and are the same character
according to the char=? procedure (section 6.6).
obj1 and obj2 are pairs, vectors, bytevectors, records, or strings
that denote the same location in the store (section 3.4).
In your case number->string returns a new 'location in the store' and thus #f is returned. (The Scheme standards do not require number->string to return a new location; returning the same string would be an optimization.) Whereas integer->char returns the same char.
Related
Racket document told me “eq? return #t if v1 and v2 refer to the same object”, but two fixnums that are = are also the same according to eq?, = “returns #t if all of the arguments are numerically equal”. I can't find any message about “numbers” and “symbols”, but in the example I found :
> (eq? 'yes 'yes)
#t
This is contradictory to the above, because it was never mentioned above that symbol was special, so 'yes and 'yes are not the same.
This one makes me even more confused :
> (eq? (expt 2 100) (expt 2 100))
#f
> (eq? (* 6 7) 42)
#t
if numbers are tested numerically, then (eq? (expt 2 100) (expt 2 100)) should return #t, otherwise, numbers are tested by refer, then (eq? (* 6 7) 42) should return #f, so I guess both of the above situations are not right...
Why?!
(expt 2 100)
is too big to be a fixnum. Let's try to evaluate:
(expt 2 100) ; => 1267650600228229401496703205376
(fixnum? (expt 2 100)) ; => #f
(expt 2 10) ; => 1024
(fixnum? (expt 2 100)) ; => #t
This is because large numbers are allocated in several memory cells (think of them as a list of groups of digits).
On the other hand, each symbol, when read, is “internalized”. This means that the first time it is read, a new symbol value is created for it. Subsequently, when it is read again, the system checks if a symbol with that name is already present, and in this case the old symbol value is returned, without creating any new object in memory. So:
(eq 'yes 'yes) ; => #t
since what are apparently two different symbols with the same name are in effect the same object in memory.
This is to give additional information in addition to #Renzo's answer
how can I check if a data type is “internalized”
The answer is that it's complicated.
One factor is the reader:
Symbols, keywords, strings, byte strings, regexps, characters, and numbers produced by the reader in read-syntax mode are interned, which means that such values in the result of read-syntax are always eq? when they are equal? (whether from the same call or different calls to read-syntax). Symbols and keywords are interned in both read and read-syntax mode. Sending an interned value across a place channel does not necessarily produce an interned value at the receiving place. See also datum-intern-literal and datum->syntax.
So (eq? (expt 2 100) (expt 2 100)) returns #f because (expt 2 100) is needed to be computed at runtime. On the other hand, (eq? 1267650600228229401496703205376 1267650600228229401496703205376) returns #t because the value is apparent at reading time, allowing Racket to intern the number.
Another factor is datatypes. For instance, a fixnum is always interned even if the value is not apparent at reading time, according to https://docs.racket-lang.org/reference/numbers.html
Two fixnums that are = are also the same according to eq?. Otherwise, the result of eq? applied to two numbers is undefined
That means (eq? (+ 1 2) 3) is guaranteed to be #t.
A symbol is normally interned, but it's possible to make it uninterned via string->uninterned-symbol and gensym.
A symbol is like an immutable string, but symbols are normally interned, so that two symbols with the same character content are normally eq?.
The two procedures string->uninterned-symbol and gensym generate uninterned symbols, i.e., symbols that are not eq?, eqv?, or equal? to any other symbol, although they may print the same as other symbols.
So:
> (eq? (string->symbol "ab") (string->symbol (string-append "a" "b")))
#t
> (eq? (string->uninterned-symbol "ab") (string->uninterned-symbol "ab"))
#f
Is there a way to get a unique identifier for an object in Racket? For instance, when we use Racket's eq? operator to check whether two variables refer to the same object, what identifier is it using to achieve this comparison?
I'm looking for something like python's id function or Ruby's object_id method, in other words, some function id such that (= (id obj) (id obj2)) means that (eq? obj obj2) is true.
Some relevant docs:
Object Identity and Comparisons
Variables and Locations
Is eq-hash-code what you want?
> (define l1 '(1))
> (define l2 '(1))
> (eq? l1 l2)
#f
> (eq-hash-code l1)
9408
> (eq-hash-code l2)
9412
There's a way to get a C pointer of an object via ffi/unsafe, with the obvious caveat that it's UNSAFE.
;; from https://rosettacode.org/wiki/Address_of_a_variable#Racket
(require ffi/unsafe)
(define (madness v) ; i'm so sorry
(cast v _racket _gcpointer))
To use it:
(define a (list 1 2))
(define b (list 1 2))
(printf "a and b have different address: ~a ~a\n"
(equal? (madness a) (madness b))
(eq? a b))
(printf "a and a have the same address: ~a ~a\n"
(equal? (madness a) (madness a))
(eq? a a))
(printf "1 and 1 have the same address: ~a ~a\n"
(equal? (madness 1) (madness 1))
(eq? 1 1))
Though the pointer is not a number or an identifier. It's an opaque object... So in a sense, this is kinda useless. You could have used the real objects with eq? instead.
I also don't know any guarantee of this method. In particular, I don't know if the pointer will be updated to its latest value when the copy GC copies objects.
Here is an implementation of such a function using a weak hash table.
Using a weak hash table ensures that objects are garbage collected correctly
even if we have given it an id.
#lang racket
(define ht (make-weak-hasheq))
(define next 0)
(define (get-id x)
(define id (hash-ref ht x #f))
(or id
(begin0
next
(hash-set! ht x next)
(set! next (+ next 1)))))
(get-id 'a)
(get-id 'b)
(get-id 'a)
Note that Sylwester's advice is sound. The standard is to store the value directly.
You most likely won't find an identity, but the object itself is only eq? with itself and nothing else. eq? basically compares the address location of the values. So if you want an id you can just store the whole object at that place and it will be unique.
A location is a binding. Think of it as an address you cannot get and an address which has an address to a object. Eg. a binding ((lambda (a) a) 10) would store the address location of the object 10 in the first stack address and the code in the body just returns that same address. A location can change by set! but you'll never get the memory location of it.
It's common for lisp systems to store values in pointers. That means that some types and values doesn't really have an object at the address, but the address has a value and type encoded in it that the system knows. Typically small integers, chars, symbols and booleans can be pointer equal even though they are constructed at different times. eg. '(1 2 3) would only use 3 pairs and not any space for the values 1-3 and ().
I was confusing about the difference between match and case. In the document, it mentions that match supports general pattern matching.
> (define (m x)
(match x
[(list a b c)
#:when (= 6 (+ a b c))
'sum-is-six]
[(list a b c) 'sum-is-not-six]))
> (m '(1 2 3))
'sum-is-six
> (m '(2 3 4))
'sum-is-not-six
For this example, I thought I could rewrite it using case expression. But seems it's quite complicated. I have to get the length of the input x, and maybe a lambda function to get the sum of the elements of x and compare it with 6.
So I guess we prefer match when doing pattern matching. Is it true? Any difference other than that?
You said it yourself, match does general pattern matching (a very powerful concept!) whereas case only checks if a value belongs in one of several lists of possible (implicitly quoted) values. All that case does is syntactic sugar for a cond with multiple conditions, for example:
(case (+ 7 5)
[(1 2 3) 'small]
[(10 11 12) 'big]
[else 'other])
... is roughly equivalent to:
(let ((val (+ 7 5)))
(cond ((or (equal? val 1) (equal? val 2) (equal? val 3))
'small)
((or (equal? val 10) (equal? val 11) (equal? val 12))
'big)
(else 'other)))
Whereas match does some complex matching; it checks if a value is one of several possible patterns, it's not only about comparing values for equality, it also checks the type and "shape" of the value against the pattern, and we can even add additional constraints using #:when. To see how complex this can be check under the grammar part of match's documentation.
There are two differences:
match is a lot more powerful than case. case doesn't have "patterns" in the way match does, and it implicitly quotes the datums in each "branch question". It only compares the quoted form of the datum against the value, like a switch statement. match has a different and much richer pattern language.
The x in each branch-question of these two examples
(case 5
[(x) 10]
[else 'fail])
;=> 'fail
(case 'x
[(x) 10]
[else 'fail])
;=> 10
Is implicitly quoted, as the symbol 'x. In match terms, this is equivalent to
(match 5
['x 10]
[_ 'fail])
;=> 'fail
(match 'x
['x 10]
[_ 'fail])
;=> 10
Where quoting is one of many options for creating patterns, not the default. If you leave out the quote in a match, x is no longer a symbol; it is a wildcard that matches anything and defines x as the result.
(match 5
[x (+ x 1)])
;=> 6
This could never happen with case because of case's implicit quoting.
case branch-questions have multiple datums per branch.
These datums must be wrapped in parentheses.
(case expr
[(datum ...) answer]
...)
Where match has only one pattern per branch (no parentheses)
(match expr
[pattern answer]
...)
I was asked in an internship interview to do a R5RS program that creates a function, let's say two-subsets. This function has to return #t if the list L contains two subsets with equal sums of elements and with equal numbers of elements, otherwise it returns #f. It takes in entry the list L (only positive numbers) and some parameters (that I judge useful. There is no conditions on the number of parameters) all equal to 0 at the beginning.
The requirements as I still remember were as follow:
- Do not define other functions and call them inside the "two-subsets" function.
- It can only use the following constructs: null?, cond, car, cdr, else, + ,=, not, and, #t, #f, two-subsets (itself for recursive call), the names of the parameters, such as list, sum, ...etc, numeric constants and parentheses.
There were some given examples on the results that we are supposed to have, let's say:
(two-subsets '(7 7) 0 0 0) returns #t. The two subsets are {7} and {7}.
(two-subsets '(7 7 1) 0 0) returns #t. The two subsets are {7} and {7}.
(two-subsets '(5 3 2 4) 0 0) returns #t. The two subsets are {2, 5} and {3, 4}.
(two-subsets '(1 2 3 6 9) 0 0) returns #f.
I started by writing the signature that it looks to me it should be something like this:
(define two-subsets (lambda (L m n ... other parameters)
(cond
The problem is really complicated and it's complexity is obviously more than O(n), I read on it on https://en.wikipedia.org/wiki/Partition_problem .
I tried to start by defining the algorithm first before coding it. I thought about taking as parameters: sum of the list L so in my conditions I'll iterate only on the combinations which sum is <= sum(L)/2. By doing that I can reduce a little bit the complexity of the problem, but still I couldn't figure out how to do it.
It looks like an interesting problem and I really want to know more about it.
Here is a version which does not depend on the numbers being all positive. I am reasonably sure that, by knowing they are, you can do much better than this.
Note this assumes that:
the partition does not need to be exhaustive;
but the sets must not be empty.
I'd be very interested to see a version which relies on the elements of the list being +ve!
(define (two-subsets? l sl sld ssd)
;; l is the list we want to partition
;; sl is how many elements we have eaten from it so far
;; sld is the length difference in the partitions
;; ssd is the sum difference in the partitions
(cond [(and (not (= sl 0))
(= sld 0)
(= ssd 0))
;; we have eaten some elements, the differences are zero
;; we are done.
#t]
[(null? l)
;; out of l, failed
#f]
;; this is where I am sure we could be clever about the set containing
;; only positive numbers, but I am too lazy to think
[(two-subsets? (cdr l)
(+ sl 1)
(+ sld 1)
(+ ssd (car l)))
;; the left-hand set worked
#t]
[(two-subsets? (cdr l)
(+ sl 1)
(- sld 1)
(- ssd (car l)))
;; the right-hand set worked
#t]
[else
;; finally drop the first element of l and try the others
(two-subsets? (cdr l) sl sld ssd)]))
I have a list of elements '(a b c) and I want to find if (true or false) x is in it, where x can be 'a or 'd, for instance. Is there a built in function for this?
If you need to compare using one of the build in equivalence operators, you can use memq, memv, or member, depending on whether you want to look for equality using eq?, eqv?, or equal?, respectively.
> (memq 'a '(a b c))
'(a b c)
> (memq 'b '(a b c))
'(b c)
> (memq 'x '(a b c))
#f
As you can see, these functions return the sublist starting at the first matching element if they find an element. This is because if you are searching a list that may contain booleans, you need to be able to distinguish the case of finding a #f from the case of not finding the element you are looking for. A list is a true value (the only false value in Scheme is #f) so you can use the result of memq, memv, or member in any context expecting a boolean, such as an if, cond, and, or or expression.
> (if (memq 'a '(a b c))
"It's there! :)"
"It's not... :(")
"It's there! :)"
What is the difference between the three different functions? It's based on which equivalence function they use for comparison. eq? (and thus memq) tests if two objects are the same underlying object; it is basically equivalent to a pointer comparison (or direct value comparison in the case of integers). Thus, two strings or lists that look the same may not be eq?, because they are stored in different locations in memory. equal? (and thus member?) performs a deep comparison on lists and strings, and so basically any two items that print the same will be equal?. eqv? is like eq? for almost anything but numbers; for numbers, two numbers that are numerically equivalent will always be eqv?, but they may not be eq? (this is because of bignums and rational numbers, which may be stored in ways such that they won't be eq?)
> (eq? 'a 'a)
#t
> (eq? 'a 'b)
#f
> (eq? (list 'a 'b 'c) (list 'a 'b 'c))
#f
> (equal? (list 'a 'b 'c) (list 'a 'b 'c))
#t
> (eqv? (+ 1/2 1/3) (+ 1/2 1/3))
#t
(Note that some behavior of the functions is undefined by the specification, and thus may differ from implementation to implementation; I have included examples that should work in any R5RS compatible Scheme that implements exact rational numbers)
If you need to search for an item in a list using an equivalence predicate different than one of the built in ones, then you may want find or find-tail from SRFI-1:
> (find-tail? (lambda (x) (> x 3)) '(1 2 3 4 5 6))
'(4 5 6)
Here's one way:
> (cond ((member 'a '(a b c)) '#t) (else '#f))
#t
> (cond ((member 'd '(a b c)) '#t) (else '#f))
#f
member returns everything starting from where the element is, or #f. A cond is used to convert this to true or false.
You are looking for "find"
Basics - The simplest case is just (find Entry List), usually used as a predicate: "is Entry in List?". If it succeeds in finding the element in question, it returns the first matching element instead of just "t". (Taken from second link.)
http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node145.html
-or-
http://www.apl.jhu.edu/~hall/Lisp-Notes/Higher-Order.html
I don't know if there is a built in function, but you can create one:
(define (occurrence x lst)
(if (null? lst) 0
(if (equal? x (car lst)) (+ 1 (occurrence x (cdr lst)))
(occurrence x (cdr lst))
)
)
)
Ỳou will get in return the number of occurrences of x in the list. you can extend it with true or false too.
(define (member? x list)
(cond ((null? list) #f)
((equal? x (car list)) #t)
(else (member? x (cdr list)))))
The procedure return #t (true) or #f (false)
(member? 10 '(4 2 3))
output is #f