How would I translate numbers to words in Scheme? - numbers

We just started with scheme at the university and I need some help doing my task.
We shall write a program that translates numbers into words, EG: 1 to "one".
It must work for all numbers up to 10 ^ 9. I've got no real clue how to do this.
My basic idea would be to create some sort of array or list where I define those numbers as words like this:
('ZERO', 0) ('ONE', 1) ('TWO', 2) ('THREE', 3) ('FOUR', 4) ('FIVE', 5) ('SIX', 6)
('SEVEN', 7) ('EIGHT', 8) ('NINE', 9) ('TEN', 10) ('ELEVEN', 11) ('TWELVE', 12)
('THIRTEEN', 13) ('FOURTEEN', 14) ('FIFTEEN', 15) ('SIXTEEN', 16)
('SEVENTEEN', 17) ('EIGHTEEN', 18) ('NINETEEN', 19) ('TWENTY', 20) ('THIRTY', 30)
('FORTY', 40) ('FIFTY', 50) ('SIXTY', 60) ('SEVENTY', 70) ('EIGHTY', 80)
('NINETY', 90) ('HUNDRED', 100) ('THOUSAND', 1000) ('MILLION', 1000000)
('BILLION', 1000000000) ('TRILLION', 1000000000000)
('QUADRILLION', 1000000000000000)
Then somehow check the input value with this list and substitute the numbers with the words.
But implementing this is giving me some trouble.
The main question that buggers me is if Scheme can do substrings and if I can make a list in which I define those numbers as words (so define 1 as one).

You can't "cheat" and do (define 1 one) as 1 is already defined to be a number :)
I don't think you need to worry about substring in this problem, you are building up a string from substrings, not the opposite...
You do need an operation for string concatenation though. I don't remember how they do strings in scheme though...
The only "hard" problem here is that you need to encode the arbritrary rules from the english language so...
//pseudocode, doing things the "dumb way" tons of ifs, aka cond
define to_str n =
if n == 1 then "one"
...
if n == 3 then "thirteen"
..
if 20 <= n < 30 then
'twenty-' concatenated to (to_str (n - 20))
...
After you get a version like this to work you can think about abstracting the repetitive control-flow via a dictionary or association list similar to the one you mentioned in the question.

Related

Sequence of numbers from 1 to 100 in a List in Mulesoft

I have a list with a sequence like #[[1,2,3,4]. I want to define another list with sequence of numbers from 1 to 100. I tried to use #[[1..100]] but this is not allowed. How can i achieve this in mulesoft ?
In mule 4 and dataweave 2 .. operator is used for different use-case now. For a range use to instead:
#[1 to 100]
In mule 3 and dataweave 2, use .. but if you want it as an inline expression, you need to wrap it in dw function
#[dw('[1..100]', 'application/java')]
Or use the the transform-message component for non inline transforms
Although one can simply use the range operator to to generate a big list of numbers. There are still other ways to attain this, especially by using lambda variables turned functions.
To generate numbers starting from 1, we can try a recursive function call as given below.
%dw 2.0
output application/json
var nums = (k: Number, arr: Array<Number>=[]) -> if(k > 0) nums(k-1, (arr + k)) else arr[-1 to 0]
---
nums(4)
To generate numbers based on a starting and ending range; try the below
%dw 2.0
output application/json
var nums = (j, k: Number, arr: Array<Number>=[]) -> if(k > (j-1)) nums(j,k-1, (arr + k)) else arr[-1 to 0]
---
nums(7, 11)

Calculating prime numbers in Scala: how does this code work?

So I've spent hours trying to work out exactly how this code produces prime numbers.
lazy val ps: Stream[Int] = 2 #:: Stream.from(3).filter(i =>
ps.takeWhile{j => j * j <= i}.forall{ k => i % k > 0});
I've used a number of printlns etc, but nothings making it clearer.
This is what I think the code does:
/**
* [2,3]
*
* takeWhile 2*2 <= 3
* takeWhile 2*2 <= 4 found match
* (4 % [2,3] > 1) return false.
* takeWhile 2*2 <= 5 found match
* (5 % [2,3] > 1) return true
* Add 5 to the list
* takeWhile 2*2 <= 6 found match
* (6 % [2,3,5] > 1) return false
* takeWhile 2*2 <= 7
* (7 % [2,3,5] > 1) return true
* Add 7 to the list
*/
But If I change j*j in the list to be 2*2 which I assumed would work exactly the same, it causes a stackoverflow error.
I'm obviously missing something fundamental here, and could really use someone explaining this to me like I was a five year old.
Any help would be greatly appreciated.
I'm not sure that seeking a procedural/imperative explanation is the best way to gain understanding here. Streams come from functional programming and they're best understood from that perspective. The key aspects of the definition you've given are:
It's lazy. Other than the first element in the stream, nothing is computed until you ask for it. If you never ask for the 5th prime, it will never be computed.
It's recursive. The list of prime numbers is defined in terms of itself.
It's infinite. Streams have the interesting property (because they're lazy) that they can represent a sequence with an infinite number of elements. Stream.from(3) is an example of this: it represents the list [3, 4, 5, ...].
Let's see if we can understand why your definition computes the sequence of prime numbers.
The definition starts out with 2 #:: .... This just says that the first number in the sequence is 2 - simple enough so far.
The next part defines the rest of the prime numbers. We can start with all the counting numbers starting at 3 (Stream.from(3)), but we obviously need to filter a bunch of these numbers out (i.e., all the composites). So let's consider each number i. If i is not a multiple of a lesser prime number, then i is prime. That is, i is prime if, for all primes k less than i, i % k > 0. In Scala, we could express this as
nums.filter(i => ps.takeWhile(k => k < i).forall(k => i % k > 0))
However, it isn't actually necessary to check all lesser prime numbers -- we really only need to check the prime numbers whose square is less than or equal to i (this is a fact from number theory*). So we could instead write
nums.filter(i => ps.takeWhile(k => k * k <= i).forall(k => i % k > 0))
So we've derived your definition.
Now, if you happened to try the first definition (with k < i), you would have found that it didn't work. Why not? It has to do with the fact that this is a recursive definition.
Suppose we're trying to decide what comes after 2 in the sequence. The definition tells us to first determine whether 3 belongs. To do so, we consider the list of primes up to the first one greater than or equal to 3 (takeWhile(k => k < i)). The first prime is 2, which is less than 3 -- so far so good. But we don't yet know the second prime, so we need to compute it. Fine, so we need to first see whether 3 belongs ... BOOM!
* It's pretty easy to see that if a number n is composite then the square of one of its factors must be less than or equal to n. If n is composite, then by definition n == a * b, where 1 < a <= b < n (we can guarantee a <= b just by labeling the two factors appropriately). From a <= b it follows that a^2 <= a * b, so it follows that a^2 <= n.
Your explanations are mostly correct, you made only two mistakes:
takeWhile doesn't include the last checked element:
scala> List(1,2,3).takeWhile(_<2)
res1: List[Int] = List(1)
You assume that ps always contains only a two and a three but because Stream is lazy it is possible to add new elements to it. In fact each time a new prime is found it is added to ps and in the next step takeWhile will consider this new added element. Here, it is important to remember that the tail of a Stream is computed only when it is needed, thus takeWhile can't see it before forall is evaluated to true.
Keep these two things in mind and you should came up with this:
ps = [2]
i = 3
takeWhile
2*2 <= 3 -> false
forall on []
-> true
ps = [2,3]
i = 4
takeWhile
2*2 <= 4 -> true
3*3 <= 4 -> false
forall on [2]
4%2 > 0 -> false
ps = [2,3]
i = 5
takeWhile
2*2 <= 5 -> true
3*3 <= 5 -> false
forall on [2]
5%2 > 0 -> true
ps = [2,3,5]
i = 6
...
While these steps describe the behavior of the code, it is not fully correct because not only adding elements to the Stream is lazy but every operation on it. This means that when you call xs.takeWhile(f) not all values until the point when f is false are computed at once - they are computed when forall wants to see them (because it is the only function here that needs to look at all elements before it definitely can result to true, for false it can abort earlier). Here the computation order when laziness is considered everywhere (example only looking at 9):
ps = [2,3,5,7]
i = 9
takeWhile on 2
2*2 <= 9 -> true
forall on 2
9%2 > 0 -> true
takeWhile on 3
3*3 <= 9 -> true
forall on 3
9%3 > 0 -> false
ps = [2,3,5,7]
i = 10
...
Because forall is aborted when it evaluates to false, takeWhile doesn't calculate the remaining possible elements.
That code is easier (for me, at least) to read with some variables renamed suggestively, as
lazy val ps: Stream[Int] = 2 #:: Stream.from(3).filter(i =>
ps.takeWhile{p => p * p <= i}.forall{ p => i % p > 0});
This reads left-to-right quite naturally, as
primes are 2, and those numbers i from 3 up, that all of the primes p whose square does not exceed the i, do not divide i evenly (i.e. without some non-zero remainder).
In a true recursive fashion, to understand this definition as defining the ever increasing stream of primes, we assume that it is so, and from that assumption we see that no contradiction arises, i.e. the truth of the definition holds.
The only potential problem after that, is the timing of accessing the stream ps as it is being defined. As the first step, imagine we just have another stream of primes provided to us from somewhere, magically. Then, after seeing the truth of the definition, check that the timing of the access is okay, i.e. we never try to access the areas of ps before they are defined; that would make the definition stuck, unproductive.
I remember reading somewhere (don't recall where) something like the following -- a conversation between a student and a wizard,
student: which numbers are prime?
wizard: well, do you know what number is the first prime?
s: yes, it's 2.
w: okay (quickly writes down 2 on a piece of paper). And what about the next one?
s: well, next candidate is 3. we need to check whether it is divided by any prime whose square does not exceed it, but I don't yet know what the primes are!
w: don't worry, I'l give them to you. It's a magic I know; I'm a wizard after all.
s: okay, so what is the first prime number?
w: (glances over the piece of paper) 2.
s: great, so its square is already greater than 3... HEY, you've cheated! .....
Here's a pseudocode1 translation of your code, read partially right-to-left, with some variables again renamed for clarity (using p for "prime"):
ps = 2 : filter (\i-> all (\p->rem i p > 0) (takeWhile (\p->p^2 <= i) ps)) [3..]
which is also
ps = 2 : [i | i <- [3..], and [rem i p > 0 | p <- takeWhile (\p->p^2 <= i) ps]]
which is a bit more visually apparent, using list comprehensions. and checks that all entries in a list of Booleans are True (read | as "for", <- as "drawn from", , as "such that" and (\p-> ...) as "lambda of p").
So you see, ps is a lazy list of 2, and then of numbers i drawn from a stream [3,4,5,...] such that for all p drawn from ps such that p^2 <= i, it is true that i % p > 0. Which is actually an optimal trial division algorithm. :)
There's a subtlety here of course: the list ps is open-ended. We use it as it is being "fleshed-out" (that of course, because it is lazy). When ps are taken from ps, it could potentially be a case that we run past its end, in which case we'd have a non-terminating calculation on our hands (a "black hole"). It just so happens :) (and needs to ⁄ can be proved mathematically) that this is impossible with the above definition. So 2 is put into ps unconditionally, so there's something in it to begin with.
But if we try to "simplify",
bad = 2 : [i | i <- [3..], and [rem i p > 0 | p <- takeWhile (\p->p < i) bad]]
it stops working after producing just one number, 2: when considering 3 as the candidate, takeWhile (\p->p < 3) bad demands the next number in bad after 2, but there aren't yet any more numbers there. It "jumps ahead of itself".
This is "fixed" with
bad = 2 : [i | i <- [3..], and [rem i p > 0 | p <- [2..(i-1)] ]]
but that is a much much slower trial division algorithm, very far from the optimal one.
--
1 (Haskell actually, it's just easier for me that way :) )

Why does ifelse convert a data.frame to a list: ifelse(TRUE, data.frame(1), 0)) != data.frame(1)?

I want to return a data.frame from a function if TRUE, else return NA using return(ifelse(condition, mydf, NA))
However, ifelse strips the column names from the data.frame.
Why are these results different?
> data.frame(1)
X1
1 1
> ifelse(TRUE, data.frame(1), NA)
[[1]]
[1] 1
Some additional insight from dput():
> dput(ifelse(TRUE, data.frame(1), 0))
list(1)
> dput(data.frame(1))
structure(list(X1 = 1), .Names = "X1", row.names = c(NA, -1L),
class = "data.frame")
ifelse is generally intended for vectorized comparisons, and has side-effects such as these: as it says in ?ifelse,
‘ifelse’ returns a value with the same shape as ‘test’ ...
so in this case (test is a vector of length 1) it tries to convert the data frame to a 'vector' (list in this case) of length 1 ...
return(if (condition) mydf else NA)
As a general design point I try to return objects of the same structure no matter what, so I might prefer
if (!condition) mydf[] <- NA
return(mydf)
As a general rule, I find that R users (especially coming from other programming languages) start by using if exclusively, take a while to discover ifelse, then overuse it for a while, discovering later that you really want to use if in logical contexts. A similar thing happens with & and &&.
See also:
section 3.2 of Patrick Burns's R Inferno ...
Why can't R's ifelse statements return vectors?

perfect hash function

I'm attempting to hash the values
10, 100, 32, 45, 58, 126, 3, 29, 200, 400, 0
I need a function that will map them to an array that has a size of 13 without causing any collisions.
I've spent several hours thinking this over and googling and can't figure this out. I haven't come close to a viable solution.
How would I go about finding a hash function of this sort? I've played with gperf, but I don't really understand it and I couldn't get the results I was looking for.
if you know the exact keys then it is trivial to produce a perfect hash function -
int hash (int n) {
switch (n) {
case 10: return 0;
case 100: return 1;
case 32: return 2;
// ...
default: return -1;
}
}
Found One
I tried a few things and found one semi-manually:
(n ^ 28) % 13
The semi-manual part was the following ruby script that I used to test candidate functions with a range of parameters:
t = [10, 100, 32, 45, 58, 126, 3, 29, 200, 400, 0]
(1..200).each do |i|
t2 = t.map { |e| (e ^ i) % 13 }
puts i if t2.uniq.length == t.length
end
On some platforms (e.g. embedded), modulo operation is expensive, so % 13 is better avoided. But AND operation of low-order bits is cheap, and equivalent to modulo of a power-of-2.
I tried writing a simple program (in Python) to search for a perfect hash of your 11 data points, using simple forms such as ((x << a) ^ (x << b)) & 0xF (where & 0xF is equivalent to % 16, giving a result in the range 0..15, for example). I was able to find the following collision-free hash which gives an index in the range 0..15 (expressed as a C macro):
#define HASH(x) ((((x) << 2) ^ ((x) >> 2)) & 0xF)
Here is the Python program I used:
data = [ 10, 100, 32, 45, 58, 126, 3, 29, 200, 400, 0 ]
def shift_right(value, shift_value):
"""Shift right that allows for negative values, which shift left
(Python shift operator doesn't allow negative shift values)"""
if shift_value == None:
return 0
if shift_value < 0:
return value << (-shift_value)
else:
return value >> shift_value
def find_hash():
def hashf(val, i, j = None, k = None):
return (shift_right(val, i) ^ shift_right(val, j) ^ shift_right(val, k)) & 0xF
for i in xrange(-7, 8):
for j in xrange(i, 8):
#for k in xrange(j, 8):
#j = None
k = None
outputs = set()
for val in data:
hash_val = hashf(val, i, j, k)
if hash_val >= 13:
pass
#break
if hash_val in outputs:
break
else:
outputs.add(hash_val)
else:
print i, j, k, outputs
if __name__ == '__main__':
find_hash()
Bob Jenkins has a program for this too: http://burtleburtle.net/bob/hash/perfect.html
Unless you're very lucky, there's no "nice" perfect hash function for a given dataset. Perfect hashing algorithms usually use a simple hashing function on the keys (using enough bits so it's collision-free) then use a table to finish it off.
Just some quasi-analytical ramblings:
In your set of numbers, eleven in all, three are odd and eight are even.
Looking at the simplest forms of hashing - %13 - will give you the following hash values:
10 - 3,
100 - 9,
32 - 6,
45 - 6,
58 - 6,
126 - 9,
3 - 3,
29 - 3,
200 - 5,
400 - 10,
0 - 0
Which, of course, is unusable due to the number of collisions. Something more elaborate is needed.
Why state the obvious?
Considering that the numbers are so few any elaborate - or rather, "less simple" - algorithm will likely be slower than either the switch statement or (which I prefer) simply searching through an unsigned short/long vector of size eleven positions and using the index of the match.
Why use a vector search?
You can fine-tune it by placing the most often occuring values towards the beginning of the vector.
I assume the purpose is to plug in the hash index into a switch with nice, sequential numbering. In that light it seems wasteful to first use a switch to find the index and then plug it into another switch. Maybe you should consider not using hashing at all and go directly to the final switch?
The switch version of hashing cannot be fine-tuned and, due to the widely differing values, will cause the compiler to generate a binary search tree which will result in a lot of comparisons and conditional/other jumps (especially costly) which take time (I've assumed you've turned to hashing for its speed) and require space.
If you want to speed up the vector search additionally and are using an x86-system you can implement a vector search based on the assembler instructions repne scasw (short)/repne scasd (long) which will be much faster. After a setup time of a few instructions you will find the first entry in one instruction and the last in eleven followed by a few instructions cleanup. This means 5-10 instructions best case and 15-20 worst. This should beat the switch-based hashing in all but maybe one or two cases.
I did a quick check and using the SHA256 hash function and then doing modular division by 13 worked when I tried it in Mathematica. For c++ this function should be in the openssl library. See this post.
If you were doing a lot of hashing and lookup though, modular division is a pretty expensive operation to do repeatedly. There is another way of mapping an n-bit hash function into a i-bit indices. See this post by Michael Mitzenmacher about how to do it with a bit shift operation in C. Hope that helps.
Try the following which maps your n values to unique indices between 0 and 12
(1369%(n+1))%13

What is the preferred order for operands in boolean expressions?

Is there any benefit to structuring boolean expressions like:
if (0 < x) { ... }
instead of
if (x > 0) { ... }
I have always used the second way, always putting the variable as the first operand and using whatever boolean operator makes sense, but lately I have read code that uses the first method, and after getting over the initial weirdness I am starting to like it a lot more.
Now I have started to write all my boolean expressions to only use < or <= even if this means the variable isn't the first operand, like the above example. To me it seems to increase readability, but that might just be me :)
What do other people think about this?
Do whatever is most natural for whatever expression you are trying to compare.
If you're wondering about other operations (like ==) there are previous topics comparing the orderings of operands for those comparisons (and the reasons why).
It is mostly done to avoid the problem of using = instead of == in if conditions. To keep the consistency many people use the same for with other operators also. I do not see any problem in doing it.
Use whatever 'reads' best. One thing I'd point out is that if I'm testing to see if a value is within bounds, I try to write it so the bounds are on the 'outside' just like they might be in a mathematical expression:
So, to test that (0 < x <= 10):
if ((0 < x) && (x <= 10)) { ... }
instead of
if ((0 < x) && (10 >= x)) { ... }
or
if ((x > 0) && (10 >= x)) { ... }
I find this pattern make is somewhat easier to follow the logic.
An advantage for putting the number first is that it can prevent bug of using = when == is wanted.
if ( 0 == x ) // ok
if ( 0 = x ) //is a compiler error
compare to the subtle bug:
if ( x = 0 ) // assignment and not comparison. most likely a typo
To be honest it's unusual to write expressions with the variable on the right-side, and as a direct consequence of that unusualness readability suffers. Coding conventions have intrinsic value merely by virtue of being conventions; people are used to code being written in particular standard ways, x >= 0 being one example. Unnecessarily deviating from simple norms like these should be avoided without good cause.
The fact that you had to "get over the initial weirdness" should perhaps be a red flag.
I would not write 0 < x just as I would not use Hungarian notation in Java. When in Rome, do as the Romans do. The Romans write x >= 0. No, it's not a huge deal, it just seems like an unnecessary little quirk.