How does this permutations function work (Scala)? - scala

I'm looking at Pavel's solution to Project Euler question 24, but can't quite work out how this function works - can someone explain what it's doing? Its purpose is to return the millionth lexicographic permutation of the digits 0 to 9.
def ps(s: String): Seq[String] = if(s.size == 1) Seq(s) else
s.flatMap(c => ps(s.filterNot(_ == c)).map(c +))
val r = ps("0123456789")(999999).toLong
I understand that when the input String is length 1, the function returns that character as a Seq, and I think then what happens is that it's appended to the only other character that was left, but I can't really visualise how you get to that point, or why this results in a list of permutations.
(I already solved the problem myself but used the permutations method, which makes it a fairly trivial 1-liner, but would like to be able to understand the above.)

For each letter (flatMap(c => ...)) of the given string s, ps generates a permutation by permutating the remaining letters ps(s.filterNot(_ == c)) and prepending the taken letter in front of this permutation (map(c +)). For the trivial case of a one letter string, it does nothing (if(s.size == 1) Seq(s)).
Edit: Why does this work?
Let’s begin with shuffling a one letter string:
[a]
-> a # done.
Now for two letters, we split the task into subtasks. Take each character in the set, put it to the first position and permutate the rest.
a [b]
-> b
b [a]
-> a
For three letters, it’s the same. Take each character and prepend it to each of the sub-permutations of the remaining letters.
a [b c]
-> b [c]
-> c
-> c [b]
-> b
b [a c]
-> a [c]
-> c
-> c [a]
# ... and so on
So, basically the outermost function guarantees that each letter gets to the first position, the first recursive call guarantees the same for the second position and so on.

Let's write it out in pseudocode:
for each letter in the string
take that letter out
find all permutations of what remains
stick that letter on the front
Because it's working for each letter in the string, what this effectively does is move each letter in turn to the front of the string (which means that the first letter can be any of the letters present, which is what you need for a permutation). Since it works recursively, the remainder is every remaining permutation.
Note that this algorithm assumes all the letters are different (because filterNot is used to remove the selected letter); the permutations method in the collections library does not assume this.

Unrelated to this, but you might be interested in knowing you can compute the millionth lexicographical permutation without computing any of the previous ones.
The idea is pretty simple: for N digits, there are N! permutations. That means 10 digits can yield 3628800 permutations, 9 digits can yield 362880 permutations, and so on. Given that information, we can compute the following table:
First digit First Permutation Last Permutatation
0 1 362880
1 362881 725760
2 725761 1088640
3 1088641 1451520
4 1451521 1814400
5 1814401 2177280
6 2177281 2540160
7 2540161 2903040
8 2903041 3265920
9 3265921 3628800
So the first digit will be 2, because that's the range in which 1000000 is. Or, to put it more simply, the first digit is the one at the index (1000000 - 1) / fat(9). So you just need to apply that recursively:
def fat(n: Int) = (2 to n).foldLeft(1)(_*_)
def permN(digits: String, n: Int): String = if (digits.isEmpty) "" else {
val permsPerDigit = fat(digits.length - 1)
val index = (n - 1) / permsPerDigit
val firstDigit = digits(index)
val remainder = digits filterNot (firstDigit ==)
firstDigit + permN(remainder, n - index * permsPerDigit)
}

Related

Removing duplicates (ints) in an array and replacing them with chars

So i'm trying to make a basic hitori solver, but i am not sure where i should start. I'm still new to Scala.
My first issue is that i'm trying to have an array of some ints (1,2,3,4,2)
and making the program output them like this: (1,2,3,4,B)
notice that the duplicate has become a char B.
Where do i start? Here is what i already did, but didn't do what i excatly need.
val s = lines.split(" ").toSet;
var jetSet = s
for(i<-jetSet){
print(i);
}
One way is to fold over the numbers, left to right, building the Set[Int], for the uniqueness test, and the list of output, as you go along.
val arr = Array(1,2,3,4,2)
arr.foldLeft((Set[Int](),List[String]())){case ((s,l),n) =>
if (s(n)) (s,"B" :: l)
else (s + n, n.toString :: l)
}._2.reverse // res0: List[String] = List(1, 2, 3, 4, B)
From here you can use mkString() to format the output as desired.
What I'd suggest is to break your program into a number of steps and try to solve those.
As a first step you could transform the list into tuples of the numbers and the number of times they have appeared so far ...
(1,2,3,4,2) becomes ((1,1),(2,1),(3,1),(4,1),(2,2)
Next step it's easy to map over this list returning the number if the count is 1 or the letter if it is greater.
That first step is a little bit tricky because as you walk through the list you need to keep track of how many you've seen so far of each letter.
When want to process a sequence and maintain some changing state as you do, you should use a fold. If you're not familiar with fold it has the following signature:
def foldLeft[B](z: B)(op: (B, A) => B): B
Note that the type of z (the initial value) has to match the type of the return value from the fold (B).
So one way to do this would be for type B to be a tuple of (outputList, seensofarCounts)
outputList would accumulate in each step by taking the next number and updating the map of how many of each numbers you've seen so far. "seensofarCounts" would be a map of the numbers and the current count.
So what you get out of the foldLeft is a tuple of (((1,1),(2,1),(3,1),(4,1),(2,2), Map(1 -> 1, 2, 2 ETC ... ))
Now you can map over that first element of the tuple as described above.
Once it's working you could avoid the last step by updating the numbers to letters as you work through the fold.
Usually this technique of breaking things into steps makes it simple to reason about, then when it's working you may see that some steps trivially collapse into each other.
Hope this helps.

Using `quick-look` to find certain element

First of all, I want to say that this is a school assignment and I am only seeking for some guidance.
My task was to write an algorithm that finds the k:th smallest element in a seq using quickselect. This should be easy enough but when running some tests I hit a wall. For some reason if I use input (List(1, 1, 1, 1), 1) it goes into infinite loop.
Here is my implementation:
val rand = new scala.util.Random()
def find(seq: Seq[Int], k: Int): Int = {
require(0 <= k && k < seq.length)
val a: Array[Int] = seq.toArray[Int] // Can't modify the argument sequence
val pivot = rand.nextInt(a.length)
val (low, high) = a.partition(_ < a(pivot))
if (low.length == k) a(pivot)
else if (low.length < k) find(high, k - low.length)
else find(low, k)
}
For some reason (or because I am tired) I cannot spot my mistake. If someone could hint me where I go wrong I would be pleased.
Basically you are depending on this line - val (low, high) = a.partition(_ < a(pivot)) to split the array into 2 arrays. The first one containing the continuous sequence of elements smaller than pivot-element and the second contains the rest.
Then you say that if the first array has length k that means you have already seen k elements smaller that your pivot-element. Which means pivot-element is actually k+1th smallest and you are actually returning k+1th smallest element instead of kth. This is your first mistake.
Also... A greater problem occurs when you have all elements which are same because your first array will always have 0 elements.
Not only that... your code will give you wrong answer for inputs where you have repeating elements among k smallest ones like - (1, 3, 4, 1, 2).
The solution lies in obervation that in the sequence (1, 1, 1, 1) the 4th smallest element is the 4th 1. Meaning you have to use <= instead of <.
Also... Since the partition function will not split the array until your boolean condition is false, you can not use partition for achieving this array split. you will have to write the split yourself.

Fibonnaci Sequence fast implementation

I have written this function in Scala to calculate the fibonacci number given a particular index n:
def fibonacci(n: Long): Long = {
if(n <= 1) n
else
fibonacci(n - 1) + fibonacci(n - 2)
}
However it is not efficient when calculating with large indexes. Therefore I need to implement a function using a tuple and this function should return two consecutive values as the result.
Can somebody give me any hints about this? I have never used Scala before. Thanks!
This question should maybe go to Mathematics.
There is an explicit formula for the Fibonacci sequence. If you need to calculate the Fibonacci number for n without the previous ones, this is much faster. You find it here (Binet's formula): http://en.wikipedia.org/wiki/Fibonacci_number
Here's a simple tail-recursive solution:
def fibonacci(n: Long): Long = {
def fib(i: Long, x: Long, y: Long): Long = {
if (i > 0) fib(i-1, x+y, x)
else x
}
fib(n, 0, 1)
}
The solution you posted takes exponential time since it creates two recursive invocation trees (fibonacci(n - 1) and fibonacci(n - 2)) at each step. By simply tracking the last two numbers, you can recursively compute the answer without any repeated computation.
Can you explain the middle part, why (i-1, x+y, x) etc. Sorry if I am asking too much but I hate to copy and paste code without knowing how it works.
It's pretty simple—but my poor choice of variable names might have made it confusing.
i is simply a counter saying how many steps we have left. If we're calculating the Mth (I'm using M since I already used n in my code) Fibonacci number, then i tells us how many more terms we have left to calculate before we reach the Mth term.
x is the mth term in the Fibonacci sequence, or Fm (where m = M - i).
y is the m-1th term in the Fibonacci sequence, or Fm-1 .
So, on the first call fib(n, 0, 1), we have i=M, x=0, y=1. If you look up the bidirectional Fibonacci sequence, you'll see that F0 = 0 and F-1 = 1, which is why x=0 and y=1 here.
On the next recursive call, fib(i-1, x+y, x), we pass x+y as our next x value. This come straight from the definiton:
Fn = Fn-1 + Fn-2
We pass x as the next y term, since our current Fn-1 is the same as Fn-2 for the next term.
On each step we decrement i since we're one step closer to the final answer.
I am assuming that you don't have saved values from previous computations. If so, it will be faster for you to use the direct formula using the golden ratio instead of the recursive definition. The formula can be found in the Wikipedia page for Fibonnaci number:
floor(pow(phi, n)/root_of_5 + 0.5)
where phi = (1 + sqrt(5)/2).
I have no knowledge of programming in Scala. I am hoping someone on SO will upgrade my pseudo-code to actual Scala code.
Update
Here's another solution again using Streams as below (getting Memoization for free) but a bit more intuitive (aka: without using zip/tail invocation on fibs Stream):
val fibs = Stream.iterate( (0,1) ) { case (a,b)=>(b,a+b) }.map(_._1)
that yields the same output as below for:
fibs take 5 foreach println
Scala supports Memoizations through Streams that is an implementation of lazy lists. This is a perfect fit for Fibonacci implementation which is actually provided as an example in the Scala Api for Streams. Quoting here:
import scala.math.BigInt
object Main extends App {
val fibs: Stream[BigInt] = BigInt(0) #:: BigInt(1) #:: fibs.zip(fibs.tail).map { n => n._1 + n._2 }
fibs take 5 foreach println
}
// prints
//
// 0
// 1
// 1
// 2
// 3

Scala split argument over several lines and parse to Int

This is probably going to end up being very simple, but I ask more to help me learn better Scala idioms (Python guy by trade looking to learn some scala tricks.)
I'm doing some hacker rank problems and the method of input requires is a read over lines from stdin. The spec is quoted below:
The first line contains the number of test cases T. T test cases
follow. Each case contains two integers N and M.
So in the input passed to the script looks something like this:
4
2 2
3 2
2 3
4 4
I'm wondering what would be the proper, idiomatic way to do this. I've thought of a few:
Use io.Source.stdin.readLines.zipWithIndex, then from within a foreach, if the index is greater than 0, split on whitespace and map to (_.toInt)
Use the same readLines function to get the input and then pattern match against the index.
Split on whitespace and newlines to make a single list of digits, map toInt, pop the first element (problem size) and then modulo 2 to make tuples of arguments for my problem function.
I'm wondering what more experienced scala programmers would consider the best way to parse these args, where the 2 element lines would be args to a function and the first, single digit line is just the number of problems to solve.
Maybe you're looking for something like this?
def f(x: Int, y: Int) = { f"do something with $x and $y" }
io.Source.stdin.readLines
.map(_.trim.split("\\s+").map(_.toInt)) // split and convert to ints
.collect { case Array(a, b) => f(a, b) } // pass to f if there are two arguments
.foreach(println) // print the result of each function call
Another way to read the input for Hacker Rank problems is with scala.io.Stdin
import scala.io.StdIn
import scala.collection.mutable.ArrayBuffer
object Solution {
def main(args: Array[String]) = {
val q = StdIn.readInt
var lines = ArrayBuffer[Array[Int]]()
(1 to q).foreach(_ => lines += StdIn.readLine.split(" ").map(_.toInt))
for (a <- lines){
val n = a(0)
val m = a(1)
val ans = n * m
println(ans)
}
}
}
I have tested it on Hacker Rank platform today and the output is:
4
6
6
16

Off by one with sliding?

One of the advantages of not handling collections through indices is to avoid off-by-one errors. That's certainly not the only advantage, but it is one of them.
Now, I often use sliding in some algorithms in Scala, but I feel that it usually results in something very similar to the off-by-one errors, because a sliding of m elements in a collection of size n has size n - m + 1 elements. Or, more trivially, list sliding 2 is one element shorter than list.
The feeling I get is that there's a missing abstraction in this pattern, something that would be part sliding, part something more -- like foldLeft is to reduceLeft. I can't think of what that might be, however. Can anyone help me find enlightenment here?
UPDATE
Since people are not clear one what I'm talking, let's consider this case. I want to capitalize a string. Basically, every letter that is not preceded by a letter should be upper case, and all other letters should be lower case. Using sliding, I have to special case either the first or the last letter. For example:
def capitalize(s: String) = s(0).toUpper +: s.toSeq.sliding(2).map {
case Seq(c1, c2) if c2.isLetter => if (c1.isLetter) c2.toLower else c2.toUpper
case Seq(_, x) => x
}.mkString
I’m taking Owen’s answer as an inspiration to this.
When you want to apply a simple diff() to a list, this can be seen as equivalent to the following matrix multiplication.
a = (0 1 4 3).T
M = ( 1 -1 0 0)
( 0 1 -1 0)
( 0 0 1 -1)
diff(a) = M * a = (1 3 1).T
We may now use the same scheme for general list operations, if we replace addition and multiplication (and if we generalise the numbers in our matrix M).
So, with plus being a list append operation (with flatten afterwards – or simply a collect operation), and the multiplicative equivalent being either Some(_) or None, a slide with a window size of two becomes:
M = (Some(_) Some(_) None None)
(None Some(_) Some(_) None)
(None None Some(_) Some(_))
slide(a) = M “*” a = ((0 1) (1 4) (4 3)).T
Not sure, if this is the kind of abstraction you’re looking for, but it would be a generalisation on a class of operations which change the number of items.
diff or slide operations of order m for an input of length n will need to use Matrixes of size n-m+1 × n.
Edit: A solution could be to transform List[A] to List[Some[A]] and then to prepend or append (slideLeft or slideRight) these with None. That way you could handle all the magic inside the map method.
list.slideLeft(2) {
case Seq(Some(c1), Some(c2)) if c2.isLetter => if (c1.isLetter) c2.toLower else c2.toUpper
case Seq(_, Some(x)) => x
}
I run into this problem all the time in python/R/Matlab where you diff() a vector and then can't line it up with the original one! It is very frustrating.
I think what's really missing is that the vector only hold the dependent variables, and assumes that you, the programmer, are keeping track of the independent variables, ie the dimension that the collection ranges over.
I think the way to solve this is to have the language to some degree keep track of independent variables; perhaps statically through types, or dynamically by storing them along with the vector. Then it can check the independent axes, make sure they line up, or, I don't know if this is possible, shuffle things around to make them line up.
That's the best I've thought of so far.
EDIT
Another way of thinking about this is, why does your collection have order? Why is it not just a Set? The order means something, but the collection doesn't keep track of that -- it's basically using sequential position (which is about as informative as numerical indices) to proxy for the real meaning.
EDIT
Another consequence would be that transformations like sliding actually represent two transformations, one for the dependent variables, and one for their axis.
In your example, I think the code is made more complex because, you basically want to do a map but working with sliding which introduces edge conditions in a way that doesn't work nicely. I think a fold left with an accumulator that remembers the relevant state may be easier conceptually:
def capitalize2(s: String) = (("", true) /: s){ case ((res, notLetter), c) =>
(res + (if (notLetter) c.toUpper else c.toLower), !c.isLetter)
}._1
I think this could be generalized so that notLetter could remember n elements where n is the size of the sliding window.
The transformation you're asking for inherently reduces the size of the data. Sorry--there's no other way to look at it. tail also gives you off-by-one errors.
Now, you might say--well, fine, but I want a convenience method to maintain the original size.
In that case, you might want these methods on List:
initializedSliding(init: List[A]) = (init ::: this).sliding(1 + init.length)
finalizedSliding(tail: List[A]) = (this ::: tail).sliding(1 + tail.length)
which will maintain your list length. (You can envision how to generalize to non-lists, I'm sure.)
This is the analog to fold left/right in that you supply the missing data in order to perform a pairwise (or more) operation on every element of the list.
The off by one problem you describe reminds me in the boundary condition issue in digital signal processing. The problem occurs since the data (list) is finite. It doesn't occur for infinite data (stream). In digital signal processing the issues is remedied by extending the finite signal to an infinite one. This can be done in various ways like repeating the data or repeating the data and reversing it on every repetition (like it is done for the discrete cosine transform).
Borrowing from these approached for sliding would lead to an abstraction which does not exhibit the off by one problem:
(1::2::3::Nil).sliding(2)
would yield
(1,2), (2,3), (3,1)
for circular boundary conditions and
(1,2), (2,3), (3,2)
for circular boundary conditions with reversal.
Off-by-one errors suggest that you are trying to put the original list in one-to-one correspondence with the sliding list, but something strange is going on, since the sliding list has fewer elements.
The problem statement for your example can be roughly phrased as: "Uppercase every character if it (a) is the first character, or (b) follows a letter character". As Owen points, the first character is a special case, and any abstraction should respect this. Here's a possibility,
def slidingPairMap[A, B](s: List[A], f1: A => B, f2: (A, A) => B): List[B] = s match {
case Nil => Nil
case x :: _ => f1(x) +: s.sliding(2).toList.map { case List(x, y) => f2(x, y) }
}
(not the best implementation, but you get the idea). This generalizes to sliding triples, with off-by-two errors, and so on. The type of slidingPairMap makes it clear that special casing is being done.
An equivalent signature could be
def slidingPairMap[A, B](s: List[A], f: Either[A, (A, A)] => B): List[B]
Then f could use pattern matching to figure out if it's working with the first element, or with a subsequent one.
Or, as Owen says in the comments, why not make a modified sliding method that gives information about whether the element is first or not,
def slidingPairs[A](s: List[A]): List[Either[A, (A, A)]]
I guess this last idea is isomorphic to what Debilski suggests in the comments: pad the beginning of the list with None, wrap all the existing elements with Some, and then call sliding.
I realize this is an old question but I just had a similar problem and I wanted to solve it without having to append or prepend anything, and where it would handle the last elements of the sequence in a seamless manner. The approach I came up with is a slidingFoldLeft. You have to handle the first element as a special case (like some others mentioned, for capitalize, it is a special case), but for the end of the sequence you can just handle it like other cases. Here is the implementation and some silly examples:
def slidingFoldLeft[A, B] (seq: Seq[A], window: Int)(acc: B)(
f: (B, Seq[A]) => B): B = {
if (window > 0) {
val iter = seq.sliding(window)
iter.foldLeft(acc){
// Operate normally
case (acc, next) if iter.hasNext => f(acc, next)
// It's at the last <window> elements of the seq, handle current case and
// call recursively with smaller window
case (acc, next) =>
slidingFoldLeft(next.tail, window - 1)(f(acc, next))(f)
}
} else acc
}
def capitalizeAndQuestionIncredulously(s: String) =
slidingFoldLeft(s.toSeq, 2)("" + s(0).toUpper) {
// Normal iteration
case (acc, Seq(c1, c2)) if c1.isLetter && c2.isLetter => acc + c2.toLower
case (acc, Seq(_, c2)) if c2.isLetter => acc + c2.toUpper
case (acc, Seq(_, c2)) => acc + c2
// Last element of string
case (acc, Seq(c)) => acc + "?!"
}
def capitalizeAndInterruptAndQuestionIncredulously(s: String) =
slidingFoldLeft(s.toSeq, 3)("" + s(0).toUpper) {
// Normal iteration
case (acc, Seq(c1, c2, _)) if c1.isLetter && c2.isLetter => acc + c2.toLower
case (acc, Seq(_, c2, _)) if c2.isLetter => acc + c2.toUpper
case (acc, Seq(_, c2, _)) => acc + c2
// Last two elements of string
case (acc, Seq(c1, c2)) => acc + " (commercial break) " + c2
// Last element of string
case (acc, Seq(c)) => acc + "?!"
}
println(capitalizeAndQuestionIncredulously("hello my name is mAtthew"))
println(capitalizeAndInterruptAndQuestionIncredulously("hello my name is mAtthew"))
And the output:
Hello My Name Is Matthew?!
Hello My Name Is Matthe (commercial break) w?!
I would prepend None after mapping with Some(_) the elements; note that the obvious way of doing it (matching for two Some in the default case, as done in the edit by Debilski) is wrong, as we must be able to modify even the first letter. This way, the abstraction respects the fact that simply sometimes there is no predecessor. Using getOrElse(false) ensures that a missing predecessor is treated as having failed the test.
((None +: "foo1bar".toSeq.map(Some(_))) sliding 2).map {
case Seq(c1Opt, Some(c2)) if c2.isLetter => if (c1Opt.map(_.isLetter).getOrElse(false)) c2.toLower else c2.toUpper
case Seq(_, Some(x)) => x
}.mkString
res13: String = "Foo1Bar"
Acknowledgments: the idea of mapping the elements with Some(_) did come to me through Debilski's post.
I'm not sure if this solves your concrete problem, but we could easily imagine a pair of methods e.g. slidingFromLeft(z: A, size: Int) and slidingToRight(z: A, size: Int) (where A is collection's element type) which, when called on e.g.
List(1, 2, 3, 4, 5)
with arguments e.g. (0, 3), should produce respectively
List(0, 0, 1), List(0, 1, 2), List(1, 2, 3), List(2, 3, 4), List(3, 4, 5)
and
List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 0), List(5, 0, 0)
This is the sort of problem nicely-suited to an array-oriented functional language like J. Basically, we generate a boolean with a one corresponding to the first letter of each word. To do this, we start with a boolean marking the spaces in a string. For example (lines indented three spaces are inputs; results are flush with left margin; "NB." starts a comment):
str=. 'now is the time' NB. Example w/extra spaces for interest
]whspc=. ' '=str NB. Mark where spaces are=1
0 0 0 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0
Verify that (*.-.) ("and not") returns one only for "1 0":
]tt=. #:i.4 NB. Truth table
0 0
0 1
1 0
1 1
(*.-.)/"1 tt NB. Apply to 1-D sub-arrays (rows)
0 0 1 0 NB. As hoped.
Slide our tacit function across pairs in the boolean:
2(*.-.)/\whspc NB. Apply to 2-ples
0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0
But this doesn't handle the edge condition of the initial letter, so force a one into the first position. This actually helps as the reduction of 2-ples left us one short. Here we compare lengths of the original boolean and the target boolean:
#whspc
20
#1,2(*.-.)/\whspc
20
1,2(*.-.)/\whspc
1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0
We get uppercase by using the index into the lowercase vector to select from the uppercase vector (after defining these two vectors):
'lc uc'=. 'abcdefghijklmnopqrstuvwxyz';'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
(uc,' '){~lc i. str
NOW IS THE TIME
Check that insertion by boolean gives correct result:
(1,2(*.-.)/\whspc) } str,:(uc,' '){~lc i. str
Now Is The Time
Now is the time to combine all this into one statement:
(1,2(*.-.)/\' '=str) } str,:(uc,' '){~lc i. str
Now Is The Time