Algorithms to check if string has unique characters in Scala. Why is O(N2) verison quicker? - scala

I implemented two versions of the algorithm in Scala (without using sets, by the way). A first one :
def isContained(letter: Char, word: String): Boolean =
if (word == "") false
else if (letter == word.head) true
else isContained(letter, word.tail)
def hasUniqueChars(stringToCheck: String): Boolean =
if (stringToCheck == "") true
else if (isContained(stringToCheck.head, stringToCheck.tail)) false
else hasUniqueChars(stringToCheck.tail)
which is in O(N2).
And a second one :
def hasUniqueChars2Acc(str: String, asciiTable: List[Boolean]): Boolean = {
if (str.length == 0) true
else if (asciiTable(str.head.toByte)) false
else hasUniqueChars2Acc(str.tail, asciiTable.updated(str.head.toByte, true))
}
def hasUniqueChars2(str: String): Boolean = {
val virginAsciiTable = List.fill(128)(false)
if (str.length > 128) false
else hasUniqueChars2Acc(str, virginAsciiTable)
}
which is in O(N).
But when testing, the second version takes as much as 20 times the duration of the first one. Why? Is it related to the .updated method ?

Related

Scala seems to require 'return' keyword for recursive functions

I'm new to Scala and know that the 'return' keyword is redundant. However, when writing a recursive function, the control doesn't return from a call stack when the desired condition is met if the 'return' keyword is missing.
Below is a code to check for balanced parentheses -
Using 'return' (works fine):
def balance(chars: List[Char]): Boolean = {
def parenBalance(char : List[Char]) :Boolean = {
parenBalanceRecur(char, 0 , 0)
}
def parenBalanceRecur(charSequence: List[Char], itr : Int, imbalance : Int) : Boolean = {
if (imbalance < 0) then return false
if (itr == charSequence.length ) {
if (imbalance == 0) then return true else return false
}
if (charSequence(itr) == '(') {
parenBalanceRecur(charSequence, (itr + 1), (imbalance + 1))
}
else if (charSequence(itr) == ')') {
parenBalanceRecur(charSequence, itr + 1, imbalance -1)
}
else {
parenBalanceRecur (charSequence, itr +1, imbalance)
}
}
parenBalance(chars)
}
Without return (Successfully computes #1, but doesn't return, fails at #2 with IndexOutOfBoundsException):
def balance(chars: List[Char]): Boolean = {
def parenBalance(char : List[Char]) :Boolean = {
parenBalanceRecur(char, 0 , 0)
}
def parenBalanceRecur(charSequence: List[Char], itr : Int, imbalance : Int) : Boolean = {
if (imbalance < 0) then false
if (itr == charSequence.length ) {
if (imbalance == 0) then true else false // #1
}
if (charSequence(itr) == '(') { // # 2
parenBalanceRecur(charSequence, (itr + 1), (imbalance + 1))
}
else if (charSequence(itr) == ')') {
parenBalanceRecur(charSequence, itr + 1, imbalance -1)
}
else {
parenBalanceRecur (charSequence, itr +1, imbalance)
}
}
parenBalance(chars)
}
Is this specific to tail recursion? Please help me understand this.
As mentioned in some comments, you can not just short circuit the path and return a value without the return keyword. return is not needed only at the last value of the method. In any case, usually, it's a good practice to avoid short circuits with multiple returns in a method.
In your example, adding two else you can construct the path down to the last value of each possible output without using the return keyword:
def balance(chars: List[Char]): Boolean = {
def parenBalance(char : List[Char]) :Boolean = {
parenBalanceRecur(char, 0 , 0)
}
#tailrec
def parenBalanceRecur(charSequence: List[Char], itr : Int, imbalance : Int) : Boolean = {
if (imbalance < 0) then false
else if (itr == charSequence.length ) {
if (imbalance == 0) then true else false // #1
}
else if (charSequence(itr) == '(') { // # 2
parenBalanceRecur(charSequence, (itr + 1), (imbalance + 1))
}
else if (charSequence(itr) == ')') {
parenBalanceRecur(charSequence, itr + 1, imbalance -1)
}
else {
parenBalanceRecur (charSequence, itr +1, imbalance)
}
}
parenBalance(chars)
}
As currently implemented, parenBalanceRecur() has 3 top-level if expressions, they each evaluate to a boolean value, but the rule in scala is that only the last expression of the function is the return value of the function => the first two are simply ignored.
=> in your second implementation:
def parenBalanceRecur(charSequence: List[Char], itr : Int, imbalance : Int) : Boolean = {
if (imbalance < 0) then false
if (itr == charSequence.length ) {
if (imbalance == 0) then true else false // #1
}
if (charSequence(itr) == '(') { // # 2
parenBalanceRecur(charSequence, (itr + 1), (imbalance + 1))
}
else if (charSequence(itr) == ')') {
parenBalanceRecur(charSequence, itr + 1, imbalance -1)
}
else {
parenBalanceRecur (charSequence, itr +1, imbalance)
}
}
parenBalance(chars)
}
The first expression if (imbalance < 0) then false will evaluate to either true or false, but this expression is disconnected from the rest of the code => the function is not doing anything with that boolean. We could as well write val thisAchievesNothing = if (imbalance < 0) then false. Our thisAchievesNothing will hold some value, which is valid syntax, but is not very useful.
Likewise, this:
if (itr == charSequence.length ) {
if (imbalance == 0) then true else false // #1
}
will evaluate to another boolean that is equally ignored.
Finally, the last if... else if... else will also evaluate to yet another boolean, and since this one is the last expression of the function, that and that only will be the returned value.
Try to re-writing parenBalanceRecur as one single if.. else if... else if ... expression instead of 3, and then the behavior will be the same as the implementation that uses return.
(we tend to avoid return since it's a statement that does something, whereas the habit is to write functions/expression that are some value)
Besides the accepted answer, some things worth considering:
use #tailrec, so your recursive method gets tail-call optimized by the compiler. Otherwise it can produce stack overflow for very large lists. (Although this might not be the case here, always consider using this annotation when implementing tail-recursive methods).
List in Scala is a singly-linked list, so it's not indexed. Using cs(iter) on each recursive call makes your method get the element in linear time for every index. That means you'll get O(n^2) complexity. As an alternative, either use chars: Array[Char] which is indexed-based and gives you each element in O(1) or continue using list but switch to pattern match instead.
minor things: redundant curly braces from if expressions with only a single statement inside; parenBalance can be removed and call directly parenBalanceRecur(char, 0 , 0) as the last statement in balance; use default values for parenBalanceRecur.
My suggestions in code:
def isBalanced(chars: List[Char]): Boolean = {
#tailrec
def balance(cs: List[Char], imbalances: Int = 0): Boolean =
if (imbalances < 0) false
else cs match {
case Nil => imbalances == 0
case x :: xs =>
if (x == '(') balance(xs, imbalances + 1)
else if (x == ')') balance(xs, imbalances - 1)
else balance(xs, imbalances)
}
balance(chars)
}
println(isBalanced("()".toList)) // true
println(isBalanced("()(())".toList)) // true
println(isBalanced(")(".toList)) // false
println(isBalanced("(()))".toList)) // false
println(isBalanced("((((()))))".toList)) // true

How to extend binary with Scala?

I am trying to complete the below exercise:
I have attempted it below, but my code is not acting as expected.
def extend(p: Long): Long = {
var e = p.toBinaryString
if ( e.count(_== '1') % 2 == 0) {
e="0"+e
}else {
e="1"+e
}
e.toLong
}
What am I doing wrong here? I don't understand how I'm supposed to change binary right Hex.
#Test def testExtend() {
assertEquals("extend(0x0000000000000000L)", 0x0000000000000000L, extend(0x0000000000000000L))
assertEquals("extend(0x0000000000000001L)", 0x8000000000000001L, extend(0x0000000000000001L))
assertEquals("extend(0x0000000000000011L)", 0x0000000000000011L, extend(0x0000000000000011L))
assertEquals("extend(0x8000000000000000L)", 0x0000000000000000L, extend(0x8000000000000000L))
assertEquals("extend(0x8000000000F00000L)", 0x0000000000F00000L, extend(0x8000000000F00000L))
assertEquals("extend(0x0000001000300000L)", 0x8000001000300000L, extend(0x0000001000300000L))
}
The first problem is that .toLong assumes that what's being converted is the String representation of a decimal value. So "10" is assumed to represent ten (decimal), not two (binary).
The next problem is that Long has a fixed length. You can't add an extra bit to it. You have to flip an existing bit.
def extend(p: Long): Long =
if (p.toBinaryString.count(_ == '1') % 2 == 0) p
else p ^ Long.MinValue
testing:
0x0000000000000000L == extend(0x0000000000000000L) //res0: Boolean = true
0x8000000000000001L == extend(0x0000000000000001L) //res1: Boolean = true
0x0000000000000011L == extend(0x0000000000000011L) //res2: Boolean = true
0x0000000000000000L == extend(0x8000000000000000L) //res3: Boolean = true
0x0000000000F00000L == extend(0x8000000000F00000L) //res4: Boolean = true
0x8000001000300000L == extend(0x0000001000300000L) //res5: Boolean = true

Scala Do While Loop Not Ending

I'm new to scala and i'm trying to implement a do while loop but I cannot seem to get it to stop. Im not sure what i'm doing wrong. If someone could help me out that would be great. Its not the best loop I know that but I am new to the language.
Here is my code below:
def mnuQuestionLast(f: (String) => (String, Int)) ={
var dataInput = true
do {
print("Enter 1 to Add Another 0 to Exit > ")
var dataInput1 = f(readLine)
if (dataInput1 == 0){
dataInput == false
} else {
println{"Do the work"}
}
} while(dataInput == true)
}
You're comparing a tuple type (Tuple2[String, Int] in this case) to 0, which works because == is defined on AnyRef, but doesn't make much sense when you think about it. You should be looking at the second element of the tuple:
if (dataInput1._2 == 0)
Or if you want to enhance readability a bit, you can deconstruct the tuple:
val (line, num) = f(readLine)
if (num == 0)
Also, you're comparing dataInput with false (dataInput == false) instead of assigning false:
dataInput = false
Your code did not pass the functional conventions.
The value that the f returns is a tuple and you should check it's second value of your tuple by dataInput1._2==0
so you should change your if to if(dataInput1._2==0)
You can reconstruct your code in a better way:
import util.control.Breaks._
def mnuQuestionLast(f: (String) => (String, Int)) = {
breakable {
while (true) {
print("Enter 1 to Add Another 0 to Exit > ")
f(readLine) match {
case (_, 0) => break()
case (_,1) => println( the work"
case _ => throw new IllegalArgumentException
}
}
}
}

I want to pattern match from Array of String with a single String in scala?

val aggFilters = Array["IR*","IR_"]
val aggCodeVal = "IR_CS_BPV"
val flag = compareFilters(aggFilters,aggCodeVal)
As per my requirement I want to compare the patterns given in the aggFilters with aggCodeVal. The first pattern "IR*" is a match with "IR_CS_BPV" but not the second one, hence I want to break out of the for loop after the match is found so that I don't go for the second one "IR_". I don't want to use break statement like java.
def compareFilters(aggFilters: Array[String], aggCodeVal: String): Boolean = {
var flag: Boolean = false
for (aggFilter <- aggFilters) {
if (aggFilter.endsWith("*")
&& aggCodeVal.startsWith(aggFilter.substring(0, aggFilter.length() - 1))) {
flag = true
}
else if (aggFilter.startsWith("*")
&& aggCodeVal.startsWith(aggFilter.substring(1, aggFilter.length()))) {
flag = true
}
else if (((aggFilter startsWith "*")
&& aggFilter.endsWith("*"))
&& aggCodeVal.startsWith(aggFilter.substring(1, aggFilter.length() - 1))) {
flag = true
}
else if (aggFilter.equals(aggCodeVal)) {
flag = true
}
else {
flag = false
}
}
flag
}
If * is your only wild-card character, you should be able to leverage Regex to do your match testing.
def compareFilters(aggFilters: Array[String], aggCodeVal: String): Boolean =
aggFilters.exists(f => s"$f$$".replace("*",".*").r.findAllIn(aggCodeVal).hasNext)
You can use the built-in exists method to do it for you.
Extract a function that compares a single filter, with this signature:
def compareFilter(aggFilter: String, aggCodeVal: String): Boolean
And then:
def compareFilters(aggFilters: Array[String], aggCodeVal: String): Boolean = {
aggFilters.exists(filter => compareFilter(filter, aggCodeVal))
}
The implementation of compareFilter, BTW, can be shortened to something like:
def compareFilter(aggFilter: String, aggCodeVal: String): Boolean = {
(aggFilter.startsWith("*") && aggFilter.endsWith("*") && aggCodeVal.startsWith(aggFilter.drop(1).dropRight(1))) ||
(aggFilter.endsWith("*") && aggCodeVal.startsWith(aggFilter.dropRight(1))) ||
(aggFilter.startsWith("*") && aggCodeVal.startsWith(aggFilter.drop(1))) ||
aggFilter.equals(aggCodeVal)
}
But - double check me on that one, not sure I followed your logic perfectly.

Find String in Char iterator

I have a use case where I need to return a String up to a delimiter String (if found) from an iterator of Char.
The contract:
if iterator is exhausted (only at the begin), return None
if the delimiter String is found, return all characters before it (empty String is fine), delimiter will be dropped
else return the remaining characters
do not eagerly exhaust the iterator!
I do have this working solution, but it feels like Java (which is where I'm coming from)
class MyClass(str: String) {
def nextString(iterator: Iterator[Char]): Option[String] = {
val sb = new StringBuilder
if(!iterator.hasNext) return None
while (iterator.hasNext) {
sb.append(iterator.next())
if (sb.endsWith(str)) return Some(sb.stripSuffix(str))
}
Some(sb.toString())
}
}
Is there a way I can do this in a more functional way (ideally without changing the method signature)?
Update: Here is how I test this
val desmurfer = new MyClass("_smurf_")
val iterator: Iterator[Char] = "Scala_smurf_is_smurf_great_smurf__smurf_".iterator
println(desmurfer.nextString(iterator))
println(desmurfer.nextString(iterator))
println(desmurfer.nextString(iterator))
println(desmurfer.nextString(iterator))
println(desmurfer.nextString(iterator))
println
println(desmurfer.nextString("FooBarBaz".iterator))
println(desmurfer.nextString("".iterator))
Output:
Some(Scala)
Some(is)
Some(great)
Some()
None
Some(FooBarBaz)
None
How about this one:
scala> def nextString(itr: Iterator[Char], sep: String): Option[String] = {
| def next(res: String): String =
| if(res endsWith sep) res dropRight sep.size else if(itr.hasNext) next(res:+itr.next) else res
| if(itr.hasNext) Some(next("")) else None
| }
nextString: (itr: Iterator[Char], sep: String)Option[String]
scala> val iterator: Iterator[Char] = "Scala_smurf_is_smurf_great".iterator
iterator: Iterator[Char] = non-empty iterator
scala> println(nextString(iterator, "_smurf_"))
Some(Scala)
scala> println(nextString(iterator, "_smurf_"))
Some(is)
scala> println(nextString(iterator, "_smurf_"))
Some(great)
scala> println(nextString(iterator, "_smurf_"))
None
scala> println(nextString("FooBarBaz".iterator, "_smurf_"))
Some(FooBarBaz)
What about this one?
def nextString(iterator: Iterator[Char]): Option[String] = {
val t = iterator.toStream
val index = t.indexOfSlice(s)
if(t.isEmpty) None
else if(index == -1) Some(t.mkString)
else Some(t.slice(0,index).mkString)
}
it passed this tests:
val desmurfer = new MyClass("_smurf_")
val iterator: Iterator[Char] = "Scala_smurf_is_smurf_great_smurf__smurf_".iterator
assert(desmurfer.nextString(iterator) == Some("Scala"))
assert(desmurfer.nextString(iterator) == Some("is"))
assert(desmurfer.nextString(iterator) == Some("great"))
assert(desmurfer.nextString(iterator) == Some(""))
assert(desmurfer.nextString(iterator) == None)
assert(desmurfer.nextString("FooBarBaz".iterator) == Some("FooBarBaz"))
assert(desmurfer.nextString("".iterator) == None)
Updated: removed "index == -1 &&" from the first "if condition clause".
This seems to be doing what you'd want. #Eastsun answer motivated me
val str = "hello"
def nextString2(iterator: Iterator[Char]): Option[String] = {
val maxSize = str.size
#tailrec
def inner(collected: List[Char], queue: Queue[Char]): Option[List[Char]] =
if (queue.size == maxSize && queue.sameElements(str))
Some(collected.reverse.dropRight(maxSize))
else
iterator.find(x => true) match {
case Some(el) => inner(el :: collected, if (queue.size == maxSize) queue.dequeue._2.enqueue(el) else queue.enqueue(el))
case None => Some(collected.reverse)
}
if (iterator.hasNext)
inner(Nil, Queue.empty).map(_.mkString)
else
None
}
test(nextString2(Nil.iterator)) === None
test(nextString2("".iterator)) === None
test(nextString2("asd".iterator)) === Some("asd")
test(nextString2("asfhello".iterator)) === Some("asf")
test(nextString2("asehelloasdasd".iterator)) === Some("ase")
But I honestly think it's too complicated to be used. Sometimes you have to use non FP stuff in scala to be performance effecient.
P.S. I didn't know how to match iterator on it's first element, so I've used iterator.find(x => true) which is ugly. Sorry.
P.P.S. A bit of explanation. I recoursively build up collected to fill the elements you are searching for. And I also build queue with last str.size-elements. Then I just check this queue over str each time. This might not be the most efficient way of doing this stuff. You might go with Aho–Corasick algorithm or an analogue if you want more.
P.P.P.S. And I am using iterator as a state, which is probably not FP way
P.P.P.P.S. And you test passes as well:
val desmurfer = new MyClass("_smurf_")
val iterator: Iterator[Char] = "Scala_smurf_is_smurf_great".iterator
test(desmurfer.nextString2(iterator)) === Some("Scala")
test(desmurfer.nextString2(iterator)) === Some("is")
test(desmurfer.nextString2(iterator)) === Some("great")
test(desmurfer.nextString2(iterator)) === None
println()
test(desmurfer.nextString2("FooBarBaz".iterator)) === Some("FooBarBaz")
test(desmurfer.nextString2("".iterator)) === None
Here's one I'm posting just because it's a bit warped :) I wouldn't recommend actually using it:
class MyClass2(str: String) {
val sepLength = str.length
def nextString(iterator: Iterator[Char]): Option[String] = {
if (!iterator.hasNext) return None
val sit = iterator.sliding(sepLength)
val prefix = sit.takeWhile(_.mkString != str).toList
val prefixString = prefix.toList.map(_.head).mkString
if (prefix.head.length < sepLength) Some(prefix.head.mkString)
else if (!iterator.hasNext) Some(prefix.head.mkString + prefix.last.mkString)
else Some(prefixString)
}
}
The idea is that by calling sliding() on our underlying iterator, we can get a sequence, one of which will be our delimiter, if it's present. So we can use takeWhile to find the delimiter. Then the first characters of each of the sliding strings before our delimiter is the string we skipped over. As I said, warped.
I'd really like sliding to be defined so that it produced all subsequences of length n and at the end sequences of length n-1, n-2....1 for this particular use case, but it doesn't, and the horrible if statement at the end is dealing with the various cases.
It passes the test cases :)
Updated: This works without converting the iterator to String
def nextString(iterator: Iterator[Char]): Option[String] = {
if (iterator.isEmpty) None
else Some(iterator.foldLeft("") { (result, currentChar) => if (res.endsWith(str)) result else result + currentChar})
}
A colleague provided the makings of this answer, which is a mixture between his original approach and some polishing from my side. Thanks, Evans!
Then another colleague also added some input. Thanks Ako :-)
class MyClass(str: String) {
def nextString(iterator: Iterator[Char]): Option[String] = {
def nextString(iterator: Iterator[Char], sb: StringBuilder): Option[String] = {
if (!iterator.hasNext || sb.endsWith(str)) {
Some(sb.stripSuffix(str))
} else {
nextString(iterator, sb.append(iterator.next()))
}
}
if (!iterator.hasNext) None
else nextString(iterator, new StringBuilder)
}
}
So far, I like this approach best, so I will accept it in two days unless there is a better answer by then.