Reverse bit positions without toBinaryString - scala

I have been trying to use scala to reverse bit positions by only using shifting, forcing and toggling. I was wondering if somebody could find my error, I have been staring my code too long now :)
Examples:
1010 1010 -> 0101 0101
1100 1001 -> 1001 0011
Here is my code atm:
def reverse(word: Byte): Byte = {
var r = 0x00 // Reversed bitstring
for (i <- 0 to 7) {
if ((word >> (7 - i) & 1) == 1) r = r & 1
r >> 1
}
r
}
Old:
def reverse(word: Byte) = {
var reversed = 0xFF.toByte
for (i <- 0 to 7) {
if ((word >> i & 1) == 1) {
reversed = reversed >> 1
}
else reversed = reversed >>> 1
}
reversed
}

Just take any answer for a Java implementation and do it simpler in Scala. (I added an explicit bit-size). Like:
import annotation.tailrec
#tailrec
def reverse(in: Int, n: Int = 8, out: Int = 0): Int =
if (n == 0) out
else reverse(in >>> 1, n - 1, (out << 1) | (in & 1))
For the number of bits, copy lowest bit from input to output and shift in opposite directions. Verify:
assert(reverse(0xAA) == 0x55)
assert(reverse(0xC9) == 0x93)
for (x <- 0x00 to 0xFF) assert(reverse(reverse(x)) == x)

This is a weird problem to spend time solving ... Homework?
#tailrec
def reverse(in: Int, out: Int = 0, n: Int = 0): Int =
if(in == 0) out else reverse(in >> 1, out | (in & 1) << (7-n), n+1)

java.lang.Integer and Long have methods for reversing of bits (and bytes), but for some silly reason, java.lang.Byte doesnt, so if you were to just use this method, remember to shift over the bytes properly:
Eg: (Integer.reverse(x) >>> 24) & 0xFF
This may be easier than writing all the bitwise operations yourself if not quite up to that, and Oracle implements it with a well optimized version for 32&64 bit integers

Related

scala: Loop through a file to read 20 bytes at a time and blank out bytes at 3rd position

I have a code snippet in java that loops through the file byte by byte and blanks out byte at 3rd position on every 20 bytes. This is done using for each loop.
logic:
for(byte b: raw){
if (pos is 3) b = 32;
if (i > 20) i = 0;
i++
}
Since I am learning scala, I would like to know if there is a better way of looping byte by byte in scala.
I have read into byte array as below in scala:
val result = IOUtils.toByteArray(new FileInputStream (new File(fileDir)))
Thanks.
Here is a diametrically opposite solution to that of Tzach Zohar:
def parallel(ba: Array[Byte], blockSize: Int = 2048): Unit = {
val n = ba.size
val numJobs = (n + blockSize - 1) / blockSize
(0 until numJobs).par.foreach { i =>
val startIdx = i * blockSize
val endIdx = n min ((i + 1) * blockSize)
var j = startIdx + ((3 - startIdx) % 20 + 20) % 20
while (j < endIdx) {
ba(j) = 32
j += 20
}
}
}
You see a lot of mutable variables, scary imperative while-loops, and some strange tricks with modular arithmetic. That's actually not idiomatic Scala at all. But the interesting thing about this solution is that it processes blocks of the byte array in parallel. I've compared the time needed by this solution to your naive solution, using various block sizes:
Naive: 38.196
Parallel( 16): 11.676000
Parallel( 32): 7.260000
Parallel( 64): 4.311000
Parallel( 128): 2.757000
Parallel( 256): 2.473000
Parallel( 512): 2.462000
Parallel(1024): 2.435000
Parallel(2048): 2.444000
Parallel(4096): 2.416000
Parallel(8192): 2.420000
At least in this not very thorough microbenchmark (1000 repetitions on 10MB array), the more-or-less efficiently implemented parallel version outperformed the for-loop in your question by factor 15x.
The question is now: What do you mean by "better"?
My proposal was slightly faster than your naive approach
#TzachZohar's functional solution could generalize better should the
code be moved on a cluster like Apache Spark.
I would usually prefer something closer to #TzachZohar's solution, because it's easier to read.
So, it depends on what you are optimizing for: performance? generality? readability? maintainability? For each notion of "better", you could get a different answer. I've tried to optimize for performance. #TzachZohar optimized for readability and maintainability. That lead to two rather different solutions.
Full code of the microbenchmark, just in case someone is interested:
val array = Array.ofDim[Byte](10000000)
def naive(ba: Array[Byte]): Unit = {
var pos = 0
for (i <- 0 until ba.size) {
if (pos == 3) ba(i) = 32
pos += 1
if (pos == 20) pos = 0
}
}
def parallel(ba: Array[Byte], blockSize: Int): Unit = {
val n = ba.size
val numJobs = (n + blockSize - 1) / blockSize
(0 until numJobs).par.foreach { i =>
val startIdx = i * blockSize
val endIdx = n min ((i + 1) * blockSize)
var j = startIdx + ((3 - startIdx) % 20 + 20) % 20
while (j < endIdx) {
ba(j) = 32
j += 20
}
}
}
def measureTime[U](repeats: Long)(block: => U): Double = {
val start = System.currentTimeMillis
var iteration = 0
while (iteration < repeats) {
iteration += 1
block
}
val end = System.currentTimeMillis
(end - start).toDouble / repeats
}
println("Basic sanity check (did I get the modulo arithmetic right?):")
{
val testArray = Array.ofDim[Byte](50)
naive(testArray)
println(testArray.mkString("[", ",", "]"))
}
{
for (blockSize <- List(3, 7, 13, 16, 17, 32)) {
val testArray = Array.ofDim[Byte](50)
parallel(testArray, blockSize)
println(testArray.mkString("[", ",", "]"))
}
}
val Reps = 1000
val naiveTime = measureTime(Reps)(naive(array))
println("Naive: " + naiveTime)
for (blockSize <- List(16,32,64,128,256,512,1024,2048,4096,8192)) {
val parallelTime = measureTime(Reps)(parallel(array, blockSize))
println("Parallel(%4d): %f".format(blockSize, parallelTime))
}
Here's one way to do this:
val updated = result.grouped(20).flatMap { arr => arr.update(3, 32); arr }

Complexity estimation for simple recursive algorithm

I wrote a code on Scala. And now I want to estimate time and memory complexity.
Problem statement
Given a positive integer n, find the least number of perfect square numbers (for example, 1, 4, 9, 16, ...) which sum to n.
For example, given n = 12, return 3 because 12 = 4 + 4 + 4; given n = 13, return 2 because 13 = 4 + 9.
My code
def numSquares(n: Int): Int = {
import java.lang.Math._
def traverse(n: Int, ns: Int): Int = {
val max = ((num: Int) => {
val sq = sqrt(num)
// a perfect square!
if (sq == floor(sq))
num.toInt
else
sq.toInt * sq.toInt
})(n)
if (n == max)
ns + 1
else
traverse(n - max, ns + 1)
}
traverse(n, 0)
}
I use here a recursion solution. So IMHO time complexity is O(n), because I need to traverse over the sequence of numbers using recursion. Am I right? Have I missed anything?

scala - serialize Int to ArrayBuffer[Byte]. Bit twiddle goes wrong

I want to serialize an into into a Byte array or array buffer.
I realise that I can use 'java.nio.ByteBuffer' but I am experimenting for fun and trying to do it myself.
The following code works for positive Int but goes wrong when I serialize a negative Int.
Can anyone explain why or show me a correction?
import scala.collection.mutable.ArrayBuffer
object b {
val INTBYTES:Int = 4 // int is 4 bytes
def toArrayBuf(x:Int): ArrayBuffer[Byte] = {
val buf = new ArrayBuffer[Byte](INTBYTES)
for(i <- 0 until INTBYTES) {
buf += ((x >>> (INTBYTES - i - 1 << 3)) & 0xFF).toByte
}
buf
}
}
the following test works as expected:-
int the REPL it prints:-
scala> val test:Int = 0x4f0f0f0f
test: Int = 1326386959
scala> println(test.toBinaryString)
1001111000011110000111100001111
scala> val t1 = b.toArrayBuf(test)
t1: scala.collection.mutable.ArrayBuffer[Byte] = ArrayBuffer(79, 15, 15, 15)
scala> t1.foreach( it => printf("%s ",it.toInt.toBinaryString))
1001111 1111 1111 1111
but this with a negative int does something wierd:-
scala> val test2:Int = 0x8f0f0f0f
test2: Int = -1894838513
scala> println(test2.toBinaryString)
10001111000011110000111100001111
scala> val t2 = b.toArrayBuf(test2)
t2: scala.collection.mutable.ArrayBuffer[Byte] = ArrayBuffer(-113, 15, 15, 15)
scala> t2.foreach( it => printf("%s ",it.toInt.toBinaryString))
11111111111111111111111110001111 1111 1111 1111
notice that the first byte has been 1 filled for the whole int it shoild be '10001111'
Any ideas?
FYI
Im using :-
scala -version
Scala code runner version 2.10.1 -- Copyright 2002-2013, LAMP/EPFL
java -fullversion
java full version "1.7.0_40-b31"
with OpenJDK
Thanks
Scala's toBinaryString method defers to the Java one on Integer. From those documents:
public static String toBinaryString(int i)
Returns a string representation of the integer argument as an unsigned
integer in base 2. The unsigned integer value is the argument plus
2^32 if the argument is negative; otherwise it is equal to the
argument. This value is converted to a string of ASCII digits in
binary (base 2) with no extra leading 0s.
In other words it's working as specified. Your bit-twiddling seems to be OK, but when you're printing the numbers out, you need to realise that the number of characters is dependent on the length of the data type. (E.g. -1: Int in binary is 11111111111111111111111111111111 while -1: Byte is 11111111.) You get away with it for positive numbers only because the leading zeros are not displayed, as specified above.
Solution: make your own toBinaryString for bytes, or just taking the rightmost 8 digits from the Int version should work (though less efficient) i.e.
it.toInt.toBinaryString.takeRight(8)
Taking Luigi's advice I hacked up a pimp for Byte that provides a toBinaryString that works properly, in case anyone else is struggling with similar problems here is what I did.
object b {
val INTBYTES:Int = 4 // int is 4 bytes
val SIZEBYTE:Short = 8
def toArrayBuf(x:Int): ArrayBuffer[Byte] = {
val buf = new ArrayBuffer[Byte](INTBYTES)
for(i <- 0 until INTBYTES) {
buf += ((x >>> (INTBYTES - i - 1 << 3)) & 0xFF).toByte
}
buf
}
def toBinaryString(x: Byte): String = {
val buf = new StringBuilder(SIZEBYTE)
for(i <- 0 until SIZEBYTE) {
buf.append((x >>> (SIZEBYTE - i - 1)) & 0x01)
}
buf.toString()
}
}
//pimp Byte
implicit def fooBar(byte: Byte) = new {def toBinaryString = b.toBinaryString(byte)}
Now when I run the previous experiment it works properly
scala> val test:Int = 0x4f0f0f0f
test: Int = 1326386959
scala> println(test.toBinaryString)
1001111000011110000111100001111
scala> val t1 = toArrayBuf(test)
t1: scala.collection.mutable.ArrayBuffer[Byte] = ArrayBuffer(79, 15, 15, 15)
scala> t1.foreach( it => printf("%s ",it.toBinaryString))
01001111 00001111 00001111 00001111
and
scala> val test2:Int = 0x8f0f0f0f
test2: Int = -1894838513
scala> println(test2.toBinaryString)
10001111000011110000111100001111
scala> val t2 = toArrayBuf(test2)
t2: scala.collection.mutable.ArrayBuffer[Byte] = ArrayBuffer(-113, 15, 15, 15)
scala> t2.foreach( it => printf("%s ",it.toBinaryString))
10001111 00001111 00001111 00001111
Thanks Luigi

Why converting '1' char to int using toInt method results to 49?

I want to convert a char to an int value.
I am a bit puzzled by the way toInt works.
println(("123").toList) //List(1, 2, 3)
("123").toList.head // res0: Char = 1
("123").toList.head.toInt // res1: Int = 49 WTF??????
49 pops up randomly for no reason.
How do you convert a char to int the right way?
For simple digit to int conversions there is asDigit:
scala> "123" map (_.asDigit)
res5: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
Use Integer.parseInt("1", 10). Note that the 10 here is the radix.
val x = "1234"
val y = x.slice(0,1)
val z = Integer.parseInt(y)
val z2 = y.toInt //equivalent to the line above, see #Rogach answer
val z3 = Integer.parseInt(y, 8) //This would give you the representation in base 8 (radix of 8)
49 does not pop up randomly. It's the ascii representation of "1". See http://www.asciitable.com/
.toInt will give you the ascii value. It's probably easiest to write
"123".head - '0'
If you want to handle non-numeric characters, you can do
c match {
case c if '0' <= c && c <= '9' => Some(c - '0')
case _ => None
}
You can also use
"123".head.toString.toInt

Is there a Scala-way to get the length of a number?

I would like to know, if there is a Scala built-in method to get the length of the decimal representation of an integer ?
Example: 45 has length 2; 10321 has length 5.
One could get the length with 10321.toString.length, but this smells a bit because of the overhead when creating a String object. Is there a nicer way or a built-in method ?
UPDATE:
With 'nicer' I mean a faster solution
I am only interested in positive integers
This is definitely personal preference, but I think the logarithm method looks nicer without a branch. For positive values only, the abs can be omitted of course.
def digits(x: Int) = {
import math._
ceil(log(abs(x)+1)/log(10)).toInt
}
toString then get length of int will not work for negative integers. This code will work not only for positive numbers but also negatives.
def digits(n:Int) = if (n==0) 1 else math.log10(math.abs(n)).toInt + 1;
If you want speed then something like the following is pretty good, assuming random distribution:
def lengthBase10(x: Int) =
if (x >= 1000000000) 10
else if (x >= 100000000) 9
else if (x >= 10000000) 8
else if (x >= 1000000) 7
else if (x >= 100000) 6
else if (x >= 10000) 5
else if (x >= 1000) 4
else if (x >= 100) 3
else if (x >= 10) 2
else 1
Calculating logarithms to double precision isn't efficient if all you want is the floor.
The conventional recursive way would be:
def len(x: Int, i: Int = 1): Int =
if (x < 10) i
else len(x / 10, i + 1)
which is faster than taking logs for integers in the range 0 to 10e8.
lengthBase10 above is about 4x faster than everything else though.
Something like this should do the job:
def numericLength(n: Int): Int = BigDecimal(n).precision
Take log to the base 10, take the floor and add 1.
The easiest way is:
def numberLength(i : Int): Int = i.toString.length
You might add a guarding-condition because negative Int will have the length of their abs + 1.
Another possibility can be:
private lazy val lengthList = (1 until 19).map(i => i -> math.pow(10, i).toLong)
def numberSize(x: Long): Int =
if (x >= 0) positiveNumberSize(x)
else positiveNumberSize(-x) + 1
private def positiveNumberSize(x: Long): Int =
lengthList
.collectFirst {
case (l, p) if x < p => l
}
.getOrElse(19)
Most people gave the most efficient answer of (int) log(number)+1
But I want to get a bit deeper into understanding why this works.
Let N be a 3 digit number. This means N can be any number between 100 and 1000, or :
100 < N < 1000
=> 10^2 < N < 10^3
The Logarithmic function is continuous , therefore :
log(10^2) < log(N) < log(10^3)
=> 2 < log(N) < 3
We can conclude that N's logarithm is a number between 2 and 3 , or in other words , any 3 digit numbers' logarithm is between 2 and 3.
So if we take only the integer part of a numbers logarithm(eg. the integer part of 2.567 is 2) and add 1 we get the digit length of the number.
Here is the solution:
number.toString.toCharArray.size
input - output
45 - 2
100 - 3