How to convert a Seq[Byte] into an Array[Boolean] representing each bit in Scala - scala

Is there a better way to convert a sequence of Bytes into an Seq[Boolean] where each element represents a bit from the Byte sequence?
I'm currently doing this, but byte2Bools seems a little too heavy...
object Main extends App {
private def byte2Bools(b: Byte) =
(0 to 7).foldLeft(ArrayBuffer[Boolean]())((bs, i) => bs += isBitSet(b, i))
private def isBitSet(byte: Byte, bit: Int) =
((byte >> bit) & 1) == 1
val bytes = List[Byte](1, 2, 3)
val bools = bytes.flatMap(b => byte2Bools(b))
println(bools)
}
Perhaps the real question is: what's a better implementation of byte2Bools?

First, accumulator in foldLeft is not necessary need to be a mutable collection.
def byte2Bools(b: Byte): Seq[Boolean] =
(0 to 7).foldLeft(Vector[Boolean]()) { (bs, i) => bs :+ isBitSet(b)(i) }
Second, you can just map initial sequence with isBitSet.
def byte2Bools(b: Byte): Seq[Boolean] =
0 to 7 map isBitSet(b)
def isBitSet(byte: Byte)(bit: Int): Boolean =
((byte >> bit) & 1) == 1

For whatever it's worth, you can convert a Byte to a BinaryString and then to sequence of Booleans with:
val b1 : Byte = 7
(0x100 + b1).toBinaryString.tail.map{ case '1' => true; case _ => false }
Results in: Vector(false, false, false, false, false, true, true, true)
And, you would go back (Booleans to Byte) with:
val s1 = Vector(false, false, false, false, false, true, true, true)
Integer.parseInt( s1.map{ case true => '1'; case false => '0' }.mkString, 2 ).toByte

Related

Check sequence String to contain strictly two patterns

I have a list of String where the String element could start with prefix of AA, BB or CC. How can I check if the list must and only contains String startWith both AA and BB but not CC. This is what I have now which is working, is that a better way to do it? Thanks.
private final val ValidPatternA: String = "^AA.*"
private final val ValidPatternB: String = "^BB.*"
private final val InvalidPatternC: String = "^CC.*"
def main(args: Array[String]): Unit = {
println(isValid(Seq())) // false
println(isValid(Seq("AA0"))) // false
println(isValid(Seq("BB1"))) // false
println(isValid(Seq("CC2"))) // false
println(isValid(Seq("AA0", "BB1", "CC2"))) // false
println(isValid(Seq("AA0", "CC2"))) // false
println(isValid(Seq("BB1", "CC2"))) // false
println(isValid(Seq("AA0", "BB1"))) // true
}
private def isValid(listOfString: Seq[String]) =
!listOfString.exists(_.matches(InvalidPatternC)) &&
listOfString.exists(_.matches(ValidPatternA)) &&
listOfString.exists(_.matches(ValidPatternB))
The code you have is clear and expressive, so the only concern can be performance. A recursive function can do one pass efficiently:
def isValid(listOfString: Seq[String]) = {
#annotation.tailrec
def loop(rem: List[String], foundA: Boolean, foundB: Boolean): Boolean =
rem match {
case Nil => foundA && foundB
case s"CC$_" :: _ => false
case s"AA$_" :: tail => loop(tail, true, foundB)
case s"BB$_" :: tail => loop(tail, foundA, true)
case hd :: tail => loop(tail, foundA, foundB)
}
loop(listOfString.toList, false, false)
}
The #annotation.tailrec indicates that this will be compiled into a fast loop with rem, foundA and foundB stored in local variables, and loop being a goto back to the start of the function.
You can optimize it by using bit mask to save on number of collection traversals.
private def isValid(listOfString: Seq[String]) =
listOfString.foldLeft(0) { (mask, str) =>
mask | str.matches(ValidPatternA).compare(false) | str.matches(ValidPatternB).compare(false) << 1 | str.matches(InvalidPatternC).compare(false) << 2
} == 3 // 1 & 2 & ^4

Scala Comparison of 2 different Object Sequences

I am coming from Java background and learning Scala now. I have a Seq respectively for Loc and LocSize objects whereby these 2 objects share a common member field code. I have a validateRoom function that basically validates these two Seq by matching a Loc instance to a LocSize instance and make some basic validation operations. My objective is to make sure all Loc objects passes the validation, hence I am using Seq.forall at the last println.
I believe that there are better, shorter and prettier FP ways to achieve what I want. How can I do so?
case class Loc(id: String, code: String, isRoom: Boolean, measureUnit: String, allowedSize: Int)
case class LocSize(code: String, width: Int, length: Int, measureUnit: String)
val locs = Seq(
Loc("_1", "loc01", true, "M", 100),
Loc("_2", "loc02", false, "M", 100),
Loc("_3", "loc03", true, "M", 100)
)
val locSizes = Seq(
LocSize("loc01", 5, 10, "M"),
LocSize("loc02", 6, 11, "M"),
LocSize("loc03", 9, 14, "M"),
LocSize("loc04", 8, 13, "M"),
LocSize("loc05", 9, 14, "M"),
)
def validateRoom(locs: Seq[Loc], loSizes: Seq[LocSize]): Seq[Boolean] = {
for (loc <- locs) yield {
if (loc.isRoom) {
val locSize = loSizes.find(_.code == loc.code)
locSize match {
case Some(i) => {
if (i.measureUnit.contains(loc.measureUnit) &&
i.width*i.length < loc.allowedSize)
true
else
false
}
case None => false
}
}
else
true
}
}
// Return false as loc03 size 9*14 > 100
println(validateRoom(locs, locSizes).forall(b => b))
YMMV and there are probably other more elegant solutions, here is one example
def validateRooms(locs: Seq[Loc], loSizes: Seq[LocSize]): Boolean = {
locs.foldLeft(true){ (state, loc) =>
if (state) {
val locSize = loSizes.find(_.code == loc.code)
locSize.map( s =>
s.measureUnit.contains(loc.measureUnit) &&
s.width*s.length < loc.allowedSize).getOrElse(false)
} else {
state
}
}
}
In this case the function returns the result true if all are valid eliminating the forall. Uses map and foldleft to eliminate the for comprehension.
Another option you have is:
def validateRoom(locs: Seq[Loc], locSizes: Seq[LocSize]): Boolean = {
locs.forall(loc => !loc.isRoom || locSizes.find(_.code == loc.code).exists(locSize => {
locSize.measureUnit.contains(loc.measureUnit) &&
locSize.width*locSize.length < loc.allowedSize
}))
}
Why does it work?
forall makes sure for any loc in locs, the condition should be true.
What is the codition?
First, loc.isRoom should br true, then locSizes should find LocSize with the same code. The result of find is an Option, which exposes the method exists. exists makes sure the option is not empty, and the boolean inside is true.
Code run at Scastie.
A better functional approach is to not really in Booleans at all, but rather provide more meaningful types; for example, an Either.
If you are open to using cats, the code is as simple as:
import cats.syntax.all._
// In general, a validation should return a new type, maybe a LocWithSize or something.
// The idea of a new type is to proof the type system that such value is already validated.
def validateRoom(locs: List[Loc], loSizes: List[LocSize]): Either[String, List[Loc]] ={
val sizesByCode = locSizes.iterator.map(ls => ls.code -> ls).toMap
locs.traverse { loc =>
sizesByCode
.get(key = loc.code)
.toRight(left = s"The code: '${loc.code}' was not found in the sizes")
.flatMap { locSize =>
val size = locSize.width * locSize.length
if (locSize.measureUnit != loc.measureUnit)
Left(s"The mesaure unit: '${locSize.measureUnit}' was not applicable to the location: ${loc}")
else if (size > loc.allowedSize)
Left(s"The size of location '${loc.code}' was ${size} which is bigger than the allowed size (${loc.allowedSize})")
else
Right(loc)
}
}
}
Which you can use like this:
val locs = List(
Loc("_1", "loc01", true, "M", 100),
Loc("_2", "loc02", false, "M", 100),
Loc("_3", "loc03", true, "M", 100)
)
val locSizes = List(
LocSize("loc01", 5, 10, "M"),
LocSize("loc02", 6, 11, "M"),
LocSize("loc03", 9, 14, "M"),
LocSize("loc04", 8, 13, "M"),
LocSize("loc05", 9, 14, "M")
)
validateRoom(locs, locSizes)
// res: Either[String, List[Loc]] = Left(The size of location 'loc03' was 126 which is bigger than the allowed size (100))
If you do not want to pull out cats just for this, you can implement your own traverse.
Take a look to this for an example using Try, the logic should be pretty similar.
You can see the code running here.
You can go cleaner like:
def validateRoom(locs: Seq[Loc], loSizes: Seq[LocSize]): Boolean = {
val locSizeIndex: Map[String, LocSize] = loSizes.map(loc => loc.code -> loc).toMap
val validation = for {
room <- locs.filter(_.isRoom)
size <- locSizeIndex.get(room.code)
} yield size.measureUnit.contains(room.measureUnit) && size.width * size.length < room.allowedSize
validation.reduceOption(_ && _).getOrElse(true)
}
// Return false as loc03 size 9*14 > 100
println(validateRoom(locs, locSizes))
Which prints out false
Scatie for playing: https://scastie.scala-lang.org/xQbDWZXOR16PJhz6xeaByA

How to dynamically provide N codecs to process fields as a VectorCodec for a record of binary fields that do not contain size bytes

Considering this function in Decoder:
final def decodeCollect[F[_], A](dec: Decoder[A], limit: Option[Int])(buffer: BitVector)(implicit cbf: Factory[A, F[A]]): Attempt[DecodeResult[F[A]]] = {
What I really need is dec: Vector[Decoder[A]], like this:
final def decodeCollect[F[_], A](dec: Vector[Decoder[A]], limit: Option[Int])(buffer: BitVector)(implicit cbf: Factory[A, F[A]]): Attempt[DecodeResult[F[A]]] = {
to process a binary format that has fields that are not self describing. Early in the file are description records, and from these come field sizes that have to be applied later in data records. So I want to build up a list of decoders and apply it N times, where N is the number of decoders.
I could write a new function modeled on decodeCollect, but it takes an implicit Factory, so I probably would have to compile the scodec library and add it.
Is there a simpler approach using what exists in the scodec library? Either a way to deal with the factory or a different approach?
I finally hacked a solution in the codec codebase. Now that that door is open, I'll add whatever I need until I succeed.
final def decodeNCollect[F[_], A](dec: Vector[Decoder[A]])(buffer: BitVector)(implicit cbf: Factory[A, F[A]]): Attempt[DecodeResult[F[A]]] = {
val bldr = cbf.newBuilder
var remaining = buffer
var count = 0
val maxCount = dec.length
var error: Option[Err] = None
while (count < maxCount && remaining.nonEmpty) {
dec(count).decode(remaining) match {
case Attempt.Successful(DecodeResult(value, rest)) =>
bldr += value
count += 1
remaining = rest
case Attempt.Failure(err) =>
error = Some(err.pushContext(count.toString))
remaining = BitVector.empty
}
}
Attempt.fromErrOption(error, DecodeResult(bldr.result, remaining))
}
final def encodeNSeq[A](encs: Vector[Encoder[A]])(seq: collection.immutable.Seq[A]): Attempt[BitVector] = {
if (encs.length != seq.length)
return Attempt.failure(Err("encodeNSeq: length of coders and items does not match"))
val buf = new collection.mutable.ArrayBuffer[BitVector](seq.size)
((seq zip (0 until encs.length)): Seq[(A, Int)]) foreach { case (a, i) =>
encs(i).encode(a) match {
case Attempt.Successful(aa) => buf += aa
case Attempt.Failure(err) => return Attempt.failure(err.pushContext(buf.size.toString))
}
}
def merge(offset: Int, size: Int): BitVector = size match {
case 0 => BitVector.empty
case 1 => buf(offset)
case n =>
val half = size / 2
merge(offset, half) ++ merge(offset + half, half + (if (size % 2 == 0) 0 else 1))
}
Attempt.successful(merge(0, buf.size))
}
private[codecs] final class VectorNCodec[A](codecs: Vector[Codec[A]]) extends Codec[Vector[A]] {
def sizeBound = SizeBound(0, Some(codecs.length.toLong))
def encode(vector: Vector[A]) = Encoder.encodeNSeq(codecs)(vector)
def decode(buffer: BitVector) =
Decoder.decodeNCollect[Vector, A](codecs)(buffer)
override def toString = s"vector($codecs)"
}
def vectorOf[A](valueCodecs: Vector[Codec[A]]): Codec[Vector[A]] =
provide(valueCodecs.length).
flatZip { count => new VectorNCodec(valueCodecs) }.
narrow[Vector[A]]({ case (cnt, xs) =>
if (xs.size == cnt) Attempt.successful(xs)
else Attempt.failure(Err(s"Insufficient number of elements: decoded ${xs.size} but should have decoded $cnt"))
}, xs => (xs.size, xs)).
withToString(s"vectorOf($valueCodecs)")

How to find the index of an item skipping some values while wrapping to head

I have a list of Boolean
val stacks = List(True, True, False, True, False)
I need a function that takes an index, and returns the next index that is not false, going back to 0 after reaching length.
def nextInvolvedAfter(after: Int): Int = ???
For example:
nextInvolvedAfter(0) // 1
nextInvolvedAfter(1) // 3
nextInvolvedAfter(2) // 3
nextInvolvedAfter(3) // 0
I was thinking iterating over a list like this:
stacks.drop(after + 1) ++ stacks indexWhere(_)
IMHO, this kind of problems are perfect to be solved using a tail-recursive algorithm.
def nextInvolvedAfter(data: List[Boolean])(after: Int): Int = {
#annotation.tailrec
def loop(remaining: List[Boolean], currentIdx: Int): Int =
remaining match {
case true :: _ if (currentIdx > after) => currentIdx
case _ :: xs => loop(remaining = xs, currentIdx + 1)
case Nil => nextInvolvedAfter(data)(after = -1) // Start again from the begining.
}
loop(remaining = data, currentIdx = 0)
}
However, if you want a solution using built-in methods, check this:
def nextInvolvedAfter(data: List[Boolean])(after: Int): Int =
data.iterator.zipWithIndex.collectFirst {
case (true, idx) if (idx > after) => idx
}.getOrElse(nextInvolvedAfter(data)(after = -1)) // Start again from the begining.
Both can be tested like this:
val test = nextInvolvedAfter(List(true, true, false, true, false)) _
// test: Int => Int = $$Lambda$938/1778422985#51a6cc2a
test(0)
// res: Int = 1
test(1)
// res: Int = 3
test(2)
// res: Int = 3
test(3)
// res: Int = 0
test(4)
// res: Int = 0
However, take into account that if all values are false this will end in a StackOverflow exception, so use it with care.
Or you may add custom logic to abort after a second iteration from the beginning.
This seems to be working:
(stacks.zipWithIndex.drop(after + 1) ++ stacks.zipWithIndex).find(_._1).get._2
I think we can simply iterate over indices. getOrElse can be used to do the circular check:
def nextInvolvedAfter(as : List[Boolean], after : Int) : Int =
as.indices.find(i => i > after && as(i))
.getOrElse(
as.indices.find(i => as(i)).getOrElse(-1)
)
This can be improved a little bit by trying to iterate only over the relevant portion of the list, you can directly use scala.collection.immutable.Range instead of indices.
def nextInvolvedAfter(as : Vector[Boolean], after : Int) : Int =
after + 1 until as.size find as getOrElse
0 to after find as getOrElse -1
Also note that iterating over a list using index is inefficient as mentioned in the comments.
Another thing to note that in all the solutions in this question (including the accepted solution) if the given index is greater than the list size, the function will just return the first true value encountered in the list. A trivial conditional check to make sure index is within range can remedy this.

How to use fold to do boolean testing

I'd like to know the idiomatic way to approach this problem in scala.
Given a start date and an end date and a collection of dates in between, determine whether the given collection of dates contains all the dates necessary to go from the start date to the end date with no gap dates in between.
Type signature:
def checkDate(start: DateTime, end: DateTime, between: IndexedSeq[DateTime]): Boolean
The "normal" or "not functional" way to do this would be something like this:
def checkDate(start: DateTime, end: DateTime, between: IndexedSeq[DateTime]): Boolean = {
i = 1
status = true
while(start != end) {
d = start.plusDays(i)
if (!between.contains(d) {
status = false
break
}
i += 1
}
return status
}
How can I do this using a Fold?
Here's my thought process so far:
def checkDate(start: DateTime, end: DateTime, between: IndexedSeq[DateTime]): Boolean = {
// A fold will assume the dates are in order and move left (or right)
// This means the dates must be sorted.
val sorted = between.sortBy(_.getMillis())
val a = sorted.foldLeft(List[Boolean]) {
(acc, current) => {
// How do I access an iterable version of the start date?
if (current == ??) {
acc :: true
} else false
}
}
// If the foldLeft produced any values that could NOT be matched
// to the between list, then the start date does not have an
// uninterrupted path to the end date.
if (a.count(_ == false) > 0) false
else true
}
I just need to figure out how to index the start parameter so I can increase the value of it as the fold iterates over the between collection. Or it's possible that fold isn't what I'm supposed to use at all.
Any help would be appreciated!
You can pass previous DateTime item in accumulator:
val a = sortedBetween.foldLeft((List[Boolean](), start)) {
case ((results, prev), current) => {
... calculate res here ...
(results ++ List(res), current)
}
}
But for this kind of check you better use sliding and forall combination:
sortedBetween.sliding(2).forall {
case List(prev,cur) => ..do the check here ..
}
Also, note that you ingnoring the result of between sorting since IndexedSeq is immutable. Fix - use another val:
val sortedBetween = between.sortBy(_.getMillis())
I think a fold isn't necessary, it's making things too hard.
Suppose you had the following functions:
private def normalizeDateTime( dt : DateTime ) : DateMidnight = ???
private def requiredBetweens( start : DateMidnight, end : DateMidnight ) : Seq[DateMidnight] = ???
Then you could write your function as follows:
def checkDate(start: DateTime, end: DateTime, between: IndexedSeq[DateTime]): Boolean = {
val startDay = normalizeDateTime( start )
val endDay = normalizeDateTime( end )
val available = between.map( normalizeDateTime ).toSet
val required = requiredBetweens( startDay, endDay ).toSet
val unavailable = (required -- available)
unavailable.isEmpty
}
Note that this function imposes no requirement as to the ordering of the betweens, treats the elements as a Set, only requiring that each day be available somewhere.
To implement normalizeDateTime(...) you might get away with something as simple as dt.toDateMidnight, but you should think a bit about Chronology and time zone issues. It's critical that DateTime objects that you mean to represent a day always normalize to the same DateMidnight.
To implement requiredBetweens(...), you might consider using a Stream and takeWhile(...) for an elegant solution. You might want to require that (end isAfter start).
I would use filter and then zip and take the difference, the dates should always be one day apart, so check they are all 1.
# val ls = Array(1, 2, 3, 4, 5, 6, 7) // can use dates in the same way
ls: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7)
# val ls2 = ls.filter { i => (2 < i) && (i < 6) }
ls2: Array[Int] = Array(3, 4, 5)
# ls2.zip(ls2.drop(1))
res21: Array[(Int, Int)] = Array((3, 4), (4, 5))
# ls2.zip(ls2.drop(1)).map { case (x, y) => y-x }
res22: Array[Int] = Array(1, 1)
# ls2.zip(ls2.drop(1)).map { case (x, y) => y-x }.forall { _ == 1 }
res23: Boolean = true
You also have to check that no dates are missing:
# ls2.length == 6 - 2 - 1 // beware off-by-one errors
res25: Boolean = true
You may also be able to do this more simply by using the Range object:
# ls2.zipAll(3 to 5 by 1, 0, 0).forall { case (x, y) => x == y }
res46: Boolean = true
This should work, but may need a slight tweak for DateTime...
# val today = LocalDate.now
today: LocalDate = 2017-10-19
# val a = (0 to 9).reverse.map { today.minusDays(_) }
a: collection.immutable.IndexedSeq[LocalDate] = Vector(2017-10-10, 2017-10-11, 2017-10-12, 2017-10-13, 2017-10-14, 2017-10-15, 2017-10-16, 2017-10-17, 2017-10-18, 2017-10-19)
# a.zip(a.drop(1)).map { case (x, y) => x.until(y) }.forall { _ == Period.ofDays(1) }
res71: Boolean = true
Solution with tail recursion. I'm using ZonedDateTime from Java 8 for DateTime representation. Here is online version on codepad.remoteinterview.io:
import scala.annotation.tailrec
import java.time.ZonedDateTime
object TailRecursionExample {
def checkDate(start: ZonedDateTime, end: ZonedDateTime,
between: Seq[ZonedDateTime]): Boolean = {
// We have dates in range (inclusive) [start, end] with step = 1 day
// All these days should be in between collection
// set for fast lookup
val set = between.toSet
#tailrec
def checkDate(curr: ZonedDateTime, iterations: Int): (Int, Boolean) = {
if (curr.isAfter(end)) (iterations, true)
else if (set.contains(curr)) checkDate(curr.plusDays(1), iterations + 1)
else (iterations, false)
}
val (iterations, result) = if (start.isAfter(end))
(0, false)
else
checkDate(start, 0)
println(s"\tNum of iterations: $iterations")
result
}
def main(args: Array[String]): Unit = {
testWhenStartIsAfterEnd()
println
testWhenStartIsBeforeEnd()
println
testWhenStartIsBeforeEndButBetweenSkipOneDay()
println
()
}
def testWhenStartIsAfterEnd(): Unit = {
val start = ZonedDateTime.now().plusDays(5)
val end = ZonedDateTime.now()
val between = (0 to 5).map(i => start.plusDays(i))
verboseTest("testWhenStartIsAfterEnd", start, end, between)
}
def testWhenStartIsBeforeEnd(): Unit = {
val start = ZonedDateTime.now().minusDays(5)
val end = ZonedDateTime.now()
val between = (0 to 5).map(i => start.plusDays(i))
verboseTest("testWhenStartIsBeforeEnd", start, end, between)
}
def testWhenStartIsBeforeEndButBetweenSkipOneDay(): Unit = {
val start = ZonedDateTime.now().minusDays(5)
val end = ZonedDateTime.now()
val between = (1 to 5).map(i => start.plusDays(i))
verboseTest("testWhenStartIsBeforeEndButBetweenSkipOneDay", start, end, between)
}
def verboseTest(name: String, start: ZonedDateTime, end: ZonedDateTime,
between: Seq[ZonedDateTime]): Unit = {
println(s"$name:")
println(s"\tStart: $start")
println(s"\tEnd: $end")
println(s"\tBetween: ")
between.foreach(t => println(s"\t\t$t"))
println(s"\tcheckDate: ${checkDate(start, end, between)}")
}
}