Scala - ReadLines with linenumber awareness - scala

I am trying to read from a (\n separated) file in Scala. I would like to treat the last and second to last line of the file differently from the others. how do I achieve this? As far as I know, LineNumberedReader which is a BufferedReader can get the current line number but is not aware of the total number of lines(to process the last line differently). Desired:
var aLine = lineNumberedReader.readLine
while (aLine != null) {
val currentLineNum: Int = lineNumberedReader.getLineNumber
if (currentLineNum == total_line_count - 1) {
do_this // To know if its the last line/ second to last, I need the total_line_count available in hand. which I (maybe incorrectly?) believe needs a file iteration by itself
} else {
do_that
}
aLine = lineNumberedReader.readLine
}
Same issues exist with Scala io.Source. it requires a minimum of two file iterations. Any ideas/APIs using which this can be achieved using a single iteration?
EDIT: Extending the question to include case for second-to-last line

Something like this perhaps?
val lines = Source
.fromFile("filename")
.getLines
lines.foreach { line =>
if (lines.hasNext) doThis(line) else doThat(line)
}
For one before last it gets a bit more involved:
val it = Source.fromFile("filename").getLines.sliding(2)
it.foreach {
case x :: _ if it.hasNext => doCommon(x)
case x :: y :: _ => doBeforeLast(x); doLast(y)
}
Or in general for N-before-last (before you edit your question again :D):
val it = Source.fromFile("filename").getLines.sliding(n+1)
it.foreach {
case x :: _ if it.hasNext => doCommon(x)
case tail => doNBeforeLast(tail)
}

Related

Scala, http4s, fs2. Why does fileupload using http4s with fs2 reads just one line while reading as bytes reads completely

I have the following snippet using multi part file uploads. One of the parts reads as byte while the other reads string. One that reads as Byte shows the correct size while the one reading as String reads only one line. What am I missing ?
val routes = HttpRoutes.of[IO] {
case GET -> Root => Ok(html.index())
case req # POST -> Root / "generate" =>
req.decode[Multipart[IO]] { m =>
m.parts.find(_.name == Some("template")) match {
case None => BadRequest("Missing template file")
case Some(templatePart) => {
val templateByteStream = for {
byte <- templatePart.body
} yield byte
m.parts.find(_.name == Some("data")) match {
case Some(datapart) =>
val dataLineStream = for {
line <- datapart.body.through(utf8.decode)
} yield line
Ok {
for {
templateBytes <- templateByteStream.compile.toList
datalines <- dataLineStream.compile.toList
templateSize = templateBytes.size
dataLineCount = datalines.size
} yield s"template size $templateSize && dataline count $dataLineCount"
}
case None => BadRequest("Missing data file")
}
}
}
}
}
Solution:
Because you are reading the whole data as a single string, maybe you
also wanted to through(text.lines)
Credits: "Luis Miguel Mejía Suárez"

How can I emit periodic results over an iteration?

I might have something like this:
val found = source.toCharArray.foreach{ c =>
// Process char c
// Sometimes (e.g. on newline) I want to emit a result to be
// captured in 'found'. There may be 0 or more captured results.
}
This shows my intent. I want to iterate over some collection of things. Whenever the need arrises I want to "emit" a result to be captured in found. It's not a direct 1-for-1 like map. collect() is a "pull", applying a partial function over the collection. I want a "push" behavior, where I visit everything but push out something when needed.
Is there a pattern or collection method I'm missing that does this?
Apparently, you have a Collection[Thing], and you want to obtain a new Collection[Event] by emitting a Collection[Event] for each Thing. That is, you want a function
(Collection[Thing], Thing => Collection[Event]) => Collection[Event]
That's exactly what flatMap does.
You can write it down with nested fors where the second generator defines what "events" have to be "emitted" for each input from the source. For example:
val input = "a2ba4b"
val result = (for {
c <- input
emitted <- {
if (c == 'a') List('A')
else if (c.isDigit) List.fill(c.toString.toInt)('|')
else Nil
}
} yield emitted).mkString
println(result)
prints
A||A||||
because each 'a' emits an 'A', each digit emits the right amount of tally marks, and all other symbols are ignored.
There are several other ways to express the same thing, for example, the above expression could also be rewritten with an explicit flatMap and with a pattern match instead of if-else:
println(input.flatMap{
case 'a' => "A"
case d if d.isDigit => "|" * (d.toString.toInt)
case _ => ""
})
I think you are looking for a way to build a Stream for your condition. Streams are lazy and are computed only when required.
val sourceString = "sdfdsdsfssd\ndfgdfgd\nsdfsfsggdfg\ndsgsfgdfgdfg\nsdfsffdg\nersdff\n"
val sourceStream = sourceString.toCharArray.toStream
def foundStreamCreator( source: Stream[Char], emmitBoundaryFunction: Char => Boolean): Stream[String] = {
def loop(sourceStream: Stream[Char], collector: List[Char]): Stream[String] =
sourceStream.isEmpty match {
case true => collector.mkString.reverse #:: Stream.empty[String]
case false => {
val char = sourceStream.head
emmitBoundaryFunction(char) match {
case true =>
collector.mkString.reverse #:: loop(sourceStream.tail, List.empty[Char])
case false =>
loop(sourceStream.tail, char :: collector)
}
}
}
loop(source, List.empty[Char])
}
val foundStream = foundStreamCreator(sourceStream, c => c == '\n')
val foundIterator = foundStream.toIterator
foundIterator.next()
// res0: String = sdfdsdsfssd
foundIterator.next()
// res1: String = dfgdfgd
foundIterator.next()
// res2: String = sdfsfsggdfg
It looks like foldLeft to me:
val found = ((List.empty[String], "") /: source.toCharArray) {case ((agg, tmp), char) =>
if (char == '\n') (tmp :: agg, "") // <- emit
else (agg, tmp + char)
}._1
Where you keep collecting items in a temporary location and then emit it when you run into a character signifying something. Since I used List you'll have to reverse at the end if you want it in order.

Generate all IP addresses given a string in scala

I was trying my hand at writing an IP generator given a string of numbers. The generator would take as an input a string of number such as "17234" and will return all possible list of ips as follows:
1.7.2.34
1.7.23.4
1.72.3.4
17.2.3.4
I attempted to write a snippet to do the generation as follows:
def genip(ip:String):Unit = {
def legal(ip:String):Boolean = (ip.size == 1) || (ip.size == 2) || (ip.size == 3)
def genips(ip:String,portion:Int,accum:String):Unit = portion match {
case 1 if legal(ip) => println(accum+ip)
case _ if portion > 1 => {
genips(ip.drop(1),portion-1,if(accum.size == 0) ip.take(1)+"." else accum+ip.take(1)+".")
genips(ip.drop(2),portion-1,if(accum.size == 0) ip.take(2)+"." else accum+ip.take(2)+".")
genips(ip.drop(3),portion-1,if(accum.size == 0) ip.take(3)+"." else accum+ip.take(3)+".")
}
case _ => return
}
genips(ip,4,"")
}
The idea is to partition the string into four octets and then further partition the octet into strings of size "1","2" and "3" and then recursively descend into the remaining string.
I am not sure if I am on the right track but it would be great if somebody could suggest a more functional way of accomplishing the same.
Thanks
Here is an alternative version of the attached code:
def generateIPs(digits : String) : Seq[String] = generateIPs(digits, 4)
private def generateIPs(digits : String, partsLeft : Int) : Seq[String] = {
if ( digits.size < partsLeft || digits.size > partsLeft * 3) {
Nil
} else if(partsLeft == 1) {
Seq(digits)
} else {
(1 to 3).map(n => generateIPs(digits.drop(n), partsLeft - 1)
.map(digits.take(n) + "." + _)
).flatten
}
}
println("Results:\n" + generateIPs("17234").mkString("\n"))
Major changes:
Methods now return the collection of strings (rather than Unit), so they are proper functions (rather than working of side effects) and can be easily tested;
Avoiding repeating the same code 3 times depending on the size of the bunch of numbers we take;
Not passing accumulated interim result as a method parameter - in this case it doesn't have sense since you'll have at most 4 recursive calls and it's easier to read without it, though as you're loosing the tail recursion in many case it might be reasonable to leave it.
Note: The last map statement is a good candidate to be replaced by for comprehension, which many developers find easier to read and reason about, though I will leave it as an exercise :)
You code is the right idea; I'm not sure making it functional really helps anything, but I'll show both functional and side-effecting ways to do what you want. First, we'd like a good routine to chunk off some of the numbers, making sure an okay number are left for the rest of the chunking, and making sure they're in range for IPs:
def validSize(i: Int, len: Int, more: Int) = i + more <= len && i + 3*more >= len
def chunk(s: String, more: Int) = {
val parts = for (i <- 1 to 3 if validSize(i, s.length, more)) yield s.splitAt(i)
parts.filter(_._1.toInt < 256)
}
Now we need to use chunk recursively four times to generate the possibilities. Here's a solution that is mutable internally and iterative:
def genIPs(digits: String) = {
var parts = List(("", digits))
for (i <- 1 to 4) {
parts = parts.flatMap{ case (pre, post) =>
chunk(post, 4-i).map{ case (x,y) => (pre+x+".", y) }
}
}
parts.map(_._1.dropRight(1))
}
Here's one that recurses using Iterator:
def genIPs(digits: String) = Iterator.iterate(List((3,"",digits))){ _.flatMap{
case(j, pre, post) => chunk(post, j).map{ case(x,y) => (j-1, pre+x+".", y) }
}}.dropWhile(_.head._1 >= 0).next.map(_._2.dropRight(1))
The logic is the same either way. Here it is working:
scala> genIPs("1238516")
res2: List[String] = List(1.23.85.16, 1.238.5.16, 1.238.51.6,
12.3.85.16, 12.38.5.16, 12.38.51.6,
123.8.5.16, 123.8.51.6, 123.85.1.6)

Idiomatic "do until" collection updating

Scenario:
val col: IndexedSeq[Array[Char]] = for (i <- 1 to n) yield {
val x = for (j <- 1 to m) yield 'x'
x.toArray
}
This is a fairly simple char matrix. toArray used to allow updating.
var west = last.x - 1
while (west >= 0 && arr(last.y)(west) == '.') {
arr(last.y)(west) = ch;
west -= 1;
}
This is updating all . to ch until a non-dot char is found.
Generically, update until stop condition is met, unknown number of steps.
What is the idiomatic equivalent of it?
Conclusion
It's doable, but the trade-off isn't worth it, a lot of performance is lost to expressive syntax when the collection allows updating.
Your wish for a "cleaner, more idiomatic" solution is of course a little fuzzy, because it leaves a lot of room for subjectivity. In general, I'd consider a tail-recursive updating routine more idiomatic, but it might not be "cleaner" if you're more familiar with a non-functional programming style. I came up with this:
#tailrec
def update(arr:List[Char], replace:Char, replacement:Char, result:List[Char] = Nil):List[Char] = arr match {
case `replace` :: tail =>
update(tail, replace, replacement, replacement :: result)
case _ => result.reverse ::: arr
}
This takes one of the inner sequences (assuming a List for easier pattern matching, since Arrays are trivially convertible to lists), and replaces the replace char with the replacement recursively.
You can then use map to update the outer sequence, like so:
col.map { x => update(x, '.', ch) }
Another more reusable alternative is writing your own mapUntil, or using one which is implemented in a supplemental library (Scalaz probably has something like it). The one I came up with looks like this:
def mapUntil[T](input:List[T])(f:(T => Option[T])) = {
#tailrec
def inner(xs:List[T], result:List[T]):List[T] = xs match {
case Nil => Nil
case head :: tail => f(head) match {
case None => (head :: result).reverse ::: tail
case Some(x) => inner(tail, x :: result)
}
}
inner(input, Nil)
}
It does the same as a regular map invocation, except that it stops as soon as the passed function returns None, e.g.
mapUntil(List(1,2,3,4)) {
case x if x >= 3 => None
case x => Some(x-1)
}
Will result in
List[Int] = List(0, 1, 3, 4)
If you want to look at Scalaz, this answer might be a good place to start.
x3ro's answer is the right answer, esp. if you care about performance or are going to be using this operation in multiple places. I would like to add simple solution using only what you find in the collections API:
col.map { a =>
val (l, r) = a.span(_ == '.')
l.map {
case '.' => ch
case x => x
} ++ r
}

Break or shortcircuit a fold in Scala

I have written a simple depth-first search in Scala with a recursive function like that:
search(labyrinth, path, goal)
where labyrinth is a specification of the problem (as graph or whatever), path is a list that holds the path taken so far and goal is a specification of the goal state. The function returns a path to the goal as a List and Nil if no path can be found.
The function expands, e.g. finds all suitable next nodes (candidates) and then has to recursively call itself.
I do this by
candidates.foldLeft(Nil){
(solution, next) =>
if( solution == Nil )
search( labyrinth, next :: path, goal )
else
solution
}
Please note that I have omitted some unescessary details. Everything is working fine so far. But once a solution is found inside the foldLeft call, this solution gets simply copied by the else part of the if-statement. Is there a way to avoid this by breaking the foldLeft or maybe by using a different function instead of foldLeft? Actually I could probably write a version of foldLeft which breaks once "not Nil" is returned myself. But is there one inside the API?
I'm not sure I understand the desire to short-circuit the loop. Is it expensive to iterate through the candidates? Is the candidates list potentially large?
Maybe you could use the "find" method:
candidates.find { c =>
Nil != search( labyrinth, c :: path, goal )
} match {
case Some(c) => c :: path
case None => Nil
}
If the potential depth of the search space is large, you could overflow your stack (fitting, given this site name). But, that is a topic for another post.
For giggles, here is an actual runnable implementation. I had to introduce a local mutable variable (fullPath) mostly out of laziness, but I'm sure you could take those out.
object App extends Application {
// This impl searches for a specific factor in a large int
type SolutionNode = Int
case class SearchDomain(number: Int) {
def childNodes(l: List[Int]): List[Int] = {
val num = if (l.isEmpty) number else l.head
if (num > 2) {
(2 to (num - 1)) find {
n => (num % n)==0
} match {
case Some(i) => List(i, num / i)
case None => List()
}
}
else List()
}
}
type DesiredResult = Int
def search ( labyrinth: SearchDomain, path: List[SolutionNode], goal: DesiredResult ): List[SolutionNode] = {
if ( !path.isEmpty && path.head == goal ) return path
if ( path.isEmpty ) return search(labyrinth, List(labyrinth.number), goal)
val candidates: List[SolutionNode] = labyrinth.childNodes(path)
var fullPath: List[SolutionNode] = List()
candidates.find { c =>
fullPath = search( labyrinth, c :: path, goal )
!fullPath.isEmpty
} match {
case Some(c) => fullPath
case None => Nil
}
}
// Is 5 a factor of 800000000?
val res = search(SearchDomain(800000000), Nil, 5)
println(res)
}
I like Mitch Blevins solution, as it is a perfect match for your algorithm. You may be interested in my own solution to another maze problem.
Ok so there are a lot of interpretations here, your question lacks some degrees of specificity.
I will try to point out all assumptions as I go along.
Existing Function: (First assumptions: Type signatures)
type Labyrinth // Not particularly important to logic
type Position // The grid "position" of the path
def search(labyrinth: Labyrinth, path: List[Position], goal: Position): List[Position]
The function expands, e.g. finds all suitable next nodes (candidates) and then has to recursively call itself.
Assumption: the following code is within the search function
def search(labyrinth: Labyrinth, path: List[Position], goal: Position): List[Position] = {
...
return candidates.foldLeft(Nil){
(solution, next) =>
if( solution == Nil )
search( labyrinth, next :: path, goal )
else
solution
}
}
Please note that I have omitted some unescessary details.
If my assumptions have been correct, then perhaps, but I still had to deduce some things.
Never assume the necessity of implementation details when asking for assistance on implementation, specificity makes answering questions far simpler.
Is there a way to avoid this by breaking the foldLeft or maybe by using a different function instead of foldLeft?
Why not use recursion?
def search(labyrinth: Labyrinth, path: List[Position], goal: Position): List[Position] = {
...
val candidates: List[Position] = ... // However you calculate those
def resultPathOpt(candidates: List[position]): Option[List[Position]] = {
candidates match {
// No candidates
case Nil => None
// At least one candidate
case firstCandidate :: restOfCandidates => {
// Search
search(labyrinth, firstCandidate :: path, goal) match {
// Search didn't pan out, no more candidates, done
case Nil if restOfCandidates.isEmpty => None
// Search didn't pan out, try other candidates
case Nil => resultPathOpt(restOfCandidates)
// Search panned out, done
case validPath => Some(validPath)
}
}
}
val maybeValidResult: Option[List[Position]] = resultPathOpt(candidates)
maybeValidResult.getOrElse(List.empty)
}
Now as soon as we find a "valid" path, note there is no check for this in the logic I have provided, so I am not quite sure if this would ever end as you have implemented it and as I have implemented this, but the concepts are still valid.