Scala: Print separators when using output stream - scala

When we need an array of strings to be concatenated, we can use mkString method:
val concatenatedString = listOfString.mkString
However, when we have a very long list of string, getting concatenated string may not be a good choice. In this case, It would be more appropriated to print out to an output stream directly, Writing it to output stream is simple:
listOfString.foreach(outstream.write _)
However, I don't know a neat way to append separators. One thing I tried is looping with an index:
var i = 0
for(str <- listOfString) {
if(i != 0) outstream.write ", "
outstream.write str
i += 1
}
This works, but it is too wordy. Although I can make a function encapsules the code above, I want to know whether Scala API already has a function do the same thing or not.
Thank you.

Here is a function that do what you want in a bit more elegant way:
def commaSeparated(list: List[String]): Unit = list match {
case List() =>
case List(a) => print(a)
case h::t => print(h + ", ")
commaSeparated(t)
}
The recursion avoids mutable variables.
To make it even more functional style, you can pass in the function that you want to use on each item, that is:
def commaSeparated(list: List[String], func: String=>Unit): Unit = list match {
case List() =>
case List(a) => func(a)
case h::t => func(h + ", ")
commaSeparated(t, func)
}
And then call it by:
commaSeparated(mylist, oustream.write _)

I believe what you want is the overloaded definitions of mkString.
Definitions of mkString:
scala> val strList = List("hello", "world", "this", "is", "bob")
strList: List[String] = List(hello, world, this, is, bob)
def mkString: String
scala> strList.mkString
res0: String = helloworldthisisbob
def mkString(sep: String): String
scala> strList.mkString(", ")
res1: String = hello, world, this, is, bob
def mkString(start: String, sep: String, end: String): String
scala> strList.mkString("START", ", ", "END")
res2: String = STARThello, world, this, is, bobEND
EDIT
How about this?
scala> strList.view.map(_ + ", ").foreach(print) // or .iterator.map
hello, world, this, is, bob,

Not good for parallelized code, but otherwise:
val it = listOfString.iterator
it.foreach{x => print(x); if (it.hasNext) print(' ')}

Here's another approach which avoids the var
listOfString.zipWithIndex.foreach{ case (s, i) =>
if (i != 0) outstream write ","
outstream write s }

Self Answer:
I wrote a function encapsulates the code in the original question:
implicit def withSeparator[S >: String](seq: Seq[S]) = new {
def withSeparator(write: S => Any, sep: String = ",") = {
var i = 0
for (str <- seq) {
if (i != 0) write(sep)
write(str)
i += 1
}
seq
}
}
You can use it like this:
listOfString.withSeparator(print _)
The separator can also be assigned:
listOfString.withSeparator(print _, ",\n")
Thank you for everyone answered me. What I wanted to use is a concise and not too slow representation. The implicit function withSeparator looks like the thing I wanted. So I accept my own answer for this question. Thank you again.

Related

Inverting case in Scala

for (i <- marker to cursor - 1 ){
if (buffer.charAt(i).isUpper){
buffer.charAt(i).toString.toLowerCase
} else if (buffer.charAt(i).isLower) {
buffer.charAt(i).toString.toUpperCase
}
}
I've tried multiple methods to achieve but can't figure a solution and this is where I'm at. While trying other methods I used slice but couldn't get it to return a Bool for an if statement (Converted to a string but isUpper doesn't work on strings). Currently this does nothing to the strings, for context marker/cursor just highlight a selection on a sentence to invert.
Here is a one liner:
val s = "mixedUpperLower"
s.toUpperCase.zip (s).map {case (a, b) => if (a == b) a.toLower else a}.mkString ("")
res3: String = MIXEDuPPERlOWER
Maybe a short method is better readable:
scala> def invertCase (c: Char) : Char = if (c.isLower) c.toUpper else c.toLower
invertCase: (c: Char)Char
scala> s.map (invertCase)
res4: String = MIXEDuPPERlOWER
"aBcDef".map(x => if(x.isLower) x.toUpper else x.toLower)
prints
AbCdEF

Better safe get from an array in scala?

I want to get first argument for main method that is optional, something like this:
val all = args(0) == "all"
However, this would fail with exception if no argument is provided.
Is there any one-liner simple method to set all to false when args[0] is missing; and not doing the common if-no-args-set-false-else... thingy?
In general case you can use lifting:
args.lift(0).map(_ == "all").getOrElse(false)
Or even (thanks to #enzyme):
args.lift(0).contains("all")
You can use headOption and fold (on Option):
val all = args.headOption.fold(false)(_ == "all")
Of course, as #mohit pointed out, map followed by getOrElse will work as well.
If you really need indexed access, you could pimp a get method on any Seq:
implicit class RichIndexedSeq[V, T <% Seq[V]](seq: T) {
def get(i: Int): Option[V] =
if (i < 0 || i >= seq.length) None
else Some(seq(i))
}
However, if this is really about arguments, you'll be probably better off, handling arguments in a fold:
case class MyArgs(n: Int = 1, debug: Boolean = false,
file: Option[String] = None)
val myArgs = args.foldLeft(MyArgs()) {
case (args, "-debug") =>
args.copy(debug = true)
case (args, str) if str.startsWith("-n") =>
args.copy(n = ???) // parse string
case (args, str) if str.startsWith("-f") =>
args.copy(file = Some(???) // parse string
case _ =>
sys.error("Unknown arg")
}
if (myArgs.file.isEmpty)
sys.error("Need file")
You can use foldLeft with initial false value:
val all = (false /: args)(_ | _ == "all")
But be careful, One Liners can be difficult to read.
Something like this will work assuming args(0) returns Some or None:
val all = args(0).map(_ == "all").getOrElse(false)

Simple functionnal way for grouping successive elements? [duplicate]

I'm trying to 'group' a string into segments, I guess this example would explain it more succintly
scala> val str: String = "aaaabbcddeeeeeeffg"
... (do something)
res0: List("aaaa","bb","c","dd","eeeee","ff","g")
I can thnk of a few ways to do this in an imperative style (with vars and stepping through the string to find groups) but I was wondering if any better functional solution could
be attained? I've been looking through the Scala API but there doesn't seem to be something that fits my needs.
Any help would be appreciated
You can split the string recursively with span:
def s(x : String) : List[String] = if(x.size == 0) Nil else {
val (l,r) = x.span(_ == x(0))
l :: s(r)
}
Tail recursive:
#annotation.tailrec def s(x : String, y : List[String] = Nil) : List[String] = {
if(x.size == 0) y.reverse
else {
val (l,r) = x.span(_ == x(0))
s(r, l :: y)
}
}
Seems that all other answers are very concentrated on collection operations. But pure string + regex solution is much simpler:
str split """(?<=(\w))(?!\1)""" toList
In this regex I use positive lookbehind and negative lookahead for the captured char
def group(s: String): List[String] = s match {
case "" => Nil
case s => s.takeWhile(_==s.head) :: group(s.dropWhile(_==s.head))
}
Edit: Tail recursive version:
def group(s: String, result: List[String] = Nil): List[String] = s match {
case "" => result reverse
case s => group(s.dropWhile(_==s.head), s.takeWhile(_==s.head) :: result)
}
can be used just like the other because the second parameter has a default value and thus doesnt have to be supplied.
Make it one-liner:
scala> val str = "aaaabbcddddeeeeefff"
str: java.lang.String = aaaabbcddddeeeeefff
scala> str.groupBy(identity).map(_._2)
res: scala.collection.immutable.Iterable[String] = List(eeeee, fff, aaaa, bb, c, dddd)
UPDATE:
As #Paul mentioned about the order here is updated version:
scala> str.groupBy(identity).toList.sortBy(_._1).map(_._2)
res: List[String] = List(aaaa, bb, c, dddd, eeeee, fff)
You could use some helper functions like this:
val str = "aaaabbcddddeeeeefff"
def zame(chars:List[Char]) = chars.partition(_==chars.head)
def q(chars:List[Char]):List[List[Char]] = chars match {
case Nil => Nil
case rest =>
val (thesame,others) = zame(rest)
thesame :: q(others)
}
q(str.toList) map (_.mkString)
This should do the trick, right? No doubt it can be cleaned up into one-liners even further
A functional* solution using fold:
def group(s : String) : Seq[String] = {
s.tail.foldLeft(Seq(s.head.toString)) { case (carry, elem) =>
if ( carry.last(0) == elem ) {
carry.init :+ (carry.last + elem)
}
else {
carry :+ elem.toString
}
}
}
There is a lot of cost hidden in all those sequence operations performed on strings (via implicit conversion). I guess the real complexity heavily depends on the kind of Seq strings are converted to.
(*) Afaik all/most operations in the collection library depend in iterators, an imho inherently unfunctional concept. But the code looks functional, at least.
Starting Scala 2.13, List is now provided with the unfold builder which can be combined with String::span:
List.unfold("aaaabbaaacdeeffg") {
case "" => None
case rest => Some(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
or alternatively, coupled with Scala 2.13's Option#unless builder:
List.unfold("aaaabbaaacdeeffg") {
rest => Option.unless(rest.isEmpty)(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
Details:
Unfold (def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): List[A]) is based on an internal state (init) which is initialized in our case with "aaaabbaaacdeeffg".
For each iteration, we span (def span(p: (Char) => Boolean): (String, String)) this internal state in order to find the prefix containing the same symbol and produce a (String, String) tuple which contains the prefix and the rest of the string. span is very fortunate in this context as it produces exactly what unfold expects: a tuple containing the next element of the list and the new internal state.
The unfolding stops when the internal state is "" in which case we produce None as expected by unfold to exit.
Edit: Have to read more carefully. Below is no functional code.
Sometimes, a little mutable state helps:
def group(s : String) = {
var tmp = ""
val b = Seq.newBuilder[String]
s.foreach { c =>
if ( tmp != "" && tmp.head != c ) {
b += tmp
tmp = ""
}
tmp += c
}
b += tmp
b.result
}
Runtime O(n) (if segments have at most constant length) and tmp.+= probably creates the most overhead. Use a string builder instead for strict runtime in O(n).
group("aaaabbcddeeeeeeffg")
> Seq[String] = List(aaaa, bb, c, dd, eeeeee, ff, g)
If you want to use scala API you can use the built in function for that:
str.groupBy(c => c).values
Or if you mind it being sorted and in a list:
str.groupBy(c => c).values.toList.sorted

Scala: how to traverse stream/iterator collecting results into several different collections

I'm going through log file that is too big to fit into memory and collecting 2 type of expressions, what is better functional alternative to my iterative snippet below?
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)]={
val lines : Iterator[String] = io.Source.fromFile(file).getLines()
val logins: mutable.Map[String, String] = new mutable.HashMap[String, String]()
val errors: mutable.ListBuffer[(String, String)] = mutable.ListBuffer.empty
for (line <- lines){
line match {
case errorPat(date,ip)=> errors.append((ip,date))
case loginPat(date,user,ip,id) =>logins.put(ip, id)
case _ => ""
}
}
errors.toList.map(line => (logins.getOrElse(line._1,"none") + " " + line._1,line._2))
}
Here is a possible solution:
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String,String)] = {
val lines = Source.fromFile(file).getLines
val (err, log) = lines.collect {
case errorPat(inf, ip) => (Some((ip, inf)), None)
case loginPat(_, _, ip, id) => (None, Some((ip, id)))
}.toList.unzip
val ip2id = log.flatten.toMap
err.collect{ case Some((ip,inf)) => (ip2id.getOrElse(ip,"none") + "" + ip, inf) }
}
Corrections:
1) removed unnecessary types declarations
2) tuple deconstruction instead of ulgy ._1
3) left fold instead of mutable accumulators
4) used more convenient operator-like methods :+ and +
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)] = {
val lines = io.Source.fromFile(file).getLines()
val (logins, errors) =
((Map.empty[String, String], Seq.empty[(String, String)]) /: lines) {
case ((loginsAcc, errorsAcc), next) =>
next match {
case errorPat(date, ip) => (loginsAcc, errorsAcc :+ (ip -> date))
case loginPat(date, user, ip, id) => (loginsAcc + (ip -> id) , errorsAcc)
case _ => (loginsAcc, errorsAcc)
}
}
// more concise equivalent for
// errors.toList.map { case (ip, date) => (logins.getOrElse(ip, "none") + " " + ip) -> date }
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
I have a few suggestions:
Instead of a pair/tuple, it's often better to use your own class. It gives meaningful names to both the type and its fields, which makes the code much more readable.
Split the code into small parts. In particular, try to decouple pieces of code that don't need to be tied together. This makes your code easier to understand, more robust, less prone to errors and easier to test. In your case it'd be good to separate producing your input (lines of a log file) and consuming it to produce a result. For example, you'd be able to make automatic tests for your function without having to store sample data in a file.
As an example and exercise, I tried to make a solution based on Scalaz iteratees. It's a bit longer (includes some auxiliary code for IteratorEnumerator) and perhaps it's a bit overkill for the task, but perhaps someone will find it helpful.
import java.io._;
import scala.util.matching.Regex
import scalaz._
import scalaz.IterV._
object MyApp extends App {
// A type for the result. Having names keeps things
// clearer and shorter.
type LogResult = List[(String,String)]
// Represents a state of our computation. Not only it
// gives a name to the data, we can also put here
// functions that modify the state. This nicely
// separates what we're computing and how.
sealed case class State(
logins: Map[String,String],
errors: Seq[(String,String)]
) {
def this() = {
this(Map.empty[String,String], Seq.empty[(String,String)])
}
def addError(date: String, ip: String): State =
State(logins, errors :+ (ip -> date));
def addLogin(ip: String, id: String): State =
State(logins + (ip -> id), errors);
// Produce the final result from accumulated data.
def result: LogResult =
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
// An iteratee that consumes lines of our input. Based
// on the given regular expressions, it produces an
// iteratee that parses the input and uses State to
// compute the result.
def logIteratee(errorPat: Regex, loginPat: Regex):
IterV[String,List[(String,String)]] = {
// Consumes a signle line.
def consume(line: String, state: State): State =
line match {
case errorPat(date, ip) => state.addError(date, ip);
case loginPat(date, user, ip, id) => state.addLogin(ip, id);
case _ => state
}
// The core of the iteratee. Every time we consume a
// line, we update our state. When done, compute the
// final result.
def step(state: State)(s: Input[String]): IterV[String, LogResult] =
s(el = line => Cont(step(consume(line, state))),
empty = Cont(step(state)),
eof = Done(state.result, EOF[String]))
// Return the iterate waiting for its first input.
Cont(step(new State()));
}
// Converts an iterator into an enumerator. This
// should be more likely moved to Scalaz.
// Adapted from scalaz.ExampleIteratee
implicit val IteratorEnumerator = new Enumerator[Iterator] {
#annotation.tailrec def apply[E, A](e: Iterator[E], i: IterV[E, A]): IterV[E, A] = {
val next: Option[(Iterator[E], IterV[E, A])] =
if (e.hasNext) {
val x = e.next();
i.fold(done = (_, _) => None, cont = k => Some((e, k(El(x)))))
} else
None;
next match {
case None => i
case Some((es, is)) => apply(es, is)
}
}
}
// main ---------------------------------------------------
{
// Read a file as an iterator of lines:
// val lines: Iterator[String] =
// io.Source.fromFile("test.log").getLines();
// Create our testing iterator:
val lines: Iterator[String] = Seq(
"Error: 2012/03 1.2.3.4",
"Login: 2012/03 user 1.2.3.4 Joe",
"Error: 2012/03 1.2.3.5",
"Error: 2012/04 1.2.3.4"
).iterator;
// Create an iteratee.
val iter = logIteratee("Error: (\\S+) (\\S+)".r,
"Login: (\\S+) (\\S+) (\\S+) (\\S+)".r);
// Run the the iteratee against the input
// (the enumerator is implicit)
println(iter(lines).run);
}
}

String concatenation gone functional

Suppose there are 3 strings:
protein, starch, drink
Concatenating those, we could say what is for dinner:
Example:
val protein = "fish"
val starch = "chips"
val drink = "wine"
val dinner = protein + ", " + starch + ", " + drink
But what if something was missing, for example the protein, because my wife couldn't catch anything. Then, we will have: ,chips, drink for dinner.
There is a slick way to concatenate the strings to optionally add the commas - I just don't know what it is ;-). Does anyone have a nice idea?
I'm looking for something like:
val dinner = protein +[add a comma if protein is not lenth of zero] + starch .....
It's just a fun exercise I'm doing, so now sweat if it can't be done in some cool way. The reason that I'm trying to do the conditional concatenation in a single assignment, is because I'm using this type of thing a lot in XML and a nice solution will make things..... nicer.
When you say "it may be absent", this entity's type should be Option[T]. Then,
def dinner(components: List[Option[String]]) = components.flatten mkString ", "
You would invoke it like this:
scala> dinner(None :: Some("chips") :: Some("wine") :: Nil)
res0: String = chips, wine
In case you absolutely want checking a string's emptiness,
def dinner(strings: List[String]) = strings filter {_.nonEmpty} mkString ", "
scala> dinner("" :: "chips" :: "wine" :: Nil)
res1: String = chips, wine
You're looking for mkString on collections, maybe.
val protein = "fish"
val starch = "chips"
val drink = "wine"
val complete = List (protein, starch, drink)
val partly = List (protein, starch)
complete.mkString (", ")
partly.mkString (", ")
results in:
res47: String = fish, chips, wine
res48: String = fish, chips
You may even specify a start and end:
scala> partly.mkString ("<<", ", ", ">>")
res49: String = <<fish, chips>>
scala> def concat(ss: String*) = ss filter (_.nonEmpty) mkString ", "
concat: (ss: String*)String
scala> concat("fish", "chips", "wine")
res0: String = fish, chips, wine
scala> concat("", "chips", "wine")
res1: String = chips, wine
scala>
This takes care of the case of empty strings and also shows how you could put other logic for filtering and formatting. This will work fine for a List[String] and generalizes to List[Any].
val input = List("fish", "", "chips", 137, 32, 32.0, None, "wine")
val output = input.flatMap{ _ match {
case None => None
case x:String if !x.nonEmpty => None
case x:String => Some(x)
case _ => None
}}
.mkString(",")
res1: String = fish,chips,wine
The idea is that flatMap takes a List[Any] and uses matching to assign None for any elements that you do not want to keep in the output. The Nones get flattened away and the Somes stay.
If you needed to be able to handle different types (Int, Double, etc) then you could add more cases.
println(s"$protein,$starch,$drink")