Unpacking tuple directly into class in scala - scala

Scala gives the ability to unpack a tuple into multiple local variables when performing various operations, for example if I have some data
val infos = Array(("Matt", "Awesome"), ("Matt's Brother", "Just OK"))
then instead of doing something ugly like
infos.map{ person_info => person_info._1 + " is " + person_info._2 }
I can choose the much more elegant
infos.map{ case (person, status) => person + " is " + status }
One thing I've often wondered about is how to directly unpack the tuple into, say, the arguments to be used in a class constructor. I'm imagining something like this:
case class PersonInfo(person: String, status: String)
infos.map{ case (p: PersonInfo) => p.person + " is " + p.status }
or even better if PersonInfo has methods:
infos.map{ case (p: PersonInfo) => p.verboseStatus() }
But of course this doesn't work. Apologies if this has already been asked -- I haven't been able to find a direct answer -- is there a way to do this?

I believe you can get to the methods at least in Scala 2.11.x, also, if you haven't heard of it, you should checkout The Neophyte's Guide to Scala Part 1: Extractors.
The whole 16 part series is fantastic, but part 1 deals with case classes, pattern matching and extractors, which is what I think you are after.
Also, I get that java.lang.String complaint in IntelliJ as well, it defaults to that for reasons that are not entirely clear to me, I was able to work around it by explicitly setting the type in the typical "postfix style" i.e. _: String. There must be some way to work around that though.
object Demo {
case class Person(name: String, status: String) {
def verboseStatus() = s"The status of $name is $status"
}
val peeps = Array(("Matt", "Alive"), ("Soraya", "Dead"))
peeps.map {
case p # (_ :String, _ :String) => Person.tupled(p).verboseStatus()
}
}
UPDATE:
So after seeing a few of the other answers, I was curious if there was any performance differences between them. So I set up, what I think might be a reasonable test using an Array of 1,000,000 random string tuples and each implementation is run 100 times:
import scala.util.Random
object Demo extends App {
//Utility Code
def randomTuple(): (String, String) = {
val random = new Random
(random.nextString(5), random.nextString(5))
}
def timer[R](code: => R)(implicit runs: Int): Unit = {
var total = 0L
(1 to runs).foreach { i =>
val t0 = System.currentTimeMillis()
code
val t1 = System.currentTimeMillis()
total += (t1 - t0)
}
println(s"Time to perform code block ${total / runs}ms\n")
}
//Setup
case class Person(name: String, status: String) {
def verboseStatus() = s"$name is $status"
}
object PersonInfoU {
def unapply(x: (String, String)) = Some(Person(x._1, x._2))
}
val infos = Array.fill[(String, String)](1000000)(randomTuple)
//Timer
implicit val runs: Int = 100
println("Using two map operations")
timer {
infos.map(Person.tupled).map(_.verboseStatus)
}
println("Pattern matching and calling tupled")
timer {
infos.map {
case p # (_: String, _: String) => Person.tupled(p).verboseStatus()
}
}
println("Another pattern matching without tupled")
timer {
infos.map {
case (name, status) => Person(name, status).verboseStatus()
}
}
println("Using unapply in a companion object that takes a tuple parameter")
timer {
infos.map { case PersonInfoU(p) => p.name + " is " + p.status }
}
}
/*Results
Using two map operations
Time to perform code block 208ms
Pattern matching and calling tupled
Time to perform code block 130ms
Another pattern matching without tupled
Time to perform code block 130ms
WINNER
Using unapply in a companion object that takes a tuple parameter
Time to perform code block 69ms
*/
Assuming my test is sound, it seems the unapply in a companion-ish object was ~2x faster than the pattern matching, and pattern matching another ~1.5x faster than two maps. Each implementation probably has its use cases/limitations.
I'd appreciate if anyone sees anything glaringly dumb in my testing strategy to let me know about it (and sorry about that var). Thanks!

The extractor for a case class takes an instance of the case class and returns a tuple of its fields. You can write an extractor which does the opposite:
object PersonInfoU {
def unapply(x: (String, String)) = Some(PersonInfo(x._1, x._2))
}
infos.map { case PersonInfoU(p) => p.person + " is " + p.status }

You can use tuppled for case class
val infos = Array(("Matt", "Awesome"), ("Matt's Brother", "Just OK"))
infos.map(PersonInfo.tupled)
scala> infos: Array[(String, String)] = Array((Matt,Awesome), (Matt's Brother,Just OK))
scala> res1: Array[PersonInfo] = Array(PersonInfo(Matt,Awesome), PersonInfo(Matt's Brother,Just OK))
and then you can use PersonInfo how you need

You mean like this (scala 2.11.8):
scala> :paste
// Entering paste mode (ctrl-D to finish)
case class PersonInfo(p: String)
Seq(PersonInfo("foo")) map {
case p# PersonInfo(info) => s"info=$info / ${p.p}"
}
// Exiting paste mode, now interpreting.
defined class PersonInfo
res4: Seq[String] = List(info=foo / foo)
Methods won't be possible by the way.

Several answers can be combined to produce a final, unified approach:
val infos = Array(("Matt", "Awesome"), ("Matt's Brother", "Just OK"))
object Person{
case class Info(name: String, status: String){
def verboseStatus() = name + " is " + status
}
def unapply(x: (String, String)) = Some(Info(x._1, x._2))
}
infos.map{ case Person(p) => p.verboseStatus }
Of course in this small case it's overkill, but for more complex use cases this is the basic skeleton.

Related

Combine multiple extractor objects to use in one match statement

Is it possible to run multiple extractors in one match statement?
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = ...
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = ...
}
// is there some syntax similar to this?
thing match {
case t # CoolStuff(someInfo) # NeatStuff(otherInfo) => process(someInfo, otherInfo)
case _ => // neither Cool nor Neat
}
The intent here being that there are two extractors, and I don't have to do something like this:
object CoolNeatStuff {
def unapply(thing: Thing): Option[(SomeInfo, OtherInfo)] = thing match {
case CoolStuff(someInfo) => thing match {
case NeatStuff(otherInfo) => Some(someInfo -> otherInfo)
case _ => None // Cool, but not Neat
case _ => None// neither Cool nor Neat
}
}
Can try
object ~ {
def unapply[T](that: T): Option[(T,T)] = Some(that -> that)
}
def too(t: Thing) = t match {
case CoolStuff(a) ~ NeatStuff(b) => ???
}
I've come up with a very similar solution, but I was a bit too slow, so I didn't post it as an answer. However, since #userunknown asks to explain how it works, I'll dump my similar code here anyway, and add a few comments. Maybe someone finds it a valuable addition to cchantep's minimalistic solution (it looks... calligraphic? for some reason, in a good sense).
So, here is my similar, aesthetically less pleasing proposal:
object && {
def unapply[A](a: A) = Some((a, a))
}
// added some definitions to make your question-code work
type Thing = String
type SomeInfo = String
type OtherInfo = String
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = Some(thing.toLowerCase)
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = Some(thing.toUpperCase)
}
def process(a: SomeInfo, b: OtherInfo) = s"[$a, $b]"
val res = "helloworld" match {
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
process(someInfo, otherInfo)
case _ =>
}
println(res)
This prints
[helloworld, HELLOWORLD]
The idea is that identifiers (in particular, && and ~ in cchantep's code) can be used as infix operators in patterns. Therefore, the match-case
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
will be desugared into
case &&(CoolStuff(someInfo), NeatStuff(otherInfo)) =>
and then the unapply method method of && will be invoked which simply duplicates its input.
In my code, the duplication is achieved by a straightforward Some((a, a)). In cchantep's code, it is done with fewer parentheses: Some(t -> t). The arrow -> comes from ArrowAssoc, which in turn is provided as an implicit conversion in Predef. This is just a quick way to create pairs, usually used in maps:
Map("hello" -> 42, "world" -> 58)
Another remark: notice that && can be used multiple times:
case Foo(a) && Bar(b) && Baz(c) => ...
So... I don't know whether it's an answer or an extended comment to cchantep's answer, but maybe someone finds it useful.
For those who might miss the details on how this magic actually works, just want to expand the answer by #cchantep anf #Andrey Tyukin (comment section does not allow me to do that).
Running scalac with -Xprint:parser option will give something along those lines (scalac 2.11.12)
def too(t: String) = t match {
case $tilde(CoolStuff((a # _)), NeatStuff((b # _))) => $qmark$qmark$qmark
}
This basically shows you the initial steps compiler does while parsing source into AST.
Important Note here is that the rules why compiler makes this transformation are described in Infix Operation Patterns and Extractor Patterns. In particular, this allows you to use any object as long as it has unapply method, like for example CoolStuff(a) AndAlso NeatStuff(b). In previous answers && and ~ were picked up as also possible but not the only available valid identifiers.
If running scalac with option -Xprint:patmat which is a special phase for translating pattern matching one can see something similar to this
def too(t: String): Nothing = {
case <synthetic> val x1: String = t;
case9(){
<synthetic> val o13: Option[(String, String)] = main.this.~.unapply[String](x1);
if (o13.isEmpty.unary_!)
{
<synthetic> val p3: String = o13.get._1;
<synthetic> val p4: String = o13.get._2;
{
<synthetic> val o12: Option[String] = main.this.CoolStuff.unapply(p3);
if (o12.isEmpty.unary_!)
{
<synthetic> val o11: Option[String] = main.this.NeatStuff.unapply(p4);
if (o11.isEmpty.unary_!)
matchEnd8(scala.this.Predef.???)
Here ~.unapply will be called on input parameter t which will produce Some((t,t)). The tuple values will be extracted into variables p3 and p4. Then, CoolStuff.unapply(p3) will be called and if the result is not None NeatStuff.unapply(p4) will be called and also checked if it is not empty. If both are not empty then according to Variable Patterns a and b will be bound to returned results inside corresponding Some.

Filtering Lists in Scala's Monocle

Given the following code:
case class Person(name :String)
case class Group(group :List[Person])
val personLens = GenLens[Person]
val groupLens = GenLens[Group]
how can i "filter" out certain Persons from the selection, NOT by index but by a specific property of Person, like:
val trav :Traversal[Group, Person] = (groupLens(_.group) composeTraversal filterWith((x :Person) => /*expression of type Boolean here */))
I only found the filterIndex function, which does only include elements from the list based on the index, but this is not what I want.
filterIndex takes a function of type: (Int => Boolean)
and I want:
filterWith (made up name), that takes a (x => Boolean), where x has the type of the element of the list, namely Person in this short example.
This seems so practical and common that I assume somebody has thought about that and i (with my, i must admit limited understanding of the matter) don't see why it can't be done.
Am I missing this functionality, is it not implemented yet or just plainly impossible for whatever reason (please do explain if you have the time)?
Thank you.
A bad version
I'll start with a naive attempt to write something like this. I'm using a simple list version here, but you could get fancier (with Traverse or whatever) if you wanted.
import monocle.Traversal
import scalaz.Applicative, scalaz.std.list._, scalaz.syntax.traverse._
def filterWith[A](p: A => Boolean): Traversal[List[A], A] =
new Traversal[List[A], A] {
def modifyF[F[_]: Applicative](f: A => F[A])(s: List[A]): F[List[A]] =
s.filter(p).traverse(f)
}
And then:
import monocle.macros.GenLens
case class Person(name: String)
case class Group(group: List[Person])
val personLens = GenLens[Person]
val groupLens = GenLens[Group]
val aNames = groupLens(_.group).composeTraversal(filterWith(_.name.startsWith("A")))
val group = Group(List(Person("Al"), Person("Alice"), Person("Bob")))
And finally:
scala> aNames.getAll(group)
res0: List[Person] = List(Person(Al), Person(Alice))
It works!
Why it's bad
It works, except…
scala> import monocle.law.discipline.TraversalTests
import monocle.law.discipline.TraversalTests
scala> TraversalTests(filterWith[String](_.startsWith("A"))).all.check
+ Traversal.get what you set: OK, passed 100 tests.
+ Traversal.headOption: OK, passed 100 tests.
! Traversal.modify id = id: Falsified after 2 passed tests.
> Labels of failing property:
Expected List(崡) but got List()
> ARG_0: List(崡)
! Traversal.modifyF Id = Id: Falsified after 2 passed tests.
> Labels of failing property:
Expected List(ᜱ) but got List()
> ARG_0: List(ᜱ)
+ Traversal.set idempotent: OK, passed 100 tests.
Three out of five isn't very good.
A slightly better version
Let's start over:
def filterWith2[A](p: A => Boolean): Traversal[List[A], A] =
new Traversal[List[A], A] {
def modifyF[F[_]: Applicative](f: A => F[A])(s: List[A]): F[List[A]] =
s.traverse {
case a if p(a) => f(a)
case a => Applicative[F].point(a)
}
}
val aNames2 = groupLens(_.group).composeTraversal(filterWith2(_.name.startsWith("A")))
And then:
scala> aNames2.getAll(group)
res1: List[Person] = List(Person(Al), Person(Alice))
scala> TraversalTests(filterWith2[String](_.startsWith("A"))).all.check
+ Traversal.get what you set: OK, passed 100 tests.
+ Traversal.headOption: OK, passed 100 tests.
+ Traversal.modify id = id: OK, passed 100 tests.
+ Traversal.modifyF Id = Id: OK, passed 100 tests.
+ Traversal.set idempotent: OK, passed 100 tests.
Okay, better!
Why it's still bad
The "real" laws for Traversal aren't encoded in Monocle's TraversalLaws (at least not at the moment), and we additionally want something like this to hold:
For any f: A => A and g: A => A, t.modify(f.compose(g)) should equal t.modify(f).compose(t.modify(g)).
Let's try it:
scala> val graduate: Person => Person = p => Person("Dr. " + p.name)
graduate: Person => Person = <function1>
scala> val kill: Person => Person = p => Person(p.name + ", deceased")
kill: Person => Person = <function1>
scala> aNames2.modify(kill.compose(graduate))(group)
res2: Group = Group(List(Person(Dr. Al, deceased), Person(Dr. Alice, deceased), Person(Bob)))
scala> aNames2.modify(kill).compose(aNames2.modify(graduate))(group)
res3: Group = Group(List(Person(Dr. Al), Person(Dr. Alice), Person(Bob)))
So we're out of luck again. The only way our filterWith could actually be lawful is if we promise never to use it with an argument to modify that might change the result of the predicate.
This is why filterIndex is legit—its predicate takes as an argument something that modify can't touch, so you can't break the t.modify(f.compose(g)) === t.modify(f).compose(t.modify(g)) law.
Moral of the story
You could write an unlawful Traversal that does unlawful filtering stuff and use it all the time and it's pretty likely that it will never hurt you and that nobody will ever think you are a horrible person. So go for it, if you want. You'll probably never see a filterWith in a decent lens library, though.
You can use UnsafeSelect, https://www.optics.dev/Monocle/docs/unsafe_module.html#unsafeselect
import monocle.macros.GenLens
import org.scalatest.FunSuite
import monocle.function.all._
import monocle.unsafe.UnsafeSelect
case class Person(name :String, age: Int)
case class Group(group :List[Person])
class Example extends FunSuite{
test("filter elements of list") {
val group = Group(List(Person("adult1", 2), Person("adult2", 3), Person("child", 4)))
val filteredGroup = (GenLens[Group](_.group) composeTraversal each composePrism UnsafeSelect.unsafeSelect(_.name.startsWith("adult")) composeLens GenLens[Person](_.age) set 18) (group)
assert(filteredGroup.group.filter(_.name.startsWith("adult")).map(_.age) == List(18, 18))
}
}

Filtering inside `for` with pattern matching

I am reading a TSV file and using using something like this:
case class Entry(entryType: Int, value: Int)
def filterEntries(): Iterator[Entry] = {
for {
line <- scala.io.Source.fromFile("filename").getLines()
} yield new Entry(line.split("\t").map(x => x.toInt))
}
Now I am both interested in filtering out entries whose entryType are set to 0 and ignoring lines with column count greater or lesser than 2 (that does not match the constructor). I was wondering if there's an idiomatic way to achieve this may be using pattern matching and unapply method in a companion object. The only thing I can think of is using .filter on the resulting iterator.
I will also accept solution not involving for loop but that returns Iterator[Entry]. They solutions must be tolerant to malformed inputs.
This is more state-of-arty:
package object liner {
implicit class R(val sc: StringContext) {
object r {
def unapplySeq(s: String): Option[Seq[String]] = sc.parts.mkString.r unapplySeq s
}
}
}
package liner {
case class Entry(entryType: Int, value: Int)
object I {
def unapply(s: String): Option[Int] = util.Try(s.toInt).toOption
}
object Test extends App {
def lines = List("1 2", "3", "", " 4 5 ", "junk", "0, 100000", "6 7 8")
def entries = lines flatMap {
case r"""\s*${I(i)}(\d+)\s+${I(j)}(\d+)\s*""" if i != 0 => Some(Entry(i, j))
case __________________________________________________ => None
}
Console println entries
}
}
Hopefully, the regex interpolator will make it into the standard distro soon, but this shows how easy it is to rig up. Also hopefully, a scanf-style interpolator will allow easy extraction with case f"$i%d".
I just started using the "elongated wildcard" in patterns to align the arrows.
There is a pupal or maybe larval regex macro:
https://github.com/som-snytt/regextractor
You can create variables in the head of the for-comprehension and then use a guard:
edit: ensure length of array
for {
line <- scala.io.Source.fromFile("filename").getLines()
arr = line.split("\t").map(x => x.toInt)
if arr.size == 2 && arr(0) != 0
} yield new Entry(arr(0), arr(1))
I have solved it using the following code:
import scala.util.{Try, Success}
val lines = List(
"1\t2",
"1\t",
"2",
"hello",
"1\t3"
)
case class Entry(val entryType: Int, val value: Int)
object Entry {
def unapply(line: String) = {
line.split("\t").map(x => Try(x.toInt)) match {
case Array(Success(entryType: Int), Success(value: Int)) => Some(Entry(entryType, value))
case _ =>
println("Malformed line: " + line)
None
}
}
}
for {
line <- lines
entryOption = Entry.unapply(line)
if entryOption.isDefined
} yield entryOption.get
The left hand side of a <- or = in a for-loop may be a fully-fledged pattern. So you may write this:
def filterEntries(): Iterator[Int] = for {
line <- scala.io.Source.fromFile("filename").getLines()
arr = line.split("\t").map(x => x.toInt)
if arr.size == 2
// now you may use pattern matching to extract the array
Array(entryType, value) = arr
if entryType == 0
} yield Entry(entryType, value)
Note that this solution will throw a NumberFormatException if a field is not convertible to an Int. If you do not want that, you'll have to encapsulate x.toInt with a Try and pattern match again.

Scala extractors: a cumbersome example

Everything started from a couple of considerations:
Extractors are Scala objects that implements some unapply methods with certain peculiarities (directly from «Programming in Scala 2nd edition», I've checked)
Objects are singletons lazy initialised on the static scope
I've tried to implement a sort of «parametric extractors» under the form of case classes to try to have an elegant pattern for SHA1 checking.
I'd like to check a list of SHA1s against a buffer to match which of them apply. I'd like to write something like this:
val sha1: Array[Byte] = ...
val sha2: Array[Byte] = ...
buffer match {
case SHA1(sha1) => ...
case SHA1(sha2) => ...
...
}
Ok, it looks weird, but don't bother now.
I've tried to solve the problem by simply implementing a case class like this
case class SHA1(sha1: Array[Byte]) {
def unapply(buffer: Array[Byte]): Boolean = ...
}
and use it like
case SHA1(sha1)() =>
and even
case (SHA1(sha1)) =>
but it doesn't work: compiler fails.
Then I've a little changed the code in:
val sha1 = SHA1(sha1)
val sha2 = SHA1(sha2)
buffer match {
case sha1() => println("sha1 Match")
case sha2() => println("sha2 Match")
...
}
and it works without any issue.
Questions are:
Q1: There are any subtle implications in using such a kind of «extractors»
Q2: Provided the last example works, which syntax was I supposed to use to avoid to define temporary vals? (if any provided compiler's job with match…case expressions)
EDIT
The solution proposed by Aaron doesn't work either. A snippet:
case class SHA1(sha1: Array[Byte]) {
def unapply(buffer: Array[Byte]) = buffer.length % 2 == 0
}
object Sha1Sample {
def main(args: Array[String]) {
println("Sha1 Sample")
val b1: Array[Byte] = Array(0, 1, 2)
val b2: Array[Byte] = Array(0, 1, 2, 3)
val sha1 = SHA1(b1)
List(b1, b2) map { b =>
b match {
case sha1() => println("Match") // works
case `sha1` => println("Match") // compile but it is semantically incorrect
case SHA1(`b1`) => println("SOLVED") // won't compile
case _ => println("Doesn't Match")
}
}
}
}
Short answer: you need to put backticks around lowercase identifiers if you don't want them to be interpreted as pattern variables.
case Sha1(`sha1`) => // ...
See this question.

Scala: how to traverse stream/iterator collecting results into several different collections

I'm going through log file that is too big to fit into memory and collecting 2 type of expressions, what is better functional alternative to my iterative snippet below?
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)]={
val lines : Iterator[String] = io.Source.fromFile(file).getLines()
val logins: mutable.Map[String, String] = new mutable.HashMap[String, String]()
val errors: mutable.ListBuffer[(String, String)] = mutable.ListBuffer.empty
for (line <- lines){
line match {
case errorPat(date,ip)=> errors.append((ip,date))
case loginPat(date,user,ip,id) =>logins.put(ip, id)
case _ => ""
}
}
errors.toList.map(line => (logins.getOrElse(line._1,"none") + " " + line._1,line._2))
}
Here is a possible solution:
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String,String)] = {
val lines = Source.fromFile(file).getLines
val (err, log) = lines.collect {
case errorPat(inf, ip) => (Some((ip, inf)), None)
case loginPat(_, _, ip, id) => (None, Some((ip, id)))
}.toList.unzip
val ip2id = log.flatten.toMap
err.collect{ case Some((ip,inf)) => (ip2id.getOrElse(ip,"none") + "" + ip, inf) }
}
Corrections:
1) removed unnecessary types declarations
2) tuple deconstruction instead of ulgy ._1
3) left fold instead of mutable accumulators
4) used more convenient operator-like methods :+ and +
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)] = {
val lines = io.Source.fromFile(file).getLines()
val (logins, errors) =
((Map.empty[String, String], Seq.empty[(String, String)]) /: lines) {
case ((loginsAcc, errorsAcc), next) =>
next match {
case errorPat(date, ip) => (loginsAcc, errorsAcc :+ (ip -> date))
case loginPat(date, user, ip, id) => (loginsAcc + (ip -> id) , errorsAcc)
case _ => (loginsAcc, errorsAcc)
}
}
// more concise equivalent for
// errors.toList.map { case (ip, date) => (logins.getOrElse(ip, "none") + " " + ip) -> date }
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
I have a few suggestions:
Instead of a pair/tuple, it's often better to use your own class. It gives meaningful names to both the type and its fields, which makes the code much more readable.
Split the code into small parts. In particular, try to decouple pieces of code that don't need to be tied together. This makes your code easier to understand, more robust, less prone to errors and easier to test. In your case it'd be good to separate producing your input (lines of a log file) and consuming it to produce a result. For example, you'd be able to make automatic tests for your function without having to store sample data in a file.
As an example and exercise, I tried to make a solution based on Scalaz iteratees. It's a bit longer (includes some auxiliary code for IteratorEnumerator) and perhaps it's a bit overkill for the task, but perhaps someone will find it helpful.
import java.io._;
import scala.util.matching.Regex
import scalaz._
import scalaz.IterV._
object MyApp extends App {
// A type for the result. Having names keeps things
// clearer and shorter.
type LogResult = List[(String,String)]
// Represents a state of our computation. Not only it
// gives a name to the data, we can also put here
// functions that modify the state. This nicely
// separates what we're computing and how.
sealed case class State(
logins: Map[String,String],
errors: Seq[(String,String)]
) {
def this() = {
this(Map.empty[String,String], Seq.empty[(String,String)])
}
def addError(date: String, ip: String): State =
State(logins, errors :+ (ip -> date));
def addLogin(ip: String, id: String): State =
State(logins + (ip -> id), errors);
// Produce the final result from accumulated data.
def result: LogResult =
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
// An iteratee that consumes lines of our input. Based
// on the given regular expressions, it produces an
// iteratee that parses the input and uses State to
// compute the result.
def logIteratee(errorPat: Regex, loginPat: Regex):
IterV[String,List[(String,String)]] = {
// Consumes a signle line.
def consume(line: String, state: State): State =
line match {
case errorPat(date, ip) => state.addError(date, ip);
case loginPat(date, user, ip, id) => state.addLogin(ip, id);
case _ => state
}
// The core of the iteratee. Every time we consume a
// line, we update our state. When done, compute the
// final result.
def step(state: State)(s: Input[String]): IterV[String, LogResult] =
s(el = line => Cont(step(consume(line, state))),
empty = Cont(step(state)),
eof = Done(state.result, EOF[String]))
// Return the iterate waiting for its first input.
Cont(step(new State()));
}
// Converts an iterator into an enumerator. This
// should be more likely moved to Scalaz.
// Adapted from scalaz.ExampleIteratee
implicit val IteratorEnumerator = new Enumerator[Iterator] {
#annotation.tailrec def apply[E, A](e: Iterator[E], i: IterV[E, A]): IterV[E, A] = {
val next: Option[(Iterator[E], IterV[E, A])] =
if (e.hasNext) {
val x = e.next();
i.fold(done = (_, _) => None, cont = k => Some((e, k(El(x)))))
} else
None;
next match {
case None => i
case Some((es, is)) => apply(es, is)
}
}
}
// main ---------------------------------------------------
{
// Read a file as an iterator of lines:
// val lines: Iterator[String] =
// io.Source.fromFile("test.log").getLines();
// Create our testing iterator:
val lines: Iterator[String] = Seq(
"Error: 2012/03 1.2.3.4",
"Login: 2012/03 user 1.2.3.4 Joe",
"Error: 2012/03 1.2.3.5",
"Error: 2012/04 1.2.3.4"
).iterator;
// Create an iteratee.
val iter = logIteratee("Error: (\\S+) (\\S+)".r,
"Login: (\\S+) (\\S+) (\\S+) (\\S+)".r);
// Run the the iteratee against the input
// (the enumerator is implicit)
println(iter(lines).run);
}
}