i have this very small requirement for learning purposes.
suppose we have the following string
"1.1 This is a test 34"
where
"1.1" is the chapter
"This is a test" is the title of the chapter
"34" is the page number
The overall result should give me some indication whether or not "parsed" line is ok.
Right now its only working for "well formed" lines (this is on purpose).
So far i have 2 ways of solving this...
1) Monad approach (though im not entirely sure this is done right, therefore my question)
trait Mine[A] {
def get(): A
def map(f: A => A): Mine[A]
def flatMap(f: A => Mine[A]): Mine[A]
}
case class Attempt1[A, B](a: (A, B)) extends Mine[(A, B)] {
def get(): (A, B) = a
def map(f: ((A, B)) => (A, B)): Mine[(A, B)] = {
Attempt1(f(a._1, a._2))
}
def flatMap(f: ((A, B)) => Mine[(A, B)]): Mine[(A, B)] = {
f(a._1, a._2)
}
}
and i also have following functions for getting text out of my "string" line
def getChapter2(t: (Result, String)): Mine[(Result, String)] = {
val result = t._1
val state = t._2
result.chapter = state.substring(0, 3)
var newState = state.substring(3)
Attempt1((result, newState))
}
def getTitle2(t: (Result, String)): Mine[(Result, String)] = {
val result = t._1
val state = t._2
result.title = state.substring(0, state.length() - 2)
var newState = state.substring(state.length() - 2)
Attempt1((result, newState))
}
def getPage2(t: (Result, String)): Mine[(Result, String)] = {
val result = t._1
val state = t._2
result.page = state
Attempt1((result, ""))
}
I can think of trying to use a higher order function for code that gets values "out" of Tuple2 and creation of Attempt1 stuff, but right now i want to keep things simple, the important thing for me is the monad stuff.
And finally this is the main logic.
var line = "1.1 Some awesome book 12"
val result = new Result("", "", "")
val at1 = Attempt1((result, line))
val r = for (
o1 <- at1;
o2 <- getChapter2(o1);
o3 <- getTitle2(o2);
o4 <- getPage2(o3)
) yield (o4)
val res = r.get._1
println("chapter " + res.chapter) //1.1
println("title " + res.title) // Some awesome book
println("page " + res.page) // 12
2) Compositional approach
def getChapter(t: (Result, String)): (Result, String) = {
val result = t._1
val state = t._2
result.chapter = state.substring(0, 3)
var newState = state.substring(3)
(result, newState)
}
def getTitle(t: (Result, String)): (Result, String) = {
val result = t._1
val state = t._2
result.title = state.substring(0, state.length() - 2)
var newState = state.substring(state.length() - 2)
(result, newState)
}
def getPage(t: (Result, String)): (Result, String) = {
val result = t._1
val state = t._2
result.page = state
(result, "")
}
as u can see functions are the same except on the return type (not "wrapped" by a Mine type), and i also have this method
def process(s: String, f: ((Result, String)) => (Result, String)): Result = {
val res = new Result("", "", "")
val t = f(res, s)
res
}
and my main logic is as follows
var line = "1.1 Some awesome book 12"
var fx = getChapter _ andThen getTitle _ andThen getPage
var resx = process(line, fx)
printf("title: %s%nchapter: %s%npage: %s%n", resx.title, resx.chapter, resx.page)
returnes values are the same as the "Monad approach".
So finally the questions would be:
Is "Monad approach" really a Monad??
I find compositional approach logic easier, and for this particular case Monad approach might seem overkill but remember this is for learning purposes.
I find that in both approaches logic flow is easy to reason.
If needed in both cases is easy to add or even remove a step in order to parse a string line.
I know this code is so similar and has room for improvement but right now im keeping it simple and maybe in the future i will factor commong things out.
Suggestions are welcome.
First off there is no need for the vars in your code. Second, since you are using substring function of a string, all you need is one partial function which takes the offsets of the sub-strings. This would be a good place to start when refactoring and would allow varied functionality for how you split the lines should the format change.
This would look like
def splitline(method:String)(symbol:String)(s:String) = method match {
case "substring" => val symb = Integer.parseInt(symbol) ;(s.substring(0,symb),s.substring(symb))
}
val getTitle = splitline("substring")("3") _
In terms of composition or monadic code, this falls on preference and the cognitive
load you wish to place on yourself.
Related
I want to generate a List of some class which contains several fields. One of them is Int type and it doesn’t have to repeat. Could you help me to write the code?
I tried next:
case class Person(name: String, age: Int)
implicit val genPerson: Gen[Person] =
for {
name <- arbitrary[String]
age <- Gen.posNum[Int]
} yield Person(name, age)
implicit val genListOfPerson: Gen[scala.List[Person]] = Gen.listOfN(3, genPerson)
The problem is that I got an instance of a person with equal age.
If you're requiring that no two Persons in the generated list have the same age, you can
implicit def IntsArb: Arbitrary[Int] = Arbitrary(Gen.choose[Int](0, Int.MaxValue))
implicit val StringArb: Arbitrary[String] = Arbitrary(Gen.listOfN(5, Gen.alphaChar).map(_.mkString))
implicit val PersonGen = Arbitrary(Gen.resultOf(Person.apply _))
implicit val PersonsGen: Arbitrary[List[Person]] =
Arbitrary(Gen.listOfN(3, PersonGen.arbitrary).map { persons =>
val grouped: Map[Int, List[Person]] = persons.groupBy(_.age)
grouped.values.map(_.head) // safe because groupBy
})
Note that this will return a List with no duplicate ages but there's no guarantee that the list will have size 3 (it is guaranteed that the list will be nonempty, with size at most 3).
If having a list of size 3 is important, at the risk of generation failing if the "dice are against you", you can have something like:
def uniqueAges(persons: List[Person], target: Int): Gen[List[Person]] = {
val grouped: Map[Int, List[Person]] = persons.groupBy(_.age)
val uniquelyAged = grouped.values.map(_.head)
val n = uniquelyAged.size
if (n == target) Gen.const(uniquelyAged)
else {
val existingAges = grouped.keySet
val genPerson = PersonGen.arbitrary.retryUntil { p => !existingAges(p.age) }
Gen.listOf(target - n, genPerson)
.flatMap(l => uniqueAges(l, target - n))
.map(_ ++ uniquelyAged)
}
}
implicit val PersonsGen: Arbitrary[List[Person]] =
Arbitrary(Gen.listOfN(3, PersonGen.arbitrary).flatMap(l => uniqueAges(l, 3)))
You can do it as follows:
implicit def IntsArb: Arbitrary[Int] = Arbitrary(Gen.choose[Int](0, Int.MaxValue))
implicit val StringArb: Arbitrary[String] = Arbitrary(Gen.listOfN(5, Gen.alphaChar).map(_.mkString))
implicit val PersonGen = Arbitrary(Gen.resultOf(Person.apply _))
implicit val PersonsGen: Arbitrary[List[Person]] = Arbitrary(Gen.listOfN(3, PersonGen.arbitrary))
I have an expensive function which I want to run as few times as possible with the following requirement:
I have several input values to try
If the function returns a value below a given threshold, I don't want to try other inputs
if no result is below the threshold, I want to take the result with the minimal output
I could not find a nice solution using Iterator's takeWhile/dropWhile, because I want to have the first matching element included. just ended up with the following solution:
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s:String) : Double = {
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val inputIt = inputsToTry.iterator
val results = mutable.ArrayBuffer.empty[(String, Double)]
val earlyAbort = 0.5 // threshold
breakable {
while (inputIt.hasNext) {
val name = inputIt.next()
val res = expensiveFunc(name)
results += Tuple2(name,res)
if (res<earlyAbort) break()
}
}
println(results) // ArrayBuffer((a,0.6), (b,0.2))
val (name, bestResult) = results.minBy(_._2) // (b, 0.2)
If i set val earlyAbort = 0.1, the result should still be (b, 0.2) without evaluating all the cases again.
You can make use of Stream to achieve what you are looking for, remember Stream is some kind of lazy collection, that evaluate operations on demand.
Here is the scala Stream documentation.
You only need to do this:
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
val earlyAbort = 0.5
def expensiveFunc(s: String): Double = {
println(s"Evaluating for $s")
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val results = inputsToTry.toStream.map(input => input -> expensiveFunc(input))
val finalResult = results.find { case (k, res) => res < earlyAbort }.getOrElse(results.minBy(_._2))
If find does not get any value, you can use the same stream to find the min, and the function is not evaluated again, this is because of memoization:
The Stream class also employs memoization such that previously computed values are converted from Stream elements to concrete values of type A
Consider that this code will fail if the original collection was empty, if you want to support empty collections you should replace minBy with sortBy(_._2).headOption and getOrElse by orElse:
val finalResultOpt = results.find { case (k, res) => res < earlyAbort }.orElse(results.sortBy(_._2).headOption)
And the output for this is:
Evaluating for a
Evaluating for b
finalResult: (String, Double) = (b,0.2)
finalResultOpt: Option[(String, Double)] = Some((b,0.2))
The clearest, simplest, thing to do is fold over the input, passing forward only the current best result.
val inputIt :Iterator[String] = inputsToTry.iterator
val earlyAbort = 0.5 // threshold
inputIt.foldLeft(("",Double.MaxValue)){ case (low,name) =>
if (low._2 < earlyAbort) low
else Seq(low, (name, expensiveFunc(name))).minBy(_._2)
}
//res0: (String, Double) = (b,0.2)
It calls on expensiveFunc() only as many times as is needed, but it does walk through the entire input iterator. If that's still too onerous (lots of input) then I'd go with a tail-recursive method.
val inputIt :Iterator[String] = inputsToTry.iterator
val earlyAbort = 0.5 // threshold
def bestMin(low :(String,Double) = ("",Double.MaxValue)) :(String,Double) = {
if (inputIt.hasNext) {
val name = inputIt.next()
val res = expensiveFunc(name)
if (res < earlyAbort) (name, res)
else if (res < low._2) bestMin((name,res))
else bestMin(low)
} else low
}
bestMin() //res0: (String, Double) = (b,0.2)
Use view in your input list:
try the following:
val pseudoResult = Map("a" -> 0.6, "b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s: String): Double = {
println(s"executed for ${s}")
pseudoResult(s)
}
val inputsToTry = Seq("a", "b", "c")
val earlyAbort = 0.5 // threshold
def doIt(): List[(String, Double)] = {
inputsToTry.foldLeft(List[(String, Double)]()) {
case (n, name) =>
val res = expensiveFunc(name)
if(res < earlyAbort) {
return n++List((name, res))
}
n++List((name, res))
}
}
val (name, bestResult) = doIt().minBy(_._2)
println(name)
println(bestResult)
The output:
executed for a
executed for b
b
0.2
As you can see, only a and b are evaluated, and not c.
This is one of the use-cases for tail-recursion:
import scala.annotation.tailrec
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s:String) : Double = {
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val earlyAbort = 0.5 // threshold
#tailrec
def f(s: Seq[String], result: Map[String, Double] = Map()): Map[String, Double] = s match {
case Nil => result
case h::t =>
val expensiveCalculation = expensiveFunc(h)
val intermediateResult = result + (h -> expensiveCalculation)
if(expensiveCalculation < earlyAbort) {
intermediateResult
} else {
f(t, intermediateResult)
}
}
val result = f(inputsToTry)
println(result) // Map(a -> 0.6, b -> 0.2)
val (name, bestResult) = f(inputsToTry).minBy(_._2) // ("b", 0.2)
If you implement takeUntil and use it, you'd still have to go through the list once more to get the lowest one if you don't find what you are looking for. Probably a better approach would be to have a function that combines find with reduceOption, returning early if something is found or else returning the result of reducing the collection to a single item (in your case, finding the smallest one).
The result is comparable with what you could achieve using a Stream, as highlighted in a previous reply, but avoids leveraging memoization, which can be cumbersome for very large collections.
A possible implementation could be the following:
import scala.annotation.tailrec
def findOrElse[A](it: Iterator[A])(predicate: A => Boolean,
orElse: (A, A) => A): Option[A] = {
#tailrec
def loop(elseValue: Option[A]): Option[A] = {
if (!it.hasNext) elseValue
else {
val next = it.next()
if (predicate(next)) Some(next)
else loop(Option(elseValue.fold(next)(orElse(_, next))))
}
}
loop(None)
}
Let's add our inputs to test this:
def f1(in: String): Double = {
println("calling f1")
Map("a" -> 0.6, "b" -> 0.2, "c" -> 1.0, "d" -> 0.8)(in)
}
def f2(in: String): Double = {
println("calling f2")
Map("a" -> 0.7, "b" -> 0.6, "c" -> 1.0, "d" -> 0.8)(in)
}
val inputs = Seq("a", "b", "c", "d")
As well as our helpers:
def apply[IN, OUT](in: IN, f: IN => OUT): (IN, OUT) =
in -> f(in)
def threshold[A](a: (A, Double)): Boolean =
a._2 < 0.5
def compare[A](a: (A, Double), b: (A, Double)): (A, Double) =
if (a._2 < b._2) a else b
We can now run this and see how it goes:
val r1 = findOrElse(inputs.iterator.map(apply(_, f1)))(threshold, compare)
val r2 = findOrElse(inputs.iterator.map(apply(_, f2)))(threshold, compare)
val r3 = findOrElse(Map.empty[String, Double].iterator)(threshold, compare)
r1 is Some(b, 0.2), r2 is Some(b, 0.6) and r3 is (reasonably) None. In the first case, since we use a lazy iterator and terminate early, we only invoke f1 twice.
You can have a look at the results and can play with this code here on Scastie.
Starting my first project with Scala: a poker framework.
So I have the following class
class Card(rank1: CardRank, suit1: Suit){
val rank = rank1
val suit = suit1
}
And a Utils object which contains two methods that do almost the same thing: they count number of cards for each rank or suit
def getSuits(cards: List[Card]) = {
def getSuits(cards: List[Card], suits: Map[Suit, Int]): (Map[Suit, Int]) = {
if (cards.isEmpty)
return suits
val suit = cards.head.suit
val value = if (suits.contains(suit)) suits(suit) + 1 else 1
getSuits(cards.tail, suits + (suit -> value))
}
getSuits(cards, Map[Suit, Int]())
}
def getRanks(cards: List[Card]): Map[CardRank, Int] = {
def getRanks(cards: List[Card], ranks: Map[CardRank, Int]): Map[CardRank, Int] = {
if (cards isEmpty)
return ranks
val rank = cards.head.rank
val value = if (ranks.contains(rank)) ranks(rank) + 1 else 1
getRanks(cards.tail, ranks + (rank -> value))
}
getRanks(cards, Map[CardRank, Int]())
}
Is there any way I can "unify" these two methods in a single one with "field/method-as-parameter"?
Thanks
Yes, that would require high order function (that is, function that takes function as parameter) and type parameters/genericity
def groupAndCount[A,B](elements: List[A], toCount: A => B): Map[B, Int] = {
// could be your implementation, just note key instead of suit/rank
// and change val suit = ... or val rank = ...
// to val key = toCount(card.head)
}
then
def getSuits(cards: List[Card]) = groupAndCount(cards, {c : Card => c.suit})
def getRanks(cards: List[Card]) = groupAndCount(cards, {c: Card => c.rank})
You do not need type parameter A, you could force the method to work only on Card, but that would be a pity.
For extra credit, you can use two parameter lists, and have
def groupAndCount[A,B](elements: List[A])(toCount: A => B): Map[B, Int] = ...
that is a little peculiarity of scala with type inference, if you do with two parameters lists, you will not need to type the card argument when defining the function :
def getSuits(cards: List[Card]) = groupAndCount(cards)(c => c.suit)
or just
def getSuits(cards: List[Card] = groupAndCount(cards)(_.suit)
Of course, the library can help you with the implementation
def groupAndCount[A,B](l: List[A])(toCount: A => B) : Map[A,B] =
l.groupBy(toCount).map{case (k, elems) => (k, elems.length)}
although a hand made implementation might be marginally faster.
A minor note, Card should be declared a case class :
case class Card(rank: CardRank, suit: Suit)
// declaration done, nothing else needed
I'm trying to implement a new type, Chunk, that is similar to a Map. Basically, a "Chunk" is either a mapping from String -> Chunk, or a string itself.
Eg it should be able to work like this:
val m = new Chunk("some sort of value") // value chunk
assert(m.getValue == "some sort of value")
val n = new Chunk("key" -> new Chunk("value"), // nested chunks
"key2" -> new Chunk("value2"))
assert(n("key").getValue == "value")
assert(n("key2").getValue == "value2")
I have this mostly working, except that I am a little confused by how the + operator works for immutable maps.
Here is what I have now:
class Chunk(_map: Map[String, Chunk], _value: Option[String]) extends Map[String, Chunk] {
def this(items: (String, Chunk)*) = this(items.toMap, None)
def this(k: String) = this(new HashMap[String, Chunk], Option(k))
def this(m: Map[String, Chunk]) = this(m, None)
def +[B1 >: Chunk](kv: (String, B1)) = throw new Exception(":( do not know how to make this work")
def -(k: String) = new Chunk(_map - k, _value)
def get(k: String) = _map.get(k)
def iterator = _map.iterator
def getValue = _value.get
def hasValue = _value.isDefined
override def toString() = {
if (hasValue) getValue
else "Chunk(" + (for ((k, v) <- this) yield k + " -> " + v.toString).mkString(", ") + ")"
}
def serialize: String = {
if (hasValue) getValue
else "{" + (for ((k, v) <- this) yield k + "=" + v.serialize).mkString("|") + "}"
}
}
object main extends App {
val m = new Chunk("message_info" -> new Chunk("message_type" -> new Chunk("boom")))
val n = m + ("c" -> new Chunk("boom2"))
}
Also, comments on whether in general this implementation is appropriate would be appreciated.
Thanks!
Edit: The algebraic data types solution is excellent, but there remains one issue.
def +[B1 >: Chunk](kv: (String, B1)) = Chunk(m + kv) // compiler hates this
def -(k: String) = Chunk(m - k) // compiler is pretty satisfied with this
The - operator here seems to work, but the + operator really wants me to return something of type B1 (I think)? It fails with the following issue:
overloaded method value apply with alternatives: (map: Map[String,Chunk])MapChunk <and> (elems: (String, Chunk)*)MapChunk cannot be applied to (scala.collection.immutable.Map[String,B1])
Edit2:
Xiefei answered this question -- extending map requires that I handle + with a supertype (B1) of Chunk, so in order to do this I have to have some implementation for that, so this will suffice:
def +[B1 >: Chunk](kv: (String, B1)) = m + kv
However, I don't ever really intend to use that one, instead, I will also include my implementation that returns a chunk as follows:
def +(kv: (String, Chunk)):Chunk = Chunk(m + kv)
How about an Algebraic data type approach?
abstract sealed class Chunk
case class MChunk(elems: (String, Chunk)*) extends Chunk with Map[String,Chunk] {
val m = Map[String, Chunk](elems:_*)
def +[B1 >: Chunk](kv: (String, B1)) = m + kv
def -(k: String) = m - k
def iterator = m.iterator
def get(s: String) = m.get(s)
}
case class SChunk(s: String) extends Chunk
// A 'Companion' object that provides 'constructors' and extractors..
object Chunk {
def apply(s: String) = SChunk(s)
def apply(elems: (String, Chunk)*) = MChunk(elems: _*)
// just a couple of ideas...
def unapply(sc: SChunk) = Option(sc).map(_.value)
def unapply(smc: (String, MChunk)) = smc match {
case (s, mc) => mc.get(s)
}
}
Which you can use like:
val simpleChunk = Chunk("a")
val nestedChunk = Chunk("b" -> Chunk("B"))
// Use extractors to get the values.
val Chunk(s) = simpleChunk // s will be the String "a"
val Chunk(c) = ("b" -> nestedChunk) // c will be a Chunk: Chunk("B")
val Chunk(c) = ("x" -> nestedChunk) // will throw a match error, because there's no "x"
// pattern matching:
("x" -> mc) match {
case Chunk(w) => Some(w)
case _ => None
}
The unapply extractors are just a suggestion; hopefully you can mess with this idea till you get what you want.
The way it's written, there's no way to enforce that it can't be both a Map and a String at the same time. I would be looking at capturing the value using Either and adding whatever convenience methods you require:
case class Chunk(value:Either[Map[String,Chunk],String]) {
...
}
That will also force you to think about what you really need to do in situations such as adding a key/value pair to a Chunk that represents a String.
Have you considered using composition instead of inheritance? So, instead of Chunk extending Map[String, Chunk] directly, just have Chunk internally keep an instance of Map[String, Chunk] and provide the extra methods that you need, and otherwise delegating to the internal map's methods.
def +(kv: (String, Chunk)):Chunk = new Chunk(_map + kv, _value)
override def +[B1 >: Chunk](kv: (String, B1)) = _map + kv
What you need is a new + method, and also implement the one declared in Map trait.
I currently have a method that uses a scala.collection.mutable.PriorityQueue to combine elements in a certain order. For instance the code looks a bit like this:
def process[A : Ordering](as: Set[A], f: (A, A) => A): A = {
val queue = new scala.collection.mutable.PriorityQueue[A]() ++ as
while (queue.size > 1) {
val a1 = queue.dequeue
val a2 = queue.dequeue
queue.enqueue(f(a1, a2))
}
queue.dequeue
}
The code works as written, but is necessarily pretty imperative. I thought of using a SortedSet instead of the PriorityQueue, but my attempts make the process look a lot messier. What is a more declarative, succinct way of doing what I want to do?
If f doesn't produce elements that are already in the Set, you can indeed use a SortedSet. (If it does, you need an immutable priority queue.) A declarative wayto do this would be:
def process[A:Ordering](s:SortedSet[A], f:(A,A)=>A):A = {
if (s.size == 1) s.head else {
val fst::snd::Nil = s.take(2).toList
val newSet = s - fst - snd + f(fst, snd)
process(newSet, f)
}
}
Tried to improve #Kim Stebel's answer, but I think imperative variant is still more clear.
def process[A:Ordering](s: Set[A], f: (A, A) => A): A = {
val ord = implicitly[Ordering[A]]
#tailrec
def loop(lst: List[A]): A = lst match {
case result :: Nil => result
case fst :: snd :: rest =>
val insert = f(fst, snd)
val (more, less) = rest.span(ord.gt(_, insert))
loop(more ::: insert :: less)
}
loop(s.toList.sorted(ord.reverse))
}
Here's a solution with SortedSet and Stream:
def process[A : Ordering](as: Set[A], f: (A, A) => A): A = {
Stream.iterate(SortedSet.empty ++ as)( ss =>
ss.drop(2) + f(ss.head, ss.tail.head))
.takeWhile(_.size > 1).last.head
}