How to use getOrElseUpdate in scala.collection.mutable.HashMap? - scala

The example code counts each word's occurrences in given input file:
object Main {
def main(args: Array[String]) {
val counts = new scala.collection.mutable.HashMap[String, Int]
val in = new Scanner(new File("input.txt"))
while (in.hasNext()) {
val s: String = in.next()
counts(s) = counts.getOrElse(s, 0) + 1 // Here!
}
print(counts)
}
}
Can the highlighted with comment line be rewritten using the getOrElseUpdate method?
P.S. I am only at the 4th part of the "Scala for the impatient", so please don't teach me now about functional Scala which, I am sure, can be more beautiful here.
Thanks.

If you look at the doc you'll see the next:
If given key is already in this map, returns associated value.
Otherwise, computes value from given expression op, stores with key in
map and returns that value.
, but you need to modify map anyway, so getOrElseUpdate is useless here.
You can define default value, which will return if key doesn't exist. And use it the next way:
import scala.collection.mutable.HashMap
object Main {
def main(args: Array[String]) {
val counts = HashMap[String, Int]().withDefaultValue(0)
val in = new Scanner(new File("input.txt"))
while (in.hasNext()) {
val s: String = in.next()
counts(s) += 1
}
print(counts)
}
}

Related

How to covert multiple strings in a list to be keys in a map

I am trying to write a function that would return a map in which every word is a key and the values are pages at which the word shows up. Currently, I am stuck at the point where I have data of the following type: List(List(words),page).
Is there any sensible way to reformat this data if so, please explain as I have no idea how to even begin?
object G {
def main(args: Array[String]): Unit = {
stwórzIndeks()
}
def stwórzIndeks(): Unit= {
val linie = io.Source
.fromResource("tekst.txt")
.getLines
.toList
val zippedLinie: List[(String,Int)]=linie.zipWithIndex
val splitt=zippedLinie.foldLeft(List.empty[(List[String],Int)])((acc,curr)=>{
curr match {
case (arr,int) => {
val toAdd=(arr.split("\\s+").toList,zippedLinie.length-int)
toAdd+:acc
}
}
})
}
}
You can replace that foldLet with a flatMap with an inner map to get a big List of (word, page).
val wordsAndPage = zippedLinie.flatMap {
case (line, idx) =>
lome.split("\\s+").toList.map(word => word -> idx + 1)
}
After that you can check for one of the grouping methods in the scaladoc.

Using variable second times in function not return the same value?

I start learning Scala, and i wrote that code. And I have question, why val which is constant? When i pass it second time to the same function return other value? How write pure function in scala?
Or any comment if that counting is right?
import java.io.FileNotFoundException
import java.io.IOException
import scala.io.BufferedSource
import scala.io.Source.fromFile
object Main{
def main(args: Array[String]): Unit = {
val fileName: String = if(args.length == 1) args(0) else ""
try {
val file = fromFile(fileName)
/* In file tekst.txt is 4 lines */
println(s"In file $fileName is ${countLines(file)} lines")
/* In file tekst.txt is 0 lines */
println(s"In file $fileName is ${countLines(file)} lines")
file.close
}
catch{
case e: FileNotFoundException => println(s"File $fileName not found")
case _: Throwable => println("Other error")
}
}
def countLines(file: BufferedSource): Long = {
file.getLines.count(_ => true)
}
}
val means that you cannot assign new value to it. If this is something immutable - a number, immutable collection, tuple or case class of other immutable things - then your value will not change over its lifetime - if this is val inside a function, when you assign value to it, it will stay the same until you leave that function. If this is value in class, it will stay the same between all calls to this class. If this is object it will stay the same over whole program life.
But, if you are talking about object which are mutable on their own, then the only immutable part is the reference to object. If you have a val of mutable.MutableList, then you can swap it with another mutable.MutableList, but you can modify the content of the list. Here:
val file = fromFile(fileName)
/* In file tekst.txt is 4 lines */
println(s"In file $fileName is ${countLines(file)} lines")
/* In file tekst.txt is 0 lines */
println(s"In file $fileName is ${countLines(file)} lines")
file.close
file is immutable reference to BufferedSource. You cannot replace it with another BufferedSource - but this class has internal state, it counts how many lines from file it already read, so the first time you operate on it you receive total number of lines in file, and then (since file is already read) 0.
If you wanted that code to be purer, you should contain mutability so that it won't be observable to the user e.g.
def countFileLines(fileName: String): Either[String, Long] = try {
val file = fromFile(fileName)
try {
Right(file.getLines.count(_ => true))
} finally {
file.close()
}
} catch {
case e: FileNotFoundException => Left(s"File $fileName not found")
case _: Throwable => Left("Other error")
}
println(s"In file $fileName is ${countLines(fileName)} lines")
println(s"In file $fileName is ${countLines(fileName)} lines")
Still, you are having side effects there, so ideally it should be something written using IO monad, but for now remember that you should aim for referential transparency - if you could replace each call to countLines(file) with a value from val counted = countLines(file) it would be RT. As you checked, it isn't. So replace it with something that wouldn't change behavior if it was called twice. A way to do it is to call whole computations twice without any global state preserved between them (e.g. internal counter in BufferedSource). IO monads make that easier, so go after them once you feel comfortable with syntax itself (to avoid learning too many things at once).
file.getLines returns Iterator[String] and iterator is consumable meaning we can iterate over it only once, for example, consider
val it = Iterator("a", "b", "c")
it.count(_ => true)
// val res0: Int = 3
it.count(_ => true)
// val res1: Int = 0
Looking at the implementation of count
def count(p: A => Boolean): Int = {
var res = 0
val it = iterator
while (it.hasNext) if (p(it.next())) res += 1
res
}
notice the call to it.next(). This call advances the state of the iterator and if it happens then we cannot go back to previous state.
As an alternative you could try length instead of count
val it = Iterator("a", "b", "c")
it.length
// val res0: Int = 3
it.length
// val res0: Int = 3
Looking at the definition of length which just delegates to size
def size: Int = {
if (knownSize >= 0) knownSize
else {
val it = iterator
var len = 0
while (it.hasNext) { len += 1; it.next() }
len
}
}
notice the guard
if (knownSize >= 0) knownSize
Some collections know their size without having to compute it by iterating over them. For example,
Array(1,2,3).knownSize // 3: I know my size in advance
List(1,2,3).knownSize // -1: I do not know my size in advance so I have to traverse the whole collection to find it
So if the underlying concrete collection of the Iterator knows its size, then call to length will short-cuircuit and it.next() will never execute, which means the iterator will not be consumed. This is the case for default concrete collection used by Iterator factory which is Array
val it = Iterator("a", "b", "c")
it.getClass.getSimpleName
// res6: Class[_ <: Iterator[String]] = class scala.collection.ArrayOps$ArrayIterator
however it is not true for BufferedSource. To workaround the issue consider creating an new iterator each time countLines is called
def countLines(fileName: String): Long = {
fromFile(fileName).getLines().length
}
println(s"In file $fileName is ${countLines(fileName)} lines")
println(s"In file $fileName is ${countLines(fileName)} lines")
// In file build.sbt is 22 lines
// In file build.sbt is 22 lines
Final point regarding value definitions and immutability. Consider
object Foo { var x = 42 } // object contains mutable state
val foo = Foo // value definition
foo.x
// val res0: Int = 42
Foo.x = -11 // mutation happening here
foo.x
// val res1: Int = -11
Here identifier foo is an immutable reference to mutable object.

Scala Future Sequence Mapping: finding length?

I want to return both a Future[Seq[String]] from a method and the length of that Seq[String] as well. Currently I'm building the Future[Seq[String]] using a mapping function from another Future[T].
Is there any way to do this without awaiting for the Future?
You can map over the current Future to create a new one with the new data added to the type.
val fss: Future[Seq[String]] = Future(Seq("a","b","c"))
val x: Future[(Seq[String],Int)] = fss.map(ss => (ss, ss.length))
If you somehow know what the length of the Seq will be without actually waiting for it, then something like this;
val t: Future[T] = ???
def foo: (Int, Future[Seq[String]]) = {
val length = 42 // ???
val fut: Future[Seq[String]] = t map { v =>
genSeqOfLength42(v)
}
(length, fut)
}
If you don't, then you will have to return Future[(Int, Seq[String])] as jwvh said, or you can easily get the length later in the calling function.

Scala equivalent of Haskell's insertWith for Maps

I'm looking to do the simple task of counting words in a String. The easiest way I've found is to use a Map to keep track of word frequencies. Previously with Haskell, I used its Map's function insertWith, which takes a function that resolves key collisions, along with the key and value pair. I can't find anything similar in Scala's library though; only an add function (+), which presumably overwrites the previous value when re-inserting a key. For my purposes though, instead of overwriting the previous value, I want to add 1 to it to increase its count.
Obviously I could write a function to check if a key already exists, fetch its value, add 1 to it, and re-insert it, but it seems odd that a function like this isn't included. Am I missing something? What would be the Scala way of doing this?
Use a map with default value and then update with +=
import scala.collection.mutable
val count = mutable.Map[String, Int]().withDefaultValue(0)
count("abc") += 1
println(count("abc"))
If it's a string then why not use the split module
import Data.List.Split
let mywords = "he is a good good boy"
length $ nub $ splitOn " " mywords
5
If you want to stick with Scala's immutable style, you could create your own class with immutable semantics:
class CountMap protected(val counts: Map[String, Int]){
def +(str: String) = new CountMap(counts + (str -> (counts(str) + 1)))
def apply(str: String) = counts(str)
}
object CountMap {
def apply(counts: Map[String, Int] = Map[String, Int]()) = new CountMap(counts.withDefaultValue(0))
}
And then you can use it:
val added = CountMap() + "hello" + "hello" + "world" + "foo" + "bar"
added("hello")
>>2
added("qux")
>>0
You might also add apply overloads on the companion object so that you can directly input a sequence of words, or even a sentence:
object CountMap {
def apply(counts: Map[String, Int] = Map[String, Int]()): CountMap = new CountMap(counts.withDefaultValue(0))
def apply(words: Seq[String]): CountMap = CountMap(words.groupBy(w => w).map { case(word, group) => word -> group.length })
def apply(sentence: String): CountMap = CountMap(sentence.split(" "))
}
And then the you can even more easily:
CountMap(Seq("hello", "hello", "world", "world", "foo", "bar"))
Or:
CountMap("hello hello world world foo bar")

Finding the values last values of a key in HashMap in Scala - how to write it in a functional style

I want to search the hashmap after a key and if the key is found give me the last value of the found value of the key. Here is my solution so far:
import scala.collection.mutable.HashMap
object Tmp extends Application {
val hashmap = new HashMap[String, String]
hashmap += "a" -> "288 | object | L"
def findNameInSymboltable(name: String) = {
if (hashmap.get(name) == None)
"N"
else
hashmap.get(name).flatten.last.toString
}
val solution: String = findNameInSymboltable("a")
println(solution) // L
}
Is there maybe a functional style of it which save me the overhead of locs?
Couldn't quite get your example to work. But maybe something like this would do the job?
hashmap.getOrElse("a", "N").split(" | ").last
The "getOrElse" will at least save you the if/else check.
In case your "N" is intended for display and not for computation, you can drag ouround the fact that there is no such "a" in a None until display:
val solution = // this is an Option[String], inferred
hashmap.get("a"). // if None the map is not done
map(_.split(" | ").last) // returns an Option, perhaps None
Which can alse be written:
val solution = // this is an Option[String], inferred
for(x <- hashmap.get("a"))
yield {
x.split(" | ").last
}
And finally:
println(solution.getOrElse("N"))