Reading Second Value In Line Scala - scala

I'd like to read line of input of the form "x y" from standard input using Scala and assign only y to a var. Here's what I have so far:
val Array(_, t) = readLine.split(" ").map(_.toInt)
This looks pretty ugly though. I tried val t = readLine.split(" ").map(_.toInt)(1), but the compiler complains when I try this. If there's a cleaner solution than using an Array, I'd really appreciate the help. Thanks!

Your solution val Array(_, t) = readLine.split(" ").map(_.toInt) is ok when string contains valid data.
If you know that second token is valid use this:
val t = line.split(" ")(1).toInt

Related

How to extract sequence elements wrapped in Option in Scala?

I am learning Scala and struggling with Option[Seq[String]] object I need to process. There is a small array of strings Seq("hello", "Scala", "!") which I need to filter against charAt(0).isUpper condition.
Doing it on plain val arr = Seq("hello", "Scala", "!") is as easy as arr.filter(_.charAt(0).isUpper). However, doing the same on Option(Seq("hello", "Scala", "!")) won't work since you need to call .getOrElse on it first. But even then how can you apply the condition?
arr.filter(_.getOrElse(false).charAt(0).isUpper is wrong. I have tried a lot of variants and searching stackoverflow didn't help either and I am wondering if this is at all possible. Is there an idiomatic way to handle Option wrapped cases in Scala?
If you want to apply f: X => Y to a value x of type X, you write f(x).
If you want to apply f: X => Y to a value ox of type Option[X], you write ox.map(f).
You seem to already know what you want to do with the sequence, so just put the appropriate f into map.
Example:
val ox = Option(Seq("hello", "Scala", "!"))
ox.map(_.filter(_(0).isUpper)) // Some(Seq("Scala"))
You can just call foreach or map on the option, i.e. arr.map(seq => seq.filter(_.charAt(0).isUpper))
You can use a pattern matching for that case as
Option(Seq("hello", "Scala", "!")) match {
case Some(x) => x.filter(_.charAt(0).isUpper)
case None => Seq()
}

Get a different result when calling the same function

I am a scala newbie.
This is my code. The results of two types of same method use are different,
can anyone explain to me why???
The thing is that in Scala, all functions ( or "operators") which have names ending with a colon ':' are deemed as right associative when used with infix notation.
So... for your function,
def ::(t: TG) = ???
When, you are writing
val lxx3 = lxx1 :: lxx2
The function :: associates to the right (ie. with lxx2). So it is actually equivalent to
val lxx3 = lxx2.::(lxx1)
instead of this,
val lxx3 = lxx1.::(lxx2)

Reduce a List of Map of Tuples

I have the following variable
val x1 = List((List(('a',1), ('e',1), ('t',1)),"eat"), (List(('a',1), ('e',1), ('t',1)),"ate"))
I want to get a
List(Map -> List)
that will look something like the following. The idea is to group words b the characters contained in them
Map(List((a,1), (e,1), (t,1)) -> List(eat, ate))
I have used the following to achieve this however not quite getting it right. I have used the following code and get the expected result
scala> val z1 = x1.groupBy(x => x._1 )
.map(x => Map(x._1 -> x._2.map(z=>z._2)))
.fold(){(a,b) => b}
z1: Any = Map(List((a,1), (e,1), (t,1)) -> List(eat, ate))
However, I would like to return the obvious type Map[List[(Char, Int)],List[String]] and not Any as is returned in my case. Also, i'm wondering if there is a better of doing the whole thing itself. Many thanks!
Try this.
scala> x1.groupBy(_._1).mapValues(_.map(_._2))
res2: scala.collection.immutable.Map[List[(Char, Int)],List[String]] = Map(List((a,1), (e,1), (t,1)) -> List(eat, ate))
But, yeah, I think you might want to reconsider your data representation. That List[(List[(Char,Int)], String)] business is rather cumbersome.

Finding lines that start with a digit in Scala using filter() method

I am a python programmer and as the Python API is too slow for my Spark application and decided to port my code to Spark Scala API, to compare the computation time.
I am trying to filter out the lines that start with numeric characters from a huge file using Scala API in Spark. In my file, some lines have numbers and some have words and I want the lines that only have numbers.
So, in my Python application, I have these lines.
l = sc.textFile("my_file_path")
l_filtered = l.filter(lambda s: s[0].isdigit())
which works exactly as I want.
This is what I have tried so far.
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(x => x.forall(_.isDigit))
This throws out an error saying that char does not have forall() function.
I also tried taking the first character of the lines using s.take(1) and apply isDigit() function on that in the following way.
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(x => x.take(1).isDigit)
and this too...
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(x => x.take(1).Character.isDigit)
This also throws an error.
This is basically a small error and as I am not accustomed to Scala syntax, I am having hard time figuring it out. Any help would be appreciated.
Edit: As answered for this question, I tried writing the function, but I am unable to use that in filter() function in my application. To apply the function for all the lines in the file.
In Scala indexing syntax uses parens () instead of brackets []. The exact translation of your Python code would be this:
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(_(0).isDigit)
A more idiomatic extraction of the first symbol would be using head method:
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(_.head.isDigit)
Both of these methods would fail if your file contains empty lines.
If that's the case, then you probably want this:
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(_.headOption.map(_.isDigit).getOrElse(false))
UPD.
As curious noted map(predicate).getOrElse(false) on Option could be shortened to exists(predicate):
val l = sc.textFile("my_file_path")
val l_filtered = l.filter(_.headOption.exists(_.isDigit))
You can use regular expressions:
scala> List("1hello","2world","good").filter(_.matches("^[0-9].*$"))
res0: List[String] = List(1hello, 2world)
or you can do like this with lesser no. of operations as this file might contain a huge number of lines to filter.
scala> List("1hello","world").filter(_.headOption.exists(_.isDigit))
res1: List[String] = List(1hello)
replace List[String] with your lines l in your case to work.

Putting two placeholders inside flatMap in Spark Scala to create Array

I am applying flatMap on a scala array and create another array from it:
val x = sc.parallelize(Array(1,2,3,4,5,6,7))
val y = x.flatMap(n => Array(n,n*100,42))
println(y.collect().mkString(","))
1,100,42,2,200,42,3,300,42,4,400,42,5,500,42,6,600,42,7,700,42
But I am trying to use placeholder "_" in the second line of the code where I create y in the following way:
scala> val y = x.flatMap(Array(_,_*100,42))
<console>:26: error: wrong number of parameters; expected = 1
val y = x.flatMap(Array(_,_*100,42))
^
Which is not working. Could someone explain what to do in such cases if I want to use placeholder?
In scala, the number of placeholders in a lambda indicates the cardinality of the lambda parameters.
So the last line is expanded as
val y = x.flatMap((x1, x2) => Array(x1, x2*100, 42))
Long story short, you can't use a placeholder to refer twice to the same element.
You have to use named parameters in this case.
val y = x.flatMap(x => Array(x, x*100, 42))
You can only use _ placeholder once per parameter. (In your case, flatMap method takes single argument, but you are saying -- hey compiler, expect two arguments which is not going to work)
val y = x.flatMap(i => Array(i._1, i._2*100,42))
should do the trick.
val y = x.flatMap { case (i1, i2) => Array(i1, i2*100,42) }
should also work (and probably more readable)