Proper way for looping in Scala - scala

Suppose I have an array of strings in Scala:
val strings = Array[String]("1", "2", "3", "4", "5", "6", "7")
What I need is to make a new array which elements will be obtained as a concatenation of each three (any number) consequent elements of the first array, which should result in ("123", "456", "7")
Being new to Scala I wrote the following code which was neither concise nor effective:
var step = 3
val strings = Array[String]("1", "2", "3", "4", "5", "6", "7")
val newStrings = collection.mutable.ArrayBuffer.empty[String]
for (i <- 0 until strings.length by step) {
var elem = ""
for (k <- 0 until step if i + k < strings.length) {
elem += strings(i + k)
}
newStrings += elem
}
What would be the Scala way for doing this?

strings.grouped(3).map(_.mkString).toArray
or
strings grouped 3 map (_.mkString) toArray
I personally prefer the first version :)

strings grouped 3 map (_.mkString)
or (in order to really get an Array back)
(strings grouped 3 map (_.mkString)).toArray

... or use sliding
val strings = Array[String]("1", "2", "3", "4", "5", "6", "7")
strings.sliding (3, 3) .map (_.mkString).toArray
res19: Array[String] = Array(123, 456, 7)
Sliding: You take 3, and move forward 3. Variants:
scala> strings.sliding (3, 2) .map (_.mkString).toArray
res20: Array[String] = Array(123, 345, 567)
take 3, but forward 2
scala> strings.sliding (2, 3) .map (_.mkString).toArray
res21: Array[String] = Array(12, 45, 7)
take 2, forward 3 (thereby skipping every third)

Related

Merging lists in scala

I have to merge these two lists together in a way that results in the 3rd list. I'm not super familiar with Scala that much but I am always interested in learning.
val variable = List("a", "b", "c") | val number = List("1", "2", "3")
When merged and printing each value after, it should result in an output like this
a is equal to 1
b is equal to 2
c is equal to 3
with a list that is now equal to
List("a is equal to 1", "b is equal to 2", "c is equal to 3")
Please help me out
zip and map would work,
variable.zip(number).map {case (str, int) => s"$str is equal to $int"}
Alternativly, you could use for-comprehensions
Welcome to Scala 3.1.1 (17, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.
scala> val variable = List("a", "b", "c")
val variable: List[String] = List(a, b, c)
scala> val number = List("1", "2", "3")
val number: List[String] = List(1, 2, 3)
scala> for {
| str <- variable
| n <- number
| } yield s"$str is equal to $n"
val res0: List[String] = List(a is equal to 1, a is equal to 2, a is equal to 3, b is equal to 1, b is equal to 2, b is equal to 3, c is equal to 1, c is equal to 2, c is equal to 3)

get indexes of elements containig some value and the coresponding elements from another list

The queston is a mouthful, but the idea pretty simple.
I have 3 lists and a string.
val a = List("x", "y", "z")
val b = List("a1", "a2", "b1", "b2", "c1", "c2", "d1", "d2")
val c = List("1", "1", "2", "2", "3", "3", "4", "4")
val d = "xc1b1"
I need to check if d contains elements from a. If it does I check the position of all the elemtns from b that are present in d and return a set of elements from c that corespond these positions.
The result for the given example is
Set("3", "2")
But when I try
if(a.exists(d.contains)) c(b.indexWhere(d.contains))
I only get
Any = 2
Which corespond to the first encountered elemnt from b ie b1
How would I get the set?
-
if(a.exists(d.contains)) b.zip(c).collect{
case (x, y) if d.contains(x) => y
}
// res1: Any = List(2, 3)
If you need a Set:
if(a.exists(d.contains)) b.zip(c).collect{
case (x, y) if d.contains(x) => y
}.toSet
// res2: Any = Set(2, 3)
I think I've understood what you need to do here, although the question could do with some clarification.
These are the two ways of getting to your set that I found:
if(a.exists(d.contains)) b.collect {
case x if d.contains(x) => c(b.indexOf(x))
}.toSet
if(a.exists(d.contains)) b.filter(d.contains).map(b.indexOf).map(c).toSet
Both find elements of b that are in d, then find their index in b and find their relative elements in c. The first way is more explicit in what it's doing, while the second way is more concise.

filter element of rdd[array[string] by array of [0,1]

I want to select some elements(feature) of rdd based on binary array. I have an array consisting of 0,1 with size 40 that specify if an element is present at that index or not.
My RDD was created form kddcup99 dataset
val rdd=sc.textfile("./data/kddcup.txt")
val data=rdd.map(_.split(','))
How can I to filter or select elements of data(rdd[Array[String]]) whose value of correspondent index in binary array is 1?
If I understood your question correctly, you have an array like :
val arr = Array(1, 0, 1, 1, 1, 0)
And a RDD[Array[String]] which looks like :
val rdd = sc.parallelize(Array(
Array("A", "B", "C", "D", "E", "F") ,
Array("G", "H", "I", "J", "K", "L")
) )
Now, to get elements at the indices where arr has 1, you need to first get the indices which have 1 as the value in arr
val requiredIndices = arr.zipWithIndex.filter(_._1 == 1).map(_._2)
requiredIndices: Array[Int] = Array(0, 2, 3, 4)
And then similarily with RDD, you can use zipWithIndex and contains to check if that index is available in your requiredIndices array :
rdd.map(_.zipWithIndex.filter(x => requiredIndices.contains(x._2) ).map(_._1) )
// Array[Array[String]] = Array(Array(A, C, D, E), Array(G, I, J, K))

Counting how many times any of the items appear in a Scala List

I have the following list:
val list = List("this", "this", "that", "there", "here", "their", "where")
I want to count how many times "this" OR "that" appears. I can do something like:
list.count(_ == "this") + list.count(_ == "that")
Is there most concise way of doing this?
You can count more than one occurrence at a time. No need to call count twice.
scala> list.count(x => x == "this" || x == "that")
res4: Int = 3
scala> list.count(Set("this", "that").contains)
res12: Int = 3
it's shorter
it's one-pass
If you need to count words in several different places using the same big list:
val m = list.groupBy(identity).mapValues(_.size).withDefaultValue(0)
will give you a handy Map with all counts, so you could do
scala> m("this") + m("that")
res11: Int = 3
Very similar example:
val s = Seq("apple", "oranges", "apple", "banana", "apple", "oranges", "oranges")
s.groupBy(identity).mapValues(_.size)
And result is
Map(banana -> 1, oranges -> 3, apple -> 3)
And for certain item:
s.groupBy(identity).mapValues(_.size)("apple")

How to convert string array to int array in scala

I am very new to Scala, and I am not sure how this is done. I have googled it with no luck.
let us assume the code is:
var arr = readLine().split(" ")
Now arr is a string array. Assuming I know that the line I input is a series of numbers e.g. 1 2 3 4, I want to convert arr to an Int (or int) array.
I know that I can convert individual elements with .toInt, but I want to convert the whole array.
Thank you and apologies if the question is dumb.
Applying a function to every element of a collection is done using .map :
scala> val arr = Array("1", "12", "123")
arr: Array[String] = Array(1, 12, 123)
scala> val intArr = arr.map(_.toInt)
intArr: Array[Int] = Array(1, 12, 123)
Note that the _.toInt notation is equivalent to x => x.toInt :
scala> val intArr = arr.map(x => x.toInt)
intArr: Array[Int] = Array(1, 12, 123)
Obviously this will raise an exception if one of the element is not an integer :
scala> val arr = Array("1", "12", "123", "NaN")
arr: Array[String] = Array(1, 12, 123, NaN)
scala> val intArr = arr.map(_.toInt)
java.lang.NumberFormatException: For input string: "NaN"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
...
... 33 elided
Starting Scala 2.13, you might want to use String::toIntOption in order to safely cast Strings to Option[Int]s and thus also handle items that can't be cast:
Array("1", "12", "abc", "123").flatMap(_.toIntOption)
// Array[Int] = Array(1, 12, 123)