How to use string.split() without foreach()? - scala

Write a program in Scala that reads an String from the keyboard and counts the number of characters, ignoring if its UpperCase or LowerCase
ex: Avocado
R: A = 2; v = 1; o = 2; c = 1; d = 2;
So, i tried to do it with two fors iterating over the string, and then a conditional to transform the character in the position (x) to Upper and compare with the character in the position (y) which is the same position... basically i'm transforming the same character so i can increment in the counter ex: Ava -> A = 2; v = 1;
But with this logic when i print the result it comes with:
ex: Avocado
R: A = 2; v = 1; o = 2; c = 1; a = 2; d = 1; o = 2;
its repeting the same character Upper or Lower in the result...
so my teacher asked us to resolve this using the split method and yield of Scala but i dunno how to use the split without forEach() that he doesnt allow us to use.
sorry for the bad english
object ex8 {
def main(args: Array[String]): Unit = {
println("Write a string")
var string = readLine()
var cont = 0
for (x <- 0 to string.length - 1) {
for (y <- 0 to string.length - 1) {
if (string.charAt(x).toUpper == string.charAt(y).toUpper)
cont += 1
}
print(string.charAt(x) + " = " + cont + "; ")
cont = 0
}
}
}
But with this logic when i print the result it comes with:
ex: Avocado
R: A = 2; V = 1; o = 2; c = 1; a = 2; d = 1; o = 2;

Scala 2.13 has added a very handy method to cover this sort of thing.
inputStr.groupMapReduce(_.toUpper)(_ => 1)(_+_)
.foreach{case (k,v) => println(s"$k = $v")}
//A = 2
//V = 1
//C = 1
//O = 2
//D = 1

It might be easier to group the individual elements of the String (i.e. a collection of Chars, made case-insensitive with toLower) to aggregate their corresponding size using groupBy/mapValues:
"Avocado".groupBy(_.toLower).mapValues(_.size)
// res1: scala.collection.immutable.Map[Char,Int] =
// Map(a -> 2, v -> 1, c -> 1, o -> 2, d -> 1)

Scala 2.11
Tried with classic word count approach of map => group => reduce
val exampleStr = "Avocado R"
exampleStr.
toLowerCase.
trim.
replaceAll(" +","").
toCharArray.map(x => (x,1)).groupBy(_._1).
map(x => (x._1,x._2.length))
Answer :
exampleStr: String = Avocado R
res3: scala.collection.immutable.Map[Char,Int] =
Map(a -> 2, v -> 1, c -> 1, r -> 1, o -> 2, d -> 1)

Related

For loop with two variables in scala

I have the following java code:
for (int i = 0, j = 0; i < 10 && j < 10 0; i++, j++)
{
System.out.println("i = " + i + " :: " + "j = " + j);
}
The output is :
i = 0 :: j = 0
i = 1 :: j = 1
i = 2 :: j = 2
i = 3 :: j = 3
i = 4 :: j = 4
i = 5 :: j = 5
....
I would like to do the same thing in scala, I tried this but it does not work:
for (i<- 0 to 9; j <- 0 to 9)
{
println("i = " + i + " :: " + "j = " + j)
}
The output is:
i = 0 :: j = 0
i = 0 :: j = 1
i = 0 :: j = 2
i = 0 :: j = 3
i = 0 :: j = 4
i = 0 :: j = 5
i = 0 :: j = 6
i = 0 :: j = 7
i = 0 :: j = 8
i = 0 :: j = 9
i = 1 :: j = 0
i = 1 :: j = 1
i = 1 :: j = 2
i = 1 :: j = 3
....
I have not find a way to have two variables in the same level.
Thank you for your answer.
Scala's replacement would be
for {
(i, j) <- (0 to 9) zip (0 to 9)
} {
println("i = " + i + " :: " + "j = " + j)
}
To avoid the confusion I suggest reading what for is the syntactic sugar for (as opposed to Java it is not specialized while).
Since both variables always have the same value, you actually only need one of them. In Scala, you would generally not use a loop to solve this problem, but use higher-level collection operations instead. Something like:
(0 to 9) map { i => s"i = $i :: j = $i" } mkString "\n"
Note: this will only generate the string that you want to print, but not actually print it. It is generally considered a good thing to not mix generating data and printing data.
If you want to print this, you only need to pass it to println:
println((0 to 9) map { i => s"i = $i :: j = $i" } mkString "\n")
Or, in Scala 2.13+:
import scala.util.chaining._
(0 to 9) map { i => s"i = $i :: j = $i" } mkString "\n" pipe println
You could also write it like this:
(for (i <- 0 to 9) yield s"i = $i :: j = $i") mkString "\n"
Now, you might say, "Wait a minute, didn't you just say that we don't use loops in Scala?" Well, here's the thing: that's not a loop! That is a for comprehension. It is actually syntactic sugar for collection operations.
for (foo <- bar) yield baz(foo)
is actually just syntactic sugar for
bar map { foo => baz(foo) }
A for comprehension simply desugars into calls to map, flatMap, foreach, and withFilter. It is not a loop.
Note that Scala does have a while loop. It exists mainly for performance reasons. Unless you are writing low-level libraries that are going to be used in performance-intensive code by tens of thousands of developers, please just pretend that it doesn't exist.
Also note that if the while loop weren't built into Scala, you could easily write it yourself:
def whiley(cond: => Boolean)(body: => Unit): Unit =
if (cond) { body; whiley(cond)(body) }
you can do it as below
val start = 0; val size = 10;
for ((i, j) <- (start to size) zip (start to size))
{
println(s"i=$i j=$j")
}
j is just a copy of i so this is one solution:
for {
i <- 0 to 9
j = i
} {
println("i = " + i + " :: " + "j = " + j)
}
This pattern works in any situation where j is just a function of i

How to combine the results of spark computations in the following case?

The question is to calculate average of each of the columns corresponding to each class. Class number is given in the first column.
I am giving a part of test file for better clarity.
2 0.819039 -0.408442 0.120827
3 -0.063763 0.060122 0.250393
4 -0.304877 0.379067 0.092391
5 -0.168923 0.044400 0.074417
1 0.053700 -0.088746 0.228501
2 0.196758 0.035607 0.008134
3 0.006971 -0.096478 0.123718
4 0.084281 0.278343 -0.350414
So the task is to calculate
1: avg(), avg(), avg()
.
.
.
I am very new to Scala. After juggling a lot with the code I came up with the following code
val inputfile = sc.textFile ("testfile.txt")
val myArray = inputfile.map { line =>
(line.split(" ").toList)
}
var Avgmap:Map[String,List[Double]] = Map()
var countmap:Map[String,Int] = Map()
for( a <- myArray ){
//println( "Value of a: " + a + " " + a.size );
if(!countmap.contains(a(0))){
countmap += (a(0) -> 0)
Avgmap += (a(0) -> List.fill(a.size-1)(1.0))
}
var c = countmap(a(0)) + 1
val countmap2 = countmap + (a(0) -> c)
countmap = countmap2
var p = List[Double]()
for( i <- 1 to a.size - 1) {
var temp = (Avgmap(a(0))(i-1)*(countmap(a(0)) - 1) + a(i).toDouble)/countmap(a(0))
// println("i: "+i+" temp: "+temp)
var q = p :+ temp
p = q
}
val Avgmap2 = Avgmap + (a(0) -> p)
Avgmap = Avgmap2;
println("--------------------------------------------------")
println(countmap)
println(Avgmap)
}
When I execute this code I seem to be getting the results in two halves of the dataset. Please help me in combining them.
Edit: About the variables I am using. countmap keeps record of classnumber -> number of vectors encountered. Similarly Avgmap keeps record of average so far of each columns corresponding to the key.
at first, use DataFrame API. at secont, what you want is just one row
df.select(df.columns.map(c => mean(col(c))) :_*).show

Binary search not working

Below is a binary search algorithm but its not finding the value :
I don't think this algorithm is correct?
'theArray' is initialised to an array of 0's with item at position 7 equal to 4.
object various {
//O(log N)
def binarySerachForValue(value : Int) = {
var arraySize = 100
var theArray = new Array[Int](arraySize)
theArray(7) = 4
var timesThrough = 0
var lowIndex = 0
var highIndex = arraySize - 1
while(lowIndex <= highIndex){
var middleIndex = (highIndex + lowIndex) / 2
if(theArray(middleIndex) < value)
lowIndex = middleIndex + 1
else if(theArray(middleIndex) > value)
highIndex = middleIndex - 1
else {
println("Found match in index " + middleIndex)
lowIndex = highIndex + 1
}
timesThrough = timesThrough + 1
}
timesThrough
} //> binarySerachForValue: (value: Int)Int
binarySerachForValue(4) //> res0: Int = 7
}
Assuming your array is already properly sorted, you could write your search function a little more functionally using tail optimized recursion as follows:
def binarySearchForValue(value : Int, theArray:Array[Int]) = {
#tailrec
def doSearch(arr:Array[Int], index:Int = 0):Int = {
val middleIndex = arr.size / 2
val splits = arr.splitAt(middleIndex)
val totalIndex = middleIndex + index
arr(middleIndex) match{
case i if i == value => totalIndex
case i if i < value => doSearch(splits._2, totalIndex)
case _ => doSearch(splits._1 dropRight(1), totalIndex)
}
}
doSearch(theArray)
}
Note that this could also be accomplished slightly differently as follows:
def binarySearchForValue(value : Int, theArray:Array[Int]) = {
#tailrec
def doSearch(low:Int, high:Int):Int = {
val mid = (low + high) / 2
if(mid >= theArray.size) -1
else {
val currval = theArray(mid)
if (currval == value) mid
else if (currval < value) doSearch(mid+1, high)
else doSearch(low, mid - 1)
}
}
doSearch(0, theArray.size)
}
It looks like a proper implementation of the Binary Search Algorithm, but you are providing an array of 0's, with just one number at the index of 7. Binary Search usually takes an array of sorted values (although you can implement sorting as the first step).
Here is an example of why you need a sorted array first:
Searchfor(4)
theArray = [0,4,0,0,0]
First iteration, look at theArray(2), which equals 0. 0 < 4, so use the upperhalf(i.e. lower index = middleindex + 1
newArray = [0,0]
Then we iterate again and eventually exit the loop because we never found it. With a sorted list, your technique would work well.
With finding a single value in an array of 0's, your best bet is to just iterate through the array until you find it. Best of Luck.
loop should be like this:
while(lowIndex <= highIndex){
//note the lowIndex + other
var middleIndex = lowIndex + ((highIndex + lowIndex) / 2)
if(theArray(middleIndex) < value)
lowIndex = middleIndex + 1
else if(theArray(middleIndex) > value)
highIndex = middleIndex - 1
else return middleIndex
timesThrough = timesThrough + 1
}
// if loop finished and not returned middleIndex in last else, return -1 (not found)
return -1

How to do X * diag(Y) in Scala Breeze?

How to do X * diag(Y) in Scala Breeze? X could be for example a CSCMatrix and Y could be a DenseVector?
In MATLAB syntax, this would be:
X * spdiags(0, Y, N, N )
Or:
X .* repmat( Y', K, 0 )
In SciPy syntax, this would be a 'broadcast multiply':
Y * X
How to do X * diag(Y) in Scala Breeze?
I wrote my own sparse diagonal method, and dense / sparse multiplication method in the end.
Use like this:
val N = 100000
val K = 100
val A = DenseMatrix.rand(N,K)
val b = DenseVector.rand(N)
val c = MatrixHelper.spdiag(b)
val d = MatrixHelper.mul( A.t, c )
Here are the implementations of spdiag and mul:
// Copyright Hugh Perkins 2012
// You can use this under the terms of the Apache Public License 2.0
// http://www.apache.org/licenses/LICENSE-2.0
package root
import breeze.linalg._
object MatrixHelper {
// it's only efficient to put the sparse matrix on the right hand side, since
// it is a column-sparse matrix
def mul( A: DenseMatrix[Double], B: CSCMatrix[Double] ) : DenseMatrix[Double] = {
val resultRows = A.rows
val resultCols = B.cols
var row = 0
val result = DenseMatrix.zeros[Double](resultRows, resultCols )
while( row < resultRows ) {
var col = 0
while( col < resultCols ) {
val rightRowStartIndex = B.colPtrs(col)
val rightRowEndIndex = B.colPtrs(col + 1) - 1
val numRightRows = rightRowEndIndex - rightRowStartIndex + 1
var ri = 0
var sum = 0.
while( ri < numRightRows ) {
val inner = B.rowIndices(rightRowStartIndex + ri)
val rightValue = B.data(rightRowStartIndex + ri)
sum += A(row,inner) * rightValue
ri += 1
}
result(row,col) = sum
col += 1
}
row += 1
}
result
}
def spdiag( a: Tensor[Int,Double] ) : CSCMatrix[Double] = {
val size = a.size
val result = CSCMatrix.zeros[Double](size,size)
result.reserve(a.size)
var i = 0
while( i < size ) {
result.rowIndices(i) = i
result.colPtrs(i) = i
result.data(i) = i
//result(i,i) = a(i)
i += 1
}
//result.activeSize = size
result.colPtrs(i) = i
result
}
}

Scala - Most elegant way of initialising values inside array that's already been declared?

I have a 3d array defined like so:
val 3dArray = new Array[Array[Array[Int]]](512, 8, 8)
In Javascript I would do the following to assign each element to 1:
for (i = 0; i < 512; i++)
{
3dArray[i] = [];
for (j = 0; j < 8; j++)
{
3dArray[i][j] = [];
for (k = 0; k < 8; k++)
{
3dArray[i][j][k] = 1;
}
}
}
What's the most elegant approach to doing the same?
Not sure there's a particularly elegant way to do it, but here's one way (I use suffix s to indicate dimension, i.e. xss is a two-dimensional array).
val xsss = Array.ofDim[Int](512, 8, 8)
for (xss <- xsss; xs <- xss; i <- 0 until 8)
xs(i) = 1
Or, using transform it gets even shorter:
for (xss <- xsss; xs <- xss)
xs transform (_ => 1)
for {
i <- a.indices
j <- a(i).indices
k <- a(i)(j).indices
} a(i)(j)(k) = 1
or
for {
e <- a
ee <- e
i <- ee.indices
} ee(i) = 1
See: http://www.scala-lang.org/api/current/index.html#scala.Array$
You have Array.fill to initialize an array of 1 to 5 dimension to some given value, and Array.tabulate to initialize an array of 1 to 5 dimension given the current indexes:
scala> Array.fill(2,1,1)(42)
res1: Array[Array[Array[Int]]] = Array(Array(Array(42)), Array(Array(42)))
enter code here
scala> Array.tabulate(3,2,1){ (x,y,z) => x+y+z }
res2: Array[Array[Array[Int]]] = Array(Array(Array(0), Array(1)), Array(Array(1), Array(2)), Array(Array(2), Array(3)))