How to print the arrays in table or dimension format? - scala

Actually this works
object Matrixmul extends App {
val a = Array(Array(1, 2, 3), Array(4, 5, 6), Array(7, 8, 9))
val b = Array(Array(1, 2, 3), Array(4, 5, 6), Array(7, 8, 9))
val c = Array.ofDim[Int](3, 3)
val sum =Array.ofDim[Int](3,3)
println(a.mkString(" "))
val elements = for {
row <- a
ele <- row
}yield ele
for(array1 <-elements)
println(" the 1st matrix array elements are : " + array1)
This prints the arrray in the format,
the 1st matrix array elements are : 1
the 1st matrix array elements are : 2
the 1st matrix array elements are : 3
the 1st matrix array elements are : 4
the 1st matrix array elements are : 5
the 1st matrix array elements are : 6
the 1st matrix array elements are : 7
the 1st matrix array elements are : 8
the 1st matrix array elements are : 9
But I need in DIMENSION format,
1 2 3
4 5 6
7 8 9

How about the following,
val a = Array(Array(1, 2, 3), Array(4, 5, 6), Array(7, 8, 9))
a.foreach(row => println(row.mkString(" ")))
Which will print your dimentional format,
1 2 3
4 5 6
7 8 9

Small variation on the subject
val a = Array(Array(1, 2, 3), Array(4, 5, 6), Array(7, 8, 9))
println(a.map(_.mkString(" ")).mkString("\n"))

Related

Reformatting Dataframe Containing Array to RowMatrix

I have this dataframe in the following format:
+----+-----+
| features |
+----+-----+
|[1,4,7,10]|
|[2,5,8,11]|
|[3,6,9,12]|
+----+----+
Script to create sample dataframe:
rows2 = sc.parallelize([ IndexedRow(0, [1, 4, 7, 10 ]),
IndexedRow(1, [2, 5, 8, 1]),
IndexedRow(1, [3, 6, 9, 12]),
])
rows_df = rows2.toDF()
row_vec= rows_df.drop("index")
row_vec.show()
The feature column contains 4 features, and there are 3 row ids. I want to convert this data to a rowmatrix, where the columns and rows will be in the following mat format:
from pyspark.mllib.linalg.distributed import RowMatrix
rows = sc.parallelize([(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12)])
# Convert to RowMatrix
mat = RowMatrix(rows)
# Calculate exact and approximate similarities
exact = mat.columnSimilarities()
approx = mat.columnSimilarities(0.05)
Basically, I want to transpose the dataframe into the new format so that I can run the columnSimilarities() function. I have a much larger dataframe that contains 50 features, and 39000 rows.
Is this what you are trying to do? Hate using collect() but don't think it can be avoided here since you want to reshape/convert structured object to matrix ... right?
X = np.array(row_vec.select("_2").collect()).reshape(-1,3)
X = sc.parallelize(X)
for i in X.collect(): print(i)
[1 4 7]
[10 2 5]
[8 1 3]
[ 6 9 12]
I figured it out, I used the following:
from pyspark.mllib.linalg.distributed import RowMatrix
features_rdd = row_vec.select("features").rdd.map(lambda row: row[0])
features_mat = RowMatrix(features_rdd )
from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
coordmatrix_features = CoordinateMatrix(
features_mat .rows.zipWithIndex().flatMap(
lambda x: [MatrixEntry(x[1], j, v) for j, v in enumerate(x[0])]
)
)
transposed_rowmatrix_features = coordmatrix_features.transpose().toRowMatrix()

For loop to create tuples of adjacent elements

I have a array
[1,2,2,3,4,6,2,4,6,8,2,3,5]
I want to iterate over this array using a for loop to get a collection of tuples of adjacent elements. How should I code in Scala?
Expected output :
1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5
If you want the output like 1-2|2-2|2-3|3-4|........ as you mentioned in your comment you can try following,
val arr = Array(1,2,2,3,4,6,2,4,6,8,2,3,5)
//here first separate array elements by - then whole array by |
val str = arr.sliding(2).map(_.mkString("-")).mkString("|")
print(str)
//output
//1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5
In scala you have sliding function for that.
scala> val arr = Array(1,2,2,3,4,6,2,4,6,8,2,3,5)
arr: Array[Int] = Array(1, 2, 2, 3, 4, 6, 2, 4, 6, 8, 2, 3, 5)
scala> arr.sliding(2).foreach(tuple => println(tuple.mkString(" ")))
1 2
2 2
2 3
3 4
4 6
6 2
2 4
4 6
6 8
8 2
2 3
3 5
scala> arr.sliding(2).map(tuple => tuple.mkString("-")).mkString("|")
res10: String = 1-2|2-2|2-3|3-4|4-6|6-2|2-4|4-6|6-8|8-2|2-3|3-5

Create 2D array and store value into each element of that array in Scala

I am working on a Scala exercise which asks me to create a 2D array of 4 rows and 5 columns and store the row index+column index+5 in each element. Also I have to sum the array by rows and then by columns and print the rows total and the columns total.I am so confused and I only know how to create an empty array.
val matrix = Array.ofDim[Int](4, 5)
Can you teach me how to do the rest of this exercise?
I will not tell you "the rest of the exercise" but I will try to show one way to create a 2D collection, like an array in this case:
val matrix1D = for {
rowIndex <- (0 until 4).toArray
colIndex <- (0 until 5).toArray
} yield rowIndex + colIndex + 5
Where
scala> :t matrix1D
Array[Int]
Now the result of this for-comprehension is the 1D version of your 2D array.
EDIT
I could probably give you few more hints:
scala> (0 to 11).toArray.grouped(4).toArray
res10: Array[Array[Int]] = Array(Array(0, 1, 2, 3), Array(4, 5, 6, 7), Array(8, 9, 10, 11))
scala> .transpose
res11: Array[Array[Int]] = Array(Array(0, 4, 8), Array(1, 5, 9), Array(2, 6, 10), Array(3, 7, 11))
EDIT
After you create matrix2D from matrix1D:
val matrix2D = matrix1D.??????????????????
Where
scala> :t matrix2D
Array[Array[Int]]
To print it out, you could simply use mkString:
scala> matrix2D.map(_.mkString("\t")).mkString("\n")
res32: String =
5 6 7 8 9
6 7 8 9 10
7 8 9 10 11
8 9 10 11 12

Scala trying to count instances of a digit in a number

This is my first day using scala. I am trying to make a string of the number of times each digit is represented in a string. For instance, the number 4310227 would return "1121100100" because 0 appears once, 1 appears once, 2 appears twice and so on...
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val digits = 0 to 9
val str = ""
for (a <- digits) {
println(a)
val b = cubed.count(_==a.toString)
println(b)
}
return cubed
}
and it doesn't seem to work. would like some scalay reasons why and to know whether I should even be going about it in this manner. Thanks!
When you iterate over strings, which is what you are doing when you call String#count(), you are working with Chars, not Strings. You don't want to compare these two with ==, since they aren't the same type of object.
One way to solve this problem is to call Char#toString() before performing the comparison, e.g., amend your code to read cubed.count(_.toString==a.toString).
As Rado and cheeken said, you're comparing a Char with a String, which will never be be equal. An alternative to cheekin's answer of converting each character to a string is to create a range from chars, ie '0' to '9':
val digits = '0' to '9'
...
val b = cubed.count(_ == a)
Note that if you want the Int that a Char represents, you can call char.asDigit.
Aleksey's, Ren's and Randall's answers are something you will want to strive towards as they separate out the pure solution to the problem. However, given that it's your first day with Scala, depending on what background you have, you might need a bit more context before understanding them.
Fairly simple:
scala> ("122333abc456xyz" filter (_.isDigit)).foldLeft(Map.empty[Char, Int]) ((histo, c) => histo + (c -> (histo.getOrElse(c, 0) + 1)))
res1: scala.collection.immutable.Map[Char,Int] = Map(4 -> 1, 5 -> 1, 6 -> 1, 1 -> 1, 2 -> 2, 3 -> 3)
This is perhaps not the fastest approach because intermediate datatype like String and Char are used but one of the most simplest:
def countDigits(n: Int): Map[Int, Int] =
n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
Example:
scala> def countDigits(n: Int): Map[Int, Int] = n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
countDigits: (n: Int)Map[Int,Int]
scala> countDigits(12345135)
res0: Map[Int,Int] = Map(5 -> 2, 1 -> 2, 2 -> 1, 3 -> 2, 4 -> 1)
Where myNumAsString is a String, eg "15625"
myNumAsString.groupBy(x => x).map(x => (x._1, x._2.length))
Result = Map(2 -> 1, 5 -> 2, 1 -> 1, 6 -> 1)
ie. A map containing the digit with its corresponding count.
What this is doing is taking your list, grouping the values by value (So for the initial string of "15625", it produces a map of 1 -> 1, 2 -> 2, 6 -> 6, and 5 -> 55.). The second bit just creates a map of the value to the count of how many times it occurs.
The counts for these hundred digits happen to fit into a hex digit.
scala> val is = for (_ <- (1 to 100).toList) yield r.nextInt(10)
is: List[Int] = List(8, 3, 9, 8, 0, 2, 0, 7, 8, 1, 6, 9, 9, 0, 3, 6, 8, 6, 3, 1, 8, 7, 0, 4, 4, 8, 4, 6, 9, 7, 4, 6, 6, 0, 3, 0, 4, 1, 5, 8, 9, 1, 2, 0, 8, 8, 2, 3, 8, 6, 4, 7, 1, 0, 2, 2, 6, 9, 3, 8, 6, 7, 9, 5, 0, 7, 6, 8, 7, 5, 8, 2, 2, 2, 4, 1, 2, 2, 6, 8, 1, 7, 0, 7, 6, 9, 5, 5, 5, 3, 5, 8, 2, 5, 1, 9, 5, 7, 2, 3)
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } map ("%x" format _) mkString
warning: there were 1 feature warning(s); re-run with -feature for details
res7: String = a8c879caf9
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } sum
warning: there were 1 feature warning(s); re-run with -feature for details
res8: Int = 100
I was going to point out that no one used a char range, but now I see Kristian did.
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val cnts = for (a <- '0' to '9') yield cubed.count(_ == a)
(cnts map (c => ('0' + c).toChar)).mkString
}

Scala: Using span with modular arithmetic

I have a List[Int] from 1 to 10 and want to make a List[List[Int]] containing two List[Int]: one list containing even numbers and the other containing odd numbers. The result should be like this:
List(List(2,4,6,8,10),List(1,3,5,7,9))
I tried these things:
1.to(10).toList.span((x:Int) => x % 2 == 0)
and
val lst = 1.to(10).toList; lst span (_%2==0)
However, neither of these worked.
Can someone help me on this matter?
The method you need to use is partition, not span:
scala> (1 to 10).partition(_ % 2 == 0)
res0: (IndexedSeq[Int], IndexedSeq[Int]) = (Vector(2, 4, 6, 8, 10),Vector(1, 3, 5, 7, 9))
Since you want a List[List[Int]], you could do this:
val lst = (1 to 10).toList
val (evens, odds) = lst.partition(_ % 2 == 0)
val newList = List(evens,odds) // List(List(2, 4, 6, 8, 10), List(1, 3, 5, 7, 9))
The span method can only be used to split a sequence at a single point:
scala> (1 to 10).span(_ < 5)
res1: (Range, Range) = (Range(1, 2, 3, 4),Range(5, 6, 7, 8, 9, 10))
When you tried lst.span(_ % 2 == 0), the program found that the first item, 1, did not pass the test (_ % 2 == 0), so all the elements were put in the second list, leaving none in the first.