How can I modify each of the values in array? - scala

I would like to change all values in a multidimensional array but I'm receiving the "reassignment to val" error.
Scala code:
var cal = Array.ofDim[Double](300, 10000000);
cal.foreach(x => {
x.foreach({o => o = 5.1} //here it'll be more complicated code
)});
Does any body knows how to reassign the values in Array ?
Thanks

There is in-place modification bulk operation is transform on mutable sequences.
for (y <- 0 until cal.length) {
cal(y).transform(x => 5.1)
}
You can also do:
for (y <- 0 until cal.length) {
val row = cal(y)
for (x <- 0 until row.length) {
row(x) = 5.1
}
}

I am not sure whether this is the best way but it works:
val cal = Array.ofDim[Double](300, 10000000)
(0 until cal.length).foreach(rowIndex => {
val row = cal(rowIndex)
(0 until row.length).foreach(colIndex => {
row(colIndex) = 5.1 //here it'll be more complicated code
})
})
The reason for the error you got is that you are trying to assign a value to the immutable function parameter o in o => o = 5.1.

Since your using var you can simply
var cal = Array.ofDim[Double](300, 10000000);
cal = cal.map{
a=>a.map{
o=> 1.5 \\complex calculations
}
}
NB: This is not idiomatic scala or even functional, but it works.

Related

How to append a string to a list or array in a for loop in scala?

var RetainKeyList: mutable.Seq[String] = new scala.collection.mutable.ListBuffer[String]()
for(element<-ts_rdd)
{
var elem1 = element._1
var kpssSignificance: Double = 0.05
var dOpt: Option[Int] = (0 to 2).find
{
diff =>
var testTs = differencesOfOrderD(element._2, diff)
var (stat, criticalValues) = kpsstest(testTs, "c")
stat < criticalValues(kpssSignificance)
}
var d = dOpt match
{
case Some(v) => v
case None => 300000
}
if(d.equals(300000))
{
println("Bad Key: " + elem1)
RetainKeyList += elem1
}
Hi all,
I created a empty mutable list buffer var RetainKeyList: mutable.Seq[String] = new scala.collection.mutable.ListBuffer[String]() and I am trying to add a string elem1 to it in a for loop.
When I try to compile the code it hangs with no error message but if I remove the code RetainKeyList += elem1 I am able to print all of the elem1 string properly.
What am I doing wrong here? Is there a cleaner way to collect all the string elem1 generated in the for loop?
Long story short, your code is running on a distributed environment, so the local collection is not modified. Every week someone asks this question, please if you do not understand what are the implications of distributed computing do not use a distributed framework like Spark.
Also, you are abusing of mutability in all parts. And mutability and a distributed environment don't play nicely.
Anyway, here is a better way to solve your problem.
val retainKeysRdd = ts_rdd.map {
case (elem1, elem2) =>
val kpssSignificance = 0.05d
val dOpt = (0 to 2).find { diff =>
val testTs = differencesOfOrderD(elem2, diff)
val (stat, criticalValues) = kpsstest(testTs, "c")
stat < criticalValues(kpssSignificance)
}
(elem1 -> dOpt)
} collect {
case (key, None) => key
}
This returns an RDD with the retain keys. If you are really sure you need this as a local collection and that they won't blow up your memory, you can do this:
val retainKeysList = retainKeysRdd.collect().toList

Is there any way to replace nested For loop with Higher order methods in scala

I am having a mutableList and want to take sum of all of its rows and replacing its rows with some other values based on some criteria. Code below is working fine for me but i want to ask is there any way to get rid of nested for loops as for loops slows down the performance. I want to use scala higher order methods instead of nested for loop. I tried flodLeft() higher order method to replace single for loop but can not implement to replace nested for loop
def func(nVect : Int , nDim : Int) : Unit = {
var Vector = MutableList.fill(nVect,nDimn)(math.random)
var V1Res =0.0
var V2Res =0.0
var V3Res =0.0
for(i<- 0 to nVect -1) {
for (j <- i +1 to nVect -1) {
var resultant = Vector(i).zip(Vector(j)).map{case (x,y) => x + y}
V1Res = choice(Vector(i))
V2Res = choice(Vector(j))
V3Res = choice(resultant)
if(V3Res > V1Res){
Vector(i) = res
}
if(V3Res > V2Res){
Vector(j) = res
}
}
}
}
There are no "for loops" in this code; the for statements are already converted to foreach calls by the compiler, so it is already using higher-order methods. These foreach calls could be written out explicitly, but it would make no difference to the performance.
Making the code compile and then cleaning it up gives this:
def func(nVect: Int, nDim: Int): Unit = {
val vector = Array.fill(nVect, nDim)(math.random)
for {
i <- 0 until nVect
j <- i + 1 until nVect
} {
val res = vector(i).zip(vector(j)).map { case (x, y) => x + y }
val v1Res = choice(vector(i))
val v2Res = choice(vector(j))
val v3Res = choice(res)
if (v3Res > v1Res) {
vector(i) = res
}
if (v3Res > v2Res) {
vector(j) = res
}
}
}
Note that using a single for does not make any difference to the result, it just looks better!
At this point it gets difficult to make further improvements. The only parallelism possible is with the inner map call, but vectorising this is almost certainly a better option. If choice is expensive then the results could be cached, but this cache needs to be updated when vector is updated.
If the choice could be done in a second pass after all the cross-sums have been calculated then it would be much more parallelisable, but clearly that would also change the results.

Spark: calling a function inside of mapPartitionsWithIndex

I got very strange results with the following code.
I only want to take the partition data and iterate for each data, X times.
Here I'm calling to my function for each partition:
val myRDDResult = myRDD.mapPartitionsWithIndex( myFunction(_, _, limit), preservesPartitioning = true)
And the funcion is:
private def myFunction (partitionIndex: Long,
partitionData: Iterator[Array[(LabeledPoint,Int,Int)]]), limit: Int): Iterator[String] = {
var newData = ArrayBuffer[String]()
if (partitionData.nonEmpty){
val partDataMap = partitionData.next.map{ case (lp, _, neighId) => (lp, neighId) }.toMap
var newString:String = ""
for {
(k1,_) <- partDataMap
i <- 0 to limit
_ = {
// ... some code to generate the content for `newString`
newData.+=(newString)
}
}yield ()
}
newData.iterator
}
Here are some values obtained:
partitionData limit newData newData_expected
1640 250 411138 410000 (1640*250)
16256 27 288820 438912
I don't know if I misundertanding some concept of my code.
I've also tried changing the for part for this idea: partDataMap.map{elem=> for (i <- 0 to limit){....}}
Any suggestions?
First, sorry because I downvoted/upvoted (click error) your question and since I didn't cancel it within 10 minutes, SO kept it upvoted.
Regarding to your code, I think your expected results are bad because I took the same code as you, simplified it a little, and instead of receiving 410000 elements, I got 411640. Maybe I copied something incorrectly or ignore some stuff, but the code giving 411640 looks like:
val limit = 250
val partitionData: Iterator[Array[Int]] = Seq((1 to 1640).toArray).toIterator
var newData = ArrayBuffer[String]()
if (partitionData.nonEmpty){
val partDataMap = partitionData.next.map{ nr => nr.toString }
for {
value <- partDataMap
i <- 0 to limit
_ = {
newData.+=(s"${value}_${i}")
}
} yield ()
}
println(s"new buffer=${newData}")
println(s"Buffer size = ${newData.size}")
Now to answer to your question about why mapWithPartitions results differ from your expectations. IMO it's because your conversion from the Array to Map. If in your array you have duplicated key, it will count only once. It could explain why in both cases (if we consider 411640 as correct expected number) you receive the results lower than expected. To be sure of that you can compare partDataMap.size with partitionData.next.size.

Accessing list of lists inside a for

I have the following bit of code in scala which is not working:
var tacTable : List[List[TACList]] = List(List())
def gen(p: Program) = {
for (i <- 0 to p.classes.length){
for (j <- 0 to p.classes(i).methods.length){
var tacInstr = new TACList()
tacTable(i)(j) = tacInstr //error: application does not take parameters
}
}
}
Apparently, it has to do with the fact that I'm using j to access the list and j is used in for...how can I solve this?
For convenience you can work with this other example which gives the same error:
var l : List[List[Int]] = List(List(1,2),List(3,4))
for (i <- 0 to l.length) {
for (j <- 0 to l.length) {
l(i)(j) = 8
}
}
Lists are immutable.
Try this instead:
val tacTable = p.classes.map { _.methods.map { _ =>
new TACList()
}
since I cannot comment on the initial post a sidenote here:
in a scala for-comprehension you can use multiple generators in a single for. so the nesting that you used is not necessary since you can use this:
for (i <- 0 to l.length; j <- 0 to l.length) {
// your code
}
furthermore, this does not seem to apply in your case but if you had a flat mapped result you should use the yield of the for comprehension instead of a mutation in the body

How can I loop though a set and reassign every item in collection with new value?

Hi I want to loop a set of Strings and convert them from String type to ObjectId type.
I tried this way:
followingIds.foreach(e => e = new ObjectId(e))
But I cant do that assignement.
I also tried using "for" but I don't know how to access each position of the Set by Index.
for (i <- 0 until following.size) {
following[i] = new ObjectId(following[i])
}
This neither work,
Can anyone help me?!? Please!
If you insist on mutability you can go with something like this:
var followingIds = Set("foo", "bar")
followingIds = followingIds.map(e => new ObjectId(e))
But you can make your code more scalish with immutable things:
val followingIds = Set("foo", "bar")
val objectIds = followingIds.map(e => new ObjectId(e))
Now variables (values) names are pretty descriptive
Java-1.4-like?
val mutableSet: collection.mutable.Set[AnyRef] = collection.mutable.Set[AnyRef]("0", "1", "10", "11")
//mutableSet: scala.collection.mutable.Set[AnyRef] = Set(0, 1, 10, 11)
for (el <- mutableSet) el match {
case s: String =>
mutableSet += ObjectId(s)
mutableSet -= s
s
case s => s
}
mutableSet
//res24: scala.collection.mutable.Set[AnyRef] = Set(ObjectId(0), ObjectId(11), ObjectId(10), ObjectId(1))