using if block in scala for distributed computing is least recommended. I have code and i want to replace if with Scala higher order method. How can i do that.
Detail code is given Here
Some part of code that contains if block is.
var bat = DenseVector.fill(N)(new BAT12(d , MinVal , MaxVal ))
bat.foreach{x => x.BestPosition = x.position;x.fitness = Sphere(x.position) ; x.BestFitness = x.fitness}
bat.foreach(x =>
if(x.BestFitness < GlobalBest_Fitness)
{
GlobalBest_Fitness =x.BestFitness ;GlobalBest_Position = x.BestPosition
})
Try
bat.filter(_.BestFitness < GlobalBest_Fitness).foreach { x =>
GlobalBest_Fitness = x.BestFitness
GlobalBest_Position = x.BestPosition
}
Do a filter before the foreach, with the if condition as the filter condition. Then do the foreach without any condition.
I have the following Scala snippet from my code. I am not able to convert it into functional style. I could do it at other places in my code but not able to change the below one to functional. Issue is once the code exhausts all pattern matching options, then only it should send back "NA". Following code is doing that, but it's not in functional style (for-yield)
var matches = new ListBuffer[List[String]]()
for (line <- caselist){
var count = 0
for (pat <- pattern if (!pat.findAllIn(line).isEmpty)){
count += 1
matches += pat.findAllIn(line).toList
}
if (count == 0){
matches += List("NA")
}
}
return matches.toList
}
Your question is not entirely complete, so I can't be sure, but I believe the following will do the job:
for {
line <- caselist
matches = pattern.map(_.findAllIn(line).toList)
} yield matches.flatten match {
case Nil => List("NA")
case ms => ms
}
This should do the job. Using foreach and filter to generate the matches and checking to make sure there is a match for each line will work.
caseList.foreach{ line =>
val results = pattern.foreach ( pat => pat.findAllIn(line).toList )
val filteredResults = results.filter( ! _.isEmpty )
if ( filteredResults.isEmpty ) List("NA")
else filteredResults
}
Functional doesn't mean you can't have intermediate named values.
I have a file like below:
0; best wrap ear market pair pair break make
1; time sennheiser product better earphone fit
1; recommend headphone pretty decent full sound earbud design
0; originally buy work gym work well robust sound quality good clip
1; terrific sound great fit toss mine profuse sweater headphone
0; negative experienced sit chair back touch chair earplug displace hurt
...
and i want to extract number and store it in a for each document, i've tried :
var grouped_with_wt = data.flatMap({ (line) =>
val words = line.split(";").split(" ")
words.map(w => {
val a =
(line.hashCode(),(vocab_lookup.value(w), a))
})
}).groupByKey()
expected output is :
(1453543,(best,0),(wrap,0),(ear,0),(market,0),(pair,0),(break,0),(make,0))
(3942334,(time,1),(sennheiser,1),(product,1),(better,1),(earphone,1),(fit,1))
...
after generating above results i used them in this code to generate final results:
val Beta = DenseMatrix.zeros[Int](V, S)
val Beta_c = grouped_with_wt.flatMap(kv => {
kv._2.map(wt => {
Beta(wt._1,wt._2) +=1
})
})
final results:
1 0
1 0
1 0
1 0
...
This code doesn't work well , Can anybody help me? I want a code like above.
val inputRDD = sc.textFile("input dir ")
val outRDD = inputRDD.map(r => {
val tuple = r.split(";")
val key = tuple(0)
val words = tuple(1).trim().split(" ")
val outArr = words.map(w => {
new Tuple2(w,key)
})
(r.hashCode, outArr.mkString(","))
})
outRDD.saveAsTextFile("output dir")
output
(-1704185638,(best,0),(wrap,0),(ear,0),(market,0),(pair,0),(pair,0),(break,0),(make,0))
(147969209,(time,5),(sennheiser,5),(product,5),(better,5),(earphone,5),(fit,5))
(1145947974,(recommend,1),(headphone,1),(pretty,1),(decent,1),(full,1),(sound,1),(earbud,1),(design,1))
(838871770,(originally,4),(buy,4),(work,4),(gym,4),(work,4),(well,4),(robust,4),(sound,4),(quality,4),(good,4),(clip,4))
(934228708,(terrific,5),(sound,5),(great,5),(fit,5),(toss,5),(mine,5),(profuse,5),(sweater,5),(headphone,5))
(659513416,(negative,-3),(experienced,-3),(sit,-3),(chair,-3),(back,-3),(touch,-3),(chair,-3),(earplug,-3),(displace,-3),(hurt,-3))
I read file from HDFS, which contains x1,x2,y1,y2 representing a envelope in JTS.
I would like to use those data to build STRtree in foreach.
val inputData = sc.textFile(inputDataPath).cache()
val strtree = new STRtree
inputData.foreach(line => {val array = line.split(",").map(_.toDouble);val e = new Envelope(array(0),array(1),array(2),array(3)) ;
println("envelope is " + e);
strtree.insert(e,
new Rectangle(array(0),array(1),array(2),array(3)))})
As you can see, I also print the e object.
To my surprise, when I log the size of strtree, it is zero! It seems that insert method make no senses here.
By the way, if I write hard code some test data line by line, the strtree can be built well.
One more thing, those project is packed into jar and submitted in the spark-shell.
So, why does the method in foreach not work ?
You will have to collect() to do this:
inputData.collect().foreach(line => {
... // your code
})
You can do this (for avoiding collecting all data):
val pairs = inputData.map(line => {
val array = line.split(",").map(_.toDouble);
val e = new Envelope(array(0),array(1),array(2),array(3)) ;
println("envelope is " + e);
(e, new Rectangle(array(0),array(1),array(2),array(3)))
}
pairs.collect().foreach(pair => {
strtree.insert(pair._1, pair._2)
}
Use .map() instead of .foreach() and reassign the outcome.
Foreach does not return the outcome of applyied function. It can be used for sending data somewhere, storing to db, printing, and so on.
I use this code to read file into memory :
val lines = Source.fromFile(fileToRead, "utf-8").getLines
To iterate over some of the lines I use :
lines.take(linesToReadFromDataFile).foreach(line => {
Sometimes I may want to iterate all lines :
lines.foreach(line => {
To determines if to real all of the lines I could use a boolean 'useAlllines' and do something like :
if(useAllLines)
lines.foreach(line => {
else
lines.take(linesToReadFromDataFile).foreach(line => {
Using Scala is there a better way of achieving this ?
I guess this will be enough:
val toIterate =
if(useAllLines)
lines
else
lines.take(linesToReadFromDataFile)
for ( line <- toIterate ) {
...
}
You could also combine useAllLines and linesToReadFromDataFile in a single variable of type Option[Int]:
val toIterate = optionLinesToReadFromDataFile.map{ lines.take(_) }.getOrElse(lines)
lines.take(if (useAllLines) lines.length else linesToReadFromDataFile).foreach(