Can't print items of the observables after grouped - reactive-programming

Can't understand why the following rxscala code is not working as expected:
import rx.lang.scala.Observable
object MyTest extends App {
case class ProjectEvent(projectName: String, description: String)
val projectEvents: Observable[ProjectEvent] = Observable.just(
ProjectEvent("aaa", "d1"),
ProjectEvent("bbb", "d2"),
ProjectEvent("aaa", "d3")
)
lazy val grouped = projectEvents.groupBy(_.projectName).map { case (projectName, eventsOfThisProject) =>
println("projectName: " + projectName)
eventsOfThisProject.foreach(x => "######### event in project " + projectName + ": " + x)
(projectName, eventsOfThisProject)
}
grouped.foreach(println)
}
I grouped the projectEvents by the projectName and want to print the items of each project. But when I run this code, it only prints:
projectName: aaa
(aaa,rx.lang.scala.JavaConversions$$anon$2#49de17f4)
projectName: bbb
(bbb,rx.lang.scala.JavaConversions$$anon$2#52f6438d)
There is no ######### event in project printed.
I can't understand why, is there anything I missed?

You forgot to use println in this line:
eventsOfThisProject.foreach(x => "######### event in project " + projectName + ": " + x)
The function in foreach just converts x to a String but doesn't print it.

Related

Scala - Keep Map in foreach

var myMap:Map[String, Int] = Map()
myRDD.foreach { data =>
println( "1. " + data.name + " : " + data.time)
myMap += ( data.name -> data.time)
println( "2. " + myMap)
}
println( "Total Map : " + myMap)
Result
A : 1
Map(A -> 1)
B: 2
Map(B -> 2) // deleted key A
C: 3
Map(C -> 3) // deleted Key A and B
Total Map : Map() // nothing
Somehow I cannot store Map data in foreach. It kept deleting or initialing previous data when adding new key&value.
Any Idea of this?
Spark closures are serialized and executed in a separate context (remotely when in a cluster). myMap variable will not be updated locally.
To get the data from the RDD as a map, there's a built-in operation:
val myMap = rdd.collectAsMap()

support and lift for fp-growth rules in mllib spark/scala

I would like to extract support and lift for generated association rules with fp-growth. Having found the rules with the code below I manually go through the transactions and calculate support and lift. I wonder if there is a more legant way to extract this info. thanks!
val fpg = new FPGrowth()
.setMinSupport(0.2)
.setNumPartitions(10)
val model = fpg.run(transactions)
model.freqItemsets.collect().foreach { itemset =>
println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}
val minConfidence = 0.8
model.generateAssociationRules(minConfidence).collect().foreach { rule =>
println(
rule.antecedent.mkString("[", ",", "]")
+ " => " + rule.consequent .mkString("[", ",", "]")
+ ", " + rule.confidence)
}
mm not elegant but this is what I do
val freqs = fpgrowth_model(transactions, min_supp=supp)
val supps = freqs.withColumn("support", $"freq" / total_transactions)
val rules = get_rules(transactions, min_supp=supp, min_confidence=conf)
val cross_df = supps.join(rules, $"items" === $"consequent")
.withColumn("lift",$"confidence" / $"support")

SignatureDoesNotMatch Aws CloudSearch scala

I keep getting:
"#SignatureDoesNotMatch","error":{"message":"[Deprecated: Use the
outer message field] The request signature we calculated does not
match the signature you provided. Check your AWS Secret Access Key and
signing method. Consult the service documentation for details.
from trying to do a get request to cloudsearch. I verified that my Canonical String and String-to-Sign match the ones sent back from the error message everytime now, but I keep getting the error. Im assuming my signature itself isn't being processed correctly. But hard to nail it down.
def getHash(key:Array[Byte]): String = {
try
{
val md = MessageDigest.getInstance("SHA-256").digest(key)
md.map("%02x".format(_)).mkString.toLowerCase()
}
catch
{
case e: Exception => ""
}
}
.
def HmacSHA256(data:String, key:Array[Byte]): Array[Byte] = {
val algorithm="HmacSHA256";
val mac = Mac.getInstance(algorithm);
mac.init(new SecretKeySpec(key, algorithm));
mac.doFinal(data.getBytes("UTF8"));
}
.
...
val algorithm = "AWS4-HMAC-SHA256"
val credential_scope = date + "/us-west-1/cloudsearch/aws4_request"
val string_to_sign = algorithm + "\n" + dateTime + "\n" + credential_scope + "\n" + getHash(canonical_request)
val kSecret = ("AWS4" + config.getString("cloud.secret")).getBytes("utf-8")
val kDate = HmacSHA256(date.toString, kSecret)
val kRegion = HmacSHA256("us-west-1",kDate)
val kService = HmacSHA256("cloudsearch",kRegion)
val kSigning = HmacSHA256("aws4_request",kService)
val signing_key = kSigning
val signature = getHash(HmacSHA256(string_to_sign, kSigning))
val authorization_header = algorithm + " " + "Credential=" + config.getString("cloud.key") + "/" + credential_scope + ", " + "SignedHeaders=" + signed_headers + ", " + "Signature=" + signature
val complexHolder = holder.withHeaders(("x-amz-date",dateTime.toString))
.withHeaders(("Authorization",authorization_header))
.withRequestTimeout(5000)
.get()
val response = Await.result(complexHolder, 10 second)
I just released a helper library to sign your HTTP requests to AWS: https://github.com/ticofab/aws-request-signer . Hope it helps!

error: value saveAsTextFile is not a member of Unit

I am relatively new to Spark and scala programming.
I was trying to execute the simple pagerank algorithm using scala. But I encountered this error when compiling.
error: value saveAsTextFile is not a member of Unit
I have attached the code I am using.
val output = ranks.collect()
output.foreach(tup => println(tup._1 + " has page rank: " + tup._2)).saveAsTextFile("/user/ssimhadr/ScalaWordCount_Output")
foreach is, as Ryan pointed out, solely for side-effects. It returns Unit and not the List itself. Ergo no chaining.
Now what you are actually doing is the following:
val output = ranks.collect()
val realoutput: Unit = output.foreach(tup => println(tup._1 + " has page rank: " + tup._2))
realoutput.saveAsTextFile(...)
saveAsTextFile is not a member of Unit and you get your error message
You should be doing:
ranks.foreach(tup => println(tup._1 + " has page rank: " + tup._2))
ranks.saveAsTextFile(...)
or
ranks.saveAsTextFile(...)
ranks.collect().foreach(tup => println(tup._1 + " has page rank: " + tup._2))

"foreach is not a member of object" when I'm trying to iterate over enumeration [duplicate]

I'm try to learn some Scala reading Programming Scala, by Dean Wampler.
I'm trying to replicate a code snippet about Enumeration
object Breed extends Enumeration {
val doberman = Value("Doberman Pinscher")
val yorkie = Value("Yorkshire Terrier")
val scottie = Value("Scottish Terrier")
val dane = Value("Great Dane")
val portie = Value("Portuguese Water Dog")
}
for (breed <- Breed) println(breed.id + "\t" + breed)
But, in the last line of code, I got this error:
value foreach is not a member of object Breed
Am I missing something? How can I solve?
You need to use .values:
for (breed <- Breed.values) println(breed.id + "\t" + breed)
And why not make it a bit more scala-y
Breed.values.foreach(breed => println(breed.id + "\t" + breed));