I'm having an issue with Future List inside a recursion.
When i implemented this method without Futures i used ListBuffer and then adding items to the list.
val filtered = ListBuffer.empty[PostMD]
filtered ++= postMd.filter(_.fromID == userID)
Now i'm trying to implement it with Futures but i can't find a similar solution
What will be the best way to work with a Future List.
def getData(url: String, userID: String) = {
val filtered: (List[PostMD]) => Future[List[PostMD]] = Future[List[PostMD]]
def inner(url: String): Unit = {
val chunk: Future[JsValue] = BusinessLogic.Methods.getJsonValue(url)
val postMd: Future[List[PostMD]] = for {
x <- chunk.map(_.\("data").as[List[JsValue]])
y <- x.map(_.\("data").as[PostMD])
} yield y
filtered = postMd.map(_.filter(_.fromID == userID)) // <- returned Future[List[PostMD]]
val next: String = (chunk.map(_.\("paging").\("next"))).toString
if (next != null) inner(next)
}
inner(url)
filtered
}
thanks,
miki
I tried to do what you want with random number generation.
import scala.concurrent.{Await, Future}
import scala.util.Random
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
val RANDOM = new Random()
def futureRec(num: Int, f: Future[List[Integer]]): Future[List[Integer]] = {
if(num == 0) {
f
} else {
f.flatMap(l => {
futureRec(num - 1, Future.successful(RANDOM.nextInt() :: l))
})
}
}
val futureResult = futureRec(5, Future.successful(Nil))
Await.result(futureResult, 5 minutes)
So I would do, what you want something like this:
def getData(url: String, userID: String):Future[List[PostMD]] = {
def inner(url: String, f: Future[List[PostMD]]): Future[List[PostMD]] = {
val chunk: Future[JsValue] = ???
chunk.flatMap(ch => {
val postMd = (ch \ "data").\\("data").map(_.as[PostMD]).toList
val relatedPostMd = postMd.filter(_.fromID == userID)
val next: String = (ch.\("paging").\("next")).as[String]
if (next != null)
inner(next, f.map(l => l ++ relatedPostMd))
else
f.map(l => l ++ relatedPostMd)
})
}
inner(url, Future.successful(Nil))
}
Related
I am not able to return Future[List[DiagnosisCode]] from fetchDiagnosisForUniqueCodes
import scala.concurrent._
import ExecutionContext.Implicits.global
case class DiagnosisCode(rootCode: String, uniqueCode: String, description: Option[String] = None)
object Database {
private val data: List[DiagnosisCode] = List(
DiagnosisCode("A00", "A001", Some("Cholera due to Vibrio cholerae")),
DiagnosisCode("A00", "A009", Some("Cholera, unspecified")),
DiagnosisCode("A08", "A080", Some("Rotaviral enteritis")),
DiagnosisCode("A08", "A083", Some("Other viral enteritis"))
)
def getAllUniqueCodes: Future[List[String]] = Future {
Database.data.map(_.uniqueCode)
}
def fetchDiagnosisForUniqueCode(uniqueCode: String): Future[Option[DiagnosisCode]] = Future {
Database.data.find(_.uniqueCode.equalsIgnoreCase(uniqueCode))
}
}
getAllUniqueCodes returns all unique codes from data List.
fetchDiagnosisForUniqueCode returns DiagnosisCode when uniqueCode matches.
From fetchDiagnosisForUniqueCodes, I would like to return Future[List[DiagnosisCode]] using getAllUniqueCodes() and fetchDiagnosisForUniqueCode(uniqueCode).*
def fetchDiagnosisForUniqueCodes: Future[List[DiagnosisCode]] = {
val xa: Future[List[Future[DiagnosisCode]]] = Database.getAllUniqueCodes.map { (xs:
List[String]) =>
xs.map { (uq: String) =>
Database.fetchDiagnosisForUniqueCode(uq)
}
}.map(n =>
n.map(y=>
y.map(_.head))) // Future[List[Future[DiagnosisCode]]]
}
If I understood your post correctly, your question is: "How can I convert a Future[List[Future[DiagnosisCode]]] into a Future[List[DiagnosisCode]]?"
The answer to that question would be: use Future.sequence:
// assuming an implicit ExecutionContext is in scope:
val xa: Future[List[Future[DiagnosisCode]]] = // ... your code here
val flattened: Future[List[DiagnosisCode]] =
xa.flatMap { listOfFutures =>
Future.sequence(listOfFutures)
}
first method
i wonder to use Accumulator to calculate num of "NULL" String in different columns, so i write Spark code as follows(the code is simplified), when i put some input in appData's map operation, i could see std output in spark web ui, the value of accumulator is increased, but when i want to get the final value in driver, the accumulators are always be zero, i'll appreciate it if you could do me a favor
val mapAC = collection.mutable.Map[String, LongAccumulator]()
for (ei <- eventList) {
val idNullCN = sc.longAccumulator(ei + "_idNullCN")
mapAC.put(ei + "_idNullCN", idNullCN)
val packNullCN = sc.longAccumulator(ei + "_packNullCN")
mapAC.put(ei + "_packNullCN", packNullCN)
val positionNullCN = sc.longAccumulator(ei + "_positionNullCN")
mapAC.put(ei + "_positionNullCN", positionNullCN)
}
val mapBC = sc.broadcast(mapAC)
val res = appData.map(d => {
val ei = d.eventId
val map = mapBC.value
if (d.id.toUpperCase == "NULL") map(ei + "_idNullCN").add(1)
if (d.pack.toUpperCase == "NULL") map(ei + "_packNullCN").add(1)
if (d.position.toUpperCase == "NULL") map(ei + "_positionNullCN").add(1)
ei
})
res.count()
mapBC.value.foreach(ac=>{
println(ac._1 + ": " + ac._2.value)
})
second method
i've tried another way to caculate the value by creating a map accumulator like this.
import java.util
import java.util.Collections
import org.apache.spark.util.AccumulatorV2
import scala.collection.JavaConversions._
class CountMapAccumulator extends AccumulatorV2[String, java.util.Map[String, Long]] {
private val _map = Collections.synchronizedMap(new util.HashMap[String, Long]())
override def isZero: Boolean = _map.isEmpty
override def copy(): CountMapAccumulator = {
val newAcc = new CountMapAccumulator
_map.synchronized {
newAcc._map.putAll(_map)
}
newAcc
}
override def reset(): Unit = _map.clear()
override def add(key: String): Unit = _map.synchronized{_map.put(key, _map.get(key) + 1L)}
override def merge(other: AccumulatorV2[String, java.util.Map[String, Long]]): Unit = other match {
case o: CountMapAccumulator => for ((k, v) <- o.value) {
val oldValue = _map.put(k, v)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
}
// println("merge key: "+k+" old val: "+oldValue+" new Value: "+v+" current val: "+_map.get(k))
}
case _ => throw new UnsupportedOperationException(
s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}")
}
override def value: util.Map[String, Long] = _map.synchronized {
java.util.Collections.unmodifiableMap(new util.HashMap[String, Long](_map))
}
def setValue(value: Map[String, Long]): Unit = {
val newValue = mapAsJavaMap(value)
_map.clear()
_map.putAll(newValue)
}
}
then i invoke it as follows
val tmpMap = collection.mutable.Map[String, Long]()
for (ei <- eventList) {
tmpMap.put(ei + "_idNullCN", 0L)
tmpMap.put(ei + "_packNullCN", 0L)
tmpMap.put(ei + "_positionNullCN", 0L)
}
val accumulator = new CountMapAccumulator
accumulator.setValue(collection.immutable.Map[String,Long](tmpMap.toSeq:_*))
sc.register(accumulator, "CustomAccumulator")
val res = appData.map(d => {
val ei = d.eventId
if (d.id.toUpperCase == "NULL") accumulator.add(ei + "_idNullCN")
if (d.pack.toUpperCase == "NULL") accumulator.add(ei + "_packNullCN")
if (d.position.toUpperCase == "NULL") accumulator.add(ei + "_positionNullCN")
if (d.modulePos.toUpperCase == "NULL") accumulator.add(ei + "_modulePosNullCN")
ei
})
res.count()
accumulator.value.foreach(println)
but the accumulator value is still zero either
second method correct
since the program ends correctly, i did not check the log, after i take a look, i found this ERROR
java.lang.UnsupportedOperationException: Cannot merge $line105198665522.$read$$iw$$iw$CountMapAccumulator with $line105198665522.$read$$iw$$iw$CountMapAccumulator so i change merge methd's pattern matching code like this
override def merge(other: AccumulatorV2[String, java.util.Map[String, Long]]): Unit = other match {
case o: AccumulatorV2[String, java.util.Map[String, Long]] => for ((k, v) <- o.value) {
val oldValue: java.lang.Long = _map.get(k)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
} else {
_map.put(k, v)
}
println(s"key: ${k} oldValue: ${oldValue} newValue: ${v} finalValue: ${_map.get(k)}")
}
case _ => throw new UnsupportedOperationException(
s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}")
}
after changed o's type, it works finally, but it still confused me what first way behaves.
in your custom accumulator you have mistake in merge function, look at correct:
val oldValue: java.lang.Long = _map.get(k)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
} else {
_map.put(k, v)
}
case class Keyword(id: Int = 0, words: String)
val my= Keyword(123, "hello")
val fields: Array[Field] = my.getClass.getDeclaredFields
for (i <- fields.indices) {
println(fields(i).getName +":"+ my.productElement(i))
}
id:123
title:keyword's title
it's ok.
def outputCaseClass[A](obj:A){
val fields: Array[Field] = obj.getClass.getDeclaredFields
for (i <- fields.indices) {
println(fields(i).getName +":"+ obj.productElement(i))
}
}
outputCaseClass(my)
it's wrong
import scala.reflect.runtime.{universe => ru}
def printCaseClassParams[C: scala.reflect.ClassTag](instance: C):Unit = {
val runtimeMirror = ru.runtimeMirror(instance.getClass.getClassLoader)
val instanceMirror = runtimeMirror.reflect(instance)
val tpe = instanceMirror.symbol.toType
tpe.members
.filter(member => member.asTerm.isCaseAccessor && member.asTerm.isMethod)
.map(member => {
val term = member.asTerm
val termName = term.name.toString
val termValue = instanceMirror.reflectField(term).get
termName + ":" + termValue
})
.toList
.reverse
.foreach(s => println(s))
}
// Now you can use it with any case classes,
case class Keyword(id: Int = 0, words: String)
val my = Keyword(123, "hello")
printCaseClassParams(my)
// id:123
// words:hello
productElement is a Method of the Product Base trait.
Try to use a method signature like this:
def outputCaseClass[A <: Product](obj:A){ .. }
However it still won't work for inner case classes (fields also reports the $outer-Field, which productElement won't return and so it crashes with IndexOutOfBoundsException).
I don't know if it is possible, but I'd like in my mapPartitions to split in two lists the variable "a". Like here to have a list l that stores all numbers and an other list let's say b that stores all words. with something like a.mapPartitions((p,v) =>{ val l = p.toList; val b = v.toList; ....}
With for example in my for loop l(i)=1 and b(i) ="score"
import scala.io.Source
import org.apache.spark.rdd.RDD
import scala.collection.mutable.ListBuffer
val a = sc.parallelize(List(("score",1),("chicken",2),("magnacarta",2)) )
a.mapPartitions(p =>{val l = p.toList;
val ret = new ListBuffer[Int]
val words = new ListBuffer[String]
for(i<-0 to l.length-1){
words+= b(i)
ret += l(i)
}
ret.toList.iterator
}
)
Spark is a distributed computing engine. you can perform operation on partitioned data across nodes of the cluster. Then you need a Reduce() method that performs a summary operation.
Please see this code that should do what you want:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object SimpleApp {
class MyResponseObj(var numbers: List[Int] = List[Int](), var words: List[String] = List[String]()) extends java.io.Serializable{
def +=(str: String, int: Int) = {
numbers = numbers :+ int
words = words :+ str
this
}
def +=(other: MyResponseObj) = {
numbers = numbers ++ other.numbers
words = words ++ other.words
this
}
}
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application").setMaster("local[2]")
val sc = new SparkContext(conf)
val a = sc.parallelize(List(("score", 1), ("chicken", 2), ("magnacarta", 2)))
val myResponseObj = a.mapPartitions[MyResponseObj](it => {
var myResponseObj = new MyResponseObj()
it.foreach {
case (str :String, int :Int) => myResponseObj += (str, int)
case _ => println("unexpected data")
}
Iterator(myResponseObj)
}).reduce( (myResponseObj1, myResponseObj2) => myResponseObj1 += myResponseObj2 )
println(myResponseObj.words)
println(myResponseObj.numbers)
}
}
I want to be able to do something like this:
prepare form:
val formDescription = formBuilder(_.textField[User](_.firstName)
.textField[User](_.lastName)
).build
showForm(formDescription)
extract data from user filled form, using User:
//contains data of a form submitted by a user:
val formData: Map[String, String] = getFormData
val newUser = User(id = randomUuid, firstName = formData.extract[User](_.firstName))
One solution I see is to use a dynamic proxy that extends provided class and remembers what was invoked on him:
def getFieldName[T:Manifest](foo: T => Any) = {
val clazz = implicitly[Manifest[T]].erasure
val proxy = createDynamicProxy(clazz)
foo(proxy)
proxy.lastInvokedMethodName
}
Is there a better way to do it? Is there any lib that implements it already?
This reflective approach takes a case class and invokes its companion apply, calling getField and fetching default args if the field is not in the data.
import scala.reflect.runtime.{currentMirror => cm, universe => uni}
import uni._
def fromXML(xml: Node): Option[PluginDescription] = {
def extract[A]()(implicit tt: TypeTag[A]): Option[A] = {
// extract one field
def getField(field: String): Option[String] = {
val text = (xml \\ field).text.trim
if (text == "") None else Some(text)
}
val apply = uni.newTermName("apply")
val module = uni.typeOf[A].typeSymbol.companionSymbol.asModule
val ts = module.moduleClass.typeSignature
val m = (ts member apply).asMethod
val im = cm reflect (cm reflectModule module).instance
val mm = im reflectMethod m
def getDefault(i: Int): Option[Any] = {
val n = uni.newTermName("apply$default$" + (i+1))
val m = ts member n
if (m == NoSymbol) None
else Some((im reflectMethod m.asMethod)())
}
def extractArgs(pss: List[List[Symbol]]): List[Option[Any]] =
pss.flatten.zipWithIndex map (p => getField(p._1.name.encoded) orElse getDefault(p._2))
val args = extractArgs(m.paramss)
if (args exists (!_.isDefined)) None
else Some(mm(args.flatten: _*).asInstanceOf[A])
}
// check the top-level tag
xml match {
case <plugin>{_*}</plugin> => extract[PluginDescription]()
case _ => None
}
}
The idea was to do something like:
case class User(id: Int = randomUuid, firstName: String, lastName: String)
val user = extract[User]()
That's my own solution:
package utils
import javassist.util.proxy.{MethodHandler, MethodFilter, ProxyFactory}
import org.specs2.mutable._
import javassist.util.proxy.Proxy
import java.lang.reflect.{Constructor, Method}
class DynamicProxyTest extends Specification with MemberNameGetter {
"Dynamic proxy" should {
"extract field name" in {
memberName[TestClass](_.a) must ===("a")
memberName[TestClass](_.i) must ===("i")
memberName[TestClass](_.b) must ===("b")
memberName[TestClass](_.variable) must ===("variable")
memberName[TestClass](_.value) must ===("value")
memberName[TestClass](_.method) must ===("method")
}
}
}
trait MemberNameGetter {
def memberName[T: Manifest](foo: T => Any) = {
val mf = manifest[T]
val clazz = mf.erasure
val proxyFactory = new ProxyFactory
proxyFactory.setSuperclass(clazz)
proxyFactory.setFilter(new MethodFilter {
def isHandled(p1: Method) = true
})
val newClass = proxyFactory.createClass()
var lastInvokedMethod: String = null
val mh = new MethodHandler {
def invoke(p1: Any, p2: Method, p3: Method, p4: Array[AnyRef]) = {
lastInvokedMethod = p2.getName
p3.invoke(p1, p4: _*)
}
}
val constructor = defaultConstructor(newClass)
val parameters = defaultConstructorParameters(constructor)
// val proxy = constructor.newInstance("dsf", new Integer(0))
val proxy2 = constructor.newInstance(parameters: _*)
proxy2.asInstanceOf[Proxy].setHandler(mh)
foo(proxy2.asInstanceOf[T])
lastInvokedMethod
}
private def defaultConstructor(c: Class[_]) = c.getConstructors.head
private def defaultConstructorParameters(constructor: Constructor[_]) = {
val parameterTypes = constructor.getParameterTypes
parameterTypes.map{
case Integer.TYPE => Integer.valueOf(0)
case _ => null
}
}
}
case class TestClass(a: String, i: Int, b: Boolean) {
var variable = "asdf"
val value = "asdfasdfasd"
def method = "method"
}
val mh = new MethodHandler {
def invoke(p1: Any, p2: Method, p3: Method, p4: Array[AnyRef]) = {
lastInvokedMethod = p2.getName
p3.invoke(p1, p4: _*)
}
}
val constructor = defaultConstructor(newClass)
val parameters = defaultConstructorParameters(constructor)
// val proxy = constructor.newInstance("dsf", new Integer(0))
val proxy2 = constructor.newInstance(parameters: _*)
proxy2.asInstanceOf[Proxy].setHandler(mh)
foo(proxy2.asInstanceOf[T])
lastInvokedMethod
}
private def defaultConstructor(c: Class[_]) = c.getConstructors.head
private def defaultConstructorParameters(constructor: Constructor[_]) = {
val parameterTypes = constructor.getParameterTypes
parameterTypes.map{
case Integer.TYPE => Integer.valueOf(0)
case java.lang.Double.TYPE => java.lang.Double.valueOf(0)
case java.lang.Long.TYPE => java.lang.Long.valueOf(0)
case java.lang.Boolean.TYPE => java.lang.Boolean.FALSE
case _ => null
}
}
}
case class TestClass(a: String, i: Int, b: Boolean) {
var variable = "asdf"
val value = "asdfasdfasd"
def method = "method"
}