I am new to scala and I am trying to extract few columns based below is my code
case class Extract(e1:String,e2:String,e3:String){
override def toString = e1+","+e2+","+e3
}
object ScalaSpark {
def main(args: Array[String])
{
val textfile = sc.textFile("/user/cloudera/xxxx/File2")
val word = textfile.filter(x => x.length > 0).map(_.split("\\|"))
val pid = word.filter(_.contains("SSS"))
val pidkeys = pid.map(tuple => Extract(tuple(0),tuple(3),tuple(7)))
val obx = word.filter(_.contains("HHH"))
val obxkeys = obx.map(tuple => Extract(tuple(0),tuple(5)))
val rddall = pidkeys.unionAll(obxkeys)
rddall.saveAsTextFile("/user/xxxx/xxxx/rddsum1")
}
}
What I am trying with this code is to extract 3 values from row containing SSS and 2 values from row contatining HHH but when i am executing this i am getting below error
error: not enough arguments for method apply: (e1: String, e2: String, e3: String)Extract in object Extract.
I then tried using Opt[String] = None but that also didn't worked i don't know how to sort out this problem please help.
EDIT:
I used Option[String] and my code is written below
case class Extract(e1:String,e2:String,e3:Option[String]){
override def toString = e1+","+e2+","+e3
}
object ScalaSpark {
def main(args: Array[String])
{
val textfile = sc.textFile("/user/cloudera/xxxx/File2")
val word = textfile.filter(x => x.length > 0).map(_.split('|'))
val pid = word.filter(_.contains("SSS"))
val pidkeys = pid.map(tuple => Extract(tuple(0),tuple(5),tuple(8)))
val obx = word.filter(_.contains("HHH"))
val obxkeys = obx.map(tuple => Extract(tuple(0),tuple(5), None))
val rddall = pidkeys.union(obxkeys)
rddall.coalesce(1).saveAsTextFile("/user/xxx/xxx/rddsum1")
}
}
but i am getting below error
error: type mismatch;
found : String
required: Option[String]
val pidkeys = pid.map(tuple => Header(tuple(0),tuple(5),tuple(8)))
^
<console>:38: error: type mismatch;
found : org.apache.spark.rdd.RDD[Extract]
required: org.apache.spark.rdd.RDD[Nothing]
Note: Header >: Nothing, but class RDD is invariant in type T.
You may wish to define T as -T instead. (SLS 4.5)
val rddall = pidkeys.union(obxkeys)
As far as I understand your Extract case class have 3 parameters. Last of them is optional.
If so you should declare it this way:
case class Extract(s1: String, s2: String, s3: Option[String])
and use it either Extract("some string", "other string", Some("optional string")) or Extract("some string", "other string", None).
Related
My scala code goes like this.
def getFileName(fileName: String, postfixList: List[String]): String = {
if (...) {
return fileName + postfixList(0)
}
return fileName + postfixList(1)
}
val getFileNameUdf = udf(getFileName(_: String, _: List[String]): String)
def main(args: Array[String]): Unit = {
val postfixList = List("a", "b")
val rawFsRecordDF = sparkSession.read.option("delimiter", "\t").schema(fsImageSchema)
.withColumn("fileName", getFileNameUdf(col("name"), lit(postfixList)))
}
For each of column values, I want to append a certain postfix.
Running this code, I get
diagnostics: User class threw exception: java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.Nil$ List()
How should I pass list to a UDF function?
Any comment or link would be appreciated!
lit is good for basic types. For parameterized types, you should use typedLit.
val rawFsRecordDF = sparkSession.read.option("delimiter", "\t").schema(fsImageSchema)
.withColumn("fileName", getFileNameUdf(col("name"), typedLit(postfixList)))
should work.
I am trying to write a macro against scala 3.0.0-M3. I want the macro to return the inner types of the Option[_] fields of a Type. For example, given:
class Professor(val lastName: String, val id: Option[Int], val bossId: Option[Long])
I want to associate id with Int, and bossId with Long.
I have some code that does that for primitive types and compiles ok:
import scala.quoted._
import scala.quoted.staging._
import scala.quoted.{Quotes, Type}
object TypeInfo {
inline def fieldsInfo[T <: AnyKind]: Map[String, Class[Any]] = ${ fieldsInfo[T] }
def fieldsInfo[T <: AnyKind: Type](using qctx0: Quotes): Expr[Map[String, Class[Any]]] = {
given qctx0.type = qctx0
import qctx0.reflect.{given, _}
val uns = TypeTree.of[T]
val symbol = uns.symbol
val innerClassOfOptionFields: Map[String, Class[Any]] = symbol.memberFields.flatMap { m =>
// we only support val fields for now
if(m.isValDef){
val tpe = ValDef(m, None).tpt.tpe
// only if the field is an Option[_]
if(tpe.typeSymbol == TypeRepr.of[Option[Any]].typeSymbol){
val containedClass: Option[Class[Any]] =
if(tpe =:= TypeRepr.of[Option[Int]]) Some(classOf[Int].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Short]]) Some(classOf[Short].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Long]]) Some(classOf[Long].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Double]]) Some(classOf[Double].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Float]]) Some(classOf[Float].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Boolean]]) Some(classOf[Boolean].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Byte]]) Some(classOf[Byte].asInstanceOf[Class[Any]])
else if(tpe =:= TypeRepr.of[Option[Char]]) Some(classOf[Char].asInstanceOf[Class[Any]])
else None
containedClass.map(clazz => (m.name -> clazz))
} else None
} else None
}.toMap
println(innerClassOfOptionFields)
Expr(innerClassOfOptionFields)
}
But if I try to use it, like this:
class Professor(val lastName: String, val id: Option[Int], val bossId: Option[Long])
object Main extends App {
val fields = TypeInfo.fieldsInfo[Professor]
}
the compiler first prints Map(id -> int, bossId -> long) because of the println in the macro code which looks alright, but then fails with:
[error] 16 | val fields = TypeInfo.fieldsInfo[Professor]
[error] | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[error] | Found: (classOf[Int] : Class[Int])
[error] | Required: Class[Any]
[error] | This location contains code that was inlined from Main.scala:34
What I am doing wrong? Am I not supposed to be able to return a Map from a macro, or maybe not this way?
Note that the if/else logic in my macro doesn't really matter, the problem can be reduced to (everything else being equal):
val result: Map[String, Class[Any]] = Map(
"bossId" -> classOf[scala.Long].asInstanceOf[Class[Any]],
"id" -> classOf[scala.Int].asInstanceOf[Class[Any]]
)
Expr(result)
You can define this missing given, based on the one from the standard library.
import scala.quoted._
given ToExpr[Class[?]] with {
def apply(x: Class[?])(using Quotes) = {
import quotes.reflect._
Ref(defn.Predef_classOf).appliedToType(TypeRepr.typeConstructorOf(x)).asExpr.asInstanceOf[Expr[Class[?]]]
}
}
In the next release of Scala 3, this should no longer be necessary. The given instance of the standard library has been adapted to work for Class[?] too.
Then you can return a well typed Map[String, Class[?]].
inline def fieldsInfo: Map[String, Class[?]] = ${ fieldsInfoMacro }
def fieldsInfoMacro(using Quotes): Expr[Map[String, Class[?]]] = {
val result: Map[String, Class[?]] = Map(
"bossId" -> classOf[scala.Long],
"id" -> classOf[scala.Int]
)
Expr(result)
}
And everything works:
scala> fieldsInfo
val res1: Map[String, Class[?]] = Map(bossId -> long, id -> int)
In the below code, encoded is a JSON string. The JSON.parseFull() function is returning an object of the form: Some(Map(...)). I am using .get to extract the Map, but am unable to index it as the compiler sees it as type Any. Is there any to provide the compiler visibility that it is, in fact, a map?
val parsed = JSON.parseFull(encoded)
val mapped = parsed.get
You can utilize the collect with pattern matching to match on the type:
scala> val parsed: Option[Any] = Some(Map("1" -> List("1")))
parsed: Option[Any] = Some(Map(1 -> List(1)))
scala> val mapped = parsed.collect{
case map: Map[String, Any] => map
}
mapped: Option[Map[String,Any]] = Some(Map(1 -> List(1)))
You can do something like the following in the case of a List value to get values from the List:
scala> mapped.get.map{ case(k, List(item1)) => item1}
res0: scala.collection.immutable.Iterable[Any] = List(1)
I was able to use a combination of the get function and pattern matching similar to what was posted in Tanjin's response to get the desired result.
object ReadFHIR {
def fatal(msg: String) = throw new Exception(msg)
def main (args: Array[String]): Unit = {
val fc = new FhirContext()
val client = fc.newRestfulGenericClient("http://test.fhir.org/r2")
val bundle = client.search().forResource("Observation")
.prettyPrint()
.execute()
val jsonParser = fc.newJsonParser()
val encoded = jsonParser.encodeBundleToString(bundle)
val parsed = JSON.parseFull(encoded)
val mapped: Map[String, Any] = parsed.get match{
case map: Map[String, Any] => map
}
println(mapped("resourceType"))
}
}
consider the following code:
object Foo{def foo(a:Int):List[(String, Int)] = ???}
class Bar{def bar(a:Int, b:Any):Option[(String, Long)] = ???}
Given either the object or class, I need to first find the method names (does not seem to be that difficult).
After that, for each method, I want to find a string description of the Scala return types (not the Java ones) . So for instance, for Foo.foo, I would need the String List[(String, Int)] and for Bar.bar, I would need the String Option[(String, Long)].
I saw this and this tutorial but could not figure it out.
EDIT: Here is what I tried based on the comments:
class RetTypeFinder(obj:AnyRef) {
import scala.reflect.runtime.{universe => ru}
val m = ru.runtimeMirror(getClass.getClassLoader)
val im = m.reflect(obj)
def getRetType(methodName:String) = {
ru.typeOf[obj.type].declaration(ru.TermName(methodName)).asMethod.returnType
}
}
object A { def foo(a:Int):String = ??? } // define dummy object
class B { def bar(a:Int):String = ??? } // define dummy class
val a = new RetTypeFinder(A)
a.getRetType("foo") // exception here
val b = new RetTypeFinder(new B)
b.getRetType("bar") // exception here
The error I get is:
scala.ScalaReflectionException: <none> is not a method
at scala.reflect.api.Symbols$SymbolApi$class.asMethod(Symbols.scala:228)
at scala.reflect.internal.Symbols$SymbolContextApiImpl.asMethod(Symbols.scala:84)
at cs.reflect.Test.getRetCls(Test.scala:11)
...
However, this works (tried in REPL):
import scala.reflect.runtime.{universe => ru}
val m = ru.runtimeMirror(getClass.getClassLoader)
object A { def foo(a:Int):String = ??? } // define dummy object
val im = m.reflect(A)
ru.typeOf[A.type].declaration(ru.TermName("foo")).asMethod.returnType
class B { def bar(a:Int):String = ??? } // define dummy class
val im = m.reflect(new B)
ru.typeOf[B].declaration(ru.TermName("bar")).asMethod.returnType
I need to use it in the first way, where I don't know in advance what objects/classes will be passed. Any help will be appreciated.
Once you have a universe.Type you can use the way from the comments to get the return type of one of its methods:
import scala.reflect.runtime.{universe => ru}
def getRetTypeOfMethod(tpe: ru.Type)(methodName: String) =
tpe.member(ru.TermName(methodName)).asMethod.returnType
To get a universe.Type the easiest way is to capture in an implicit TypeTag:
class RetTypeFinder[T <: AnyRef](obj: T)(implicit tag: ru.TypeTag[T]) {
def getRetType(methodName: String) = {
val tpe = tag.tpe
getRetTypeOfMethod(tpe)(methodName)
}
}
But if you don't have a TypeTag, but just an object of type AnyRef, you can go through a mirror to reflect it. The resulting Type will have some information lost due to Java's type erasure, but it would still be enough to get the return type of a method by name, because that's supported by JVM reflection:
class RetTypeFinder2(obj: AnyRef) {
def getRetType(methodName: String) = {
val mirror = ru.runtimeMirror(getClass.getClassLoader)
val tpe = mirror.reflect(obj).symbol.info
getRetTypeOfMethod(tpe)(methodName)
}
}
Both methods work fine for your problem:
scala> new RetTypeFinder(A).getRetType("foo")
res0: reflect.runtime.universe.Type = String
scala> new RetTypeFinder2(A).getRetType("foo")
res1: reflect.runtime.universe.Type = String
scala> new RetTypeFinder(new B).getRetType("bar")
res2: reflect.runtime.universe.Type = String
scala> new RetTypeFinder2(new B).getRetType("bar")
res3: reflect.runtime.universe.Type = String
Assume I have an instance of MethodMirror created for a certain method of an object. By mirror's fields I can easily access return type and parameters of the method. But I actually need to obtain the type this method would have as a function.
Here is a toy code example which will help me explain, what I want to achieve. I'm using Scala 2.11.6.
import scala.reflect.runtime.universe._
object ForStackOverflow {
object Obj {
def method(x:String, y:String):Int = 0
def expectedRetType():((String, String) => Int) = ???
}
def main(args: Array[String]) {
val mirror:Mirror = runtimeMirror(getClass.getClassLoader)
val instanceMirror = mirror.reflect(Obj)
val methodSymbol:MethodSymbol = instanceMirror.symbol.toType.decl(TermName("method")).asMethod
val methodMirror = instanceMirror.reflectMethod(methodSymbol)
println(methodMirror.symbol.returnType)
println(methodMirror.symbol.paramLists(0).map { x => x.info.resultType }.mkString(", "))
val expectedSymbol:MethodSymbol = instanceMirror.symbol.toType.decl(TermName("expectedRetType")).asMethod
println("I would like to produce from a 'methodMirror' this: "+expectedSymbol.returnType)
}
}
I want to produce Type instance from the methodMirror which would represent a function. For this example it should be (String, String) => Int. I would prefer a solution that doesn't depend too much on the concrete Scala's FunctionX classes.
The method getEtaExpandedMethodType below does what you asked, and even handles methods with multiple parameter lists.
On the other hand it does not handle generic methods. By example def method[T](x: T) = 123, when eta-expanded, creates a function of type Any => Int, but getEtaExpandedMethodType will report T => Int which is not only incorrect but does not make sense at all (T has no meaning in this context).
def getEtaExpandedMethodType(methodSymbol: MethodSymbol): Type = {
val typ = methodSymbol.typeSignature
def paramType(paramSymbol: Symbol): Type = {
// TODO: handle the case where paramSymbol denotes a type parameter
paramSymbol.typeSignatureIn(typ)
}
def rec(paramLists: List[List[Symbol]]): Type = {
paramLists match {
case Nil => methodSymbol.returnType
case params :: otherParams =>
val functionClassSymbol = definitions.FunctionClass(params.length)
appliedType(functionClassSymbol, params.map(paramType) :+ rec(otherParams))
}
}
if (methodSymbol.paramLists.isEmpty) { // No arg method
appliedType(definitions.FunctionClass(0), List(methodSymbol.returnType))
} else {
rec(methodSymbol.paramLists)
}
}
def getEtaExpandedMethodType(methodMirror: MethodMirror): Type = getEtaExpandedMethodType(methodMirror.symbol)
REPL test:
scala> val mirror: Mirror = runtimeMirror(getClass.getClassLoader)
mirror: reflect.runtime.universe.Mirror = ...
scala> val instanceMirror = mirror.reflect(Obj)
instanceMirror: reflect.runtime.universe.InstanceMirror = instance mirror for Obj$#21b6e507
scala> val tpe = instanceMirror.symbol.toType
tpe: reflect.runtime.universe.Type = Obj.type
scala> getEtaExpandedMethodType(tpe.decl(TermName("method1")).asMethod)
res28: reflect.runtime.universe.Type = (String, String) => scala.Int
scala> getEtaExpandedMethodType(tpe.decl(TermName("method2")).asMethod)
res29: reflect.runtime.universe.Type = () => String
scala> getEtaExpandedMethodType(tpe.decl(TermName("method3")).asMethod)
res30: reflect.runtime.universe.Type = () => scala.Long
scala> getEtaExpandedMethodType(tpe.decl(TermName("method4")).asMethod)
res31: reflect.runtime.universe.Type = String => (scala.Float => scala.Double)
scala> getEtaExpandedMethodType(tpe.decl(TermName("method5")).asMethod)
res32: reflect.runtime.universe.Type = T => scala.Int
scala> getEtaExpandedMethodType(tpe.decl(TermName("method6")).asMethod)
res33: reflect.runtime.universe.Type = T => scala.Int
Here is probably the most straightforward solution using universe.appliedType. It doesn't work in the case of multiple parameter lists. I post this to show an alternative way of solving this problem.
def getEtaExpandedMethodType2(methodSymbol: MethodSymbol): Type = {
val typesList = methodSymbol.info.paramLists(0).map(x => x.typeSignature) :+ methodSymbol.returnType
val arity = methodSymbol.paramLists(0).size
universe.appliedType(definitions.FunctionClass(arity), typesList)
}