I would like to troubleshoot the issue but couldn't move further. Can anyone please help
import org.apache.hadoop.mapred.lib.MultipleTextOutputFormat
class KeyBasedOutput[T >: Null, V <: AnyRef] extends MultipleTextOutputFormat[T , V] {
override def generateFileNameForKeyValue(key: T, value: V, leaf: String) = {
key.toString
}
override def generateActualKey(key: T, value: V) = {
null
}
}
val cp1 =sqlContext.sql("select * from d_prev_fact").map(t => t.mkString("\t")).map{x => val parts = x.split("\t")
val partition_key = parts(3)
val rows = parts.slice(0, parts.length).mkString("\t")
("date=" + partition_key.toString, rows.toString)}
cp1.saveAsHadoopFile(FACT_CP)
I have got an error as below and not able to debug
scala> cp1.saveAsHadoopFile(FACT_CP,classOf[String],classOf[String],classOf[KeyBasedOutput[String, String]])
java.lang.RuntimeException: java.lang.NoSuchMethodException: $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$KeyBasedOutput.<init>()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:709)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:742)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:674)
The idea is to write the values into multiple folder based on the Key
Put KeyBasedOutput to a jar and start spark-shell --jars /path/to/the/jar
I'm not certain, but I think type erasure combined with reflection may be causing this problem for you. Try defining a non generic subclass of KeyBasedOutput that hard codes the type parameters and use that.
class StringKeyBasedOutput extends KeyBasedOutput[String, String]
Related
I'm building a toy project to learn Scala 3 and i'm stuck in one problem, first of all i'm following the tagless-final approach using cats-effect, the approach is working as expected except for the entity serialization, when i try to create a route using akka-http i have the following problem:
def routes: Route = pathPrefix("security") {
(path("auth") & post) {
entity(as[LoginUserByCredentialsCommand]) {
(command: LoginUserByCredentialsCommand) =>
complete {
login(command)
}
}
}}
F[
Either[com.moralyzr.magickr.security.core.errors.AuthError,
com.moralyzr.magickr.security.core.types.TokenType.Token
]
]
Required: akka.http.scaladsl.marshalling.ToResponseMarshallable
where: F is a type in class SecurityApi with bounds <: [_] =>> Any
For what i understood, akka-http does not know how to serialize the F highly-kinded type, by searching a little bit i found the following solution, it consists of creating an implicit called marshallable to show the akka-http how to serialize the type, however when i implement it i get a StackOverflow error :(
import akka.http.scaladsl.marshalling.ToResponseMarshaller
import cats.effect.IO
trait Marshallable[F[_]]:
def marshaller[A: ToResponseMarshaller]: ToResponseMarshaller[F[A]]
object Marshallable:
implicit def marshaller[F[_], A : ToResponseMarshaller](implicit M: Marshallable[F]): ToResponseMarshaller[F[A]] =
M.marshaller
given ioMarshaller: Marshallable[IO] with
def marshaller[A: ToResponseMarshaller] = implicitly
I'm really stuck right now, does anyone have an idea on how can i fix this problem? The complete code can be found here
Edit: This is the login code
For clarity, here are the class that instantiate the security api and the security api itself
object Magickr extends IOApp:
override def run(args: List[String]): IO[ExitCode] =
val server = for {
// Actors
actorsSystem <- ActorsSystemResource[IO]()
streamMaterializer <- AkkaMaterializerResource[IO](actorsSystem)
// Configs
configs <- Resource.eval(MagickrConfigs.makeConfigs[IO]())
httpConfigs = AkkaHttpConfig[IO](configs)
databaseConfigs = DatabaseConfig[IO](configs)
flywayConfigs = FlywayConfig[IO](configs)
jwtConfig = JwtConfig[IO](configs)
// Interpreters
jwtManager = JwtBuilder[IO](jwtConfig)
authentication = InternalAuthentication[IO](
passwordValidationAlgebra = new SecurityValidationsInterpreter(),
jwtManager = jwtManager
)
// Database
_ <- Resource.eval(
DbMigrations.migrate[IO](flywayConfigs, databaseConfigs)
)
transactor <- DatabaseConnection.makeTransactor[IO](databaseConfigs)
userRepository = UserRepository[IO](transactor)
// Services
securityManagement = SecurityManagement[IO](
findUser = userRepository,
authentication = authentication
)
// Api
secApi = new SecurityApi[IO](securityManagement)
routes = pathPrefix("api") {
secApi.routes()
}
akkaHttp <- AkkaHttpResource.makeHttpServer[IO](
akkaHttpConfig = httpConfigs,
routes = routes,
actorSystem = actorsSystem,
materializer = streamMaterializer
)
} yield (actorsSystem)
return server.useForever
And
class SecurityApi[F[_]: Async](
private val securityManagement: SecurityManagement[F]
) extends LoginUserByCredentials[F]
with SecurityProtocols:
def routes()(using marshaller: Marshallable[F]): Route = pathPrefix("security") {
(path("auth") & post) {
entity(as[LoginUserByCredentialsCommand]) {
(command: LoginUserByCredentialsCommand) =>
complete {
login(command)
}
}
}
}
override def login(
command: LoginUserByCredentialsCommand
): F[Either[AuthError, Token]] =
securityManagement.loginWithCredentials(command = command).value
================= EDIT 2 =========================================
With the insight provided by Luis Miguel, it makes a clearer sense that i need to unwrap the IO into a Future at the Marshaller level, something like this:
def ioToResponseMarshaller[A: ToResponseMarshaller](
M: Marshallable[IO]
): ToResponseMarshaller[IO[A]] =
Marshaller.futureMarshaller.compose(M.entity.unsafeToFuture())
However, i have this problem:
Found: cats.effect.unsafe.IORuntime => scala.concurrent.Future[A]
Required: cats.effect.IO[A] => scala.concurrent.Future[A]
I think i'm close! Is there a way to unwrap the IO keeping the IO type?
I managed to make it work! Thanks to #luismiguel insight, the problem was that the Akka HTTP Marshaller was not able to deal with Cats-Effect IO monad, so the solution was an implementation who unwraps the IO monad using the unsafeToFuture inside the marshaller, that way i was able to keep the Tagless-Final style from point to point, here's the solution:
This implicit fetches the internal marshaller for the type
import akka.http.scaladsl.marshalling.ToResponseMarshaller
import cats.effect.IO
trait Marshallable[F[_]]:
def marshaller[A: ToResponseMarshaller]: ToResponseMarshaller[F[A]]
object Marshallable:
implicit def marshaller[F[_], A: ToResponseMarshaller](implicit
M: Marshallable[F]
): ToResponseMarshaller[F[A]] = M.marshaller
given ioMarshallable: Marshallable[IO] with
def marshaller[A: ToResponseMarshaller] = CatsEffectsMarshallers.ioMarshaller
This one unwraps the IO monad and flatMaps the marshaller using a future, which akka-http knows how to deal with.
import akka.http.scaladsl.marshalling.{
LowPriorityToResponseMarshallerImplicits,
Marshaller,
ToResponseMarshaller
}
import cats.effect.IO
import cats.effect.unsafe.implicits.global
trait CatsEffectsMarshallers extends LowPriorityToResponseMarshallerImplicits:
implicit def ioMarshaller[A](implicit
m: ToResponseMarshaller[A]
): ToResponseMarshaller[IO[A]] =
Marshaller(implicit ec => _.unsafeToFuture().flatMap(m(_)))
object CatsEffectsMarshallers extends CatsEffectsMarshallers
I need to write a function in Scala that returns an Array of byte serializated with AvroOutputStream, but in scala i can't get the class of the generic object i'm passing in input.
Here is my util class:
class AvroUtils {
def createByteArray[T](obj: T): Array[Byte] = {
val byteArrayStream = new ByteArrayOutputStream()
val output = AvroOutputStream.binary[T](byteArrayStream)
output.write(obj)
output.close()
byteArrayStream.toByteArray()
}
}
As you can see if tou test this code is that AvroOutputStream can't recognize the T class so it can't generate a schema for it.
Hope you can help! thanks
PS: Already tried with TypeTag and ClassTag, nothing works.
You need to add the proper implicits for T, namely SchemaFor and ToRecord:
def createByteArray[T : SchemaFor : ToRecord](obj: T): Array[Byte] = {
val byteArrayStream = new ByteArrayOutputStream()
val output = AvroOutputStream.binary[T](byteArrayStream)
output.write(obj)
output.close()
byteArrayStream.toByteArray()
}
I have a case where I wish to apply modifications to an object based on the presence of (a few, say, 5 to 10) optionals. So basically, if I were to do it imperatively, what I'm aiming for is :
var myObject = ...
if (option.isDefined) {
myObject = myObject.modify(option.get)
}
if (option2.isDefined) {
myObject = myObject.someOtherModification(option2.get)
}
(Please note : maybe my object is mutable, maybe not, that is not the point here.)
I thought it'd look nicer if I tried to implement a fluent way of writing this, such as (pseudo code...) :
myObject.optionally(option, _.modify(_))
.optionally(option2, _.someOtherModification(_))
So I started with a sample code, which intelliJ does not highlight as an error, but that actually does not build.
class MyObject(content: String) {
/** Apply a transformation if the optional is present */
def optionally[A](optional: Option[A], operation: (A, MyObject) => MyObject): MyObject =
optional.map(operation(_, this)).getOrElse(this)
/** Some possible transformation */
def resized(length : Int): MyObject = new MyObject(content.substring(0, length))
}
object Test {
val my = new MyObject("test")
val option = Option(2)
my.optionally(option, (size, value) => value.resized(size))
}
Now, in my case, the MyObject type is of some external API, so I created an implicit conversion to help, so what it really does look like :
// Out of my control
class MyObject(content: String) {
def resized(length : Int): MyObject = new MyObject(content.substring(0, length))
}
// What I did : create a rich type over MyObject
class MyRichObject(myObject: MyObject) {
def optionally[A](optional: Option[A], operation: (A, MyObject) => MyObject): MyObject = optional.map(operation(_, myObject)).getOrElse(myObject)
}
// And an implicit conversion
object MyRichObject {
implicit def apply(myObject: MyObject): MyRichObject = new MyRichObject(myObject)
}
And then, I use it this way :
object Test {
val my = new MyObject("test")
val option = Option(2)
import MyRichObject._
my.optionally(option, (size, value) => value.resized(size))
}
And this time, it fails in IntelliJ and while compiling because the type of the Option is unknown :
Error:(8, 26) missing parameter type
my.optionally(option, (size, value) => value.resized(size))
To make it work, I can :
Actively specify a type of the size argument : my.optionally(option, (size: Int, value) => value.resized(size))
Rewrite the optionally to a curried-version
None of them is really bad, but if I may ask :
Is there a reason that a curried version works, but a multi argument version seems to fail to infer the parametrized type,
Could it be written in a way that works without specifying the actual types
and as a bonus (although this might be opinion based), how would you write it (some sort of foldLeft on a sequence of optionals come to my mind...) ?
One option for your consideration:
// Out of my control
class MyObject(content: String) {
def resized(length : Int): MyObject = new MyObject(content.substring(0, length))
}
object MyObjectImplicits {
implicit class OptionalUpdate[A](val optional: Option[A]) extends AnyVal {
def update(operation: (A, MyObject) => MyObject): MyObject => MyObject =
(obj: MyObject) => optional.map(a => operation(a, obj)).getOrElse(obj)
}
}
object Test {
val my = new MyObject("test")
val option = Option(2)
import MyObjectImplicits._
Seq(
option.update((size, value) => value.resized(size)),
// more options...
).foldLeft(my)(_)
}
Might as well just use a curried-version of your optionally, like you said.
A nicer way to think about the need to add the type there is write it this way:
object Test {
val my = new MyObject("test")
val option = Some(2)
my.optionally[Int](option, (size, value) => value.resized(size))
}
Another way, if you only will manage one type since the object creation, is to move the generic to the class creation, but be careful, with this option you only can have one type per instance:
class MyObject[A](content: String) {
def optionally(optional: Option[A], operation: (A, MyObject[A]) => MyObject[A]): MyObject[A] =
optional.map(operation(_, this)).getOrElse(this)
def resized(length : Int): MyObject[A] = new MyObject[A](content.substring(0, length))
}
object Test {
val my = new MyObject[Int]("test")
val option = Some(2)
my.optionally(option, (size, value) => value.resized(size))
}
As you can see, now all the places where the generics was is taken by the Int type, because that is what you wanted in the first place, here is a pretty answer telling why:
(just the part that I think applies here:)
4)When the inferred return type would be more general than you intended, e.g., Any.
Source: In Scala, why does a type annotation must follow for the function parameters ? Why does the compiler not infer the function parameter types?
In order to be able to handle large amounts of different request types I created a .proto file like this:
message Message
{
string typeId = 1;
bytes message = 2;
}
I added the typeId so that one knows what actual protobuf bytes represents. (Self-describing)
Now my problem is handling that different "concrete types" in an elegant way. (Note: All works fine if I simple use a switch-case-like approach!)
I thought about a solution like this:
1) Have a trait the different handlers have to implement, e.g.:
trait Handler[T]
{
def handle(req: T): Any
}
object TestHandler extends Handler[Test]
{
override def handle(req: Test): String =
{
s"A success, $req has been handled by TestHandler
}
}
object OtherHandler extends Handler[Other]
{
override def handle(req: Other): String =
{
s"A success, $req has been handled by OtherHandler
}
}
2) provide some kind of registry to query the right handler for a given message:
val handlers = Map(
Test -> TestHandler,
Other -> OtherHandler
)
3) If a request comes in it identifies itself, so we need another Mapper:
val reqMapper = Map(
"Test" -> Test
"Other" -> Other
)
4) If a request comes in, handle it:
val request ...
// Determine the requestType
val requestType = reqMapper(request.type)
// Find the correct handler for the requestType
val handler = handlers(requestType)
// Parse the actual request
val actualRequest = requestType.parse(...) // type of actualRequest can only be Test or Other in our little example
Now, until here everything looks fine and dandy, but then this line breaks my whole world:
handler.handle(actualRequest)
It leads to:
type mismatch; found : com.trueaccord.scalapb.GeneratedMessage with Product with com.trueaccord.scalapb.Message[_ >: tld.test.proto.Message.Test with tld.test.proto.Message.Other <: com.trueaccord.scalapb.GeneratedMessage with Product] with com.trueaccord.lenses.Updatable[_ >: tld.test.proto.Message.Other with tld.test.proto.Message.Test <: com.trueaccord.scalapb.GeneratedMessage with Product]{def companion: Serializable} required: _1
As far as I understand - PLEASE CORRECT ME HERE IF AM WRONG - the compiler cannot be sure here, that actualRequest is "handable" by a handler. That means it lacks the knowledge that the actualRequest is definitely somewhere in that mapper AND ALSO that there is a handler for it.
It's basically implicit knowledge a human would get, but the compiler cannot infer.
So, that being said, how can I overcome that situation elegantly?
your types are lost when you use a normal Map. for eg
object Test{}
object Other{}
val reqMapper = Map("Test" -> Test,"Other" -> Other)
reqMapper("Test")
res0: Object = Test$#5bf0fe62 // the type is lost here and is set to java.lang.Object
the most idomatic way to approach this is to use pattern matching
request match {
case x: Test => TestHandler(x)
case x: Other => OtherHandler(x)
case _ => throw new IllegalArgumentException("not supported")
}
if you still want to use Maps to store your type to handler relation consider HMap provided by Shapeless here
Heterogenous maps
Shapeless provides a heterogenous map which supports an arbitrary
relation between the key type and the corresponding value type,
I settled for this solution for now (basically thesamet's, a bit adapted for my particular use-case)
trait Handler[T <: GeneratedMessage with Message[T], R]
{
implicit val cmp: GeneratedMessageCompanion[T]
def handle(bytes: ByteString): R = {
val msg: T = cmp.parseFrom(bytes.newInput())
handler(msg)
}
def apply(t: T): R
}
object Test extends Handler[Test, String]
{
override def apply(t: Test): String = s"$t received and handled"
override implicit val cmp: GeneratedMessageCompanion[Test] = Test.messageCompanion
}
One trick you can use is to capture the companion object as an implicit, and combine the parsing and handling in a single function where the type is available to the compiler:
case class Handler[T <: GeneratedMessage with Message[T]](handler: T => Unit)(implicit cmp: GeneratedMessageCompanion[T]) {
def handle(bytes: ByteString): Unit = {
val msg: T = cmp.parseFrom(bytes.newInputStream)
handler(t)
}
}
val handlers: Map[String, Handler[_]] = Map(
"X" -> Handler((x: X) => Unit),
"Y" -> Handler((x: Y) => Unit)
)
// To handle the request:
handlers(request.typeId).handle(request.message)
Also, take a look at any.proto which defines a structure very similar to your Message. It wouldn't solve your problem, but you can take advantage of it's pack and unpack methods.
I'm using scala and slick here, and I have a baserepository which is responsible for doing the basic crud of my classes.
For a design decision, we do have updatedTime and createdTime columns all handled by the application, and not by triggers in database. Both of this fields are joda DataTime instances.
Those fields are defined in two traits called HasUpdatedAt, and HasCreatedAt, for the tables
trait HasCreatedAt {
val createdAt: Option[DateTime]
}
case class User(name:String,createdAt:Option[DateTime] = None) extends HasCreatedAt
I would like to know how can I use reflection to call the user copy method, to update the createdAt value during the database insertion method.
Edit after #vptron and #kevin-wright comments
I have a repo like this
trait BaseRepo[ID, R] {
def insert(r: R)(implicit session: Session): ID
}
I want to implement the insert just once, and there I want to createdAt to be updated, that's why I'm not using the copy method, otherwise I need to implement it everywhere I use the createdAt column.
This question was answered here to help other with this kind of problem.
I end up using this code to execute the copy method of my case classes using scala reflection.
import reflect._
import scala.reflect.runtime.universe._
import scala.reflect.runtime._
class Empty
val mirror = universe.runtimeMirror(getClass.getClassLoader)
// paramName is the parameter that I want to replacte the value
// paramValue is the new parameter value
def updateParam[R : ClassTag](r: R, paramName: String, paramValue: Any): R = {
val instanceMirror = mirror.reflect(r)
val decl = instanceMirror.symbol.asType.toType
val members = decl.members.map(method => transformMethod(method, paramName, paramValue, instanceMirror)).filter {
case _: Empty => false
case _ => true
}.toArray.reverse
val copyMethod = decl.declaration(newTermName("copy")).asMethod
val copyMethodInstance = instanceMirror.reflectMethod(copyMethod)
copyMethodInstance(members: _*).asInstanceOf[R]
}
def transformMethod(method: Symbol, paramName: String, paramValue: Any, instanceMirror: InstanceMirror) = {
val term = method.asTerm
if (term.isAccessor) {
if (term.name.toString == paramName) {
paramValue
} else instanceMirror.reflectField(term).get
} else new Empty
}
With this I can execute the copy method of my case classes, replacing a determined field value.
As comments have said, don't change a val using reflection. Would you that with a java final variable? It makes your code do really unexpected things. If you need to change the value of a val, don't use a val, use a var.
trait HasCreatedAt {
var createdAt: Option[DateTime] = None
}
case class User(name:String) extends HasCreatedAt
Although having a var in a case class may bring some unexpected behavior e.g. copy would not work as expected. This may lead to preferring not using a case class for this.
Another approach would be to make the insert method return an updated copy of the case class, e.g.:
trait HasCreatedAt {
val createdAt: Option[DateTime]
def withCreatedAt(dt:DateTime):this.type
}
case class User(name:String,createdAt:Option[DateTime] = None) extends HasCreatedAt {
def withCreatedAt(dt:DateTime) = this.copy(createdAt = Some(dt))
}
trait BaseRepo[ID, R <: HasCreatedAt] {
def insert(r: R)(implicit session: Session): (ID, R) = {
val id = ???//insert into db
(id, r.withCreatedAt(??? /*now*/))
}
}
EDIT:
Since I didn't answer your original question and you may know what you are doing I am adding a way to do this.
import scala.reflect.runtime.universe._
val user = User("aaa", None)
val m = runtimeMirror(getClass.getClassLoader)
val im = m.reflect(user)
val decl = im.symbol.asType.toType.declaration("createdAt":TermName).asTerm
val fm = im.reflectField(decl)
fm.set(??? /*now*/)
But again, please don't do this. Read this stackoveflow answer to get some insight into what it can cause (vals map to final fields).