Spark: Custom operator for JavaPairRDD

Spark: Custom operator for JavaPairRDD - scala

I'm trying to declare a custom operator for JavaPairRDD, here is the code:
object CustomOperators {
implicit class CustomRDDOperator[K: ClassTag, V: ClassTag](rdd: JavaPairRDD[K, V]) {
def customOp = {
// logic
}
}
}
But I'm not able to call this function from my JavaPairRDD.
I'm very new to Scala, so there is a good chance that I'm doing something fundamentally wrong. Need some guidance.
What would be the best way to add a custom function to JavaPairRDD?

You should just need to add import CustomOperators._ in the file where you are using it. But if you are working from Scala, you shouldn't end up with a JavaPairRDD in the first place (unless you are using third-party library intended to be used primarily from Java).

Related

Scala, cats - how to do not use Applicative[F] explicity?

I would like to use Applicative[F] in some other way then explicity. Currently I have a simple code:
class BettingServiceMock[F[_] : Async] extends BettingService[F] {
override def put(bet: Bet): F[Bet] = {
for {
created <- Bet(Some(BetId(randomUUID().toString)), bet.stake, bet.name).pure
} yield created
}
}
Bet is just a simple case class. I use method pure explicity to return F[Bet]. Is there some way to do not it like this way (to do not call pure method explicity)?
I tried to do something like this:
class BettingServiceMock[F[_] : Async] (implicit a:Applicative[F]) extends BettingService[F] {
override def put(bet: Bet): F[Bet] = {
for {
created <- Bet(Some(BetId(randomUUID().toString)), bet.stake, bet.name)
} yield created
}
}
It did not help, because I got an error:
value map is not a member of model.Bet <- (Some(BetId(randomUUID().toString)), bet.stake, bet.name)
I would like to discover some good practise in Cats that's way I'm asking about it. I do not thnik so that explicity call methids like pure is good practise. Could you help me with that?

First of all, why do you think it's a bad practice. That's a common Applicative syntax. If you want some "magic" automatically lifts your value Bet to Applicative[Bet] then you would need sort of implicit conversion and that would be really bad practice.
You take a look at the scaladoc example of Applicative https://github.com/typelevel/cats/blob/master/core/src/main/scala/cats/Applicative.scala
Applicative[Option].pure(10)
Here the Applicative[Option] instance was summoned by apply[F[_]](implicit instance: Applicative[F]) which is automatically generated by simulacrum's #typeclass.

Best Practice to Load Class in Scala

I'm new to Scala (and functional programming as well) and I'm developing a plugin based application to learn and study.
I've cretead a trait to be the interface of a plugin. So when my app starts, it will load all the classes that implement this trait.
trait Plugin {
def init(config: Properties)
def execute(parameters: Map[String, Array[String]])
}
In my learning of Scala, I've read that if I want to program in functional way, I should avoid using var. Here's my problem:
The init method will be called after the class being loaded. And probably I will want to use the values from the config parameter in the execute method.
How to store this without using a var? Is there a better practice to do what I want here?
Thanks

There is more to programming in a functional way than just avoiding vars. One key concept is also to prefer immutable objects. In that respect your Plugin API is already breaking functional principles as both methods are only executed for their side-effects. With such an API using vars inside the implementation does not make a difference.
For an immutable plugin instance you could split plugin creation:
trait PluginFactory {
def createPlugin (config: Properties): Plugin
}
trait Plugin {
def execute ...
}
Example:
class MyPluginFactory extends MyPlugin {
def createPlugin (config: Properties): Plugin = {
val someValue = ... // extract from config
new MyPlugin(someValue)
}
}
class MyPlugin (someValue: String) extends Plugin {
def execute ... // using someConfig
}

You can use a val! It's basically the same thing, but the value of a val field cannot be modified later on. If you were using a class, you could write:
For example:
class Plugin(val config: Properties) {
def init {
// do init stuff...
}
def execute = // ...
}
Unfortunately, a trait cannot have class parameters. If you want to have a config field in your trait, you wont be able to set its value immediately, so it will have to be a var.

Can I define and use functions outside classes and objects in Scala?

I begin to learn Scala and I'm interesting can I define and use function without any class or object in Scala as in Haskell where there is no OOP concept. I'm interested can I use Scala totally without any OOP concept?
P.S. I use IntelliJ plugin for Scala

Well, you cannot do that really, but you can get very close to that by using package objects:
src/main/scala/yourpackage/package.scala:
package object yourpackage {
def function(x: Int) = x*x
}
src/main/scala/yourpackage/Other.scala:
package yourpackage
object Other {
def main(args: Array[String]) {
println(function(10)); // Prints 100
}
}
Note how function is used in Other object - without any qualifiers. Here function belongs to a package, not to some specific object/class.

As the other answers have indicated, its not possible to define a function without an object or class, because in Scala, functions are objects and (for example) Function1 is a class.
However, it is possible to define a function without an enclosing class or object. You do this by making the function itself the object.
src/main/scala/foo/bar/square.scala:
package foo.bar
object square extends (Int => Int) {
def apply(x: Int): Int = x * x
}
extends (Int => Int) is here just syntactic sugar for extends Function1[Int, Int].
This can then be imported and used like so:
src/main/scala/somepackage/App.scala:
package foo.bar.somepackage
import foo.bar.square
object App {
def main(args: Array[String]): Unit = {
println(square(2)) // prints "4"
}
}

No, functions as "first-class objects" means they are still objects.
However, it is still easy to construct a Scala application that does nothing but invoke a function that is arbitrarily composed.
One Odersky meme is that Scala's fusion of FP and OOP exploits modularity of OOP. It's a truism that Scala doesn't strive for "pure FP"; but OOP has real consequences for everyday programming, such as that a Map is a Function, as is a Set or a String (by virtue of enrichment to StringOps).
So, although it's possible to write a program that just invokes a function and is not structured OOPly, the language does not support active avoidance of OOP in the sense of exposure to inheritance and other unFP things like mutable state and side effects.

How can I add new methods to a library object?

I've got a class from a library (specifically, com.twitter.finagle.mdns.MDNSResolver). I'd like to extend the class (I want it to return a Future[Set], rather than a Try[Group]).
I know, of course, that I could sub-class it and add my method there. However, I'm trying to learn Scala as I go, and this seems like an opportunity to try something new.
The reason I think this might be possible is the behavior of JavaConverters. The following code:
class Test {
var lst:Buffer[Nothing] = (new java.util.ArrayList()).asScala
}
does not compile, because there is no asScala method on Java's ArrayList. But if I import some new definitions:
class Test {
import collection.JavaConverters._
var lst:Buffer[Nothing] = (new java.util.ArrayList()).asScala
}
then suddenly there is an asScala method. So that looks like the ArrayList class is being extended transparently.
Am I understanding the behavior of JavaConverters correctly? Can I (and should I) duplicate that methodology?

Scala supports something called implicit conversions. Look at the following:
val x: Int = 1
val y: String = x
The second assignment does not work, because String is expected, but Int is found. However, if you add the following into scope (just into scope, can come from anywhere), it works:
implicit def int2String(x: Int): String = "asdf"
Note that the name of the method does not matter.
So what usually is done, is called the pimp-my-library-pattern:
class BetterFoo(x: Foo) {
def coolMethod() = { ... }
}
implicit def foo2Better(x: Foo) = new BetterFoo(x)
That allows you to call coolMethod on Foo. This is used so often, that since Scala 2.10, you can write:
implicit class BetterFoo(x: Foo) {
def coolMethod() = { ... }
}
which does the same thing but is obviously shorter and nicer.
So you can do:
implicit class MyMDNSResolver(x: com.twitter.finagle.mdns.MDNSResolver) = {
def awesomeMethod = { ... }
}
And you'll be able to call awesomeMethod on any MDNSResolver, if MyMDNSResolver is in scope.

This is achieved using implicit conversions; this feature allows you to automatically convert one type to another when a method that's not recognised is called.
The pattern you're describing in particular is referred to as "enrich my library", after an article Martin Odersky wrote in 2006. It's still an okay introduction to what you want to do: http://www.artima.com/weblogs/viewpost.jsp?thread=179766

The way to do this is with an implicit conversion. These can be used to define views, and their use to enrich an existing library is called "pimp my library".
I'm not sure if you need to write a conversion from Try[Group] to Future[Set], or you can write one from Try to Future and another from Group to Set, and have them compose.

Generating a Scala class automatically from a trait

I want to create a method that generates an implementation of a trait. For example:
trait Foo {
def a
def b(i:Int):String
}
object Processor {
def exec(instance: AnyRef, method: String, params: AnyRef*) = {
//whatever
}
}
class Bar {
def wrap[T] = {
// Here create a new instance of the implementing class, i.e. if T is Foo,
// generate a new FooImpl(this)
}
}
I would like to dynamically generate the FooImpl class like so:
class FooImpl(val wrapped:AnyRef) extends Foo {
def a = Processor.exec(wrapped, "a")
def b(i:Int) = Processor.exec(wrapped, "b", i)
}
Manually implementing each of the traits is not something we would like (lots of boilerplate) so I'd like to be able to generate the Impl classes at compile time. I was thinking of annotating the classes and perhaps writing a compiler plugin, but perhaps there's an easier way? Any pointers will be appreciated.

java.lang.reflect.Proxy could do something quite close to what you want :
import java.lang.reflect.{InvocationHandler, Method, Proxy}
class Bar {
def wrap[T : ClassManifest] : T = {
val theClass = classManifest[T].erasure.asInstanceOf[Class[T]]
theClass.cast(
Proxy.newProxyInstance(
theClass.getClassLoader(),
Array(theClass),
new InvocationHandler {
def invoke(target: AnyRef, method: Method, params: Array[AnyRef])
= Processor.exec(this, method.getName, params: _*)
}))
}
}
With that, you have no need to generate FooImpl.
A limitation is that it will work only for trait where no methods are implemented. More precisely, if a method is implemented in the trait, calling it will still route to the processor, and ignore the implementation.

You can write a macro (macros are officially a part of Scala since 2.10.0-M3), something along the lines of Mixing in a trait dynamically. Unfortunately now I don't have time to compose an example for you, but feel free to ask questions on our mailing list at http://groups.google.com/group/scala-internals.

You can see three different ways to do this in ScalaMock.
ScalaMock 2 (the current release version, which supports Scala 2.8.x and 2.9.x) uses java.lang.reflect.Proxy to support dynamically typed mocks and a compiler plugin to generate statically typed mocks.
ScalaMock 3 (currently available as a preview release for Scala 2.10.x) uses macros to support statically typed mocks.
Assuming that you can use Scala 2.10.x, I would strongly recommend the macro-based approach over a compiler plugin. You can certainly make the compiler plugin work (as ScalaMock demonstrates) but it's not easy and macros are a dramatically superior approach.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Spark: Custom operator for JavaPairRDD - scala

You should just need to add import CustomOperators._ in the file where you are using it. But if you are working from Scala, you shouldn't end up with a JavaPairRDD in the first place (unless you are using third-party library intended to be used primarily from Java).

Related

Scala, cats - how to do not use Applicative[F] explicity?

Best Practice to Load Class in Scala

Can I define and use functions outside classes and objects in Scala?

How can I add new methods to a library object?

Generating a Scala class automatically from a trait

Categories

Resources