Custom ML Function in Pyspark - pyspark

I have a class function in python which does a stats.gamma_cdf modelling. I wanted to convert ths into pandas UDF so that it can be used in Pyspark for a bigger data. Any insights will be really helpfull.
import scipy
import scipy.stats as stats
class gamma_function:
def __init__( self, a=None, b=None, p=None):
self.a = a
self.b = b
self.p = p
def _gamma_cdf_background(self, X, a, b, p):
return stats.gamma.cdf(X, a, b)*p
def predict( self, X ):
return self._gamma_cdf_background(X, self.a, self.b, self.p)
def fit( self, X, y ):
from scipy.optimize import curve_fit
popt, pcov = curve_fit( self._gamma_cdf_background, X, y, bounds=([.25,0.1,.1], [3., 3.,3.]), tr_solver='lsmr')
self.a = popt[0]
self.b = popt[1]
self.p = popt[2]
return self
def transform(self, X):
return stats.gamma.cdf(X, self.a, self.b)*self.p
def get_params( self, deep=False ):
return { 'a':self.a, 'b':self.b, 'p':self.p }
def set_params( self, **parameters ):
for parameter, value in parameters.intems():
setattr( self, parameter, value )
return self

Related

Can implicits be used to disambiguate overloaded definition?

Consider the following overloaded definition of method mean:
def mean[T](data: Iterable[T])(implicit number: Fractional[T]): T = {
import number._
val sum = data.foldLeft(zero)(plus)
div(sum, fromInt(data.size))
}
def mean[T](data: Iterable[T])(implicit number: Integral[T]): Double = {
import number._
val sum = data.foldLeft(zero)(plus)
sum.toDouble / data.size
}
I would like second definition which returns Double only to be used in the case of Integral types, however
mean(List(1,2,3,4))
results in compiler error
Error: ambiguous reference to overloaded definition,
both method mean in class A$A16 of type [T](data: Iterable[T])(implicit number: Integral[T])Double
and method mean in class A$A16 of type [T](data: Iterable[T])(implicit number: Fractional[T])T
match argument types (List[Int])
mean(List(1,2,3,4))
^
Is there any way to use the fact that Fractional[Int] implicit is not available in order to disambiguate the two overloads?
Scala only considers the first argument list for the overload resolution, according to the specification. Both mean methods are deemed equally specific and ambiguous.
But for implicit resolution the implicits in scope are also considered, so a workaround could be to use a magnet pattern or a type class. Here is an example using the magnet pattern, which I believe is simpler:
def mean[T](data: MeanMagnet[T]): data.Out = data.mean
sealed trait MeanMagnet[T] {
type Out
def mean: Out
}
object MeanMagnet {
import language.implicitConversions
type Aux[T, O] = MeanMagnet[T] { type Out = O }
implicit def fromFractional[T](
data: Iterable[T]
)(
implicit number: Fractional[T]
): MeanMagnet.Aux[T, T] = new MeanMagnet[T] {
override type Out = T
override def mean: Out = {
import number._
val sum = data.foldLeft(zero)(plus)
div(sum, fromInt(data.size))
}
}
implicit def fromIntegral[T](
data: Iterable[T]
)(
implicit number: Integral[T]
): MeanMagnet.Aux[T, Double] = new MeanMagnet[T] {
override type Out = Double
override def mean: Out = {
import number._
val sum = data.foldLeft(zero)(plus)
sum.toDouble / data.size
}
}
}
With this definition it works normally:
scala> mean(List(1,2,3,4))
res0: Double = 2.5
scala> mean(List(1.0, 2.0, 3.0, 4.0))
res1: Double = 2.5
scala> mean(List(1.0f, 2.0f, 3.0f, 4.0f))
res2: Float = 2.5
Here is my attempt at typeclass solution as suggested by others
trait Mean[In, Out] {
def apply(xs: Iterable[In]): Out
}
object Mean {
def mean[In, Out](xs: Iterable[In])(implicit ev: Mean[In, Out]): Out = ev(xs)
private def meanFractional[T](data: Iterable[T])(implicit number: Fractional[T]): T = {
import number._
val sum = data.foldLeft(zero)(plus)
div(sum, fromInt(data.size))
}
private def meanIntegral[T](data: Iterable[T])(implicit number: Integral[T]): Double = {
import number._
val sum = data.foldLeft(zero)(plus)
sum.toDouble / data.size
}
implicit val meanBigInt: Mean[BigInt, Double] = meanIntegral _
implicit val meanInt: Mean[Int, Double] = meanIntegral _
implicit val meanShort: Mean[Short, Double] = meanIntegral _
implicit val meanByte: Mean[Byte, Double] = meanIntegral _
implicit val meanChar: Mean[Char, Double] = meanIntegral _
implicit val meanLong: Mean[Long, Double] = meanIntegral _
implicit val meanFloat: Mean[Float, Float] = meanFractional _
implicit val meanDouble: Mean[Double, Double] = meanFractional _
import scala.math.BigDecimal
implicit val meanBigDecimal: Mean[BigDecimal, BigDecimal] = meanFractional _
}
object MeanTypeclassExample extends App {
import Mean._
println(mean(List(1,2,3,4)))
println(mean(List(1d,2d,3d,4d)))
println(mean(List(1f,2f,3f,4f)))
}
which outputs
2.5
2.5
2.5

Two implicit definitions with same name for a method

I have two implicit declarations that "redefine" x as an operator:
import scala.io.StdIn._
import util._
import scala.language.postfixOps
case class Rectangle(width: Int, height: Int)
case class Circle(ratio: Integer)
case class Cylinder[T](ratio: T, height: T)
object implicitsExample1 {
implicit class RectangleMaker(width: Int) {
def x(height: Int) = Rectangle(width, height)
}
implicit class CircleMaker(ratio: Int) {
def c = Circle(ratio)
}
implicit class CylinderMaker[T](ratio: T) {
def x(height: T) = Cylinder(ratio, height)
}
def main(args: Array[String]) {
val myRectangle = 3 x 4
val myCircle = 3 c
val myCylinder = 4 x 5
println("myRectangle = " + myRectangle)
println("myCircle = " + myCircle)
println("myCylinder = " + myCylinder)
}
}
Here my output gives:
myRectangle = Rectangle(3,4)
myCircle = Circle(3)
myCylinder = Rectangle(4,5)
What I need to do to have something like:
myCylinder = Cylinder[Int](4,5)
I understand that the chosen implicit conversion is the first one declared but is there a way to specify the use of the Cylinder one?
Try combining RectangleMaker and CylinderMaker into a single ShapeMaker implicit class like so
implicit class ShapeMaker[T](width: T) {
def x(height: T)(implicit ev: T =:= Int) = Rectangle(width, height)
def x(height: T) = Cylinder[T](width, height)
}
and provide type ascriptions to value definitions like so
val myRectangle: Rectangle = 3 x 4
val myCircle = 3 c
val myCylinder: Cylinder[Int] = 4 x 5
which outputs
myRectangle = Rectangle(3,4)
myCircle = Circle(3)
myCylinder = Cylinder(4,5)

Scala: ReaderT composition with different contexts and dependencies

Example of s3f1 and s3f2 functions that return different ReaderT:
type FailFast[A] = Either[List[String], A]
trait Service1 { def s1f:Option[Int] = Some(10) }
trait Service2 { def s2f:FailFast[Int] = Right(20) }
import cats.instances.option._
def s3f1: ReaderT[Option, Service1, Int] =
for {
r1 <- ReaderT((_: Service1).s1f)
} yield r1 + 1
import cats.syntax.applicative._
import cats.instances.either._
type ReaderService2FF[A] = ReaderT[FailFast, Service2, A]
def s3f2: ReaderService2FF[Int] =
for {
r1 <- ReaderT((_: Service2).s2f)
r2 <- 2.pure[ReaderService2FF]
} yield r1 + r2
I try to compose these two functions that return readers with different F[_] context and dependencies: ReaderT[Option, Service1, Int] and ReaderT[FailFast, Service2, Int]
I have to combine somehow the F[_] context, which means combine FailFast with Option. I assume, it makes sense to combine it to FailFast[Option]:
type Env = (Service1, Service2)
type FFOption[A] = FailFast[Option[A]]
type ReaderEnvFF[A] = ReaderT[FFOption, Env, A]
How to compose s3f1 and s3f2:
def c: ReaderEnvFF[Int] =
for {
r1 <- //s3f1
r2 <- //s3f2
} yield r1 + r2
Since you try to compose monads FailFast and Option in FFOption, you should use one more monad transformer, so FFOption[A] should be OptionT[FailFast, A] rather than just FailFast[Option[A]].
import cats.instances.option._
import cats.instances.either._
import cats.syntax.applicative._
import cats.syntax.either._
import cats.syntax.option._
type Env = (Service1, Service2)
type FFOption[A] = OptionT[FailFast, A]
type ReaderEnvFF[A] = ReaderT[FFOption, Env, A]
def c: ReaderEnvFF[Int] =
for {
r1 <- ReaderT[FFOption, Env, Int](p => OptionT(Either.right(s3f1.run(p._1))))
r2 <- ReaderT[FFOption, Env, Int](p => OptionT(s3f2.run(p._2).map(_.some)))
} yield r1 + r2
This can be rewritten with with local and mapF:
def c: ReaderEnvFF[Int] =
for {
r1 <- s3f1.local[Env](_._1).mapF[FFOption, Int](opt => OptionT(opt.asRight))
r2 <- s3f2.local[Env](_._2).mapF[FFOption, Int](ff => OptionT(ff.map(_.some)))
} yield r1 + r2

Is there any way to set a default value to generic type variable in Scala?

I want to set a default value to variable. But my Scala compiler says:
Error:(20, 16) unbound placeholder parameter
val p: T = _
^
Here is the code.
object InverseFunctionsExample extends App {
type D = Double
def f(x: D): D = 5 * x + 10
def g(x: D): D = 0.2 * x - 2
printMessage(isInversible(f, g))
def printMessage(inv: Boolean): Unit = {
if (inv) print("YES!") else print("NOPE!")
}
def isInversible[T](f: (T) => T, g: (T) => T): Boolean = {
val p: T = _
if (f(p) == g(p))
true
else
false
}
}
Is it possible to initialize a val p with default value somehow?
Only var fields (not local variables) can be initialized in this way. If you want to define "default values" for different types, the standard approach is the type-class pattern:
case class Default[T](value: T)
object Default {
implicit val defaultInt: Default[Int] = Default(0)
implicit val defaultString: Default[String] = Default("")
...
}
def isInversible[T](f: (T) => T, g: (T) => T)(implicit d: Default[T]): Boolean = {
if (f(d.value) == g(d.value))
true
else
false
// or just f(d.value) == g(d.value)
}
You can use reflection to instantiate a new instance of a class, but that's probably not going to be very useful for you here:
class Foo
classOf[Foo].getConstructor().newInstance()
You can read about the reflection API to see you how you can pick a suitable constructor here.
You could also have a parameter that specifies how to instantiate a new instance:
def isInversible[T](f: T => T, g: T => T, def: => T) = f(def) == g(def)
Since this looks like an inherently math-oriented problem, you might be interested in the Numeric type, which can help facilitate this kind of logic generically for different number types. For example:
def intersectAtOrigin[T](f: T => T, g: T => T)(implicit n: Numeric[T]) = {
val zero = n.zero
f(zero) == g(zero)
}
And then you can do:
def f(x: D): D = 5 * x + 10
def g(x: D): D = 0.2 * x - 2
intersectAtOrigin(f, g) //false, working with Doubles
intersectAtOrigin[Int](_ + 1, x => x * x + x + 1) //true, working with Ints
You can read more about Numeric in the docs here.
You could pass in the value as a parameter of type T
def isInversible[T](f: (T) => T, g: (T) => T)(p: T): Boolean = {
if (f(p) == g(p))
true
else
false
}
An example printMessage(isInversible(f, g)(10))

scala's spire framework : I am unable to operate on a group

I try to use spire, a math framework, but I have an error message:
import spire.algebra._
import spire.implicits._
trait AbGroup[A] extends Group[A]
final class Rationnel_Quadratique(val n1: Int = 2)(val coef: (Int, Int)) {
override def toString = {
coef match {
case (c, i) =>
s"$c + $i√$n"
}
}
def a() = coef._1
def b() = coef._2
def n() = n1
}
object Rationnel_Quadratique {
def apply(coef: (Int, Int),n: Int = 2)= {
new Rationnel_Quadratique(n)(coef)
}
}
object AbGroup {
implicit object RQAbGroup extends AbGroup[Rationnel_Quadratique] {
def +(a: Rationnel_Quadratique, b: Rationnel_Quadratique): Rationnel_Quadratique = Rationnel_Quadratique(coef=(a.a() + b.a(), a.b() + b.b()))
def inverse(a: Rationnel_Quadratique): Rationnel_Quadratique = Rationnel_Quadratique((-a.a(), -a.b()))
def id: Rationnel_Quadratique = Rationnel_Quadratique((0, 0))
}
}
object euler66_2 extends App {
val c = Rationnel_Quadratique((1, 2))
val d = Rationnel_Quadratique((3, 4))
val e = c + d
println(e)
}
the program is expected to add 1+2√2 and 3+4√2, but instead I have this error:
could not find implicit value for evidence parameter of type spire.algebra.AdditiveSemigroup[Rationnel_Quadratique]
val e = c + d
^
I think there is something essential I have missed (usage of implicits?)
It looks like you are not using Spire correctly.
Spire already has an AbGroup type, so you should be using that instead of redefining your own. Here's an example using a simple type I created called X.
import spire.implicits._
import spire.algebra._
case class X(n: BigInt)
object X {
implicit object XAbGroup extends AbGroup[X] {
def id: X = X(BigInt(0))
def op(lhs: X, rhs: X): X = X(lhs.n + rhs.n)
def inverse(lhs: X): X = X(-lhs.n)
}
}
def test(a: X, b: X): X = a |+| b
Note that with groups (as well as semigroups and monoids) you'd use |+| rather than +. To get plus, you'll want to define something with an AdditiveSemigroup (e.g. Semiring, or Ring, or Field or something).
You'll also use .inverse and |-| instead of unary and binary - if that makes sense.
Looking at your code, I am also not sure your actual number type is right. What will happen if I want to add two numbers with different values for n?
Anyway, hope this clears things up for you a bit.
EDIT: Since it seems like you're also getting hung up on Scala syntax, let me try to sketch a few designs that might work. First, there's always a more general solution:
import spire.implicits._
import spire.algebra._
import spire.math._
case class RQ(m: Map[Natural, SafeLong]) {
override def toString: String = m.map {
case (k, v) => if (k == 1) s"$v" else s"$v√$k" }.mkString(" + ")
}
object RQ {
implicit def abgroup[R <: Radical](implicit r: R): AbGroup[RQ] =
new AbGroup[RQ] {
def id: RQ = RQ(Map.empty)
def op(lhs: RQ, rhs: RQ): RQ = RQ(lhs.m + rhs.m)
def inverse(lhs: RQ): RQ = RQ(-lhs.m)
}
}
object Test {
def main(args: Array[String]) {
implicit val radical = _2
val x = RQ(Map(Natural(1) -> 1, Natural(2) -> 2))
val y = RQ(Map(Natural(1) -> 3, Natural(2) -> 4))
println(x)
println(y)
println(x |+| y)
}
}
This allows you to add different roots together without problem, at the cost of some indirection. You could also stick more closely to your design with something like this:
import spire.implicits._
import spire.algebra._
abstract class Radical(val n: Int) { override def toString: String = n.toString }
case object _2 extends Radical(2)
case object _3 extends Radical(3)
case class RQ[R <: Radical](a: Int, b: Int)(implicit r: R) {
override def toString: String = s"$a + $b√$r"
}
object RQ {
implicit def abgroup[R <: Radical](implicit r: R): AbGroup[RQ[R]] =
new AbGroup[RQ[R]] {
def id: RQ[R] = RQ[R](0, 0)
def op(lhs: RQ[R], rhs: RQ[R]): RQ[R] = RQ[R](lhs.a + rhs.a, lhs.b + rhs.b)
def inverse(lhs: RQ[R]): RQ[R] = RQ[R](-lhs.a, -lhs.b)
}
}
object Test {
def main(args: Array[String]) {
implicit val radical = _2
val x = RQ[_2.type](1, 2)
val y = RQ[_2.type](3, 4)
println(x)
println(y)
println(x |+| y)
}
}
This approach creates a fake type to represent whatever radical you are using (e.g. √2) and parameterizes QR on that type. This way you can be sure that no one will try to do additions that are invalid.
Hopefully one of these approaches will work for you.