How to extend polars api without creating a namespace - python-polars

Let's say we want to create a single expression and we want to add it to the api.
We can do it through a namespace as seen here but can we add it directly to Expr?
In other words instead of:
#pl.api.register_expr_namespace("greetings")
class Greetings:
def __init__(self, expr: pl.Expr):
self._expr = expr
def hello(self) -> pl.Expr:
return (pl.lit("Hello ") + self._expr).alias("hi there")
pl.DataFrame(data=["world", "world!", "world!!"]).select(
[
pl.all().greetings.hello(),
]
)
is there a way to make pl.all().hello() available?

The simplest approach would be to assign custom methods directly onto the base Expr class, though extension namespaces are definitely a cleaner way to handle it (especially if you are going to have more than one related function). Using the same example as above:
# declare method (with implicit 'self')
def hello( self ) -> pl.Expr:
return ( pl.lit("Hello ") + self ).alias( "hi there" )
# assign the method to the expression base class
pl.Expr.hello = hello
# can now call the assigned method from any Expr
pl.DataFrame(
data = ["world", "world!", "world!!"]
).select(
pl.all().hello(),
)

Related

How to get the body of variable initialisation from outer scope in Scala 3 macros?

Suppose I have this code for extracting the code initialising a variable:
def extractBodyImpl[T: Type](expr: Expr[T])(using Quotes) =
import quotes.reflect._
expr.asTerm.underlyingArgument match
case ident # Ident(_) =>
ident.symbol.tree match
case ValDef(_,_,rhs) => println(rhs)
case DefDef(_,_,_,rhs) => println(rhs)
'{ () }
inline def extractBody[T](inline expr: T) = ${ extractBodyImpl('expr) }
When called on a variable declared in the same scope it works as desired:
#main def hello() =
val x = 1
extractBody(x)
prints Some(Literal(Constant(1))).
However, on a variable from outer scope, it prints None:
val x = 1
#main def hello() =
extractBody(x)
How can I make it work in the second case?
In Scala 3 you just need to switch on
scalacOptions += "-Yretain-trees"
Then
val x = 1
#main def hello() =
extractBody(x)
will print Some(Literal(Constant(1))) too.
In Scala 2 we had to use Traverser technique in order to get RHS of definition
Get an scala.MatchError: f (of class scala.reflect.internal.Trees$Ident) when providing a lambda assigned to a val
Def Macro, pass parameter from a value
Creating a method definition tree from a method symbol and a body
Scala macro how to convert a MethodSymbol to DefDef with parameter default values?
How to get the runtime value of parameter passed to a Scala macro?
Can you implement dsinfo in Scala 3? (Can Scala 3 macros get info about their context?) (Scala 3)
You cannot do it in macro. A function which received argument might have been called from everywhere. How would static analysis would access the information only available in runtime? The only reliable solution would be to force user to expand this extractBody macro right after defining the value and passing the result in some wrapper combining both value and its origin.

How can I combine this context with invokeFunction in Nashorn?

I am trying to call a function in Javascript from Java/Nashorn (in Scala, but that's not material to the question).
// JS
var foo = function(calculator){ // calculator is a Scala object
return this.num * calculator.calcMult();
}
The context on the Scala side is like this:
case class Thing(
num: Int,
stuff: String
)
case class Worker() { // Scala object to bind to calculator
def calMult() = { 3 } // presumably some complex computation here
}
I start by getting foo into the JS environment:
jsengine.eval("""var foo = function(calculator){return this.num * calculator.calcMult();}"""
To use this I need two things to be available: 1) 'this' context to be populated with my Thing object, and 2) the ability to pass a Java/Scala object to my JS function (to call calcMulti later). (If needed I can easily JSON-serialize Thing.)
How can I do both and successfully call foo() from Scala?
This may not be the only or cleanest solution, but it does work.
Turns out javascript has the ability to bind a given 'this' context to a function, which creates a "bound function" that has your 'this' visible within it. Then you use invoke() as you normally would on the bound function.
val inv = javascript.asInstanceOf[Invocable]
val myThis: String = // JSON serialized Map of stuff you want accessible in the js function
val bindFn = "bind_" + fnName
javascript.eval(bindFn + s" = $fnName.bind(" + myThis + ")")
inv.invokeFunction(bindFn, args: _*)
If you passed myThis into the binding to include {"x":"foo"} then when invoked, any access within your function to this.x will resolve to "foo" as you'd expect.

Value to indicate to use default

In Scala I would like to have something like this
TokenizerExample.scala
class TokenizerExample private (whateva : Any)(implicit val separator : Char = '.') {
def this(data2Tokenize : String)(implicit s : Char) {
this("", s) //call to base constructor
}
def this(data2Tokenize : Array[Char])(implicit s : Char) { {
this("", s) //call to base constructor
}
}
what I would like to achieve is to allow the user to call any of the two public constructors either providing or not the separator, but if they do NOT provide the separator automatically take the one in the base constructor, I was wondering if there is a value that I can pass to the base constructor so that scala use the default on the private base constructor.
what I would like to avoid it to do the next in each constructor
def this(_3rdConstructor : SytringBuilder)(implicit s : Char = '.') ...
I tried this in many different ways, with the values being implicit, with the separator as a Option, but I do not get a result that I actually like, specially because scala complains about having implicit values in multiple constructors (which kind of defeats the purpose of having them). Is there a way to achieve that behavior in a nice way without
1) forcing the user to provide a separator.
2) go into "bad-practices" by passing null values and then validating them (specially because that would not allow my separator to be a val in the constructor.
3) creating YET ANOTHER LANGUAGE just because I dislike a small little thing about one of them :) .
I would strongly advice you against using implicits for this purpose. The resolution rules are rather complex, and it makes the code extremely hard to follow, because it is almost impossible to tell what value the constructor will end up receiving without the debugger.
If all you are trying to do is avoid defining the default in multiple places, just define it in a companion object:
object Foo {
val defaultParam = ','
}
class Foo {
import Foo.defaultParam
def this(data: String, param: Char = defaultParam) = ???
def this(data: List[Char], param: Char = defaultParam) = ???
// etc ...
}
If you insist on using implicits, you can use a similar approach to the above: just make defaultParam definition implicit, drop the defaults, replacing them with implicit lists, and then import Foo._ in scope where you are making the call. But, really, don't do that: it adds no value, and only has disadvantages in this case.

Understand how to use apply and unapply

I'm trying to get a better understanding of the correct usage of apply and unapply methods.
Considering an object that we want to serialize and deserialize, is this a correct usage (i.e. the Scala way) of using apply and unapply?
case class Foo
object Foo {
apply(json: JValue): Foo = json.extract[Foo]
unapply(f: Foo): JValue = //process to json
}
Firstly, apply and unapply are not necessarily opposites of each other. Indeed, if you define one on a class/object, you don't have to define the other.
apply
apply is probably the easier to explain. Essentially, when you treat your object like a function, apply is the method that is called, so, Scala turns:
obj(a, b, c) to obj.apply(a, b, c).
unapply
unapply is a bit more complicated. It is used in Scala's pattern matching mechanism and its most common use I've seen is in Extractor Objects.
For example, here's a toy extractor object:
object Foo {
def unapply(x : Int) : Option[String] =
if(x == 0) Some("Hello, World") else None
}
So now, if you use this is in a pattern match like so:
myInt match {
case Foo(str) => println(str)
}
Let's suppose myInt = 0. Then what happens? In this case Foo.unapply(0) gets called, and as you can see, will return Some("Hello, World"). The contents of the Option will get assigned to str so in the end, the above pattern match will print out "Hello, world".
But what if myInt = 1? Then Foo.unapply(1) returns None so the corresponding expression for that pattern does not get called.
In the case of assignments, like val Foo(str) = x this is syntactic sugar for:
val str : String = Foo.unapply(x) match {
case Some(s) => s
case None => throw new scala.MatchError(x)
}
The apply method is like a constructor which takes arguments and creates an object, whereas the unapply takes an object and tries to give back the arguments.
A simple example:
object Foo {
def apply(name: String, suffix: String) = name + "." + suffix
def unapply(name: String): Option[(String, String)] = {
//simple argument extractor
val parts = name.split("\\.")
if (parts.length == 2) Some(parts(0), parts(1)) else None
}
}
when you call
val file = Foo("test", "txt")
It actually calls Foo.apply("test", "txt") and returns test.txt
If you want to deconstruct, call
val Foo(name) = file
This essentially invokes val name = Foo.unapply(file).get and returns (test, txt) (normally use pattern matching instead)
You can also directly unpack the tuple with 2 variables, i.e.
scala> val Foo(name, suffix) = file
val name: String = test
val suffix: String = txt
BTW, the return type of unapply is Option by convention.
So apply and unapply are just defs that have extra syntax support.
Apply takes arguments and by convention will return a value related to the object's name. If we take Scala's case classes as "correct" usage then the object Foo's apply will construct a Foo instance without needing to add "new". You are free of course to make apply do whatever you wish (key to value in Map, set contains value in Set, and indexing in Seq come to mind).
Unapply, if returning an Option or Boolean can be used in match{} and pattern matching. Like apply it's just a def so can do whatever you dream up but the common usage is to extract value(s) from instances of the object's companion class.
From the libraries I've worked with serialization/deserialization defs tend to get named explicitly. E.g., write/read, show/read, toX/fromX, etc.
If you want to use apply/unapply for this purpose the only thing I'd suggest is changing to
def unapply(f: Foo): Option[JValue]
Then you could do something like:
val myFoo = Foo("""{name: "Whiskers", age: 7}""".asJson)
// use myFoo
val Foo(jval) = myFoo
// use jval

How to track nested functions in Scala

I'd like to have some basic knowledge of how deeply my function call is nested. Consider the following:
scala> def decorate(f: => Unit) : Unit = { println("I am decorated") ; f }
decorate: (f: => Unit)Unit
scala> decorate { println("foo") }
I am decorated
foo
scala> decorate { decorate { println("foo") } }
I am decorated
I am decorated
foo
For the last call, I'd like to be able to get the following:
I am decorated 2x
I am decorated 1x
foo
The idea is that the decorate function knows how deeply its nested. Ideas?
Update: As Nikita had thought, my example doesn't represent what I'm really after. The goal is not to produce the strings so much as to be able to pass some state through a series of calls to the same nested function. I think RĂ©gis Jean-Gilles is pointing me in the right direction.
You can use the dynamic scope pattern. More prosaically this means using a thread local variable (scala's DynamicVariable is done just for that) to store the current nesting level. See my answer to this other question for a partical example of this pattern: How to define a function that takes a function literal (with an implicit parameter) as an argument?
This is suitable only if you want to know the nesting level for a very specific method though. If you want a generic mecanism that works for any method then this won't work (as you'd need a distinct variable for each method). In this case the only alternative I can think of is to inspect the stack, but not only is it not very reliable, it is also extremely slow.
UPDATE: actually, there is a way to apply the dynamic scope pattern in a generic way (for any possible method). The important part is to be able to implicitly get a unique id for each method. from there, it is just a matter of using this id as a key to associate a DynamicVariable to the method:
import scala.util.DynamicVariable
object FunctionNestingHelper {
private type FunctionId = Class[_]
private def getFunctionId( f: Function1[_,_] ): FunctionId = {
f.getClass // That's it! Beware, implementation dependant.
}
private val currentNestings = new DynamicVariable( Map.empty[FunctionId, Int] )
def withFunctionNesting[T]( body: Int => T ): T = {
val id = getFunctionId( body )
val oldNestings = currentNestings.value
val oldNesting = oldNestings.getOrElse( id, 0 )
val newNesting = oldNesting + 1
currentNestings.withValue( oldNestings + ( id -> newNesting) ) {
body( newNesting )
}
}
}
Usage:
import FunctionNestingHelper._
def decorate(f: => Unit) = withFunctionNesting { nesting: Int =>
println("I am decorated " + nesting + "x") ; f
}
To get a unique id for the method, I actually get an id for a the closure passed to withFunctionNesting (which you must call in the method where you need to retrieve the current nesting). And that's where I err on the implementation dependant side: the id is just the class of the function instance. This does work as expected as of now (because every unary function literal is implemented as exactly one class implementing Function1 so the class acts as a unique id), but the reality is that it might well break (although unlikely) in a future version of scala. So use it at your own risk.
Finally, I suggest that you first evaluate seriously if Nikita Volkov's suggestion of going more functional would not be a better solution overall.
You could return a number from the function and count how many levels you are in on the way back up the stack. But there is no easy way to count on the way down like you have given example output for.
Since your question is tagged with "functional programming" following are functional solutions. Sure the program logic changes completely, but then your example code was imperative.
The basic principle of functional programming is that there is no state. What you're used to have as a shared state in imperative programming with all the headache involved (multithreading issues and etc.) - it is all achieved by passing immutable data as arguments in functional programming.
So, assuming the "state" data you wanted to pass was the current cycle number, here's how you'd implement a function using recursion:
def decorated ( a : String, cycle : Int ) : String
= if( cycle <= 0 ) a
else "I am decorated " + cycle + "x\n" + decorated(a, cycle - 1)
println(decorated("foo", 3))
Alternatively you could make your worker function non-recursive and "fold" it:
def decorated ( a : String, times : Int )
= "I am decorated " + times + "x\n" + a
println( (1 to 3).foldLeft("foo")(decorated) )
Both codes above will produce the following output:
I am decorated 3x
I am decorated 2x
I am decorated 1x
foo