Dynamic arg types for a python function when embedding - python-c-api

I am adding to Exim an embedded python interpreter. I have copied the embedded perl interface and expect python to work the same as the long-since-coded embedded perl interpreter. The goal is to allow the sysadmin to do complex functions in a powerful scripting language (i.e. python) instead of trying to use exim's standard ACL commands because it can get quite complex to do relatively simple things using the exim ACL language.
My current code as of the time of this writing is located at http://git.exim.org/users/tlyons/exim.git/blob/9b2c5e1427d3861a2154bba04ac9b1f2420908f7:/src/src/python.c . It is working properly in that it can import the sysadmin's custom python code, call functions in it, and handle the returned values (simple return types only: int, float, or string). However, it does not yet handle values that are passed to a python function, which is where my question begins.
Python seems to require that any args I pass to the embedded python function be explicitly cast to one of int,long,double,float or string using the c api. The problem is the sysadmin can put anything in that embedded python code and in the c side of things in exim, I won't know what those variable types are. I know that python is dynamically typed so I was hoping to maintain that compliance when passing values to the embedded code. But it's not working that way in my testing.
Using the following basic super-simple python code:
def dumb_add(a,b):
return a+b
...and the calling code from my exim ACL language is:
${python {dumb_add}{800}{100}}
In my c code below, reference counting is omitted for brevity. count is the number of args I'm passing:
pArgs = PyTuple_New(count);
for (i=0; i<count; ++i)
{
pValue = PyString_FromString((const char *)arg[i]);
PyTuple_SetItem(pArgs, i, pValue);
}
pReturn = PyObject_CallObject(pFunc, pArgs);
Yes, **arg is a pointer to an array of strings (two strings in this simple case). The problem is that the two values are treated as strings in the python code, so the result of that c code executing the embedded python is:
${python {dumb_add}{800}{100}}
800100
If I change the python to be:
def dumb_add(a,b):
return int(a)+int(b)
Then the result of that c code executing the python code is as expected:
${python {dumb_add}{800}{100}}
900
My goal is that I don't want to force a python user to manually cast all of the numeric parameters they pass to an embedded python function. Instead of PyString_FromString(), if there was a PyDynamicType_FromString(), I would be ecstatic. Exim's embedded perl parses the args and does the casting automatically, I was hoping for the same from the embedded python. Can anybody suggest if python can do this arg parsing to provide the dynamic typing I was expecting?
Or if I want to maintain that dynamic typing, is my only option going to be for me to parse each arg and guess at the type to cast it to? I was really really REALLY hoping to avoid that approach. If it comes to that, I may just document "All parameters passed are strings, so if you are actually trying to pass numbers, you must cast all parameters with int(), float(), double(), or long()". However, and there is always a comma after however, I feel that approach will sour strong python coders on my implementation. I want to avoid that too.
Any and all suggestions are appreciated, aside from "make your app into a python module".

The way I ended up solving this was by finding out how many args the function expected, and exit with an error if the number of args passed to the function didn't match. Rather than try and synthesize missing args or to simply omit extra args, for my use case I felt it was best to enforce matching arg counts.
The args are passed to this function as an unsigned char ** arg:
int count = 0;
/* Identify and call appropriate function */
pFunc = PyObject_GetAttrString(pModule, (const char *) name);
if (pFunc && PyCallable_Check(pFunc))
{
PyCodeObject *pFuncCode = (PyCodeObject *)PyFunction_GET_CODE(pFunc);
/* Should not fail if pFunc succeeded, but check to be thorough */
if (!pFuncCode)
{
*errstrp = string_sprintf("Can't check function arg count for %s",
name);
return NULL;
}
while(arg[count])
count++;
/* Sanity checking: Calling a python object requires to state number of
vars being passed, bail if it doesn't match function declaration. */
if (count != pFuncCode->co_argcount)
{
*errstrp = string_sprintf("Expected %d args to %s, was passed %d",
pFuncCode->co_argcount, name, count);
return NULL;
}
The string_sprintf is a function within the Exim source code which also handles memory allocation, making life easy for me.

Related

How to evaluate an expression in nim?

I'm trying to perform the equivalent of the eval method from Python in nim.
I was under the impression that parseStmt from the macros package should help me with this, but I'm facing a compilation issue that I don't understand.
import macros
echo parseStmt("1 + 2")
I would have expected this to print 3 when executed, but instead the compilation complains that
Error: request to generate code for .compileTime proc: $
I found this thread, and the examples there work, and following this, I was able to make the following program that works as I would expect:
import macros
import strformat
macro eval(value: string): untyped =
result = parseStmt fmt"{value}"
echo eval("1+2")
But I don't undertand why it needs to be written in this way at all. If I inline the statement, let value = "1 + 2"; echo parseStmt fmt"{value}", I get the same compile error as above.
Also, why is parseStmt value different from parseStmt fmt"{value}", in the context of the eval macro above?
What am I missing here?
Thank you in advance for any clarifications!
Unlike Python which is an interpreted language, Nim is compiled. This means that all code is parsed and turned into machine code on compile-time and the program that you end up with doesn't really know anything about Nim at all (at least as long as you don't import the Nim compiler as a module, which is possible). So parseStmt and all macro/template expansion stuff in Nim is done completely during compilation. The error, although maybe a bit hard to read, is trying to tell you that what was passed to $ (which is the convert-to-string operator in Nim, and called by echo on all its arguments) is a compile-time thing that can't be used on runtime. In this case it's because parseStmt doesn't return "3", it returns something like NimNode(kind: nnkIntLit, intVal: 3), and the NimNode type is only available during compile-time. Nim however allows you to run code on compile-time to return other code, this is what a macro does. The eval macro you wrote there takes value which is any statement that resolves to a string during runtime, passed as a NimNode. This is also why result = parseStmt value won't work in your case, because value is not yet a string, but could be something like reading a string from standard input which would result in a string during runtime. However the use of strformat here is a bit confusing and overkill. If you change your code to:
import macros
macro eval(value: static[string]): untyped =
result = parseStmt value
echo eval("1+2")
It will work just fine. This is because we have now told Nim that value needs to be a static i.e. known during compile-time. In this case the string literal "1+2" is obviously known at compile-time, but this could also be a call to a compile-time procedure, or even the output of staticRead which reads a file during compilation.
As you can see Nim is very powerful, but the barrier between compile-time and run-time can sometimes be a bit confusing. Note also that your eval procedure doesn't actually evaluate anything at compile-time, it simply returns the Nim code 1 + 2 so your code ends up being echo 1 + 2. If you want to actually run the code at compile-time you might want to look into compile-time procedures.
Hope this helps shed some light on your issue.
Note: while this answer outlines why this happens, keep in mind that what you're trying to do probably won't result in what you want (which I assumed to be runtime evaluation of expressions).
You're trying to pass a NimNode to parseStmt which expects a string. The fmt macro automatically stringifies anything in the {}, you can omit the fmt by doing $value to turn the node into a string.
As I already noted, this will not work as it does in Python: Nim does not have runtime evaluation. The expression in the string is going to be evaluated at compile time, so a simple example like this will not do what you want:
import std/rdstdin
let x = readLineFromStdin(">")
echo eval(x)
Why?
First of all, because you're stringifying the AST you pass to the eval, it's not the string behind the x variable that's going to get passed to the macro - it's going to be the symbol that denotes the x variable. If you stringify a symbol, you get the underlying identifier, which means that parseStmt will receive "x" as its parameter. This will effect in the string stored in x being printed out, which is wrong.
What you want instead is the following:
import std/rdstdin
import std/macros
macro eval(value: static string): untyped =
result = parseStmt(value)
echo eval("1 + 2")
This prevents runtime-known values from being passed to the macro. You can only pass consts and literals to it now, which is the correct behavior.

Using LuaJ with Scala

I am attempting to use LuaJ with Scala. Most things work (actually all things work if you do them correctly!) but the simple task of setting object values has become incredibly complicated thanks to Scala's setter implementation.
Scala:
class TestObject {
var x: Int = 0
}
Lua:
function myTestFunction(testObject)
testObject.x = 3
end
If I execute the script or line containing this Lua function and pass a coerced instance of TestObject to myTestFunction this causes an error in LuaJ. LuaJ is trying to direct-write the value, and Scala requires you to go through the implicitly-defined setter (with the horrible name x_=, which is not valid Lua so even attempting to call that as a function makes your Lua not parse).
As I said, there are workarounds for this, such as defining your own setter or using the #BeanProperty markup. They just make code that should be easy to write much more complicated:
Lua:
function myTestFunction(testObject)
testObject.setX(testObject, 3)
end
Does anybody know of a way to get luaj to implicitly call the setter for such assignments? Or where I might look in the luaj source code to perhaps implement such a thing?
Thanks!
I must admit that I'm not too familiar with LuaJ, but the first thing that comes to my mind regarding your issue is to wrap the objects within proxy tables to ease interaction with the API. Depending upon what sort of needs you have, this solution may or may not be the best, but it could be a good temporary fix.
local mt = {}
function mt:__index(k)
return self.o[k] -- Define how your getters work here.
end
function mt:__newindex(k, v)
return self.o[k .. '_='](v) -- "object.k_=(v)"
end
local function proxy(o)
return setmetatable({o = o}, mt)
end
-- ...
function myTestFunction(testObject)
testObject = proxy(testObject)
testObject.x = 3
end
I believe this may be the least invasive way to solve your problem. As for modifying LuaJ's source code to better suit your needs, I had a quick look through the documentation and source code and found this, this, and this. My best guess says that line 71 of JavaInstance.java is where you'll find what you need to change, if Scala requires a different way of setting values.
f.set(m_instance, CoerceLuaToJava.coerce(value, f.getType()));
Perhaps you should use the method syntax:
testObject:setX(3)
Note the colon ':' instead of the dot '.' which can be hard to distinguish in some editors.
This has the same effect as the function call:
testObject.setX(testObject, 3)
but is more readable.
It can also be used to call static methods on classes:
luajava.bindClass("java.net.InetAddress"):getLocalHost():getHostName()
The part to the left of the ':' is evaluated once, so a statement such as
x = abc[d+e+f]:foo()
will be evaluated as if it were
local tmp = abc[d+e+f]
x = tmp.foo(tmp)

Why some variable of struct take preprocessor to function?

Variables of struct declared by data type of language in the header file. Usually data type using to declare variables, but other data type pass to preprocessors. When we should use to a data type send to preprocessor for declare variables? Why data type and variables send to processor?
#define DECLARE_REFERENCE(type, name) \
union { type name; int64_t name##_; }
typedef struct _STRING
{
int32_t flags;
int32_t length;
DECLARE_REFERENCE(char*, identifier);
DECLARE_REFERENCE(uint8_t*, string);
DECLARE_REFERENCE(uint8_t*, mask);
DECLARE_REFERENCE(MATCH*, matches_list_head);
DECLARE_REFERENCE(MATCH*, matches_list_tail);
REGEXP re;
} STRING;
Why this code is doing this for declarations? Because as the body of DECLARE_REFERENCE shows, when a type and name are passed to this macro it does more than just the declaration - it builds something else out of the name as well, for some other unknown purpose. If you only wanted to declare a variable, you wouldn't do this - it does something distinct from simply declaring one variable.
What it actually does? The unions that the macro declares provide a second name for accessing the same space as a different type. In this case you can get at the references themselves, or also at an unconverted integer representation of their bit pattern. Assuming that int64_t is the same size as a pointer on the target, anyway.
Using a macro for this potentially serves several purposes I can think of off the bat:
Saves keystrokes
Makes the code more readable - but only to people who already know what the macros mean
If the secondary way of getting at reference data is only used for debugging purposes, it can be disabled easily for a release build, generating compiler errors on any surviving debug code
It enforces the secondary status of the access path, hiding it from people who just want to see what's contained in the struct and its formal interface
Should you do this? No. This does more than just declare variables, it also does something else, and that other thing is clearly specific to the gory internals of the rest of the containing program. Without seeing the rest of the program we may never fully understand the rest of what it does.
When you need to do something specific to the internals of your program, you'll (hopefully) know when it's time to invent your own thing-like-this (most likely never); but don't copy others.
So the overall lesson here is to identify places where people aren't writing in straightforward C, but are coding to their particular application, and to separate those two, and not take quirks from a specific program as guidelines for the language as a whole.
Sometimes it is necessary to have a number of declarations which are guaranteed to have some relationship to each other. Some simple kinds of relationships such as constants that need to be numbered consecutively can be handled using enum declarations, but some applications require more complex relationships that the compiler can't handle directly. For example, one might wish to have a set of enum values and a set of string literals and ensure that they remain in sync with each other. If one declares something like:
#define GENERATE_STATE_ENUM_LIST \
ENUM_LIST_ITEM(STATE_DEFAULT, "Default") \
ENUM_LIST_ITEM(STATE_INIT, "Initializing") \
ENUM_LIST_ITEM(STATE_READY, "Ready") \
ENUM_LIST_ITEM(STATE_SLEEPING, "Sleeping") \
ENUM_LIST_ITEM(STATE_REQ_SYNC, "Starting synchronization") \
// This line should be left blank except for this comment
Then code can use the GENERATE_STATE_ENUM_LIST macro both to declare an enum type and a string array, and ensure that even if items are added or removed from the list each string will match up with its proper enum value. By contrast, if the array and enum declarations were separate, adding a new state to one but not the other could cause the values to get "out of sync".
I'm not sure what the purpose the macros in your particular case, but the pattern can sometimes be a reasonable one. The biggest 'question' is whether it's better to (ab)use the C preprocessor so as to allow such relationships to be expressed in valid-but-ugly C code, or whether it would be better to use some other tool to take a list of states and would generate the appropriate C code from that.

How does the auto-free()ing work when I use functions like mktemp()?

Greetings,
I'm using mktemp() (iPhone SDK) and this function returns a char * to the new file name where all "X" are replaced by random letters.
What confuses me is the fact that the returned string is automatically free()d. How (and when) does that happen? I doubt it has something to do with the Cocoa event loop. Is it automatically freed by the kernel?
Thanks in advance!
mktemp just modifies the buffer you pass in, and returns the same poiinter you pass in, there's no extra buffer to be free'd.
That's at least how the OSX manpage describes it(I couldn't find documentation for IPhone) , and the posix manpage (although the example in the posix manpage looks to be wrong, as it pass in a pointer to a string literal - possibly an old remnant, the function is also marked as legacy - use mkstemp instead. The OSX manpage specifically mention that as being an error).
So, this is what will happen:
char template[] = "/tmp/fooXXXXXX";
char *ptr;
if((ptr = mktemp(template)) == NULL) {
assert(ptr == template); //will be true,
// mktemp just return the same pointer you pass in
}
If it's like the cygwin function of the same name, then it's returning a pointer to an internal static character buffer that will be overwritten by the next call to mktemp(). On cygwin, the mktemp man page specifically mentions _mktemp_r() and similar functions that are guaranteed reentrant and use a caller-provided buffer.

How can Perl's XSUB die?

I have written a Perl XS wrapper for a C library consisting of about ~80
functions. Right now my general strategy is to substitute the error from a C
function with PL_sv_undef and the calling Perl code has to check explicitly
whether the return is not undef. (For some C functions it is more complicated
as I convert their output into a HV/AV and use empty list to report the error.)
Now as I moved to writing bigger Perl scripts using that library, I want to
simplify the error handling and use e.g. the usual eval {}/die exception-like
mechanism to handle errors.
At the moment a simple XSUB in my XS look like that:
SV *
simple_function( param1, param2 = 0, param3 = 0)
int param1
int param2
int param3
CODE:
int rc;
rc = simple_function( param1, param2, param3 );
RETVAL = (rc == 0) ? &PL_sv_yes : &PL_sv_undef;
OUTPUT:
RETVAL
I have seen that some modules have global flag like "RaiseError" to die on
errors but failed to find any example I can borrow from. The few modules I have
found handle the "RaiseError" flag inside the .pm, not inside the .xs, and
thus allowed to use the Perl's die. In my case that is rather hard to
implement inside the .pm as many functions require special error checks. That
would also lead to code duplication as the checks are already present inside the XS.
I found nothing relevant in the perlxs/perlguts documentation. In particular, I have seen calls to Perl_croak() in the .c generated from my .xs, but failed to locate any documentation for the function.
What is the XS' analog of the Perl's die? Or how else can the XSUB report to Perl
run-time that the function has failed and there is no RETVAL to return? How to properly set the $#?
Perl_croak() is documented here on the perlapi man page. As the example on that page shows, you can either pass it a message string, or you can manually set $# to an exception object and pass NULL.