Variadic recursive preprocessor macros - is it possible? - macros

I've run into a little theoretical problem. In a piece of code I'm maintaining there's a set of macros like
#define MAX_OF_2(a, b) (a) > (b) ? (a) : (b)
#define MAX_OF_3(a, b, c) MAX_OF_2(MAX_OF_2(a, b), c)
#define MAX_OF_4(a, b, c, d) MAX_OF_2(MAX_OF_3(a, b, c), d)
...etc up to MAX_OF_8
What I'd like to do is replace them with something like this:
/* Base case #1, single input */
#define MAX_OF_N(x) (x)
/* Base case #2, two inputs */
#define MAX_OF_N(x, y) (x) > (y) ? (x) : (y)
/* Recursive definition, arbitrary number of inputs */
#define MAX_OF_N(x, ...) MAX_OF_N(x, MAX_OF_N(__VA_ARGS__))
...which, of course, is not valid preprocessor code.
Ignoring that this particular case should probably be solved using a function rather than a preprocessor macro, is it possible to define a variadic MAX_OF_N() macro?
Just for clarity, the end result should be a single macro that takes an arbitrary number of parameters and evaluates to the largest of them. I've got an odd feeling that this should be possible, but I'm not seeing how.

It's possible to write a macro that evaluates to the number of arguments it's called with. (I couldn't find a link to the place where I first saw it.) So you could write MAX_OF_N() that would work as you'd like, but you'd still need all the numbered macros up until some limit:
#define MAX_OF_1(a) (a)
#define MAX_OF_2(a,b) max(a, b)
#define MAX_OF_3(a,...) MAX_OF_2(a,MAX_OF_2(__VA_ARGS__))
#define MAX_OF_4(a,...) MAX_OF_2(a,MAX_OF_3(__VA_ARGS__))
#define MAX_OF_5(a,...) MAX_OF_2(a,MAX_OF_4(__VA_ARGS__))
...
#define MAX_OF_64(a,...) MAX_OF_2(a,MAX_OF_63(__VA_ARGS__))
// NUM_ARGS(...) evaluates to the literal number of the passed-in arguments.
#define _NUM_ARGS2(X,X64,X63,X62,X61,X60,X59,X58,X57,X56,X55,X54,X53,X52,X51,X50,X49,X48,X47,X46,X45,X44,X43,X42,X41,X40,X39,X38,X37,X36,X35,X34,X33,X32,X31,X30,X29,X28,X27,X26,X25,X24,X23,X22,X21,X20,X19,X18,X17,X16,X15,X14,X13,X12,X11,X10,X9,X8,X7,X6,X5,X4,X3,X2,X1,N,...) N
#define NUM_ARGS(...) _NUM_ARGS2(0, __VA_ARGS__ ,64,63,62,61,60,59,58,57,56,55,54,53,52,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,33,32,31,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0)
#define _MAX_OF_N3(N, ...) MAX_OF_ ## N(__VA_ARGS__)
#define _MAX_OF_N2(N, ...) _MAX_OF_N3(N, __VA_ARGS__)
#define MAX_OF_N(...) _MAX_OF_N2(NUM_ARGS(__VA_ARGS__), __VA_ARGS__)
Now MAX_OF_N(a,b,c,d,e) will evaluate to max(a, max(b, max(c, max(d, e)))). (I've tested on gcc 4.2.1.)
Note that it's critical that the base case (MAX_OF_2) doesn't repeat its arguments more than once in the expansion (which is why I put max in this example). Otherwise, you'd be doubling the length of the expansion for every level, so you can imagine what will happen with 64 arguments :)

You might consider this cheating, since it is not recursive and it doesn't do the work in the preprocessor. And it uses a GCC extension. And it only works for one type. It is, however, a variadic MAX_OF_N macro:
#include <iostream>
#include <algorithm>
#define MAX_OF_N(...) ({\
int ra[] = { __VA_ARGS__ }; \
*std::max_element(&ra[0], &ra[sizeof(ra) / sizeof(int)]); \
})
int main() {
int i = 12;
std::cout << MAX_OF_N(1, 3, i, 6);
}
Oh yes, and because of the potential variable expression in the initializer list, I don't think that an equivalent of this (using its own function to avoid std::max_element) would work in C89. But I'm not sure variadic macros are in C89 either.
Here's something that I think gets around the "only one type" restriction. It's getting a bit hairy, though:
#include <iostream>
#include <algorithm>
#define MAX_OF_N(x, ...) ({\
typeof(x) ra[] = { (x), __VA_ARGS__ }; \
*std::max_element(&ra[0], &ra[sizeof(ra)/sizeof(ra[0])]); \
})
int main() {
int i = 12;
std::cout << MAX_OF_N(i + 1, 1, 3, 6, i);
}

No, because the preprocessor only takes one "swipe" at the file. There's no way to get it to recursively define macros.
The only code that I've seen do something like this was not variadic, but used default values the user had to pass:
x = MAX_OF_8 (a, b, -1, -1, -1, -1, -1, -1)
assuming all values were non-negative.
Inline functions should give you the same for C++ at least. As you state, it's probably better left to a function with variable arguments similar to printf().

I think that, even if you could expand macros recursively, there would be one little problem with your approach in terms of efficiency... when the macros are expanded, if the MAX_OF_[N-1] is greater, then you have to evaluate it again from scratch.
Here is a silly and stupid answer that probably no one will like xD
file "source.c"
#include "my_macros.h"
...
file "Makefile"
myprogram: source.c my_macros.h
gcc source.c -o myprogram
my_macros.h: make_macros.py
python make_macros.py > my_macros.h
file "make_macros.py"
def split(l):
n = len(l)
return l[:n/2], l[n/2:]
def gen_param_seq(n):
return [chr(i + ord("A")) for i in range(n)]
def make_max(a, b):
if len(a) == 1:
parta = "("+a[0]+")"
else:
parta = make_max(*split(a))
if len(b) == 1:
partb = "("+b[0]+")"
else:
partb = make_max(*split(b))
return "("+parta +">"+partb+"?"+parta+":"+partb+")"
for i in range(2, 9):
p = gen_param_seq(i)
print "#define MAX_"+str(i)+"("+", ".join(p)+") "+make_max(*split(p))
then you'll have those pretty macros defined:
#define MAX_2(A, B) ((A)>(B)?(A):(B))
#define MAX_3(A, B, C) ((A)>((B)>(C)?(B):(C))?(A):((B)>(C)?(B):(C)))
#define MAX_4(A, B, C, D) (((A)>(B)?(A):(B))>((C)>(D)?(C):(D))?((A)>(B)?(A):(B)):((C)>(D)?(C):(D)))
#define MAX_5(A, B, C, D, E) (((A)>(B)?(A):(B))>((C)>((D)>(E)?(D):(E))?(C):((D)>(E)?(D):(E)))?((A)>(B)?(A):(B)):((C)>((D)>(E)?(D):(E))?(C):((D)>(E)?(D):(E))))
#define MAX_6(A, B, C, D, E, F) (((A)>((B)>(C)?(B):(C))?(A):((B)>(C)?(B):(C)))>((D)>((E)>(F)?(E):(F))?(D):((E)>(F)?(E):(F)))?((A)>((B)>(C)?(B):(C))?(A):((B)>(C)?(B):(C))):((D)>((E)>(F)?(E):(F))?(D):((E)>(F)?(E):(F))))
#define MAX_7(A, B, C, D, E, F, G) (((A)>((B)>(C)?(B):(C))?(A):((B)>(C)?(B):(C)))>(((D)>(E)?(D):(E))>((F)>(G)?(F):(G))?((D)>(E)?(D):(E)):((F)>(G)?(F):(G)))?((A)>((B)>(C)?(B):(C))?(A):((B)>(C)?(B):(C))):(((D)>(E)?(D):(E))>((F)>(G)?(F):(G))?((D)>(E)?(D):(E)):((F)>(G)?(F):(G))))
#define MAX_8(A, B, C, D, E, F, G, H) ((((A)>(B)?(A):(B))>((C)>(D)?(C):(D))?((A)>(B)?(A):(B)):((C)>(D)?(C):(D)))>(((E)>(F)?(E):(F))>((G)>(H)?(G):(H))?((E)>(F)?(E):(F)):((G)>(H)?(G):(H)))?(((A)>(B)?(A):(B))>((C)>(D)?(C):(D))?((A)>(B)?(A):(B)):((C)>(D)?(C):(D))):(((E)>(F)?(E):(F))>((G)>(H)?(G):(H))?((E)>(F)?(E):(F)):((G)>(H)?(G):(H))))
and the best thing about it is that... it works ^_^

If you're going down this road in C++, take a look at template metaprogramming. It's not pretty, and it may not solve your exact problem, but it will handle recursion.

First, macros don't expand recusrsively. Although, macros can have reentrance by creating a macro for each recursion level and then deducing the recursion level. However, all this repetition and deducing recursion, is taken care of by the Boost.Preprocessor library. You can therefore use the higher order fold macro to calculate the max:
#define MAX_EACH(s, x, y) BOOST_PP_IF(BOOST_PP_GREATER_EQUAL(x, y), x, y)
#define MAX(...) BOOST_PP_SEQ_FOLD_LEFT(MAX_EACH, 0, BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__))
MAX(3, 6, 8) //Outputs 8
MAX(4, 5, 9, 2) //Outputs 9
Now, this will understand literal numbers between 0-256. It wont work on C++ variables or expression, because the C preprocessor doesn't understand C++. Its just pure text replacement. But C++ provides a feature called a "function" that will work on C++ expressions, and you can use it to calculate the max value.
template<class T>
T max(T x, T y)
{
return x > y ? x : y;
}
template<class X, class... T>
auto max(X x, T ... args) -> decltype(max(x, max(args...)))
{
return max(x, max(args...));
}
Now, the code above does require a C++11 compiler. If you are using C++03, you can create multiple overloads of the function in order to simulate variadic parameters. Furthermore, we can use the preprocessor to generate this repetitive code for us(thats what it is there for). So in C++03, you can write this:
template<class T>
T max(T x, T y)
{
return x > y ? x : y;
}
#define MAX_FUNCTION(z, n, data) \
template<class T> \
T max(T x, BOOST_PP_ENUM_PARAMS(n, T x)) \
{ \
return max(x, max(BOOST_PP_ENUM_PARAMS(n, x)));\
}
BOOST_PP_REPEAT_FROM_TO(2, 64, MAX_FUNCTION, ~)

There's a nice recursion example here
The "hack" is to have a mid-step in the preprocessor to make it think that the define is not replaced by anything else at a given step.

Related

Verifying programs with heterogeneous arrays in VST

I'm verifying a c program that uses arrays to store heterogeneous data - in particular, the program uses arrays to implement cons cells, where the first element of the array is an integer value, and the second element is a pointer to the next cons cell.
For example, the free operation for this list would be:
void listfree(void * x) {
if((x == 0)) {
return;
} else {
void * n = *((void **)x + 1);
listfree(n);
free(x);
return;
}
}
Note: Not shown here, but other code sections will read the values of the array and treat it as an integer.
While I understand that the natural way to express this would be as some kind of struct, the program itself is written using an array, and I can't change this.
How should I specify the structure of the memory in VST?
I've defined an lseg predicate as follows:
Fixpoint lseg (x: val) (s: (list val)) (self_card: lseg_card) : mpred := match self_card with
| lseg_card_0 => !!(x = nullval) && !!(s = []) && emp
| lseg_card_1 _alpha_513 =>
EX v : Z,
EX s1 : (list val),
EX nxt : val,
!!(~ (x = nullval)) &&
!!(s = ([(Vint (Int.repr v))] ++ s1)) &&
(data_at Tsh (tarray tint 2) [(Vint (Int.repr v)); nxt] x) *
(lseg nxt s1 _alpha_513)
end.
However, I run into troubles when trying to evaluate void *n = *(void **)x; presumably because the specification states that the memory contains an array of ints not pointers.
The issue is probably as follows, and can almost be solved as follows.
The C semantics permit casting an integer (of the right size) to a pointer, and vice versa, as long as you don't actually do any pointer operations to an integer value, or vice versa. Very likely your C program obeys those rules. But the type system of Verifiable C tries to enforce that local variables (and array elements, etc.) of integer type will never contain pointer values, and vice versa (except the special integer value 0, which is NULL).
However, Verifiable C does support a (proved-foundationally-sound) workaround to this stricter enforcement:
typedef void * int_or_ptr
#ifdef COMPCERT
__attribute((aligned(_Alignof(void*))))
#endif
;
That is: the int_or_ptr type is void*, but with the attribute "align this as void*". So it's semantically identical to void*, but the redundant attribute is a hint to the VST type system to be less restrictive about C type enforcement.
So, when I say "can almost be solved", I'm asking: Can you modify the C program to use an array of "void* aligned as void*" ?
If so, then you can proceed. Your VST verification should use int_or_ptr_type, which is a definition of type Ctypes.type provided by VST-Floyd, when referring to the C-language type of these array elements, or of local variables that these elements are loaded into.
Unfortunately, int_or_ptr_type is not documented in the reference manual (VC.pdf), which is an omission that should be correct. You can look at progs/int_or_ptr.c and progs/verif_int_or_ptr.v, but these do much more than you want or need: They axiomatize operators that distinguish odd integers from aligned pointers, which is undefined in C11 (but consistent with C11, otherwise the ocaml garbage collector could never work). That is, those axiomatized external functions are consistent with CompCert, gcc, clang; but you won't need any of them, because the only operations you're doing on int_or_pointer are the perfectly-legal "comparison with NULL" and "cast to integer" or "cast to struct foo *".

Why are macros based on abstract syntax trees better than macros based on string preprocessing?

I am beginning my journey of learning Rust. I came across this line in Rust by Example:
However, unlike macros in C and other languages, Rust macros are expanded into abstract syntax trees, rather than string preprocessing, so you don't get unexpected precedence bugs.
Why is an abstract syntax tree better than string preprocessing?
If you have this in C:
#define X(A,B) A+B
int r = X(1,2) * 3;
The value of r will be 7, because the preprocessor expands it to 1+2 * 3, which is 1+(2*3).
In Rust, you would have:
macro_rules! X { ($a:expr,$b:expr) => { $a+$b } }
let r = X!(1,2) * 3;
This will evaluate to 9, because the compiler will interpret the expansion as (1+2)*3. This is because the compiler knows that the result of the macro is supposed to be a complete, self-contained expression.
That said, the C macro could also be defined like so:
#define X(A,B) ((A)+(B))
This would avoid any non-obvious evaluation problems, including the arguments themselves being reinterpreted due to context. However, when you're using a macro, you can never be sure whether or not the macro has correctly accounted for every possible way it could be used, so it's hard to tell what any given macro expansion will do.
By using AST nodes instead of text, Rust ensures this ambiguity can't happen.
A classic example using the C preprocessor is
#define MUL(a, b) a * b
// ...
int res = MUL(x + y, 5);
The use of the macro will expand to
int res = x + y * 5;
which is very far from the expected
int res = (x + y) * 5;
This happens because the C preprocessor really just does simple text-based substitutions, it's not really an integral part of the language itself. Preprocessing and parsing are two separate steps.
If the preprocessor instead parsed the macro like the rest of the compiler, which happens for languages where macros are part of the actual language syntax, this is no longer a problem as things like precedence (as mentioned) and associativity are taken into account.

MATLAB, mex files, Fortran, warnings and errors

I'm not quite seeing the issue I'm having, and I've tried suggestions I've seen in other forum posts. I am trying to write a (fortran) mex file, per request of my boss. However, I was getting warnings when passing a matrix to my computational routine. If I ignored the warning, my MATLAB shut down. So now I'm trying a simpler program, an inner product. However, I am still getting the warning: "Expected a procedure at (1)" where (1) is at 'call innerProd(x,y,c)' underneath the x. I'm not sure what that means... I've included my code.
#include "fintrf.h"
C======================================================================
#if 0
C
C innerProd.F
C .F file needs to be preprocessed to generate .for equivalent
C
#endif
C
C innerProd.F
C calculates the inner product
C This is a MEX file for MATLAB.
C Copyright 1984-2011 The MathWorks, Inc.
C $Revision: 1.12.2.9 $
C======================================================================
C Gateway routine
subroutine mexFunction(nlhs, plhs, nrhs, prhs)
C Declarations
implicit none
C mexFunction arguments:
mwPointer:: plhs(*), prhs(*)
integer:: nlhs, nrhs
C Function declarations:
mwPointer:: mxCreateDoubleMatrix, mxGetPr,mxGetM, mxGetData
integer:: mxIsNumeric
C Pointers to input/output mxArrays:
mwPointer:: x_ptr, y_ptr, c_ptr
C Array information:
mwSize:: m
C Arguments for computational routine:
real*8:: x,y,c
C----------------------------------------------------------------------
C Check for proper number of arguments.
if (nrhs .ne. 2) then
call mexErrMsgTxt('Error.')
elseif (nlhs .ne. 1) then
call mexErrMsgTxt('One output required.')
endif
C Check to see if inputs are numeric.
if (mxIsNumeric(prhs(1)) .ne. 1 ) then
call mexErrMsgTxt('Input # 1 is not a numeric array.')
elseif (mxIsNumeric(prhs(2)) .ne. 1) then
call mexErrMsgTxt('Input #2 is not a numeric array.')
endif
C Find dimensions of mxArrays
m=mxGetM(prhs(1))
C create Fortran arrays from the input arguments
x_ptr=mxGetData(prhs(1))
call mxCopyPtrToReal8(x_ptr,x,m)
y_ptr= mxGetData(prhs(2))
call mxCopyPtrToReal8(y_ptr,y,m)
C create matrix for the return argument
plhs(1) =mxCreateDoubleMatrix(1,1,0)
c_ptr= mxGetPr(plhs(1))
C Call the computational subroutine.
call innerProd(x,y,c)
C Load the output into a MATLAB array.
call mxCopyReal8ToPtr(c, c_ptr, 1)
return
end subroutine mexFunction
C----------------------------------------------------------------------
C Computational routine
subroutine innerProd(x,y,c)
implicit none
real*8:: x,y,temp,c
integer:: i,m
do i=1,m
temp=temp+x(i)*y(i)
end do
c = temp
return
end subroutine innerProd
I'm just learning this for the first time, and I would appreciate any suggestions. Even if it is where to look for solutions. I've looked through MATLAB mex Fortran aids online. There isn't any help there. I can't get the function to run, so I can't use print statements which is a good way to debug. I think mex has a print function, I will try to get that to work.
Thank you!
The main problem is that you haven't anywhere declared the arguments of innerProd as arrays. That holds for the actual arguments x and y in the subroutine mexFunction and the dummy arguments x and y in innerProd itself.
So, in innerProd the expression x(i) isn't referencing the i-th element of the real*8 array x, but the real*8 result of the function x with argument i. As the x you've passed isn't a function (procedure), this is an error.
There are ways to solve this, but all involve declaring the dummy arguments as arrays. Which brings up another point.
You have in innerProd
integer:: i,m
do i=1,m
temp=temp+x(i)*y(i)
end do
where m is not defined. Crucially, from mexfunction you're expecting the compiler to know that m is the size of the arrays x and y: this isn't true. m is a variable local to innerProd. Instead you may want to pass it as an argument to the subroutine and use that to dimension the arrays:
subroutine innerProd(x,y,c,m)
implicit none
integer :: m
real*8:: x(m),y(m),temp,c
...
end subroutine
[You could, of course, use assumed-shape arrays (and the SIZE intrinsic), but that's an additional complication requiring more substantial changes.] You also need to think about how to declare arrays appropriately in mexfunction, noting that the call to mxCopyPtrToReal8 also requires an array argument.
I couldn't get the Fortran code to work for innerProd, but I did get C code to work. I recommend, if you are having issues with Fortran, to use C. It seems that Matlab is more flexible when it comes to mex files and C. Here is the code:
#include "mex.h"
/*
* innerProd.c
*
*Computational function that takes the inner product of two vectors.
*
*Mex-file for MATLAB.
*/
void innerProd(double *x, double *y,double *c,int m){
double temp;
int i;
for(i=0; i < m; i++){
temp = temp + x[i]*y[i];
}
*c=temp;
}
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *x, *y, *c;
size_t m;
/*check for proper number of arguments.*/
if(nrhs != 2){
mexErrMsgIdAndTxt("MATLAB:innerProd:invalidNumInputs","Two input required.");
}
/*The input must be a noncomplex double vector.*/
m = mxGetM(prhs[0]);
if(!mxIsDouble(prhs[0]) || mxIsComplex(prhs[0]) || m==1){
mexErrMsgIdAndTxt("MATLAB:innerProd:inputNotRealDouble","Input must be noncomplex double vector");
}
/*create return argument */
plhs[0] = mxCreateDoubleMatrix(1,1,0);
c=mxGetPr(plhs[0]);
/*Assign pointers to each input and output. */
x = mxGetPr(prhs[0]);
y=mxGetPr(prhs[1]);
/*call subroutine*/
innerProd(x,y,c,m);
}
I will still take suggestions on the Fortran code above, though. I'd like to know how to get it to work. Thanks!

Calling a macro

I have the following macro:
#define testMethod(a, b) \
if (a.length > b.length) \
return a; \
return b;
When I try to call it with:
NSString *s = testMethod(#"fir", #"sec");
I get an error:
"Excepted ";" at end of declaration"
Why?
if is a statement, not an expression. It can't return a value like that.
You probably mean:
#define testMethod(a, b) ((a).length > (b).length ? (a) : (b))
The extra parenthesis around the arguments on the right side are common, and are there to protect you against unexpected precendence-related incidents.
Also note that because the above is pre-processed by doing textual replacement, it will probably construct more objects than the equivalent function would.
If you want to use the macro within expressions, it should be defined as an expression itself, not as a group of statements. You end up with a syntax error because the macro is literally substituted, and statements are not allowed within another statement.
GCC has an extension called "statement expressions" that can help you achieve this, but it is non-standard:
#define testMethod(a, b) ({ \
typeof(a) result = (a).length > (b).length ? (a) : (b); \
result; \
})
Actually, in your case none of this is needed because the statements can be easily converted to an expression:
#define testMethod(a, b) ((a).length > (b).length ? (a) : (b))
You need not to use the return statement... try to use the following code
#define testMethod(a,b) ((a) < (b) ? (a) : (b))
may this will help you..

C language preprocessor behavior

There are different kind of macros in the C language, nested macro is one of them.
Considering a program with the following macro
#define HYPE(x,y) (SQUR(x)+SQUR(y))
#define SQUR(x) (x*x)
Using this we can successfully compile to get the result.
As we all know the C preprocessor replaces all the occurrence of the identifiers with the replacement-string. Considering the above example I would like to know how many times the C preprocessor traverses the program to replace the macro with the replacement values. I assume it cannot be done in one go.
the replacement takes place, when "HYPE" is actually used. it is not expanded when the #define statement occurs.
eg:
1 #define FOO 1
2
3 void foo() {
4 printf("%d\n", FOO);
5 }
so the replacement takes place in line 5, and not in line 1. hence the answer to your question is: once.
A #define'd macro invocation is expanded until there are no more terms to expand, except it doesn't recurse. For example:
#define TIMES *
#define factorial(n) ((n) == 0 ? 1 : (n) TIMES factorial((n)-1))
// Doesn't actually work, don't use.
Suppose you say factorial(2). It will expand to ((2) == 0 ? 1 : (2) * factorial((2)-1)). Note that factorial is expanded, then TIMES is also expanded, but factorial isn't expanded again afterwards, as that would be recursion.
However, note that nesting (arguably a different type of "recursion") is in fact expanded multiple times in the same expression:
#define ADD(a,b) ((a)+(b))
....
ADD(ADD(1,2),ADD(3,4)) // expands to ((((1)+(2)))+(((3)+(4))))