Possible to check a value is a compile time constant in Rust? - macros

There are cases when inline functions or macros will expand into a lot of code. However, when used with constants, dead branches can be optimized away.
I could add a comment in the code:
// foo arg is always a constant, dead branches will be removed
But I'd rather add some kind of static assertion to ensure this is always the case.
Is there a way in Rust to check if a value is a compile time constant?
Something like GCC's __builtin_constant_p?

If the type of the value or expression is fixed and known in advance, you can define a local constant and initialize it with the value or expression. If you don't otherwise use the constant, prefix its name with an underscore to suppress compiler warnings about the constant being unused. This only works with macros, though.
const _ASSERT_COMPILE_TIME_CONSTANT: i32 = $arg;
The nightly compiler also supports defining "const functions", i.e. functions that can be used in contexts where the compiler requires an expression that can be evaluated at compile time. The body of those functions are subject to restrictions, but call sites of const functions that don't need to be evaluated at compile time can pass expressions that cannot be evaluated at compile time as arguments, so defining a const function doesn't provide the guarantee you're asking for.
If the type of the value or expression cannot be specified in the macro, then we can't omit it, as const requires that the type be specified. However, we can use a generic const function that returns a fixed type in a const initializer!
// at the beginning of the crate
#![feature(const_fn)]
// in the macro's body
const fn _swallow<T>(_x: T) { () }
const _ASSERT_COMPILE_TIME_CONSTANT: () = _swallow($arg);

To add to #Francis answer, this is a macro that can be used to ensure constant value.
macro_rules! ensure_const_expr {
($value:expr, $t:ty) => {
{
const _IGNORE: $t = $value;
}
}
}
// in a functions body
ensure_const_expr!(some_variable, i32);
Note the extra braces are needed so multiple uses don't fail with:
error: a value named `_IGNORE` has already been defined in this block

Related

Declaring list in dart with const and modifying it giving runtime error not a compile time

In dart, why
const cities = ['Delhi', 'UP', 'Noida'];
//error is in this line
cities[0] = 'Mumbai';
is a runtime error, not a compile time error?
See this answer for knowing the implications of const in dart
TLDR const variables are pre compile by dart and you cannot modify them at runtime.
const cities = ['Delhi', 'UP', 'Noida'];
cities[0] = 'Mumbai'; // Throws at runtime
Use final or var instead.
final cities = ['Delhi', 'UP', 'Noida'];
cities[0] = 'Mumbai'; // Works OK
https://www.peachpit.com/articles/article.aspx?p=2468332&seqNum=5#:~:text=EXAMPLE%204.12&text=Unlike%20final%20variables%2C%20properties%20of,its%20values%20cannot%20be%20changed.
Use normal variables, as with constants, you cannot change the value of the list during runtime.
I assume that is done with
var or final, as I'm not a dart master myself.
There currently is no way of indicating in Dart whether a method mutates its object or is guaranteed to leave it alone. (This is unlike, say, C++ where a method could be marked as const to indicate that it does not (visibly) mutate the object.)
Consequently, there isn't a good way for the Dart compiler to know that operator []= shouldn't be allowed to be invoked on a const object, so unfortunately it isn't known that it violates the const-ness of the object until runtime.

Adding classes to micropython module

In reference to adding module in micropython, I was trying to create a class which has a local method. In the documentation it is given how to add local methods and that the first argument should be of mp_obj_t type which is the data struct itself. However, I was asking how can I pass extra parameters like other methods? I tried using mp_obj_t * args as second argument but STATIC MP_DEFINE_CONST_FUN_OBJ_1 gives error. I tried the same with STATIC MP_DEFINE_CONST_FUN_OBJ_VAR but it does not support passing mp_obt_t as first argument as STATIC MP_DEFINE_CONST_FUN_OBJ_VAR needs an int. I am quite new, so I was asking how to add methods to classes which can accept arguments?
You need MP_DEFINE_CONST_FUN_OBJ_2, since you have 2 arguments.
Something like
STATIC mp_obj_t my_class_func(mp_obj_t self, mp_obj_t arg) {
if (MP_OBJ_IS_SMALL_INT(lhs)) {
const mp_int_t lhs_val = MP_OBJ_SMALL_INT_VALUE(arg);
//...
} else {
//oops, not an int
}
return mp_const_none;
}
MP_DEFINE_CONST_FUN_OBJ_2(my_class_func_obj, my_class_func);
The best source of samples like this is the source code btw.
To eleaborate on #stijn answer ~ when creating a class, all the MP_DEFINE_CONST_FUN_OBJ_XXXXXX defines work the exact same as they would if you weren't creating a class. The only difference is the first argument of ACTUAL arguments will always refer to self
Here's an example:
mp_obj_t Class_method(mp_uint_t n_args, const mp_obj_t *args) { ... }
That is the standard candidate for:
MP_DEFINE_CONST_FUN_OBJ_VAR_BETWEEN(Class_method_obj, 1, 3, Class_method);
However, in this case args[0] will be self.
Let's have another example.
mp_obj_t Class_method(mp_uint_t n_args, const mp_obj_t *args, mp_map_t *kw_args) { ... }
That's a prime candidate for this define
MP_DEFINE_CONST_FUN_OBJ_KW(Class_method_obj, 2, Class_method);
The only difference in this case is that the first index of allowed_args needs to automatically be handled as self. Nothing about how you do these things changes, except now the first ACTUAL argument (ie not including n_args or any other "helper" argument) needs to automatically be considered as self. That being said, you will NEVER use MP_DEFINE_CONST_FUN_OBJ_0 with a class method. '_0' means "zero arguments" and a class method will never have zero arguments because it will ALWAYS at least have self. This also means that you have to add one to however many expected arguments you have on the python end. If your python version accepts 3 arguments ~
(red, green, blue)
then your C_MODULE define has to start at 4 because it's going to get
(self, red, green, blue)

When does Chapel pass by reference and when by constant?

I am looking for examples of Chapel passing by reference. This example works but it seems like bad form since I am "returning" the input. Does this waste memory? Is there an explicit way to operate on a class?
class PowerPuffGirl {
var secretIngredients: [1..0] string;
}
var bubbles = new PowerPuffGirl();
bubbles.secretIngredients.push_back("sugar");
bubbles.secretIngredients.push_back("spice");
bubbles.secretIngredients.push_back("everything nice");
writeln(bubbles.secretIngredients);
proc kickAss(b: PowerPuffGirl) {
b.secretIngredients.push_back("Chemical X");
return b;
}
bubbles = kickAss(bubbles);
writeln(bubbles.secretIngredients);
And it produces the output
sugar spice everything nice
sugar spice everything nice Chemical X
What is the most efficient way to use a function to modify Bubbles?
Whether Chapel passes an argument by reference or not can be controlled by the argument intent. For example, integers normally pass by value but we can pass one by reference:
proc increment(ref x:int) { // 'ref' here is an argument intent
x += 1;
}
var x:int = 5;
increment(x);
writeln(x); // outputs 6
The way that a type passes when you don't specify an argument is known as the default intent. Chapel passes records, domains, and arrays by reference by default; but of these only arrays are modifiable inside the function. ( Records and domains pass by const ref - meaning they are passed by reference but that the function they are passed to cannot modify them. Arrays pass by ref or const ref depending upon what the function does with them - see array default intent ).
Now, to your question specifically, class instances pass by "value" by default, but Chapel considers the "value" of a class instance to be a pointer. That means that instead of allowing a field (say) to be mutated, passing a class instance by ref just means that it could be replaced with a different class instance. There isn't currently a way to say that a class instance's fields should not be modifiable in the function (other than making them to be explicitly immutable data types).
Given all of that, I don't see any inefficiencies with the code sample you provided in the question. In particular, here:
proc kickAss(b: PowerPuffGirl) {
b.secretIngredients.push_back("Chemical X");
return b;
}
the argument accepting b will receive a copy of the pointer to the instance and the return b will return a copy of that pointer. The contents of the instance (in particular the secretIngredients array) will remain stored where it was and won't be copied in the process.
One more thing:
This example works but it seems like bad form since I am "returning" the input.
As I said, this isn't really a problem for class instances or integers. What about an array?
proc identity(A) {
return A;
}
var A:[1..100] int;
writeln(identity(A));
In this example, the return A in identity() actually does cause a copy of the array to be made. That copy wasn't created when passing the array in to identity(), since the array was passed by with a const ref intent. But, since the function returns something "by value" that was a reference, it's necessary to copy it as part of returning. See also arrays return by value by default in the language evolution document.
In any case, if one wants to return an array by reference, it's possible to do so with the ref or const ref return intent, e.g.:
proc refIdentity(ref arg) ref {
return arg;
}
var B:[1..10] int;
writeln(refIdentity(B));
Now there is no copy of the array and everything is just referring to the same B.
Note though that it's currently possible to write programs that return a reference to a variable that no longer exists. The compiler includes some checking in that area but it's not complete. Hopefully improvements in that area are coming soon.

Why is generic instantiation syntax disallowed in Hack?

From the docs:
Note: HHVM allows syntax such as $x = Vector<int>{5,10};, but Hack
disallows the syntax in this situation, instead opting to infer
it.
Is there a specific reason for this? Isn't this a violation of the fail-fast rule?
There are some situations in which this would cause error to be deffered, which in turn leads to harder backtracing.
For example:
<?hh // strict
function main() : void {
$myVector = new Vector([]); // no generic syntax
$myVector->addAll(require 'some_external_source.php');
}
The above code causes no errors until it is used in a context where the statically-typed collection is actually in place:
class Foo
{
public ?Vector<int> $v;
}
$f = new Foo();
$f->v = $myVector;
Now there is an error if the vector contains something else then int. But one must trace back the error to the point where the flawed data was actually imported. This would not be necessary if one could instantiate the vector using generic syntax in the first place:
$myVector = new Vector<int>([]);
$myVector->addAll(require 'some_external_source.php'); // fail immediately
I work on the Hack type system and typechecker at Facebook. This question has been asked a few times internally at FB, and it's good to have a nice, externally-visible place to have an answer to it written down.
So first of all, your question is premised on the following code:
<?hh // strict
function main() : void {
$myVector = new Vector([]); // no generic syntax
$myVector->addAll(require 'some_external_source.php');
}
However, that code does not pass the typechecker due to the usage of require outside toplevel, and so any result of actually executing it on HHVM is undefined behavior, rendering this whole discussion moot for that code.
But it's still a legitimate question for other potential pieces of code that do actually typecheck, so let me go ahead and actually answer it. :)
The reason that it's unsupported is because the typechecker is actually able to infer the generic correctly, unlike many other languages, and so we made the judgement call that the syntax would get in the way, and decided to disallow it. It turns out that if you just don't worry about, we'll infer it right, and still give useful type errors. You can certainly come up with contrived code that doesn't "fail fast" in the way you want, but it's, well, contrived. Take for example this fixup of your example:
<?hh // strict
function main(): void {
$myVector = Vector {}; // I intend this to be a Vector<int>
$myVector[] = 0;
$myVector[] = 'oops'; // Oops! Now it's inferred to be a Vector<mixed>
}
You might argue that this is bad, because you intended to have a Vector<int> but actually have a Vector<mixed> with no type error; you would have liked to be able to express this when creating it, so that adding 'oops' into it would cause such an error.. But there is no type error only because you never actually tried to use $myVector! If you tried to pull out any of its values, or return it from the function, you'd get some sort of type compatibility error. For example:
<?hh // strict
function main(): Vector<int> {
$myVector = Vector {}; // I intend this to be a Vector<int>
$myVector[] = 0;
$myVector[] = 'oops'; // Oops! Now it's inferred to be a Vector<mixed>
return $myVector; // Type error!
}
The return statement will cause a type error, saying that the 'oops' is a string, incompatible with the int return type annotation -- exactly what you wanted. So the inference is good, it works, and you don't ever actually need to explicitly annotate the type of locals.
But why shouldn't you be able to if you really want? Because annotating only generics when instantiating new objects isn't really the right feature here. The core of what you're getting at with "but occasionally I really want to annotate Vector<int> {}" is actually "but occasionally I really want to annotate locals". So the right language feature is not to let you write $x = Vector<int> {}; but let you explicitly declare variables and write Vector<int> $x = Vector {}; -- which also allows things like int $x = 42;. Adding explicit variable declarations to the language is a much more general, reasonable addition than just annotating generics at object instantiation. (It's however not a feature being actively worked on, nor can I see it being such in the near to medium term future, so don't get your hopes up now. But leaving the option open is why we made this decision.)
Furthermore, allowing either of these syntaxes would be actively misleading at this point in time. Generics are only enforced by the static typechecker and are erased by the runtime. This means that if you get untyped values from PHP or Hack partial mode code, the runtime cannot possibly check the real type of the generic. Noting that untyped values are "trust the programmer" and so you can do anything with them in the static typechecker too, consider the following code, which includes the hypothetical syntax you propose:
<?hh // partial
function get_foo() /* unannotated */ {
return 'not an int';
}
<?hh // strict
function f(): void {
$v = Vector<int> {};
$v[] = 1; // OK
// $v[] = 'whoops'; // Error since explicitly annotated as Vector<int>
// No error from static typechecker since get_foo is unannotated
// No error from runtime since generics are erased
$v[] = get_foo();
}
Of course, you can't have unannotated values in 100% strict mode code, but we have to think about how it interacts with all potential usages, including untyped code in partial mode or even PHP.

Const member function vs const return type

In D I can specify const functions, like in c++:
struct Person {
string name;
// these two are the same?
const string getConstName() { return name; }
string getConstName2() const { return name; }
}
It seems that the above two are the same meaning. Is it true?
If so how can I return a const string rather than define a const function?
The two are identical. Function attributes can go on either side of a function. e.g.
pure Bar foo() {...}
and
Bar foo() pure {...}
are identical. The same goes for pure, nothrow, const, etc. This is probably fine for most attributes, but it becomes quite annoying when const, immutable, or inout is involved, because they can all affect the return type. In order for those attributes to affect the return type, parens must be used. e.g.
const(Bar) foo() {...}
returns a const Bar, whereas
Bar foo const {...}
and
const Bar foo() {...}
return a mutable Bar, but the member function itself is const. In most cases what you want is probably either
Bar foo() {...}
or
const(Bar) foo() const {...}
since it's frequently the case that having a const member function forces you to return const (particularly if you're returning a member variable), but you can have any combination of const between the member function and its return type just so long as it works with what the function is doing (e.g. returning a mutable reference to a member variable doesn't work from a const function).
Now personally, I wish that putting const on the left-hand side were illegal, particularly when the excuse that all function attributes can go on either side of the function isn't really true anyway (e.g. static, public, and private don't seem to be able to go on the right-hand side), but unfortunately, that's the way it is at this point, and I doubt that it's going to change, because no one has been able to convince Walter Bright that it's a bad idea to let const go on the left.
However, it is generally considered bad practice to put const, immutable, or inout on the left-hand side of the function unless they're using parens and thus affect the return type, precisely because if they're on the left without parens, you immediately have to question whether the programmer who did it meant to modify the function or the return type. So, allowing it on the left is pretty pointless (aside perhaps for generic code, but it's still not worth allowing it IMHO).