Possible to derive attributes *after* a struct declaration? - macros

I'm using a macro to extend a primitive struct:
pub struct MyTypedNumber(pub u32);
struct_impl_my_features!(MyTypedNumber);
The struct_impl_my_features macro can implement functions & traits for MyTypedNumber, however there is a case where its useful to use #[derive(PartialEq, Eq)] - for example.
Is it possible to use #[derive(...)] after the struct is already declared?
An alternative is to pass in the struct definition as an item argument to a macro:
struct_impl_my_features!(
pub struct MyTypedNumber(pub u32);,
MyTypedNumber
);
This works, so it may be the best option, although it is rather clunky and means the declaration and macro extension must be together.
See this complete example, the macro is called struct_bitflag_impl (second example).
I worked around this by manually implementing PartialEq and Eq, however I ran into a case where Rust needs #[derive(...)] to be used as constants within match statement:
= warning: this was previously accepted by the compiler but is being phased out;
it will become a hard error in a future release!
= note: for more information,
see RFC 1445 <https://github.com/rust-lang/rfcs/pull/1445>

The "complete example" link you provide does show an example of having the macro prefix attributes (see the second macro).
#[derive(PartialEq, Eq, Copy, Clone, Debug)]
$struct_p_def
However, if instead you want to be able to provide derive-attributes per struct (e.g., only some of your structs need to derive PartialEq), you can pass the derive expression in the first part of your second struct_impl_my_features! example—the attributes are considered part of the item macro form. E.g.,
struct_impl_my_features!(
#[derive(PartialEq, Eq)]
pub struct MyTypedNumber(pub u32);,
MyTypedNumber
);
Update
Sorry, I don't have an answer to your main question; as far as I know, it is not possible. However, if your primary concern is the clunkiness, and if your structs are all of similar form, you could make your macro call nicer by adding this to the top of your macro:
($x:ident ( $($v:tt)* ) ) => {
struct_impl_my_features!(pub struct $x( $($v)* );, $x)
};
And then calling it like:
struct_impl_my_features!(MyTypedNumber(pub u32));

Related

Documentation comment for loop variable in Xcode

I know that we can use
/// index variable
var i = 0
as a documentation comment for a single variable.
How can we do the same for a loop variable?
The following does not work:
var array = [0]
/// index variable
for i in array.indices {
// ...
}
or
var array = [0]
for /** index variable */ i in array.indices {
// ...
}
Background:
The reason why I don’t use "good" variable names is that I’m implementing a numerical algorithm which is derived using mathematical notation. It has in this case only single letter variable names. In order to better see the connection between the derivation and the implementation I use the same variable names.
Now I want to comment on the variables in code.
The use of /// is primarily intended for use of documenting the API of of a class, struct, etc. in Swift.
So if used before a class, func, a var/let in a class/struct, etc. you are attaching documentation to that code aspect that Xcode understands how to show inline. It doesn’t know how to pickup that information for things inside of function since at this time that is not the intention of /// (it may work for simple var/let but not likely fully on purpose).
Instead use a simple // code comment for the benefit of any those working in the code however avoid over documenting the code since good code is likely fairly self explaining to anyone versed in the language and adding unneeded documentations can get in the way of just reading the code.
This is a good reference for code documentation in Swift at this time Swift Documentation
I woud strongly push back on something like this if I saw it in a PR. i is a massively well adopted "term of art" for loop indices. Generally, if your variable declaration name needs to be commented, you need a better variable name. There are some exceptions, such as when it stores data with complicated uses/invariants that can't be captured in a better way in a type system.
I think commenting is one area that beginners get wrong, mainly from being misled by teachers or by not yet fully understanding the purpose of comments. Comments don't exist to create an english based, psuedo-programming language in which your entire app will be duplicated. Understanding the programming language is a minimal expectation out of contributors to a project. Absolutely no comments should be explaining programming language features. E.g. var x: Int = 0 // declares a new mutable variable called x, to the Int value 0, with the exception of tutorials for learning Swift.
Commenting in this manner might seem like it's helpful, because you could argue it explains things for beginners. That may be the case, but it's suffocating for all other readers. Imagine if novel had to define all the English words they used.
Instead, the goal of documentation to explain the purpose and the use of things. To answer such questions as:
Why did you implement something this way, and not another way?
What purpose does this method serve?
When will this method of my delegate be called?
Case Study: Equatable
For a good example, take a look at the documentation of Equatable
Some things to notice:
It's written for an audience of Swift developers. It uses many things, which it does not explain such as, arrays, strings, constants, variable declaration, assignment, if statements, method calls (such as Array.contains(_:)), string interpolation, the print function.
It explains the general purpose of this protocol.
It explains how to use this protocol
It explains how you can adopt this protocol for your own use
It documents contractual requirements that cannot be enforced by the type system.
Since equality between instances of Equatable types is an equivalence relation, any of your custom types that conform to Equatable must satisfy three conditions, for any values a, b, and c:
a == a is always true (Reflexivity)
a == b implies b == a (Symmetry)
a == b and b == c implies a == c (Transitivity)
It explains possible misconceptions about the protocol ("Equality is Separate From Identity")

Why can't code in #:fallbacks refer to the generic methods?

This code:
(require racket/generic)
;; A holder that assigns ids to the things it holds. Some callers want to know the
;; the id that was assigned when adding a thing to the holder, and others don't.
(define-generics holder
(add-new-thing+id holder new-thing) ;returns: holder id (two values)
(add-new-thing holder new-thing) ;returns: holder
#:fallbacks
[(define (add-new-thing holder new-thing) ;probably same code for all holder structs
(let-values ([(holder _) (add-new-thing+id holder new-thing)])
holder))])
produces this error message:
add-new-thing+id: method not implemented in: (add-new-thing+id holder new-thing)
I'm able to fix it by adding a define/generic inside the fallbacks, like this:
#:fallbacks
[(define/generic add add-new-thing+id)
(define (add-new-thing holder new-thing)
(let-values ([(holder _) (add holder new-thing)])
holder))])
but this seems to add complexity without adding value, and I don't understand why one works and the other doesn't.
As I understand #:fallbacks, the idea is that the generic definition can build methods out of the most primitive methods, so structs that implement the generic interface don't always need to reimplement the same big set of methods that usually just call the core methods with identical code—but they can override those "derived" methods if needed. say, for optimization. That's a very useful thing*—but have I misunderstood fallbacks?
It seems strange that fallbacks code couldn't refer to the generic methods. Isn't the main value of fallbacks to call them? The documentation for define/generic says that it's a syntax error to invoke it outside a #:methods clause in a struct definition, so I'm probably misusing it. Anyway, can someone explain what are the rules for code in a #:fallbacks clause? How are you supposed to write it?
* The Clojure world has something similar, in the potemkin library's def-abstract-type and deftype+, but not as well integrated into the language. potemkin/def-map-type illustrates very nicely why fallbacks—as I understand them, anyway—are such a valuable feature.
The second version of your code is correct.
The first version of your code would work if you had a fallback definition of add-new-thing+id, but because you are referring to any possible definition of that method outside of the fallback scope, you need to import it.
It effectively feels a bit repetitive to have to define the generic again inside the fallback clause. It's because #:fallbacks works the same way as #:methods, and therefore has the same behavior of overriding generics with its own definitions.
To make it explicit that you are overriding a method, you need to "import" it inside your clause, using define/generic (which is not really defining anything, it is just importing the generic into the context).
As the documentation for define/generic says:
When used inside the method definitions associated with the #:methods keyword, binds local-id to the generic for method-id. This form is useful for method specializations to use generic methods (as opposed to the local specialization) on other values.
Then in define-generics:
The syntax of the fallback-impls is the same as the methods provided for the #:methods keyword for struct.
Which means #:fallbacks has the same behavior as using #:methods in a struct.
Why?
The logic behind that behavior is that method definition blocks, like #:methods and #:fallbacks have access to their own definitions of all the generics, so that it's easy to refer to your own context. To explicitly use a generic from outside this context, you need to use define/generic.

Overloading vs. Overriding in Julia

I am not familiar with Julia but I feel like I noticed it allows you to define functions multiple times with different signatures, such as this:
FK5Coords{e}(ra::T, dec::T) where {e,T<:AbstractFloat} = FK5Coords{e, T}(ra, dec)
FK5Coords{e}(ra::Real, dec::Real) where {e} =
FK5Coords{e}(promote(float(ra), float(dec))...)
To me it looks like this allows you to call FK5Coords with two different signatures.
So I'm wondering (a) if that is true, if Julia allows overloading functions like this, and (b) if Julia allows something like super in a function, which seems like it would conflict with overloading. And (c), what an example snippet of Julia code looks like that shows (1) overloading in one example, and (2) overriding in the other.
The reason I'm asking is because I am wondering how Julia solves the problem of having both super and function overloading, because both require defining the function again and it seems you would have to flag it with some metadata or something to say "in this case I am overriding" or "in this case I am overloading".
Note: If that was not an example of overloading, then (from Wikipedia) this was what I was imagining Julia supported (along these lines):
// volume of a cylinder
double volume(const double r, const int h)
{
return 3.1415926*r*r*static_cast<double>(h);
}
// volume of a cuboid
long volume(const long l, const int b, const int h)
{
return l*b*h;
}
So I'm wondering (a) if that is true, if Julia allows overloading functions like this
Julia allows you to write different versions of the same function (different "methods" for the function) that differ in the type/number of arguments. That's pretty similar to overloading, except that overloading usually means the function to be called is decided based on the compile-time type of the arguments, whereas in Julia it's decided based on the run-time type of the arguments. This is commonly called dynamic dispatch. See this C++ example to see what overloading lacks and dispatch gives you.
(b) if Julia allows something like super in a function, which seems like it would conflict with overloading
The reason I'm asking is because I am wondering how Julia solves the problem of having both super and function overloading, because both require defining the function again and it seems you would have to flag it with some metadata or something to say "in this case I am overriding" or "in this case I am overloading".
I'm not sure why you think overloading will conflict with super. In C++, overriding involves having the exact same argument numbers and types, whereas overloading requires having either the number or the type of arguments be different. Compilers are smart enough to easily distinguish between those two cases, and AFAICT C++ can have a super method despite having both overloading and overriding, except that it also has multiple inheritance. I believe (with my limited C++ knowledge) that multiple inheritance is the reason C++ doesn't have a super method call, not overloading.
Anyway, if you peel back behind the Object-oriented curtain and look into method signatures, you'll see that all overriding is really a particular type of overloading: Dog::run(int dist, int dir) can override Animal::run(int dist, int dir) (assume Dog inherits from Animal), but that's equivalent to overloading a run(Animal a, int dist, int dir) function with a run(Dog d, int dist, int dir) definition. (If run was a virtual function, this would be dynamic dispatch instead of overloading, but that's a separate discussion.)
In Julia we do this explicitly, so the definitions would be run(d::Dog, dist::Int, dir::Int) and run(a::Animal, dist::Int, dir::Int). However, in Julia, you can only inherit from abstract types, so here the supertype Animal would be an abstract type, so you can't really call the second method with an Animal instance - the second method definition is really a shorthand way of saying "call this method for any instance of some concrete subtype of Animal, unless that subtype has its own separate method definition" (which Dog does, in this case). I'm not aware of any easy way of calling the second method run(Animal... from the first run(Dog..., which would be the equivalent of a super call.
(You can also 'override' a method from another module with import, but if it has completely the same parameters and parameter types, you'd probably be committing type piracy, which is usually a bad idea. I'm not aware of any way of getting back the original method after this type of overriding. "Overloading" (using dispatch) by defining and using your own types is much more common anyway.)
(c), what an example snippet of Julia code looks like that shows (1) overloading in one example, and (2) overriding in the other.
The first code snippet you posted is an example of using dispatch (which is what Julia uses instead of overloading). For another example, let's first define our base type and function:
abstract type Vehicle end
function move(v::Vehicle, dist::Float64)
println("Moving by $dist meters")
end
Now we can create another method of this function for dispatch ("overload" it) this way:
function move(v::Vehicle, dist::LightYears)
println("Blazing across $dist light years")
end
We can do an object-oriented style "override" too (though at the language level this is just seen as another method for dispatch):
struct Car <: Vehicle
model::String
end
function move(c::Car, dist::Float64)
println("$model is moving $dist meters")
end
This is the equivalent of overriding Vehicle.move(float dist) in derived class as Car.move(float dist).
And just for the heck of it, the volume function from the question:
# volume of a cylinder
volume(r::Float64, h::Int) = π*r*r*h
volume(l::Int, b::Int, h::Int) = l*b*h;
Now the correct volume method to call will be decided based on the number (and type) of arguments passed (and the return type is automatically inferred by the compiler, Float64 for the first method and Int for the second one).

Why some variable of struct take preprocessor to function?

Variables of struct declared by data type of language in the header file. Usually data type using to declare variables, but other data type pass to preprocessors. When we should use to a data type send to preprocessor for declare variables? Why data type and variables send to processor?
#define DECLARE_REFERENCE(type, name) \
union { type name; int64_t name##_; }
typedef struct _STRING
{
int32_t flags;
int32_t length;
DECLARE_REFERENCE(char*, identifier);
DECLARE_REFERENCE(uint8_t*, string);
DECLARE_REFERENCE(uint8_t*, mask);
DECLARE_REFERENCE(MATCH*, matches_list_head);
DECLARE_REFERENCE(MATCH*, matches_list_tail);
REGEXP re;
} STRING;
Why this code is doing this for declarations? Because as the body of DECLARE_REFERENCE shows, when a type and name are passed to this macro it does more than just the declaration - it builds something else out of the name as well, for some other unknown purpose. If you only wanted to declare a variable, you wouldn't do this - it does something distinct from simply declaring one variable.
What it actually does? The unions that the macro declares provide a second name for accessing the same space as a different type. In this case you can get at the references themselves, or also at an unconverted integer representation of their bit pattern. Assuming that int64_t is the same size as a pointer on the target, anyway.
Using a macro for this potentially serves several purposes I can think of off the bat:
Saves keystrokes
Makes the code more readable - but only to people who already know what the macros mean
If the secondary way of getting at reference data is only used for debugging purposes, it can be disabled easily for a release build, generating compiler errors on any surviving debug code
It enforces the secondary status of the access path, hiding it from people who just want to see what's contained in the struct and its formal interface
Should you do this? No. This does more than just declare variables, it also does something else, and that other thing is clearly specific to the gory internals of the rest of the containing program. Without seeing the rest of the program we may never fully understand the rest of what it does.
When you need to do something specific to the internals of your program, you'll (hopefully) know when it's time to invent your own thing-like-this (most likely never); but don't copy others.
So the overall lesson here is to identify places where people aren't writing in straightforward C, but are coding to their particular application, and to separate those two, and not take quirks from a specific program as guidelines for the language as a whole.
Sometimes it is necessary to have a number of declarations which are guaranteed to have some relationship to each other. Some simple kinds of relationships such as constants that need to be numbered consecutively can be handled using enum declarations, but some applications require more complex relationships that the compiler can't handle directly. For example, one might wish to have a set of enum values and a set of string literals and ensure that they remain in sync with each other. If one declares something like:
#define GENERATE_STATE_ENUM_LIST \
ENUM_LIST_ITEM(STATE_DEFAULT, "Default") \
ENUM_LIST_ITEM(STATE_INIT, "Initializing") \
ENUM_LIST_ITEM(STATE_READY, "Ready") \
ENUM_LIST_ITEM(STATE_SLEEPING, "Sleeping") \
ENUM_LIST_ITEM(STATE_REQ_SYNC, "Starting synchronization") \
// This line should be left blank except for this comment
Then code can use the GENERATE_STATE_ENUM_LIST macro both to declare an enum type and a string array, and ensure that even if items are added or removed from the list each string will match up with its proper enum value. By contrast, if the array and enum declarations were separate, adding a new state to one but not the other could cause the values to get "out of sync".
I'm not sure what the purpose the macros in your particular case, but the pattern can sometimes be a reasonable one. The biggest 'question' is whether it's better to (ab)use the C preprocessor so as to allow such relationships to be expressed in valid-but-ugly C code, or whether it would be better to use some other tool to take a list of states and would generate the appropriate C code from that.

Redundancy in OCaml type declaration (ml/mli)

I'm trying to understand a specific thing about ocaml modules and their compilation:
am I forced to redeclare types already declared in a .mli inside the specific .ml implementations?
Just to give an example:
(* foo.mli *)
type foobar = Bool of bool | Float of float | Int of int
(* foo.ml *)
type baz = foobar option
This, according to my normal way of thinking about interfaces/implementations, should be ok but it says
Error: Unbound type constructor foobar
while trying to compile with
ocamlc -c foo.mli
ocamlc -c foo.ml
Of course the error disappears if I declare foobar inside foo.ml too but it seems a complex way since I have to keep things synched on every change.
Is there a way to avoid this redundancy or I'm forced to redeclare types every time?
Thanks in advance
OCaml tries to force you to separate the interface (.mli) from the implementation (.ml. Most of the time, this is a good thing; for values, you publish the type in the interface, and keep the code in the implementation. You could say that OCaml is enforcing a certain amount of abstraction (interfaces must be published; no code in interfaces).
For types, very often, the implementation is the same as the interface: both state that the type has a particular representation (and perhaps that the type declaration is generative). Here, there can be no abstraction, because the implementer doesn't have any information about the type that he doesn't want to publish. (The exception is basically when you declare an abstract type.)
One way to look at it is that the interface already contains enough information to write the implementation. Given the interface type foobar = Bool of bool | Float of float | Int of int, there is only one possible implementation. So don't write an implementation!
A common idiom is to have a module that is dedicated to type declarations, and make it have only a .mli. Since types don't depend on values, this module typically comes in very early in the dependency chain. Most compilation tools cope well with this; for example ocamldep will do the right thing. (This is one advantage over having only a .ml.)
The limitation of this approach is when you also need a few module definitions here and there. (A typical example is defining a type foo, then an OrderedFoo : Map.OrderedType module with type t = foo, then a further type declaration involving'a Map.Make(OrderedFoo).t.) These can't be put in interface files. Sometimes it's acceptable to break down your definitions into several chunks, first a bunch of types (types1.mli), then a module (mod1.mli and mod1.ml), then more types (types2.mli). Other times (for example if the definitions are recursive) you have to live with either a .ml without a .mli or duplication.
Yes, you are forced to redeclare types. The only ways around it that I know of are
Don't use a .mli file; just expose everything with no interface. Terrible idea.
Use a literate-programming tool or other preprocessor to avoid duplicating the interface declarations in the One True Source. For large projects, we do this in my group.
For small projects, we just duplicate type declarations. And grumble about it.
You can let ocamlc generate the mli file for you from the ml file:
ocamlc -i some.ml > some.mli
In general, yes, you are required to duplicate the types.
You can work around this, however, with Camlp4 and the pa_macro syntax extension (findlib package: camlp4.macro). It defines, among other things, and INCLUDE construct. You can use it to factor the common type definitions out into a separate file and include that file in both the .ml and .mli files. I haven't seen this done in a deployed OCaml project, however, so I don't know that it would qualify as recommended practice, but it is possible.
The literate programming solution, however, is cleaner IMO.
No, in the mli file, just say "type foobar". This will work.