Why do string macros in Julia use ...? - macros

I was looking at the source for the r_str macro in Julia, which parses r"text" into Regex("text"). The second argument is flags..., which passes flags into the regex, like i for case insensitive, and so on.
I was playing with this myself and got:
julia> macro a_str(p, flags...)
print(flags)
p
end
julia> a"abc"iii
("iii",)"abc"
So it seems that the iii is all passed in as the first flag. In that case, why is there the ... on the flags. Is it possible to pass in more than one element of flags to the macro?

When this question was originally asked, a macro expander – i.e. the function defined with the macro keyword, which is called to transform the expressions passed to a macro into a single output expression – was not a generic function, but rather an anonymous function, which were a different kind of function in Julia 0.4 and earlier. At that point, the only way to write an anonymous function signature which could work for either one or two arguments was to use a trailing varargs argument, which is why this pattern was used to define string macros. In Julia 0.5 all functions have become generic functions, including anonymous functions and macro expanders. Thus, you can now write a macro a variety of ways, including the old way of using a varargs argument after the string argument:
# old style
macro rm_str(raw, rest...)
remove = isempty(rest) ? "aeiouy" : rest[1]
replace(raw, collect(remove), "")
end
# new style with two methods
macro rm_str(raw)
replace(raw, ['a','e','i','o','u','y'], "")
end
macro rm_str(raw, remove)
replace(raw, collect(remove), "")
end
# new style with default second argument
macro rm_str(raw, remove="aeiouy")
replace(raw, collect(remove), "")
end
These all result in the same non-standard string literal behavior:
julia> rm"foo bar baz"
"f br bz"
julia> rm"foo bar baz"abc
"foo r z"
The string literal produces the string with the flagged letters stripped from it, defaulting to stripping out all the ASCII vowels ("aeiouy"). The new approach of using a second argument with a default is the easiest and clearest in this case, as it will be in many cases, but now you can use whichever approach is best for the circumstances.

With an explicit call like
#a_str("abc", "iii", "jjj")
you can pass multiple flags. But I'm not aware of a way to make this work with a"abc"ijk syntax.

I don't believe it is possible, and the documentation doesn't provide an example where that would be used. In addition, the mostly-fully-compliant JuliaParser.jl doesn't support multiple flags either. Perhaps open an PR on Julia changing that?

Related

Regex in SV or UVM

What functions do I need to call to use Regular Expressions in Systemverilog/UVM?
Note: I'm not asking how to use regular expressions, just method names.
First, if you want to use regular expression, you'll need to make sure you're using a UVM library compiled together with its DPI code (i.e. the UVM_NO_DPI define isn't set).
The methods you want to use are located in dpi/uvm_regex.svh. The main function is uvm_re_match(...), which takes as an argument a regular expression and the string to match against. This is basically a wrapper around the regexec(...) C function found in the regex.h library. It will return 0 on a match.
Another function you might want to use is uvm_glob_to_re(...) which can convert from a glob expression (the kind you get in a Linux shell) to a true regular expression.

what is the difference between 'define as' to 'define as computed' in specman?

The difference between the two is not so clear from the Cadence documentation.
Could someone please elaborate on the difference between the two?
A define as macro is just a plain old macro that you probably know from other programming languages. It just means that at some select locations in the macro code you can substitute your own code.
A define as computed macro allows you to construct your output code programmatically, by using control flow statements (if, for, etc.). It acts kind of like a function that returns a string, with the return value being the code that will be inserted in its place by the pre-processor.
With both define as and define as computed macros you define a new syntactic construct of a given syntactic category (for example, <statement> or <action>), and you implement the replacement code that replaces a construct matching the macro match expression (or pattern).
In both cases the macro match expression can have syntactic arguments that are used inside the replacement code and are substituted with the actual code strings used in the matched code.
The difference is that with a define as macro the replacement code is just written in the macro body.
With a define as computed macro you write a procedural code that computes the desired replacement code text and returns it as a string. It's effectively a method that returns string, you can even use the result keyword to assign the resulting string, just like in any e method.
A define as computed macro is useful when the replacement code is not fixed, and can be different depending on the exact macro argument values or even semantic context (for example, in some cases a reflection query can be used to decide on the exact replacement code).
(But it's important to remember that even define as computed macros are executed during compilation and not at run time, so they cannot query actual run time values of fields or variables to decide on the resulting replacement code).
Here are some important differences between the two macro kinds.
A define as macro is more readable and usually easier to write. You just write down the code that you want to be created.
Define as computed macros are stronger. Everything that can be implemented with define as, can also be implemented with define as computed, but not vice versa. When the replacement code is not fixed, define as is not sufficient.
A define as macro can be used immediately after its definition. If the construct it introduces is used in the statement just following the macro, it will already be matched. A define as computed macro can only be used in the next file, and is not usable in the same file in which the macro is defined.

Is "my" a function in Perl?

I know that my is used to declare a variable local to a block or file. I have always assumed that my is a keyword in Perl. But I was just told that it's actually a function. One of the proofs is that perldoc puts my under the “Functions” section, see http://perldoc.perl.org/functions/my.html.
How does a function do the job of declaring local variables?
my is not a function, it's just clumped together with functions (in perl documentation) because it works like a function.
If you look at perldoc perlfunc, it is saith,
Here are Perl's functions (including things that look like functions, like some keywords and named operators) arranged by category...
then a bit below that
Keywords related to scoping
caller, import, local, my, our, package, state, use
Specifically, note that the word “keyword” was used there instead of “function”
So that implies that you would find some non-functions (e.g. keywords) under Perl functions A-Z
Another way of saying this: if something is listed under “Functions” in perldoc, it is not necessarily a function – it can be a keyword or named operator which acts like a function.
Yes, by Perl's (very unique) definition, my is a function. The opening paragraph of perlfunc defines "function":
The functions in this section can serve as terms in an expression. They fall into two major categories: list operators and named unary operators.
my is a named operator. But it's special in two ways:
In addition to behaving like a function (that allocates a new variable and returns that variable), it has a compile-time effect.
my ... is a unary operator, but it can accept multiple arguments when parens are used.
If on the other hand you were ask if my was a function by C's definition, then no. my is not a C function. Neither is print, open, chr, etc. Everything in perlfunc is an operator; none of them are functions.
Finally, print, open and chr are far closer to a person's conception of a function than my. To be more precise, few people would consider my to be a function. It's more of a technicality than anything meaningful that it matches perfunc's definition of function.
See also:
What are perl built-in operators/functions?
Why does this [my] variable keep its value

What is the equivalent of Matlab's default display function which outputs to a file instead of stdout?

I apologize if this question is badly named, but let me explain through a simple analogy to Java. Please note, this question is about Matlab, not Java.
In Java, the standard way to write to the stdout stream is to say System.out.println.
If we want to print this exact output to a file, we can create a PrintStream object myPrinter, point it at a file, and replace the call to System.out.println with the call to myPrinter.println.
System.out.println("Hello World");
PrintStream myPrinter = new PrintStream(new File("MyJournal.log"));
myPrinter.println("Hello Log");
In Matlab, one standard way to write to standard out is to write an expression, variable or statement without a following semicolon. How would I rewrite my Matlab statements to get this (ascii) output into a file instead?
I would like to avoid solutions that redirect stdout to a file, because I still want some things printed to the console.
Additionally, I do not want to have to specify the type of the object that is being written to file, because the standard way does not require type specification.
Update: In particular, as far as I know, the printf family of functions requires type specification (different format string depending on the type of object). I am trying to avoid this because I seek a more generic solution.
What you want to do is find a generic string conversion that looks like display(), use that to build your output as strings, and then you can pass it to fprintf() or other normal output functions.
The Matlab output produced by omitting the semicolon is done by calling display() on the result of the expression. Matlab provides a display implementation for all the builtin types and a fallback implementation for user-supplied objects. The disp() function displays the value in the same way, but omits the variable name.
This is basically the equivalent of Java's toString function which is getting called behind the scenes when you print arbitrary objects to a PrintStream. The display functionality is mostly not in PrintStream itself, but in the polymorphic toString method that knows how to convert objects to display strings. The difference is that Java toString returns a String data value, which is easy to work with programmatically and compose with other operations, so you can manipulate it send it where you want, but Matlab's display and disp always output to stdout instead of giving you a string back.
What you want is probably something that will give you display-style output as a string, and then you can output it where you want, using fprintf() or another string output function.
One easy approach is to use evalc() to just call display and capture the output. The evalc function lets you run arbitrary Matlab code, and capture the output that would have gone to stdout to a char variable instead. You can make a generic conversion function and apply it to any values you want.
function out = tostr(x)
out = evalc('disp(x)');
end
The other option is to define a new dispstr() function, write it to convert Matlab built-in types (maybe using the evalc technique), and override dispstr in the objects you write, and call it polymorphically. More work, but it could end up being cleaner and faster, because evalc is slow and sometimes fragile. Also have a look at mat2str for ideas on an alternate output mechanism.
Then you'd be able to output values generically, without specifying the type, with calls like this. Note that in the second form, the placeholders are all just %s regardless of the type of object.
function displaySomeArbitraryDataValues(fh, foo, bar, baz)
% fh is a file handle previously opened with fopen()
fprintf(fh, 'foo=%s, bar=%s and baz=%s', tostr(foo), tostr(baz), tostr(qux))
end
Matlab does not have an formatted output function that automatically calls a display conversion, so you'll have to call the conversions explicitly in a lot of cases, so it'll be a little more wordy than Java. (Unless you want to go all out and write your own polymorphic auto-converting output functions or string wrapper objects.) If you want to use placeholders, it's easy enough by forcing all the args to be converted.
function out = sprintf2(format, varargin)
%SPRINTF2 A sprintf equivalent with polymorphic input conversion
% Use '%s' placeholders for everything in format, regardless of arg type
args = varargin;
strs = {}
for i = 1:numel(args)
strs{i} = tostr(args{i});
end
out = sprintf(format, strs{:});
end
If you want to get fancy, you could do mixed printf conversions by parsing the printf format argument yourself, defining a new placeholder like %y that calls your tostr() on the argument and then prints that string (probably by replacing the placeholder with %s and the argument with the tostr() result), and passes all the normal placeholders along to printf with un-converted arguments. This is kind of difficult to implement and can be computationally expensive, though.
You could make it behave more like Java by providing char conversions that convert object values to display strings, but that's inconsistent with how Matlab's existing numeric to char conversions work, so it wouldn't be fully generic, and you would run in to some odd edge cases, so I wouldn't recommend that. You could do a safer form of this by defining a new #string class that wraps char strings, but does implicit conversions by calling your tostr() instead of char(). This could be a big performance hit, though, because Matlab OOP objects are substantially slower than operations with built-in types and plain functions.
For background: basically, the reason you need to pass explicit conversions to fprintf in Matlab, but you don't need to with the PrintStream stuff in Java is that fprintf is inherited from C, which did not have run time type detection, especially of varargs, in the language. Matlab does, so fprintf is more limited than it needs to be. However, what fprintf does give you is fine-grained control over the formatting of numeric values and a concise calling form. These are useful, as evidenced by how Java programmers complained for years about not having them in Java and eventually got them added. It's just that Matlab sticks to the C-supported fprintf conversions, where they could easily add a couple new generic conversions that examine the type of the input at runtime and give you the generic output support you want.

Can the C preprocessor perform simple string manipulation?

This is C macro weirdness question.
Is it possible to write a macro that takes string constant X ("...") as argument and evaluates to sting Y of same length such that each character of Y is [constant] arithmetic expression of corresponding character of X.
This is not possible, right ?
No, the C preprocessor considers string literals to be a single token and therefore it cannot perform any such manipulation.
What you are asking for should be done in actual C code. If you are worried about runtime performance and wish to delegate this fixed task at compile time, modern optimising compilers should successfully deal with code like this - they can unroll any loops and pre-compute any fixed expressions, while taking code size and CPU cache use patterns into account, which the preprocessor has no idea about.
On the other hand, you may want your code to include such a modified string literal, but do not want or need the original - e.g. you want to have obfuscated text that your program will decode and you do not want to have the original strings in your executable. In that case, you can use some build-system scripting to do that by, for example, using another C program to produce the modified strings and defining them as macros in the C compiler command line for your actual program.
As already said by others, the preprocessor sees entire strings as tokens. There is only one exception the _Pragma operator, that takes a string as argument and tokenizes its contents to pass it to a #pragma directive.
So unless your targeting a _Pragma the only way to do things in the preprocessing phases is to have them written as token sequences, manipulate them and to stringify them at the end.