How to use julia string macros with a variable? - macros

Apologies if this has already been answered, but I'm having a surprisingly hard time using the "py" string macro included with the PyCall library when I feed it a variable representing a string instead of a simple string.
Examples:
#py_str "2 + 2"
returns 4
z = "2 + 2"
#py_str z
causes a error that interpolate_pycode does not take an Expr argument, as does #py_str ($(z)).
How can I pass #py_str a string variable?
(Just to clarify - above was a toy example, I'm using it for an application where it really is necessary).

#py_str "\$\$z" is your friend (looked at macro help using ?#py_str in REPL). Better to write the same macro as py"$$z"
From the REPL help (bolded relevant part):
py".....python code....."
Evaluate the given Python code string in the main Python module.
If the string is a single line (no newlines), then the Python
expression is evaluated and the result is returned. If the string is
multiple lines (contains a newline), then the Python code is compiled
and evaluated in the main Python module and nothing is returned.
If the o option is appended to the string, as in py"..."o, then the
return value is an unconverted PyObject; otherwise, it is
automatically converted to a native Julia type if possible.
Any $var or $(expr) expressions that appear in the Python code
(except in comments or string literals) are evaluated in Julia and
passed to Python via auto-generated global variables. This allows
you to "interpolate" Julia values into Python code.
Similarly, ny $$var or $$(expr) expressions in the Python code are
evaluated in Julia, converted to strings via string, and are pasted
into the Python code. This allows you to evaluate code where the code
itself is generated by a Julia expression.
PS the little typo ny instead of any is in the package source.

Related

Single quotes in a variable name in Perl?

I was writing some Perl code in vim and accidentally typed a single quote character in a variable name and noticed that it highlighted it in a different color than normal single quoted strings.
I thought that was odd, so I wrote a small test program (shown above) and tried to run it to see how Perl would handle it and I got this error:
"my" variable $var::with::apostrophes can't be in a package
What exactly is going on here? Are there situations where single quotes in variable names are actually valid? If so, what meaning do single quotes have when used in this context?
The single quote is the namespace separator used in Perl 4, replaced by the double colon :: in Perl 5. Because Perl is mostly backwards compatible, this still works. It's great for golfing, but not much else.
Here's an article about it on perl.com that doesn't explain it.

Stata and global variables

I am working with Stata.
I have a variable called graduate_secondary.
I generate a global variable called outcome, because eventually I will use another outcome.
Now I want to replace the variable graduate if a condition relative to global is met, but I get an error:
My code is:
global outcome "graduate_secondary"
gen graduate=.
replace graduate=1 if graduate_primary==1 & `outcome'==1
But i receive the symbol ==1 invalid name.
Does anyone know why?
Something along those lines might work (using a reproducible example):
sysuse auto, clear
global outcome "rep78"
gen graduate=.
replace graduate=1 if mpg==22 & $outcome==3
(2 real changes made)
In your example, just use
replace graduate=1 if graduate_primary==1 & $outcome==1
would work.
Another solution is to replace global outcome "graduate_secondary" with local outcome "graduate_secondary".
Stata has two types of macros: global, which are accessed with a $, and local, which are accessed with single quotes `' around the name -- as you did in your original code.
You get an error message because a local by the name of outcome has no value assigned to it in your workspace. By design, this will not itself produce an error but instead will the reference to the macro will evaluate as a blank value. You can see the result of evaluating macro references when you type them by using display as follows. You can also see all of the macros in your workspace with macro dir (the locals start with an underscore):
display `outcome'
display $outcome
Here is a blog post about using macros in Stata. In general, I only use global macros when I have to pass something between multiple routines, but this seems like a good use case for locals.

In what sense are languages like Elixir and Julia homoiconic?

Homoiconicity in Lisp is easy to see:
(+ 1 2)
is both the function call to + with 1, 2 as arguments, as well as being a list containing +, 1, and 2. It is simultaneously both code and data.
In a language like Julia, though:
1 + 2
I know we can parse this into an Expr in Julia:
:(1 + 2)
And then we can get the AST and manipulate it:
julia> Meta.show_sexpr(:(1+2))
(:call, :+, 1, 2)
So, we can manipulate a program's AST in Julia (and Elixir). But are they homoiconic in the same sense as Lisp- is any snippet of code really just a data structure in the language itself?
I don't see how code like 1 + 2 in Julia is, immediately, data- like how (+ 1 2) in Lisp is just a list. Is it still homiconic, then?
In the words of Bill Clinton, "It depends upon what the meaning of the word 'is' is". Well, ok, not really, but it does depend on what the meaning of the word "homoiconic" is. This term is sufficiently controversial that we no longer say that Julia is homoiconic – so you can decide for yourself whether it qualifies. Instead of trying to define homoiconicity, I'll quote what Kent Pitman (who knows a thing or two about Lisp) said in a Slashdot interview back in 2001:
I like Lisp's willingness to represent itself. People often explain this as its ability to represent itself, but I think that's wrong. Most languages are capable of representing themselves, but they simply don't have the will to. Lisp programs are represented by lists and programmers are aware of that. It wouldn't matter if it had been arrays. It does matter that it's program structure that is represented, and not character syntax, but beyond that the choice is pretty arbitrary. It's not important that the representation be the Right® choice. It's just important that it be a common, agreed-upon choice so that there can be a rich community of program-manipulating programs that "do trade" in this common representation.
He doesn't define homoiconicity either – he probably doesn't want to get into a definitional argument any more than I do. But he cuts to the heart of the matter: how willing is a language to represent itself? Lisp is willing in the extreme – you can't even avoid it: the representation of the program as data is just sitting right there, staring you in the face. Julia doesn't use S-expression syntax, so the representation of code as data is less obvious, but it's not hidden very deep:
julia> ex = :(2a + b + 1)
:(2a + b + 1)
julia> dump(ex)
Expr
head: Symbol call
args: Array(Any,(4,))
1: Symbol +
2: Expr
head: Symbol call
args: Array(Any,(3,))
1: Symbol *
2: Int64 2
3: Symbol a
typ: Any
3: Symbol b
4: Int64 1
typ: Any
julia> Meta.show_sexpr(ex)
(:call, :+, (:call, :*, 2, :a), :b, 1)
julia> ex.args[3]
:b
julia> ex.args[3] = :(3b)
:(3b)
julia> ex
:(2a + 3b + 1)
Julia code is represented by the Expr type (and symbols and atoms), and while the correspondence between the surface syntax and the structure is less immediately obvious, it's still there. And more importantly, people know that code is simply data which can be generated and manipulated, so there is a "rich community of program-manipulating programs", as KMP put it.
This is not just a superficial presentation of Julia code as a data structure – this is how Julia represents its code to itself. When you enter an expression in the REPL, it is parsed into Expr objects. Those Expr objects are then passed to eval, which "lowers" them to somewhat more regular Expr objects, which are then passed to type inference, all implemented in Julia. The key point is that the compiler uses the exact the same representation of code that you see. The situation is not that different in Lisp. When you look at Lisp code, you don't actually see list objects – those only exist in the computer's memory. What you see is a textual representation of list literals, which the Lisp interpreter parses and turns into list objects which it then evals, just like Julia. Julia's syntax can be seen as a textual representation for Expr literals – the Expr just happens to be a somewhat less general data structure than a list.
I don't know the details, but I suspect that Elixir is similar – maybe José will chime in.
Update (2019)
Having thought about this more for the past 4+ years, I think the key difference between Lisp and Julia is this:
In Lisp, the syntax for code is the same as the syntax for the data structure that is used to represent that code.
In Julia, the syntax for code is quite different from the syntax for the data structure that represents that code.
Why does this matter? On the pro-Julia side, people like special syntax for things and often find S-expression syntax inconvenient or unpleasant. On the pro-Lisp side, it's much easier to figure out how to do metaprogramming correctly when the syntax of the data structure you're trying to generate (to represent code) is the same as the syntax of the code that you would normally write. This is why one of the best pieces of advice when people are trying to write macros in Julia is to do the following:
Write an example of the kind of code you want your macro to generate
Call Meta.#dump on that code to see it as a data structure
Write code to generate that data structure—this is your macro.
In Lisp, you don't have to do step 2 because syntax for the code is already the same as the syntax for the data structure. There are the quasiquoting (in Lisp speak) quote ... end and :(...) constructs in Julia, which allow you to construct the data structures using code syntax, but that's still not as direct as having them use the same syntax in the first place.
See also:
https://docs.julialang.org/en/v1/manual/metaprogramming/
What is a "symbol" in Julia?

Why do string macros in Julia use ...?

I was looking at the source for the r_str macro in Julia, which parses r"text" into Regex("text"). The second argument is flags..., which passes flags into the regex, like i for case insensitive, and so on.
I was playing with this myself and got:
julia> macro a_str(p, flags...)
print(flags)
p
end
julia> a"abc"iii
("iii",)"abc"
So it seems that the iii is all passed in as the first flag. In that case, why is there the ... on the flags. Is it possible to pass in more than one element of flags to the macro?
When this question was originally asked, a macro expander – i.e. the function defined with the macro keyword, which is called to transform the expressions passed to a macro into a single output expression – was not a generic function, but rather an anonymous function, which were a different kind of function in Julia 0.4 and earlier. At that point, the only way to write an anonymous function signature which could work for either one or two arguments was to use a trailing varargs argument, which is why this pattern was used to define string macros. In Julia 0.5 all functions have become generic functions, including anonymous functions and macro expanders. Thus, you can now write a macro a variety of ways, including the old way of using a varargs argument after the string argument:
# old style
macro rm_str(raw, rest...)
remove = isempty(rest) ? "aeiouy" : rest[1]
replace(raw, collect(remove), "")
end
# new style with two methods
macro rm_str(raw)
replace(raw, ['a','e','i','o','u','y'], "")
end
macro rm_str(raw, remove)
replace(raw, collect(remove), "")
end
# new style with default second argument
macro rm_str(raw, remove="aeiouy")
replace(raw, collect(remove), "")
end
These all result in the same non-standard string literal behavior:
julia> rm"foo bar baz"
"f br bz"
julia> rm"foo bar baz"abc
"foo r z"
The string literal produces the string with the flagged letters stripped from it, defaulting to stripping out all the ASCII vowels ("aeiouy"). The new approach of using a second argument with a default is the easiest and clearest in this case, as it will be in many cases, but now you can use whichever approach is best for the circumstances.
With an explicit call like
#a_str("abc", "iii", "jjj")
you can pass multiple flags. But I'm not aware of a way to make this work with a"abc"ijk syntax.
I don't believe it is possible, and the documentation doesn't provide an example where that would be used. In addition, the mostly-fully-compliant JuliaParser.jl doesn't support multiple flags either. Perhaps open an PR on Julia changing that?

Can the C preprocessor perform simple string manipulation?

This is C macro weirdness question.
Is it possible to write a macro that takes string constant X ("...") as argument and evaluates to sting Y of same length such that each character of Y is [constant] arithmetic expression of corresponding character of X.
This is not possible, right ?
No, the C preprocessor considers string literals to be a single token and therefore it cannot perform any such manipulation.
What you are asking for should be done in actual C code. If you are worried about runtime performance and wish to delegate this fixed task at compile time, modern optimising compilers should successfully deal with code like this - they can unroll any loops and pre-compute any fixed expressions, while taking code size and CPU cache use patterns into account, which the preprocessor has no idea about.
On the other hand, you may want your code to include such a modified string literal, but do not want or need the original - e.g. you want to have obfuscated text that your program will decode and you do not want to have the original strings in your executable. In that case, you can use some build-system scripting to do that by, for example, using another C program to produce the modified strings and defining them as macros in the C compiler command line for your actual program.
As already said by others, the preprocessor sees entire strings as tokens. There is only one exception the _Pragma operator, that takes a string as argument and tokenizes its contents to pass it to a #pragma directive.
So unless your targeting a _Pragma the only way to do things in the preprocessing phases is to have them written as token sequences, manipulate them and to stringify them at the end.