Parsing complex function declarations using tmlanguage - visual-studio-code

I'm writing a vscode extension for my programming language. I'm having difficulty coming up with a tmlanguage rule to parse my function declarations.
A function can look like this:
def macro_1 macro_2(foo, 20) function1(a: int, b: double) -> int {
}
macro_1 and macro_2 would be keywords while function1 should be a function name
The parser is looking for matching parenthesis and then looking ahead if it is followed by an identifier, in which case it is assumed to be a macro invocation. There can be any amount of macro invocations on a function. Do note that lines can be continued with a backslash, but I'm not trying to implement that yet.
My first idea was to use a single rule to parse both macro invocations and the function name with parameters, but the problem with this is that the macro invocation can have any expression as an argument while a function parameter can only be an identifier followed by colon and a type.

Related

In C++ in a function definition parentheses are operators or separators/punctuators?

in this code
void something () { /*something*/ }
are () separators or operators?
as i know in a function call () are operators:
something();
but in a function definition it would be a bit weird to have an operator, because operator in fact is a function and there is a function in a function definition?
can somebody explain this topic? what are the separators/punctuators exactly? they are tokens for the compiler to differentiate some part of the code? for example two statements?
/*statement1*/;
/*statement2*/;
;s are separating the statements from each other
so they are atomic syntactic elements for the compiler to understand the source code?
Depends on the context.
() in C++ can fulfill both definitions at once (operator and separator), or only one at a time.
It is an operator, since () is literally defined as the function call operator in the language spec. Since an overloaded operator is still an operator, this is independent of the number of arguments being passed to it (zero, or several).
A separator in terms of (programming) languages is usually defined as one or two tokens that separate some language features from other language features. This is the case when you pass parameters to a function when it is called, since the brackets separate the function name from the function arguments. This is not the case if no argument argument is being passed during the function call, as there is nothing to separate. In this case, () would act as an operator, but not as a separator.
I also almost forgot to mention the fact that round brackets are also used in arithmetic to denote precedence (acting as separators, not operators).
Another example of () acting as operator, but not as a separator, would be a cast.

How to use julia string macros with a variable?

Apologies if this has already been answered, but I'm having a surprisingly hard time using the "py" string macro included with the PyCall library when I feed it a variable representing a string instead of a simple string.
Examples:
#py_str "2 + 2"
returns 4
z = "2 + 2"
#py_str z
causes a error that interpolate_pycode does not take an Expr argument, as does #py_str ($(z)).
How can I pass #py_str a string variable?
(Just to clarify - above was a toy example, I'm using it for an application where it really is necessary).
#py_str "\$\$z" is your friend (looked at macro help using ?#py_str in REPL). Better to write the same macro as py"$$z"
From the REPL help (bolded relevant part):
py".....python code....."
Evaluate the given Python code string in the main Python module.
If the string is a single line (no newlines), then the Python
expression is evaluated and the result is returned. If the string is
multiple lines (contains a newline), then the Python code is compiled
and evaluated in the main Python module and nothing is returned.
If the o option is appended to the string, as in py"..."o, then the
return value is an unconverted PyObject; otherwise, it is
automatically converted to a native Julia type if possible.
Any $var or $(expr) expressions that appear in the Python code
(except in comments or string literals) are evaluated in Julia and
passed to Python via auto-generated global variables. This allows
you to "interpolate" Julia values into Python code.
Similarly, ny $$var or $$(expr) expressions in the Python code are
evaluated in Julia, converted to strings via string, and are pasted
into the Python code. This allows you to evaluate code where the code
itself is generated by a Julia expression.
PS the little typo ny instead of any is in the package source.

Why do string macros in Julia use ...?

I was looking at the source for the r_str macro in Julia, which parses r"text" into Regex("text"). The second argument is flags..., which passes flags into the regex, like i for case insensitive, and so on.
I was playing with this myself and got:
julia> macro a_str(p, flags...)
print(flags)
p
end
julia> a"abc"iii
("iii",)"abc"
So it seems that the iii is all passed in as the first flag. In that case, why is there the ... on the flags. Is it possible to pass in more than one element of flags to the macro?
When this question was originally asked, a macro expander – i.e. the function defined with the macro keyword, which is called to transform the expressions passed to a macro into a single output expression – was not a generic function, but rather an anonymous function, which were a different kind of function in Julia 0.4 and earlier. At that point, the only way to write an anonymous function signature which could work for either one or two arguments was to use a trailing varargs argument, which is why this pattern was used to define string macros. In Julia 0.5 all functions have become generic functions, including anonymous functions and macro expanders. Thus, you can now write a macro a variety of ways, including the old way of using a varargs argument after the string argument:
# old style
macro rm_str(raw, rest...)
remove = isempty(rest) ? "aeiouy" : rest[1]
replace(raw, collect(remove), "")
end
# new style with two methods
macro rm_str(raw)
replace(raw, ['a','e','i','o','u','y'], "")
end
macro rm_str(raw, remove)
replace(raw, collect(remove), "")
end
# new style with default second argument
macro rm_str(raw, remove="aeiouy")
replace(raw, collect(remove), "")
end
These all result in the same non-standard string literal behavior:
julia> rm"foo bar baz"
"f br bz"
julia> rm"foo bar baz"abc
"foo r z"
The string literal produces the string with the flagged letters stripped from it, defaulting to stripping out all the ASCII vowels ("aeiouy"). The new approach of using a second argument with a default is the easiest and clearest in this case, as it will be in many cases, but now you can use whichever approach is best for the circumstances.
With an explicit call like
#a_str("abc", "iii", "jjj")
you can pass multiple flags. But I'm not aware of a way to make this work with a"abc"ijk syntax.
I don't believe it is possible, and the documentation doesn't provide an example where that would be used. In addition, the mostly-fully-compliant JuliaParser.jl doesn't support multiple flags either. Perhaps open an PR on Julia changing that?

what is the difference between 'define as' to 'define as computed' in specman?

The difference between the two is not so clear from the Cadence documentation.
Could someone please elaborate on the difference between the two?
A define as macro is just a plain old macro that you probably know from other programming languages. It just means that at some select locations in the macro code you can substitute your own code.
A define as computed macro allows you to construct your output code programmatically, by using control flow statements (if, for, etc.). It acts kind of like a function that returns a string, with the return value being the code that will be inserted in its place by the pre-processor.
With both define as and define as computed macros you define a new syntactic construct of a given syntactic category (for example, <statement> or <action>), and you implement the replacement code that replaces a construct matching the macro match expression (or pattern).
In both cases the macro match expression can have syntactic arguments that are used inside the replacement code and are substituted with the actual code strings used in the matched code.
The difference is that with a define as macro the replacement code is just written in the macro body.
With a define as computed macro you write a procedural code that computes the desired replacement code text and returns it as a string. It's effectively a method that returns string, you can even use the result keyword to assign the resulting string, just like in any e method.
A define as computed macro is useful when the replacement code is not fixed, and can be different depending on the exact macro argument values or even semantic context (for example, in some cases a reflection query can be used to decide on the exact replacement code).
(But it's important to remember that even define as computed macros are executed during compilation and not at run time, so they cannot query actual run time values of fields or variables to decide on the resulting replacement code).
Here are some important differences between the two macro kinds.
A define as macro is more readable and usually easier to write. You just write down the code that you want to be created.
Define as computed macros are stronger. Everything that can be implemented with define as, can also be implemented with define as computed, but not vice versa. When the replacement code is not fixed, define as is not sufficient.
A define as macro can be used immediately after its definition. If the construct it introduces is used in the statement just following the macro, it will already be matched. A define as computed macro can only be used in the next file, and is not usable in the same file in which the macro is defined.

What is the equivalent of Matlab's default display function which outputs to a file instead of stdout?

I apologize if this question is badly named, but let me explain through a simple analogy to Java. Please note, this question is about Matlab, not Java.
In Java, the standard way to write to the stdout stream is to say System.out.println.
If we want to print this exact output to a file, we can create a PrintStream object myPrinter, point it at a file, and replace the call to System.out.println with the call to myPrinter.println.
System.out.println("Hello World");
PrintStream myPrinter = new PrintStream(new File("MyJournal.log"));
myPrinter.println("Hello Log");
In Matlab, one standard way to write to standard out is to write an expression, variable or statement without a following semicolon. How would I rewrite my Matlab statements to get this (ascii) output into a file instead?
I would like to avoid solutions that redirect stdout to a file, because I still want some things printed to the console.
Additionally, I do not want to have to specify the type of the object that is being written to file, because the standard way does not require type specification.
Update: In particular, as far as I know, the printf family of functions requires type specification (different format string depending on the type of object). I am trying to avoid this because I seek a more generic solution.
What you want to do is find a generic string conversion that looks like display(), use that to build your output as strings, and then you can pass it to fprintf() or other normal output functions.
The Matlab output produced by omitting the semicolon is done by calling display() on the result of the expression. Matlab provides a display implementation for all the builtin types and a fallback implementation for user-supplied objects. The disp() function displays the value in the same way, but omits the variable name.
This is basically the equivalent of Java's toString function which is getting called behind the scenes when you print arbitrary objects to a PrintStream. The display functionality is mostly not in PrintStream itself, but in the polymorphic toString method that knows how to convert objects to display strings. The difference is that Java toString returns a String data value, which is easy to work with programmatically and compose with other operations, so you can manipulate it send it where you want, but Matlab's display and disp always output to stdout instead of giving you a string back.
What you want is probably something that will give you display-style output as a string, and then you can output it where you want, using fprintf() or another string output function.
One easy approach is to use evalc() to just call display and capture the output. The evalc function lets you run arbitrary Matlab code, and capture the output that would have gone to stdout to a char variable instead. You can make a generic conversion function and apply it to any values you want.
function out = tostr(x)
out = evalc('disp(x)');
end
The other option is to define a new dispstr() function, write it to convert Matlab built-in types (maybe using the evalc technique), and override dispstr in the objects you write, and call it polymorphically. More work, but it could end up being cleaner and faster, because evalc is slow and sometimes fragile. Also have a look at mat2str for ideas on an alternate output mechanism.
Then you'd be able to output values generically, without specifying the type, with calls like this. Note that in the second form, the placeholders are all just %s regardless of the type of object.
function displaySomeArbitraryDataValues(fh, foo, bar, baz)
% fh is a file handle previously opened with fopen()
fprintf(fh, 'foo=%s, bar=%s and baz=%s', tostr(foo), tostr(baz), tostr(qux))
end
Matlab does not have an formatted output function that automatically calls a display conversion, so you'll have to call the conversions explicitly in a lot of cases, so it'll be a little more wordy than Java. (Unless you want to go all out and write your own polymorphic auto-converting output functions or string wrapper objects.) If you want to use placeholders, it's easy enough by forcing all the args to be converted.
function out = sprintf2(format, varargin)
%SPRINTF2 A sprintf equivalent with polymorphic input conversion
% Use '%s' placeholders for everything in format, regardless of arg type
args = varargin;
strs = {}
for i = 1:numel(args)
strs{i} = tostr(args{i});
end
out = sprintf(format, strs{:});
end
If you want to get fancy, you could do mixed printf conversions by parsing the printf format argument yourself, defining a new placeholder like %y that calls your tostr() on the argument and then prints that string (probably by replacing the placeholder with %s and the argument with the tostr() result), and passes all the normal placeholders along to printf with un-converted arguments. This is kind of difficult to implement and can be computationally expensive, though.
You could make it behave more like Java by providing char conversions that convert object values to display strings, but that's inconsistent with how Matlab's existing numeric to char conversions work, so it wouldn't be fully generic, and you would run in to some odd edge cases, so I wouldn't recommend that. You could do a safer form of this by defining a new #string class that wraps char strings, but does implicit conversions by calling your tostr() instead of char(). This could be a big performance hit, though, because Matlab OOP objects are substantially slower than operations with built-in types and plain functions.
For background: basically, the reason you need to pass explicit conversions to fprintf in Matlab, but you don't need to with the PrintStream stuff in Java is that fprintf is inherited from C, which did not have run time type detection, especially of varargs, in the language. Matlab does, so fprintf is more limited than it needs to be. However, what fprintf does give you is fine-grained control over the formatting of numeric values and a concise calling form. These are useful, as evidenced by how Java programmers complained for years about not having them in Java and eventually got them added. It's just that Matlab sticks to the C-supported fprintf conversions, where they could easily add a couple new generic conversions that examine the type of the input at runtime and give you the generic output support you want.