How to define regexp variables in TM language? - visual-studio-code

In sublime-syntax file you can define variables to use in regular expressions (like - match: "{{SOME_VARIABLE}}"). It looks like you can't in tmLanguage (https://macromates.com), but highlighters frequently expand variables, then is there an utility that adds variable support like this for the TM language descriptor, so it can be used with VSCode? I found nothing with the search engine.

I too was looking for this functionality as the regular expressions get long and complex very quickly, especially if writing the tmLanguage file in JSON, which forces you to escape some characters with \\.
It seems not to be supported out of the box by textmate. However you can have variable support if you don't mind some pre-processing.
I found this kind of solution browsing Microsoft TypeScript TmLanguage GitHub repository.
They define the Typescript grammar in YAML, which is more readable and requires only one anti-slash to escape characters. In this YAML file, they define "variables" for frequently used patterns, e.g.:
variables:
startOfIdentifier: (?<![_$[:alnum:]])(?:(?<=\.\.\.)|(?<!\.))
endOfIdentifier: (?![_$[:alnum:]])(?:(?=\.\.\.)|(?!\.))
propertyAccess: (?:(\.)|(\?\.(?!\s*[[:digit:]])))
propertyAccessPreIdentifier: \??\.\s*
identifier: '[_$[:alpha:]][_$[:alnum:]]*'
constantIdentifier: '[[:upper:]][_$[:digit:][:upper:]]*'
propertyIdentifier: '\#?{{identifier}}'
constantPropertyIdentifier: '\#?{{constantIdentifier}}'
label: ({{identifier}})\s*(:)
Then they reuse those "variables" in the pattern definitions (or even in other variables, if you look above, the label variable uses the identifier variable), e.g.:
enum-declaration:
name: meta.enum.declaration.ts
begin: '{{startOfDeclaration}}(?:\b(const)\s+)?\b(enum)\s+({{identifier}})'
beginCaptures:
'1': { name: keyword.control.export.ts }
'2': { name: storage.modifier.ts}
'3': { name: storage.modifier.ts}
'4': { name: storage.type.enum.ts }
'5': { name: entity.name.type.enum.ts }
And finally they use a build script to transform this YAML grammar to a plist or json grammar. In this build script, they remove the "variables" property from the grammar as it is not part of the tmLanguage spec and they loop over the variables definitions to replace their occurrences ({{variable}}) in other variables or begin, end, match patterns.
function replacePatternVariables(pattern: string, variableReplacers: VariableReplacer[]) {
let result = pattern;
for (const [variableName, value] of variableReplacers) {
result = result.replace(variableName, value);
}
return result;
}
type VariableReplacer = [RegExp, string];
function updateGrammarVariables(grammar: TmGrammar, variables: MapLike<string>) {
delete grammar.variables;
const variableReplacers: VariableReplacer[] = [];
for (const variableName in variables) {
// Replace the pattern with earlier variables
const pattern = replacePatternVariables(variables[variableName], variableReplacers);
variableReplacers.push([new RegExp(`{{${variableName}}}`, "gim"), pattern]);
}
transformGrammarRepository(
grammar,
["begin", "end", "match"],
pattern => replacePatternVariables(pattern, variableReplacers)
);
return grammar;
}
Not exactly what you (and I) were looking for but if your grammar is big enough, it helps. If the grammar is not quite big enough, then I would not use this pre-processing.

I made a command line tool for converting a YAML format of TMLanguage syntax with support for these variables to JSON: https://www.npmjs.com/package/com.matheusds365.vscode.yamlsyntax2json
For more information on the TMLanguage format and creating language extensions for Visual Studio Code, look at this StackOverflow answer.
You can refer to variables using {{variableName}} syntax.
Install it with NPM:
npm i -g com.matheusds365.vscode.yamlsyntax2json
Here is an example:
# tmLanguage
---
$schema: https://raw.githubusercontent.com/martinring/tmlanguage/master/tmlanguage.json
name: MyLanguageName
scopeName: source.mylang
variables:
someVar: 'xxx'
patterns:
- include: '#foo'
repository:
foo:
patterns: []
Run:
yamlsyntax2json mylanguage.tmLanguage.yaml mylanguage.tmLanguage.json
Output:
{
"$schema": "https://raw.githubusercontent.com/martinring/tmlanguage/master/tmlanguage.json",
"name": "MyLanguageName",
"patterns": [
{
"include": "#foo"
}
],
"repository": {
"foo": {
"patterns": []
}
},
"scopeName": "source.mylang"
}

Related

vs code snippet: how to use variable transforms twice in a row

See the following snippet:
"srcPath":{
"prefix": "getSrcPath",
"body": [
"$TM_FILEPATH",
"${1:${TM_FILEPATH/(.*)src.(.*)/${2}/i}}",
"${TM_FILEPATH/[\\\\]/./g}"
]
},
The output of lines 1-3 is :
D:\root\src\view\test.lua
view\test.lua
D:.root.src.view.test.lua
How can I get output like 'view/test.lua'?
Try this snippet:
"srcPath":{
"prefix": "getSrcPath",
"body": [
"$TM_FILEPATH",
"${TM_FILEPATH/.*src.|(\\\\)/${1:+/}/g}",
"${TM_FILEPATH/[\\\\]/\\//g}"
]
}
.*src.|(\\\\) will match everything up to and including the ...src\ path information. We don't save it in a capture group because we aren't using it in the replacement part of the transform.
The (\\\\) matches any \ in the rest of the path - need the g flag to get them all.
Replace: ${1:+/} which means if there is a capture group 1 in .*src.|(\\\\) then replace it with a /. Note we don't match the rest of the path after src\ only the \'s that might follow it. So, not matching those other path parts just allows them to remain in the result.
You were close on this one:
"${TM_FILEPATH/[\\\\]/\\//g}" just replace any \\\\ with \\/.
With the extension File Templates you can insert a "snippet" that contains a variable and multiple find-replace operations.
With a key binding:
{
"key": "ctrl+alt+f", // or any other combo
"command": "templates.pasteTemplate",
"args": {
"text": [
"${relativeFile#find=.*?src/(.*)#replace=$1#find=[\\\\/]#flags=g#replace=.#}"
]
}
}
At the moment only possible with a key binding or via multi command (or similar). Will add an issue to also make it possible by prefix.
Also some of the standard variables are missing.

VSCode language extension: can a language be embedded into another through code?

The extension adds support for Renpy language, a language very similar to Python. In this language, it's possible to embed Python code in different ways.
Single line statement:
define e = Character("Eileen", who_color="#c8ffc8")
default sidebar = False
$ sampleFunction("Eileen", 1.0)
To embed python inside single-line statements, I use the following TextMate Grammar pattern:
{
"comment": "Match begin and end of python one line statements",
"name": "meta.embedded.python.line",
"begin": "(?<=(\\$|define|default)\\s)",
"end": "\\R$",
"patterns": [{ "include": "source.python" }]
}
In this case, I can know when a statement ends.
Python block:
python:
def foo():
return "bar"
These blocks can be nested within other language blocks, for example:
init:
image movie = Movie()
python:
povname = ""
pov = DynamicCharacter("povname", color=(255, 0, 0, 255))
$ ectc = Character('Eileen', color=(200, 255, 200, 255))
In the case of the block, since it's delimited by indentation, I can't determine where it ends. If these blocks couldn't be nested, I could capture the end with a regular expression, e.g. ^(?=\S), since it can be nested I can't detect when it ends.
I tried to add the TextMate scope source.python via the SemanticTokenProvider, but it seems that it's not possible to add a textmate scope using the SemanticTokensBuilder. Also tried with TextMate patterns but have not succeeded.
I would like to find a way to make the contents of Python blocks have the source.python TextMate scope, regardless of whether it's nested or not.
If there is always a line after the block that has the same indent as the word python: you could try
{
"begin": "^(\\s*)(python:)",
"end": "^\\1(?=\S)",
"beginCaptures": { "2": { "name": "keyword.other" } },
"patterns": [{ "include": "source.python" }]
}

How to apply multiple transforms to snippet variable

I'm in a file called event-list.tsx, and I'm trying to create a snippet that writes the following code:
const EventList: FC = () => {
return <div>
</div>;
};
export default EventList;
Thus far, in typescriptreact.json I've written the following snippet setting, which results in awkward-looking code (it puts out const event-list rather than const EventList
"react arrow func component": {
"prefix": "rafce",
"body": [
"const ${TM_FILENAME_BASE}: FC = () => {",
" return <div>",
" ",
" </div>;",
"};",
"",
"export default ${TM_FILENAME_BASE};",
""
]
},
I know how to remove the hyphen from the snippet:
${TM_FILENAME_BASE/-//}
I also figured out how to capitalize the first character:
${TM_FILENAME_BASE/(^.)/${1:/upcase}/}
But I can't figure out how to apply all three of the changes I want. I know the regular expression needed to capitalize every character that comes after a hyphen (a positive lookbehind), but I don't know how to apply it here. There is nothing in the documentation chapter implying the possibility to chain multiple transforms onto each other.
Try the following global regex
${TM_FILENAME_BASE/(.)([^-]*)-?/${1:/upcase}${2}/g}
Find a part before a - and Upcase the first letter, repeat for the whole string
"${TM_FILENAME_BASE/(\\w+)-?/${1:/capitalize}/g}",
(\\w+)-? : You only need one capture group if you use /capitalize.
The hyphens are removed by virtue of matching them (-?) but not including them in the output.
The g flag is necessary to keep matching every (\\w+)-? instance and perform a transform for each.
And since you are reusing an earlier transform you can simplify the whole thing like this:
"react arrow func component": {
"prefix": "rafce",
"body": [
"const ${1:${TM_FILENAME_BASE/(\\w*)-?/${1:/capitalize}/g}}: FC = () => {",
" return <div>",
" ",
" </div>;",
"};",
"",
"export default $1;",
""
]
},
Note that
${1:${TM_FILENAME_BASE/(\\w*)-?/${1:/capitalize}/g}}
stores the result of that transform in variable $1 - which can simply be used later (or earlier) by itself to output the same result!

VSCode Snippets for Latex: Combining placeholders and curly braces

I'm in the process of setting up a file for personal snippets in VSCode for LaTeX.
Is there a way to combine placeholders (Syntax: ${1:foo}) and normal curly braces?
In my example I want my code to output:
\fcolorbox {frame}{background}{text}
where every variable is a placeholder. My generated snippet code (.json) looks as follows:
"Colorbox fcolorbox": {
"prefix": "colbox",
"body": [
"\\fcolorbox ${{1:frame}}${{2:background}}${{3:text}}"
],
"description": "Colorbox fcolorbox"
}
but doesn't work since it outputs and interprets the $ and {} as LaTeX symbols.
Is there a way to fix this and make the placeholders work?
This produces the requested result: \fcolorbox {frame}{background}{text}
"Colorbox fcolorbox": {
"prefix": "colbox",
"body": [
"\\fcolorbox {${1:frame}}{${2:background}}{${3:text}}"
],
"description": "Colorbox fcolorbox"
}

vscode objects passed to the registerHoverProvider. ¿what are yours "keys" and what are they for?

I am developing an extension for visual studio code, I want to show a description of some elements using registerHoverProvider. I have found some examples, but it is not clear to me what the use of each element is and I do not know if there are others. For example:
"format": {
"prefix": "format",
"body": "format($1)",
"text": "format()",
"description":
"filter formats a given string by replacing the placeholders (placeholders follows the sprintf notation)",
"example":
"{% set foo = \"foo\" %}\n{{ \"I like %s and %s.\"| format(foo, \"bar\") }}\n\n{# outputs I like foo and bar #}"
}
What are the uses of: prefix, body, and text; and when to use each one