Where does VSCode store files related to markdown codeblock syntax highlighting?

Where does VSCode store files related to markdown codeblock syntax highlighting? - visual-studio-code

I'm trying to find the files necessary to add my own syntax highlighting to fenced code blocks in a markdown document within Visual Studio Code. e.g.
```Python
# Comment
print("hello world")
```
will provide syntax highlighting for Python. I'm trying to add syntax highlighting for an unsupported language.
I have tried to find these on my own, but it's been all guesswork. I've found what looks like related files under C:\Users<user>\AppData\Local\Programs\Microsoft VS Code\resources\app\extensions. There's a markdown-basics folder within there that contains a syntaxes\markdown.tmLanguage.json that appears to contain fenced code block references, but not the actual syntax definitions. There's also folders for a variety of languages. Trying to create a new definition based on these doesn't seem to work though.
I would appreciate any help figuring out how fenced code block syntax is defined.

Related

Scala in VSCode: error highlighting not working

I want to use VSCode to work on Scala code, but cannot get error highlighting to work.
See some example code below, how it looks like, vs. how I want it to highlight errors.
I think every proper programming language should have code editor support for highlighting of obvious errors regarding undefined variables, functions or operator overloadings.
What can I do, to get error highlighting for Scala in VSCode?

How to write diff code with syntax highlight in Github

Github supports syntax highlight as follows:
```javascript
let message = 'hello world!'
```
And it supports diff as follows: (but WITHOUT syntax highlight)
```diff
-let message = 'hello world!'
+let message = 'hello stackoverflow!'
```
How can I get both 'syntax hightlight' AND 'diff' ?

No, this is not a supported feature at this time.
GitHub documents their processing of lightweight markup languages (including Markdown, among others) in github/markup. Note step 3:
Syntax highlighting is performed on code blocks. See github/linguist for more information about syntax highlighting.
If we follow that link, we find a list of grammars that Linguist uses to provide syntax highlighting on GitHub. Linguist can only apply one of the grammars in that list to a block of code at a time. Of course, one of the grammars is Diff. However, that grammar knows nothing about the language of code being diffed, so you don't get syntax highlighting of that.
Of course, there are other languages which are often combined. For example, HTML is often included in a templating language. Therefore, in addition to the HTML grammar, we also find grammars for HTML+Django, HTML+ECR HTML+EEX, HTML+ERB, and HTML+PHP. In each case, the single grammar is aware of two languages. Both the specific templating language and the HTML which is interspersed within the template.
To accomplish the same thing with a diff, you would need a separate "diff" grammar for every single language listed. In other words, the number of grammars would double. Of course, a way to avoid this might be to treat diff differently. When diff is specified, they could run the block through the syntax highlighter twice, once for diff and once for the source language. However, at least when processing code blocks in lightweight markup languages, they have not implemented such a feature.
And if they ever were to implement such a feature in the future, it would likely be more complicated that simply running the code block through twice. After all, every line of the diff has diff specific content which would confuse the other language grammar. Therefore, every grammar would need to be diff aware, or each line would need to be fed to the grammar separately with the diff parts removed. The problem with the later is that the grammar would not have the context of each line and is more likely to get things wrong. Whether such a solution is possible is outside this cope of this answer, but the point is that it is reasonable to expect that such a feature would be much lower priority to support due to the complexity involved.
So why does GitHub do syntax highlighting in other places on its website? Because, in those cases, it has access to the two source files being diffed and it generates the diff itself. Each source is first highlighted (avoiding the complexity mentioned above), then the diff is created from the two highlighted source files. However, a diff included in a Markdown code block is already a diff when GitHub first sees it. There is no way for them to highlight the pre-diff code first. In other words, the process they currently use would not be transferable to supporting the requested feature.

You would need to post-process the output of the git diff in order to add syntax highlighting for the right language of the file being diff'ed.
But since you are asking for GitHub, that post-processing is not in your control, and is not provided by GitHub at the moment in its GFM (GitHub Flavored Markdown Spec).
It is supported for source files, in a regular diff like this one or in a PR: GitHub does the syntax highlighting of the two versions of the file, and then computes the diff.
It is not supported in a regular markdown fenced code block, where the +/- of a diff would throw off the syntax highlighting engine, considering there is no "diff" operation done here (just the writer trying to add diff +/- symbols)

Is it possible/easy to build a VS Code extension that does syntax highlighting with a lexer?

I am building an experimental lexer generator and I think it would be cool to output simple syntax highlighters for VS Code. The input grammar goes through the classic regular language -> NFA -> DFA transformation, then generates state machine code (it also has some unconventional features to support nested languages). Converting all this back into tmlanguage definitions is a complicated problem, and I'm starting to wonder if a VS Code extension is a better option. The question is:
Are VS Code syntax highlighting internals completely tied to the tmlanguage regex scanner, or would it be possible to write an extension that provides tokens / highlight ranges programmatically?
Is there an API that would make this reasonably straightforward, or would this project be a tour de force?

As of VSCode 1.15, you have to use textmate grammars for syntax highlighting. There's an feature request open that tracks what you are after: https://github.com/Microsoft/vscode/issues/1967

Adding support for language on a code fence block

doxygen has support for code fence blocks that also have syntax highlighting in the output.
Here is the documentation:
http://www.doxygen.nl/manual/markdown.html#md_fenced
It looks like this:
~~~{.c}
int somefunc(int somevar);
~~~
I want to support .sql; I tried it, but it did not highlight.
My two questions are:
How do I determine what code types doxygen supports for code fence blocks?
Is there some way to define a new one? I am quite happy with just a keyword highlighter; it does not need to be a full parse.

Since my comment, I have looked into adding SQL syntax highlighting to fenced code blocks and \code blocks.
It should now be available if you build from source at https://github.com/doxygen/doxygen or it will be available in the next version (1.8.13).
Here is an example of the syntax highlighting:
If you could test it before the next release, that would be nice, as well.

Shortcut for clike languages comments not working/implemented?

I'm using the Brackets code editor to code in C++ and I'm having a hard time having the shortcut for lineComment and blockComment working...
The shortcuts are [Ctrl+/] and [Ctrl+Shift+/], they work perfectly for CSS, JS.. etc but not with C++ files.
I looked into the clike.js file in the CodeMirror folder of Brackets, the blockCommentStart, blockCommentEnd and lineComment are correctly defined.
Is it a known issue? has anyone found a workaround?
Before that,I was coding with Notepad++ and this feature was the one I used the most. It's really hard not to have it anymore

You said you saw that blockCommentStart, blockCommentEnd and lineComment are correctly defined in clike.js. From CodeMirror documentation
This file defines, in the simplest case, a lexer (tokenizer) for your
language—a function that takes a character stream as input, advances
it past a token, and returns a style for that token. More advanced
modes can also handle indentation for the language.
It is used to highlight the c++ file. But also it could be used to auto comment line with shortcut. However it is probably not implemented for C++. For this feature comment addon from CodeMirror might be used http://codemirror.net/addon/comment/comment.js since The addon also defines a toggleComment command, which will try to uncomment the current selection, and if that fails, line-comments it.

This was a Brackets bug, but it was fixed in the Sprint 39 release.
(Fwiw though, language metadata in Brackets is defined in a file called languages.json - although Brackets extensions can add to / modify this metadata as well).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse