Kitchen sink examples of common languages for checking syntax theming

Kitchen sink examples of common languages for checking syntax theming - visual-studio-code

I'm making a theme for VS Code and need to check the syntax highlighting. Is there anywhere that demonstrates all types (e.g. variables, punctuation, comments, functions, objects etc.) for the most popular languages? The code doesn't need to make sense or do anything, just demonstrate the syntax without errors.

There's no such demo, from what I know. Best examples are the existing themes. The patterns used to describe a lexeme differ significantly between languages, so it would be hard to give a common example.
Since vscode uses/supports the TextMate themes you can however start with TM's description about the mapping between a lexem (a string, a number etc.) to identifiers, which in a second step are then translated to colors and font styles. Most theme writers use the xml style .tmtheme files format, while I find the JSON/YAML like style much better readable. See for example they syntax file I created: antlr.json.
The file which defines the mapping between the lexeme ids and styling is then here. The theme file is independent of the syntax file and could be used for other syntaxes as well (and uses their assignment of lexems to the theming ids).

Related

Where Can I Find All Scope Selectors And Definitions for VSC?

I want to create my own Visual Studio Code theme. The JSON file for VSC themes consists of an object called colors, which contains UI colors, and an array called tokenColors, which contains syntax colors (from my understanding).
The VSC documentation for the different colors selectors can be found here. What I haven't found is a list of the different tokenColors and what they mean.
So far, I have found out that you can use the Developer: Inspect Editor Tokens and Scopes command to find out what token/scope every word/symbol in your code belongs to. But I can't write code in every language containing every possible code construct and keyword just to then inspect them and find out the scope they belong to.
I also learnt that these scopes are the same or similar to the ones used in Sublime Text. However, this documentation only contains a small portion of the scopes used in VSC's built in themes. How am I supposed to find out what the remaining scopes stand for?

For syntax highlighting a different set of colors is used, with an own set of token identifiers (aka. scopes). The Syntax Highlighting Guide describes details of the tokenization + theming process and refers to the TextMate rules, which are used also in VS Code.
This page contains section about naming conventions and that's essentially the base set of token IDs that have become the de-facto standard. Extensions often create their own scopes, but you cannot handle all of them. So, focus on the base set instead.
Fortunately, scope selectors are hierarchically organized. When the highlighter cannot find a color for, say, keyword.sql.mysql it tries again with keyword.sql or finally with keyword. That means, as long as all scopes follow these rules, then there will be at least a color available from the base set.

How to write diff code with syntax highlight in Github

Github supports syntax highlight as follows:
```javascript
let message = 'hello world!'
```
And it supports diff as follows: (but WITHOUT syntax highlight)
```diff
-let message = 'hello world!'
+let message = 'hello stackoverflow!'
```
How can I get both 'syntax hightlight' AND 'diff' ?

No, this is not a supported feature at this time.
GitHub documents their processing of lightweight markup languages (including Markdown, among others) in github/markup. Note step 3:
Syntax highlighting is performed on code blocks. See github/linguist for more information about syntax highlighting.
If we follow that link, we find a list of grammars that Linguist uses to provide syntax highlighting on GitHub. Linguist can only apply one of the grammars in that list to a block of code at a time. Of course, one of the grammars is Diff. However, that grammar knows nothing about the language of code being diffed, so you don't get syntax highlighting of that.
Of course, there are other languages which are often combined. For example, HTML is often included in a templating language. Therefore, in addition to the HTML grammar, we also find grammars for HTML+Django, HTML+ECR HTML+EEX, HTML+ERB, and HTML+PHP. In each case, the single grammar is aware of two languages. Both the specific templating language and the HTML which is interspersed within the template.
To accomplish the same thing with a diff, you would need a separate "diff" grammar for every single language listed. In other words, the number of grammars would double. Of course, a way to avoid this might be to treat diff differently. When diff is specified, they could run the block through the syntax highlighter twice, once for diff and once for the source language. However, at least when processing code blocks in lightweight markup languages, they have not implemented such a feature.
And if they ever were to implement such a feature in the future, it would likely be more complicated that simply running the code block through twice. After all, every line of the diff has diff specific content which would confuse the other language grammar. Therefore, every grammar would need to be diff aware, or each line would need to be fed to the grammar separately with the diff parts removed. The problem with the later is that the grammar would not have the context of each line and is more likely to get things wrong. Whether such a solution is possible is outside this cope of this answer, but the point is that it is reasonable to expect that such a feature would be much lower priority to support due to the complexity involved.
So why does GitHub do syntax highlighting in other places on its website? Because, in those cases, it has access to the two source files being diffed and it generates the diff itself. Each source is first highlighted (avoiding the complexity mentioned above), then the diff is created from the two highlighted source files. However, a diff included in a Markdown code block is already a diff when GitHub first sees it. There is no way for them to highlight the pre-diff code first. In other words, the process they currently use would not be transferable to supporting the requested feature.

You would need to post-process the output of the git diff in order to add syntax highlighting for the right language of the file being diff'ed.
But since you are asking for GitHub, that post-processing is not in your control, and is not provided by GitHub at the moment in its GFM (GitHub Flavored Markdown Spec).
It is supported for source files, in a regular diff like this one or in a PR: GitHub does the syntax highlighting of the two versions of the file, and then computes the diff.
It is not supported in a regular markdown fenced code block, where the +/- of a diff would throw off the syntax highlighting engine, considering there is no "diff" operation done here (just the writer trying to add diff +/- symbols)

Visual Studio Code C++ color formatting in type declarations

As you can see, in C++ type declarations (string and new type), the colors of the types are in normal text color, which I find oddly peculiar, since it isn't the case for java source codes (even using the exact same theme).
I've tried customizing it in user settings:
But all that does is change the color of the type in the type definitions, not in the declarations.
I've also tried out various themes, but it's all the same.
Strangely, this doesn't happen for java source codes as shown in the image below:
I might be missing something. Maybe there's some field or attribute in the user settings that I should change instead of the 'type'.
Does anyone know how?

According to the Developer: Inpsect TM scopes command, the entire string x declaration has the same scopes:
source.cpp - the "base" scope for cpp files that everything in them has
meta.block.c - just tells you that it's inside of a block / {}
Consequently, there's not really anything to target with editor.tokenColorCustomizations, since that's based on scopes.
You could search for an extension that replaces the built-in CPP grammar with one that is better in this regard. Note that the only one I've found so far, Reloaded C/C++, doesn't help. Alternatively, you could search for a better grammar elsewhere - TmLanguage grammars are very commonplace and used in many editors, not just VSCode. Even GitHub uses them for their syntax highlighting.

How to style DITA xrefs in structured Framemaker

In the middle of a conversion project from unstructured Framemaker to DITA-compliant, structured Framemaker. Customer wants xrefs to be underlined in the output. Seems straightforward enough, but I've been all over the documentation and all over the internet and can't find what I need. The EDD file shows that we should be using the "link.external" style, which makes perfect sense, but for the life of me I can't figure out where link.external is defined. I've found one piece of documentation in all my searching that sort of comes close to what I need, but the process for styling an xref, according to this document, is long and laborious. I just can't believe that applying a simple style to an element is so hard. Where would I look for the definition of the "link.external" style (or any other style, for that matter)? What obvious point am I missing?

You apply the style in the Cross-Reference panel using building blocks in the cross-reference format(s).
For example:
Section 2.3.4, Volcanoes.
would be styled using the x-ref format below:
Section <$paranumonly>, <Emphasis><$paratext>.
Therefore, to underline all of the x-refs, create an underline character format such as Underline, and use it in a building block within every x-ref format that you have.
<Underline>“<$paratext>” on page\ <$pagenum>
The change only applies to the x-ref, not to the following text.

How to disable special handling of calling convention examples in emacs-lisp-mode?

As described here, emacs-lisp-mode provides for special handling of s-expressions in docstrings that start in the first column. This requires them to be escaped with a backslash to avoid mucking up font-lock later on in the file.
This may be a feature for elisp, but is unfortunate in other lisp modes that reuse emacs-lisp-mode for convenience that don't have special handling of expressions in docstrings, as described/shown here.
My question is, is there any way for such "descendant" modes to configure emacs-lisp-mode to disregard "calling convention expressions" in docstrings?

The short answer is no.
The longer answer is that those other modes are simply broken. They should adapt to Emacs Lisp in this regard. There is no reason not to, is there? It is simply a bad idea to use workarounds (e.g. indent all doc-string lines), such are suggested in the link you provided (and its linked duplicate post).
Emacs doc string are not trivial strings. They have several special properties, including the handling of \\[...], \\{...}, and \\<...>, as well as the property you mention here.
If some mode cannot adjust to Emacs doc strings then it should use macros that define the things it needs without creating Emacs doc strings for them but by handling a different string argument in the special way desired. IOW, create pseudo doc strings that correspond to what the mode wants instead of what Emacs wants.
Of course, that means that you cannot directly take advantage of the Emacs documentation features. You would need to also define mode-specific doc commands that would, for example, wrap the existing doc functions such as describe-function with code that picks up the mode's pseudo-doc string and DTRT, following the mode's conventions instead of the Emacs doc-string conventions.
But I would think that the easiest approach would be to just adapt the mode to the existing Emacs behavior, so that it DTRT.

Many Emacs programming modes, and various Lisp modes are no exception, have been implemented based on parsers with regular expressions. This, unfortunately, gives the editor little idea of the document being edited. Eclipse, for example, has a very different idea of how to edit code, which is more structured, and JetBrain MPS editors are even more rigid and structured in this sense (almost like spreadsheets).
This makes Emacs modes faster and easier to implement, but it also means the code that supports the proper indentation, syntactic validation and highlighting has to re-parse more text every time it is being edited. CEDET, afaik, is trying to address this issue.
Thus, historically, there had been conventions designed to reduce the amount of code to parse on each edit. Parenthesis in the first column is one such convention. However, it also has been known to be an annoyance some times, that's why there's a open-paren-in-column-0-is-defun-start variable one can set to nil to inhibit this behaviour.
But It's hard to say what exactly the performance issues you may face when changing this setting. Lisp grammar is very regular, unless you are using many reader macros, so, perhaps, that won't be a problem.

If beginning-of-defun-function is set accordingly, i.e. checking if inside a comment or string, should be no need for such escaping.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse