How to highlight my own syntax in Emacs? - emacs

I am developing my own Domain Specific Language (DSL) and the filename extension is .xyz.
Emacs doesn't know how to highlight syntax in .xyz files so I uausally turn on typescript-mode or json-mode. But the available syntax highlight mode is not good enough for me, so I am considering writing my own syntax highligher for Emacs editor. Any tips on this task? Any toolkit recommendation?
Alternatively, I would be happy with any available mode that highlights common keywords such as class, string, list, variables before =sign and after # sign, braces {}, brackets [], question mark ? and exclamation mark !. Any existing languages have similar syntax?
I am not color-blind and not picky on colors. Any syntax highligher that highlights above syntax can solve my problem.

If you are satisfied with simple syntax highlighting for keywords and comments only, there is a helper for this called define-generic-mode, which is documented in the elisp manual.
Some examples of using it can be found in generic-x.el distributed with Emacs.
But highlighting of variable names is not covered by this. For that, you need to be able to parse the DSL using semantic/bovine, as whether a particular string is interpreted as a variable name depends on context, and not just simple regexp matching.

Related

How can I add keywords to common editors?

I'm learning data structure. I tried coding to assist my learning. Sometimes I need to highlight some keywords in pseudo code ("SeqList","SqList","ElemType") as the color of the preset keywords ("int").
I do know I can add keywords in vim, for example, : syn keyword type seqlist, could dye the word "seqlist" as "int". But I wonder if mainstream editors (vscode, clion, Xcode, sublime) can achieve this?
I used VIM for some days to highlight the keywords defined by myself, using syntax keyword and syntax match

How to disable special handling of calling convention examples in emacs-lisp-mode?

As described here, emacs-lisp-mode provides for special handling of s-expressions in docstrings that start in the first column. This requires them to be escaped with a backslash to avoid mucking up font-lock later on in the file.
This may be a feature for elisp, but is unfortunate in other lisp modes that reuse emacs-lisp-mode for convenience that don't have special handling of expressions in docstrings, as described/shown here.
My question is, is there any way for such "descendant" modes to configure emacs-lisp-mode to disregard "calling convention expressions" in docstrings?
The short answer is no.
The longer answer is that those other modes are simply broken. They should adapt to Emacs Lisp in this regard. There is no reason not to, is there? It is simply a bad idea to use workarounds (e.g. indent all doc-string lines), such are suggested in the link you provided (and its linked duplicate post).
Emacs doc string are not trivial strings. They have several special properties, including the handling of \\[...], \\{...}, and \\<...>, as well as the property you mention here.
If some mode cannot adjust to Emacs doc strings then it should use macros that define the things it needs without creating Emacs doc strings for them but by handling a different string argument in the special way desired. IOW, create pseudo doc strings that correspond to what the mode wants instead of what Emacs wants.
Of course, that means that you cannot directly take advantage of the Emacs documentation features. You would need to also define mode-specific doc commands that would, for example, wrap the existing doc functions such as describe-function with code that picks up the mode's pseudo-doc string and DTRT, following the mode's conventions instead of the Emacs doc-string conventions.
But I would think that the easiest approach would be to just adapt the mode to the existing Emacs behavior, so that it DTRT.
Many Emacs programming modes, and various Lisp modes are no exception, have been implemented based on parsers with regular expressions. This, unfortunately, gives the editor little idea of the document being edited. Eclipse, for example, has a very different idea of how to edit code, which is more structured, and JetBrain MPS editors are even more rigid and structured in this sense (almost like spreadsheets).
This makes Emacs modes faster and easier to implement, but it also means the code that supports the proper indentation, syntactic validation and highlighting has to re-parse more text every time it is being edited. CEDET, afaik, is trying to address this issue.
Thus, historically, there had been conventions designed to reduce the amount of code to parse on each edit. Parenthesis in the first column is one such convention. However, it also has been known to be an annoyance some times, that's why there's a open-paren-in-column-0-is-defun-start variable one can set to nil to inhibit this behaviour.
But It's hard to say what exactly the performance issues you may face when changing this setting. Lisp grammar is very regular, unless you are using many reader macros, so, perhaps, that won't be a problem.
If beginning-of-defun-function is set accordingly, i.e. checking if inside a comment or string, should be no need for such escaping.

Emacs Mode for a c-like language

I'm trying to write a new emacs mode for a new template c-like language, which I have to use for some academic research.
I want the code to be colored and indented like in c-mode, with the following exceptions:
The '%' is not used as an operator, but as the first character in some specific keywords (like: "%p", "%action", etc.)
The code lines do not end with a semicolon.
Is it possible to create a derived-mode (from c-mode) and set it to ignore the original purposes of '%' and ';'? Is it possible to make the feature of "automatic-indentation after pressing RET" work without ';'?
Are there similar modes for similar languages (with '{}' brackets, but without semicolons) that I could try to patch?
Should I try to write a major mode from scratch?
I thought about patching the R-mode from http://ess.r-project.org/, but this mode does not support comments of the form "/* comment */".
The most important feature that I'm looking for is the brackets-indentation, i.e. indenting code inside a '{}' block after pressing RET (and without the extra-indentation after lines that do not end with ';'). Partial solutions will help too.
More generally, CC-mode has been extended and generalized over time to accomodate ever more languages, and the latest CC-mode is supposed to be reasonably good at isolating the generic code from the language-specific code. So take a look at some of the major modes that use CC-mode (e.g. awk-mode), and get in touch with CC-mode's maintainer who will be able to help you figure out hwo to do what you want.
If you don't mind something really simple, you can look at Gosu mode. Gosu is a language that has curly braces and no semi-colons, so you should be all set for your minimum. It also uses the same comment syntax as C.
The implementation of the mode for it is really simple and based on generic mode, so modifying it to work the way you want should be easy. It is not based on C-mode.
This is what I used to make a mode for the language I was working on for my compilers class, and it was really easy even with limited elisp experience. On the other hand, the indentation is fairly simple--it works for most code, but is not as complete as C-mode's.
Check out arduino-mode: https://github.com/bookest/arduino-mode/blob/master/arduino-mode.el
It is a C based mode that uses the cc-mode features to quickly create something very useful and unique to arduino programming. Using this as a simple template should help a lot.

Alternatives to font-lock

I'm trying to improve Emacs highlighting of Common Lisp and I'm stuck at regexp approach to highlighting used by font-lock. Regexps aren't enough as I want to be able to recognize structure of such forms as defun - highlighting of functions' argument list should be different from the bodys' highlighting, not just global search-and-highlight.
So, are there any alternatives to font-lock in Emacs itself or somewhere in the Internet? And if so, does they operate on symbolic expressions?
Emacs' font-lock matching is not restricted to regular-expression; you can use any function as matcher provided it satisfies certain protocol. Take a look at the variable font-lock-keywords for more details.
C-h vfont-lock-keywords
I think, that something like could be done on base of Semantic (part of CEDET package) - you can get syntactic information from parsed buffer and apply different color for different types of objects. Although I don't know any existing implementation right now

ispell in Emacs LaTeX mode

When I run Emacs command ispell-buffer on an Emacs buffer which is in the LaTeX mode, ispell checks spelling also inside math expressions.
I'd very much like to disable this. Is there an easy way to do it?
I've read about detex but detex does not seem to be integrated into Emacs.
It shouldn't do this, if you are using latexisms (eg. \[ ... \], equation environments, &c) to invoke math mode. Check the contents of ispell-tex-skip-alists; cf. section 6 of the ispell FAQ for what kind of thing should be there.
You can use $..$, $$..$$ to mark out maths using ispell-tex-skip-alists, but beware getting them out of kilter...
Postscript
Check also the value of the ispell-parser variable: this should be 'tex, otherwise ispell will not look for $...$ and $$...$$ regions.
Yes, you can: install aspell instead of ispell, and use flyspell with it.
This doesn't answer your question directly, but I have found Flyspell, an on-the-fly spell checker, incredibly useful when editing LaTeX documents. It still spellchecks inside equations, but it is much easier to ignore a few extra red underlines than ispell's interactive commands.
You may know this, but you can press A during spell checking to add a word to the buffer-local dictionary (that's capital A, lowercase a adds it to the global dictionary). It's not ideal, but this is how I usually suppress spell-checking of technical terms and variable names, etc., in my LaTeX documents.
This AUCTeX mailing list thread : "spell checker (ispell-buffer) complains about products in math modes" has some workarounds and the answer demonstrates how to use ispell-tex-skip-alists.
Another approach is to use ispell-skip-region-alist. The following example is to exclude org-mode src blocks:
(add-to-list 'ispell-skip-region-alist '("#\\+begin_src". "#\\+end_src"))