emacs major-mode font-lock between characters (parenthesis, quotes, etc) - emacs

I'm trying to setup an emacs major-mode which essentially just highlights text between lots of different characters, in different colors. I have square brackets working with:
(font-lock-add-keywords nil '(("\\[\\(.*\\)\\]"
1 font-lock-keyword-face prepend)))
but when I try replacing the [ and ] with other characters, it stops working. For example, round parentheses '()' does not work:
(font-lock-add-keywords nil '(("\\(\\(.*\\)\\)"
1 font-lock-function-name-face prepend)))
Trying single, double, or back-quotes, etc also don't work. I'm completely unfamiliar with lisp-syntax --- what am I doing wrong? Also: is there any way to include the characters bracketing the expression?

You're mixing regular expressions and regular strings.
Try these:
;; square brackets - escape the first one so you don't get a [..] regexp
(font-lock-add-keywords nil '(("\\(\\[.*]\\)"
1 font-lock-keyword-face prepend)))
;; parentheses - don't escape the parentheses you want to match!
(font-lock-add-keywords nil '(("\\((.*)\\)"
1 font-lock-keyword-face prepend)))
;; quotes - single escape so you don't break your string:
(font-lock-add-keywords nil '(("\\(\".*\"\\)"
1 font-lock-keyword-face prepend)))
;; other characters - not regexps, so don't escape them:
(font-lock-add-keywords nil '(("\\('.*'\\)"
1 font-lock-keyword-face prepend)))
(font-lock-add-keywords nil '(("\\(<.*>\\)"
1 font-lock-keyword-face prepend)))

Related

Writing major mode: how set different start string and end string character?

I'm writing a major mode where I can have multiline strings like this:
Text : >abcde
fgh
ijklmonp<
where '>' and '<' indicate the respective start and end of the string. The following syntax table entries only mark >...> and <...< strings, which is not what I want.
(modify-syntax-entry ?> "\"" st)
(modify-syntax-entry ?< "\"" st)
Currently the best solution is using generic string delimiters: ‘|’, but it still messes up my system as I have >...<...< situations sometimes. The best would be if I could use a multiline regexp like
^Text : >.*<$
How can I achieve this?
As thornjad explains, this is not supported directly by syntax-table, so you need to use syntax-propertize-function. E.g.
(defconst my-syntax-propertize
(syntax-propertize-rules
(">" (0 (unless (nth 8 (save-excursion (syntax-ppss (match-beginning 0)))
(string-to-syntax "|"))))
("<" (0 (when (eq t (nth 3 (save-excursion
(syntax-ppss (match-beginning 0))))
(string-to-syntax "|"))))))
then in your major mode function:
(setq-local syntax-propertize-function my-syntax-propertize)
The nth 8 test makes sure > is only marked as a string delimiter if it is not within another string or comment, and the nth 3 test makes sure that < is only marked as a string delimiter when it occurs with a string that was started by another generic string delimiter.
Unfortunately modify-syntax-entry isn't powerful enough to handle this sort of situation. Luckily we have other options! My orson-mode deals with a similar issue where strings are delimited by double-single quotes ('') instead of double quotes (").
To do this, a regexp looks for the entire string, quotes included, then uses Emacs's string-fence class to mark the quotes as fences.
(defconst orson--string-rx
"\\(''[^']*''\\)")
(defun orson-syntax-propertize-function (start end)
(save-excursion
(goto-char start)
(while (re-search-forward orson--string-rx end 'noerror)
(let ((a (match-beginning 1))
(b (match-end 1))
(string-fence (string-to-syntax "|")))
(put-text-property a (1+ a) 'syntax-table string-fence)
(put-text-property (1- b) b 'syntax-table string-fence))))

Want to replace the underscore and following char with uppercase char in emacs lisp

I want to replace the String abcd_efg to abcdEfg using emacs lisp. For some reason the following doesn't work
(replace-regexp-in-string "_\(.\)" "\,(capitalize \1) "abcd_efg")
but works when i do
M-x replace-regexp _\(.\) \,(capitalize \1)
The reason is that \ is an escape character in elisp strings. So, if you want to create a regexp containing a backslash, you will have to write \\.
Secondly, the \, construct is not available when calling the function.
(let ((s "abc_def_ghi_2_")
(pos 0))
(while (string-match "_\\(.\\)" s pos)
(setq pos (match-end 0))
(setq s (replace-match (capitalize (match-string 1 s)) nil nil s 1)))
s)
This returns:
"abc_Def_Ghi_2_"

Emacs major-mode for Mathematica based on cc-mode

****Solution to Issue 1 by Stephan - see Answer below****
I mark \ as an escape character in the syntax table, but then override that designation for the Mathematica syntax elements like \[Infinity]. Here is my syntax-propertize-function:
(defconst math-syntax-propertize-function
(syntax-propertize-rules
("\\\\\\[\\([A-Z][A-Za-z]*\\)]" (0 "_"))))
I referenced it from the (defun math-node() function like so:
(set (make-local-variable 'syntax-propertize-function)
math-syntax-propertize-function)
In my first attempt, I didn't use the make-local-variable function and I was surprised when my elisp buffer highlighting went awry.
****End Solution to Issue 1****
I am implementing a major-mode in Emacs derived from cc-mode for editing Mathematica files. The goal is syntax highlighting and indentation. I will leave interfacing with the Mathematica kernel for later.
I have the basic functionality working, but there are a couple of sticking points that are giving me trouble.
****Issue 1** - The \ character is used as an escape character and to prefix multi-character, bracketed keywords. **
Like many languages, Mathematica uses the \ character to escape " and other \ characters is strings.
Mathematic has what are called in Mathematica speak Syntax Characters like \[Times], \[Element], \[Infinity], etc. that represent mathematica operators and constants.
And, Mathematica makes heavy use of [ and ] instead of ( and ) for function definitions and calls, etc.
So, if I mark \ as an escape character in the syntax-table, then my brackets become mis-matched anywhere I use a Syntax Character. E.g.,
If[x < \[Pi], True, False]
Of course, cc-mode is intent on ignoring the [ right after the \. Given the functional nature of Mathematica, the mode is almost useless if it cannot match brackets. Think lisp without paren matching.
If I don't put \ in the syntax-table as an escape character, then how do I handle escape sequences in comments and strings?
It would be great if I could put Times, Element, Infinity, etc in a keyword list and have everything work correctly.
****Issue 2** - The syntax of Mathematica is different enough from C,C++,Java,ObjC, etc. that cc-mode's builtin syntactical analysis doesn't always produce the desired result.**
Consider the following code block:
FooBar[expression1,
expression2,
expression3];
This formats beautifully because the expressions are recognized as an argument list.
However, if a list is passed as an argument,
FooBar[{expression1,
expression2,
expression3}];
the result is not pretty because the expressions are considered continuations of a single statement within the { and }. Unfortunately, the simple hack of setting c-continuation-offset to 0 breaks actual continuations like,
addMe[x_Real, y_Real] :=
Plus[x, y];
which you want to be indented.
The issue is that in Mathematica { and } delineate lists and not code blocks.
Here is the current elisp file I am using:
(require 'cc-mode)
;; There are required at compile time to get the sources for the
;; language constants.
(eval-when-compile
(require 'cc-langs)
(require 'cc-fonts))
;; Add math mode the the language constant system. This needs to be
;; done at compile time because that is when the language constants
;; are evaluated.
(eval-and-compile
(c-add-language 'math-mode 'c-mode))
;; Function names
(c-lang-defconst c-cpp-matchers
math (append
(c-lang-const c-cpp-matchers c)
;; Abc[
'(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-type-face))
;; abc[
'(("\\<\\([A-Za-z][A-Za-z0-9]*\\)\\>\\[" 1 font-lock-function-name-face))
;; Abc
'(("\\<\\([A-Z][A-Za-z0-9]*\\)\\>" 1 font-lock-keyword-face))
;; abc_
'(("\\<\\([a-z][A-Za-z0-9]*[_]\\)\\>" 1 font-lock-variable-name-face))
))
;; font-lock-comment-face
;; font-lock-doc-face
;; font-lock-string-face
;; font-lock-keyword-fact
;; font-lock-function-name-face
;; font-lock-constant-face
;; font-lock-type-face
;; font-lock-builtin-face
;; font-lock-reference-face
;; font-lock-warning-face
;; There is no line comment character.
(c-lang-defconst c-line-comment-starter
math nil)
;; The block comment starter is (*.
(c-lang-defconst c-block-comment-starter
math "(*")
;; The block comment ender is *).
(c-lang-defconst c-block-comment-ender
math "*)")
;; The assignment operators.
(c-lang-defconst c-assignment-operators
math '("=" ":=" "+=" "-=" "*=" "/=" "->" ":>"))
;; The operators.
(c-lang-defconst c-operators
math `(
;; Unary.
(prefix "+" "-" "!")
;; Multiplicative.
(left-assoc "*" "/")
;; Additive.
(left-assoc "+" "-")
;; Relational.
(left-assoc "<" ">" "<=" ">=")
;; Equality.
(left-assoc "==" "=!=")
;; Assignment.
(right-assoc ,#(c-lang-const c-assignment-operators))
;; Sequence.
(left-assoc ",")))
;; Syntax modifications necessary to recognize keywords with
;; punctuation characters.
;; (c-lang-defconst c-identifier-syntax-modifications
;; math (append '((?\\ . "w"))
;; (c-lang-const c-identifier-syntax-modifications)))
;; Constants.
(c-lang-defconst c-constant-kwds
math '( "False" "True" )) ;; "\\[Infinity]" "\\[Times]" "\\[Divide]" "\\[Sqrt]" "\\[Element]"\
))
(defcustom math-font-lock-extra-types nil
"Extra types to recognize in math mode.")
(defconst math-font-lock-keywords-1 (c-lang-const c-matchers-1 math)
"Minimal highlighting for math mode.")
(defconst math-font-lock-keywords-2 (c-lang-const c-matchers-2 math)
"Fast normal highlighting for math mode.")
(defconst math-font-lock-keywords-3 (c-lang-const c-matchers-3 math)
"Accurate normal highlighting for math mode.")
(defvar math-font-lock-keywords math-font-lock-keywords-3
"Default expressions to highlight in math mode.")
(defvar math-mode-syntax-table nil
"Syntax table used in math mode.")
(message "Setting math-mode-syntax-table to nil to force re-initialization")
(setq math-mode-syntax-table nil)
;; If a syntax table has not yet been set, allocate a new syntax table
;; and setup the entries.
(unless math-mode-syntax-table
(setq math-mode-syntax-table
(funcall (c-lang-const c-make-mode-syntax-table math)))
(message "Modifying the math-mode-syntax-table")
;; character (
;; ( - open paren class
;; ) - matching paren character
;; 1 - 1st character of comment delimitter (**)
;; n - nested comments allowed
(modify-syntax-entry ?\( "()1n" math-mode-syntax-table)
;; character )
;; ) - close parent class
;; ( - matching paren character
;; 4 - 4th character of comment delimitter (**)
;; n - nested comments allowed
(modify-syntax-entry ?\) ")(4n" math-mode-syntax-table)
;; character *
;; . - punctuation class
;; 2 - 2nd character of comment delimitter (**)
;; 3 - 3rd character of comment delimitter (**)
(modify-syntax-entry ?\* ". 23n" math-mode-syntax-table)
;; character [
;; ( - open paren class
;; ] - matching paren character
(modify-syntax-entry ?\[ "(]" math-mode-syntax-table)
;; character ]
;; ) - close paren class
;; [ - mathcing paren character
(modify-syntax-entry ?\] ")[" math-mode-syntax-table)
;; character {
;; ( - open paren class
;; } - matching paren character
(modify-syntax-entry ?\{ "(}" math-mode-syntax-table)
;; character }
;; ) - close paren class
;; { - matching paren character
(modify-syntax-entry ?\} "){" math-mode-syntax-table)
;; The following characters are punctuation (i.e. they cannot appear
;; in identifiers).
;;
;; / ' % & + - ^ < > = |
(modify-syntax-entry ?\/ "." math-mode-syntax-table)
(modify-syntax-entry ?\' "." math-mode-syntax-table)
(modify-syntax-entry ?% "." math-mode-syntax-table)
(modify-syntax-entry ?& "." math-mode-syntax-table)
(modify-syntax-entry ?+ "." math-mode-syntax-table)
(modify-syntax-entry ?- "." math-mode-syntax-table)
(modify-syntax-entry ?^ "." math-mode-syntax-table)
(modify-syntax-entry ?< "." math-mode-syntax-table)
(modify-syntax-entry ?= "." math-mode-syntax-table)
(modify-syntax-entry ?> "." math-mode-syntax-table)
(modify-syntax-entry ?| "." math-mode-syntax-table)
;; character $
;; _ - word class (since $ is allowed in identifier names)
(modify-syntax-entry ?\$ "_" math-mode-syntax-table)
;; character \
;; . - punctuation class (for now treat \ as punctuation
;; until we can fix the \[word] issue).
(modify-syntax-entry ?\\ "." math-mode-syntax-table)
) ;; end of math-mode-syntax-table adjustments
;;
;;
(defvar math-mode-abbrev-table nil
"Abbrevation table used in math mode buffers.")
(defvar math-mode-map (let ((map (c-make-inherited-keymap)))
map)
"Keymap used in math mode buffers.")
;; math-mode
;;
(defun math-mode ()
"Major mode for editing Mathematica code."
(interactive)
(kill-all-local-variables)
(c-initialize-cc-mode t)
(set-syntax-table math-mode-syntax-table)
(setq major-mode 'math-mode
mode-name "Math"
local-abbrev-table math-mode-abbrev-table
abbrev-mode t)
(use-local-map math-mode-map)
(c-init-language-vars math-mode)
(c-common-init 'math-mode)
(run-hooks 'c-mode-common-hook)
(run-hooks 'math-mode-hook)
(c-update-modeline))
(provide 'math-mode)
And a screenshot of some .
While cc-mode is designed to be adaptable to various languages, I'm not sure it will serve you well for Mathematica, because the syntax is too far from the one well-supported by cc-mode. I would suggest to try SMIE (an indentation engine that appeared in Emacs-23.4 and that was originally built for SML but is currently used for a variety of languages). Just like cc-mode, SMIE is not ideal for all languages either, but I wouldn't be surprised if it works better than cc-mode in your case.
For the backslash issue, your best bet is to use syntax-propertize-function to change the escaping-nature of specific backslashes (either set \ as escaping in the syntax-table and then mark the \ of \[foo] as non-escaping, or leave the \ as non-escaping in the syntax-table and then mark those \ of \" and \\ as escaping).

Emacs font lock mode: provide a custom color instead of a face

On this page discussing font lock mode, an example is provided which highlights a custom pattern:
(add-hook 'c-mode-hook
(lambda ()
(font-lock-add-keywords nil
'(("\\<\\(FIXME\\):" 1 font-lock-warning-face t)))))
Is there a way to provide a custom color instead of font-lock-warning-face and without defining a new custom face. I want to be able to write something like:
(font-lock-add-keywords nil '(("\\<\\(FIXME\\):" 1 "Blue" t)))
or an RGB color definition:
(font-lock-add-keywords nil '(("\\<\\(FIXME\\):" 1 "#F0F0F0" t)))
Using the double quotes doesn't work. Do you know what will make it work?
(font-lock-add-keywords nil '(("\\<\\(FIXME\\):" 1 '(:foreground "blue") t)))
(font-lock-add-keywords nil '(("\\<\\(FIXME\\):" 1 '(:foreground "#F0F0F0") t)))
A full list of attributes is in the manual.

Why might the Emacs "downcase" function refuse to do downcasing?

I'm trying to write simple Emacs function to convert ids between C style ones and camelCase ones (i.e. c_style <-> cStyle). But for some reason, Emacs built in downcase function leaves the word intact. M-x downcase-word works fine so I completely lost. Any ideas are welcome.
(defun toggle-id-style ()
"Toggle between C-style ids and camel Case ones (i.e. c_style_id -> cStyleId and back)."
(interactive)
(save-excursion
(progn
(re-search-forward "[^A-Za-z0-9_]" nil t)
(let ((end (point))
(case-fold-search nil))
(progn
(re-search-backward "[^A-Za-z0-9_]" nil t)
(let* ((cstyle (if (string-match "_" (buffer-substring-no-properties (point) end)) t nil))
(regexp (if cstyle "_\\(\\w+\\)" "\\([A-Z][a-z0-9]+\\)") )
(func (if cstyle 'capitalize (lambda (s) (concat "_" (downcase s) ) ))))
(progn
(while (re-search-forward regexp end t)
(replace-match (funcall func (match-string 1)) nil nil)))))))))
;;M-x replace-regexp _\(\w+\) -> \,(capitalize \1) ;; c_style -> cStyle
;;M-x replace-regexp \([A-Z][a-z0-9]+\) -> _\,(downcase \1) ;;cStyle -> c_style
It works fine if I convert c_style but when I'm trying to convert cStyle I got c_Style as result. Yes, I've checked that this is due to downcase behaviour.
Your problem is the second argument to replace-match. From the documentation:
If second arg fixedcase is non-nil, do not alter case of replacement text.
Otherwise maybe capitalize the whole text, or maybe just word initials,
based on the replaced text.
If the replaced text has only capital letters
and has at least one multiletter word, convert newtext to all caps.
Otherwise if all words are capitalized in the replaced text,
capitalize each word in newtext.
You're passing nil for the fixedcase argument, which causes replace-match to capitalize the replacement when the text being replaced is capitalized. Pass t instead and this bit of the code will work.
I have two general comments about your code.
All of your uses of progn are unnecessary. The body of save-excursion is an implicit progn and so are the bodies of let and let*.
You search forwards and then backwards to try to find the bounds of the symbol underneath point. Emacs already has a thingatpt library to find things at or near the point. In your case you can just call (bounds-of-thing-at-point 'symbol) which returns a cons cell (START . END) giving the start and end positions of the symbol that was found.
I think you need the second arg of replace-match to be t instead of nil.