Avoid font-locking interfering inside of comments - emacs

In my font-lock-defaults I have:
("\\(^\\| \\|\t\\)\\(![^\n]+\\)\n" 2 'factor-font-lock-comment)
The comment character is ! and this makes it so comments get the right face. This works mostly, except when there is a competing font-locked entity inside the comment, like a string (delimited by double quotes):
! this line is font-locked fine
! this one is "not" because "strings"
How do you get font-lock to understand that the comment is already font-locked fine and it doesn't need to try to font-lock any strings inside of it? The obvious way is to add ! to the comment starter class in the syntax table:
(modify-syntax-entry ?! "< 2b" table)
This solution is not possible because function names and other symbols containing ! are legal such as map! filter! and foo!bar. And adding ! would cause code containing such names to be highlighted incorrectly.

Generally, it's a bad idea to highlight comments using a font-lock keyword. It's better to use the syntactic phase for this.
Even though the syntax table isn't powerful enough to describe the syntax of your language, it's still possible to highlight comments using in the syntactic font-lock phase. The solution is to provide a custom function to assign syntactic properties to the ! characters that should start a comment. This is done using the variable syntax-propertize-function.
See the elisp manual for details. Also, this tutorial covers this in great detail.
Update: The following is a simple example that define ! to be comment start character, but not within identifiers. A real world example might need a more refined way to check if something is an identifier.
(defun exmark-syntax-propertize (start end)
(funcall (syntax-propertize-rules
("[[:alnum:]_]\\(!\\)"
(1 "_")))
start
end))
(defvar exmark-mode-syntax-table
(let ((table (make-syntax-table)))
(modify-syntax-entry ?\n "> " table)
(modify-syntax-entry ?! "< " table)
table))
(define-derived-mode exmark-mode prog-mode "!-Mark"
"Major mode for !-mark."
(set (make-local-variable 'syntax-propertize-function)
'exmark-syntax-propertize))

Related

Variable definition for Invisible Text

I'm reading about Invisible Text in the Elisp manual. It defines the variable my-symbol to add or not add ... in place of the hidden text.
;; If you want to display an ellipsis:
(add-to-invisibility-spec '(my-symbol . t))
;; If you don't want ellipsis:
(add-to-invisibility-spec 'my-symbol)
However, I don't get it. How is it that you don't use (setq my-symbol "..."). What is the difference in syntax between (setq my-symbol "...") and '(my-symbol . t).
This might be a silly question but I'm not an expert or anything in Lisp and I'm playing around with Emacs configurations.
If you were to do (setq my-symbol "...") that would just set the value of variable my-symbol to that string.
What the Elisp manual is describing is the form of a specification, that is, a Lisp data structure (in this case a list) that causes certain parts of the buffer text to be invisible. It causes that behavior because such a spec is handled by Emacs automatically.
As #jenesaisquoi said in a comment, it is the C code of Emacs that does that automatic handling of the buffer invisibility spec. To use the spec, refer to the Elisp manual, node Invisible Text.

Underscore as part of word for forward-word not working

I am trying to make underscores get treated as part of the word for the forward/backward-word function as described here and here. I am specifically trying to get this to work for nxhtml mode, but would really like it to work like this for all modes.
I have modified my site-start.el file a number of different ways but to no avail. But if I manually execute the command M-x modify-syntax-table in the buffer, it works just fine. I just can't get this to be the default behavior.
Here is what I tried putting in my site-start.el file:
;; 1
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 2
;; thought this would automatically set it on a mode change - no error, but did not work
(defun change-major-mode-hook ()
(modify-syntax-entry ?_ "w"))
;; 3
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 4
;; this produced a symbol's value as variable is void error
(modify-syntax-entry ?_ "w" nxhtml-mode-syntax-table)
What am I missing?
Have you tried modifying the nxhtml-mode-hook directly? Like so:
(add-hook 'nxhtml-mode-hook
(lambda () (modify-syntax-entry ?_ "w")))
Trey's answered your very specific question.
But Emacs already has a level of character organization that does what you ask: sexp. If you learned sexp commands instead of breaking word commands, then you can delete by "word" or by "symbol" as the situation fits.
As event_jr, I recommend you learn to use the sexp-based commands like C-M-f and C-M-b. But if you really want to only jump over identifiers (and ignore things like parentheses), then rather than try and change all syntax tables in all major modes (which may end up having unforeseen consequences), I recommend you simply use new symbol-based movement commands.
E.g. (global-set-key [remap forward-word] 'forward-symbol). For some commands (like kill-word) you'll need to write kill-symbol yourself, but it should be easy to do.
Instead of adding hooks for each major mode, as suggested by Trey Jackson, you can modify the standard syntax table. The different modes inherit from the standard syntax table so your modification will hopefully apply to all modes:
(modify-syntax-entry ?_ "w" (standard-syntax-table))
This will affect regular text, XML, Python and others.
Some modes are more stubborn, and modifying the standard syntax table won't affect them as they override some symbols during initialization. One example is cc-mode. You'll still have to modify their tables explicitly.

Define a character as a word boundary

I've defined the \ character to behave as a word constituent in latex-mode, and I'm pretty happy with the results. The only thing bothering me is that a sequence like \alpha\beta gets treated as a single word (which is the expected behavior, of course).
Is there a way to make emacs interpret a specific character as a word "starter"? This way it would always be considered part of the word following it, but never part of the word preceding it.
For clarity, here's an example:
\alpha\beta
^ ^
1 2
If the point is at 1 and I press M-d, the string "\alpha" should be killed.
If the point is at 2 and I press M-<backspace>, the string "\beta" should be killed.
How can I achieve this?
Another thought:
Your requirement is very like what subword-mode provides for camelCase.
You can't customize subword-mode's behaviour -- the regexps are hard-coded -- but you could certainly copy that library and modify it for your purposes.
M-x find-library RET subword RET
That would presumably be a pretty robust solution.
Edit: updated from the comments, as suggested:
For the record, changing every instance of [[:upper:]] to [\\\\[:upper:]] in the functions subword-forward-internal and subword-backward-internal inside subword.el works great =) (as long as "\" is defined as "w" syntax).
Personally I would be more inclined to make a copy of the library than edit it directly, unless for the purpose of making the existing library a little more general-purpose, for which the simplest solution would seem to be to move those regexps into variables -- after which it would be trivial to have buffer-local modified versions for this kind of purpose.
Edit 2: As of Emacs 24.3 (currently a release candidate), subword-mode facilitates this with the new subword-forward-regexp and subword-backward-regexp variables (for simple modifications), and the subword-forward-function and subword-backward-function variables (for more complex modifications).
By making those regexp variables buffer-local in latex-mode with the desired values, you can just use subword-mode directly.
You should be able to implement this using syntax text properties:
M-: (info "(elisp) Syntax Properties") RET
Edit: Actually, I'm not sure if you can do precisely this?
The following (which is just experimentation) is close, but M-<backspace> at 2 will only delete "beta" and not the preceding "\".
I suppose you could remap backward-kill-word to a function which checked for that preceding "\" and killed that as well. Fairly hacky, but it would probably do the trick if there's not a cleaner solution.
I haven't played with this functionality before; perhaps someone else can clarify.
(modify-syntax-entry ?\\ "w")
(setq parse-sexp-lookup-properties t)
(setq syntax-propertize-function 'my-propertize-syntax)
(defun my-propertize-syntax (start end)
"Set custom syntax properties."
(save-excursion
(goto-char start)
(while (re-search-forward "\\w\\\\" end t)
(put-text-property
(1- (point)) (point) 'syntax-table (cons "." ?\\)))))

Modify Alt+f in Emacs for tex-mode

Alt+f in emacs when writing in tex mode seems to not include the . as part of the word. So how do I modify the alt+f behavior to remain the same exact when going forward if there is punctiation to include that as part of the word.
I have a separate file that loads for when writing in tex so I will just throw it in there so it doesn't affect normal emacs behavior.
Thanks for any help.
Thought of an addition to this but same related problem is when using Alt+d and deleting. Getting it to delete not only the word but also the punctation following eg.. (,.! etc..).
The following code should work for you:
(defun unpunctuate-syntax (str)
"Make the characters of the given string word characters."
(let ((st (copy-syntax-table (syntax-table))))
(dotimes (n (length str))
(modify-syntax-entry (elt str n) "w" st))
(set-syntax-table st)))
(defun dots-are-not-punctuation ()
(unpunctuate-syntax "."))
(add-hook 'TeX-mode-hook 'dots-are-not-punctuation)
The way M-f (the forward-word function) works is that it skips all characters in the buffer that have type "w" (ie word) in the current syntax table.
This code makes a modified syntax table and gives it to the buffer and the add-hook bit at the bottom sets it to run when you open a file in TeX-mode. (This method avoids you having to do the separate file thing you described).
You might notice that I make a copy of the syntax table rather than editing the one belonging to the TeX major mode. This is because I always get things wrong when playing with syntax tables and you can mess things up royally... This method means you just have to close the buffer and start again!

easily display useful information in custom emacs minor mode -- mode-line woes

Background:
I'm creating a minor mode that gives the user "hints" about whether the buffer they're visiting uses tabs or spaces for indentation (simply by examining the first character of each line in the buffer). Some features I plan to add include an informational display in the mode-line and a few functions to switch between using tabs or spaces, tab-width, etc.
I'm not really concerned about the usefulness of this minor mode. In fact, I would be surprised if there's not already something out there that does this same thing. Mostly this is an exercise in writing minor modes.
Question:
What would be a clean, non-obtrusive way to insert/remove text from the mode-line when enabling/disabling my minor mode? I don't want the user to have to modify their mode-line-format, I just want non-destructively insert and remove text. Right now I'm using a function that looks something like:
(defun update-indent-hints-mode-line (what-this-buffer-loves)
(let ((indent-hints-mode-line-text (concat " " "[" what-this-buffer-loves "-loving" "]"))
(my-mode-line-buffer-identification
(remq " [Tab-loving]" (remq " [Space-loving]" mode-line-buffer-identification))))
(setq mode-line-buffer-identification
(add-to-list 'my-mode-line-buffer-identification
indent-hints-mode-line-text
t))
(force-mode-line-update)))
It's working okay but searching for and removing " [Tab-loving]" and " [Space-loving]" seems pretty hackish and ugly... Is there a cleaner way to do it?
Bonus Points:
Any comments on the humble beginnings of my equally humble minor-mode:
https://github.com/mgalgs/indent-hints-mode/blob/master/indent-hints.el
I'm obviously an elisp n00b, but I'm here to learn.
Check out the variable minor-mode-alist, which associates variables with strings in the mode-line. If you change your code to either set the variable tab-loving to t or space-loving to t (and set the other variable to the nil), you can get what you want with:
(setq minor-mode-alist (cons '(space-loving " [Space-loving]")
(cons '(tab-loving " [Tab-loving]")
minor-mode-alist)))