Define a character as a word boundary - emacs

I've defined the \ character to behave as a word constituent in latex-mode, and I'm pretty happy with the results. The only thing bothering me is that a sequence like \alpha\beta gets treated as a single word (which is the expected behavior, of course).
Is there a way to make emacs interpret a specific character as a word "starter"? This way it would always be considered part of the word following it, but never part of the word preceding it.
For clarity, here's an example:
\alpha\beta
^ ^
1 2
If the point is at 1 and I press M-d, the string "\alpha" should be killed.
If the point is at 2 and I press M-<backspace>, the string "\beta" should be killed.
How can I achieve this?

Another thought:
Your requirement is very like what subword-mode provides for camelCase.
You can't customize subword-mode's behaviour -- the regexps are hard-coded -- but you could certainly copy that library and modify it for your purposes.
M-x find-library RET subword RET
That would presumably be a pretty robust solution.
Edit: updated from the comments, as suggested:
For the record, changing every instance of [[:upper:]] to [\\\\[:upper:]] in the functions subword-forward-internal and subword-backward-internal inside subword.el works great =) (as long as "\" is defined as "w" syntax).
Personally I would be more inclined to make a copy of the library than edit it directly, unless for the purpose of making the existing library a little more general-purpose, for which the simplest solution would seem to be to move those regexps into variables -- after which it would be trivial to have buffer-local modified versions for this kind of purpose.
Edit 2: As of Emacs 24.3 (currently a release candidate), subword-mode facilitates this with the new subword-forward-regexp and subword-backward-regexp variables (for simple modifications), and the subword-forward-function and subword-backward-function variables (for more complex modifications).
By making those regexp variables buffer-local in latex-mode with the desired values, you can just use subword-mode directly.

You should be able to implement this using syntax text properties:
M-: (info "(elisp) Syntax Properties") RET
Edit: Actually, I'm not sure if you can do precisely this?
The following (which is just experimentation) is close, but M-<backspace> at 2 will only delete "beta" and not the preceding "\".
I suppose you could remap backward-kill-word to a function which checked for that preceding "\" and killed that as well. Fairly hacky, but it would probably do the trick if there's not a cleaner solution.
I haven't played with this functionality before; perhaps someone else can clarify.
(modify-syntax-entry ?\\ "w")
(setq parse-sexp-lookup-properties t)
(setq syntax-propertize-function 'my-propertize-syntax)
(defun my-propertize-syntax (start end)
"Set custom syntax properties."
(save-excursion
(goto-char start)
(while (re-search-forward "\\w\\\\" end t)
(put-text-property
(1- (point)) (point) 'syntax-table (cons "." ?\\)))))

Related

Emacs: Prevent auto-fill mode breaking lines in latex \text{...} inline commands

The latex package polyglossia allows for correct typesetting of foreign languages: it provides an inline command of the form \text[foreign_language]{...}, such as \textspanish{Soy el hijo de Fernando}.
Since I use Emacs, I have to prevent auto-fill mode from breaking up those code blocks: for example, ending a line with \textspanish{Soy el, and beginning the next line with hijo de Fernando}. When auto-fill mode breaks lines in this way, the exporter gets confused.
I tried creating a function to add as hook to fill-nobreak-predicate, but my knowledge of regular expressions and elips is not good enough. This is how far I got:
(defun foreign-language-nobreak-p ()
(or (looking-at "[[[:space:]]\|[[:print:]]].*}")
(save-excursion
(skip-chars-backward " \t")
(unless (bolp)
(backward-char 1)
(looking-at ".*\\text")))))
(add-hook 'fill-nobreak-predicate #'foreign-language-nobreak-p)
Any ideas on what went wrong?
First of all, thanks for pointing out fill-nobreak-predicate. Never heard about it in my first 23 years of Emacs.
Regarding your regexp question, I'd like to mention the function regexp-opt which takes a list of strings and builds an efficient regexp that matches those strings:
(defvar foreign-lang-re
(regexp-opt
'("\\textspanish{"
"\\textrussian{"
"\\textfrench{")))
If you factor out the supported languages into yet another variable, you could also build the list of strings with a loop, adding \text and the trailing {.
If your heuristic would be stable enough that you don't want auto filling to kick in when there is just the opening command somewhere on the current line, you could use thing-at-point like so:
(defun foreign-language-nobreak-p ()
(string-match
foreign-lang-re
(thing-at-point 'line t)))
This does not work when you closed the command already with a }. For that to work better, you'd need to search backward from your current point for the optimized regexp and forward for a closing curly brace, limiting the search to (bolp) and (eolp) respectively. This would get really hairy if you start to use other commands with curly braces inside the \textspanish command, though.
Hope that makes sense and helps a bit.

indent-[code-]rigidly called from emacs LISP function

I'm trying to write an emacs LISP function to un-indent the region
(rigidly). I can pass prefix arguments to indent-code-rigidly or
indent-rigidly or indent-region and they all work fine, but I don't
want to always have to pass a negative prefix argument to shift things
left.
My current code is as below but it seems to do nothing:
(defun undent ()
"un-indent rigidly."
(interactive)
(list
(setq fline (line-number-at-pos (region-beginning)))
(setq lline (line-number-at-pos (region-end)))
(setq curIndent (current-indentation))
;;(indent-rigidly fline lline (- curIndent 1))
(indent-region fline lline 2)
;;(message "%d %d" curIndent (- curIndent 1))
)
)
I gather that (current-indentation) won't get me the indentation of the first line
of the region, but of the first line following the region (so a second quesiton is
how to get that!). But even when I just use a constant for the column (as shown,
I don't see this function do any change.
Though if I uncomment the (message) call, it displays reasonable numbers.
GNU Emacs 24.3.1, on Ubuntu. And in case it matters, I use
(setq-default indent-tabs-mode nil) and (cua-mode).
I must be missing something obvious... ?
All of what Tim X said is true, but if you just need something that works, or an example to show you what direction to take your own code, I think you're looking for something like this:
(defun unindent-rigidly (start end arg &optional interactive)
"As `indent-rigidly', but reversed."
(interactive "r\np\np")
(indent-rigidly start end (- arg) interactive))
All this does is call indent-rigidly with an appropriately transformed prefix argument. If you call this with a prefix argument n, it will act as if you had called indent-rigidly with the argument -n. If you omit the prefix argument, it will behave as if you called indent-rigidly with the argument -1 (instead of going into indent-rigidly's interactive mode).
There are a number of problems with your function, including some vary
fundamental elisp requirements. Highly recommend reading the Emacs Lisp
Reference Manual (bundled with emacs). If you are new to programming and lisp,
you may also find An Introduction to Emacs Lisp useful (also bundled with
Emacs).
A few things to read about which will probably help
Read the section on the command loop from the elisp reference. In particular,
look at the node which describes how to define a new command and the use of
'interactive', which you will need if you want to bind your function to a key
or call it with M-x.
Read the section on variables from the lisp reference
and understand variable scope (local v global). Look at using 'let' rather
than 'setq' and what the difference is.
Read the section on 'positions' in the elisp reference. In particular, look at
'save-excursion' and 'save-restriction'. Understanding how to define and use
the region is also important.
It isn't clear if your writing this function just as a learning exercise or
not. However, just in case you are doing it because it is something you need to
do rather than just something to learn elisp, be sure to go through the Emacs
manual and index. What you appear to need is a common and fairly well supported
requirement. It can get a little complicated if programming modes are involved
(as opposed to plain text). However, with emacs, if what you need seems like
something which would be a common requirement, you can be fairly confident it is
already there - you just need to find it (which can be a challenge at first).
A common convention is for functions/commands to be defined which act 'in
reverse' when supplied with a negative or universal argument. Any command which
has this ability can also be called as a function in elisp code with the
argument necessary to get that behaviour, so understanding the inter-play
between commands, functions and calling conventions is important.

How do I font lock dollar signs (math mode delimiters) in AUCTeX buffer only outside comments?

I wrote the following code to highlight dollar signs in AUCTeX buffers in different colors, but then I found that it's even highlighting dollar signs in comments, which was unintended, but I am starting to like it. But now just for curiosity, I wonder if that can be avoided.
(defun my-LaTeX-mode-dollars ()
(font-lock-add-keywords
nil
`((,(rx "$") (0 'success t)))
t))
(add-hook 'LaTeX-mode-hook 'my-LaTeX-mode-dollars)
From the documentation of font-lock-keywords:
MATCH-HIGHLIGHT should be of the form:
(SUBEXP FACENAME [OVERRIDE [LAXMATCH]])
OVERRIDE and LAXMATCH are flags. If OVERRIDE is t, existing
fontification can be overwritten. If keep', only parts not already
fontified are highlighted. Ifprepend' or `append', existing
fontification is merged with the new, in which the new or existing
fontification, respectively, takes precedence.
In other words, if you drop the t after 'success, it will no longer fontify dollar signs in comments and strings.
EDIT:
Apparently, the above solution is not sufficient in this situation, probably because dollar signs have been colored using another face earlier.
One way that might work is to not pass the HOW parameter (currently t) to font-lock-add-keywords. This means that they should be added to the end of the list. However, this might cause other things to stop working.
If we need a bigger hammer, you can write a bit more advanced rule that inspects the current fontification, and decides what to do upon this. For example, the following is used by Emacs to add a warning face to parentheses placed at column 0 in strings:
"^\\s("
(0
(if
(memq
(get-text-property
(match-beginning 0)
'face)
'(font-lock-string-face font-lock-doc-face font-lock-comment-face))
(list 'face font-lock-warning-face 'help-echo "Looks like a toplevel defun: escape the parenthesis"))
prepend)
A third way to do this is to replace the regexp (rx "$") with the name of function that could search for $ and check that it appears in the correct context. One example of such font-lock rules can be found in the standard Emacs package cwarn.

Underscore as part of word for forward-word not working

I am trying to make underscores get treated as part of the word for the forward/backward-word function as described here and here. I am specifically trying to get this to work for nxhtml mode, but would really like it to work like this for all modes.
I have modified my site-start.el file a number of different ways but to no avail. But if I manually execute the command M-x modify-syntax-table in the buffer, it works just fine. I just can't get this to be the default behavior.
Here is what I tried putting in my site-start.el file:
;; 1
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 2
;; thought this would automatically set it on a mode change - no error, but did not work
(defun change-major-mode-hook ()
(modify-syntax-entry ?_ "w"))
;; 3
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 4
;; this produced a symbol's value as variable is void error
(modify-syntax-entry ?_ "w" nxhtml-mode-syntax-table)
What am I missing?
Have you tried modifying the nxhtml-mode-hook directly? Like so:
(add-hook 'nxhtml-mode-hook
(lambda () (modify-syntax-entry ?_ "w")))
Trey's answered your very specific question.
But Emacs already has a level of character organization that does what you ask: sexp. If you learned sexp commands instead of breaking word commands, then you can delete by "word" or by "symbol" as the situation fits.
As event_jr, I recommend you learn to use the sexp-based commands like C-M-f and C-M-b. But if you really want to only jump over identifiers (and ignore things like parentheses), then rather than try and change all syntax tables in all major modes (which may end up having unforeseen consequences), I recommend you simply use new symbol-based movement commands.
E.g. (global-set-key [remap forward-word] 'forward-symbol). For some commands (like kill-word) you'll need to write kill-symbol yourself, but it should be easy to do.
Instead of adding hooks for each major mode, as suggested by Trey Jackson, you can modify the standard syntax table. The different modes inherit from the standard syntax table so your modification will hopefully apply to all modes:
(modify-syntax-entry ?_ "w" (standard-syntax-table))
This will affect regular text, XML, Python and others.
Some modes are more stubborn, and modifying the standard syntax table won't affect them as they override some symbols during initialization. One example is cc-mode. You'll still have to modify their tables explicitly.

Modify Alt+f in Emacs for tex-mode

Alt+f in emacs when writing in tex mode seems to not include the . as part of the word. So how do I modify the alt+f behavior to remain the same exact when going forward if there is punctiation to include that as part of the word.
I have a separate file that loads for when writing in tex so I will just throw it in there so it doesn't affect normal emacs behavior.
Thanks for any help.
Thought of an addition to this but same related problem is when using Alt+d and deleting. Getting it to delete not only the word but also the punctation following eg.. (,.! etc..).
The following code should work for you:
(defun unpunctuate-syntax (str)
"Make the characters of the given string word characters."
(let ((st (copy-syntax-table (syntax-table))))
(dotimes (n (length str))
(modify-syntax-entry (elt str n) "w" st))
(set-syntax-table st)))
(defun dots-are-not-punctuation ()
(unpunctuate-syntax "."))
(add-hook 'TeX-mode-hook 'dots-are-not-punctuation)
The way M-f (the forward-word function) works is that it skips all characters in the buffer that have type "w" (ie word) in the current syntax table.
This code makes a modified syntax table and gives it to the buffer and the add-hook bit at the bottom sets it to run when you open a file in TeX-mode. (This method avoids you having to do the separate file thing you described).
You might notice that I make a copy of the syntax table rather than editing the one belonging to the TeX major mode. This is because I always get things wrong when playing with syntax tables and you can mess things up royally... This method means you just have to close the buffer and start again!