Emacs: Prevent auto-fill mode breaking lines in latex \text{...} inline commands - emacs

The latex package polyglossia allows for correct typesetting of foreign languages: it provides an inline command of the form \text[foreign_language]{...}, such as \textspanish{Soy el hijo de Fernando}.
Since I use Emacs, I have to prevent auto-fill mode from breaking up those code blocks: for example, ending a line with \textspanish{Soy el, and beginning the next line with hijo de Fernando}. When auto-fill mode breaks lines in this way, the exporter gets confused.
I tried creating a function to add as hook to fill-nobreak-predicate, but my knowledge of regular expressions and elips is not good enough. This is how far I got:
(defun foreign-language-nobreak-p ()
(or (looking-at "[[[:space:]]\|[[:print:]]].*}")
(save-excursion
(skip-chars-backward " \t")
(unless (bolp)
(backward-char 1)
(looking-at ".*\\text")))))
(add-hook 'fill-nobreak-predicate #'foreign-language-nobreak-p)
Any ideas on what went wrong?

First of all, thanks for pointing out fill-nobreak-predicate. Never heard about it in my first 23 years of Emacs.
Regarding your regexp question, I'd like to mention the function regexp-opt which takes a list of strings and builds an efficient regexp that matches those strings:
(defvar foreign-lang-re
(regexp-opt
'("\\textspanish{"
"\\textrussian{"
"\\textfrench{")))
If you factor out the supported languages into yet another variable, you could also build the list of strings with a loop, adding \text and the trailing {.
If your heuristic would be stable enough that you don't want auto filling to kick in when there is just the opening command somewhere on the current line, you could use thing-at-point like so:
(defun foreign-language-nobreak-p ()
(string-match
foreign-lang-re
(thing-at-point 'line t)))
This does not work when you closed the command already with a }. For that to work better, you'd need to search backward from your current point for the optimized regexp and forward for a closing curly brace, limiting the search to (bolp) and (eolp) respectively. This would get really hairy if you start to use other commands with curly braces inside the \textspanish command, though.
Hope that makes sense and helps a bit.

Related

How can I automatically close brackets in Emacs? [duplicate]

I've just started using emacs, and there's one feature I'd really like, and searching around a bit was fruitless. I hope someone else has done this because I don't want to learn elisp just yet.
void foo()<cursor>
I would like typing an "{" to cause this to happen
void foo(){
<cursor>
}
I would like this to only happen in cc-mode, and only at the end of a line when not in a string/comment/etc
The first thing that came to mind was rebinding "{" to do this always(I could figure out how to do this myself), but it would be hard to make it only happen at the right time.
any hints would be appreciated.
on latest emacs you can use :
electric-pair-mode is an interactive compiled Lisp function.
(electric-pair-mode &optional ARG)
Automatically pair-up parens when inserting an open paren.
this is integrated in Emacs 24.1 (actually CVS)
This will do it:
(defun my-c-mode-insert-lcurly ()
(interactive)
(insert "{")
(let ((pps (syntax-ppss)))
(when (and (eolp) (not (or (nth 3 pps) (nth 4 pps)))) ;; EOL and not in string or comment
(c-indent-line)
(insert "\n\n}")
(c-indent-line)
(forward-line -1)
(c-indent-line))))
(define-key c-mode-base-map "{" 'my-c-mode-insert-lcurly)
turn on electric-pair-mode in emacs 24 or newer version.
(electric-pair-mode 1)
I heartily recommend you to try out the excellent autopair minor mode - it does a lot more than simply inserting braces and makes Emacs a lot more IDE like in that area. I guess combining it with the electric braces setting in cc-mode will give you more or less the behavior you seek.
Try yasnippet (or on the Emacs Wiki page yasnippet). There are many packages for Emacs which support doing this kind of thing, but yasnippet seems to have momentum currently and is very extensible. Check out the videos.
You will need to delve into emacs-lisp to do this exactly as you wish, since YASnippet will do something nice for you but not exactly what you're asking for.
I think the simplest way to do this would be to bind a function to the RET key, in the cc-mode key-map.
The function should check to that the previous character is an { and if so, perform the required RET, RET, TAB, }, Up, TAB to get the cursor where you want and the closing } inserted.
You can make the feature more robust by having it check for a balanced closing } but this would be more complicated, and I'd recommend seeing how it feels without this additional polishing feature.
If you like I can write the function and the key-map binding for you, but since you asked for an idea of how it's done, I'll leave it up to you to ask for more assistance if you need it.
Alternatively, I find that autopair.el does this nicely enough for me, and I do the newlines myself ;)
You might want to keep the option of an empty function body, in which case you would want the closing brace to stay on the same line. If that is the case, then you can try this alternative solution:
Rely on the packages mentioned in the previous replies to automatically add the closing brace.
When you want to add statements to the function body, you press the Return key (while the automatically added closing brace is still under the cursor). The 'Return' key is bound as follows:
;; automatic first line in function
(defun my-c-mode-insert-funline ()
(interactive)
(newline-and-indent)
(when (looking-at "}")
(newline-and-indent)
(forward-line -1)
(c-indent-line)))
(global-set-key (kbd "RET") 'my-c-mode-insert-funline)

How do I font lock dollar signs (math mode delimiters) in AUCTeX buffer only outside comments?

I wrote the following code to highlight dollar signs in AUCTeX buffers in different colors, but then I found that it's even highlighting dollar signs in comments, which was unintended, but I am starting to like it. But now just for curiosity, I wonder if that can be avoided.
(defun my-LaTeX-mode-dollars ()
(font-lock-add-keywords
nil
`((,(rx "$") (0 'success t)))
t))
(add-hook 'LaTeX-mode-hook 'my-LaTeX-mode-dollars)
From the documentation of font-lock-keywords:
MATCH-HIGHLIGHT should be of the form:
(SUBEXP FACENAME [OVERRIDE [LAXMATCH]])
OVERRIDE and LAXMATCH are flags. If OVERRIDE is t, existing
fontification can be overwritten. If keep', only parts not already
fontified are highlighted. Ifprepend' or `append', existing
fontification is merged with the new, in which the new or existing
fontification, respectively, takes precedence.
In other words, if you drop the t after 'success, it will no longer fontify dollar signs in comments and strings.
EDIT:
Apparently, the above solution is not sufficient in this situation, probably because dollar signs have been colored using another face earlier.
One way that might work is to not pass the HOW parameter (currently t) to font-lock-add-keywords. This means that they should be added to the end of the list. However, this might cause other things to stop working.
If we need a bigger hammer, you can write a bit more advanced rule that inspects the current fontification, and decides what to do upon this. For example, the following is used by Emacs to add a warning face to parentheses placed at column 0 in strings:
"^\\s("
(0
(if
(memq
(get-text-property
(match-beginning 0)
'face)
'(font-lock-string-face font-lock-doc-face font-lock-comment-face))
(list 'face font-lock-warning-face 'help-echo "Looks like a toplevel defun: escape the parenthesis"))
prepend)
A third way to do this is to replace the regexp (rx "$") with the name of function that could search for $ and check that it appears in the correct context. One example of such font-lock rules can be found in the standard Emacs package cwarn.

Define a character as a word boundary

I've defined the \ character to behave as a word constituent in latex-mode, and I'm pretty happy with the results. The only thing bothering me is that a sequence like \alpha\beta gets treated as a single word (which is the expected behavior, of course).
Is there a way to make emacs interpret a specific character as a word "starter"? This way it would always be considered part of the word following it, but never part of the word preceding it.
For clarity, here's an example:
\alpha\beta
^ ^
1 2
If the point is at 1 and I press M-d, the string "\alpha" should be killed.
If the point is at 2 and I press M-<backspace>, the string "\beta" should be killed.
How can I achieve this?
Another thought:
Your requirement is very like what subword-mode provides for camelCase.
You can't customize subword-mode's behaviour -- the regexps are hard-coded -- but you could certainly copy that library and modify it for your purposes.
M-x find-library RET subword RET
That would presumably be a pretty robust solution.
Edit: updated from the comments, as suggested:
For the record, changing every instance of [[:upper:]] to [\\\\[:upper:]] in the functions subword-forward-internal and subword-backward-internal inside subword.el works great =) (as long as "\" is defined as "w" syntax).
Personally I would be more inclined to make a copy of the library than edit it directly, unless for the purpose of making the existing library a little more general-purpose, for which the simplest solution would seem to be to move those regexps into variables -- after which it would be trivial to have buffer-local modified versions for this kind of purpose.
Edit 2: As of Emacs 24.3 (currently a release candidate), subword-mode facilitates this with the new subword-forward-regexp and subword-backward-regexp variables (for simple modifications), and the subword-forward-function and subword-backward-function variables (for more complex modifications).
By making those regexp variables buffer-local in latex-mode with the desired values, you can just use subword-mode directly.
You should be able to implement this using syntax text properties:
M-: (info "(elisp) Syntax Properties") RET
Edit: Actually, I'm not sure if you can do precisely this?
The following (which is just experimentation) is close, but M-<backspace> at 2 will only delete "beta" and not the preceding "\".
I suppose you could remap backward-kill-word to a function which checked for that preceding "\" and killed that as well. Fairly hacky, but it would probably do the trick if there's not a cleaner solution.
I haven't played with this functionality before; perhaps someone else can clarify.
(modify-syntax-entry ?\\ "w")
(setq parse-sexp-lookup-properties t)
(setq syntax-propertize-function 'my-propertize-syntax)
(defun my-propertize-syntax (start end)
"Set custom syntax properties."
(save-excursion
(goto-char start)
(while (re-search-forward "\\w\\\\" end t)
(put-text-property
(1- (point)) (point) 'syntax-table (cons "." ?\\)))))

How can you modify two matching delimiters at once with Emacs?

While this question concerns the formatting of LaTeX within Emacs (and maybe Auctex), I believe this can be applied to more general situations in Emacs concerning delimiters like parentheses, brackets, and braces.
I am looking to be able to do the following with Emacs (and elisp), and don't know where to begin. Say I have:
(This is in parentheses)
With some keybinding in Emacs, I want Emacs to find the matching delimiter to whichever one is by my cursor (something I know Emacs can do since it can highlight matching delimiters in various modes) and be able to change both of them to
\left( This is in parentheses \right)
The delimiters I would like this to work with are: (...), [...], \lvert ... \rvert, \langle ... \rangle, \{ ... \}. What elisp would I need to accomplish this task?
More general ways to handle matching delimiters are welcome.
Evaluate the command below in Emacs. After reloading you can put the point (text cursor) immediately after a closing paren. Then do M-x replace-matching-parens to replace the closing ) with \right) and the matching start paren ( with \left(.
(defun replace-matching-parens ()
(interactive)
(save-excursion
(let ((end-point (point)))
(backward-list)
(let ((start-point (point)))
(goto-char end-point)
(re-search-backward ")" nil t)
(replace-match " \\\\right)" nil nil)
(goto-char start-point)
(re-search-forward "(" nil t)
(replace-match "\\\\left( " nil nil)))))
The interactive bit indicates that I want a "command", so it can be executed using M-x. To avoid the cursor ending up in a strange place after execution I'm wrapping the logic in save-excursion. The point jumps back to the opening paren using backward-list and holds on to the start and end positions of the paren-matched region. Lastly, starting at the end and working backwards I replace the strings. By replacing backwards rather than forwards I avoid invalidating end-point before I need it.
Generalizing this to handle different kinds of delimiters shouldn't be too bad. backward-list ought to work with any pair of strings emacs recognizes as analogues of ( and ). To add more parenthesis-like string pairs, check out set-syntax-table in this Parenthesis Matching article.
Use global-set-key to setup a key binding to replace-matching-parens.
Fair warning: replace-matching-parens is the first elisp command I've implemented, so it may not align with best practices. To all the gurus out there, I'm open to constructive criticism.

How do I run an Emacs hook when a buffer is modified?

Building on Getting Emacs to untabify when saving certain file types (and only those file types) , I'd like to run a hook to untabify my C++ files when I start modifying the buffer. I tried adding hooks to untabify the buffer on load, but then it untabifies all my writable files that are autoloaded when emacs starts.
(For those that wonder why I'm doing this, it's because where I work enforces the use of tabs in files, which I'm happy to comply with. The problem is that I mark up my files to tell me when lines are too long, but the regexp matches the number of characters in the line, not how much space the line takes up. 4 tabs in a line can push it far over my 132 character limit, but the line won't be marked appropriately. Thus, I need a way to tabify and untabify automatically.)
Take a look at the variable "before-change-functions".
Perhaps something along this line (warning: code not tested):
(add-hook 'before-change-functions
(lambda (&rest args)
(if (not (buffer-modified-p))
(untabify (point-min) (point-max)))))
Here is what I added to my emacs file to untabify on load:
(defun untabify-buffer ()
"Untabify current buffer"
(interactive)
(untabify (point-min) (point-max)))
(defun untabify-hook ()
(untabify-buffer))
; Add the untabify hook to any modes you want untabified on load
(add-hook 'nxml-mode-hook 'untabify-hook)
This answer is tangential, but may be of use.
The package wide-column.el link text changes the cursor color when the cursor is past a given column - and actually the cursor colors can vary depending on the settings. This sounds like a less intrusive a solution than your regular expression code, but it may not suit your needs.
And a different, tangential answer.
You mentioned that your regexp wasn't good enough to tell when the 132 character limit was met. Perhaps a better regexp...
This regexp will match a line when it has more than 132 characters, assuming a tabs width is 4. (I think I got the math right)
"^\\(?: \\|[^ \n]\\{4\\}\\)\\{33\\}\\(.+\\)$"
The last parenthesized expression is the set of characters that are over the limit. The first parenthesized expression is shy.