Emacs Lisp has replace-string but has no replace-char. I want to replace "typographic" curly quotes (Emacs code for this character is hexadecimal 53979) with regular ASCII quotes, and I can do so with:
(replace-string (make-string 1 ?\x53979) "'")
I think it would be better with replace-char.
What is the best way to do this?
This is the way I replace characters in elisp:
(subst-char-in-string ?' ?’ "John's")
gives:
"John’s"
Note that this function doesn't accept characters as string. The first and second argument must be a literal character (either using the ? notation or string-to-char).
Also note that this function can be destructive if the optional inplace argument is non-nil.
Why not just use
(replace-string "\x53979" "'")
or
(while (search-forward "\x53979" nil t)
(replace-match "'" nil t))
as recommended in the documentation for replace-string?
which would certainly be better with replace-char. Any way to improve my code?
Is it actually slow to the point where it matters? My elisp is usually ridiculously inefficient and I never notice. (I only use it for editor tools though, YMMV if you're building the next MS live search with it.)
Also, reading the docs:
This function is usually the wrong thing to use in a Lisp program.
What you probably want is a loop like this:
(while (search-forward "’" nil t)
(replace-match "'" nil t))
This answer is probably GPL licensed now.
What about this
(defun my-replace-smart-quotes (beg end)
"replaces ’ (the curly typographical quote, unicode hexa 2019) to ' (ordinary ascii quote)."
(interactive "r")
(save-excursion
(format-replace-strings '(("\x2019" . "'")) nil beg end)))
Once you have that in your dotemacs, you can paste elisp example codes (from blogs and etc) to your scratch buffer and then immediately press C-M-\ (to indent it properly) and then M-x my-replace-smart-quotes (to fix smart quotes) and finally C-x C-e (to run it).
I find that the curly quote is always hexa 2019, are you sure it's 53979 in your case? You can check characters in buffer with C-u C-x =.
I think you can write "’" in place of "\x2019" in the definition of my-replace-smart-quotes and be fine. It's just to be on the safe side.
Related
The latex package polyglossia allows for correct typesetting of foreign languages: it provides an inline command of the form \text[foreign_language]{...}, such as \textspanish{Soy el hijo de Fernando}.
Since I use Emacs, I have to prevent auto-fill mode from breaking up those code blocks: for example, ending a line with \textspanish{Soy el, and beginning the next line with hijo de Fernando}. When auto-fill mode breaks lines in this way, the exporter gets confused.
I tried creating a function to add as hook to fill-nobreak-predicate, but my knowledge of regular expressions and elips is not good enough. This is how far I got:
(defun foreign-language-nobreak-p ()
(or (looking-at "[[[:space:]]\|[[:print:]]].*}")
(save-excursion
(skip-chars-backward " \t")
(unless (bolp)
(backward-char 1)
(looking-at ".*\\text")))))
(add-hook 'fill-nobreak-predicate #'foreign-language-nobreak-p)
Any ideas on what went wrong?
First of all, thanks for pointing out fill-nobreak-predicate. Never heard about it in my first 23 years of Emacs.
Regarding your regexp question, I'd like to mention the function regexp-opt which takes a list of strings and builds an efficient regexp that matches those strings:
(defvar foreign-lang-re
(regexp-opt
'("\\textspanish{"
"\\textrussian{"
"\\textfrench{")))
If you factor out the supported languages into yet another variable, you could also build the list of strings with a loop, adding \text and the trailing {.
If your heuristic would be stable enough that you don't want auto filling to kick in when there is just the opening command somewhere on the current line, you could use thing-at-point like so:
(defun foreign-language-nobreak-p ()
(string-match
foreign-lang-re
(thing-at-point 'line t)))
This does not work when you closed the command already with a }. For that to work better, you'd need to search backward from your current point for the optimized regexp and forward for a closing curly brace, limiting the search to (bolp) and (eolp) respectively. This would get really hairy if you start to use other commands with curly braces inside the \textspanish command, though.
Hope that makes sense and helps a bit.
I'd like (in TeX-related modes) the tilde key to insert itself as usual if point is on anything (in particular a line end), but if a point is on space, I'd like the tilde to overwrite it. (This would be a quite useful feature after pasting something into TeX source file.) I hacked something like this:
(defun electric-tie ()
"Inserts a tilde at point unless the point is at a space
character, in which case it deletes the space first."
(interactive)
(while (equal (char-after) 32) (delete-char 1))
(while (equal (char-before) 32) (delete-char -1))
(insert "~"))
(add-hook 'TeX-mode-hook (lambda () (local-set-key "~" 'electric-tie)))
My questions are simple: is it correct (it seems to work) and can it be done better? (I assume that if the answer to the first question is in the affirmative, the latter is a question of style.)
As mentioned, it's better to use a "character" literal than a number literal. You have the choice between ? , ?\ , and ?\s where the last one is only supported since Emacs-22 but is otherwise the recommended way, since it's (as you say) "more easily visible" and also there' no risk that the space char will be turned into something else (or removed) by things like fill-paragraph or whitespace trimming.
You can indeed use eq instead of equal, but the difference is not important.
Finally, I'd call (call-interactively 'self-insert-command) rather than insert by hand, but the difference is not that important (e.g. it'll let you insert 3 tildes with C-u ~).
Some points:
Instead of 32 use ? (question-mark space) to express character literal.
Instead of defining keys in the major-mode hooks, do it in an eval-after-load block. The difference is that major-mode hook runs every time you use the major-mode, but there is only one keymap per major-mode. So there is no point in repeatedly redefining a key in it.
see: https://stackoverflow.com/a/8139587/903943
It looks like this command should not take a numeric argument, but it's worth understanding interactive specs to know how other commands you write can be made to be more flexible by taking numeric arguments into consideration.
One more note about your new modifications:
Your way to clear spaces around point is not wrong, but I'd do this:
(defun foo ()
(interactive)
(skip-chars-forward " ")
(delete-region (point) (+ (point) (skip-chars-backward " "))))
I've defined the \ character to behave as a word constituent in latex-mode, and I'm pretty happy with the results. The only thing bothering me is that a sequence like \alpha\beta gets treated as a single word (which is the expected behavior, of course).
Is there a way to make emacs interpret a specific character as a word "starter"? This way it would always be considered part of the word following it, but never part of the word preceding it.
For clarity, here's an example:
\alpha\beta
^ ^
1 2
If the point is at 1 and I press M-d, the string "\alpha" should be killed.
If the point is at 2 and I press M-<backspace>, the string "\beta" should be killed.
How can I achieve this?
Another thought:
Your requirement is very like what subword-mode provides for camelCase.
You can't customize subword-mode's behaviour -- the regexps are hard-coded -- but you could certainly copy that library and modify it for your purposes.
M-x find-library RET subword RET
That would presumably be a pretty robust solution.
Edit: updated from the comments, as suggested:
For the record, changing every instance of [[:upper:]] to [\\\\[:upper:]] in the functions subword-forward-internal and subword-backward-internal inside subword.el works great =) (as long as "\" is defined as "w" syntax).
Personally I would be more inclined to make a copy of the library than edit it directly, unless for the purpose of making the existing library a little more general-purpose, for which the simplest solution would seem to be to move those regexps into variables -- after which it would be trivial to have buffer-local modified versions for this kind of purpose.
Edit 2: As of Emacs 24.3 (currently a release candidate), subword-mode facilitates this with the new subword-forward-regexp and subword-backward-regexp variables (for simple modifications), and the subword-forward-function and subword-backward-function variables (for more complex modifications).
By making those regexp variables buffer-local in latex-mode with the desired values, you can just use subword-mode directly.
You should be able to implement this using syntax text properties:
M-: (info "(elisp) Syntax Properties") RET
Edit: Actually, I'm not sure if you can do precisely this?
The following (which is just experimentation) is close, but M-<backspace> at 2 will only delete "beta" and not the preceding "\".
I suppose you could remap backward-kill-word to a function which checked for that preceding "\" and killed that as well. Fairly hacky, but it would probably do the trick if there's not a cleaner solution.
I haven't played with this functionality before; perhaps someone else can clarify.
(modify-syntax-entry ?\\ "w")
(setq parse-sexp-lookup-properties t)
(setq syntax-propertize-function 'my-propertize-syntax)
(defun my-propertize-syntax (start end)
"Set custom syntax properties."
(save-excursion
(goto-char start)
(while (re-search-forward "\\w\\\\" end t)
(put-text-property
(1- (point)) (point) 'syntax-table (cons "." ?\\)))))
While this question concerns the formatting of LaTeX within Emacs (and maybe Auctex), I believe this can be applied to more general situations in Emacs concerning delimiters like parentheses, brackets, and braces.
I am looking to be able to do the following with Emacs (and elisp), and don't know where to begin. Say I have:
(This is in parentheses)
With some keybinding in Emacs, I want Emacs to find the matching delimiter to whichever one is by my cursor (something I know Emacs can do since it can highlight matching delimiters in various modes) and be able to change both of them to
\left( This is in parentheses \right)
The delimiters I would like this to work with are: (...), [...], \lvert ... \rvert, \langle ... \rangle, \{ ... \}. What elisp would I need to accomplish this task?
More general ways to handle matching delimiters are welcome.
Evaluate the command below in Emacs. After reloading you can put the point (text cursor) immediately after a closing paren. Then do M-x replace-matching-parens to replace the closing ) with \right) and the matching start paren ( with \left(.
(defun replace-matching-parens ()
(interactive)
(save-excursion
(let ((end-point (point)))
(backward-list)
(let ((start-point (point)))
(goto-char end-point)
(re-search-backward ")" nil t)
(replace-match " \\\\right)" nil nil)
(goto-char start-point)
(re-search-forward "(" nil t)
(replace-match "\\\\left( " nil nil)))))
The interactive bit indicates that I want a "command", so it can be executed using M-x. To avoid the cursor ending up in a strange place after execution I'm wrapping the logic in save-excursion. The point jumps back to the opening paren using backward-list and holds on to the start and end positions of the paren-matched region. Lastly, starting at the end and working backwards I replace the strings. By replacing backwards rather than forwards I avoid invalidating end-point before I need it.
Generalizing this to handle different kinds of delimiters shouldn't be too bad. backward-list ought to work with any pair of strings emacs recognizes as analogues of ( and ). To add more parenthesis-like string pairs, check out set-syntax-table in this Parenthesis Matching article.
Use global-set-key to setup a key binding to replace-matching-parens.
Fair warning: replace-matching-parens is the first elisp command I've implemented, so it may not align with best practices. To all the gurus out there, I'm open to constructive criticism.
Question as title.
More specifically, I'm rather tired of having to type \(, etc. every time I want a parenthesis in Emacs's (interactive) regexp functions (not to mention the \\( in code). So I wrote something like
(defadvice query-replace-regexp (before my-query-replace-regexp activate)
(ad-set-arg 0 (replace-regexp-in-string "(" "\\\\(" (ad-get-arg 0)))
(ad-set-arg 0 (replace-regexp-in-string ")" "\\\\)" (ad-get-arg 0)))))
in hope that I can conveniently forget about emacs's idiosyncrasy in regexp during "interaction mode". Except I cannot get the regexp right...
(replace-regexp-in-string "(" "\\\\(" "(abc")
gives \\(abc instead of the wanted \(abc. Other variations on the number of slashes just gives errors. Thoughts?
Since I started questioning, might as well ask another one: since lisp code is not supposed to use interactive functions, advicing query-replace-regexp should be okay, am I correct?
The replacement you has works well for me.
Take the text:
hi there mom
hi son!
and try query-replace-regexp with your advice:
M-x query-replace-regexp (hi).*(mom) RET \1 \2! RET
yields
hi mom!
hi son!
I didn't have to put a backslash in front of the parentheses to get them to group. That said, this disables being able to match actual parentheses...
The reason the replace-regexp-in-string yields \\(abc, is that as a string, that is equivalent to an interactively typed \(abc. In a string \ is used to denote that the following character is special, e.g. "\t" is a string with a tab. So, in order to specify just a backslash, you need to use a backslash in front of it "\\" is a string containing a backslash.
Regarding advising interactive functions, lisp code can call interactive functions all it wants. A prime example is find-file - which is called all over the place. To make your advice a little safer, you can wrap the body with a check for interactive-p to avoid mussing with internal calls:
(defadvice query-replace-regexp (before my-query-replace-regexp activate)
(when (interactive-p)
(ad-set-arg 0 (replace-regexp-in-string "(" "\\\\(" (ad-get-arg 0)))
(ad-set-arg 0 (replace-regexp-in-string ")" "\\\\)" (ad-get-arg 0)))))