Elisp macros for cleaning LaTeX-code - emacs

does anyone know of some good elisp macros for cleaning up LaTeX code?
I do a lot of LaTeX editing of other peoples sources and I'd like to extend my set of clean up tools since not everyone organize their code in the manner I like it ;-)
One in particular would be interesting, to run function X on a buffer and get all LaTeX environments (\begin{...} and \end{...} pairs) to sit on lines of their own, this helps readability of the code.
I could try this myself, but would like to hear suggestions as to a best practice for programming such a function, e.g. it should of course not introduce blank lines of its own.
suggestions?
Edit: For the archives, here are my current version based on the answer given (assumes the use of auctex). It more or less suits my needs at the moment. I added the y-or-n test just to be able to detect corner cases that I had not thought of.
(defun enviro-split ()
"Find begin and end macros, and put them on their own line."
(interactive)
(save-excursion
(beginning-of-buffer)
;; loop over document looking for begin and end macros
(while (re-search-forward "\\\\\\(begin\\|end\\)" nil t)
(catch 'continue
; if the line is a pure comment, then goto next
(if (TeX-in-commented-line)
(throw 'continue nil)
)
;; when you find one, back up to the beginning of the macro
(search-backward "\\")
;; If it's not at the beginning of the line, add a newline
(when (not (looking-back "^[ \t]*"))
(if (y-or-n-p "newline?")
(insert "\n")
)
)
;; move over the arguments, one or two pairs of matching braces
(search-forward "{") ; start of the argument
(forward-char -1)
(forward-sexp) ; move over the argument
(if (looking-at "[ \t]*{") ; is there a second argument?
(forward-sexp)
) ; move over it if so
(if (looking-at "[ \t]*\\[") ; is there a second argument?
(forward-sexp)
) ; move over it if so
(when (looking-at (concat "[ \t]*" (regexp-quote TeX-esc) "label"))
(goto-char (match-end 0))
(forward-sexp)
)
(if (looking-at (concat "[ \t]*%" ))
(throw 'continue nil)
)
;; If there is anything other than whitespace following the macro,
;; insert a newline
(if (not (looking-at "\\s *$"))
;;(insert "\n")
(if (y-or-n-p "newline (a)?")
(insert "\n")
)
)
) ; end catch 'continue
)
(LaTeX-fill-buffer 'left)
)
)

You could probably work up a single regexp and do a regexp replace for this. However, I find the logic of these manipulations becomes pretty hairy, particularly when you want to account for various edge-cases. In your example, you need to deal with some environments taking one argument, while others take two. I think it is easier to combine a series of simple regexps with basic text editing commands for this:
(defun enviro-split ()
"Find begin and end macros, and put them on their own line."
(interactive)
(save-excursion
(beginning-of-buffer)
;; loop over document looking for begin and end macros
(while (re-search-forward "\\\\\\(begin\\|end\\)" nil t)
;; when you find one, back up to the beginning of the macro
(search-backward "\\")
;; If it's not at the beginning of the line, add a newline
(when (not (looking-at "^"))
(insert "\n"))
;; move over the arguments, one or two pairs of matching braces
(search-forward "{") ; start of the argument
(forward-char -1)
(forward-sexp) ; move over the argument
(if (looking-at "\\s *{") ; is there a second argument?
(forward-sexp)) ; move over it if so
;; If there is anything other than whitespace following the macro,
;; insert a newline
(if (not (looking-at "\\s *$"))
(insert "\n")))))
This approach has the advantage of using Emacs' built-in functions for moving over sexps, which is much easier than coming up with your own regexp that can handle multiple, potentially nested, expressions inside braces.

Related

How do I navigate efficiently through emacs buffer modifying lines

I am an elisp (but not programming) beginner and have some questions about the best practice to implement a function. I have written an elisp function that reformats assembler source code according to certain rules; this function currently works for a single line. It basically uses navigation within the line, looking-at and replace-match calls on subexpressions to achieve the goal.
Now I'd like to apply it to a marked region, processing the region line by line. The behaviour will be similar to the indent-region function.
What is the recommended (and efficient) way to implement this? I consider using (line-number-at-pos ...) applied to (region-beginning) and (region-end) to count line numbers and then move from top to bottom, working through the buffer line by line, modifying these.
Also, what would I need to preserve through this operation? I though about (save-match-data ...) and am not sure how to handle mark and point. I guess they will be useless because the text extent changed.
Use save-excursion to save and restore point and mark and save-restriction to narrow to the region.
The template would be something like this:
(defun my-process-region (beg end)
"Apply `my-process-line` to every line in region."
(interactive "r")
(save-restriction
(widen)
(save-excursion
(narrow-to-region beg end)
(goto-char (point-min))
(while (not (eobp))
(my-process-line)))))
I accept the answer of sds. In the end, I used the code below. The reason was that I wanted entire lines available for reformatting, not just the marked region. So (narrow-to-region) alone would not have done the job.
I am happy to learn more, and appreciate comments on pros/cons or missing things:
(defun x-mode-reformat-region (beg end)
"..."
(interactive "r")
(save-excursion
(let ((nlines (+ 1 (apply '- (mapcar 'line-number-at-pos `(,end ,beg)))))
bol
...)
(goto-char beg)
(dotimes (i nlines)
(setq bol (line-beginning-position))
(goto-char bol)
;; do reformatting for this line -- uses bol for calculations
(forward-line)))))
Next try -- modified based on comment. I did not find a simpler way to extend the selection to include the entire line... any idea whether the setq / narrow-to-region combination could be simplified further (except using (progn ...) directly as argument ?
(defun x-mode-reformat-region (beg end)
"..."
(interactive "r")
(save-restriction
(widen)
(save-excursion
(setq beg (progn (goto-char beg) (line-beginning-position))
end (progn (goto-char end) (line-end-position)))
(narrow-to-region beg end)
(goto-char (point-min))
(while (not (eobp))
(insert "*") ;; placeholder for fancy reformatting
(forward-line)))))

Auto-escaping yanked strings in emacs

I apparently have a powerful itch this weekend to add a ton of functionality to my Emacs environment. I can do some basics on my own, and hunt down other stuff, but I haven't been able to find a solution to this (and am not good enough at Lisp to do it on my own).
I frequently work with strings of HTML, and sometimes if I move them from one block to another (or one language to another) strings are broken where they aren't escaped. So, I want a function that does something like this:
(defun smart-yank-in-string()
(if (stringp) ; Check if the point is in a string
; Check if the region created from the point to the end of the yank ends the string
; (and there is more yank left that isn't ";")
; Escape quotes for those locations recursively by prepending \
; Insert result into buffer # mark
))
Any clever ideas? I think it involves using kill-new to stash a variable and walk through it, but I'm not conversant enough in elisp to solve it.
Next yank should insert the escaped string:
(defun escape-doublequotes-at-car-of-kill-ring ()
"Escape doublequotes in car of kill-ring "
(interactive)
(with-temp-buffer
(insert (car kill-ring))
(goto-char (point-min))
(while (search-forward "\"" nil t 1)
(replace-match "\\\\\""))
(kill-new (buffer-substring-no-properties (point-min) (point-max)))))
Here is an alternative
(defun my-yank()
(interactive)
(if (nth 3 (syntax-ppss)) ;; Checks if inside a string
(insert-for-yank (replace-regexp-in-string "[\\\"]"
"\\\\\\&"
(current-kill 0)
t))
(call-interactively 'yank)))
The command when invoked checks if the point is in a string, if so it escapes the yanked text otherwise it yanks normally. One disadvantage is that you cannot use yank-pop after yanking inside a string.
Maybe you could do it as follow (guaranteed 100% non-functional code ahead):
(defun my-kill-quoted-string (start end)
"Like kill-region but takes of unquoting/requoting."
(interactive "r")
(let ((str (buffer-extract-substring start end)))
(if (nth 3 (syntax-ppss))
;; Unquote according to mode and context. E.g. we should unquote " and things like that in HTML.
(setq str (replace-regexp-in-string "\\\\\"" "\"" str)))
(put-text-property 0 (length str) 'yank-handler
(list (lambda (str)
(if (not (nth 3 (syntax-ppss)))
(insert str)
;; Requote according to mode and context.
(insert (replace-regexp-in-string "[\\\"]" "\\\\\\&" str))))))
(kill-new str)))

Set Emacs to smart auto-line after a parentheses pair?

I have electric-pair-mode on (which isn't really particularly relevant, as this could apply to any auto-pairing mode or even manual parens), but in a nutshell, I'd like it so that in the case I have:
function foo() {|}
(where | is the mark)
If I press enter, I would like to have it automatically go to
function foo() {
|
}
It would also mean that
function foo(|) {}
would become
function foo(
|
){}
I already have things to take care of the indentation, but I'm not sure how to say "if I'm inside any empty pair of matched parenthesis, when I press return, actually insert two new lines and put me at the first one".
Thanks!
Here is what I have in my init file, I got this from Magnar Sveen's .emacs.d
(defun new-line-dwim ()
(interactive)
(let ((break-open-pair (or (and (looking-back "{") (looking-at "}"))
(and (looking-back ">") (looking-at "<"))
(and (looking-back "(") (looking-at ")"))
(and (looking-back "\\[") (looking-at "\\]")))))
(newline)
(when break-open-pair
(save-excursion
(newline)
(indent-for-tab-command)))
(indent-for-tab-command)))
You can bind it to a key of your choice. I have bound it to M-RET but if you want, you can bind it to RET. The lines
(or (and (looking-back "{") (looking-at "}"))
(and (looking-back ">") (looking-at "<"))
(and (looking-back "(") (looking-at ")"))
(and (looking-back "\\[") (looking-at "\\]")))
check if cursor is at {|}, [|], (|) or >|< (html).
You may also want to look into smartparens. Specifically, see the page on insertion hooks.
Here's the config I personally use:
(with-eval-after-load 'smartparens
(sp-with-modes
'(c++-mode objc-mode c-mode)
(sp-local-pair "{" nil :post-handlers '(:add ("||\n[i]" "RET")))))
This has the added benefit of indenting the current line automatically as well. This can easily be generalized to more modes (using sp-pair for global pairs) and paren types (just duplicate the code), if you please.

How to call query-replace-regexp inside a function?

I tend to use query-replace-regexp over an entire buffer rather than at the current position so I regularly use the sequence C-< (beginning-of-buffer), then C-r (query-replace-repexp).
I'd like to make another function bound to C-S-r (C-R) which does this for me. I thought that if I simply wrapped it all together such as:
(defun query-replace-regexp-whole-buffer ()
"query-replace-regexp from the beginning of the buffer."
(interactive)
(beginning-of-buffer)
(query-replace-regexp))
that this would be adequate, unfortunately though I'm getting some errors.
query-replace-regexp-whole-buffer: Wrong number of arguments: #[(regexp to-string &optional delimited start end) "Å Æ
Ç& " [regexp to-string delimited start end perform-replace t nil] 10 1940879 (let ((common (query-replace-read-args (concat "Query replace" (if current-prefix-arg " word" "") " regexp" (if (and transient-mark-mode mark-active) " in region" "")) t))) (list (nth 0 common) (nth 1 common) (nth 2 common) (if (and transient-mark-mode mark-active) (region-beginning)) (if (and transient-mark-mode mark-active) (region-end))))], 0
I can't really see what I'm doing wrong, hopefully someone can help.
When called from Lisp, query-replace-regexp expects to be passed regular expression and the intended replacement as arguments. If you want to emulate the questions asked when invoked interactively, you need to use call-interactively:
(defun query-replace-regexp-whole-buffer ()
"query-replace-regexp from the beginning of the buffer."
(interactive)
(goto-char (point-min))
(call-interactively 'query-replace-regexp))
Also note that one should never call beginning-of-buffer from Lisp code; it will do unnecessary work, such as pushing the mark and printing a message.
You need to read arguments yourself and pass them to query-replace-regexp... This could be done by extending your interactive, so function will look something like:
(defun query-replace-regexp-whole-buffer (regex to-string)
"query-replace-regexp from the beginning of the buffer."
(interactive "sRegex to search: \nsString to replace: ")
(save-excursion
(goto-char (point-min))
(query-replace-regexp regex to-string)))

'Semantic' movement across a line

Consider the following line of Lisp code:
(some-function 7 8 | 9) ;; some comment. note the extra indentation
The point is placed between '8' and '9'. If I perform (move-beginning-of-line), the point will be placed at the absolute beginning of the line, rather than at '('.
Same for move-end-of-line: I'd find it more desirable for it to place the point at ')' if I perform it once, and at the absolute end of the line if I perform it a second time. Some IDEs behave like that.
I tried to implement this but got stuck, my solution behaves particularly bad near the end of a buffer, and on the minibuffer as well. Is there a library that provides this functionality?
I don't know of any library, but it can be done in a few lines of Elisp.
For the beginning of line part, the bundled functions beginning-of-line-text and back-to-indentation (M-m) move to the beginning of the “interesting” part of the line. back-to-indentation ignores only whitespace whereas beginning-of-line-text skips over the fill prefix (in a programming language, this is typically the comment marker, if in a comment). See Smart home in Emacs for how to flip between the beginning of the actual and logical line.
For the end of line part, the following function implements what you're describing. The function end-of-line-code moves to the end of the line, except for trailing whitespace and an optional trailing comment. The function end-of-line-or-code does this, except that if the point was already at the target position, or if the line only contains whitespace and a comment, the point moves to the end of the actual line.
(defun end-of-line-code ()
(interactive "^")
(save-match-data
(let* ((bolpos (progn (beginning-of-line) (point)))
(eolpos (progn (end-of-line) (point))))
(if (comment-search-backward bolpos t)
(search-backward-regexp comment-start-skip bolpos 'noerror))
(skip-syntax-backward " " bolpos))))
(defun end-of-line-or-code ()
(interactive "^")
(let ((here (point)))
(end-of-line-code)
(if (or (= here (point))
(bolp))
(end-of-line))))
Some suggestions that almost do what you ask:
In lisp code, you can sort-of do what you want, with the sexp movement commands. To get to the beginning of the expression from somewhere in the middle, use backward-up-list, which is bound to M-C-u. In your example, that would bring you to the open parenthesis. To move backwards over individual elements in the list, use backward-sexp, bound to M-C-b; forward-sexp moves the other way, and is bound to M-C-f. From the beginning of an sexp, you can skip to the next with M-C-n; reverse with M-C-p.
None of these commands are actually looking at the physical line you are on, so they'll go back or forward over multiple lines.
Other options include Ace Jump mode, which is a very slick way to quickly navigate to the beginning of any word visible on the screen. That might eliminate your need to use line-specific commands. For quick movement within a line, I usually use M-f and M-b to jump over words. Holding the M key down while tapping on b or f is quick enough that I end up using that by default most of the time.
Edit:
Forgot one other nice command - back-to-indentation, bound to M-m. This will back you up to the first non-whitespace character in a line. You could advice this to behave normally on the first call, and then to back up to the beginning of the line on the second call:
(defadvice back-to-indentation (around back-to-back)
(if (eq last-command this-command)
(beginning-of-line)
ad-do-it))
(ad-activate 'back-to-indentation)
I just wrote these two functions that have the behavior you are looking for.
(defun move-beginning-indent ()
(interactive)
(if (eq last-command this-command)
(beginning-of-line)
(back-to-indentation))
)
(defun move-end-indent ()
(interactive)
(if (eq last-command this-command)
(end-of-line)
(end-of-line)
(search-backward-regexp "\\s)" nil t) ; searches backwards for a
(forward-char 1)) ; closed delimiter such as ) or ]
)
(global-set-key [f7] 'move-beginning-indent)
(global-set-key [f8] 'move-end-indent)
Just try them out, they should behave exactly the way you'd want them to.
I use this:
(defun beginning-of-line-or-text (arg)
"Move to BOL, or if already there, to the first non-whitespace character."
(interactive "p")
(if (bolp)
(beginning-of-line-text arg)
(move-beginning-of-line arg)))
(put 'beginning-of-line-or-text 'CUA 'move)
;; <home> is still bound to move-beginning-of-line
(global-set-key (kbd "C-a") 'beginning-of-line-or-text)
(defun end-of-code-or-line ()
"Move to EOL. If already there, to EOL sans comments.
That is, the end of the code, ignoring any trailing comment
or whitespace. Note this does not handle 2 character
comment starters like // or /*. Such will not be skipped."
(interactive)
(if (not (eolp))
(end-of-line)
(skip-chars-backward " \t")
(let ((pt (point))
(lbp (line-beginning-position))
(comment-start-re (concat (if comment-start
(regexp-quote
(replace-regexp-in-string
"[[:space:]]*" "" comment-start))
"[^[:space:]][[:space:]]*$")
"\\|\\s<"))
(comment-stop-re "\\s>")
(lim))
(when (re-search-backward comment-start-re lbp t)
(setq lim (point))
(if (re-search-forward comment-stop-re (1- pt) t)
(goto-char pt)
(goto-char lim) ; test here ->
(while (looking-back comment-start-re (1- (point)))
(backward-char))
(skip-chars-backward " \t"))))))
(put 'end-of-code-or-line 'CUA 'move)
;; <end> is still bound to end-of-visual-line
(global-set-key (kbd "C-e") 'end-of-code-or-line)