Emacs macro that only fires on non-empty lines - emacs

Generally, I find myself writing short macros that, e.g. add or remove line comments or correct indentation on a line.
However, with whitespace-mode enabled, I will still have to look out not to fire these macros on blank lines; if the macro tries to delete a character on an empty line, generally it will mess up the entire document.
Is there any solution to this problem which does not involve having some amount of spaces on blank lines, or otherwise altering my document structure?

You could use C-M-s ^. at the beginning of the macro. That is, search for a line that contains at least one character.

Expanding on the legoscia's answer -- here's a version which also notifies if there's no previous line at all:
(defun goto-first-previous-non-empty-line ()
(interactive)
(if (re-search-backward "^." nil t)
(message "First previous non-empty line")
(message "Beginning of buffer")
)
)

Related

Wrong indentation of comments in Emacs

In many languages, the line comment starts with a single symbol, for example # in Python and R.
I find that in Emacs, when writing such line comments, I have to repeat the comment symbol twice to make the correct indentation.
See the following example:
(setq x-select-enable-clipboard t)
;using a single comment symbol indents wrongly
;; repeating the comment symbol indents fine
(setq-default c-basic-offset 4)
With a single ; at the beginning of the line cannot get the correct indentation. How to get the correct setting? Thanks!
EDIT:
I found the solution myself. In ESS's document:
Comments are also handled specially by ESS, using an idea borrowed
from the Emacs-Lisp indentation style. By default, comments beginning
with ‘###’ are aligned to the beginning of the line. Comments
beginning with ‘##’ are aligned to the current level of indentation
for the block containing the comment. Finally, comments beginning with
‘#’ are aligned to a column on the right (the 40th column by default,
but this value is controlled by the variable comment-column,) or just
after the expression on the line containing the comment if it extends
beyond the indentation column. You turn off the default behavior by
adding the line (setq ess-fancy-comments nil) to your .emacs file.
So I put this in my .emacs:
(setq ess-fancy-comments nil) ; this is for ESS
I think for Python mode, it has a similar variable.
Your example use Emacs Lisp, in this language the standard convention is that a single ; is indented to the right, whereas two ;; is indented like code would be indented at that point. I strongly recommend that you stick to this convention, otherwise your code would stand out as being different. And three ;;; is indented to the left. Four ;;;; is left indented, and used for major sections. (See https://www.gnu.org/software/emacs/manual/html_node/elisp/Comment-Tips.html)
For Ruby, comments always indent as code, as far as I know.
The major mode should take care of this properly. If not, consider filing an enhancement request or bug report to the maintainers. Of course, "properly" might be in the eye of the beholder. You can try to make your preferences known, however. And check whether the major-mode code might already have user options for this.
Beyond that, the function that is the value of variable comment-indent-function governs this. Normally, this is set by the major mode. You can set it to any function you want (e.g. on the mode hook, so that your definition overrides the one provided by the major-mode code).
It accepts no arguments, and it returns the column you want the comment to be indented to.
Here is code that indents a comment to column 0, for example:
(defun foo () (setq comment-indent-function (lambda () 0)))
(add-hook 'SOME-MODE-HOOK 'foo 'APPEND)
For Emacs-Lisp mode, for example, you would use (add-hook 'emacs-lisp-mode-hook 'foo 'APPEND).

Highlighting and replacing non-printable unicode characters in Emacs

I have an UTF-8 file containing some Unicode characters like LEFT-TO-RIGHT OVERRIDE (U+202D) which I want to remove from the file. In Emacs, they are hidden (which should be the correct behavior?) by default. How do I make such "exotic" unicode characters visible (while not changing display of "regular" unicode characters like german umlauts)? And how do I replace them afterwards (with replace-string for example. C-X 8 Ret does not work for isearch/replace-string).
In Vim, its quite easy: These characters are displayed with their hex representation per default (is this a bug or missing feature?) and you can easily remove them with :%s/\%u202d//g for example. This should be possible with Emacs?
You can do M-x find-file-literally then you will see these characters.
Then you can remove them using usual string-replace
How about this:
Put the U+202d character you want to match at the top of the kill ring by typing M-:(kill-new "\u202d"). Then you can yank that string into the various searching commands, with either C-y (eg. query-replace) or M-y (eg. isearch-forward).
(Edited to add:)
You could also just call commands non-interactively, which doesn't present the same keyboard-input difficulties as the interactive calls. For example, type M-: and then:
(replace-string "\u202d" "")
This is somewhat similar to your Vim version. One difference is that it only performs replacements from the cursor position to the bottom of the file (or narrowed region), so you'd need to go to the top of the file (or narrowed region) prior to running the command to replace all matches.
I also have this issue, and this is particularly annoying for commits as it may be too late to fix the log message when one notices the mistake. So I've modified the function I use when I type C-x C-c to check whether there is a non-printable character, i.e. matching "[^\n[:print:]]", and if there is one, put the cursor over it, output a message, and do not kill the buffer. Then it is possible to manually remove the character, replace it by a printable one, or whatever, depending on the context.
The code to use for the detection (and positioning the cursor after the non-printable character) is:
(progn
(goto-char (point-min))
(re-search-forward "[^\n[:print:]]" nil t))
Notes:
There is no need to save the current cursor position since here, either the buffer will be killed or the cursor will be put over the non-printable character on purpose.
You may want to slightly modify the regexp. For instance, the tab character is a non-printable character and I regard it as such, but you may also want to accept it.
About the [:print:] character class in the regexp, you are dependent on the C library. Some printable characters may be regarded as non-printable, like some recent emojis (but not everyone cares).
The re-search-forward return value will be regarded as true if and only if there is a non-printable character. This is exactly what we want.
Here's a snippet of what I use for Subversion commits (this is between more complex code in my .emacs).
(defvar my-svn-commit-frx "/svn-commit\\.\\([0-9]+\\.\\)?tmp\\'")
and
((and (buffer-file-name)
(string-match my-svn-commit-frx (buffer-file-name))
(progn
(goto-char (point-min))
(re-search-forward "[^\n[:print:]]" nil t)))
(backward-char)
(message "The buffer contains a non-printable character."))
in a cond, i.e. I apply this rule only on filenames used for Subversion commits. The (backward-char) can be used or not, depending on whether you want the cursor to be over or just after the non-printable character.

Modify Alt+f in Emacs for tex-mode

Alt+f in emacs when writing in tex mode seems to not include the . as part of the word. So how do I modify the alt+f behavior to remain the same exact when going forward if there is punctiation to include that as part of the word.
I have a separate file that loads for when writing in tex so I will just throw it in there so it doesn't affect normal emacs behavior.
Thanks for any help.
Thought of an addition to this but same related problem is when using Alt+d and deleting. Getting it to delete not only the word but also the punctation following eg.. (,.! etc..).
The following code should work for you:
(defun unpunctuate-syntax (str)
"Make the characters of the given string word characters."
(let ((st (copy-syntax-table (syntax-table))))
(dotimes (n (length str))
(modify-syntax-entry (elt str n) "w" st))
(set-syntax-table st)))
(defun dots-are-not-punctuation ()
(unpunctuate-syntax "."))
(add-hook 'TeX-mode-hook 'dots-are-not-punctuation)
The way M-f (the forward-word function) works is that it skips all characters in the buffer that have type "w" (ie word) in the current syntax table.
This code makes a modified syntax table and gives it to the buffer and the add-hook bit at the bottom sets it to run when you open a file in TeX-mode. (This method avoids you having to do the separate file thing you described).
You might notice that I make a copy of the syntax table rather than editing the one belonging to the TeX major mode. This is because I always get things wrong when playing with syntax tables and you can mess things up royally... This method means you just have to close the buffer and start again!

How to set face for keywords which occur in first line?

How do I set face for specified keywords but only ones in first line?
For example having this file
--- cut here ---
hello world <-- this "hello" should have face set
hello world <-- while this "hello" should not
--- cut here ---
only first hello should have face set
I tried this
(defun first-line-hello(limit)
(and (save-excursion (beginning-of-line)
(bobp))
(re-search-forward "hello" limit)))
(font-lock-add-keywords 'emacs-lisp-mode
'((first-line-hello . font-lock-warning-face)))
but it seems that for some reason (bobp) returns always true when used in font-lock-keywords. I also tried using line-number-at-pos with the same result.
You're close, but there are a few problems. 'limit' could be well beyond the end of the first line, you need to not have an error if the search fails, and you need to move point regardless of the search pass/fail. Which all boils down to changing one line in your function:
(defun first-line-hello(limit)
(and (save-excursion (beginning-of-line)
(bobp))
(re-search-forward "hello" (min (point-at-eol) limit) 'go)))
Emacs has a regexp construct that matches the empty string only at the beginning of the buffer, so try this:
(font-lock-add-keywords 'emacs-lisp-mode '(("\\`hello" . font-lock-warning-face)))
The docs say:
‘\`’
matches the empty string, but only at the beginning of the buffer or string being matched against.
First line of a buffer? First line of a paragraph?
Regardless, I think you could do it with your own minor mode, in which you scan the buffer/region the way you like. In your scan you would need to start from the beginning, note the location of the first time you saw each keyword, highlight it, and continue scanning. if you see a repeated keyword during the scan, then don't highlight that one.
I believe you cannot do it with simple font-lock-keywords. They are always keywords, regardless where they appear. But you could do it with custom matchers for font-lock. Get help on font-lock-keywords and you will see:
Each element in a user-level keywords list should have one of these forms:
MATCHER
...
where MATCHER can be either the regexp to search for, or the function name to
call to make the search (called with one argument, the limit of the search;
it should return non-nil, move point, and set match-data appropriately if
it succeeds; like re-search-forward would).
These matcher functions or regexps get called or evaluated at arbitrary positions in the buffer, at seemingly arbitrary times. Using a regexp wouldn't work for that reason, but using a function, you could scan back to beginning-of-buffer, re-search-forward for the keyword, and find out if it's the first time you've seen the keyword, and then take appropriate action. You could also cache the first-seen location of a keyword, and then manage the cache in a after-change hook, but that's getting pretty complicated.
EDIT on re-reading I see you'ver tried this latter idea and it isn't working for you. strange.

How do I run an Emacs hook when a buffer is modified?

Building on Getting Emacs to untabify when saving certain file types (and only those file types) , I'd like to run a hook to untabify my C++ files when I start modifying the buffer. I tried adding hooks to untabify the buffer on load, but then it untabifies all my writable files that are autoloaded when emacs starts.
(For those that wonder why I'm doing this, it's because where I work enforces the use of tabs in files, which I'm happy to comply with. The problem is that I mark up my files to tell me when lines are too long, but the regexp matches the number of characters in the line, not how much space the line takes up. 4 tabs in a line can push it far over my 132 character limit, but the line won't be marked appropriately. Thus, I need a way to tabify and untabify automatically.)
Take a look at the variable "before-change-functions".
Perhaps something along this line (warning: code not tested):
(add-hook 'before-change-functions
(lambda (&rest args)
(if (not (buffer-modified-p))
(untabify (point-min) (point-max)))))
Here is what I added to my emacs file to untabify on load:
(defun untabify-buffer ()
"Untabify current buffer"
(interactive)
(untabify (point-min) (point-max)))
(defun untabify-hook ()
(untabify-buffer))
; Add the untabify hook to any modes you want untabified on load
(add-hook 'nxml-mode-hook 'untabify-hook)
This answer is tangential, but may be of use.
The package wide-column.el link text changes the cursor color when the cursor is past a given column - and actually the cursor colors can vary depending on the settings. This sounds like a less intrusive a solution than your regular expression code, but it may not suit your needs.
And a different, tangential answer.
You mentioned that your regexp wasn't good enough to tell when the 132 character limit was met. Perhaps a better regexp...
This regexp will match a line when it has more than 132 characters, assuming a tabs width is 4. (I think I got the math right)
"^\\(?: \\|[^ \n]\\{4\\}\\)\\{33\\}\\(.+\\)$"
The last parenthesized expression is the set of characters that are over the limit. The first parenthesized expression is shy.