Emacs: How to intelligently handle buffer-modified when setting text properties? - emacs

The documentation on Text Properties says:
Since text properties are considered part of the contents of the buffer (or string), and can affect how a buffer looks on the screen, any change in buffer text properties marks the buffer as modified.
First, I don't understand that policy. Can anyone explain? The text props are not actually saved in the file, when the buffer is saved. So why mark the buffer as modified? For me, buffer-modified indicates "some changes have not yet been saved." but understanding the policy is just for my own amusement.
More importantly, is there an already-established way that, in code, I can change syntax text properties on the text in a buffer, while keeping the buffer-modified flag set to whatever it was, prior to those changes? I'm thinking of something like save-excursion.
It would be pretty easy to write, but this seems like a common case and I'd like to use the standard function, if possible.
For more on the scenario - I have a mode that does a full text scan and sets syntax-table properties on the text. After opening a buffer, the scan runs, but it results in a buffer with buffer-modified set to t .
As always, thanks.

Newer versions of Emacs include the macro "with-silent-modifications" for this:
C-h f with-silent-modifications
with-silent-modifications is a Lisp macro in `subr.el'.
(with-silent-modifications &rest BODY)
Execute BODY, pretending it does not modify the buffer.
If BODY performs real modifications to the buffer's text, other
than cosmetic ones, undo data may become corrupted.
Typically used around modifications of text-properties which do not really
affect the buffer's content.

Wait! I found this in cc-defs.el
;; The following is essentially `save-buffer-state' from lazy-lock.el.
;; It ought to be a standard macro.
(defmacro c-save-buffer-state (varlist &rest body)
"Bind variables according to VARLIST (in `let*' style) and eval BODY,
then restore the buffer state under the assumption that no significant
modification has been made in BODY. A change is considered
significant if it affects the buffer text in any way that isn't
completely restored again. Changes in text properties like `face' or
`syntax-table' are considered insignificant. This macro allows text
properties to be changed, even in a read-only buffer.
This macro should be placed around all calculations which set
\"insignificant\" text properties in a buffer, even when the buffer is
known to be writeable. That way, these text properties remain set
even if the user undoes the command which set them.
This macro should ALWAYS be placed around \"temporary\" internal buffer
changes \(like adding a newline to calculate a text-property then
deleting it again\), so that the user never sees them on his
`buffer-undo-list'. See also `c-tentative-buffer-changes'.
However, any user-visible changes to the buffer \(like auto-newlines\)
must not be within a `c-save-buffer-state', since the user then
wouldn't be able to undo them.
The return value is the value of the last form in BODY."
`(let* ((modified (buffer-modified-p)) (buffer-undo-list t)
(inhibit-read-only t) (inhibit-point-motion-hooks t)
before-change-functions after-change-functions
buffer-file-name buffer-file-truename ; Prevent primitives checking
; for file modification
(progn ,#body)
(and (not modified)
(set-buffer-modified-p nil)))))

Perhaps it is simply because they are considered a part of the string... (like the docs say). Remember, Emacs is buffer-centric, not file-centric, so the fact that the contents get saved out on disk is somewhat irrelevant (when thinking buffer-centric).
Also, the properties are undo-able, and that definitely fits with having the buffer marked as modified.
I don't know that there is a standard way of saving the buffer-modified state, but I do see one in the pabbrev.el library:
(defmacro pabbrev-save-buffer-modified-p (&rest body)
"Eval BODY without affected buffer modification status"
`(let ((buffer-modified (buffer-modified-p))
(buffer-undo-list t))
(set-buffer-modified-p buffer-modified)))
It doesn't protect against nonlocal exits, so perhaps you'd want to add a call to unwind-protect, like so:
(defmacro save-buffer-modified-p (&rest body)
"Eval BODY without affected buffer modification status"
`(let ((buffer-modified (buffer-modified-p))
(buffer-undo-list t))
(set-buffer-modified-p buffer-modified))))


Overriding buffer-read-only in a region

I want most of a buffer to be read-only, except for one small region (part of a line).
I first tried something like
(setq buffer-read-only t)
(let ((inhibit-read-only t))
(add-text-properties start end '(read-only nil)))
but apparently buffer-read-only takes precedence over the read-only property.
I now have buffer-read-only set to nil, and set the read-only property to t on everything except the editable region. (Or read-only nil is regarded as a no-op.)
Is there a better way?
A more detailed description of my use case is that I want my buffer to display the output of an asynchronous process. The output is mostly for read-only viewing. However, a small part of a line is editable. This part will become input to the process if it is run again.
There's no simple way to do what you need in Emacs 24 and earlier. I agree with your solution to mark everything with the read-only property except the parts that you want to be editable.
Emacs 25 will have the inhibit-read-only property, which does exactly what you want. It was implemented by larsi on 16 November, and is used by eww.

elisp code clobbering a buffer, instead of saving off elsewhere... why?

I'm having some difficulties when trying to set something up that saves some persistent state, so that I can use the data between emacs invocations.
Using as a starting point some code from another question, I came up with the following little code snippet for something I'm wanting to do:
(defmacro with-output-to-file (path &rest body)
"record output of commands in body to file"
(let* ((buf (find-file-noselect ,path))
(standard-output buf))
(set-buffer buf)
I then have a function that uses this, like:
(defun my-save-some-data ()
(with-output-to-file my-data-save-file
(prin1 my-data)))
EDIT: These both follow code like the following (previously, these were both setq; thanks to a comment from #phils for inspiring me to switch them to devfar and defcustom):
; note: my actual variable names (and filename value) are different;
; changed for example sake:
(defvar my-data (make-hash-table :test 'equal) "Data for a thing")
(defcustom my-data-save-file "~/tmp/my-data.el" "File to save my data to")
(Note: I also have a function to read the data back in, which happens automatically at load time, or on demand.)
I've set that up to run in a few circumstances (maybe too many? maybe poor choices? Anyway, this is what I set up):
(add-hook 'auto-save-hook 'my-save-some-data)
(add-hook 'kill-emacs-hook 'my-save-some-data)
(add-hook 'post-gc-hook 'my-save-some-data)
Most of the time, this works fine. However, every once in a while, I'm getting a problem where the data gets written to one of my previously-open buffers (killing all previous content there!), and then that buffer gets killed, with the saved changes.
Suffice it to say, this is highly annoying, as the buffer where this happens is frequently somewhere where I've been doing some work, and not necessarily checked it in yet.
I tried altering the macro above, replacing from (set-buffer buf) on with:
(with-current-buffer buf ; because set-buffer wasn't working??
(if (eq buf (current-buffer))
(message "buffer changed?!"))))))
This has somehow managed to cause it to append to the buffer, instead of overwriting it... so my if statement does seem to be working to some degree... however I don't see the message in my *Messages* buffer, so... I'm not quite sure what's going on.
One thing I think I've noticed (though it's hard to be certain, since I may not be actively paying attention when this happens) is that this happens in a not-then-currently-active buffer, rather than a buffer I'm currently editing.
So, the questions:
Am I doing something wrong here?
Are there other/better ways of doing this?
Are there standard ways to save state in a programatic way, that I could be using? (I poked around a bit in apropos, but failed to find anything... though perhaps I just don't know what to look for.)
What can I do to help myself track this down? (is there a way I can set breakpoints or something?)
Are there other protections I could use in code like this?
Any other thoughts welcome. I'm adding some more (message) forms in hopes of getting more debugging info in the mean time.
UPDATE: I've figured out that this only happens with the post-gc-hook. I don't know if my variables were somehow getting clobbered (and perhaps switching to defvar and defcustom will solve that?), or if there's some sort of obscure bug in the post-gc-hook processing... checking for reproducing the test-case with this latest change.
You can indeed set breakpoints, an easy way to do this is to put (edebug) in the place where you want to break. Then you can use, n for next, SPC for step, and e to eval. You can read more about edebug here.
So you can set a conditional breakpoint as a protection/warning, like this, before your call to (set-buffer):
(when (get-file-buffer my-data-save-file)
(format "Warning: %s is already being visited by a buffer, contents will be overwritten! Entering edebug" my-data-save-file))
This will warn you and then enter the debugger if a file you are visiting in some buffer is about to be overwritten by your macro, where you can inspect what is going on.
Here is part the docstring of find-file-no-select:
Read file FILENAME into a buffer and return the buffer.
If a buffer exists visiting FILENAME, return that one, but
verify that the file has not changed since visited or saved.
My guess is that the my-data-save-file is already being visited by a buffer, so that is the buffer that is returned (and subsequently overwritten). But you can really find out what is happening with (edebug).
Just a quick reply to some of what you said. Your message never appears probably because you test whether the buffer of with-current-buffer is the current-buffer, which it always is, unless body changes the current buffer.
But you are right to use with-current-buffer instead of save-excursion followed by set-buffer.
As for other ways: why not put your data in a temporary buffer and then use write-file or append-to-fileor write-region?
FWIW, I tried your code briefly and saw no problem. But I just tried a simple (prin1 (symbol-function 'my-save-some-data)) for the body and a constant file name for the file. I tried with pre-existing file or not, and with pre-existing buffer or not, and with pre-existing unsaved modified buffer or not.
Are you testing with the interpreted code (e.g., macro present) or byte-compiled code?

How to keep dir-local variables when switching major modes?

I'm committing to a project where standard indentations and tabs are 3-chars wide, and it's using a mix of HTML, PHP, and JavaScript. Since I use Emacs for everything, and only want the 3-char indentation for this project, I set up a ".dir-locals.el" file at the root of the project to apply to all files/all modes under it:
; Match projets's default indent of 3 spaces per level- and don't add tabs
(nil .
(tab-width . 3)
(c-basic-offset . 3)
(indent-tabs-mode . nil)
Which works fine when I first open a file. The problem happens when switching major modes- for example to work on a chunk of literal HTML inside of a PHP file. Then I lose all the dir-local variables.
I've also tried explicitly stating all of the modes I use in ".dir-locals.el", and adding to my .emacs file "dir-locals-set-class-variables / dir-locals-set-directory-class". I'm glad to say they all behave consistently, initially setting the dir-local variables, and then losing them as I switch the major mode.
I'm using GNU Emacs 24.3.1.
What's an elegant way of reloading dir-local variables upon switching a buffer's major-mode?
-- edit -- Thanks for the excellent answers and commentary both Aaron and phils! After posting here, I thought it "smelled" like a bug, so entered a report to GNU- will send them a reference to these discussions.
As per comments to Aaron Miller's answer, here is an overview of what happens when a mode function is called (with an explanation of derived modes); how calling a mode manually differs from Emacs calling it automatically; and where after-change-major-mode-hook and hack-local-variables fit into this, in the context of the following suggested code:
(add-hook 'after-change-major-mode-hook 'hack-local-variables)
After visiting a file, Emacs calls normal-mode which "establishes the proper major mode and buffer-local variable bindings" for the buffer. It does this by first calling set-auto-mode, and immediately afterwards calling hack-local-variables, which determines all the directory-local and file-local variables for the buffer, and sets their values accordingly.
For details of how set-auto-mode chooses the mode to call, see C-hig (elisp) Auto Major Mode RET. It actually involves some early local-variable interaction (it needs to check for a mode variable, so there's a specific look-up for that which happens before the mode is set), but the 'proper' local variable processing happens afterwards.
When the selected mode function is actually called, there's a clever sequence of events which is worth detailing. This requires us to understand a little about "derived modes" and "delayed mode hooks"...
Derived modes, and mode hooks
The majority of major modes are defined with the macro define-derived-mode. (Of course there's nothing stopping you from simply writing (defun foo-mode ...) and doing whatever you want; but if you want to ensure that your major mode plays nicely with the rest of Emacs, you'll use the standard macros.)
When you define a derived mode, you must specify the parent mode which it derives from. If the mode has no logical parent, you still use this macro to define it (in order to get all the standard benefits), and you simply specify nil for the parent. Alternatively you could specify fundamental-mode as the parent, as the effect is much the same as for nil, as we shall see momentarily.
define-derived-mode then defines the mode function for you using a standard template, and the very first thing that happens when the mode function is called is:
or if no parent is set:
As fundamental-mode itself calls (kill-all-local-variables) and then immediately returns when called in this situation, the effect of specifying it as the parent is equivalent to if the parent were nil.
Note that kill-all-local-variables runs change-major-mode-hook before doing anything else, so that will be the first hook which is run during this whole sequence (and it happens while the previous major mode is still active, before any of the code for the new mode has been evaluated).
So that's the first thing that happens. The very last thing that the mode function does is to call (run-mode-hooks MODE-HOOK) for its own MODE-HOOK variable (this variable name is literally the mode function's symbol name with a -hook suffix).
So if we consider a mode named child-mode which is derived from parent-mode which is derived from grandparent-mode, the whole chain of events when we call (child-mode) looks something like this:
(kill-all-local-variables) ;; runs change-major-mode-hook
(run-mode-hooks 'grandparent-mode-hook)
(run-mode-hooks 'parent-mode-hook)
(run-mode-hooks 'child-mode-hook)
What does delay-mode-hooks do? It simply binds the variable delay-mode-hooks, which is checked by run-mode-hooks. When this variable is non-nil, run-mode-hooks just pushes its argument onto a list of hooks to be run at some future time, and returns immediately.
Only when delay-mode-hooks is nil will run-mode-hooks actually run the hooks. In the above example, this is not until (run-mode-hooks 'child-mode-hook) is called.
For the general case of (run-mode-hooks HOOKS), the following hooks run in sequence:
delayed-mode-hooks (in the sequence in which they would otherwise have run)
HOOKS (being the argument to run-mode-hooks)
So when we call (child-mode), the full sequence is:
(run-hooks 'change-major-mode-hook) ;; actually the first thing done by
(kill-all-local-variables) ;; <-- this function
(run-hooks 'change-major-mode-after-body-hook)
(run-hooks 'grandparent-mode-hook)
(run-hooks 'parent-mode-hook)
(run-hooks 'child-mode-hook)
(run-hooks 'after-change-major-mode-hook)
Back to local variables...
Which brings us back to after-change-major-mode-hook and using it to call hack-local-variables:
(add-hook 'after-change-major-mode-hook 'hack-local-variables)
We can now see clearly that if we do this, there are two possible sequences of note:
We manually change to foo-mode:
=> (kill-all-local-variables)
=> [...]
=> (run-hooks 'after-change-major-mode-hook)
=> (hack-local-variables)
We visit a file for which foo-mode is the automatic choice:
=> (set-auto-mode)
=> (foo-mode)
=> (kill-all-local-variables)
=> [...]
=> (run-hooks 'after-change-major-mode-hook)
=> (hack-local-variables)
=> (hack-local-variables)
Is it a problem that hack-local-variables runs twice? Maybe, maybe not. At minimum it's slightly inefficient, but that's probably not a significant concern for most people. For me, the main thing is that I wouldn't want to rely upon this arrangement always being fine in all situations, as it's certainly not the expected behaviour.
(Personally I do actually cause this to happen in certain specific cases, and it works just fine; but of course those cases are easily tested -- whereas doing this as standard means that all cases are affected, and testing is impractical.)
So I would propose a small tweak to the technique, so that our additional call to the function does not happen if normal-mode is executing:
(defvar my-hack-local-variables-after-major-mode-change t
"Whether to process local variables after a major mode change.
Disabled by advice if the mode change is triggered by `normal-mode',
as local variables are processed automatically in that instance.")
(defadvice normal-mode (around my-do-not-hack-local-variables-twice)
"Prevents `after-change-major-mode-hook' from processing local variables.
See `my-after-change-major-mode-hack-local-variables'."
(let ((my-hack-local-variables-after-major-mode-change nil))
(ad-activate 'normal-mode)
(add-hook 'after-change-major-mode-hook
(defun my-after-change-major-mode-hack-local-variables ()
"Callback function for `after-change-major-mode-hook'."
(when my-hack-local-variables-after-major-mode-change
Disadvantages to this?
The major one is that you can no longer change the mode of a buffer which sets its major mode using a local variable. Or rather, it will be changed back immediately as a result of the local variable processing.
That's not impossible to overcome, but I'm going to call it out of scope for the moment :)
Be warned that I have not tried this, so it may produce undesired results ranging from your dir-local variables not being applied, to Emacs attempting to strangle your cat; by any sensible definition of how Emacs should behave, this is almost certainly cheating. On the other hand, it's all in the standard library, so it can't be that much of a sin. (I hope.)
Evaluate the following:
(add-hook 'after-change-major-mode-hook
From then on, when you change major modes, dir-local variables should (I think) be reapplied immediately after the change.
If it doesn't work or you don't like it, you can undo it without restarting Emacs by replacing 'add-hook' with 'remove-hook' and evaluating the form again.
My take on this:
(add-hook 'after-change-major-mode-hook #'hack-local-variables)
and either
(defun my-normal-mode-advice
(function &rest ...)
(let ((after-change-major-mode-hook
(remq #'hack-local-variables after-change-major-mode-hook)))
(apply function ...)))
if you can live with the annoying
Making after-change-major-mode-hook buffer-local while locally let-bound!
message or
(defun my-normal-mode-advice
(function &rest ...)
(remove-hook 'after-change-major-mode-hook #'hack-local-variables)
(apply function ...)
(add-hook 'after-change-major-mode-hook #'hack-local-variables)))
otherwise and finally
(advice-add #'normal-mode :around #'my-normal-mode-advice)

How to prevent Emacs from setting an undo boundary?

I've written an Emacs Lisp function which calls a shell command to
process a given string and return the resulting string. Here is a
simplified example which just calls tr to convert text to uppercase:
(defun test-shell-command (str)
"Apply tr to STR to convert lowercase letters to uppercase."
(let ((buffer (generate-new-buffer "*temp*")))
(with-current-buffer buffer
(insert str)
(call-process-region (point-min) (point-max) "tr" t t nil "'a-z'" "'A-Z'")
This function creates a temporary buffer, inserts the text, calls
tr, replaces the text with the result, and returns the result.
The above function works as expected, however, when I write a wrapper
around this function to apply the command to the region, two steps are
being added to the undo history. Here's another example:
(defun test-shell-command-region (begin end)
"Apply tr to region from BEGIN to END."
(interactive "*r")
(insert (test-shell-command (delete-and-extract-region begin end))))
When I call M-x test-shell-command-on-region, the region is replaced
with the uppercase text, but when I press C-_ (undo), the first
step in the undo history is the state with the text deleted. Going
two steps back, the original text is restored.
My question is, how does one prevent the intermediate step from being
added to the undo history? I've read the Emacs documentation on undo,
but it doesn't seem to address this as far as I can tell.
Here's a function which accomplishes the same thing by calling the
built-in Emacs function upcase, just as before: on the result of
delete-and-extract-region with the result being handed off to
(defun test-upcase-region (begin end)
"Apply upcase to region from BEGIN to END."
(interactive "*r")
(insert (upcase (delete-and-extract-region begin end))))
When calling M-x test-upcase-region, there is only one step in the
undo history, as expected. So, it seems to be the case that calling
test-shell-command creates an undo boundary. Can that be avoided
The key is the buffer name. See Maintaining Undo:
Recording of undo information in a newly created buffer is normally enabled to start with; but if the buffer name starts with a space, the undo recording is initially disabled. You can explicitly enable or disable undo recording with the following two functions, or by setting buffer-undo-list yourself.
with-temp-buffer creates a buffer named ␣*temp* (note the leading whitespace), whereas your function uses *temp*.
To remove the undo boundary in your code, either use a buffer name with a leading space, or explicitly disable undo recoding in the temporary buffer with buffer-disable-undo.
But generally, use with-temp-buffer, really. It's the standard way for such things in Emacs, making your intention clear to anyone who reads your code. Also, with-temp-buffer tries hard to clean up the temporary buffer properly.
As for why undo in the temporary buffer creates an undo boundary in the current one: If the previous change was undoable and made in some other buffer (the temporary one in this case), an implicit boundary is created. From undo-boundary:
All buffer modifications add a boundary whenever the previous undoable change was made in some other buffer. This is to ensure that each command makes a boundary in each buffer where it makes changes.
Hence, inhibiting undo in the temporary buffer removes the undo boundary in the current buffer, too: The previous change is simply not undoable anymore, and thus no implicit boundary is created.
There are many situations where using a temporary buffer is not practical. It's hard to debug what's going on for example.
In these cases you can let-bind undo-inhibit-record-point to prevent Emacs from deciding where to put boundaries:
(let ((undo-inhibit-record-point t))
;; Then record boundaries manually
The solution in this case was to create the temporary output buffer
using with-temp-buffer, rather than explicitly creating one with
generate-new-buffer. The following alternative version of the
first function does not create an undo boundary:
(defun test-shell-command (str)
"Apply tr to STR to convert lowercase letters to uppercase."
(insert str)
(call-process-region (point-min) (point-max) "tr" t t nil "'a-z'" "'A-Z'")
I was not able to determine whether generate-new-buffer is indeed
creating the undo boundary, but this fixed the problem.
generate-new-buffer calls get-buffer-create, which is defined in
the C source code, but I could not quickly determine what was
happening in terms of the undo history.
I suspect that the issue may be related to the following passage in
the Emacs Lisp Manual entry for undo-boundary:
All buffer modifications add a boundary whenever the previous
undoable change was made in some other buffer. This is to ensure
that each command makes a boundary in each buffer where it makes
Even though the with-temp-buffer macro calls generate-new-buffer
much as in the original function, the documentation for
with-temp-buffer states that no undo information is saved (even
though there is nothing in the Emacs Lisp source that suggests this
would be the case):
By default, undo (see Undo) is not recorded in the buffer created by
this macro (but body can enable that, if needed).

Who is setting the modified flag on my files in Emacs?

Over the course of the last few years I've been slowly growing my Emacs configuration, adding bits and pieces, add new modes etc. Around a year ago a problem started to occur regularly: some code is setting the modified bit on my buffers. It doesn't actually change anything, it just sets this flags. It is slightly annoying since each time I run compile or save-some-buffers I have to manually discard the changes in these buffers to reset the modified bit. How can I find the offending code?
Contrary to phils, I expect that your modified-p flag is not set by set-buffer-modified-p but rather by actual changes to the buffer. The reason this is possible is that text-properties are treated by Emacs as belonging to the content of the buffer, so changing them sets the modified-p flag, even tho in many cases the result is invisible and even if it is visible it is generally not perceived by the user as a modification (which users generally understand as something like "affects the file when I save the buffer").
So, most of the code that sets text-properties needs to be careful to reset the modified-p flag afterwards. The best way to do that is usually by wrapping the code that sets the properties inside a with-silent-modification.
One way try to track down the culprit is by trying to undo the modification (e.g. with C-/), but of course, if the modification is not visible, undoing it won't be visible either. So instead you may want to look at C-h v buffer-undo-list RET which is the internal data used to keep track of the modifications. With luck, not only was the modified-p set but the undo-list as well, and that list will tell you what was changed. For example, that list could look like (nil (nil face nil 12345708 . 12345713)) which means that the change was to set the face property to a new value between positions 12345708 and 12345713 and that the old value of that property was nil (that's the 3rd nil in the above). Sometimes looking at the affected positions with M-: (goto-char 12345708) RET is sufficient to figure out who's to blame. Othertimes looking at M-: (get-text-property 12345708 'face) RET, which gives you the new value that was set, is more useful.
If something really is explicitly setting the buffer as modified without changing anything, then I guess it ought to be calling set-buffer-modified-p.
I was originally going to suggest debug-on-entry for set-buffer-modified-p, but a cursory test showed that was extremely disruptive in general, so here's a way you can indicate which buffers you are interested in:
(defvar my-debug-set-buffer-modified-p-buffers nil)
(defadvice set-buffer-modified-p
(before my-debug-set-buffer-modified-p-advice)
(when (memq (current-buffer) my-debug-set-buffer-modified-p-buffers)
(ad-activate 'set-buffer-modified-p)
(defun my-debug-set-buffer-modified-p (buffer)
(interactive (list (current-buffer)))
(if (memq buffer my-debug-set-buffer-modified-p-buffers)
(progn (setq my-debug-set-buffer-modified-p-buffers
(delq buffer my-debug-set-buffer-modified-p-buffers))
(message "Disabled for %s" buffer))
(add-to-list 'my-debug-set-buffer-modified-p-buffers buffer)
(message "Enabled for %s" buffer)))