Underscore as part of word for forward-word not working - emacs

I am trying to make underscores get treated as part of the word for the forward/backward-word function as described here and here. I am specifically trying to get this to work for nxhtml mode, but would really like it to work like this for all modes.
I have modified my site-start.el file a number of different ways but to no avail. But if I manually execute the command M-x modify-syntax-table in the buffer, it works just fine. I just can't get this to be the default behavior.
Here is what I tried putting in my site-start.el file:
;; 1
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 2
;; thought this would automatically set it on a mode change - no error, but did not work
(defun change-major-mode-hook ()
(modify-syntax-entry ?_ "w"))
;; 3
;; thought this would apply it to all modes - no error, but did not work
(modify-syntax-entry ?_ "w")
;; 4
;; this produced a symbol's value as variable is void error
(modify-syntax-entry ?_ "w" nxhtml-mode-syntax-table)
What am I missing?

Have you tried modifying the nxhtml-mode-hook directly? Like so:
(add-hook 'nxhtml-mode-hook
(lambda () (modify-syntax-entry ?_ "w")))

Trey's answered your very specific question.
But Emacs already has a level of character organization that does what you ask: sexp. If you learned sexp commands instead of breaking word commands, then you can delete by "word" or by "symbol" as the situation fits.

As event_jr, I recommend you learn to use the sexp-based commands like C-M-f and C-M-b. But if you really want to only jump over identifiers (and ignore things like parentheses), then rather than try and change all syntax tables in all major modes (which may end up having unforeseen consequences), I recommend you simply use new symbol-based movement commands.
E.g. (global-set-key [remap forward-word] 'forward-symbol). For some commands (like kill-word) you'll need to write kill-symbol yourself, but it should be easy to do.

Instead of adding hooks for each major mode, as suggested by Trey Jackson, you can modify the standard syntax table. The different modes inherit from the standard syntax table so your modification will hopefully apply to all modes:
(modify-syntax-entry ?_ "w" (standard-syntax-table))
This will affect regular text, XML, Python and others.
Some modes are more stubborn, and modifying the standard syntax table won't affect them as they override some symbols during initialization. One example is cc-mode. You'll still have to modify their tables explicitly.


How to disable underscore (_) subscripting in Emacs, TeX input method

On Emacs, while editing a text document of notes for myself (a .txt document, not a .tex document), I am using M-x set-input-method Ret TeX, in order to get easy access to various Unicode characters. So for example, typing \tospace causes a "→" to be inserted into the text, and typing x^2 causes "x2" to be inserted, because the font I am using has support for Unicode codepoints 0x2192 and 0x00B2, respectively.
One of the specially handled characters in the method is for the underscore key, _. However, the font I am using for Emacs does not appear to have support for the codepoints for the various subscript characters, such as subscript zero (codepoint 0x2080), and so when I type _0, I get something rendered as a thin blank in my output. I would prefer to just have the two characters _0 in this case.
I can get _0 by the awkward keystroke sequence _spacedel0, since the space keystroke in the middle of the sequence causes Emacs to abort the TeX input method. But this is awkward.
So, my question: How can I locally customize my Emacs to not remap the _ key at all when I am in the TeX input method? Or how can I create a modified clone (or extension, etc) of the TeX input method that leaves out underscore from its magic?
Things I have tried so far:
I have already done M-xdescribe-key on _; but it is just bound to self-insert-command, like many other text characters. I did see a post-self-insert-hook there, but I have not explored trying to use that to subvert the TeX input method.
Things I have not tried so far:
I have not tried learning anything about the input method architecture or its source code. From my quick purview of the code and methods. it did not seem like something I could quickly jump into.
So here is the solution I just found: Make a personalized copy of the TeX input method, with all of the undesirable entries removed. Then when using M-x set-input-method, select the personalized version instead of TeX.
I would have tried this earlier, but the built-in documentation for set-input-mode and its ilk does not provide sufficient guidance to the actual source for the input-methods for me to find it. It was only after doing another search on SO and finding this: Emacs: Can't activate input method that I was able to get enough information to do this on my own.
In Emacs, open /usr/share/emacs/22.1/leim/leim-list.el and find the entry for the input method you want to customize. The entry will be something like the following form:
"TeX" "UTF-8" 'quail-use-package
"\\" "LaTeX-like input method for many characters."
Note the file name prefix referenced in the last element in the form above. Find the corresponding Elisp source file; in this case, it is a relative path to the file quail/latin-ltx.el[.gz]. Open that file in Emacs, and check it out; it should have the entries for the method remappings, both desired and undesired.
Make a user-local copy of that Elisp source file amongst your other Emacs customizations. Open that local copy in Emacs.
In your local copy, find the (quail-define-package ...) form in the file, and change the name of the package; I used FSK-TeX as my new name, like so:
"FSK-TeX" "UTF-8" "\\" t ;; <-- The first argument here is the important bit to change.
"LaTeX-like input method for many characters but not as many as you might think.
Go through your local copy, and delete all the S-expressions for mappings that you don't want.
In your .emacs configuration file, register your customized input method, using a form analogous to the one you saw when you looked at leim-list.el in step 1:
"FSK-TeX" "UTF-8" 'quail-use-package
"\\" "FSK-customized LaTeX-like input method for many characters."
Restart Emacs and test your new input-method; in my case, by doing M-x set-input-method FSK-TeX, typing a_0, and confirming that a_0 shows up in the buffer.
So, there's at least one answer that is less awkward once you have it installed than some of the workarounds listed in the question (and as it turns out, are also officially documented in the Emacs 22 manual as a way to cut off input method processing).
However, I am not really happy with this solution, since I would prefer to inherit future changes to TeX mode, and just have my .emacs remove the undesirable entries on startup.
So I will wait to see if anyone else comes up with a better answer than this.
I did not test this myself, but this seems to be the exact thing you are looking for:
"How to disable underscore subscript in TeX mode in emacs" - source
Two solutions are given in this blogpot:
By the author of the blogpost: (setq font-lock-maximum-decoration nil) (from maximum)
Mentioned as comment:
(eval-after-load "tex-mode" '(fset 'tex-font-lock-subscript 'ignore))
The evil plugin for vim-like modal keybinding allows to map two subsequent presses of the _ key to the insertion of a single _ character:
(set-input-method 'TeX)
(define-key evil-insert-state-local-map (kbd "_ _")
(lambda () (interactive) (insert "_")))
(define-key evil-insert-state-local-map (kbd "^ ^")
(lambda () (interactive) (insert "^")))
When _ and then 1 is pressed, we get ₁ as before, but
when _ and then _ is pressed, we get _.
Analogous for ^.
As already explained in pnkfelix answer, it seems we have to make a personalized copy of the TeX input method. But here comes a lighter way to do that, without any file tweaking. Simply put the following in your .emacs :
(eval-after-load "quail/latin-ltx"
'(let ((pkg (copy-tree (quail-package "TeX"))))
(setcar pkg "MyTeX")
(assq-delete-all ?_ (nth 2 pkg))
(quail-add-package pkg)))
(set-input-method 'TeX)
(register-input-method "MyTeX" "UTF-8" 'quail-use-package "\\")
(set-input-method 'MyTeX)
The important part is the assq-delete-all line in the middle that remove all shortcut entries starting with _. It's a bit of a lisp hack but it seems to work. Since I'm also annoyed by the shortcuts starting with - and ^, I also use the following two lines to disable them :
(assq-delete-all ?- (nth 2 pkg))
(assq-delete-all ?^ (nth 2 pkg))
Note that afterwards you can M-x set-input-method at any time and indicate TeX or MyTeX to switch between the pristine TeX input method or the customized one.

Define a character as a word boundary

I've defined the \ character to behave as a word constituent in latex-mode, and I'm pretty happy with the results. The only thing bothering me is that a sequence like \alpha\beta gets treated as a single word (which is the expected behavior, of course).
Is there a way to make emacs interpret a specific character as a word "starter"? This way it would always be considered part of the word following it, but never part of the word preceding it.
For clarity, here's an example:
^ ^
1 2
If the point is at 1 and I press M-d, the string "\alpha" should be killed.
If the point is at 2 and I press M-<backspace>, the string "\beta" should be killed.
How can I achieve this?
Another thought:
Your requirement is very like what subword-mode provides for camelCase.
You can't customize subword-mode's behaviour -- the regexps are hard-coded -- but you could certainly copy that library and modify it for your purposes.
M-x find-library RET subword RET
That would presumably be a pretty robust solution.
Edit: updated from the comments, as suggested:
For the record, changing every instance of [[:upper:]] to [\\\\[:upper:]] in the functions subword-forward-internal and subword-backward-internal inside subword.el works great =) (as long as "\" is defined as "w" syntax).
Personally I would be more inclined to make a copy of the library than edit it directly, unless for the purpose of making the existing library a little more general-purpose, for which the simplest solution would seem to be to move those regexps into variables -- after which it would be trivial to have buffer-local modified versions for this kind of purpose.
Edit 2: As of Emacs 24.3 (currently a release candidate), subword-mode facilitates this with the new subword-forward-regexp and subword-backward-regexp variables (for simple modifications), and the subword-forward-function and subword-backward-function variables (for more complex modifications).
By making those regexp variables buffer-local in latex-mode with the desired values, you can just use subword-mode directly.
You should be able to implement this using syntax text properties:
M-: (info "(elisp) Syntax Properties") RET
Edit: Actually, I'm not sure if you can do precisely this?
The following (which is just experimentation) is close, but M-<backspace> at 2 will only delete "beta" and not the preceding "\".
I suppose you could remap backward-kill-word to a function which checked for that preceding "\" and killed that as well. Fairly hacky, but it would probably do the trick if there's not a cleaner solution.
I haven't played with this functionality before; perhaps someone else can clarify.
(modify-syntax-entry ?\\ "w")
(setq parse-sexp-lookup-properties t)
(setq syntax-propertize-function 'my-propertize-syntax)
(defun my-propertize-syntax (start end)
"Set custom syntax properties."
(goto-char start)
(while (re-search-forward "\\w\\\\" end t)
(1- (point)) (point) 'syntax-table (cons "." ?\\)))))

Modify Alt+f in Emacs for tex-mode

Alt+f in emacs when writing in tex mode seems to not include the . as part of the word. So how do I modify the alt+f behavior to remain the same exact when going forward if there is punctiation to include that as part of the word.
I have a separate file that loads for when writing in tex so I will just throw it in there so it doesn't affect normal emacs behavior.
Thanks for any help.
Thought of an addition to this but same related problem is when using Alt+d and deleting. Getting it to delete not only the word but also the punctation following eg.. (,.! etc..).
The following code should work for you:
(defun unpunctuate-syntax (str)
"Make the characters of the given string word characters."
(let ((st (copy-syntax-table (syntax-table))))
(dotimes (n (length str))
(modify-syntax-entry (elt str n) "w" st))
(set-syntax-table st)))
(defun dots-are-not-punctuation ()
(unpunctuate-syntax "."))
(add-hook 'TeX-mode-hook 'dots-are-not-punctuation)
The way M-f (the forward-word function) works is that it skips all characters in the buffer that have type "w" (ie word) in the current syntax table.
This code makes a modified syntax table and gives it to the buffer and the add-hook bit at the bottom sets it to run when you open a file in TeX-mode. (This method avoids you having to do the separate file thing you described).
You might notice that I make a copy of the syntax table rather than editing the one belonging to the TeX major mode. This is because I always get things wrong when playing with syntax tables and you can mess things up royally... This method means you just have to close the buffer and start again!

Emacs/Auctex: Automatically enabling/disabling LaTeX-Math-mode

I'm using Emacs in conjunction with AucTeX (running Ubuntu 10.04, if that matters).
Does anyone know if there is a way to automatically enable LaTeX-math-mode (a minor mode of AucTeX) if the point is in any maths environment (i.e. in a $...$, a $$...$$, begin{equation}...\end{equation}, and so on)?
I suppose there is a relatively easy answer, since syntax highlighting uses the same criterion for coloring math stuff, but I could not find anything.
If andre-r's answer doesn't satisfy you, here's some code that sets up ` to self-insert in text mode and act as a math mode prefix in math mode. LaTeX-math-mode must be off.
(defun LaTeX-maybe-math ()
"If in math mode, act as a prefix key for `LaTeX-math-keymap'.
Otherwise act as `self-insert-command'."
(if (texmathp)
(let* ((events (let ((overriding-local-map LaTeX-math-keymap))
(read-key-sequence "math: ")))
(binding (lookup-key LaTeX-math-keymap events)))
(call-interactively binding))
(call-interactively 'self-insert-command)))
(define-key LaTeX-mode-map "`" 'LaTeX-maybe-math)
The following improvements are left as exercises:
Make it a minor mode.
Make it more robust towards unexpected input (I've only tested basic operation).
Show a better error message if the user presses an unbound key sequence.
Show help if the user presses C-h or f1.
LaTeX-math-mode is "a special minor mode for entering text with many mathematical symbols." (For those who don't know how, you press e.g. `A and get \forall.) So I guess it doesn't hurt to leave it on, also if you're not entering maths.
The info page therefore suggests:
(add-hook 'LaTeX-mode-hook 'LaTeX-math-mode)
IMHO the only downside would be that you have to press the prefix twice: `` to get `, at least that works with the standard prefix ` customized in LaTeX-math-abbrev-prefix.

Changing Paredit Formatting

When using paredit in programming modes such as C, typing ( will insert a space before the paren when I'm trying to call a function, leaving me with:
foo ()
Is there a way to disable the insertion of the space without changing paredit's source?
Well, the way paredit appears to work is that it checks the syntax tables to see if you're inserting a pair right after a word/symbol/etc., in which case it forces a space to be inserted. You need to override that functionality - which can be done a number of different ways: advice, redefine the function determining space, changing the syntax table, etc.
I'd try the straight forward:
(defun paredit-space-for-delimiter-p (endp delimiter)
(and (not (if endp (eobp) (bobp)))
(memq (char-syntax (if endp (char-after) (char-before)))
(list ?\" ;; REMOVED ?w ?_
(let ((matching (matching-paren delimiter)))
(and matching (char-syntax matching)))))))
This will obviously apply to all places where you use paredit. If you want something more mode specific, you can add some conditions to that and statement (e.g. (and ... (memq major-mode '(c-mode lisp-mode)))).
So... I guess I did change the "source", but you can do the same thing with a piece of defadvice ... it's all elisp, so the difference is minimal. There doesn't appear to be a setting to control this type of behavior.
See paredit-space-for-delimiter-predicates
Well, Paredit is ideal for editing languages built of S-expressions. If you just like how it automatically inserts the closing paren, use feature skeleton-pair.
(setq skeleton-pair t)
(global-set-key "(" 'skeleton-pair-insert-maybe)