Emacs incorrectly wraps text with fill-paragraph - emacs

So in Emacs, word-wraping is called filling and is done by M-q or M-x fill-paragraph. Is there any way to modify this function to respect spaces that should be non-breaking? For example, if we have the following sentence:
This is a black sentence with a yellow word at the end.
and tell Emacs to fill-paragraph at mark 50, it wraps it like this:
This is a black sentence with a yellow word at the
end.
However, if I do the same with C-u M-x shell-command-on-region and enter fold -sw 50, I get the following (correct) output:
This is a black sentence with a yellow word at
the end.
A similar problem happens when the end of a sentence is followed by something in parentheses:
This is a black sentence with a yellow word here. (This is something in parens)
The above sentence is wrapped with M-q at mark 50 in the following way:
This is a black sentence with a yellow word
here. (This is something in parens)
However, fold -sw 50 wraps it correctly:
This is a black sentence with a yellow word here.
(This is something in parens)
I know I could just write a function that uses fold and use that but I'm curious as to why fill-paragraph behaves like this and if it can be modified.

I am not sure what your definition of "non-breaking" space is.
However, you can achieve what you want in your examples by doing:
(add-to-list 'fill-nobreak-predicate 'fill-single-word-nobreak-p)
(setq sentence-end-double-space nil)
Documentation (and comment) of fill-single-word-nobreak-p says:
"Don't break a line after the first or before the last word of a sentence."
;; Actually, allow breaking before the last word of a sentence, so long as
;; it's not the last word of the paragraph.
You can add your own predicates to the fill-nobreak-predicate list.

Related

Display consecutive whitespace as dots in Emacs

This answer nicely provides a way to display characters rather than tabs (in the example it suggests ">", but I confirmed it works for ".").
It uses setting the active window display table to do it.
Now my goal is to display 4 spaces as 4 dots. Using the font-face and a regular expression, I am confident that I can display it nicely. I am aware that I could have Emacs automatically use tab characters rather than whitespaces, but I always prefer to have whitespace characters in my files.
I've also looked at whitespace mode, but I tweaked many parameters and in the end I never get the simple dots (with a face that makes it a little less "jump" out).
So: how can I, rather than display tab characters as dots, display 4 spaces elegantly as dots in Emacs?
OK, here's how to mark 4 or more spaces at beginning of line
(setq whitespace-space-regexp "^\\( \\{4,\\}\\)")
And here's how to get rid of the centered dot character for space:
(setq whitespace-display-mappings
'((space-mark ?\ [?\ ] [?.])
(space-mark ?\xA0 [?\ ] [?_])
(newline-mark ?\n [?$ ?\n])
(tab-mark ?\t [?\u00BB ?\t] [?\\ ?\t])))
The changes take effect not immediately but when you revert-buffer or
close it and open again with customizations above already set.

How to get literal forward slashes in org-mode?

In text, can I make org-mode ignore forward slashes somehow? Phonetics uses /s/ to denote a certain level of analysis.
I assume that you do not want the text to appear emphasized in the buffer, nor in the output. This is a slightly more complex answer which will achieve that result:
There is a variable defined by Org-mode called org-emphasis-alist which defines the different emphasis modes, what their plain-text syntaxes are, and how they are exported to HTML. You can achieve the result you want by changing the value of this variable before Org-mode has been loaded. That last part is critical so note it well—Org-mode reads the value of org-emphasis-alist when it is loaded and uses that value to generate a regular expression for highlighting ("font-lock") purposes.
Here are two routes to that:
Add the following lines to your .emacs file above the lines that load Org-mode:
(setq org-emphasis-alist
`(("*" bold "<b>" "</b>")
;; ("/" italic "<i>" "</i>")
("_" underline "<span style=\"text-decoration:underline;\">" "</span>")
("=" org-code "<code>" "</code>" verbatim)
("~" org-verbatim "<code>" "</code>" verbatim)
("+" ,(if (featurep 'xemacs) 'org-table '(:strike-through t))
"<del>" "</del>")))
(Notice the commented out line.)
Make the change through Emacs' customization facility:
M-x customize RET
In the search box enter Org Emphasis and click Search.
Click the down arrow next to Org Emphasis Alist to reveal its value.
Find and click the second DEL button—corresponding to the italic list item.
Click the Save for future sessions button at the top of the buffer.
You can use
#+OPTIONS: *:nil
to turn off text-emphasis (bold,italics,underline). This will however only work on export itself, the emphasis will still be visible.
See the manual for other export options.
If you like the standard emphasis functionality of the forward slashes in org-mode, you could also just define a new environment. I put something like the following in the preamble whenever I do phonetics/phonology typesetting:
\newcommand\uRep[1]{$/$\textipa{#1}$/$}

multi-word abbrev expansion

I found this mailing list message about multi-word abbreviations, but
still can't get expansion to work.
I have these two abbreviations defined:
"agw" 0 "a great whale"
"a g w" 0 "a great whale"
pressing space after "agw" works, but not "a g w". However if I call
(abbrev-expansion "a g w"), then the correct expansion is returned.
The question is how to get Emacs to search beyond one word boundary backwards?
Yes, yasnippet exists and I use it, but abbrev is more seamless (e.g. press
space after "1/2" turns it into unicode half). I also don't want to change
the syntax table.
SPACE, before inserting itself, will look for an abbrev before point to expand. It uses `backward-word', which will stop at the previous space. Thus SPACE can't expand spaces inside abbrevs.
As explained by Andreas, it does not work because by default abbrevs only work if they are made up of chars that are "word constituent" and " " is not a word constituent. You can change this rule, tho, for a given table, with something like (abbrev-table-put <table> :regexp <regexp>) where a regexp like "\\<\\(\\w+:?\\)\\W*" would pretty much reproduce the default behavior.
Now for your case, you want a regexp that will match "a g w" and "agw" but note that it shouldn't match the "agw" of "dawg" and neither should it match "a g agw", which makes it a bit tricky. One way to do that is to define your regexp as (concat "\\<" (regexp-opt '("agw" "a g w")) "\\W*"), which is fairly simple to do but has the downside that it requires changing the regexp any time you add an abbrev.

How to make part of a word bold in org-mode

How can I make org-mode markup work for a part of a word? For example, I'd like it to work for cases like this:
=Class=es
and this:
/Method/s
Based on my tests it seems like org-mode markup syntax works on complete words only.
These days, there is a way to do this (without using quoted HTML tags):
(setcar org-emphasis-regexp-components " \t('\"{[:alpha:]")
(setcar (nthcdr 1 org-emphasis-regexp-components) "[:alpha:]- \t.,:!?;'\")}\\")
(org-set-emph-re 'org-emphasis-regexp-components org-emphasis-regexp-components)
Explanation
The manual says that org-emphasis-regexp-components can be used to
fine tune what characters are allowed before and after the markup characters [...].
It is a list containing five entries. The first entry lists characters that are allowed to immediately precede markup characters, and the second entry lists characters that are allowed to follow markup characters. By default, letters are not included in either one of these entries. So in order to successfully apply formatting to strings immediately preceded or followed by a letter, we have to add [:alpha:] (which matches any letter) to both entries.
This is what the calls to setcar do. The purpose of the third line is to rebuild the regular expression for emphasis based on the modified version of org-emphasis-regexp-components.
I don't think you can do it so that it shows up in the buffer as bold. If you just need it so that it appears bold when you export it to html, you can use:
th#<b>is is ha#</b>lf bold
See Quoting HTML tags
No, you can't do that. I searched for the same solution before and found nothing. A (very) bad hack is to do something like *Class* es (with a whitespace).
Perhaps you can write a short message to the creator, Carsten Dominik (Homepage), and ask him for a solution. He seems to be a nice guy.
A solution that has not been mentioned is to use a unicode zero width space (U+200B) in between the desired bolded and unbolded parts of a word.
To get the desired bolding of the word "Classes":
Type 'Class*es' in the buffer (without quotes).
Move the cursor between the '*' and 'e' characters.
Press C-x 8 RET (to execute the insert-char command).
Type 'zero width space' (without quotes) and press RET.
Move the cursor to the beginning of the word and insert a '*' character.
The word "Classes" should now have the desired appearance.
Note that there is the possibility that this will cause problems when exporting.
src_latex{\textbf{Class}es and \textit{Method}s}
Building up on the previous excellent answer
I had to modify it a bit in order to make it work with spacemacs. Indeed, from the spacamacs org-layer documentation, available here, we can read
Because of autoloading, calling to org functions will trigger the
loading up of the org shipped with emacs which will induce conflicts.
One way to avoid conflict is to wrap your org config code in a
with-eval-after-load block [...]
So, I put the following lines inside my dotspacemacs/user-config ()
(eval-after-load "org"
'(progn
(setcar org-emphasis-regexp-components " \t('\"{[:alpha:]")
(setcar (nthcdr 1 org-emphasis-regexp-components) "[:alpha:]- \t.,:!?;'\")}\\")
(org-set-emph-re 'org-emphasis-regexp-components org-emphasis-regexp-components)
))

Getting Emacs fill-paragraph to play nice with javadoc-like comments

I'm writing an Emacs major mode for an APL dialect I use at work. I've gotten
basic font locking to work, and after setting comment-start and
comment-start-skip, comment/uncomment region and fill paragraph also
work.
However, comment blocks often contain javadoc style comments and i
would like fill-paragraph to avoid glueing together lines starting
with such commands.
If I have this (\ instead of javadoc #):
# This is a comment that is long and should be wrapped.
# \arg Description of argument
# \ret Description of return value
M-q gives me:
# This is a comment that is long and
# should be wrapped. \arg Description
# of argument \ret Description of
# return value
But I want:
# This is a comment that is long and
# should be wrapped.
# \arg Description of argument
# \ret Description of return value
I've tried setting up paragraph-start and paragraph-separate to
appropriate values, but fill-paragraph still doesn't work inside a
comment block. If I remove the comment markers, M-q works as I want
to, so the regexp I use for paragraph-start seems to work.
Do I have to write a custom fill-paragraph for my major
mode? cc-mode has one that handles cases like this, but it's really
complex, I'd like to avoid it if possible.
The problem was that the paragraph-start regexp has to match the entire line to work, including the actual comment character. The following elisp works for the example I gave:
(setq paragraph-start "^\\s-*\\#\\s-*\\\\\\(arg\\|ret\\).*$")
Here a page that has an example regexp for php-mode that does this:
http://barelyenough.org/blog/2006/10/nicer-phpdoc-comments/
There's other modes that have less complex functions used for fill-paragraph-function. Browsing through my install, it looks like the ones in ada-mode and make-mode are good examples.
What I do in these cases is open a blank line between the paragraph lines and the argument lines, then use M-q to wrap the paragraph lines, then kill the blank line between them. Not ideal, but it works and is easy enough to record in a macro if you need to repeat it.