How to remove ^M in emacs [duplicate] - emacs

This question already has answers here:
How to find and remove the invisible characters in text file using emacs
(3 answers)
Closed 8 years ago.
How can I remove ^M characters in Emacs?
It doesn't work using dos2unix filename or unix2dos filename.
Normally I cannot see any ^M characters, but this is what came out when using the command cat -A filename :
Please explain it in plain words... and in detail...
cat -A ABC.sh
#!/bin/csh -f^M$
^M$
^M$
set input = `ls -1 *.py`^M$
echo $input^M$

[Searched for a duplicate, but didn't find one about replacing (instead of preventing etc.). If there is one then this one or that one should be closed.]
In Emacs, visit the file that has the ^M chars. Go to the beginning of the file (M-<), then:
M-x replace-string RET C-q C-m RET RET
That is, at the prompt for what to replace, use Control + q then Control + m, then Enter. At the prompt for what to replace it with, just hit Enter (replace it with nothing).

I've been using this
(defun delete-carrage-returns ()
(interactive)
(save-excursion
(goto-char 0)
(while (search-forward "\r" nil :noerror)
(replace-match ""))))
I'm sure there is a better way, but this works well enough that I stopped looking.

Related

how to enter tilde in emacs with portuguese keyboard?

In every single other program I have ever used in the last 15 years across windows, osx and linux, I enter a tilde by pressing, the tilde key and then space. The Portuguese keyboard has a dedicated key for tilde where it is the primary character (no need for shift), it is used to compose ã and õ by pressing tilde then a or o. In emacs pressing tilde does nothing and posts "dead-tilde is undefined". How can I make emacs write a '~' when I press the '~' key in pt layout ?
Edit:
I think this is a better solution: It should match your experience in other applications where ~o gives õ and ~ followed by a space gives ~.
Tell Emacs you wish to use the portuguese-prefix input method. Interactively, you can do M-x set-input-method RET portuguese-prefix RET. To make this permanent, add something like this to your config file:
(set-input-method 'portuguese-prefix)
Original answer:
self-insert-command doesn't seem to work well with dead keys.
Try this instead:
(defun my-insert-tilde ()
(interactive)
(insert "~"))
(global-set-key (kbd "<dead-tilde>") #'my-insert-tilde)
add
(require 'iso-transl)
to Emacs init file (init.el). With this line tilde+space prints a tilde, and tilde+a prints ã.
This seems to be due to "Emacs and some input method managers (ibus and SCIM) don’t work together".

How to strip CR (^M) and leave LF (^J) characters?

I am trying to use Hexl mode to manually remove some special characters from a text file and don't see how to delete anything in Hexl mode.
What I really want is to remove carriage return and keep linefeed characters.
Is Hexl mode the right way to do this?
No need to find replace. Just use.
M-x delete-trailing-whitespace
You can also set the file encoding through
C-x RET f unix
Oops. That ^J^M needs to be entered as two literal characters.
Use c-q c-j, c-q c-m and for the replacement string, use c-q c-j.
No need for hexl-mode for this. Just do a global-search-and-replace of ^J^M with ^J Works for me. :) Then save the file, kill the buffer, and revisit the file so the window shows the new file mode (Unix vs DOS).
There's also a command-line tool called unix2dos/dos2unix that exists specifically to convert line endings.
Assuming you want a DOS encoded file to be changed into UNIX encoding, use M-x set-buffer-file-coding-system (C-x RET f) to set the coding-system to "unix" and save the file.
If you want to remove a carriage return (usually displayed as ^M) and leave the line feed. You can just visit the file w/out any conversion:
M-x find-file-literally /path/to/file
Because a file with carriage returns is generally displayed in DOS mode (hiding the carriage returns). The mode line will likely display (DOS) on the left side.
Once you've done that, the ^M will show up and you can delete them like you would any character.
You don't need to use hexl-mode. Instead:
open file in a way that shows you those ^M's. See M-x find-file-literally /path/to/file above. In XEmacs you can also do C-u C-x C-f and select binary encoding.
select the string you want replace and copy it using M-w
do M-% (query replace) and paste what you want to copy using C-y
present Enter when prompted to what replace it with
possible press ! now to replace all occurrences
The point is that even if you don't how to enter what you are trying to replace, you can always select/copy it.
(in hexl mode) I'm not sure that you can delete characters. I've always converted them to spaces or some other character, switched to the regular text editor, and deleted them there.
I use this function:
(defun l/cr-sanitise ()
"Make sure current buffer uses unix-utf8 encoding.
If necessary remove superfluous ^M. Buffer will need to be saved
for changes to be permanent."
(interactive)
(set-buffer-file-coding-system 'utf-8-unix)
(delete-trailing-whitespace)
(message "Please save buffer to persist encoding changes."))
From http://www.xsteve.at/prg/emacs/xsteve-functions.el:
;02.02.2000
(defun xsteve-remove-control-M ()
"Remove ^M at end of line in the whole buffer."
(interactive)
(save-match-data
(save-excursion
(let ((remove-count 0))
(goto-char (point-min))
(while (re-search-forward (concat (char-to-string 13) "$") (point-max) t)
(setq remove-count (+ remove-count 1))
(replace-match "" nil nil))
(message (format "%d ^M removed from buffer." remove-count))))))
Add this to your .emacs and run it via M-x xsteve-remove-control-M or bind it to a easier key. It will strip the ^Ms in anymode.

Hiding ^M in Emacs

Sometimes I need to read log files that have ^M (control-M) in the line endings. I can do a global replace to get rid of them, but then something more is logged to the log file and, of course, they all come back.
Setting Unix-style or dos-style end-of-line encoding doesn't seem to make much difference (but Unix-style is my default). I'm using the undecided-(unix|dos) coding system.
I'm on Windows, reading log files created by log4net (although log4net obviously isn't the only source of this annoyance).
(defun remove-dos-eol ()
"Do not show ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M []))
Solution by Johan Bockgård. I found it here.
Modern versions of emacs know how to handle both UNIX and DOS line endings, so when ^M shows up in the file, it means that there's a mixture of both in the file. When there is such a mixture, emacs defaults to UNIX mode, so the ^Ms are visible. The real fix is to fix the program creating the file so that it uses consistent line-endings.
What about?
C-x RET c dos RET C-x C-f FILENAME RET
I made a file that has two lines, with the second having a carriage return. Emacs would open the file in Unix coding, and switching coding system does nothing. However, the universal-coding-system-argument above works.
I believe you can change the line coding system the file is using to the Unix format with
C-x RET f UNIX RET
If you do that, the mode line should change to add the word "(Unix)", and all those ^M's should go away.
If you'd like to view the log files and simply hide the ^M's rather than actually replace them you can use Drew Adam's highlight extension to do so.
You can either write elisp code or make a keyboard macro to do the following
select the whole buffer
hlt-highlight-regexp-region
C-q C-M
hlt-hide-default-face
This will first highlight the ^M's and then hide them. If you want them back use `hlt-show-default-face'
Edric's answer should get more attention. Johan Bockgård's solution does address the poster's complaint, insofar as it makes the ^M's invisible, but that just masks the underlying problem, and encourages further mixing of Unix and DOS line-endings.
The proper solution would be to do a global M-x replace-regexp to turn all line endings to DOS ones (or Unix, as the case may be). Then close and reopen the file (not sure if M-x revert-buffer would be enough) and the ^M's will either all be invisible, or all be gone.
You can change the display-table entry of the Control-M (^M) character, to make it displayable as whitespace or even disappear totally (vacuous). See the code in library pp-c-l.el (Pretty Control-L) for inspiration. It displays ^L chars in an arbitrary way.
Edited: Oops, I just noticed that #binOr already mentioned this method.
Put this in your .emacs:
(defun dos2unix ()
"Replace DOS eolns CR LF with Unix eolns CR"
(interactive)
(goto-char (point-min))
(while (search-forward "\r" nil t) (replace-match "")))
Now you can simply call dos2unix and remove all the ^M characters.
If you encounter ^Ms in received mail in Gnus, you can use W c (wash CRs), or
(setq gnus-treat-strip-cr t)
what about using dos2unix, unix2dos (now tofrodos)?
sudeepdino008's answer did not work for me (I could not comment on his answer, so I had to add my own answer.).
I was able to fix it using this code:
(defun dos2unix ()
"Replace DOS eolns CR LF with Unix eolns CR"
(interactive)
(goto-char (point-min))
(while (search-forward (string ?\C-m) nil t) (replace-match "")))
Like binOr said add this to your %APPDATA%.emacs.d\init.el on windows or where ever is your config.
;; Windows EOL
(defun hide-dos-eol ()
"Hide ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M []))
(defun show-dos-eol ()
"Show ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M ?\^M))
(add-hook 'text-mode-hook 'hide-dos-eol)

determining the line terminator in Emacs

I'm writing a config file and I need to define if the process expects a windows format file or a unix format file. I've got a copy of the expected file - is there a way I can check if it uses \n or \r\n without exiting emacs?
If it says (DOS) on the modeline when you open the file on Unix, the line endings are Windows-style. If it says (Unix) when you open the file on Windows, the line endings are Unix-style.
From the Emacs 22.2 manual (Node: Mode Line):
If the buffer's file uses
carriage-return linefeed, the colon
changes to either a backslash ('\') or
'(DOS)', depending on the operating
system. If the file uses just
carriage-return, the colon indicator
changes to either a forward slash
('/') or '(Mac)'. On some systems,
Emacs displays '(Unix)' instead of the
colon for files that use newline as
the line separator.
Here's a function that – I think – shows how to check from elisp what Emacs has determined to be the type of line endings. If it looks inordinately complicated, perhaps it is.
(defun describe-eol ()
(interactive)
(let ((eol-type (coding-system-eol-type buffer-file-coding-system)))
(when (vectorp eol-type)
(setq eol-type (coding-system-eol-type (aref eol-type 0))))
(message "Line endings are of type: %s"
(case eol-type
(0 "Unix") (1 "DOS") (2 "Mac") (t "Unknown")))))
If you go in hexl-mode (M-x hexl-mode), you shoul see the line termination bytes.
Open the file in emacs using find-file-literally. If lines have ^M symbols at the end, it expects a windows format text file.
The following Elisp function will return nil if no "\r\n" terminators appear in a file (otherwise it returns the point of the first occurrence). You can put it in your .emacs and call it with M-x check-eol.
(defun check-eol (FILE)
(interactive "fFile: ")
(set-buffer (generate-new-buffer "*check-eol*"))
(insert-file-contents-literally FILE)
(let ((point (search-forward "\r\n")))
(kill-buffer nil)
point))

How to get Emacs to unwrap a block of code?

Say I have a line in an emacs buffer that looks like this:
foo -option1 value1 -option2 value2 -option3 value3 \
-option4 value4 ...
I want it to look like this:
foo -option1 value1 \
-option2 value2 \
-option3 value3 \
-option4 value4 \
...
I want each option/value pair on a separate line. I also want those subsequent lines indented appropriately according to mode rather than to add a fixed amount of whitespace. I would prefer that the code work on the current block, stopping at the first non-blank line or line that does not contain an option/value pair though I could settle for it working on a selected region.
Anybody know of an elisp function to do this?
Nobody had what I was looking for so I decided to dust off my elisp manual and do it myself. This seems to work well enough, though the output isn't precisely what I asked for. In this version the first option goes on a line by itself instead of staying on the first line like in my original question.
(defun tcl-multiline-options ()
"spread option/value pairs across multiple lines with continuation characters"
(interactive)
(save-excursion
(tcl-join-continuations)
(beginning-of-line)
(while (re-search-forward " -[^ ]+ +" (line-end-position) t)
(goto-char (match-beginning 0))
(insert " \\\n")
(goto-char (+(match-end 0) 3))
(indent-according-to-mode)
(forward-sexp))))
(defun tcl-join-continuations ()
"join multiple continuation lines into a single physical line"
(interactive)
(while (progn (end-of-line) (char-equal (char-before) ?\\))
(forward-line 1))
(while (save-excursion (end-of-line 0) (char-equal (char-before) ?\\))
(end-of-line 0)
(delete-char -1)
(delete-char 1)
(fixup-whitespace)))
In this case I would use a macro. You can start recording a macro with C-x (, and stop recording it with C-x ). When you want to replay the macro type C-x e.
In this case, I would type, C-a C-x ( C-s v a l u e C-f C-f \ RET SPC SPC SPC SPC C-x )
That would record a macro that searches for "value", moves forward 2, inserts a slash and newline, and finally spaces the new line over to line up. Then you could repeat this macro a few times.
EDIT: I just realized, your literal text may not be as easy to search as "value1". You could also search for spaces and cycle through the hits. For example, hitting, C-s a few times after the first match to skip over some of the matches.
Note: Since your example is "ad-hoc" this solution will be too. Often you use macros when you need an ad-hoc solution. One way to make the macro apply more consistently is to put the original statement all on one line (can also be done by a macro or manually).
EDIT: Thanks for the comment about ( versus C-(, you were right my mistake!
Personally, I do stuff like this all the time.
But I don't write a function to do it unless I'll be doing it
every day for a year.
You can easily do it with query-replace, like this:
m-x (query-replace " -option" "^Q^J -option")
I say ^Q^J as that is what you'll type to quote a newline and put it in
the string.
Then just press 'y' for the strings to replace, and 'n' to skip the wierd
corner cases you'd find.
Another workhorse function is query-replace-regexp that can do
replacements of regular expressions.
and also grep-query-replace, which will perform query-replace by parsing
the output of a grep command. This is useful because you can search
for "foo" in 100 files, then do the query-replace on each occurrence
skipping from file to file.
Your mode may support this already. In C mode and Makefile mode, at least, M-q (fill-paragraph) will insert line continuations in the fill-column and wrap your lines.
What mode are you editing this in?