I wrote the following command
echo -en 'uno\ndue\n' | sed -E 's/^.*(uno|$)/\1/'
expecting the following output
uno
This is indeed the case with my GNU Sed 4.8.
However, I've verified that BSD Sed outputs
Why is that the case?
I'd say that BSD's sed is POSIX-compatible only. POSIX specifies support only for basic regular expressions, which have many limitations (e.g., no support for | (alternation) at all, no direct support for + and ?) and different escaping requirements.
BSD sed is default one on MacOS so very first thing on a new system is to get GNU-compatible sed: brew install gsed.
This is mostly a curiosity question that arose here.
From the man page of GNU sed 4.8 I read
--posix
disable all GNU extensions.
so I understand that if a code like the following works, it means that -i without argument is allowed by POSIX:
sed --posix -i -n '1,25p' *.txt
On the other hand, the same code (with or without --posix) doesn't work for MacOS' BSD sed, because that version expects -i to be followed by an argument.
I can see only two mutually exclusive possibilities:
GNU sed's --posix option allows more than POSIX, which means it bugged and needs a bug report
BSD sed is not POSIX-compliant.
What is the truth?
--posix refers to the sed language itself, not the command line interface:
GNU sed includes several extensions to POSIX sed. In order to simplify writing portable scripts, this option disables all the extensions that this manual documents, including additional commands.
POSIX does not specify -i, so an implementation without it can still be POSIX-conforming.
I have this line inside a file:
ULNET-PA,client_sgcib,broker_keplersecurities
,KEPLER
I try to get rid of that ^M (carriage return) character so I used:
sed 's/^M//g'
However this does remove everything after ^M:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities^M,KEPLER
[root#localhost tmp]# sed 's/^M//g' test
ULNET-PA,client_sgcib,broker_keplersecurities
What I want to obtain is:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities,KEPLER
Use tr:
tr -d '^M' < inputfile
(Note that the ^M character can be input using Ctrl+VCtrl+M)
EDIT: As suggested by Glenn Jackman, if you're using bash, you could also say:
tr -d $'\r' < inputfile
still the same line:
sed -i 's/^M//g' file
when you type the command, for ^M you type Ctrl+VCtrl+M
actually if you have already opened the file in vim, you can just in vim do:
:%s/^M//g
same, ^M you type Ctrl-V Ctrl-M
You can simply use dos2unix which is available in most Unix/Linux systems. However I found the following sed command to be better as it removed ^M where dos2unix couldn't:
sed 's/\r//g' < input.txt > output.txt
Hope that helps.
Note: ^M is actually carriage return character which is represented in code as \r
What dos2unix does is most likely equivalent to:
sed 's/\r\n/\n/g' < input.txt > output.txt
It doesn't remove \r when it is not immediately followed by \n and replaces both with just \n. This fails with certain types of files like one I just tested with.
alias dos2unix="sed -i -e 's/'\"\$(printf '\015')\"'//g' "
Usage:
dos2unix file
If Perl is an option:
perl -i -pe 's/\r\n$/\n/g' file
-i makes a .bak version of the input file
\r = carriage return
\n = linefeed
$ = end of line
s/foo/bar/g = globally substitute "foo" with "bar"
In awk:
sub(/\r/,"")
If it is in the end of record, sub(/\r/,"",$NF) should suffice. No need to scan the whole record.
This is the better way to achieve
tr -d '\015' < inputfile_name > outputfile_name
Later rename the file to original file name.
I agree with #twalberg (see accepted answer comments, above), dos2unix on Mac OSX covers this, quoting man dos2unix:
To run in Mac mode use the command-line option "-c mac" or use the
commands "mac2unix" or "unix2mac"
I settled on 'mac2unix', which got rid of my less-cmd-visible '^M' entries, introduced by an Apple 'Messages' transfer of a bash script between 2 Yosemite (OSX 10.10) Macs!
I installed 'dos2unix', trivially, on Mac OSX using the popular Homebrew package installer, I highly recommend it and it's companion command, Cask.
This is clean and simple and it works:
sed -i 's/\r//g' file
where \r of course is the equivalent for ^M.
Simply run the following command:
sed -i -e 's/\r$//' input.file
I verified this as valid in Mac OSX Monterey.
remove any \r :
nawk 'NF+=OFS=_' FS='\r'
gawk 3 ORS= RS='\r'
remove end of line \r :
mawk2 8 RS='\r?\n'
mawk -F'\r$' NF=1
I am having trouble making sed work on my mac terminal. The original version I have is /usr/bin/sed
I want to see what version it is so I type:
sed --version
I get the following output:
/usr/bin/sed: illegal option -- - usage: sed script [-Ealn] [-i
extension] [file ...]
sed [-Ealn] [-i extension] [-e script] ... [-f script_file] ... [file ...]
My man page is for sed 4.2 and that should have a --version option
I then installed to /usr/local/bin by downloading from gnu ftp http://ftp.gnu.org/gnu/sed/
I then run /usr/local/bin/sed --version and still get same output as with original version. I am completely confused, can anyone figure out what I am doing wrong?
EDIT: It seems like even though which sed gives me /usr/local/bin/sed the command sed is still running /usr/bin/sed, consequently /usr/local/bin/sed is not being invoked. If I invoke with full path it works as expected.
I guess question is now why which sed is giving me /usr/local/bin/sed yet the command run when I type sed is /usr/bin/sed
Your /usr/bin/sed is the BSD sed which does not support --version as your error statement shows. The man page for it is /usr/share/man/man1/sed.1.gz, when I read that there is no mention of a version at all, however the date on the man page is May 10, 2005.
I am thinking you have an incorrect man page. Most probably a MANPATH that is looking somewhere else first.
As for why /usr/local/bin/sed which you are saying is GNU sed does not honor the --version I am not sure about. Can you give more detail about this?
Currently I'm on a Windows XP SP3 machine with English interface language. When I installed the Cygwin and some packages with it, the sed & awk commands are all display in other language, as the following example shown.
$ sed
用法: sed [选项]... {脚本(如果没有其他脚本)} [输入文件]...
-n, --quiet, --silent
取消自动打印模式空间
-e 脚本, --expression=脚本
添加“脚本”到程序的运行列表
-f 脚本文件, --file=脚本文件
添加“脚本文件”到程序的运行列表
--follow-symlinks
直接修改文件时跟随软链接
-i[扩展名], --in-place[=扩展名]
直接修改文件(如果指定扩展名就备份文件)
-b, --binary
以二进制方式打开文件 (回车加换行不做特殊处理)
-l N, --line-length=N
指定“l”命令的换行期望长度
--posix
关闭所有 GNU 扩展
-r, --regexp-extended
在脚本中使用扩展正则表达式
-s, --separate
将输入文件视为各个独立的文件而不是一个长的连续输入
-u, --unbuffered
从输入文件读取最少的数据,更频繁的刷新输出
--help 打印帮助并退出
--version 输出版本信息并退出
如果没有 -e, --expression, -f 或 --file 选项,那么第一个非选项参数被视为
sed脚本。其他非选项参数被视为输入文件,如果没有输入文件,那么程序将从标准
输入读取数据。
GNU sed home page: <http://www.gnu.org/software/sed/>.
General help using GNU software: <http://www.gnu.org/gethelp/>.
How can I make these commands use English language?
Thanks
Cygwin defaults to English even on non-English systems, so something must be setting the Cygwin locale to Chinese. The interface language is determined by the LC_ALL, LC_MESSAGES, and LANG environment variables, in that order of priority. If you're using the mintty terminal, the locale can be set on Text page of its options.