Explanation of difference between GNU sed and BSD sed - sed

I wrote the following command
echo -en 'uno\ndue\n' | sed -E 's/^.*(uno|$)/\1/'
expecting the following output
uno
This is indeed the case with my GNU Sed 4.8.
However, I've verified that BSD Sed outputs
Why is that the case?

I'd say that BSD's sed is POSIX-compatible only. POSIX specifies support only for basic regular expressions, which have many limitations (e.g., no support for | (alternation) at all, no direct support for + and ?) and different escaping requirements.
BSD sed is default one on MacOS so very first thing on a new system is to get GNU-compatible sed: brew install gsed.

Related

-i without argument: is GNU sed --posix option bugged or BSD sed is not POSIX-compliant?

This is mostly a curiosity question that arose here.
From the man page of GNU sed 4.8 I read
--posix
disable all GNU extensions.
so I understand that if a code like the following works, it means that -i without argument is allowed by POSIX:
sed --posix -i -n '1,25p' *.txt
On the other hand, the same code (with or without --posix) doesn't work for MacOS' BSD sed, because that version expects -i to be followed by an argument.
I can see only two mutually exclusive possibilities:
GNU sed's --posix option allows more than POSIX, which means it bugged and needs a bug report
BSD sed is not POSIX-compliant.
What is the truth?
--posix refers to the sed language itself, not the command line interface:
GNU sed includes several extensions to POSIX sed. In order to simplify writing portable scripts, this option disables all the extensions that this manual documents, including additional commands.
POSIX does not specify -i, so an implementation without it can still be POSIX-conforming.

sed equivalent of perl -pe

I'm looking for an equivalent of perl -pe. Ideally, it would be replace with sed if it's possible. Any help is highly appreciated.
The code is:
perl -pe 's/^\[([^\]]+)\].*$/$1/g'
$ echo '[foo] 123' | perl -pe 's/^\[([^\]]+)\].*$/$1/g'
foo
$ echo '[foo] 123' | sed -E 's/^\[([^]]+)\].*$/\1/'
foo
sed by default accepts code from command line, so -e isn't needed (though it can be used)
printing the pattern space is default, so -p isn't needed and sed -n is similar to perl -n
-E is used here to be as close as possible to Perl regex. sed supports BRE and ERE (not as feature rich as Perl) and even that differs from implementation to implementation.
with BRE, the command for this example would be: sed 's/^\[\([^]]*\)\].*$/\1/'
\ isn't special inside character class unless it is an escape sequence like \t, \x27 etc
backreferences use \N format (and limited to maximum 9)
Also note that g flag isn't needed in either case, as you are using line anchors

Replacing = with in using sed

I have a string like below
abc="where session = '001122' and indicator = 'X'"
I want to convert it to
eng="where session in ('001122') and indicator in ('X')"
I have tried like below using sed in bash
eng=$(echo $abc | sed -r "s/=\s+('[^']+')/in (\1)/g")
I am still get the input itself. What am I doing wrong.
You can use unadorned sed with escaped to escape the capture group parentheses (\( and \)), as well as one-or-more quantifiers (\+):
$ eng=$(echo "$abc" | sed "s/=\s\+'\([^']\+\)'/in ('\1')/g"
$ echo "$eng"
where session in ('001122') and indicator in ('X')
It is also probably a good idea to quote your expansion of abc, since it has spaces in it, but not strictly necessary in this context.
Your original code may not have worked because -r is a GNU extension. The synonym -E used to be as well, but is now part of the POSIX standard, and should therefore be relatively portable. The following version should therefore have no problems either:
$ eng=$(echo "$abc" | sed -E "s/=\s+'([^']+)'/in ('\1')/g"

SED { command fail

On MKS SED for Windows, this
TYPE Q:\temp\curtainssetspread.M3U | SED -E "/z/{s_a_b_}"
fails with
sed: garbage after command
Why?
This usage accords correctly with docs:
a,b{ groups all commands until the next matching }, so that sed executes the entire group only if the { command is selected by its address(es).
According to POSIX, the } must be preceded by a newline. I'm not sure what MKS does, but the beauty of having a standard is that the following should work on all systems (using multiple -es joins each string together with newlines in between):
sed -e "/z/{s_a_b_" -e "}"
If it doesn't work, it's a bug in MKS and should be reported, as they say their sed is POSIX-compliant.
I do suggest following Benjamin's advice and just doing
sed -e '/z/s_a_b_'
if possible, though.

Parsing a line with sed using regular expression

Using sed I want to parse Heroku's log-runtime-metrics like this one:
2016-01-29T00:38:43.662697+00:00 heroku[worker.2]: source=worker.2 dyno=heroku.17664470.d3f28df1-e15f-3452-1234-5fd0e244d46f sample#memory_total=54.01MB sample#memory_rss=54.01MB sample#memory_cache=0.00MB sample#memory_swap=0.00MB sample#memory_pgpgin=17492pages sample#memory_pgpgout=3666pages
the desired output is:
worker.2: 54.01MB (54.01MB is being memory_total)
I could not manage although I tried several alternatives including:
sed -E 's/.+source=(.+) .+memory_total=(.+) .+/\1: \2/g'
What is wrong with my command? How can it be corrected?
The .+ after source= and memory_total= are both greedy, so they accept as much of the line as possible. Use [^ ] to mean "anything except a space" so that it knows where to stop.
sed -E 's/.+source=([^ ]+) .+memory_total=([^ ]+) .+/\1: \2/g'
Putting your content into https://regex101.com/ makes it really obvious what's going on.
I'd go for the old-fashioned, reliable, non-extended sed expressions and make sure that the patterns are not too greedy:
sed -e 's/.*source=\([^ ]*\) .*memory_total=\([^ ]*\) .*/\1: \2/'
The -e is not the opposite of -E, which is primarily a Mac OS X (BSD) sed option; the normal option for GNU sed is -r instead. The -e simply means that the next argument is an expression in the script.
This produces your desired output from the given line of data:
worker.2: 54.01MB
Bonus question: There are some odd lines within the stream, I can usually filter them out using a grep pipe like | grep memory_total. However if I try to use it along with the sed command, it does not work. No output is produced with this:
heroku logs -t -s heroku | grep memory_total | sed.......
Sometimes grep | sed is necessary, but it is often redundant (unless you are using a grep feature that isn't readily supported by sed, such as Perl regular expressions).
You should be able to use:
sed -n -e '/memory_total=/ s/.*source=\([^ ]*\) .*memory_total=\([^ ]*\) .*/\1: \2/p'
The -n means "don't print by default". The /memory_total=/ matches the lines you're after; the s/// content is the same as before. I removed the g suffix that was there previously; the regex would never match multiple times anyway. I added the p to print the line when the substitution occurs.