What is sed b command in Linux? - sed

I am just wondering what is the b command for sed in Linux. For example:
sed "/some pattern/b" a.txt

The b command stands for branching: unconditionally branch to label. The label may be omitted (as in your example), in which case the next cycle is started.
Note that in most cases, use of this command means that you are probably better off programming in some real programming language like awk, Perl or Python.

In POSIX sed and GNU sed — and other variants of sed — the b command jumps to a label in the script, or in the absence of a label name, to the end of the script. It is an unconditional jump. There's a conditional jump too; that's t. And you make a label with :label.
In the context of the script in your question, the b is useless. If the pattern matches, it goes to the end of the script, just as it does if there is no match.
For an example of when the t command is useful, consider the data file (data):
Be very, very, very, very, very, very scared of sed scripts with branches.
Now consider the outputs of these sed scripts:
$ sed 's/very, very/very/' data
Be very, very, very, very, very scared of sed scripts with branches.
$ sed 's/very, very/very/g' data
Be very, very, very scared of sed scripts with branches.
$ sed -e ':label' -e 's/very, very/very/g' -e 't label' data
Be very scared of sed scripts with branches.
$
Make sure you understand what the s///g modifier does and why the result is as it is.
I went poking around my $HOME/bin directory and found 97 scripts in it that use sed, including some very convoluted chunks of sed. I only found one script (rmemptysubdirs) that uses sed labels and branching:
# Remove empty directories from paths
find "$#" -type d -print |
sed -ne 'p
:b
s%/[^/]*$%%
p
t b' |
sort -ur |
xargs rmdir 2>/dev/null
The sed script here converts a line containing:
/absolute/path/name/for/directory
into:
/absolute/path/name/for/directory
/absolute/path/name/for
/absolute/path/name
/absolute/path
/absolute
Note that the script uses a label b, not the b command.

Related

How to run GnuWin32 sed and a script file?

I'm running GnuWin32 under Windows 10
I'm trying to run the following sed one-liner using the Gnu Bash shell:
sed -f <(sed -E 's_(.+)\t(.+)_s/\1/\2/g_' C:/dictionary.txt) C:/content.txt
The file substitute sed statement converts dictionary entries into sed expressions. The main sed uses them for the content replacements.
It is described in How to awk to read a dictionary and replace words in a file?
dictionary.txt looks like this:
aluminium<tab>aluminum
analyse<tab>analyze
white spirit<tab>mineral spirits
stag night<tab>bachelor party
savoury<tab>savory
potato crisp<tab>potato chip
mashed potato<tab>mashed potatoes
content.txt looks like this:
The container of white spirit was made of aluminium.
We will use an aromatic method to analyse properties of white spirit.
No one drank white spirit at stag night.
Many people think that a potato crisp is savoury, but some would rather eat mashed potato.
...
more sentences
When running GnuWin32/sed in GnuBash-shell under windows 10, I receive the following error message:
syntax error near unexpected token <(s
How to re-formulate the script to run under GnuWin32/sed under windows 10?
with thanks to https://stackoverflow.com/users/2836621/mark-setchell and https://stackoverflow.com/users/5403468/tiw the solution works when using cygwin64
One way is to write the inner sed output to a temporary file first, use it, and then delete it:
sed -r "s_(.+)\t(.+)_s/\1/\2/g_" C:/dictionary.txt>tmp_script.sed
sed -f tmp_script.sed C:/content.txt
del tmp_script.sed
Another way, based on Mr. Mark Setchell's comment, plus little tweak, with cygwin installed,
this work on both bash and batch:
sed -r "s_(.+)\t(.+)_s/\1/\2/g_" C:/dictionary.txt | sed -f /dev/stdin C:/content.txt

Parsing a line with sed using regular expression

Using sed I want to parse Heroku's log-runtime-metrics like this one:
2016-01-29T00:38:43.662697+00:00 heroku[worker.2]: source=worker.2 dyno=heroku.17664470.d3f28df1-e15f-3452-1234-5fd0e244d46f sample#memory_total=54.01MB sample#memory_rss=54.01MB sample#memory_cache=0.00MB sample#memory_swap=0.00MB sample#memory_pgpgin=17492pages sample#memory_pgpgout=3666pages
the desired output is:
worker.2: 54.01MB (54.01MB is being memory_total)
I could not manage although I tried several alternatives including:
sed -E 's/.+source=(.+) .+memory_total=(.+) .+/\1: \2/g'
What is wrong with my command? How can it be corrected?
The .+ after source= and memory_total= are both greedy, so they accept as much of the line as possible. Use [^ ] to mean "anything except a space" so that it knows where to stop.
sed -E 's/.+source=([^ ]+) .+memory_total=([^ ]+) .+/\1: \2/g'
Putting your content into https://regex101.com/ makes it really obvious what's going on.
I'd go for the old-fashioned, reliable, non-extended sed expressions and make sure that the patterns are not too greedy:
sed -e 's/.*source=\([^ ]*\) .*memory_total=\([^ ]*\) .*/\1: \2/'
The -e is not the opposite of -E, which is primarily a Mac OS X (BSD) sed option; the normal option for GNU sed is -r instead. The -e simply means that the next argument is an expression in the script.
This produces your desired output from the given line of data:
worker.2: 54.01MB
Bonus question: There are some odd lines within the stream, I can usually filter them out using a grep pipe like | grep memory_total. However if I try to use it along with the sed command, it does not work. No output is produced with this:
heroku logs -t -s heroku | grep memory_total | sed.......
Sometimes grep | sed is necessary, but it is often redundant (unless you are using a grep feature that isn't readily supported by sed, such as Perl regular expressions).
You should be able to use:
sed -n -e '/memory_total=/ s/.*source=\([^ ]*\) .*memory_total=\([^ ]*\) .*/\1: \2/p'
The -n means "don't print by default". The /memory_total=/ matches the lines you're after; the s/// content is the same as before. I removed the g suffix that was there previously; the regex would never match multiple times anyway. I added the p to print the line when the substitution occurs.

How do I do unbuffered substitution in a perl oneliner?

I've got a bash script that wraps mvn (Apache Maven) to add colour to its output. A cut-down version of what it does is:
mvn "$#" | sed -e "s/^\[INFO\] \-.*/$bldblu&$rst/g"
where $bldblu is the ANSI color escape characters for bold blue, and $rst resets the colours.
The issue I'm having is that sometimes mvn writes a line that doesn't end in a newline, thus (as far as I can tell) sed keeps waiting for input and never prints the prompt (which makes it seem like Maven is hanging). I've tried adding -u to sed but that just forces sed to do line-by-line buffering instead of buffering more than one line - not helpful for me.
So far this is what I've come up with:
mvn "$#" | perl -pe "$| = 1; s/^(\[INFO\] \-.*)/$bldblu\$1$rst/g"
but I think the use of -p is not correct here. Any help?
A substitution may be overkill, especially when the replacement pattern has special characters in it. How about this?
export bldblu
export rst
mvn "$#" | perl -pe 'if(/^.INFO. -/){ $_=$ENV{bldblu}.$_.$ENV{rst} }'
or rather than reinventing the wheel
mvn "$#" | perl -MTerm::ANSIColor -pe
'$_=color("bold blue").$_.color("reset") if /^.INFO. -/'
(workaround) Use sed --unbuffered
I couldn't figure out the solution but thankfully this is good enough for my particular usage:
cat - | sed --unbuffered 's/.*?from//g'
But I too would like to know the answer. Perl one line substitution is a key idiom in my toolbelt.
BSD
Looks like there is no common flag for GNU and BSD. For the latter, you'd need:
-l Make output line buffered.

Using variables in sed -f (where sed script is in a file rather than inline)

We have a process which can use a file containing sed commands to alter piped input.
I need to replace a placeholder in the input with a variable value, e.g. in a single -e type of command I can run;
$ echo "Today is XX" | sed -e "s/XX/$(date +%F)/"
Today is 2012-10-11
However I can only specify the sed aspects in a file (and then point the process at the file), E.g. a file called replacements.sed might contain;
s/XX/Thursday/
So obviously;
$ echo "Today is XX" | sed -f replacements.sed
Today is Thursday
If I want to use an environment variable or shell value, though, I can't find a way to make it expand, e.g. if replacements.txt contains;
s/XX/$(date +%F)/
Then;
$ echo "Today is XX" | sed -f replacements.sed
Today is $(date +%F)
Including double quotes in the text of the file just prints the double quotes.
Does anyone know a way to be able to use variables in a sed file?
This might work for you (GNU sed):
cat <<\! > replacements.sed
/XX/{s//'"$(date +%F)"'/;s/.*/echo '&'/e}
!
echo "Today is XX" | sed -f replacements.sed
If you don't have GNU sed, try:
cat <<\! > replacements.sed
/XX/{
s//'"$(date +%F)"'/
s/.*/echo '&'/
}
!
echo "Today is XX" | sed -f replacements.sed | sh
AFAIK, it's not possible. Your best bet will be :
INPUT FILE
aaa
bbb
ccc
SH SCRIPT
#!/bin/sh
STRING="${1//\//\\/}" # using parameter expansion to prevent / collisions
shift
sed "
s/aaa/$STRING/
" "$#"
COMMAND LINE
./sed.sh "fo/obar" <file path>
OUTPUT
fo/obar
bbb
ccc
As others have said, you can't use variables in a sed script, but you might be able to "fake" it using extra leading input that gets added to your hold buffer. For example:
[ghoti#pc ~/tmp]$ cat scr.sed
1{;h;d;};/^--$/g
[ghoti#pc ~/tmp]$ sed -f scr.sed <(date '+%Y-%m-%d'; printf 'foo\n--\nbar\n')
foo
2012-10-10
bar
[ghoti#pc ~/tmp]$
In this example, I'm using process redirection to get input into sed. The "important" data is generated by printf. You could cat a file instead, or run some other program. The "variable" is produced by the date command, and becomes the first line of input to the script.
The sed script takes the first line, puts it in sed's hold buffer, then deletes the line. Then for any subsequent line, if it matches a double dash (our "macro replacement"), it substitutes the contents of the hold buffer. And prints, because that's sed's default action.
Hold buffers (g, G, h, H and x commands) represent "advanced" sed programming. But once you understand how they work, they open up new dimensions of sed fu.
Note: This solution only helps you replace entire lines. Replacing substrings within lines may be possible using the hold buffer, but I can't imagine a way to do it.
(Another note: I'm doing this in FreeBSD, which uses a different sed from what you'll find in Linux. This may work in GNU sed, or it may not; I haven't tested.)
I am in agreement with sputnick. I don't believe that sed would be able to complete that task.
However, you could generate that file on the fly.
You could change the date to a fixed string, like
__DAYOFWEEK__.
Create a temp file, use sed to replace __DAYOFWEEK__ with $(date +%Y).
Then parse your file with sed -f $TEMPFILE.
sed is great, but it might be time to use something like perl that can generate the date on the fly.
To add a newline in the replacement expression using a sed file, what finally worked for me is escaping a literal newline. Example: to append a newline after the string NewLineHere, then this worked for me:
#! /usr/bin/sed -f
s/NewLineHere/NewLineHere\
/g
Not sure it matters but I am on Solaris unix, so not GNU sed for sure.

Have sed make substitute on string but SKIP first occurrence

I have been through the sed one liners but am still having trouble with my goal. I want to substitue matching strings on all but the first occurrence of a line. My exact usage would be:
$ echo 'cd /Users/joeuser/bump bonding/initial trials' | sed <<MAGIC HAPPENS>
cd /Users/joeuser/bump\ bonding/initial\ trials
The line replaced the space in bump bonding with the slash space bump\ bonding so that I can execute this line (since when the spaces aren't escaped I wouldn't be able to cd to it).
Update: I solved this by just using single quotes and outputting
cd 'blah blah/thing/another space/'
and then using source to execute the command. But it didn't answer my question. I'm still curious though... how would you use sed to fix it?
s/ /\\ /2g
The 2 specifies that the second one should apply, and the g specifies that all the rest should apply too. (This probably only works on GNU sed. According to the Open Group Base Specification, "If both g and n are specified, the results are unspecified.")
You can avoid the problem with g and n
Replace all of them, then undo the first one:
sed -e 's/ /\\ /g' -e 's/\\ / /1'
Here's another method which uses the t branch-if-substituted command:
sed ':a;s/\([^ ]* .*[^\\]\) \(.*\)/\1\\ \2/;ta'
which has the advantage of leaving existing backslash-space sequences in the input intact.
use awk
$ echo cd 'blah blah/thing/another space/' | awk '{for(i=2;i<NF;i++) $i=$i"\\"}1'
cd blah\ blah/thing/another\ space/
$ echo 'cd /Users/joeuser/bump bonding/initial trials' | awk '{for(i=2;i<NF;i++) $i=$i"\\"}1'
cd /Users/joeuser/bump\ bonding/initial\ trials