What is the significance of -n parameter passed to sed command? - sed

Can someone please tell me how does sed -n '1!p work ? Below is the full command which I am using to sort my pods in k8s based on nodes they are assigned.
kubectl get pods -o wide --all-namespaces|sort -k8 -r| sed -n '1!p'
The above command works perfectly and the sed part removes the first header line from the final output.
I want to understand how does the sed part work and what's the significance of the parameters passed to sed?

From info sed:
-n:
By default, 'sed' prints out the pattern space at the end of each
cycle through the script (*note How 'sed' works: Execution Cycle.).
These options disable this automatic printing, and 'sed' only
produces output when explicitly told to via the 'p' command.
1p: print first line
1!p: do not print first line.

By default, sed will output every line it parses.
-n option is there to hide this output, and display only the lines specified with the p option.
In your exemple, sed -n '1!p' means "Display every line but first".
A more understandable example is when you want to search/replace with sed. If you want to see the whole resulting file you'll use this:
sed 's/from/to/g' file.txt
But if you only want to see which lines have been changed, use this:
sed -n 's/from/to/gp' file.txt

Related

Combining sed commands in bash

I am aiming to try and combine the two following sed commands to print out one output. The first command is used to strip the HTML file of its HTML tags and the second is to specify I only want lines 11 through to 16 of the file.
sed -e 's/<[^>]*.//g' file.html
sed -n '11,16p' file.html
I have been playing around with this for a while now and can only ever seem to get the output of lines 11-16 with the HTML tags, or all lines without the HTML, when I am aiming to display the output of lines 11-16 without any HTML tags. Any help would be greatly appreciated, thanks!
The naive way would be to use a pipe:
sed 's/<[^>]*.//g' file.htm | sed -n '11,16p'
You may also combine the address and the pattern:
sed -n '11,16 s/<[^>]*.//pg' file.html
Here,
-n will suppress the default line output
11,16 - will set the address, Lines 11 through 16
s/<[^>]*.// - will look for <, then zero or more chars other than > and then any one char (did you mean a >?)
p - print the result of the substitution
g - all occurrences on the line
An online demo (shortened version, Lines 2-4):
#!/bin/bash
s="<111111>aaa<111111>
<22222>bbb<111111>
<33333>ccc<111111>
<44444>ddd<111111>
<55555>eee<111111>"
sed -n '2,4 s/<[^>]*.//pg' <<< "$s"
Output:
bbb
ccc
ddd
If GNU-compatible,
sed -n '11,16{ s/<[^>]*.//g; p; }; 17q;' file.html
The range will take a block, allowing both commands to be done sequentially to each line.
The 17q; just keeps it from wasting time on lines you already know you don't need.

How to use SED to find multiple paths in the same line and replace them with a different path?

I have a file with multiple paths in the same line:
cat modules.dep
kernel/mm/zsmalloc.ko:
kernel/crypto/lzo.ko:
kernel/drivers/char/tpm/tpm_vtpm_proxy.ko: kernel/drivers/char/tpm/tpm.ko
kernel/drivers/block/virtio_blk.ko:
kernel/drivers/block/zram/zram.ko: kernel/mm/zsmalloc.ko
kernel/drivers/nvdimm/virtio_pmem.ko: kernel/drivers/nvdimm/nd_virtio.ko
kernel/drivers/nvdimm/nd_virtio.ko:
kernel/drivers/net/virtio_net.ko: kernel/drivers/net/net_failover.ko kernel/net/core/failover.ko
kernel/drivers/net/net_failover.ko: kernel/net/core/failover.ko
extra/virtio_gpu/virtio-gpu.ko: kernel/drivers/virtio/virtio_dma_buf.ko
extra/wlan_simulation/virt_wifi_sim.ko: kernel/drivers/net/wireless/virt_wifi.ko
I would like to change it to:
/lib/modules/zsmalloc.ko:
/lib/modules/lzo.ko:
/lib/modules/tpm_vtpm_proxy.ko: /lib/modules/tpm.ko
/lib/modules/virtio_blk.ko:
/lib/modules/zram.ko: /lib/modules/zsmalloc.ko
/lib/modules/virtio_pmem.ko: /lib/modules/nd_virtio.ko
/lib/modules/nd_virtio.ko:
/lib/modules/virtio_net.ko: /lib/modules/net_failover.ko /lib/modules/failover.ko
/lib/modules/net_failover.ko: /lib/modules/failover.ko
/lib/modules/virtio-gpu.ko: /lib/modules/virtio_dma_buf.ko
/lib/modules/virt_wifi_sim.ko: /lib/modules/virt_wifi.ko
But my attempt:
sed -i 's/\(.*\)\//\/lib\/modules\//g' modules.load
works only, if there is just one path per line.
How can I achieve this, via sed, with multiple paths per line?
I am using sed from BusyBox in D(ASH) Standalone.
BusyBox v1.32.1-Magisk (2021-01-21 00:17:27 PST) multi-call binary.
Usage: sed [-i[SFX]] [-nrE] [-f FILE]... [-e CMD]... [FILE]...
or: sed [-i[SFX]] [-nrE] CMD [FILE]...
-e CMD Add CMD to sed commands to be executed
-f FILE Add FILE contents to sed commands to be executed
-i[SFX] Edit files in-place (otherwise sends to stdout)
Optionally back files up, appending SFX
-n Suppress automatic printing of pattern space
-r,-E Use extended regex syntax
If no -e or -f, the first non-option argument is the sed command string.
Remaining arguments are input files (stdin if none).
This sed should work:
sed -E 's~[^[:blank:]]+/~/lib/modules/~g' modules.dep
/lib/modules/zsmalloc.ko:
/lib/modules/lzo.ko:
/lib/modules/tpm_vtpm_proxy.ko: /lib/modules/tpm.ko
/lib/modules/virtio_blk.ko:
/lib/modules/zram.ko: /lib/modules/zsmalloc.ko
/lib/modules/virtio_pmem.ko: /lib/modules/nd_virtio.ko
/lib/modules/nd_virtio.ko:
/lib/modules/virtio_net.ko: /lib/modules/net_failover.ko /lib/modules/failover.ko
/lib/modules/net_failover.ko: /lib/modules/failover.ko
/lib/modules/virtio-gpu.ko: /lib/modules/virtio_dma_buf.ko
/lib/modules/virt_wifi_sim.ko: /lib/modules/virt_wifi.ko
[^[:blank:]]+/ finds 1+ non-whitespace characters followed by a / thus matching longest string until it gets a / in each of the multiple string per line.

Extracting the contents between two different strings using bash or perl

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.
Below is the content of file temp.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/<env:Body><dp:response xmlns:dp="http://www.datapower.com/schemas/management"><dp:timestamp>2015-01-
22T13:38:04Z</dp:timestamp><dp:file name="temporary://test.txt">XJzLXJlc3VsdHMtYWN0aW9uX18i</dp:file><dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:file></dp:response></env:Body></env:Envelope>
This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.
To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g' > test.txt
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g' > test1.txt
Below command:
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g'
produces output:
XJzLXJlc3VsdHMtYWN0aW9uX18i
<dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:response> </env:Body></env:Envelope>`
Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?
When i run below command, I am getting expected output:
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g'
It produces below string
lc3VsdHMtYWN0aW9uX18i
I can then easily route this to test1.txt file.
UPDATE
I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.
OS: solaris 10
Shell: bash
sed -n 's_<dp:file name="\([^"]*\)">\([^<]*\).*_\1 -> \2_p' temp
I add \1 -> to show link from file name to content but for content only, just remove this part
posix version so on GNU sed use --posix
assuming that base64 encoded contents is on the same line as the tag around (and not spread on several lines, that need some modification in this case)
Thanks to JID for full explaination below
How it works
sed -n
The -n means no printing so unless explicitly told to print, then there will be no output from sed
's_
This is to substitute the following regex using _ to separate regex from the replacement.
<dp:file name=
Regular text
"\([^"]*\)"
The brackets are a capture group and must be escaped unless the -r option is used( -r is not available on posix). Everything inside the brackets is captured. [^"]* means 0 or more occurrences of any character that is not a quote. So really this just captures anything between the two quotes.
>\([^<]*\)<
Again uses the capture group this time to capture everything between the > and <
.*
Everything else on the line
_\1 -> \2
This is the replacement, so replace everything in the regex before with the first capture group then a -> and then the second capture group.
_p
Means print the line
Resources
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
http://www.grymoire.com/Unix/Sed.html
/usr/xpg4/bin/sed works well here.
/usr/bin/sed is not working as expected in case if the file contains just 1 line.
below command works for a file containing only single line.
/usr/xpg4/bin/sed -n 's_<env:Envelope\(.*\)<dp:file name="temporary://BackUpDir/backupmanifest.xml">\([^>]*\)</dp:file>\(.*\)_\2_p' securebackup.xml 2>/dev/null
Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.
This because of the below reason:
Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.
GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

Sed command to fetch particular string from full string

I've got a file which contains lot of strings like below input.
Need to extract the below output and process it further.
Input:
History={ExecAt=[2013-05-03 03:00:20,2013-05-03 03:00:23,2013-05-03 03:00:26],MId=["msgId3","msgId4","msgId5"]};
Output should be:
MId=["msgId3","msgId4","msgId5"]
using (sed 's/^.*,MId=/MId/') command i got the output like MId=["msgId3","msgId4","msgId5"]};
but still wanted the exact output (need to remove last 2 special chars }; here).
This works for me:
sed 's/.*\(MId=.*\)\}.*/\1/'
If your grep supports the -o option, you can use it rather than sed:
grep -o 'MId=\[[^]]\+\]'
Using the same regex in sed works fine, just remove anything before and after:
sed -e 's/.*\(MId=\[[^]]\+\]\).*/\1/'

Have sed ignore non-matching lines

How can I make sed filter matching lines according to some expression, but ignore non-matching lines, instead of letting them print?
As a real example, I want to run scalac (the Scala compiler) on a set of files, and read from its -verbose output the .class files created. scalac -verbose outputs a bunch of messages, but we're only interested in those of the form [wrote some-class-name.class].
What I'm currently doing is this (|& is bash 4.0's way to pipe stderr to the next program):
$ scalac -verbose some-file.scala ... |& sed 's/^\[wrote \(.*\.class\)\]$/\1/'
This will extract the file names from the messages we're interested in, but will also let all other messages pass through unchanged! Of course we could do instead this:
$ scalac -verbose some-file.scala ... |& grep '^\[wrote .*\.class\]$' |
sed 's/^\[wrote \(.*\.class\)\]$/\1/'
which works but looks very much like going around the real problem, which is how to instruct sed to ignore non-matching lines from the input. So how do we do that?
If you don't want to print lines that don't match, you can use the combination of
-n option which tells sed not to print
p flag which tells sed to print what is matched
This gives:
sed -n 's/.../.../p'
Another way with plain sed:
sed -e 's/.../.../;t;d'
s/// is a substituion, t without any label conditionally skips all following commands, d deletes line.
No need for perl or grep.
(edited after Nicholas Riley's suggestion)
Rapsey raised a relevant point about multiple substitutions expressions.
First, quoting an Unix SE answer, you can "prefix most sed commands with an address to limit the lines to which they apply".
Second, you can group commands within curly braces {} (separated with a semi-colon ; or a new line)
Third, add the print flag p on the last substitution
Syntax:
sed -n -e '/^given_regexp/ {s/regexp1/replacement1/flags1;[...];s/regexp1/replacement1/flagsnp}'
Example (see Here document for more details):
Code:
sed -n -e '/^ha/ {s/h/k/g;s/a/e/gp}' <<SAMPLE
haha
hihi
SAMPLE
Result:
keke
sed -n '/.../!p'
There is no need for a substitution.