How to use SED to find multiple paths in the same line and replace them with a different path? - sed

I have a file with multiple paths in the same line:
cat modules.dep
kernel/mm/zsmalloc.ko:
kernel/crypto/lzo.ko:
kernel/drivers/char/tpm/tpm_vtpm_proxy.ko: kernel/drivers/char/tpm/tpm.ko
kernel/drivers/block/virtio_blk.ko:
kernel/drivers/block/zram/zram.ko: kernel/mm/zsmalloc.ko
kernel/drivers/nvdimm/virtio_pmem.ko: kernel/drivers/nvdimm/nd_virtio.ko
kernel/drivers/nvdimm/nd_virtio.ko:
kernel/drivers/net/virtio_net.ko: kernel/drivers/net/net_failover.ko kernel/net/core/failover.ko
kernel/drivers/net/net_failover.ko: kernel/net/core/failover.ko
extra/virtio_gpu/virtio-gpu.ko: kernel/drivers/virtio/virtio_dma_buf.ko
extra/wlan_simulation/virt_wifi_sim.ko: kernel/drivers/net/wireless/virt_wifi.ko
I would like to change it to:
/lib/modules/zsmalloc.ko:
/lib/modules/lzo.ko:
/lib/modules/tpm_vtpm_proxy.ko: /lib/modules/tpm.ko
/lib/modules/virtio_blk.ko:
/lib/modules/zram.ko: /lib/modules/zsmalloc.ko
/lib/modules/virtio_pmem.ko: /lib/modules/nd_virtio.ko
/lib/modules/nd_virtio.ko:
/lib/modules/virtio_net.ko: /lib/modules/net_failover.ko /lib/modules/failover.ko
/lib/modules/net_failover.ko: /lib/modules/failover.ko
/lib/modules/virtio-gpu.ko: /lib/modules/virtio_dma_buf.ko
/lib/modules/virt_wifi_sim.ko: /lib/modules/virt_wifi.ko
But my attempt:
sed -i 's/\(.*\)\//\/lib\/modules\//g' modules.load
works only, if there is just one path per line.
How can I achieve this, via sed, with multiple paths per line?
I am using sed from BusyBox in D(ASH) Standalone.
BusyBox v1.32.1-Magisk (2021-01-21 00:17:27 PST) multi-call binary.
Usage: sed [-i[SFX]] [-nrE] [-f FILE]... [-e CMD]... [FILE]...
or: sed [-i[SFX]] [-nrE] CMD [FILE]...
-e CMD Add CMD to sed commands to be executed
-f FILE Add FILE contents to sed commands to be executed
-i[SFX] Edit files in-place (otherwise sends to stdout)
Optionally back files up, appending SFX
-n Suppress automatic printing of pattern space
-r,-E Use extended regex syntax
If no -e or -f, the first non-option argument is the sed command string.
Remaining arguments are input files (stdin if none).

This sed should work:
sed -E 's~[^[:blank:]]+/~/lib/modules/~g' modules.dep
/lib/modules/zsmalloc.ko:
/lib/modules/lzo.ko:
/lib/modules/tpm_vtpm_proxy.ko: /lib/modules/tpm.ko
/lib/modules/virtio_blk.ko:
/lib/modules/zram.ko: /lib/modules/zsmalloc.ko
/lib/modules/virtio_pmem.ko: /lib/modules/nd_virtio.ko
/lib/modules/nd_virtio.ko:
/lib/modules/virtio_net.ko: /lib/modules/net_failover.ko /lib/modules/failover.ko
/lib/modules/net_failover.ko: /lib/modules/failover.ko
/lib/modules/virtio-gpu.ko: /lib/modules/virtio_dma_buf.ko
/lib/modules/virt_wifi_sim.ko: /lib/modules/virt_wifi.ko
[^[:blank:]]+/ finds 1+ non-whitespace characters followed by a / thus matching longest string until it gets a / in each of the multiple string per line.

Related

sed remove line if neither pattern provided don't match

I am trying to create a filter command to reduce the lines from a log file, assume each line contains partition made of date,
/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
....
/iamthepathxx/20210619/filexx.txt
then from thousands of lines I only want to keep the ones with two string in the path
/202106
/202105
and remove any other lines
I have tried following command
sed -i -e '\(/202105\|/202106\)!d' ~/log.txt
above command threw
sed: -e expression #1, char 24: unterminated address regex
You can use
sed -i '/\/20210[56]/!d' ~/log.txt
Or, if you need to use more specific alternatives and further enhance the pattern:
sed -i -E '/\/(202105|202106)/!d' ~/log.txt
Details:
-i - GNU sed option for inline file replacement
-E - option enabling POSIX ERE regex syntax
/\/20210[56]/ - regex that matches /20210 and then either 5 or 6
\/(202105|202106) - the POSIX ERE pattern that matches / and then either 202105 or 202106
!d - removes the lines not matching the pattern.
See the online demo:
#!/bin/bash
s='/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
/iamthepathxx/20210619/filexx.txt'
sed '/\/20210[56]/!d' <<< "$s"
Output:
/iamthepathxx/20210619/filexx.txt
sed is the wrong tool for this. If you want a script that's as fragile as the sed one then use grep as it's the tool that exists solely to do a simple g/re/p (hence the name) like you're doing:
$ grep '/20210[56]' file
/iamthepathxx/20210619/filexx.txt
or if you want a more robust solution that focuses just on the part of the line you want to match and so will avoid false matches, then use awk:
$ awk -F '/' '$3 ~ /^20210[56]/' file
/iamthepathxx/20210619/filexx.txt
This might work for you (GNU sed):
sed -ni '\#/20210[56]#p' file
This uses seds -n grep-like option to turn off implicit printing and -i option to edit the file in place.
Normally sed uses the /.../ to match but other delimiters may be used if the first is escaped e.g. \#...#.
So the above solution will filter the existing file down to lines that contain either /202105 or /202106.
N.B. grep will almost certainly be faster in finding the above lines however the use of the -i option may be the ultimate reason for choosing sed (although the same outcome can be achieved by tacking on the > tmpFile && mv tmpFile file to a grep solution).

sed command from cmake does not change input file

I want to modify src/some_file.txt before building my executable:
cmake_minimum_required(VERSION 3.5.1)
project(MyProject)
add_custom_target(run ALL
COMMAND sed -i "s#MY_PATH=\\(.*\\)#MY_PATH=${CMAKE_BINARY_DIR}/\\1#" ${CMAKE_CURRENT_SOURCE_DIR}/some_file.txt
)
add_executable(e main.cpp)
add_dependencies(e run)
src/some_file.txt has content:
MY_PATH=something
Targets run and e get build but src/some_file.txt remains unchanged. Why?
Either you're not using GNU sed or the file doesn't match the pattern.
My guess would be that you need to escape the backslashes again, but you don't need them anyway, just use:
sed -i 's#MY_PATH=#MY_PATH=${CMAKE_BINARY_DIR}/#'
Or simply
sed -i 's#MY_PATH=#&${CMAKE_BINARY_DIR}/#'
where & expands to the matched pattern. You should use single-quotes not double-quotes unless you specifically want the shell to expand variables.

Using sed to keep the beginning of a line

I have a file in which some lines start by a >
For these lines, and only these ones, I want to keep the first eleven characters.
How can I do that using sed ?
Or maybe something else is better ?
Thanks !
Muriel
Let's start with this test file:
$ cat file
line one with something or other
>1234567890abc
other line in file
To keep only the first 11 characters of lines starting with > while keeping all other lines:
$ sed -r '/^>/ s/(.{11}).*/\1/' file
line one with something or other
>1234567890
other line in file
To keep only the first eleven characters of lines starting with > and deleting all other lines:
$ sed -rn '/^>/ s/(.{11}).*/\1/p' file
>1234567890
The above was tested with GNU sed. For BSD sed, replace the -r option with -E.
Explanation:
/^>/ is a condition. It means that the command which follows only applies to lines that start with >
s/(.{11}).*/\1/ is a substitution command. It replaces the whole line with just the first eleven characters.
-r turns on extended regular expression format, eliminating the need for some escape characters.
-n turns off automatic printing. With -n in effect, lines are only printed if we explicitly ask them to be printed. In the second case above, that is done by adding a p after the substitute command.
Other forms:
$ sed -r 's/(>.{10}).*/\1/' file
line one with something or other
>1234567890
other line in file
And:
$ sed -rn 's/(>.{10}).*/\1/p' file
>1234567890

Extracting the contents between two different strings using bash or perl

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.
Below is the content of file temp.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/<env:Body><dp:response xmlns:dp="http://www.datapower.com/schemas/management"><dp:timestamp>2015-01-
22T13:38:04Z</dp:timestamp><dp:file name="temporary://test.txt">XJzLXJlc3VsdHMtYWN0aW9uX18i</dp:file><dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:file></dp:response></env:Body></env:Envelope>
This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.
To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g' > test.txt
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g' > test1.txt
Below command:
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g'
produces output:
XJzLXJlc3VsdHMtYWN0aW9uX18i
<dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:response> </env:Body></env:Envelope>`
Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?
When i run below command, I am getting expected output:
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g'
It produces below string
lc3VsdHMtYWN0aW9uX18i
I can then easily route this to test1.txt file.
UPDATE
I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.
OS: solaris 10
Shell: bash
sed -n 's_<dp:file name="\([^"]*\)">\([^<]*\).*_\1 -> \2_p' temp
I add \1 -> to show link from file name to content but for content only, just remove this part
posix version so on GNU sed use --posix
assuming that base64 encoded contents is on the same line as the tag around (and not spread on several lines, that need some modification in this case)
Thanks to JID for full explaination below
How it works
sed -n
The -n means no printing so unless explicitly told to print, then there will be no output from sed
's_
This is to substitute the following regex using _ to separate regex from the replacement.
<dp:file name=
Regular text
"\([^"]*\)"
The brackets are a capture group and must be escaped unless the -r option is used( -r is not available on posix). Everything inside the brackets is captured. [^"]* means 0 or more occurrences of any character that is not a quote. So really this just captures anything between the two quotes.
>\([^<]*\)<
Again uses the capture group this time to capture everything between the > and <
.*
Everything else on the line
_\1 -> \2
This is the replacement, so replace everything in the regex before with the first capture group then a -> and then the second capture group.
_p
Means print the line
Resources
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
http://www.grymoire.com/Unix/Sed.html
/usr/xpg4/bin/sed works well here.
/usr/bin/sed is not working as expected in case if the file contains just 1 line.
below command works for a file containing only single line.
/usr/xpg4/bin/sed -n 's_<env:Envelope\(.*\)<dp:file name="temporary://BackUpDir/backupmanifest.xml">\([^>]*\)</dp:file>\(.*\)_\2_p' securebackup.xml 2>/dev/null
Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.
This because of the below reason:
Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.
GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

Replace a word with another set of strings in a UNIX file

When I try to replace a string using sed command it works perfectly fine.
For eg :
When i used the below sed command:
sed 's/DB_ALTER/DB_REPRISE/g' /product/dwhrec1/abc.ksh > /product/dwhrec1/abc1.ksh
This command works perfectly fine and replace all the "DB_ALTER" with "DB_REPRISE" and writes the result to abc1.ksh script.
But when I place all such values in a file. for eg:
cat Repla.txt
DB_ALTER
DB_CMD
DB_GEST_COMM
for i in `cat Repla.txt`
do
sed 's/$i/DB_REPRISE/g' /product/dwhrec1/abc.ksh > /product/dwhrec1/abc1.ksh
done
But this does not work. In my file Repla.txt is just an example. In actual it has many values.
Can anyone please help me on this command or suggest some alternative.
Thanks
There are two problems with your script. The first is that the $i variable appears within single quotes. That means that bash will not substitute for the value of i. It needs to be in double-quotes.
Secondly, every time that you run sed, it overwrites the previous abc1.ksh file. You should copy abc.ksh to abc1.ksh and then modify in place abc1.ksh as many times as needed:
cp abc.ksh abc1.ksh
for i in `cat Repla.txt`; do
sed -i'' "s/$i/DB_REPRISE/g" abc1.ksh
done
The -i flag to sed causes it to modify the file in place.
Also, bash will apply word splitting to cat Repla.txt. This can surprise people who were expecting it to work line-by-line, not word-by-word.
Workaround in case your sed does not support -i
The sed on both linux (GNU) and Mac OSX (BSD) support -i. If your sed does not, try:
cmd=
for i in `cat Repla.txt`; do
[ "$cmd" ] && cmd="$cmd;"
cmd="$cmd s/$i/DB_REPRISE/g"
done
sed "$cmd" abc.ksh >abc1.ksh
The above puts all the substitution commands that you need in a single shell variable. This way, sed only needs to be run once and -i is not used.
Another option
If it is acceptable to overwrite the source file, then:
for i in $(cat Repla.txt)
do
sed 's/'$i'/DB_REPRISE/g' abc.ksh >abc1.ksh
mv -f abc1.ksh abc.ksh
done
The above puts in single quotes all of the sed command except for the part that we want the shell to expand. This is not needed in this example but could be useful if your replacement text had shell-active characters. The above also uses the more modern $(...) in place of backquotes for command substitution.
If $i were to contain spaces (it doesn't here), we would need to enclose it in double-quotes to protect it against shell word splitting as in:
for i in $(cat Repla.txt)
do
sed 's/'"$i"'/DB_REPRISE/g' abc.ksh >abc1.ksh
mv -f abc1.ksh abc.ksh
done