Using sed script to replace specific parts of filtered lines - sed

Given an xml file consisting of lines like below:
<dependency field="no_change" name="test" conf="blahblah"/>
<dependency field="to_be_picked_up" name="test" conf="blahREPLACE_ME"/>
I would like to be able to identify lines where the value of field is equal to the to_be_picked_up (which can be anything apart from a specific string e.g. no_change) and replace the string REPLACE_ME with a specific string.
I have used the following command to do some line-level changes but I am not sure how I can script the logic for replacing REPLACE_ME only in lines where the value of the field can be anything apart from the to_be_picked_up and locate the change within the conf="".
sed -e 's/<dependency \(.*\)\(\.*\)>/\<dependency \1\/\>/'

Don't use sed to edit XML. Use an XML-aware tool. For example, in xsh, a tool based on libxml I happen to maintain, you can write
open file.xml ;
for //dependency[#field="to_be_picked_up"]/#conf
set . xsh:subst(., 'REPLACE_ME', 'RESULT') ;
save :b ;

sed '/field="no_change"/!s/REPLACE_ME/whatever/'
Using xmlstarlet, it would be:
xmlstarlet ed -u '//dependency[#field!="no_change"]/#conf' -x 'concat(substring-before(.,"REPLACE_ME"), "whatever", substring-after(., "REPLACE_ME"))'

Related

Using sed/awk to print ONLY words that contains matched pattern - Words starting with /pattern/ or Ending with /pattern/

I have the following output:
junos-vmx-x86-64-21.1R1.11.qcow2 metadata-usb-fpc0.img metadata-usb-fpc10.img
metadata-usb-fpc11.img metadata-usb-fpc1.img metadata-usb-fpc2.img metadata-usb-fpc3.img
metadata-usb-fpc4.img metadata-usb-fpc5.img metadata-usb-fpc6.img metadata-usb-fpc7.img
metadata-usb-fpc8.img metadata-usb-fpc9.img metadata-usb-re0.img metadata-usb-re1.img
metadata-usb-re.img metadata-usb-service-pic-10g.img metadata-usb-service-pic-2g.img
metadata-usb-service-pic-4g.img vFPC-20210211.img vmxhdd.img
The output came from the following script:
images_fld=$(for i in $(ls "$DIRNAME_IMG"); do echo ${i%%/}; done)
The previous output is saved in a variable called images_fld=
Problem:
I need to extract the values of junos-vmx-x86-64-21.1R1.11.qcow2
vFPC-20210211.img and vmxhdd.img When I mean values I mean the entire word
The problem is that this directory containing all the files is always being updated, and new files are added constantly, which means that I can not rely on the line number ($N) to extract the name of those files.
I am trying to use awk or sed to achieve this.
Is there a way to:
match all files ending with.qcow2 and then extract the full file name? Like: junos-vmx-x86-64-21.1R1.11.qcow2
match all files starting withvFPC and then extract the full file name? Like: vFPC-20210211.img
match all files starting withvmxhdd and then extract the full file name? Like: vmxhdd.img
I am using those patterns as those file names tend to change names according to each version I am deploying. But the patterns like: .qcow2 or vFPC or vmxhddalways remain the same regardless, so for that reason, I need to extract the entire string only by matching partial patterns. Is it possible? Thanks!
Note: I can not rely on files ending with .img as there are quite a lot of them, so it would make it more difficult to extract the specific file names :/
This might work for you (GNU sed):
sed -nE '/\<\S+\.qcow2\>|\<(vFPC|vmxhdd)\S+\>/{s//\n&\n/;s/[^\n]*\n//;P;D}' file
If a string matches the required criteria, delimit it by newlines.
Delete up to and including the first newline.
Print/delete the first line and repeat.
Thanks to KamilCuk I was able to solve the problem. Thank you! For anyone who may need this in the future, instead of using sed or awk the solution was by using tail.
echo $images_fld | tail -f | tr ' ' '\n' | grep '\.qcow2$\|vFPC\|vmxhdd')
Basically, the problem that I was having was only to extract the names of the files ending with .qcow2 | and starting with vFPC & vmxhdd
Thank you KamilCuk
Another solution given by potong is by using
echo $images_fld sed -nE '/\<\S+\.qcow2\>|\<(vFPC|vmxhdd)\S+\>/{s//\n&\n/;s/[^\n]*\n//;P;D}'
which gives the same output as KamilCuk's! Thanks both

Using sed, prepend line only once, if there's a match later in file content

I'd like to add a line on top of my output if my input file has a specific word.
However, if I'm just looking for specific string, then as I understand it, it's too late. The first line is already in the output and I can't prepend to it anymore.
Here's an exemple of input.
one
two
two
three
If I can find a line with, say, the word two, I'd like to add a new line before the first one, with for example FOUND. I want that line prepended only once, even if there are several matches.
So an input file without any two would remain unchanged, and the example file above would become:
FOUND
one
two
two
three
I know how to prepend with i\, but can't get the context right. From what I understood that would be around:
1{
/two/{ # This will will search "two" in the first line, how to look for it in the whole file ?
1i\
FOUND
}
}
EDIT:
I know how to do it using other languages/methods, that's not my question.
Sed has advanced features to work on several lines at once, append/prepend lines and is not limited to substitution. I have a sed file already filled with expressions to modify a python source file, which is why I'd prefer to avoid using something else. I want to be able to add an import at the beginning of a file if a certain class is used.
A Perl solution:
perl -i.bak -0077 -pE 'say "FOUND" if /two/;' in_file
The Perl one-liner uses these command line flags:
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
-E : Tells Perl to look for code in-line, instead of in a file. Also enables all optional features. Here, enables say.
-0777 : Slurp files whole.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
sed is for doing s/old/new on individual strings, that's not what you're trying to do so you shouldn't bother trying to use sed. There's lots of ways to do this, this one will be very efficient, robust and portable to all Unix systems:
$ grep -Fq 'two' file && echo "FOUND"; cat file
FOUND
one
two
two
three
To operate on a stream instead of (or in addition to) a file and without needing to read the whole input into memory:
awk 'f{print; next} {buf[NR]=$0} /two/{print "FOUND"; for (i=1;i<=NR;i++) print buf[i]; f=1}'
e.g.:
$ cat file | awk 'f{print; next} {buf[NR]=$0} /two/{print "FOUND"; for (i=1;i<=NR;i++) print buf[i]; f=1}'
FOUND
one
two
two
three
That awk script will also work using any awk in any shell on every Unix box.

sed backreference not being found

I am trying to use 'sed' to replace a list of paths in a file with another path.
An example string to process is:
/path/to/file/block
I want to replace /path/to/file with something else.
I have Tried
sed -r '/\s(\S+)\/block/s/\1/new_path/'
I know it's finding the matching string but I'm getting an invalid back reference error.
How can I do this?
This may do:
echo "/path/to/file/block" | sed -r 's|/\S*/(block)|/newpath/\1|'
/newpath/block
Test
echo "test=/path/file test2=/path/to/file/block test3=/home/root/file" | sed -r 's|/\S*/(block)|/newpath/\1|'
test=/path/file test2=/newpath/block test3=/home/root/file
Back-references always refer to the pattern of the s command, not to any address (before the command).
However, in this case, there's no need for addressing: we can apply the substitution to all lines (and it will change only lines where it matches), so we can write:
s,\s(\S+)/block/, \1/new_path,
(I added a space to the RHS, as I'm guessing you didn't mean to overwrite that; also used a different separator to reduce the need for backslashes.)

Using grep to correct XML files

I have a folder and sub folder that contains 2000 xml files.
Need to process all the files with BizTalk systems.
But some of the files has wrong tags
streetName Bombay Crescent /addressRegion
/streetName.
I need to you grep to find and replace the worng tags only.
I.e with the for loop.. find any xml file with wrong tag and replace it.
Only the tag "streetName" is affected. And only "addressRegion" is in the wrong place.
will like to do
grep -Po where streetName and *** /addressRegion if the condition is true
replace /addressRegion with /streetName
Thanks in Advance
The following will look for a tag <streetName> that with a matching closing tag of </addressRegion>, and will change addressRegion to streetName. It will replace all occurrences on the line. The street name must not contain any < signs, that would break the matching.
sed -e 's:\(<streetName>[^<]*\)</addressRegion>:\1</streetName>:g'
The command reads its standard input and writes standard output.
Sed -i will do the replacement in-place in all its input files:
sed -i -e 's:\(<streetName>[^<]*\)</addressRegion>:\1</streetName>:g' folder/subfolder/*.xml

Manipulate characters with sed

I have a list of usernames and i would like add possible combinations to it.
Example. Lets say this is the list I have
johna
maryb
charlesc
Is there is a way to use sed to edit it the way it looks like
ajohn
bmary
ccharles
And also
john_a
mary_b
charles_c
etc...
Can anyone assist me into getting the commands to do so, any explanation will be awesome as well. I would like to understand how it works if possible. I usually get confused when I see things like 's/\.(.*.... without knowing what some of those mean... anyway thanks in advance.
EDIT ... I change the username
sed s/\(user\)\(.\)/\2\1/
Breakdown:
sed s/string/replacement/ will replace all instances of string with replacement.
Then, string in that sed expression is \(user\)\(.\). This can be broken down into two
parts: \(user\) and \(.\). Each of these is a capture group - bracketed by \( \). That means that once we've matched something with them, we can reuse it in the replacement string.
\(user\) matches, surprisingly enough, the user part of the string. \(.\) matches any single character - that's what the . means. Then, you have two captured groups - user and a (or b or c).
The replacement part just uses these to recreate the pattern a little differently. \2\1 says "print the second capture group, then the first capture group". Which in this case, will print out auser - since we matched user and a with each group.
ex:
$ echo "usera
> userb
> userc" | sed "s/\(user\)\(.\)/\2\1/"
auser
buser
cuser
You can change the \2\1 to use any string you want - ie. \2_\1 will give a_user, b_user, c_user.
Also, in order to match any preceding string (not just "user"), just replace the \(user\) with \(.*\). Ex:
$ echo "marya
> johnb
> alfredc" | sed "s/\(.*\)\(.\)/\2\1/"
amary
bjohn
calfred
here's a partial answer to what is probably the easy part. To use sed to change usera to user_a you could use:
sed 's/user/user_/' temp
where temp is the name of the file that contains your initial list of usernames. How this works: It is finding the first instance of "user" on each line and replacing it with "user_"
Similarly for your dot example:
sed 's/user/user./' temp
will replace the first instance of "user" on each line with "user."
Sed does not offer non-greedy regex, so I suggest perl:
perl -pe 's/(.*?)(.)$/$2$1/g' file
ajohn
bmary
ccharles
perl -pe 's/(.*?)(.)$/$1_$2/g' file
john_a
mary_b
charles_c
That way you don't need to know the username before hand.
Simple solution using awk
awk '{a=$NF;$NF="";$0=a$0}1' FS="" OFS="" file
ajohn
bmary
ccharles
and
awk '{a=$NF;$NF="";$0=$0"_" a}1' FS="" OFS="" file
john_a
mary_b
charles_c
By setting FS to nothing, every letter is a field in awk. You can then easy manipulate it.
And no need to using capturing groups etc, just plain field swapping.
This might work for you (GNU sed):
sed -r 's/^([^_]*)_?(.)$/\2\1/' file
This matches any charactes other than underscores (in the first back reference (\1)), a possible underscore and the last character (in the second back reference (\2)) and swaps them around.