Using sed to change pom/xml element value with regular expressuib - sed

I have an XML file and I used sed to change the value of the element "own.version":
<properties>
<own.version>1.1.77-SNAPSHOT</own.version>
</properties>
Sed statement:
cat pom.xml | sed -e "s%<own.version>${oldVersion}</own.version>%<own.version>${newVersion}</own.version>%" > pom.xml.transformed
Now my pom file is going to be more generic and properties that I want to change might be "*.own.version", e.g.:
<properties>
<a.own.version>1.1.77SNAPSHOT</a.own.version>
</properties>
How can I use a regulare expression with sed to change the value of *.own.version?

This one should work, it captures any prefix of "own.version".
sed "s%<\(.*\)own.version>.*</%<\1own.version>${newVersion}</>%" pom.xml > pom.xml.transformed
The pattern is a bit simplified, too, assuming that there is only one such version to be modified (i.e., it captures all old version numbers), and the closing tag identifier is omitted.

Since you told your Jenkins don't have xmlstarlet could you please try following awk and let me know then.
##Shell variable
new="1.8"
awk -v new_version="$new" '/own\.version/{sub(/>.*</,">"new_version"<")} 1' Input_file
Output will be as follows.
<properties>
<own.version>1.8</own.version>
</properties>
In case want to save output into Input_file itself append > temp_file && mv temp_file Input_file to above code then.

Related

using sed to replace string with special characters

I'm basically trying to modify tomcat server.xml connector tag and add a address attribute to it.
I want to find the below string in server.xml
I'm doing the below with sed,
export currlistener=\<Connector\ port\=\"18443\"
export newlistener=\<Connector\ port\=\"18443\"\ address\=\"127.0.0.1\"\
echo $currlistener
echo $newlistener
sed -i -e 's/'$currlistener'/'$newlistener'/g' server.xml
But I get the error
sed: -e expression #1, char 12: unterminated `s' command
I guess sed is interpreting the special characters and erroring out.
How would I do the same using awk?
Regards,
Anand.
Using sed
The problem was that the shell variables were unquoted. Try:
sed -i -e "s/$currlistener/$newlistener/g" server.xml
Using awk
The sed solution requires that you trust the source of your shell variables. For a case like this, awk is safer. Using a modern GNU awk:
awk -i inplace -v a="$currlistener" -v b="$newlistener" '{gsub(a, b)} 1' server.xml
Or, using other awk:
awk -v a="$currlistener" -v b="$newlistener" '{gsub(a, b)} 1' server.xml >tmp && mv tmp server.sml
Simplifying the variable assignments
Separately, the shell variables can be defined without requiring so many escapes:
currlistener='<Connector port="18443"'
newlistener='<Connector port="18443" address="127.0.0.1"'
It is only necessary to export them if they are to be used in a child process.

Extracting the contents between two different strings using bash or perl

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.
Below is the content of file temp.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/<env:Body><dp:response xmlns:dp="http://www.datapower.com/schemas/management"><dp:timestamp>2015-01-
22T13:38:04Z</dp:timestamp><dp:file name="temporary://test.txt">XJzLXJlc3VsdHMtYWN0aW9uX18i</dp:file><dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:file></dp:response></env:Body></env:Envelope>
This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.
To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g' > test.txt
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g' > test1.txt
Below command:
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g'
produces output:
XJzLXJlc3VsdHMtYWN0aW9uX18i
<dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:response> </env:Body></env:Envelope>`
Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?
When i run below command, I am getting expected output:
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g'
It produces below string
lc3VsdHMtYWN0aW9uX18i
I can then easily route this to test1.txt file.
UPDATE
I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.
OS: solaris 10
Shell: bash
sed -n 's_<dp:file name="\([^"]*\)">\([^<]*\).*_\1 -> \2_p' temp
I add \1 -> to show link from file name to content but for content only, just remove this part
posix version so on GNU sed use --posix
assuming that base64 encoded contents is on the same line as the tag around (and not spread on several lines, that need some modification in this case)
Thanks to JID for full explaination below
How it works
sed -n
The -n means no printing so unless explicitly told to print, then there will be no output from sed
's_
This is to substitute the following regex using _ to separate regex from the replacement.
<dp:file name=
Regular text
"\([^"]*\)"
The brackets are a capture group and must be escaped unless the -r option is used( -r is not available on posix). Everything inside the brackets is captured. [^"]* means 0 or more occurrences of any character that is not a quote. So really this just captures anything between the two quotes.
>\([^<]*\)<
Again uses the capture group this time to capture everything between the > and <
.*
Everything else on the line
_\1 -> \2
This is the replacement, so replace everything in the regex before with the first capture group then a -> and then the second capture group.
_p
Means print the line
Resources
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
http://www.grymoire.com/Unix/Sed.html
/usr/xpg4/bin/sed works well here.
/usr/bin/sed is not working as expected in case if the file contains just 1 line.
below command works for a file containing only single line.
/usr/xpg4/bin/sed -n 's_<env:Envelope\(.*\)<dp:file name="temporary://BackUpDir/backupmanifest.xml">\([^>]*\)</dp:file>\(.*\)_\2_p' securebackup.xml 2>/dev/null
Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.
This because of the below reason:
Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.
GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

Get TagValue of nth occurence of a Tag in XML using sed

MY xml
<?xml version="1.0" encoding="UTF-8" ?>
<Attributes>
<Attribute>123</Attribute>
<Attribute>959595</Attribute>
<Attribute>1233</Attribute>
<Attribute>jiji</Attribute>
</Attributes>
I need to get the tag value of second occurence of attribute tag i.e 959595 using sed
i used the command
sed -n ':a;$!{N;ba};s#\(<Attribute\)\(.*\)\(</Attribute>\)#\1#2#\2#p' file
pattern one second occurrence pattern two value it doesnt work
i dont know whether my approach is correct or not please correct my command
The proper way to do this is :
$ xmllint --xpath '/Attributes/Attribute[2]/text()' file.xml
NOTES
xmllint comes with libxml2.
the '2' is the second searched element
sed -n '/<Attributes>/,\#</Attributes># {
/<Attribute>/ {
H;g
s#.*<Attribute>\(.*\)</Attribute>.*#\1#
t found
}
b
:found
p;q
}' YourFile
Assuming, like in your sample, there is only 1 Attributes to found, this sed only return the 1st. (if the xml content is only like your sample, the /<Attributes>/,\#</Attributes># selection is not needed)
Posix version so --posix on GNU sed
This sed prints all Attribute entries from the Attributes block, then takes the second entry and removes the tags:
sed -n '/<Attributes>/,\#</Attributes>#{/<Attribute>/p}' attrib.txt | sed -n '2p' | sed 's#</Attribute>##;s/<Attribute>//'
Output:
959595
Or another way without pipes is to use sed commands, this goes to the second entry strips the Attribute tag and then quits:
sed -n '/<Attributes>/,\#</Attributes>#{/<Attribute>/{n;s#.*<Attribute>\(.*\)</Attribute>.*#\1#;p;q};}' attrib.txt
Or if your number of Attribute entries changes you can make it a bit more intuitive by parsing all values and then using sed to print the attribute placement where you want:
sed -n '/<Attributes>/,\#</Attributes>#{/<Attribute>/{s#</Attribute>##;s#<Attribute>##;p}}' attrib.txt | sed -n '2p'
You can change the end where from 2, to whatever Attribute value field you want to display or take multiple values like sed -n '2p;3p' or sed -n '1,2p'
I also would follow the xmllint xpath way. It however seems like there is two versions available. According to this man page at https://linux.die.net/man/1/xmllint there is no xpath parameter, but it is called "pattern".
Following this documentation, your call then would be
$ xmllint --pattern '/Attributes/Attribute[2]/text()' file.xml
I recommend checking your local man page to see which one to use.

sed stripping hex from start of file including pattern

I've been at this most of this afternoon hacking with sed and it's a bit of a minefield.
I have a file of hex of the form:
485454502F312E31203230300D0A0D0AFFD8FFE000104A46494600
I'm pattern matching on 0D0A0D0A and have managed to delete the contents from the start of the file to there. The problem is that it leaves the 0D0A0D0A, so I have to do a second pass to pick that up.
Is there a way in one command to delete up to and including the pattern that you match to and save it back into the same file ?
thanks in advance.
ID
This should work:
sed -e 's/.*0D0A0D0A//' file.txt
You need to provide better description of your problem.
Based on what you wrote you can use -i switch (Edit files in-place) of sed to save the changed file:
sed -i.bak 's/^.*0D0A0D0A//' file
PS: On posix and on some older versions of sed doesn't have -i switch available. If that's the case use it like this:
sed 's/^.*0D0A0D0A//' file > _temp && mv _temp file

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution