How to use sed to replace multiline string?

How to use sed to replace multiline string? - sed

How can I use the bash sed command to change this string:
<Directory /var/www/>
Options Indexes FollowSymLinks
AllowOverride None
Require all granted
</Directory>
into the following string? (only changing the 3rd line of string)
<Directory /var/www/>
Options Indexes FollowSymLinks
AllowOverride All
Require all granted
</Directory>
NOTE 1: I don't just want to target the string 'AllowOverride None' because there are other occurrences in the file that should not be changed. I need to target the entire string starting with <Directory /var/www>
NOTE 2: I also need to overwrite the file. So, take that into account in your answer. And provide different versions for GNU/non-GNU versions of sed just in case.

Since the patterns contain slashes, use \% (for any character %) to mark the search patterns. Then use:
sed -e '\%^<Directory /var/www/>%,\%^</Directory>% s/AllowOverride None/AllowOverride All/'
The search patterns inside \%…% limit the search to lines between the matching patterns, and the { s/…/…/; } looks for the desired pattern within the range and makes the appropriate replacement.
If you don't want to restrict it to a single directory section but to all directory sections, adjust the start pattern appropriately. For example, this will match any <Directory> section:
sed -e '\%^<Directory [^>]*>%,\%^</Directory>% s/AllowOverride None/AllowOverride All/'
You can make it more selective depending on your requirements.

The simple version, relying on the AllowOverride line coming within two lines after <Directory...> and using a GNU sed extension, is this:
sed '/^<Directory/,+2 { s/AllowOverride None/AllowOverride All/g; }'
UPDATE: Here is the version not relying on any GNU extension (I tried it first, but made a typo and was surprised that it didn't work, that's why a posted the other version first):
sed '/^<Directory/,/^<\/Directory>/ { s/AllowOverride None/AllowOverride All/; }'

I realize this is not what you asked by maybe its worth not using sed?
How about a python solution? It walks directory passed as first parameter to script and replaces exactly <Directory element as you wrote it while only changing None to All and writes changes back to the file. It will also work with different indentation levels while preserving original indentation. Works on both python2 and python3.
After all i am assuming if you have sed you probably have python too.
#!/usr/bin/env python
import re
r = re.compile(r'(<Directory /var/www/>\s+Options Indexes FollowSymLinks\s+AllowOverride )None(\s+Require all granted\s+</Directory>)', re.MULTILINE)
for root, dirs, files in os.walk(sys.argv[1]):
for file_name in files:
if file_name.endswith('.conf'):
file_path = os.path.join(root, file_name)
with open(file_path) as fp:
data = r.sub(r'\1All\2', fp.read())
with open(file_path, 'w+') as fp:
fp.write(data)

Using Gnu Sed:
sed -zie 's!\(<Directory /var/www/>[^<]*AllowOverride\) None!\1 All!' ex1.txt
Option -z is for Null separated records: all the file is one record,
so just make a simple substitution.
[^<]* (multiline) regular expression respects Directory boundaries, and allows flexible format and order.

Your question is a good illustration of the mantra, don't use sed. Really, you shouldn't use any regex engine for context-free language like XML. But you can get close, maybe close enough, with awk.
#! /usr/bin/awk -f
/<Directory \/var\/www\/>/ {
line = NR
}
/ AllowOverride None/ && line + 2 == NR {
gsub( /None/, "All" )
}
{ print }
That way you don't have any fancy, nonstandard regex to read, and your code says exactly what it means: If you find "AllowOverride" 2 lines after the "Directory" line, replace it. The above regexes are both very simple (and Posix compliant) and should work with any version of awk.

Your answer is already given by this user just check here.
Some Reference
In the simplest calling of sed, it has one line of text in the pattern space, ie. 1 line of \n delimited text from the input. The single line in the pattern space has no \n... That's why your regex is not finding anything.
You can read multiple lines into the pattern-space and manipulate things surprisingly well, but with a more than normal effort.. Sed has a set of commands which allow this type of thing... Here is a link to a Command Summary for sed. It is the best one I've found, and got me rolling.
However forget the "one-liner" idea once you start using sed's micro-commands. It is useful to lay it out like a structured program until you get the feel of it... It is surprisingly simple, and equally unusual. You could think of it as the "assembler language" of text editing.
Summary: Use sed for simple things, and maybe a bit more, but in general, when it gets beyond working with a single line, most people prefer something else...
I'll let someone else suggest something else.. I'm really not sure what the best choice would be (I'd use sed, but that's because I don't know perl well enough.)
sed '/^a test$/{
$!{ N # append the next line when not on the last line
s/^a test\nPlease do not$/not a test\nBe/
# now test for a successful substitution, otherwise
#+ unpaired "a test" lines would be mis-handled
t sub-yes # branch_on_substitute (goto label :sub-yes)
:sub-not # a label (not essential; here to self document)
# if no substituion, print only the first line
P # pattern_first_line_print
D # pattern_ltrunc(line+nl)_top/cycle
:sub-yes # a label (the goto target of the 't' branch)
# fall through to final auto-pattern_print (2 lines)
}
}' alpha.txt
Here it is the same script, condensed into what is obviously harder to read and work with, but some would dubiously call a one-liner
sed '/^a test$/{$!{N;s/^a test\nPlease do not$/not a test\nBe/;ty;P;D;:y}}' alpha.txt
Here is my command "cheat-sheet"
: # label
= # line_number
a # append_text_to_stdout_after_flush
b # branch_unconditional
c # range_change
d # pattern_delete_top/cycle
D # pattern_ltrunc(line+nl)_top/cycle
g # pattern=hold
G # pattern+=nl+hold
h # hold=pattern
H # hold+=nl+pattern
i # insert_text_to_stdout_now
l # pattern_list
n # pattern_flush=nextline_continue
N # pattern+=nl+nextline
p # pattern_print
P # pattern_first_line_print
q # flush_quit
r # append_file_to_stdout_after_flush
s # substitute
t # branch_on_substitute
w # append_pattern_to_file_now
x # swap_pattern_and_hold
y # transform_chars

Related

Why is my sed multiline find-and-replace not working as expected?

I have a simple sed command that I am using to replace everything between (and including) //thistest.com-- and --thistest.com with nothing (remove the block all together):
sudo sed -i "s#//thistest\.com--.*--thistest\.com##g" my.file
The contents of my.file are:
//thistest.com--
zone "awebsite.com" {
type master;
file "some.stuff.com.hosts";
};
//--thistest.com
As I am using # as my delimiter for the regex, I don't need to escape the / characters. I am also properly (I think) escaping the . in .com. So I don't see exactly what is failing.
Why isn't the entire block being replaced?

You have two problems:
Sed doesn't do multiline pattern matches—at least, not the way you're expecting it to. However, you can use multiline addresses as an alternative.
Depending on your version of sed, you may need to escape alternate delimiters, especially if you aren't using them solely as part of a substitution expression.
So, the following will work with your posted corpus in both GNU and BSD flavors:
sed '\#^//thistest\.com--#, \#^//--thistest\.com# d' /tmp/corpus
Note that in this version, we tell sed to match all lines between (and including) the two patterns. The opening delimiter of each address pattern is properly escaped. The command has also been changed to d for delete instead of s for substitute, and some whitespace was added for readability.
I've also chosen to anchor the address patterns to the start of each line. You may or may not find that helpful with this specific corpus, but it's generally wise to do so when you can, and doesn't seem to hurt your use case.

# separation by line with 1 s//
sed -n -e 'H;${x;s#^\(.\)\(.*\)\1//thistest.com--.*\1//--thistest.com#\2#;p}' YourFile
# separation by line with address pattern
sed -e '\#//thistest.com--#,\#//--thistest.com# d' YourFile
# separation only by char (could be CR, CR/LF, ";" or "oneline") with s//
sed -n -e '1h;1!H;${x;s#//thistest.com--.*\1//--thistest.com##;p}' YourFile
Note:
assuming there is only 1 section thistest per file (if not, it remove anything between the first opening until the last closing section) for the use of s//
does not suite for huge file (load entire file into memory) with s//
sed using addresses pattern cannot select section on the same line, it search 1st pattern to start, and a following line to stop but very efficient on big file and/or multisection

Sed to replace variable length string between 2 known patterns

I'd like to be able to replace a string between 2 known patterns. The catch is that I want to replace it by a string of the same length that is composed only of 'x'.
Let's say I have a file containing:
Hello.StringToBeReplaced.SecondString
Hello.ShortString.SecondString
I'd like the output to be like this:
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString

Using sed loops
You can use sed, though the thinking required is not wholly obvious:
sed ':a;s/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/;t a'
This is for GNU sed; BSD (Mac OS X) sed and other versions may be fussier and require:
sed -e ':a' -e 's/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/' -e 't a'
The logic is identical in both:
Create a label a
Substitute the lead string and a sequence of x's (capture 1), followed by a non-x, and arbitrary other data plus the second string (capture 2), and replace it with the contents of capture 1, an x and the content of capture 2.
If the s/// command made a change, go back to the label a.
It stops substituting when there are no non-x's between the two marker strings.
Two tweaks to the regex allow the code to recognize two copies of the pattern on a single line. Lose the ^ that anchors the match to the beginning of the line, and change .* to [^.]* (so that the regex is not quite so greedy):
$ echo Hello.StringToBeReplaced.SecondString Hello.StringToBeReplaced.SecondString |
> sed ':a;s/\(Hello\.x*\)[^x]\([^.]*\.SecondString\)/\1x\2/;t a'
Hello.xxxxxxxxxxxxxxxxxx.SecondString Hello.xxxxxxxxxxxxxxxxxx.SecondString
$
Using the hold space
hek2mgl suggests an alternative approach in sed using the hold space. This can be implemented using:
$ echo Hello.StringToBeReplaced.SecondString |
> sed 's/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1#\3##\2/
> h
> s/.*##//
> s/./x/g
> G
> s/\(x*\)\n\([^#]*\)#\([^#]*\)##.*/\2\1\3/
> '
Hello.xxxxxxxxxxxxxxxxxx.SecondString
$
This script is not as robust as the looping version but works OK as written when each line matches the lead-middle-tail pattern. It first splits the line into three sections: the first marker, the bit to be mangled, and the second marker. It reorganizes that so that the two markers are separated by #, followed by ## and the bit to be mangled. h copies the result to the hold space. Remove everything up to and including the ##; replace each character in the bit to be mangled by x, then copy the material in the hold space after the x's in the pattern space, with a newline separating them. Finally, recognize and capture the x's, the lead marker, and the tail marker, ignoring the newline, the # and ## plus trailing material, and reassemble as lead marker, x's, and tail marker.
To make it robust, you'd recognize the pattern and then group the commands shown inside { and } to group them so they're only executed when the pattern is recognized:
sed '/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/{
s/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1#\3##\2/
h
s/.*##//
s/./x/g
G
s/\(x*\)\n\([^#]*\)#\([^#]*\)##.*/\2\1\3/
}'
Adjust to suit your needs...
Adjusting to suit your needs
[I tried one of your solutions and it worked fine.]
However when I try to replace the 'hello' by my real string (which is
'1.2.840.') and my second string (which is simply a dot '.'), things stop
working. I guess all these dots confuse the sed command.
What I try to achieve is transform this '1.2.840.10008.' to
'1.2.840.xxxxx.'
And this pattern happens several times in my file with variable number
of characters to be replaced between the '1.2.840.' and the next dot '.'
There are times when it is important to get your question close enough to the real scenario — this may be one such. Dot is a metacharacter in
sed regular expressions (and in most other dialects of regular expression — shell globbing being the noticeable exception). If the 'bit to be mangled' is always digits, then we can tighten up the regular expressions, though actually (when I look at the code ahead) the tightening really isn't imposing much in the way of a restriction.
Pretty much any solution using regular expressions is a balancing act that has to pit convenience and abbreviation against reliability and precision.
Revised code plus data
cat <<EOF |
transform this '1.2.840.10008.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.21. and 1.2.840.20992. should lose the 21 and 20992.
EOF
sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'
Example output:
transform this '1.2.840.xxxxx.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.xx. and 1.2.840.xxxxx. should lose the 21 and 20992.
The changes in the script are:
sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'
Add 1\.2\.840\. as the start pattern.
Revise the 'character to replace' expression to 'not x or .'.
Use just \. as the tail pattern.
You could replace the [^x.] with [0-9] if you're sure you only want digits matched, in which case you won't have to worry about spaces as discussed below.
You may decide you don't want spaces to be matched so that a casual comment like:
The net prefix is 1.2.840. And there are other prefixes too.
does not end up as:
The net prefix is 1.2.840.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
In which case, you probably need to use:
sed ':a;s/\(1\.2\.840\.x*\)[^x. ]\([^ .]*\.\)/\1x\2/;t a'
And so the changes continue until you've got something precise enough to do what you want without doing anything you don't want on your current data set. Writing bullet-proof regular expressions requires a precise specification of what you want matched, and can be quite hard.

I'd choose perl:
perl -pe 's/(?<=Hello\.)(.*?)(?=\.SecondString)/ "x" x length($1) /e' file

This awk should do:
awk -F. '{for (i=1;i<=length($2);i++) a=a"x";$2=a;a=""}1' OFS="." file
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString

Bash Works Too
While the perl, sed and awk solutions are probably the better choice, a Bash solution is not that difficult (just longer). Bash has good character-by-character handling abilities as well:
#!/bin/bash
rep=0 # replace flag
skip=0 # delay reset flag
while read -r line; do # read each line
for ((i=0; i<${#line}; i++)); do # for each character in the line
# if '.' and replace on, turn off and set skip
[ ${line:i:1} == '.' -a $rep -eq 1 ] && { rep=0; skip=1; }
# print char or "x" depending on replace flag
[ $rep -eq 0 ] && printf "%c" ${line:i:1} || printf "x"
# if '.' and replace off
if [ ${line:i:1} == '.' -a $rep -eq 0 ]; then
# if skip, turn skip off, else set replace on
[ $skip -eq 1 ] && skip=0 || rep=1
fi
done
printf "\n"
done
exit 0
Input
$ cat dat/replacefile.txt
Hello.StringToBeReplaced.SecondString
Hello.ShortString.SecondString
Output
$ bash replacedot.sh < dat/replacefile.txt
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString

For the sake of your sanity, just use awk:
$ awk 'BEGIN{FS=OFS="."} {gsub(/./,"x",$2)} 1' file
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString

Use sed to replace line if pattern is on next line

How do I get sed to replace previous line? I only came across examples of delete, insert lines, but what I actually need is that I only make substitution to current line if a condition on following line is met.
My sample file is like this
$ /bin/cat test
Cygwin
Cygwin is a cool emulator for Linux on Windows.
Unix
Maybe
the coolest environment?
Linux
Is also one of the best environments
Solaris
Why did Sun feel copying Java into Unix would matter?
AIX
Unknown
The output I expect is as below. Prepend ::: to strings having max 25 chars but only if the string on next line is longer than 25 chars. Thus, the line having Unix, AIX below should not get prepended with :::, but others would.
$ # See detailed sed expression in my answer below
:::Cygwin
Cygwin is a cool emulator for Linux on Windows.
Unix
Maybe
the coolest environment?
:::Linux
Is also one of the best environments
:::Solaris
Why did Sun feel copying Java into Unix would matter?
AIX
Unknown
What sed expression can help me do this?
I am inclined to use only sed since this is a part of some other script that has other sed expressions going on, so I do not want to deviate if possible.

Here's one sed expression that gives me the output I desire,
/bin/sed -rne '/^\s*$/{d;};{p;}' test | /bin/sed -rne '/(^.{5,26}$)/{$p;h;n;/^.{5,26}$/{x;p;x;p;D;};{x;s/(^.*$)/:::\1/;p;x;p;D;}};{$p;h;p;}'
Specifically, below two sed expressions are piped together above,
/bin/sed -rne '/^\s*$/{d;};{p;}' test
# Remove any empty-lines (optionally containing spaces)
/bin/sed -rne '/(^.{5,26}$)/{$p;h;n;/^.{5,26}$/{x;p;x;p;D;};{x;s/(^.*$)/:::\1/;p;x;p;D;}};{$p;h;p;}'
# This is the killer sed expression I came up with hunting around with my limited knowledge
# The detailed breakdown of this expression is as below,
/(^.{5,26}$)/ # Get a string of characters atleast 5 chars to max 26 chars
{
$p; # Print if it's already on last line (since -n is in effect)
h; # Save it to hold space
n; # Get the next line into pattern space
/^.{5,26}$/ # Check if pattern space (i.e. next line) also has min 5, max 26 chars
{ # if above condition passed, execute inside here
x; # Swap pattern with hold space; i.e. Get current line back
p; # Print it (i.e. the first line)
x; # Swap again; to get back next line
p; # Print it (i.e. the second line)
D; # Stop cycle here, and process the next line in the input file
};
{ # else block for above if-condition
x; # Swap pattern with hold space; i.e. Get current line back
s/(^.*$)/:::\1/; # Append ::: in front of line
p; # Print it (i.e. the first line)
x; # Swap again; to get back next line
p; # Print it (i.e. the second line)
D; # Stop cycle here, and process the next line in the input file
} # End processing next line
} # End if match
{ # Current line is longer than max 26 chars,
$p; # Print if it's already on last line (since -n is in effect)
h; # Remember it in hold space
p; # Print it (i.e. the current line)
}
With above explanation, I am able to achieve what I need.
But I still not confident if this could not be written or explained in a concise, or perhaps better way?

It's pretty simple in awk if you get tired of trying to use the hammer of sed on this particular screw :-)
awk '{x[NR]=$0} END{for(i=1;i<=NR;i++){if(length(x[i])<26 && length(x[i+1])>25)printf ":::";print x[i]}}' file
Save all the lines in array x[]. At the end, go through the lines printing them but prefixing ones that meet your conditions with :::.

This might work for you (GNU sed):
sed -r '$!N;/^.{1,25}\n.{26,}$/s/^/:::/;P;D' file

Perl One-Liner from Command-Line
This perl one-liner will do it (tested just now):
perl -0777 -pe 's/^([^\n]{1,25}$)(?=\n[^\n]{25,}$)/:::$1/smg' yourfile

sed recipe: how to do stuff between two patterns that can be either on one line or on two lines?

Let's say we want to do some substitutions only between some patterns, let them be <a> and </a> for clarity... (all right, all right, they're start and end!.. Jeez!)
So I know what to do if start and end always occur on the same line: just design a proper regex.
I also know what to do if they're guaranteed to be on different lines and I don't care about anything in the line containing end and I'm also OK with applying all the commands in the line containing start before start: just specify the address range as /start/,/end/.
This, however, doesn't sound very useful. What if I need to do a smarter job, for instance, introduce changes inside a {...} block?
One thing I can think of is breaking the input on { and } before processing and putting it back together afterwards:
sed 's/{\|}/\n/g' input | sed 'main stuff' | sed ':a $!{N;ba}; s/\n\(}\|{\)\n/\1/g'
Another option is the opposite:
cat input | tr '\n' '#' | sed 'whatever; s/#/\n/g'
Both of these are ugly, mainly because the operations are not confined within a single command. The second one is even worse because one has to use some character or substring as a "newline holder" assuming it isn't present in the original text.
So the question is: are there better ways or can the above-mentioned ones be optimized? This is quite a regular task from what I read in recent SO questions, so I'd like to choose the best practice once and for all.
P.S. I'm mostly interested in pure sed solutions: can the job be do with one invocation of sed and nothing else? Please no awk, Perl, etc.: this is more of a theoretical question, not a "need the job done asap" one.

This might work for you:
# create multiline test data
cat <<\! >/tmp/a
> this
> this { this needs
> changing to
> that } that
> that
> !
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/a
this
this { THIS needs
changing to
THAT } that
that
# convert multiline test data to a single line
tr '\n' ' ' </tmp/a >/tmp/b
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/b
this this { THIS needs changing to THAT } that that
Explanation:
Read the data into the pattern space (PS). /{/!b;:a;/}/!{$q;N;ba}
Copy the data into the hold space (HS). h
Strip non-data from front and back of string. s/[^{]*{//;s/}.*//
Convert data e.g. s/this\|that/\U&/g
Swap to HS and append converted data. x;G
Replace old data with converted data.s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/
EDIT:
A more complicated answer which I think caters for more than one block per line.
# slurp file into pattern space (PS)
:a
$! {
N
ba
}
# check for presence of \v if so quit with exit value 1
/\v/q1
# replace original newlines with \v's
y/\n/\v/
# append a newline to PS as a delimiter
G
# copy PS to hold space (HS)
h
# starting from right to left delete everything but blocks
:b
s/\(.*\)\({.*}\).*\n/\1\n\2/
tb
# delete any non-block details form the start of the file
s/.*\n//
# PS contains only block details
# do any block processing here e.g. uppercase this and that
s/th\(is\|at\)/\U&/g
# append ps to hs
H
# swap to HS
x
# replace each original block with its processed one from right to left
:c
s/\(.*\){.*}\(.*\)\n\n\(.*\)\({.*}\)/\1\n\n\4\2\3/
tc
# delete newlines
s/\n//g
# restore original newlines
y/\v/\n/
# done!
N.B. This uses GNU specific options but could be tweaked to work with generic sed's.

How to restrict a find and replace to only one column within a CSV?

I have a 4-column CSV file, e.g.:
0001 # fish # animal # eats worms
I use sed to do a find and replace on the file, but I need to limit this find and replace to only the text found inside column 3.
How can I have a find and replace only occur on this one column?

Are you sure you want to be using sed? What about csvfix? Is your CSV nice and simple with no quotes or embedded commas or other nasties that make regexes...a less than satisfactory way of dealing with a general CSV file? I'm assuming that the # is the 'comma' in your format.
Consider using awk instead of sed:
awk -F# '$3 ~ /pattern/ { OFS= "#"; $3 = "replace"; }'
Arguably, you should have a BEGIN block that sets OFS once. For one line of input, it didn't make any odds (and you'd probably be hard-pressed to measure a difference on a million lines of input, too):
$ echo "pattern # pattern # pattern # pattern" |
> awk -F# '$3 ~ /pattern/ { OFS= "#"; $3 = "replace"; }'
pattern # pattern #replace# pattern
$
If sed still seems appealing, then:
sed '/^\([^#]*#[^#]*\)#pattern#\(.*\)/ s//\1#replace#\2/'
For example (and note the slightly different input and output – you can fix it to handle the same as the awk quite easily if need be):
$ echo "pattern#pattern#pattern#pattern" |
> sed '/^\([^#]*#[^#]*\)#pattern#\(.*\)/ s//\1#replace#\2/'
pattern#pattern#replace#pattern
$
The first regex looks for the start of a line, a field of non-at-signs, an at-sign, another field of non-at-signs and remembers the lot; it looks for an at-sign, the pattern (which must be in the third field since the first two fields have been matched already), another at-sign, and then the residue of the line. When the line matches, then it replaces the line with the first two fields (unchanged, as required), then adds the replacement third field, and the residue of the line (unchanged, as required).
If you need to edit rather than simply replace the third field, then you think about using awk or Perl or Python. If you are still constrained to sed, then you explore using the hold space to hold part of the line while you manipulate the other part in the pattern space, and end up re-integrating your desired output line from the hold space and pattern space before printing the line. That's nearly as messy as it sounds; actually, possibly even messier than it sounds. I'd go with Perl (because I learned it long ago and it does this sort of thing quite easily), but you can use whichever non-sed tool you like.
Perl editing the third field. Note that the default output is $_ which had to be reassembled from the auto-split fields in the array #F.
$ echo "pattern#pattern#pattern#pattern" | sh -x xxx.pl
> perl -pa -F# -e '$F[2] =~ s/\s*pat(\w\w)rn\s*/ prefix-$1-suffix /; $_ = join "#", #F; ' "$#"
pattern#pattern# prefix-te-suffix #pattern
$
An explanation. The -p means 'loop, reading lines into $_ and printing $_ at the end of each iteration'. The -a means 'auto-split $_ into the array #F'. The -F# means the field separator is #. The -e is followed by the Perl program. Arrays are indexed from 0 in Perl, so the third field is split into $F[2] (the sigil — the # or $ — changes depending on whether you're working with a value from the array or the array as a whole. The =~ is a match operator; it applies the regex on the RHS to the value on the LHS. The substitute pattern recognizes zero or more spaces \s* followed by pat then two 'word' characters which are remembered into $1, then rn and zero or more spaces again; maybe there should be a ^ and $ in there to bind to the start and end of the field. The replacement is a space, 'prefix-', the remembered pair of letters, and '-suffix' and a space. The $_ = join "#", #F; reassembles the input line $_ from the possibly modified separate fields, and then the -p prints that out. Not quite as tidy as I'd like (so there's probably a better way to do it), but it works. And you can do arbitrary transforms on arbitrary fields in Perl without much difficulty. Perl also has a module Text::CSV (and a high-speed C version, Text::CSV_XS) which can handle really complex CSV files.

Essentially break the line into three pieces, with the pattern you're looking for in the middle. Then keep the outer pieces and replace the middle.
/\([^#]*#[^#]*#\[^#]*\)pattern\([^#]*#.*\)/s//\1replacement\2/
\([^#]*#[^#]*#\[^#]*\) - gather everything before the pattern, including the 3rd # and any text before the math - this becomes \1
pattern - the thing you're looking for
\([^#]*#.*\) - gather everything after the pattern - this becomes \2
Then change that line into \1 then the replacement, then everything after pattern, which is \2

This might work for you:
echo 0001 # fish # animal # eats worms|
sed 's/#/&\n/2;s/#/\n&/3;h;s/\n#.*//;s/.*\n//;y/a/b/;G;s/\([^\n]*\)\n\([^\n]*\).*\n/\2\1/'
0001 # fish # bnimbl # eats worms
Explanation:
Define the field to be worked on (in this case the 3rd) and insert a newline (\n) before it and directly after it. s/#/&\n/2;s/#/\n&/3
Save the line in the hold space. h
Delete the fields either side s/\n#.*//;s/.*\n//
Now process the field i.e. change all a's to b's. y/a/b/
Now append the original line. G
Substitute the new field for the old field (also removing any newlines). s/\([^\n]*\)\n\([^\n]*\).*\n/\2\1/
N.B. That in step 4 the pattern space only contains the defined field, so any number of commands may be carried out here and the result will not affect the rest of the line.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to use sed to replace multiline string? - sed

Related

Why is my sed multiline find-and-replace not working as expected?

Sed to replace variable length string between 2 known patterns

Use sed to replace line if pattern is on next line

sed recipe: how to do stuff between two patterns that can be either on one line or on two lines?

How to restrict a find and replace to only one column within a CSV?

Categories

Resources