Using sed to replace text between strings - sed

I'm trying to replace a.mysql.com in a file using sed for the following line
'ENGINE': 'a.mysql.com', # MySQL host
How can I replace the text without removing the comment entry?
I tried the below but it isn't working
sed -i -e "s/\('ENGINE': ').*\([^']*\)/\1new.mysql.com\2/g" file.py

Following sed may help you in same.
sed "s/\(.*: \)\('.*'\)\(.*\)/\1'my_next_text'\3/" Input_file
Output will be as follows.
'ENGINE': 'my_next_text', # MySQL host
EDIT: Tested it with edited Input_file of OP too as follows.
cat Input_file
'ENGINE': 'a.mysql.com', # MySQL host
sed "s/\(.*: \)\('.*'\)\(.*\)/\1'my_next_text'\3/" Input_file
'ENGINE': 'my_next_text', # MySQL host
EDIT2: IN case OP wants to check for line which has string ENGINE in it then following could be done.
sed "/ 'ENGINE/s/\(.*: \)\('.*'\)\(.*\)/\1'my_next_text'\3/" Input_file

You can use the following logic using GNU sed to achieve your requirement.
sed "/ENGINE/s/'[^']*'/'new.mysql.com'/2" file
The s/ENGINE/ matches any lines containing ENGINE and does the following substitution s/'[^']*'/'new.mysql.com'/2 which:
s/ # Substitute
' # Match a single quote
[^']* # Match anything not a single quote
' # Match the closing single quote
/ # Replace with
'new.mysql.com' # The literal value
/2 # The 2 here matches the second quoted string, not the first.
Add the -i extension once the file is modified appropriately.

Related

Use sed to remove lines that do not match a pattern but keep header line

I am cleaning up a dataset (csv dataset). I only want to consider registers in which all fields are complete and have the right type of values. This is what I tried:
sed -r '{
/regex_pattern/!d
more commands follow...
}' $1
The program works just fine and does what it is supposed to do. The problem is that it also removes the very first line (header line) since it does not match the specific regex_pattern. I know there is a way to specify the range in which the command should apply so for example:
sed '2,$ s/A/a/'
will do substitutions on data skipping the header line. Based on this logic I tried:
sed -r '{
2,$/regex_pattern/!d
more commands follow...
}' $1
so that the header line will be untouched however this code does not run at all.So what (and why) would be the right command to do what I am intending?
As an example, imagine my csv file is fruits.csv and that my regex_pattern is [0-9]+,[0-9]+
apples,oranges
20,5
7,3
,4
a,b
12,22
When I call the .sh script that contains the sed commands in should output:
apples,oranges
20,5
7,3
12,22
So, note that:
Header line was not deleted even though it does not match the regex_pattern.
Line number 4, i.e. ",4" was deleted as it does not match the regex_pattern.
Line number 5, i.e. "a,b" was deleted as it does not match the regex_pattern.
Any help is very much appreciated and I wish to thank you all in advance.
Kind regards.
You could write it like this, matching the whole line, starting at the second line:
sed -r '
2,${/^[0-9]+,[0-9]+$/!d}
' file
Output
apples,oranges
20,5
7,3
12,22
If you also want to allow single numbers or more than just 2 comma separated numbers:
sed -r '
2,${/^[0-9]+(,[0-9]+)*$/!d}
' file
Using sed
$ sed '2,${/[0-9]\+,[0-9]\+/!d}' input_file
apples,oranges
20,5
7,3
12,22
any one of these should work in gawk, mawk1/2, or macos nawk
mawk 'NF-_^(NF==NR)' FS='^[0-9]+,[0-9]+$'
nawk '(NF!=NR)!=NF' FS='^[0-9]+,[0-9]+$'
gawk 'NF-(NF!~NR)' FS='^[0-9]+,[0-9]+$'
'
apples,oranges
20,5
7,3
12,22
more concisely would be
mawk -F'[0-9]+,[0-9]+' '(NF<NR)-NF' # using FS
gawk '/[0-9]+,[0-9]+/^+(NF<NR)' # not using FS
nawk '(NF<NR)<=/([0-9]+,?){2}/' # same approach, rev. order
mawk '(NF~NR)-/[0-9]+,[0-9]+/' # truly fringe but
# concise syntax
nawk '(NF~NR)!=/([0-9]+,?){2}/' # same approach, to
# circumvent nawk peculiarities
sed is a bad choice for working with CSVs since it doesn't have any inbuilt functionality for working with fields, nor literal strings, nor variables, doesn't use EREs by default (all of the answers you have so far will only work with GNU sed), etc. To do what you specifically want with any awk in any shell on every Unix box is simply:
$ awk 'NR==1 || /[0-9]+,[0-9]+/' file
apples,oranges
20,5
7,3
12,22
which says "if the current line number (stored in NR) is 1 or the regexp matches the current line contents then print the line". Anything else you want to do with your CSV will also be easier with awk than with sed.
Meh, I would just preserve first line.
sed -r '
1{p;d}
/regex_pattern/!d
more commands follow...
' "$1"
or run it not for first line:
1!{
/regex_pattern/!d
more commands follow...
}
This might work for you (GNU sed):
sed -E '1!{/^[0-9]+,[0-9]+$/!d}' file
If it is not the first line, delete any line that does not match one set of comma separated natural numbers.
Alternative:
sed -E '1b;/^[0-9]+,[0-9]+$/!d' file
Or:
sed -nE '1p;1b;/^[0-9]+,[0-9]+$/p' file

sed remove line if neither pattern provided don't match

I am trying to create a filter command to reduce the lines from a log file, assume each line contains partition made of date,
/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
....
/iamthepathxx/20210619/filexx.txt
then from thousands of lines I only want to keep the ones with two string in the path
/202106
/202105
and remove any other lines
I have tried following command
sed -i -e '\(/202105\|/202106\)!d' ~/log.txt
above command threw
sed: -e expression #1, char 24: unterminated address regex
You can use
sed -i '/\/20210[56]/!d' ~/log.txt
Or, if you need to use more specific alternatives and further enhance the pattern:
sed -i -E '/\/(202105|202106)/!d' ~/log.txt
Details:
-i - GNU sed option for inline file replacement
-E - option enabling POSIX ERE regex syntax
/\/20210[56]/ - regex that matches /20210 and then either 5 or 6
\/(202105|202106) - the POSIX ERE pattern that matches / and then either 202105 or 202106
!d - removes the lines not matching the pattern.
See the online demo:
#!/bin/bash
s='/iamthepath01/20200301/file01.txt
/iamthepath02/20200302/file02.txt
/iamthepathxx/20210619/filexx.txt'
sed '/\/20210[56]/!d' <<< "$s"
Output:
/iamthepathxx/20210619/filexx.txt
sed is the wrong tool for this. If you want a script that's as fragile as the sed one then use grep as it's the tool that exists solely to do a simple g/re/p (hence the name) like you're doing:
$ grep '/20210[56]' file
/iamthepathxx/20210619/filexx.txt
or if you want a more robust solution that focuses just on the part of the line you want to match and so will avoid false matches, then use awk:
$ awk -F '/' '$3 ~ /^20210[56]/' file
/iamthepathxx/20210619/filexx.txt
This might work for you (GNU sed):
sed -ni '\#/20210[56]#p' file
This uses seds -n grep-like option to turn off implicit printing and -i option to edit the file in place.
Normally sed uses the /.../ to match but other delimiters may be used if the first is escaped e.g. \#...#.
So the above solution will filter the existing file down to lines that contain either /202105 or /202106.
N.B. grep will almost certainly be faster in finding the above lines however the use of the -i option may be the ultimate reason for choosing sed (although the same outcome can be achieved by tacking on the > tmpFile && mv tmpFile file to a grep solution).

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

How can I achieve the following in sed?

The original text is:
apr_array_pstrcat(anythingbutalwayshereincludingspaces,anythingbutalwayshereincludingspaces, ',')
I want to change it to:
apr_array_pstrcat(samethingasabove,samethingasabove, ", ")
I got the following sed command, but it is not working:
find . -type f -exec sed -i "s/apr_array_pstrcat\((.*),(.*),(.*)','\)/apr_array_pstrcat\($1,$2,$3\", \"\)/g" {} +
How can I do this? I am able to understand PCRE regex, but I am not sure about this sed one.
Issues with OP's attempts:
-E is needed to enable ERE, otherwise \( and ( need to be reversed with default BRE
$1, $2, etc should be \1, \2, etc
there should be only two capture groups as per given sample
also, g flag isn't needed if there can be only one match per line
sed -E "s/apr_array_pstrcat\((.*),(.*)','\)/apr_array_pstrcat\(\1,\2\", \"\)/g"
This can be simplified to:
sed -E "s/(apr_array_pstrcat\(.*),(.*)','\)/\1,\2\", \"\)/g"
# or this one, since using double quotes for entire expression can lead to
# conflict with shell double quote interpretation
sed -E 's/(apr_array_pstrcat\(.*),(.*)\x27,\x27\)/\1,\2", "\)/g'
This can be further simplified depending on what kind of data is present in the input:
# change ',' to ", " if a line contains apr_array_pstrcat(
sed '/apr_array_pstrcat(/ s/\x27,\x27/", "/'
sed has the -E flag for "use extended regular expressions in the script".
I'd also match the arguments with 'anything that's not a comma': "[^,]+"
So this works for me:
sed -E "s/(apr_array_pstrcat\([^,]+, [^,]+,) ','\)/\1 \", \")/"

search a string which contains "/" and replace using sed

How to search a pattern and remove the line using sed which contains special characters like "ranasnfs2:/SA_kits/prod"
I tried using a variable to hold the complete string and then recall the variable in sed command but it is not working.
echo $a
ranasnfs2:/SA_kits/prod
sed -i '/"$a"/d' test.txt
cat test.txt | grep -i SA
/SA_kits -rw,suid,soft,retry=4 ranasnfs2:/SA_kits/prod
You need to escape the slash character.
Use this for deleting lines which contain a /:
sed '/\//d' file