sed search and replace number with another on specific line - sed

I would like to replace in multiple dump files (*.sql):
`this_pattern` varchar(255)
by
`this_pattern` varchar(100)
i.e. only when "this_pattern" is present on the line.
I tried simple with:
echo "varchar(255)" | sed -e 's/\(255\)/100/g'
but I don't know how to put the this_pattern condition.
sed "/^`this_pattern`/ s/\(255\)/100/g"
did nothing.
Thanks for your help.

Try this, using a capturing group ( ) :
$ cat file
`this_pattern` varchar(255)
varchar(255)
$ sed -E '/^`this_pattern`/ s/(varchar)\(255\)/\1(100)/g' file
`this_pattern` varchar(100)
varchar(255)

echo 'this_pattern varchar(255)' | sed '/^this_pattern.*/ s/varchar(255)/varchar(100)/g'
Without -r switch, sed treats round parens just as characters.
If this_pattern is masked by backticks, like in the original post, it's a bit more complicated, but that doesn't look plausible. However, here we go:
echo '`this_pattern` varchar(255)' | sed '/^`this_pattern` .*/ s/varchar(255)/varchar(100)/g'
`this_pattern` varchar(100)
Note: There might be variable whitespace between 'pattern' and 'varchar' and 'varchar' and '(' and so on. So you have to be careful about this. Sometimes, it's a one time job, with dozens of lines, sometimes it's a repeated job. Sometimes the input is of generic nature, and expected to always look the same, sometimes it's man made, and may differ from time to time. You have to take that into account, when doing massive updates.
Keeping a copy of the source is often a good idea. Often editors allow using regular expressions, where mass renaming is possible, but they mark the places so you can oversee the changes, which will be made.

Related

sed backreference not being found

I am trying to use 'sed' to replace a list of paths in a file with another path.
An example string to process is:
/path/to/file/block
I want to replace /path/to/file with something else.
I have Tried
sed -r '/\s(\S+)\/block/s/\1/new_path/'
I know it's finding the matching string but I'm getting an invalid back reference error.
How can I do this?
This may do:
echo "/path/to/file/block" | sed -r 's|/\S*/(block)|/newpath/\1|'
/newpath/block
Test
echo "test=/path/file test2=/path/to/file/block test3=/home/root/file" | sed -r 's|/\S*/(block)|/newpath/\1|'
test=/path/file test2=/newpath/block test3=/home/root/file
Back-references always refer to the pattern of the s command, not to any address (before the command).
However, in this case, there's no need for addressing: we can apply the substitution to all lines (and it will change only lines where it matches), so we can write:
s,\s(\S+)/block/, \1/new_path,
(I added a space to the RHS, as I'm guessing you didn't mean to overwrite that; also used a different separator to reduce the need for backslashes.)

Manipulate characters with sed

I have a list of usernames and i would like add possible combinations to it.
Example. Lets say this is the list I have
johna
maryb
charlesc
Is there is a way to use sed to edit it the way it looks like
ajohn
bmary
ccharles
And also
john_a
mary_b
charles_c
etc...
Can anyone assist me into getting the commands to do so, any explanation will be awesome as well. I would like to understand how it works if possible. I usually get confused when I see things like 's/\.(.*.... without knowing what some of those mean... anyway thanks in advance.
EDIT ... I change the username
sed s/\(user\)\(.\)/\2\1/
Breakdown:
sed s/string/replacement/ will replace all instances of string with replacement.
Then, string in that sed expression is \(user\)\(.\). This can be broken down into two
parts: \(user\) and \(.\). Each of these is a capture group - bracketed by \( \). That means that once we've matched something with them, we can reuse it in the replacement string.
\(user\) matches, surprisingly enough, the user part of the string. \(.\) matches any single character - that's what the . means. Then, you have two captured groups - user and a (or b or c).
The replacement part just uses these to recreate the pattern a little differently. \2\1 says "print the second capture group, then the first capture group". Which in this case, will print out auser - since we matched user and a with each group.
ex:
$ echo "usera
> userb
> userc" | sed "s/\(user\)\(.\)/\2\1/"
auser
buser
cuser
You can change the \2\1 to use any string you want - ie. \2_\1 will give a_user, b_user, c_user.
Also, in order to match any preceding string (not just "user"), just replace the \(user\) with \(.*\). Ex:
$ echo "marya
> johnb
> alfredc" | sed "s/\(.*\)\(.\)/\2\1/"
amary
bjohn
calfred
here's a partial answer to what is probably the easy part. To use sed to change usera to user_a you could use:
sed 's/user/user_/' temp
where temp is the name of the file that contains your initial list of usernames. How this works: It is finding the first instance of "user" on each line and replacing it with "user_"
Similarly for your dot example:
sed 's/user/user./' temp
will replace the first instance of "user" on each line with "user."
Sed does not offer non-greedy regex, so I suggest perl:
perl -pe 's/(.*?)(.)$/$2$1/g' file
ajohn
bmary
ccharles
perl -pe 's/(.*?)(.)$/$1_$2/g' file
john_a
mary_b
charles_c
That way you don't need to know the username before hand.
Simple solution using awk
awk '{a=$NF;$NF="";$0=a$0}1' FS="" OFS="" file
ajohn
bmary
ccharles
and
awk '{a=$NF;$NF="";$0=$0"_" a}1' FS="" OFS="" file
john_a
mary_b
charles_c
By setting FS to nothing, every letter is a field in awk. You can then easy manipulate it.
And no need to using capturing groups etc, just plain field swapping.
This might work for you (GNU sed):
sed -r 's/^([^_]*)_?(.)$/\2\1/' file
This matches any charactes other than underscores (in the first back reference (\1)), a possible underscore and the last character (in the second back reference (\2)) and swaps them around.

SED search and replace substring in a database file

To all,
I have spent alot of time searching for a solution to this but cannot find it.
Just for a background, I have a text database with thousands of records. Each record is delineated by :
"0 #nnnnnn# Xnnn" // no quotes
The records have many fields on a line of their own, but the field I am interested in to search and replace a substring (notice spaces) :
" 1 X94 User1.faculty.ventura.ca" // no quotes
I want to use sed to change the substring ".faculty.ventura.ca" to ".students.moorpark.ut", changing nothing else on the line, globally for ALL records.
I have tested many things with negative results.
How can this be done ?
Thank You for the assistance.
Bob Perez (robertperez1957#gmail.com)
If I understand you correctly, you want this:
sed 's/1 X94 \(.*\).faculty.ventura.ca/1 X94 \1.students.moorpark.ut/' mydatabase.file
This will replace all records of the form 1 X94 XXXXXX.faculty.ventura.ca with 1 X94 XXXXX.students.moorpark.ut.
Here's details on what it all does:
The '' let you have spaces and other messes in your script.
s/ means substitute
1 X94 \(.*\).faculty.ventura.ca is what you'll be substituting. The \(.*\) stores anything in that regular expression for use in the replacement
1 X94 \1.students.moorpark.ut is what to replace the thing you found with. \1 is filled in with the first thing that matched \(.*\). (You can have multiple of those in one line, and the next one would then be \2.)
The final / just tells sed that you're done. If your database doesn't have linefeeds to separate its records, you'll want to end with /g, to make this change multiple times per line.
mydatabase.file should be the filename of your database.
Note that this will output to standard out. You'll probably want to add
> mynewdatabasefile.name
to the end of your line, to save all the output in a file. (It won't do you much good on your terminal.)
Edit, per your comments
If you want to replace 1 F94 bperez.students.Napvil.NCC to 1 F94 bperez.JohnSmith.customer, you can use another set of \(.*\), as:
sed 's/1 X94 \(.*\).\(.*\).Napvil.NCC/1 X94 \1.JohnSmith.customer/' 251-2.txt
This is similar to the above, except that it matches two stored parameters. In this example, \1 evaluates to bperez and \2 evaluates to students. We match \2, but don't use it in the replace part of the expression.
You can do this with any number of stored parameters. (Sed probably has some limit, but I've never hit a sufficiently complicated string to hit it.) For example, we could make the sed script be '\(.\) \(...\) \(.*\).\(.*\).\(.*\).\(.*\)/\1 \2 \3.JohnSmith.customer/', and this would make \1 = 1, \2 = X94, \3 = bperez, \4 = Napvil and \5 = NCC, and we'd ignore \4 and \5. This is actually not the best answer though - just showing it can be done. It's not the best because it's uglier, and also because it's more accepting. It would then do a find and replace on a line like 2 Z12 bperez.a.b.c, which is presumably not what you want. The find query I put in the edit is as specific as possible while still being general enough to suit your tasks.
Another edit!
You know how I said "be as specific as possible"? Due to the . character being special, I wasn't. In fact, I was very generic. The . means "match any character at all," instead of "match a period". Regular expressions are "greedy", matching the most they could, so \(.*\).\(.*\) will always fill the first \(.*\) (which says, "take 0 to many of any character and save it as a match for later") as far as it can.
Try using:
sed 's/1 X94 \(.*\)\.\(.*\).Napvil.NCC/1 X94 \1.JohnSmith.customer/' 251-2.txt
That extra \ acts as an escape sequence, and changes the . from "any character" to "just the period". FYI, since I don't (but should) escape the other periods, technically sed would consider 1 X94 XXXX.StdntZNapvilQNCC as a valid match. Since . means any character, a Z or a Q there would be considered a fit.
The following tutorial helped me
sed - replace substring in file
try the same using a -i prefix to replace in the file directly
sed -i 's/unix/linux/' file.txt

Using sed to remove urls with specific anchor text matches

Trying to parse some spam injection out of a mysql export file, and for some reason this is not working:
sed 's|([^<]*Buy[^<]*)||g'
Which, imo, should match and remove:
Buy Generic Drugs Without Prescription
but for some reason isn't. I can do it in perl no prob, since that supports non-greedy matches, but it is so slow, and since I will probably have to do 7 or 8 passes to get all the different permutations it would be much better if I can get sed to work instead.
Do not forget -r to support extended regexp: sed -r 's|([^<]*Buy[^<]*)||g' or just remove the useless parenthesis (that should be \( and \) without -r)
Are you sure that perl -p -e 's|[^<]*Buy[^<]*||g' is really slower.

capturing groups in sed

I have many lines of the form
ko04062 ko:CXCR3
ko04062 ko:CX3CR1
ko04062 ko:CCL3
ko04062 ko:CCL5
ko04080 ko:GZMA
and would dearly like to get rid of the ko: bit of the right-hand column. I'm trying to use sed, as follows:
echo "ko05414 ko:ITGA4" | sed 's/\(^ko\d{5}\)\tko:\(.*$\)/\1\2/'
which simply outputs the original string I echo'd. I'm very new to command line scripting, sed, pipes etc, so please don't be too angry if/when I'm doing something extremely dumb.
The main thing that is confusing me is that the same thing happens if I reverse the \1\2 bit to read \2\1 or just use one group. This, I guess, implies that I'm missing something about the mechanics of piping the output of echo into sed, or that my regexp is wrong or that I'm using sed wrong or that sed isn't printing the results of the substitution.
Any help would be greatly appreciated!
sed is outputting its input because the substitution isn't matching. Since you're probably using GNU sed, try this:
echo "ko05414 ko:ITGA4" | sed 's/\(^ko[0-9]\{5\}\)\tko:\(.*$\)/\1\2/'
\d -> [0-9] since GNU sed doesn't recognize \d
{} -> \{\} since GNU sed by default uses basic regular expressions.
This should do it. You can also skip the last group and simply use, \1 instead, but since you're learning sed and regex this is good stuff. I wanted to use a non-capturing group in the middle (:? ) but I could not get that to play with sed for whatever reason, perhaps it's not supported.
sed --posix 's/\(^ko[0-9]\{5\}\)\( ko:\)\(.*$\)/\1 \3/g' file > result
And ofcourse you can use
sed --posix 's/ko://'
You don't need sed for this
Here is how you can do it with bash:
var="ko05414 ko:ITGA4"
echo ${var//"ko:"}
${var//"ko:"} replaces all "ko:" with ""
See Manipulating Strings for more info
#OP, if you just want to get rid of "ko:", then
$ cat file
ko04062 ko:CXCR3
ko04062 ko:CX3CR1
ko04062 ko:CCL3
ko04062 ko:CCL5
some text with a legit ko: this ko: will be deleted if you use gsub.
ko04080 ko:GZMA
$ awk '{sub("ko:","",$2)}1' file
ko04062 CXCR3
ko04062 CX3CR1
ko04062 CCL3
ko04062 CCL5
some text with a legit ko: this ko: will be deleted if you use gsub.
ko04080 GZMA
Jsut a note. While you can use pure bash string substitution, its only more efficient when you are changing a single string. If you have a file, especially a big file, using bash's while read loop is still slower than using sed or awk.