sed - what is this curly brace notation called? - sed

I just found this:
sed '/label/{n;n;s/{}/{some comment}/;}'
The intended effect is to seek label, proceed 2 lines down (n;n;) then substitute in (s) some comment.
This is an amazing capability I never knew sed had.
Would someone be kind enough to specify the name of this curly brace notation, and the name of the class of operators inside the braces?

Curly brackets allow to group several commands so that they are executed for the same address range (reference). The thing here is that you specify an address (with one or two line numbers or patterns) and then apply a group of commands to matching lines.
The n command is nothing special, and it's documented in man, as well as in the linked document. I'm not sure if there's a general name for it.
From man sed:
n N Read/append the next line of input into the pattern space.

Related

Replace words but only after a colon

I have been researching this for quite some time but cannot seem to find an answer. Perhaps someone here can help.
I am trying to use sed to replace words in yml / yaml files. Since some of the words are included in the names I want to only replace words that appear after the colon (':').
For example. If the .yml file includes:
en:
label_some_tracker: A tracker
label_all_tracker: All trackers
label_attachment_type_trackers: Select trackers.
tracker_plural: trackers
and I want to replace all occurrences of tracker with issue in all values. The pattern:
s/tracker/issue/
also changes the names of the fields, which breaks my code.
I can reduce the size of the problem somewhat by including terms for all possible variants of a word. For example:
s/trackers/issues/
s/tracker/issue/
but that doesn't deal with all situations.
I have tried inserting a space before the search term:
s/ tracker/ issue/
but that matches names where the search term is at the beginning of the line.
If I search for whole words then it still seems to pick up the names because ':' and '_' are 'non word' characters.
If I try to put spaces at the beginning and end of the search term but then it misses words that are at the end of a line or words patterns with punctuation marks before the training space.
The only sure way seems to be to only replace words after a colon (':') but I cannot seem to figure out how to do that with sed.
Does anyone here know how?
With GNU sed:
sed -E 's/(:.*)tracker/\1issue/g' file
Output:
en:
label_some_tracker: A issue
label_all_tracker: All issues
label_attachment_type_trackers: Select issues.
tracker_plural: issues
Replace second occurance:
sed 's/tracker/issue/2' file

What is \n\nnd supposed to do?

echo n | sed '\n\nnd'
This command prints n with GNU sed. With BSD sed, it doesn't print anything.
The POSIX sed spec. says:
In a context address, the construction \cBREc, where c is any character other than <backslash> or <newline>, shall be identical to /BRE/. If the character designated by c appears following a <backslash>, then it shall be considered to be that literal character, which shall not terminate the BRE. For example, in the context address \xabc\xdefx, the second x stands for itself, so that the BRE is abcxdef.
The escape sequence \n shall match a <newline> embedded in the pattern space. A literal <newline> shall not be used in the BRE of a context address or in the substitute function.
but doesn't elaborate any further on these contradictory statements.
So my question is, which behavior is correct? Or is it intentionally left unspecified?
There's an update; with this commit GNU sed no longer prints n for the command in OP.
According to a reply to my email on Austin Group mailing list (quoted below), the standard is unclear on this, and both behaviors are correct. HP-UX and Solaris adopted the GNU behavior too; so it's not a bug in implementations, but a lack of clarity in the standard.
Neither is more correct than the other because, as you said yourself, the standard is unclear. A formal interpretation would say "The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this."
Given that implementations differ, we should probably make the behaviour explicitly unspecified.

Adding a comment character in most simple possible way

I want to search a file for a specific string and then place a comment at the beginning of that string. But I need an answer that avoids regex, global changes, and all the other fancy stuff.
I wrote this line:
sed -i.bak '/PermitRootLogin no/# PermitRootLogin no/' ./sshd_config
but I get an error:
sed: -e expression #1, char 21: comments don't accept any addresses
I assume the issue is that I need to escape the # character, but I'm not finding any resources on how to do that, or even mentioning it. I've tried various combinations of putting ^ or \ or \^ in front of the # but I'm jut not getting it right.
Please note I am intentionally repeating the text to be replaced. I would like the most simple possible solution to this question: how to replace "XYX" with "# XYZ" in the most obvious possible way.
As indicated in the comments by #mlt , you could try adding an s at the beginning your sed command. Straight from his comment:
s/PermitRootLogin....
I see that you said you're intentionally repeating the test to be replaced. If by that you mean, you want it to be the same, maybe consider grouping your matched text. I understand you may have meant that you just want it hand typed. Anyway, here is how to match the grouped text and add the comment character:
s/(PermitRootLogin)/# \1/
The parens indicated that the matched text should be consider a group, the \1 indicates that you want to put that matched group there.
I hope this was helpful. Happy coding! Leave a comment if you have any questions.

Sed for partial replacement?

Imagine I have a file that has the following type of line:
FIXED_DATA1 VARIABLE_DATA FIXED_DATA2
I want to change the fixed data and leave the variable data as is. For various reasons, using two sed operations to replace the fixed data will not work. For instance, the fixed fields might be double-quotes, and the line has other areas containing them, thus really the regex is written to match a pattern in the variable data and the fixed data.
If I'm bent on using sed, is there a way to change both fixed data fields at once while leaving the variable field unchanged?
Thanks.
You need to partition the line into the three pieces, replace the outer two and leave the middle alone:
sed 's/^FIX1 \(.*\) FIX2$/New \1 End/'
You can make the beginning and end matches more complex as needed.

Removing a trailing Space from Regex Matched group

I'm using regular expression lib icucore via RegKit on the iPhone to
replace a pattern in a large string.
The Pattern i'm looking for looks some thing like this
| hello world (P1)|
I'm matching this pattern with the following regular expression
\|((\w*|.| )+)\((\w\d+)\)\|
This transforms the input string into 3 groups when a match is found, of which group 1(string) and group 3(string in parentheses) are of interest to me.
I'm converting these formated strings into html links so the above would be transformed into
Hello world
My problem is the trailing space in the third group. Which when the link is highlighted and underlined, results with the line extending beyond the printed characters.
While i know i could extract all the matches and process them manually, using the search and replace feature of the icu lib is a much cleaner solution, and i would rather not do that as a result.
Many thanks as always
Would the following work as an alternate regular expression?
\|((\w*|.| )+)\s+\((\w\d+)\)\| Where inserting the extra \s+ pulls the space outside the 1st grouping.
Though, given your example & regex, I'm not sure why you don't just do:
\|(.+)\s+\((\w\d+)\)\|
Which will have the same effect. However, both your original regex and my simpler one would both fail, however on:
| hello world (P1)| and on the same line | howdy world (P1)|
where it would roll it up into 1 match.
\|\s*([\w ,.-]+)\s+\((\w\d+)\)\|
will put the trailing space(s) outside the capturing group. This will of course only work if there always is a space. Can you guarantee that?
If not, use
\|\s*([\w ,.-]+(?<!\s))\s*\((\w\d+)\)\|
This uses a lookbehind assertion to make sure the capturing group ends in a non-space character.