sed: get a line number with regex and insert text at that line - sed

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.

This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.

Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

Related

sed replace 1 line in file with all lines in file

Lets say I have a line #SYM
I need to replace it with all lines from file1.txt
Is it possible to do that with sed?
I have tried sed 's/#SYM/file1.txt/' updater
But that doesn't work, because I need to load file1.txt as string, and I do not know how to do that.
EDIT: I believe that there could be a way to do it in a shell script somehow.
EDIT2: I also just tried this:
#!/bin/bash
value=$(<tools/symlink)
sed -i 's/#SYM/$value/' META-INF/com/google/android/updater-script
Use r command:
sed -e '/#SYM/ {r tools/symlink' -e 'd}' META-INF/com/google/android/updater-script
/#SYM/ {r tools/symlink if a line contains #SYM, append the contents of tools/symlink
d} then delete the matching line
the two commands are separated using -e option because everything after r is considered as part of filename
Add the -i option once you are satisifed that it is working

Sed Process Substitution on Insert - Without Backslashes

I have function that prints a header that needs to be applied across several files, but if I utilize a sed process substitution the lines prior to the last have a backslash \ on them.
E.g.
function print_header() {
cat << EOF
-------------------------------------------------------------------
$(date '+%B %d, %Y # ~ %r') ID:$(echo $RANDOM)
EOF
}
If I then take a file such as test.txt:
line 1
line 2
line 3
line 4
line 5
sed "1 i $(print_header | sed 's/$/\\/g')" test.txt
I get:
-------------------------------------------------------------------\
November 24, 2015 # ~ 11:18:28 AM ID:13187
line 1
line 2
line 3
line 4
line 5
Notice the troublesome backslash at the end of the first line, I'd like to not have that backslash appear. Any ideas?
I would use cat for that:
cat <(print_header) file > file_with_header
This behavior depends on the sed dialect. Unfortunately, it's one of the things which depends on which version you have.
To simplify debugging, try specifying verbatim text. Here's one from a Debian system.
vnix$ sed '1i\
> foo\
> bar' <<':'
> hello
> goodbye
> :
foo
bar
hello
goodbye
Your diagnostics appear to indicate that your sed dialect does not in fact require the backslash after the first i.
Since you are generating the contents of the header programmatically anyway, my recommended solution would be to refactor the code so that you can avoid this conundrum. If you don't want cat <<EOF test.txt then maybe experiment with sed 1r/dev/stdin' <<EOF test.txt (I could not get 1r- to work, but /dev/stdin should be portable to any Linux.)
Here is my kludgy fix, if you can find something more elegant I'll gladly credit you:
sed "1 i $(print_header | sed 's/$/\\/g;$s/$/\x01/')" test.txt | tr -d '\001'
This puts an unprintable SOH (\x01) ascii Start Of Header character after the inserted text, that precludes the backslashes and then I run it over tr to delete the SOH chars.

Using sed to keep the beginning of a line

I have a file in which some lines start by a >
For these lines, and only these ones, I want to keep the first eleven characters.
How can I do that using sed ?
Or maybe something else is better ?
Thanks !
Muriel
Let's start with this test file:
$ cat file
line one with something or other
>1234567890abc
other line in file
To keep only the first 11 characters of lines starting with > while keeping all other lines:
$ sed -r '/^>/ s/(.{11}).*/\1/' file
line one with something or other
>1234567890
other line in file
To keep only the first eleven characters of lines starting with > and deleting all other lines:
$ sed -rn '/^>/ s/(.{11}).*/\1/p' file
>1234567890
The above was tested with GNU sed. For BSD sed, replace the -r option with -E.
Explanation:
/^>/ is a condition. It means that the command which follows only applies to lines that start with >
s/(.{11}).*/\1/ is a substitution command. It replaces the whole line with just the first eleven characters.
-r turns on extended regular expression format, eliminating the need for some escape characters.
-n turns off automatic printing. With -n in effect, lines are only printed if we explicitly ask them to be printed. In the second case above, that is done by adding a p after the substitute command.
Other forms:
$ sed -r 's/(>.{10}).*/\1/' file
line one with something or other
>1234567890
other line in file
And:
$ sed -rn 's/(>.{10}).*/\1/p' file
>1234567890

Simple SED statement to truncate everything after first instance of underscore AFTER # sign

Input:
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
Required Output:
ab_cd#yahoo.co.uk
ef_gh#gmail.com
I'm looking for a SED statement to do the above.
Essentially I would like everything from and including the first underscore character AFTER the # sign to be removed from the output.
I'm sorry I only have basic knowledge of programming. I'm on a Windows machine [I've found a SED editor] from here
and use it modify simple strings in a batch file from the Windows shell.
Many thanks
give this a try:
sed 's/_[^#]*$//' file
it worked here with your input:
kent$ cat f
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
kent$ sed 's/_[^#]*$//' f
ab_cd#yahoo.co.uk
ef_gh#gmail.com
This can be a way:
$ sed -r 's/(.*#[^_]*).*/\1/' file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
It catches all the text before #, after it and up to _. Then, it prints it back, getting rid of everything coming from _.
Explanation
Matching group: given a sed 's/something/back/' command, whatever you "catch" in the part, can be enclosed in parenthesis so that you can refer back to it with \1 (first match), \2 (2nd match), and up to \9.
$ cat file
hello33bye
hello44goodbye
hello55yeah
$ sed 's/hello([0-9]*).*/\1/g' a
33
44
55
So (.*#[^_]*).* means: catch a block of text followed by # and followed with any character appart from _. Then, catch the rest of the text.
Finally, print the catched block back.
To be sure we are not matching a _ within the domain:
sed -r 's/(.*#[^\.]*[^_]*).*/\1/' file
^^^^^^
catch a dot before catching an underscore
Test
$ cat a
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
aaa#gma_il.com_ST_BB
aaa#gma_il.com
$ sed -r 's/(.*#[^\.]*[^_]*).*/\1/' a
ab_cd#yahoo.co.uk
ef_gh#gmail.com
aaa#gma_il.com
aaa#gma_il.com
Here is an awk version. (awk is normal more easy to understand than sed)
cat file
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
test#home_net.com_BO_22.jpg
awk -F\. '{NF--;split($NF,a,"_");$NF=a[1]}1' OFS=\. file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
test#home_net.com
It removes the last field when split by ., then divide last field by _ and replace it by first part of it.

sed + remove "#" and empty lines with one sed command

how to remove comment lines (as # bal bla ) and empty lines (lines without charecters) from file with one sed command?
THX
lidia
If you're worried about starting two sed processes in a pipeline for performance reasons, you probably shouldn't be, it's still very efficient. But based on your comment that you want to do in-place editing, you can still do that with distinct commands (sed commands rather than invocations of sed itself).
You can either use multiple -e arguments or separate commands with a semicolon, something like (just one of these, not both):
sed -i 's/#.*$//' -e '/^$/d' fileName
sed -i 's/#.*$//;/^$/d' fileName
The following transcript shows this in action:
pax> printf 'Line # with a comment\n\n# Line with only a comment\n' >file
pax> cat file
Line # with a comment
# Line with only a comment
pax> cp file filex ; sed -i 's/#.*$//;/^$/d' filex ; cat filex
Line
pax> cp file filex ; sed -i -e 's/#.*$//' -e '/^$/d' filex ; cat filex
Line
Note how the file is modified in-place even with two -e options. You can see that both commands are executed on each line. The line with a comment first has the comment removed then all is removed because it's empty.
In addition, the original empty line is also removed.
#paxdiablo has a good answer but it can be improved.
(1) The '/^$/d' clause only matches 100% blank lines.
If you want to also match lines that are entirely whitespace (spaces, tabs etc.) use this instead:
'/^\s*$/d'
(2) The 's/#.*$//' clause only matches lines that start with the # character in column 0.
If you want to also match lines that have only whitespace before the first # use this instead:
'/^\s*#.*$/d'
The above criteria may not be universal (e.g. within a HEREDOC block, or in a Python multi-line string the different approaches could be significant), but in many cases the conventional definition of "blank" lines include whitespace-only, and "comment" lines include whitespace-then-#.
(3) Lastly, on OSX at least, the #paxdiablo solution in which the first clause turns comment lines into blank lines, and the second clause strips blank lines (including what were originally comments) doesn't work. It seems to be more portable to make both clauses /d delete actions as I've done.
The revised command incorporating the above is:
sed -e '/^\s*#.*$/d' -e '/^\s*$/d' inputFile
This tiny jewel removes all # comments, no matter where they begin in a line (see caution below):
sed -e 's/\s*#.*$//'
Example:
text="
this is a # test
#this is a test
#this is a #test
this is # another #test
"
$echo "$text" | sed -e 's/\s*#.*$//'
this is a
this is
Next this removes any resulting blank lines:
$echo "$text" | sed -e 's/\s*#.*$//' | sed -e '/^\s*$/d'
Caution: Depending on the syntax and/or interpretation of the lines your processing, this might not be an appropriate solution, as it just stupidly removes end of lines, even if the '#' is part of your data or code. However, for use cases where you'll never use a hash except for as an end of line comment then it works fine. So just as with all coding, context must be taken into consideration.
Alternative variant, using grep:
cat file.txt | grep -Ev '(#.*$)|(^$)'
you can use awk
awk 'NF{gsub(/^[ \t]*#/,"");print}' file
First example(paxdiablo) is very good except its not change file, just output result. If you want to change it inline:
sudo sed -i 's/#.*$//;/^$/d' inputFile
On (one of) my linux boxes, sed understands extended regular expressions with the -r option, so:
sed -r '/(^\s*#)|(^\s*$)/d' squid.conf.installed
is very useful for showing all non-blank, non comment lines.
The regex matches either start of line followed by zero or more spaces or tabs followed by either a hash or end of line, and deletes those matching lines from the input.