Why won't the tab be inserted on the first added line? - sed

I am trying to add multiple lines to a file, all with a leading a tab. The lines should be inserted on the first line after matching a string.
Assume a file with only one line, called "my-file.txt" as follows:
foo
I have tried the following sed command:
sed "/^foo\$/a \tinsert1\n\tinsert2" my-file.txt
This produces the following output:
foo
tinsert1
insert2
Notice how the the tab that should be on the first (inserted) line is omitted. Instead it prints an extra leading 't'.
Why? And how can I change my command to print the tab on the first line, as expected?

With GNU sed:
sed '/^foo$/a \\tinsert1\n\tinsert2' file
<---- single quotes! --->
Produces:
foo
insert1
insert2
From the manual:
a \
text Append text, which has each embedded newline preceded by a backslash.
Since the text to be append itself has to to be preceded by a backslash, it needs to be \\t at the beginning.
PS: If you need to use double quotes around the sed command because you want to inject shell variables, you need to escape the \ which precedes the text to be appended:
ins1="foo"
ins2="bar"
sed "/^foo\$/a \\\t${ins1}\n\t${ins2}" file

sed is for doing s/old/new on individual strings, that is all. Just use awk:
$ awk '{print} $0=="foo"{print "\tinsert1\n\tinsert2"}' file
foo
insert1
insert2
The above will work using any awk in any shell on every UNIX box and is trivial to modify to do anything else you might want to do in future.

Related

SED - replace string newline anything with string newline varable

I have the following content in a file
dhcp_option_domain:
- test.domain
And what I need to do is this:
whenever the value 'dhcp_option_domain:' followed by a newline and then ANY string, replace it with 'dhcp_option_domain:' followed by a newline and a variable.
ie if I set a variable of dhcp_domain="different.com" then then string above would convert to:
dhcp_option_domain:
- different.com
Note that both lines have and need to maintain leading 2 spaces.
I do not want to just do a search and replace on 'test.domain' as I have a few cases to use this and the values could be different each time the sed command is run.
I have tried a few methods such as:
dhcp_domain="something.com"
sed -i 's|dhcp_option_domain:\n.*|dhcp_option_domain:\n - $dhcp_domain|g' filename
however cannot get it to work.
Thanks.
As the manual explains:
sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed
Your regex (dhcp_option_domain:\n.*) does not match because there is no \n in the pattern space in the first place.
A possible solution:
sed '/dhcp_option_domain:$/{n;c\
- '"$dhcp_domain"'
}'
The /dhcp_option_domain:$/ part is an address. The following command is only executed on lines matching that pattern.
The { } command groups multiple commands into a single block.
The n command prints out the current pattern space and replaces it by the next line of input.
The c\ command replaces the current pattern space by whatever follows in the script. Here it gets a bit tricky. We have:
a literal newline in the sed program (required after c\), then
- (placing those characters in the pattern space literally, then
' (part of shell syntax, terminating the single-quoted part started by sed '...), then
" (starting a double-quoted part), then
$dhcp_domain (which, because it's in a double-quoted part, interpolates the contents of the dhcp_domain shell variable), then
" (terminating the double-quoted part), then
' (starting another single-quoted part), then
a literal newline again (terminating the text after c\), then
} (closing the block started by {).
By default, sed works line by line (using newline character to distinguish newlines)
$ cat ip.txt
foo baz
dhcp_option_domain:
- test.domain
123
dhcp_option_domain:
$ dhcp_domain='something.com'
$ sed '/^ dhcp_option_domain:/{n; s/.*/ - '"$dhcp_domain"'/}' ip.txt
foo baz
dhcp_option_domain:
- something.com
123
dhcp_option_domain:
/^ dhcp_option_domain:/ condition to match
{} to group more than one command to be executed when this condition is satisfied
n get next line
s/.*/ - '"$dhcp_domain"'/ replace it as required - note that shell variables won't be expanded inside single quotes, see sed substitution with bash variables
for details
note that last line in the file didn't trigger the change as there was no further line
tested on GNU sed, syntax might vary for other implementations
From GNU sed manual
n
If auto-print is not disabled, print the pattern space, then,
regardless, replace the pattern space with the next line of input. If
there is no more input then sed exits without processing any more
commands.
This might work for you (GNU sed):
sed '/dhcp_option_domain:$/{p;s// - '"${var}"'/;n;d}' file
Match on dhcp_option_domain:, print it, substitute the new domain name (maintaining indent), print the current line and fetch the next (n) and delete it.

Printing all words that start with "#" using sed in BASH

I have a file with a lot of text, but I want to print only words that contain "#" at the beginning. Ex:
My name is #Laura and I live in #London. Name=#Laura. City=#London
How can I print all words that start with #?.I did this the following and it worked, but I want to do it using sed. I tried several patters, but I cannot make it print anything.
grep -o -E "#\w+" file.txt
Thanks
Use this sed command:
sed 's/[^#]*\(#[^ .]*\)/\1\n/g' file.txt
Explanation: we invoke the substitution command of sed. This has following structure: sed 's/regex/replace/options'. We will search for a regex and replace it using the g option. g makes sure the match is made multiple times per line.
We look for a series of non at chars followed by an # and a number of non-spaces #[^ ]*. We put this last part in a group \(\) and sub it during the replacement \1.
Note that we add a newline at the end of each match, you can also get the output on a single line by omitting the \n.

sed pattern negation with a comma separated line

I have a text file full of lines looking like:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
I am trying to change all of the commas , to pipes |, except for the commas within the quotes.
Trying to use sed (which I am new to)... and it is not working. Using:
sed '/".*"/!s/\,/|/g' textfile.csv
Any thoughts?
As a test case, consider this file:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
"x,y,z",foo,"a,b,c",foo,"yes,no",foo
Here is a sed command to replace non-quoted commas with pipe symbols:
$ sed -r ':a; s/^([^"]*("[^"]*"[^"]*)*),/\1|/g; t a' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
"x,y,z"|foo|"a,b,c"|foo|"yes,no"|foo
Explanation
This looks for commas that appear after pairs of double quotes and replaces them with pipe symbols.
:a
This defines a label a.
s/^([^"]*("[^"]*"[^"]*)*),/\1|/g
If 0, 2, 4, or any an even number of quotes precede a comma on the line, then replace that comma with a pipe symbol.
^
This matches at the start of the line.
(`
This starts the main grouping (\1).
[^"]*
This looks for zero or more non-quote characters.
("[^"]*"[^"]*)*
The * outside the parens means that we are looking for zero or more of the pattern inside the parens. The pattern inside the parens consists of a quote, any number of non-quotes, a quote and then any number on non-quotes.
In other words, this grouping only matches pairs of quotes. Because of the * outside the parens, it can match any even number of quotes.
)
This closes the main grouping
,
This requires that the grouping be followed by a comma.
t a
If the previous s command successfully made a substitution, then the test command tells sed to jump back to label a and try again.
If no substitution was made, then we are done.
using awk could be eaiser:
kent$ cat f
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
kent$ awk -F'"' -v OFS='"' '{for(i=1;i<=NF;i++)if(i%2)gsub(",","|",$i)}7' f
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
I suggest a language with a proper CSV parser. For example:
ruby -rcsv -ne 'puts CSV.generate_line(CSV.parse_line($_), :col_sep=>"|")' file
Female|$0 to $25,000|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
Here I would have used gnu awks FPAT. It define how a field looks like FS that tells what the separator is. Then you can just set the output separator to |
awk '{$1=$1}1' OFS=\| FPAT="([^,]+)|(\"[^\"]+\")" file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
If your awk does not support FPAT, this can be used:
awk -F, '{for (i=1;i<NF;i++) {c+=gsub(/\"/,"&",$i);printf "%s"(c%2?FS:"|"),$i}print $NF}' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
sed 's/"\(.*\),\(.*\)"/"\1##HOLD##\2"/g;s/,/|/g;s/##HOLD##/,/g'
This will match the text in quotes and put a placeholder for the commas, then switch all the other commas to pipes and put the placeholder back to commas. You can change the ##HOLD## text to whatever you want.

Matching strings even if they start with white spaces in SED

I'm having issues matching strings even if they start with any number of white spaces. It's been very little time since I started using regular expressions, so I need some help
Here is an example. I have a file (file.txt) that contains two lines
#String1='Test One'
String1='Test Two'
Im trying to change the value for the second line, without affecting line 1 so I used this
sed -i "s|String1=.*$|String1='Test Three'|g"
This changes the values for both lines. How can I make sed change only the value of the second string?
Thank you
With gnu sed, you match spaces using \s, while other sed implementations usually work with the [[:space:]] character class. So, pick one of these:
sed 's/^\s*AWord/AnotherWord/'
sed 's/^[[:space:]]*AWord/AnotherWord/'
Since you're using -i, I assume GNU sed. Either way, you probably shouldn't retype your word, as that introduces the chance of a typo. I'd go with:
sed -i "s/^\(\s*String1=\).*/\1'New Value'/" file
Move the \s* outside of the parens if you don't want to preserve the leading whitespace.
There are a couple of solutions you could use to go about your problem
If you want to ignore lines that begin with a comment character such as '#' you could use something like this:
sed -i "/^\s*#/! s|String1=.*$|String1='Test Three'|g" file.txt
which will only operate on lines that do not match the regular expression /.../! that begins ^ with optional whiltespace\s* followed by an octothorp #
The other option is to include the characters before 'String' as part of the substitution. Doing it this way means you'll need to capture \(...\) the group to include it in the output with \1
sed -i "s|^\(\s*\)String1=.*$|\1String1='Test Four'|g" file.txt
With GNU sed, try:
sed -i "s|^\s*String1=.*$|String1='Test Three'|" file
or
sed -i "/^\s*String1=/s/=.*/='Test Three'/" file
Using awk you could do:
awk '/String1/ && f++ {$2="Test Three"}1' FS=\' OFS=\' file
#String1='Test One'
String1='Test Three'
It will ignore first hits of string1 since f is not true.

How to use sed to delete lines which have certain pattern and at some specific line range?

This is the command working for me:
sed "50,99999{/^\s*printf/d}" the_file
So this command delete all the lines between 50 and 99999 which have "printf" in it and there is only whitespace before printf at the line.
Now my questions are:
how to replace 99999 with some meta symbol to indicate the real line number
I tried sed "50,${/^\s*PUTS/d}" the_file, but it is not right.
how to replace "printf" with an environment variable? I tried
set pattern printf
sed "50,99999{/^\s*$pattern/d}" the_file
but it is not right.
Assuming a Bourne-like shell such as bash:
Simply define shell variables and splice them into your sed command string:
endLine=99999
pattern='printf'
sed '50,'"$endLine"'{ /^\s*'"$pattern"'/d; }' the_file
Note that the static parts of the sed command strings are single-quoted, as that protects them from interpretation by the shell (which means you needn't quote $ and `, for instance).
You can put everything into a single double-quoted string so as to be able to embed variable references directly, but distinguishing between what the shell will interpret up front and what sed will interpret can get confusing quickly.
That said, using a double-quoted string for the case at hand is simple:
sed "50,$endLine { /^\s*$pattern/d; }" the_file
sed "50,${/^\s*PUTS/d}" the_file
this line won't work, because you used double quotes, and you need escape the dollar: \$ or use single quote: '50,${/.../d}' file
sed "50,99999{/^\s*$pattern/d}" file
this line should work.
EDIT
wait, I just noticed that you set env var via set... this is not correct if you were with Bash. you should use export PAT="PUT" in your script.
check #Jonathan and #tripleee 's comments