How do I join the previous line with the current line with sed? - sed

I have a file with the following content.
test1
test2
test3
test4
test5
If I want to concatenate all lines into one line separated by commas, I can use vi and run the following command:
:%s/\n/,/g
I then get this, which is what I want
test1,test2,test3,test4,test5,
I'm trying to use sed to do the same thing but I'm missing some unknown command/option to make it work. When I look at the file in vi and search for "\n" or "$", it finds the newline or end of line. However, when I tell sed to look for a newline, it pretends it didn't find one.
$ cat test | sed --expression='s/\n/,/g'
test1
test2
test3
test4
test5
$
If I tell sed to look for end of line, it finds it and inserts the comma but it doesn't concatenate everything into one line.
$ cat test | sed --expression='s/$/,/g'
test1,
test2,
test3,
test4,
test5,
$
What command/option do I use with sed to make it concatenate everything into one line and replace the end of line/newline with a comma?

sed reads one line at a time, so, unless you're doing tricky things, there's never a newline to replace.
Here's the trickiness:
$ sed -n '1{h; n}; H; ${g; s/\n/,/gp}' test.file
test1,test2,test3,test4,test5
h, H, g documented at https://www.gnu.org/software/sed/manual/html_node/Other-Commands.html
When using a non-GNU sed, as found on MacOS, semi-colons before the closing braces are needed.
However, paste is really the tool for this job
$ paste -s -d, test.file
test1,test2,test3,test4,test5
If you really want the trailing comma:
printf '%s,\n' "$(paste -sd, file)"

tr instead of sed for this one:
$ tr '\n' ',' < input.txt
test1,test2,test3,test4,test5,
Just straight up translate newlines to commas.

Based on how can i replace each newline n with a space using sed:
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' <file>
testing:
$ cat file.txt
test1
test2
test3
test4
test5
$ sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' file.txt
test1,test2,test3,test4,test5
Of course, if the question would have been more generic: How do I replace \n with any character using sed then one should only replace the , with ones desired char:
export CHAR_TO_REPLACE=','
export FILE_TO_PROCESS=<filename>
sed -e ':a' -e 'N' -e '$!ba' -e "s/\n/${CHAR_TO_REPLACE}/g" $FILE_TO_PROCESS
This answer is to satisfy the requirement of using sed. Otherwise, you can use alternatives like tr, awk etc.

This might work for you (GNU sed):
sed 'H;1h;$!d;x;y/\n/,/' file
Append all lines but the first to the hold space (the first replaces the hold space).
If it is not the last line of the file, delete it.
Otherwise, swap to the hold space and translate all newlines to commas.

Related

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

Using a single sed call to split and grep

This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.
This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.
An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'
echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1

sed to copy part of line to end

I'm trying to copy part of a line to append to the end:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
becomes:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
I have tried:
sed 's/\(.*(GCA_\)\(.*\))/\1\2\2)'
$ f1=$'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz'
$ echo "$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\1\2\3\/\2\4/' <<<"$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
sed -E (or -r in some systems) enables extended regex support in sed , so you don't need to escape the group parenthesis ( ).
The format (GCA_.[^.]*) equals to "get from GCA_ all chars up and excluding the first found dot" :
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\2/' <<<"$f1"
GCA_900169985
Similarly (.[^_]*) means get all chars up to first found _ (excluding _ char). This is the regex way to perform a non greedy/lazy capture (in perl regex this would have been written something like as .*_?)
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\3/' <<<"$f1"
.1
Short sed approach:
s="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz"
sed -E 's/(GCA_[^._]+)\.([^_]+)/\1.\2\/\1/' <<< "$s"
The output:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz

unescaped newline inside substitute pattern

I have a txt file with a list of 100 countries, without quotation marks around them. I am trying to change this
Canada
USA
into this
countries['Canada']=true
etc.
This is the sed command I am trying, with '\1' representing the country in quotation marks.
sed -e "s/\(.*\)/countries['\1']=true" source.txt > output.txt
The error I'm getting is
unescaped newline inside substitute pattern
What sed command do I need to achieve what I'm trying to do, and why am I getting this error
You just missed a trailing / at the end:
v
$ sed -e "s/\(.*\)/countries['\1']=true/" file
countries['Canada']=true
countries['USA']=true
Note also that you don't need to catch group, just match everything with .* and then use & to print it back:
|-------------|
vv v
$ sed -e "s/.*/countries['&']=true/" a
countries['Canada']=true
countries['USA']=true
I would just add stuff at the beginning and end:
sed -e "s/^/countries['/" -e "s/$/']=true/" source.txt > output.txt

Sed: can't define the pattern correctly, can you please assist?

I'm trying to add "ARG1$" to the end of this line:
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $
I've tried:
sed -e 's/^command\[check_net_speed\]$/$ARG1$/g' /etc/nagios/nrpe.cfg
sed -e 's/.*speed.*/$ARG1$/g' /etc/nagios/nrpe.cfg
But none did the trick... what's the right way to catch the pattern of the "check_net_speed" command and add "ARG1$" at the end of the line, so the line will look like this:
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
Something like
sed -e 's/^command\[check_net_speed\].*/&ARG1$/g' input
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
Change your sed command like below,
sed -e '/^command\[check_net_speed\]/s~$~ARG1$~' file
/^command\[check_net_speed\]/ matches the lines which starts with command[check_net_speed] and it do the replacement on those matched lines.
$ in the regex part means end of the line. So the above command replaces the end of the line anchor with ARG1$
Example:
$ echo 'command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $' | sed -e '/^command\[check_net_speed\]/s~$~ARG1$~'
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
$ has a special meaning: end of line. To treat it as a literal, you have to escape it:
sed '/^command\[check_net_speed\].*\$/s/$/ARG1$/' file
This will replace the end of line (indicated by $ alone) with the string ARG1$. So at the end, ARG1$ will be appended to the line.
The /command/ part is used to perform this replacement only in the lines containing the string command.
Test
$ cat a
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $
ddd
$ sed '/^command\[check_net_speed\].*\$/s/$/ARG1$/' a
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
ddd
As a supplement to nu11p01n73R's answer. Use
sed -e 's/^command\[check_net_speed\].*\$$/&ARG1$/g;q' /etc/nagios/nrpe.cfg
;q after substitution command means stop processing the rest of this file after first match.