sed or awk - deleting strings between patterns - sed

I have a CSV file with lines like this:
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC.DDD,C-name,num1,num2,num3
EEE.FFF.GGGG,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3
...
Some lines have one identifier (like AAA); some have two (like CCC); some have three or more (like EEE). And some identifiers are not three characters. I need to remove all but the first identifier from each line of the line (such that the first period and anything that comes after it is deleted until the first comma is encountered), producing this:
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH,H-name,num1,num2,num3
...
I've tried a few pattern-replace methods but am getting tripped up. Does anyone have the syntax I need?

sed 's/^\([^.]\{1,\}\)[^,]*/\1/'

Just remove everything between a dot and the first colon. For the file
$ cat foo
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC.DDD,C-name,num1,num2,num3
EEE.FFF.GGGG,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3
use this sed command:
$ sed 's/\.[^,]*//' foo
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH,num1,num2,num3
However, it will remove an H at the last line. This seems to be a typo in your example, however.

Using perl
$ perl -pe 's/\.[A-Z.]*?,/,/' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3
sed
$ sed 's/\.[A-Z.]*,/,/' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3
and awk
$ awk '/\./{sub(/\.[A-Z.]*,/, ",", $0)}{print}' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3

Related

How do I join the previous line with the current line with sed?

I have a file with the following content.
test1
test2
test3
test4
test5
If I want to concatenate all lines into one line separated by commas, I can use vi and run the following command:
:%s/\n/,/g
I then get this, which is what I want
test1,test2,test3,test4,test5,
I'm trying to use sed to do the same thing but I'm missing some unknown command/option to make it work. When I look at the file in vi and search for "\n" or "$", it finds the newline or end of line. However, when I tell sed to look for a newline, it pretends it didn't find one.
$ cat test | sed --expression='s/\n/,/g'
test1
test2
test3
test4
test5
$
If I tell sed to look for end of line, it finds it and inserts the comma but it doesn't concatenate everything into one line.
$ cat test | sed --expression='s/$/,/g'
test1,
test2,
test3,
test4,
test5,
$
What command/option do I use with sed to make it concatenate everything into one line and replace the end of line/newline with a comma?
sed reads one line at a time, so, unless you're doing tricky things, there's never a newline to replace.
Here's the trickiness:
$ sed -n '1{h; n}; H; ${g; s/\n/,/gp}' test.file
test1,test2,test3,test4,test5
h, H, g documented at https://www.gnu.org/software/sed/manual/html_node/Other-Commands.html
When using a non-GNU sed, as found on MacOS, semi-colons before the closing braces are needed.
However, paste is really the tool for this job
$ paste -s -d, test.file
test1,test2,test3,test4,test5
If you really want the trailing comma:
printf '%s,\n' "$(paste -sd, file)"
tr instead of sed for this one:
$ tr '\n' ',' < input.txt
test1,test2,test3,test4,test5,
Just straight up translate newlines to commas.
Based on how can i replace each newline n with a space using sed:
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' <file>
testing:
$ cat file.txt
test1
test2
test3
test4
test5
$ sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' file.txt
test1,test2,test3,test4,test5
Of course, if the question would have been more generic: How do I replace \n with any character using sed then one should only replace the , with ones desired char:
export CHAR_TO_REPLACE=','
export FILE_TO_PROCESS=<filename>
sed -e ':a' -e 'N' -e '$!ba' -e "s/\n/${CHAR_TO_REPLACE}/g" $FILE_TO_PROCESS
This answer is to satisfy the requirement of using sed. Otherwise, you can use alternatives like tr, awk etc.
This might work for you (GNU sed):
sed 'H;1h;$!d;x;y/\n/,/' file
Append all lines but the first to the hold space (the first replaces the hold space).
If it is not the last line of the file, delete it.
Otherwise, swap to the hold space and translate all newlines to commas.

Parse file and insert new line after each occurrence

On a Unix system I am trying to add a new line in a file using sed or perl but it seems I am missing something.
Supposing my file has multiple lines of texts, always ending like this {TNG:}}${1:F01.
I am trying to find a to way to add a new line after the }$, in this way {1 should always start on a new line.
I tried it by escaping $ sign using this:
perl -e '$/ = "\${"; while (<>) { s/\$}\{$/}\n{/; print; }' but it does not work.
Any ideas will be appreciated.
give this a try:
sed 's/{TNG:}}\$/&\n/' file > newfile
The sed will by default use BRE, that is, the {}s are literal characters. But we must escape the $.
kent$ cat f
{TNG:}}${1:F01.
kent$ sed 's/{TNG:}}\$/&\n/' f
{TNG:}}$
{1:F01.
With perl:
$ cat input.txt
line 1 {TNG:}}${1:F01
line 2 {TNG:}}${1:F01
$ perl -pe 's/TNG:\}\}\$\K/\n/' input.txt
line 1 {TNG:}}$
{1:F01
line 2 {TNG:}}$
{1:F01
(Read up on the -p and -n options in perlrun and use them instead of trying to do what they do in a one-liner yourself)

Sed Remove 3 last digits from string

27211;18:05:03479;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
Second string after ; is time. gg:mm:sssss:. I just want to be gg:mm:ss:
Like so:
27211;18:05:03;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
I tried with cut but it deletes everything after n'th occurance of character, and for now I am stuck, please help.
give this one liner a try:
awk -F';' -v OFS=";" 'sub(/...$/,"",$2)+1' file
It removes the last 3 chars from column 2.
update with sed one liner
If you are a fan of sed:
sed -r 's/(;[^;]*)...;/\1;/' file
With sed:
sed -r 's/^([^;]+;[^;]+)...;/\1;/' file
(Or)
sed -r 's/^([^;]+;[0-9]{2}:[0-9]{2}:[0-9]{2})...;/\1;/' file
It also can be something like sed 's/(.*)([0-9]{2}\:){2}([0-9]{3})[0-9]*\;(.*)/\1\2\3\4/g'
It is not very clean, but at least is more clear for me.
Regards
I'd use perl for this:
perl -pe 's/(?<=:\d\d)\d+(?=;)//' file
That removes any digits between "colon-digit-digit" and the semicolon (first match only, not globally in the line).
If you want to edit the file in-place: perl -i -pe ...
With sed:
sed -E 's/(:[0-9]{2})[0-9]{3}/\1/' file
or perl:
perl -pe's/:\d\d\B\K...//' file

Replacing several lines in a script with a single line using sed

Say I have a script where I want to change several lines for a single line.
For example, I got a new function that can summarize several commands, so that I can replace in my script as follows:
Original
some_code
command1
command2
command3
some_more_code
Edited
some_code
foo()
some_more_code
How would you do that using sed?
sed '/some_code/,/command3/ !b
/some_code/ b
/command3/ a\
foo()
d' YourFile
be carrefull about meta character ( like &\\^$[]{}().) in any of the pattern (except your foo() line)
I am answering my own question here.
I couldn't figure out a way to do it in one go, so I split the problem into two parts.
Part 1: replace the first line
sed -e 's/command1/foo()/g' file1 > file2
Part 2: remove the rest of the lines
sed -e '/command2/,+1d/' file2 > file3
I'd prefer a more elegant way though, where I can be flexible in the number of lines that I am replacing, possibly matching the last command in the block. Any ideas?
Just use awk:
$ awk -v RS='^$' -v ORS= '{sub(/command1\ncommand2\ncommand3/,"foo()")}1' file
some_code
foo()
some_more_code
The above uses GNU awk for multi-char RS.
This might work for you (GNU sed):
sed '/command1/,/command3/c\foo()' file

Sed: can't define the pattern correctly, can you please assist?

I'm trying to add "ARG1$" to the end of this line:
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $
I've tried:
sed -e 's/^command\[check_net_speed\]$/$ARG1$/g' /etc/nagios/nrpe.cfg
sed -e 's/.*speed.*/$ARG1$/g' /etc/nagios/nrpe.cfg
But none did the trick... what's the right way to catch the pattern of the "check_net_speed" command and add "ARG1$" at the end of the line, so the line will look like this:
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
Something like
sed -e 's/^command\[check_net_speed\].*/&ARG1$/g' input
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
Change your sed command like below,
sed -e '/^command\[check_net_speed\]/s~$~ARG1$~' file
/^command\[check_net_speed\]/ matches the lines which starts with command[check_net_speed] and it do the replacement on those matched lines.
$ in the regex part means end of the line. So the above command replaces the end of the line anchor with ARG1$
Example:
$ echo 'command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $' | sed -e '/^command\[check_net_speed\]/s~$~ARG1$~'
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
$ has a special meaning: end of line. To treat it as a literal, you have to escape it:
sed '/^command\[check_net_speed\].*\$/s/$/ARG1$/' file
This will replace the end of line (indicated by $ alone) with the string ARG1$. So at the end, ARG1$ will be appended to the line.
The /command/ part is used to perform this replacement only in the lines containing the string command.
Test
$ cat a
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $
ddd
$ sed '/^command\[check_net_speed\].*\$/s/$/ARG1$/' a
command[check_net_speed]=/usr/lib64/nagios/plugins/check_net_speed.sh $ARG1$
ddd
As a supplement to nu11p01n73R's answer. Use
sed -e 's/^command\[check_net_speed\].*\$$/&ARG1$/g;q' /etc/nagios/nrpe.cfg
;q after substitution command means stop processing the rest of this file after first match.