Remove all occurence of ';' in C++ comments with SED - sed

I'm new to sed and I can't manage to use it to remove all ';' characters in comments of C++ files, ie lines starting or containing the string "//" (I already convert "/* ... */" comments to "// ..." comments).
For example :
// lorem; ipsum ; test
int a; // 1 ; 2 ; 3 ;
And I want to have :
// lorem ipsum test
int a; // 1 2 3
For any comment in my C++ files.
********* EDIT *********
Here is a solution with SED in two steps. A solution with AWK is also available in answers.
Put all comments on a new line : sed 's/\/\//\n\/\//g'
Remove ';' only on lines starting by "//" : sed '/^\/\// s/;//g'

It is straightforward in AWK. Create a file r.awk:
function process(s) {
gsub(";", "", s)
return s
}
{
sep = "//"; ns=length(sep)
m = match($0, sep)
if (!m) {print; next}
body = substr($0, 1, m-1)
cmnt = substr($0, m+ns )
print body sep process(cmnt)
}
Usage:
awk -f r.awk input.file

Related

bash SED command explanation with semicolon

What is this sed command doing? and is there any online utility that kind of explains sed a little bit, like regex?
sed -i '1s/$/|,a Type,b Type,c Type/;/./!b;1!s/$/|,,,/' textflile.txt
I think in the beginning it is adding csv a type, b type, c type at the end of the line but what does the rest of the command too
I don't know of any such utility, but let me explain using a text editor:
sed -i '1s/$/|,a Type,b Type,c Type/;/./!b;1!s/$/|,,,/' textflile.txt
^ ^ ^ ^ ^^ ^^ ^
| | | | || || |
modify | End Non-empty || || input
the | of lines || |Negation, file
file | line only || |i.e. lines 2,3,...
in | || |
place | || First
First line Negation, i.e.| line
empty lines only|
Branch to
script end,
i.e. skip the rest
In other words, it adds |,a type, b Type,c Type to the first line, doesn't change empty lines, and adds |,,, to all the remaining lines.
sed -i '1s/$/|,a Type,b Type,c Type/;/./!b;1!s/$/|,,,/' textflile.txt
can be written as
sed -i '
1 s/$/|,a Type,b Type,c Type/
/./! b
1! s/$/|,,,/
' textflile.txt
on line 1 only, add some text to the end of the line
if the line is empty ("matches 1 character, not"), goto next "cycle" (i.e., print current line and go to next line)
on every line except line 1, add "|,,," to the end of the line
So, it looks like you're adding some blank fields to a CSV file.
info sed contains the complete sed manual.
This doesn't answer your question but it's important for people to know and requires more space and formatting than a comment so: FYI to do what #choroba says that sed script does, i.e.
it adds |,a type, b Type,c Type to the first line,
doesn't change empty lines,
and adds |,,, to all the remaining lines.
is just this in awk:
awk '
NR==1 { print $0 "|,a type, b Type,c Type"; next }
!NF { print }
NF { print $0 "|,,," }
'
or if you're familiar with ternary expressions and want to remove the redundant code:
awk '{
sfx = "|," (NR==1 ? "a type, b Type,c Type" : ",,")
print $0 (NF ? sfx : "")
}'

Remove newline depending on the format of the next line

I have a special file with this kind of format :
title1
_1 texthere
title2
_2 texthere
I would like all newlines starting with "_" to be placed as a second column to the line before
I tried to do that using sed with this command :
sed 's/_\n/ /g' filename
but it is not giving me what I want to do (doing nothing basically)
Can anyone point me to the right way of doing it ?
Thanks
Try following solution:
In sed the loop is done creating a label (:a), and while not match last line ($!) append next one (N) and return to label a:
:a
$! {
N
b a
}
After this we have the whole file into memory, so do a global substitution for each _ preceded by a newline:
s/\n_/ _/g
p
All together is:
sed -ne ':a ; $! { N ; ba }; s/\n_/ _/g ; p' infile
That yields:
title1 _1 texthere
title2 _2 texthere
If your whole file is like your sample (pairs of lines), then the simplest answer is
paste - - < file
Otherwise
awk '
NR > 1 && /^_/ {printf "%s", OFS}
NR > 1 && !/^_/ {print ""}
{printf "%s", $0}
END {print ""}
' file
This might work for you (GNU sed):
sed ':a;N;s/\n_/ /;ta;P;D' file
This avoids slurping the file into memory.
or:
sed -e ':a' -e 'N' -e 's/\n_/ /' -e 'ta' -e 'P' -e 'D' file
A Perl approach:
perl -00pe 's/\n_/ /g' file
Here, the -00 causes perl to read the file in paragraph mode where a "line" is defined by two consecutive newlines. In your example, it will read the entire file into memory and therefore, a simple global substitution of \n_ with a space will work.
That is not very efficient for very large files though. If your data is too large to fit in memory, use this:
perl -ne 'chomp;
s/^_// ? print "$l " : print "$l\n" if $. > 1;
$l=$_;
END{print "$l\n"}' file
Here, the file is read line by line (-n) and the trailing newline removed from all lines (chomp). At the end of each iteration, the current line is saved as $l ($l=$_). At each line, if the substitution is successful and a _ was removed from the beginning of the line (s/^_//), then the previous line is printed with a space in place of a newline print "$l ". If the substitution failed, the previous line is printed with a newline. The END{} block just prints the final line of the file.

Add column to middle of tab-delimited file (sed/awk/whatever)

I'm trying to add a column (with the content '0') to the middle of a pre-existing tab-delimited text file. I imagine sed or awk will do what I want. I've seen various solutions online that do approximately this but they're not explained simply enough for me to modify!
I currently have this content:
Affx-11749850 1 555296 CC
I need this content
Affx-11749850 1 0 555296 CC
Using the command awk '{$3=0}1' filename messes up my formatting AND replaces column 3 with a 0, rather than adding a third column with a 0.
Any help (with explanation!) so I can solve this problem, and future similar problems, much appreciated.
Using the implicit { print } rule and appending the 0 to the second column:
awk '$2 = $2 FS "0"' file
Or with sed, assuming single space delimiters:
sed 's/ / 0 /2' file
Or perl:
perl -lane '$, = " "; $F[1] .= " 0"; print #F'
awk '{$2=$2" "0; print }' your_file
tested below:
> echo "Affx-11749850 1 555296 CC"|awk '{$2=$2" "0;print}'
Affx-11749850 1 0 555296 CC

Joining lines with awk and sed

I like to join lines following {st,corridor,tunnel} into one line using AWK or SED
Input
abcd
efgjk
st
wer
dfgh
corridor
weerr
tunnel
twdf
Desired output
abcd
efgjk st
wer
dfgh corridor
weerr tunnel
twdf
One way using awk:
awk '!/st|corridor|tunnel/ { if (line) print line; line = $0; next } { line = line " " $0 } END { print line }' file.txt
Results:
abcd
efgjk st
wer
dfgh corridor
weerr tunnel
twdf
This might work for you (GNU sed):
sed '$!N;s/\n\(st\|corridor\|tunnel\)\s*$/ \1/;P;D' file
Or, an awk version that reads the whole file into memory first (not recommended for large files):
$ awk 'BEGIN {i=1} {line[i++] = $0} END {j=1; while (j<i) {if (match(line[j+1], /^(st|corridor|tunnel)$/)) {print line[j] " " line[j+1]; j+=2} else print line[j++];}}' streets
abcd
efgjk st
wer
dfgh corridor
weerr tunnel
twdf
I'll leave you with the exercise of doing this one-or-two-lines-at-a-time. :)
With awk
BEGIN {
s["st"]=s["corridor"]=s["tunnel"]
}
$1 in s {
print prev, $1
}
!($1 in s) {
if (prev) print prev
prev = $1
}

Awk or Sed: File Annotation

Hallo, my SO friend, my question is:
Specification: annotate the fields of FILE_2 to the corresponding position of FILE_1.
A field is marked, and hence identified, by a delimiter pair.
I did this job in python before I knew awk and sed, with a couple hundred lines of code.
Now I want to see how powerful and efficient awk and sed can be.
Show me some masterpiece of awk or sed, please!
The delimiter pairs can be configured in FILE_3, but let's assume the first delimiter in a pair is 'Marker (number i) start', the other one is 'Marker (number i) done'
Example:
|-----------------FILE_1------------------|
text text text
text blabla
Marker_1_start
Marker_1_done
any text
in between blabla
Marker_2_start
Marker_2_done
text text
|-----------------FILE_2------------------|
Marker_1_start
11
1111
Marker_1_done
Marker_2_start
2222
22
Marker_2_done
Expected Output:
|-----------------FILE_Out------------------|
text text text
text blabla
Marker_1_start
11
1111
Marker_1_done
any text
in between blabla
Marker_2_start
2222
22
Marker_2_done
text text
awk '
FNR==NR && /Marker_.*_done/ {sep = ""; next}
FNR==NR && /Marker_.*_start/ {marker = $0; next}
FNR==NR {marker_text[marker] = marker_text[marker] sep $0; sep = "\n"; next}
1 {print}
/Marker_.*_start/ {print marker_text[$0]}
' file_2 file_1
There are several ways to approach this. I'm assuming that FILE_2 is smaller than FILE_1 and of a reasonable size.
#!/usr/bin/awk -f
FNR == NR {
if ($0 ~ /^Marker.*start$/) {
flag = 1
idx = $0
next
}
if ($0 ~ /^Marker.*done$/) {
flag = 0
nl = ""
next
}
if (flag) lines[idx] = lines[idx] nl $0
nl = "\n"
next
}
{
print
if (lines[$0]) print lines[$0]
}
To run it:
./script.awk FILE_2 FILE_1
Now I want to see how powerful and
efficient awk and sed can be
For this type of problem, very efficient. I'm sure my code can be further reduced.
#!/bin/bash
awk '
FNR == NR {
if ($0 ~ /Marker_1_start/){m1=1;next}
if ($0 ~ /Marker_2_start/){m2=1;next}
if ($0 ~ /Marker_1_done/){m1=0}
if ($0 ~ /Marker_2_done/){m2=0}
if(m1){a[i++]=$0}
if(m2){b[j++]=$0}
}
FNR != NR {
if ($0 ~ /Marker_1_start/){print;n1=1}
if ($0 ~ /Marker_2_start/){print;n2=1}
if ($0 ~ /Marker_1_done/){n1=0}
if ($0 ~ /Marker_2_done/){n2=0}
if(n1)
for (k = 0; k < i; k++)
print a[k]
else if(n2)
for (l = 0; l < j; l++)
print b[l]
else
print
}' ./file_2 ./file_1
Output
$ ./filemerge.sh
text text text
text blabla
Marker_1_start
11
1111
Marker_1_done
any text
in between blabla
Marker_2_start
2222
22
Marker_2_done
text text