Sed command only works when split up into two steps - command-line

Can someone explain to me why combining step 1 and step 2 as one sed command doesn't work:
sed -e :a -e 's/^.\{0,127\}$/& /;ta' \
-e '1,46d' -e '/Pharmacom/,+5d' -e 's/^M//g' \
-e ':a;N;$!ba;s/\n//g' -e 's/---*/\n/g' file > result
But the same command split into two steps works:
Step 1:
sed -e :a -e 's/^.\{0,127\}$/& /;ta' -e '1,46d' \
-e '/Pharmacom/,+5d' -e 's/^M//g' FILE > step
Step 2:
sed -e ':a;N;$!ba;s/\n//g' -e 's/---*/\n/g' step > result

I first translated your commands to something readable so I could make sense of it:
# Pad lines with spaces until 128 characters long
:a
s/^.\{0,127\}$/& /
ta
# Delete first 46 lines
1,46d
# Delete line containing 'Pharmacom' and next five lines
/Pharmacom/,+5d
# Remove carriage returns
s/^M//g
# Join rest of lines on single line
:a
N
$!ba
s/\n//g
# Replace two or more dashes with a newline
s/---*/\n/g
Then I reduced it to the problematic parts:
# Pad lines with spaces until 128 characters long
:a
s/^.\{0,127\}$/& /
ta
# Join rest of lines on single line
:a
N
$!ba
s/\n//g
Or, on a single line:
sed ':a;s/^.\{0,127\}$/& /;ta;:a;N;$!ba;s/\n//g'
The problem is that you use the same label name twice, so instead of repeating your first s command, the ta command jumps to the second label :a, and instead of padding to 128 characters, you get just a single space inserted.
This is easily fixed by using two different label names:
sed ':a;s/^.\{0,127\}$/& /;ta;:b;N;$!bb;s/\n//g'
Two remarks:
It doesn't matter if you use sed -e '...' -e '...' or sed '...;...' in this context; they both count as a single command and label names have to be unique.
I'd move the d commands to the beginning of the script, or you do all the padding work for nothing on the lines you're deleting anyways.

Related

How do I join the previous line with the current line with sed?

I have a file with the following content.
test1
test2
test3
test4
test5
If I want to concatenate all lines into one line separated by commas, I can use vi and run the following command:
:%s/\n/,/g
I then get this, which is what I want
test1,test2,test3,test4,test5,
I'm trying to use sed to do the same thing but I'm missing some unknown command/option to make it work. When I look at the file in vi and search for "\n" or "$", it finds the newline or end of line. However, when I tell sed to look for a newline, it pretends it didn't find one.
$ cat test | sed --expression='s/\n/,/g'
test1
test2
test3
test4
test5
$
If I tell sed to look for end of line, it finds it and inserts the comma but it doesn't concatenate everything into one line.
$ cat test | sed --expression='s/$/,/g'
test1,
test2,
test3,
test4,
test5,
$
What command/option do I use with sed to make it concatenate everything into one line and replace the end of line/newline with a comma?
sed reads one line at a time, so, unless you're doing tricky things, there's never a newline to replace.
Here's the trickiness:
$ sed -n '1{h; n}; H; ${g; s/\n/,/gp}' test.file
test1,test2,test3,test4,test5
h, H, g documented at https://www.gnu.org/software/sed/manual/html_node/Other-Commands.html
When using a non-GNU sed, as found on MacOS, semi-colons before the closing braces are needed.
However, paste is really the tool for this job
$ paste -s -d, test.file
test1,test2,test3,test4,test5
If you really want the trailing comma:
printf '%s,\n' "$(paste -sd, file)"
tr instead of sed for this one:
$ tr '\n' ',' < input.txt
test1,test2,test3,test4,test5,
Just straight up translate newlines to commas.
Based on how can i replace each newline n with a space using sed:
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' <file>
testing:
$ cat file.txt
test1
test2
test3
test4
test5
$ sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' file.txt
test1,test2,test3,test4,test5
Of course, if the question would have been more generic: How do I replace \n with any character using sed then one should only replace the , with ones desired char:
export CHAR_TO_REPLACE=','
export FILE_TO_PROCESS=<filename>
sed -e ':a' -e 'N' -e '$!ba' -e "s/\n/${CHAR_TO_REPLACE}/g" $FILE_TO_PROCESS
This answer is to satisfy the requirement of using sed. Otherwise, you can use alternatives like tr, awk etc.
This might work for you (GNU sed):
sed 'H;1h;$!d;x;y/\n/,/' file
Append all lines but the first to the hold space (the first replaces the hold space).
If it is not the last line of the file, delete it.
Otherwise, swap to the hold space and translate all newlines to commas.

programatically replace first occurence of string with sed or gnu sed

I want to replace only the first occurence of version: * in a file.
So I have a working sed command that work with GNU sed (source):
sed -i '0,/\(.*"version"\): "\(.*\)",/s//\1: '"\"${NEW_VERSION}\",/" package-lock.json
My problem is that i am executing this in scripts that also can run without GNU sed.
When i replace by sed -i '1,/\(.*"version"\): "\(.*\)",/s//\1: '"\"${NEW_VERSION}\",/" package-lock.json then it work without GNU sed but i have the following error when GNU sed is available:
sed: -e expression #1, char 0: no previous regular expression
EDIT: my main goal
As requested, here is my initial goal:
In a package.json and/or a package-lock.json , i want to replace the first occurence of version: X.X.X by version: Y.Y.Y where $NEW_VERSION containers Y.Y.Y
Using sed:
sed -i.bak -E '/(version: ).*/!{p;d;}
s//\1'"$NEW_VERSION"'/
:a
n
ba
' file
Alternatively this awk would also work:
awk -v ver="$NEW_VERSION" '!done && /^version:/{$2=ver; done=1} 1' file
You could check first occurrence by for example storing something in hold space.
sed '
# If hold space is empty
x;/^$/{x;
# If there is a pattern, replace it and..
/\("version": "\).*",/{
s//\1'"$NEW_VERSION"'"/1
# and hold the line.
h;
};x
};x
'
I'm going to simplify the expressions, since I'm not exactly sure what you're trying to match with the double quotes and the comma, and I think they obscure the main point. To replace just the first occurrence of foo with repl, you can do:
sed -e s/foo/repl/ -e ta -e p -e d -e :a -e n -e ba
The t command branches to the :a after a replacement is made, and the commands after :a just read and print each line without trying the substitution.
eg:
$ printf '%s\n' qux foo bar baz foo | sed -e s/foo/repl/ -e ta -e p -e d -e :a -e n -e ba
qux
repl
bar
baz
foo
But, this is really a lot easier with awk:
awk '/foo/ && !a{gsub("foo", "repl"); a = 1}1'

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

What :a, ba mean in "sed -e :a -e '$d;N;2,5ba' -e 'P;D' file"?

I understand that final result of
sed -e :a -e '$d;N;2,5ba' -e 'P;D' file
I don't understand what :a, ba mean. Also I get confused why -e is specified 3 times?
-e specifies a sed script of which there are 3.
:a
is a label for use with b and t commands.
$d;N;2,5ba
means match the last line and delete. The next input line is appended into pattern space. For lines 2, 5 we'll branch to label :a.
Last script prints pattern space up to the first newline, and deletes up to the first newline in the pattern space.

sed + remove "#" and empty lines with one sed command

how to remove comment lines (as # bal bla ) and empty lines (lines without charecters) from file with one sed command?
THX
lidia
If you're worried about starting two sed processes in a pipeline for performance reasons, you probably shouldn't be, it's still very efficient. But based on your comment that you want to do in-place editing, you can still do that with distinct commands (sed commands rather than invocations of sed itself).
You can either use multiple -e arguments or separate commands with a semicolon, something like (just one of these, not both):
sed -i 's/#.*$//' -e '/^$/d' fileName
sed -i 's/#.*$//;/^$/d' fileName
The following transcript shows this in action:
pax> printf 'Line # with a comment\n\n# Line with only a comment\n' >file
pax> cat file
Line # with a comment
# Line with only a comment
pax> cp file filex ; sed -i 's/#.*$//;/^$/d' filex ; cat filex
Line
pax> cp file filex ; sed -i -e 's/#.*$//' -e '/^$/d' filex ; cat filex
Line
Note how the file is modified in-place even with two -e options. You can see that both commands are executed on each line. The line with a comment first has the comment removed then all is removed because it's empty.
In addition, the original empty line is also removed.
#paxdiablo has a good answer but it can be improved.
(1) The '/^$/d' clause only matches 100% blank lines.
If you want to also match lines that are entirely whitespace (spaces, tabs etc.) use this instead:
'/^\s*$/d'
(2) The 's/#.*$//' clause only matches lines that start with the # character in column 0.
If you want to also match lines that have only whitespace before the first # use this instead:
'/^\s*#.*$/d'
The above criteria may not be universal (e.g. within a HEREDOC block, or in a Python multi-line string the different approaches could be significant), but in many cases the conventional definition of "blank" lines include whitespace-only, and "comment" lines include whitespace-then-#.
(3) Lastly, on OSX at least, the #paxdiablo solution in which the first clause turns comment lines into blank lines, and the second clause strips blank lines (including what were originally comments) doesn't work. It seems to be more portable to make both clauses /d delete actions as I've done.
The revised command incorporating the above is:
sed -e '/^\s*#.*$/d' -e '/^\s*$/d' inputFile
This tiny jewel removes all # comments, no matter where they begin in a line (see caution below):
sed -e 's/\s*#.*$//'
Example:
text="
this is a # test
#this is a test
#this is a #test
this is # another #test
"
$echo "$text" | sed -e 's/\s*#.*$//'
this is a
this is
Next this removes any resulting blank lines:
$echo "$text" | sed -e 's/\s*#.*$//' | sed -e '/^\s*$/d'
Caution: Depending on the syntax and/or interpretation of the lines your processing, this might not be an appropriate solution, as it just stupidly removes end of lines, even if the '#' is part of your data or code. However, for use cases where you'll never use a hash except for as an end of line comment then it works fine. So just as with all coding, context must be taken into consideration.
Alternative variant, using grep:
cat file.txt | grep -Ev '(#.*$)|(^$)'
you can use awk
awk 'NF{gsub(/^[ \t]*#/,"");print}' file
First example(paxdiablo) is very good except its not change file, just output result. If you want to change it inline:
sudo sed -i 's/#.*$//;/^$/d' inputFile
On (one of) my linux boxes, sed understands extended regular expressions with the -r option, so:
sed -r '/(^\s*#)|(^\s*$)/d' squid.conf.installed
is very useful for showing all non-blank, non comment lines.
The regex matches either start of line followed by zero or more spaces or tabs followed by either a hash or end of line, and deletes those matching lines from the input.