Sed replace pattern with line number - sed

I need to replace the pattern ### with the current line number.
I managed to Print in the next line with both AWK and SED.
sed -n "/###/{p;=;}" file prints to the next line, without the p;, it replaces the whole line.
sed -e "s/###/{=;}/g" file used to make sense in my head, since the =; returns the line number of the matched pattern, but it will return me the the text {=;}
What am i Missing? I know this is a silly question. I couldn't find the answer to this question in the sed manual, it's not quite clear.
If possible, point me what was i missing, and what to make it work. Thank you

Simple awk oneliner:
awk '{gsub("###",NR,$0);print}'

Given the limitations of the = command, I think it's easier to divide the job in two (actually, three) parts. With GNU sed you can do:
$ sed -n '/###/=' test > lineno
and then something like
$ sed -e '/###/R lineno' test | sed '/###/{:r;N;s/###\([^\n]*\n\)\([^\n]*\)/\2\1/;tr;:c;s/\n\n/\n/;tc}'
I'm afraid there's no simple way with sed because, as well as the = command, the r and GNU extension R commands don't read files into the pattern space, but rather directly append the lines to the output, so the contents of the file cannot be modified in any way. Hence piping to another sed command.
If the contents of test are
fooo
bar ### aa
test
zz ### bar
the above will produce
fooo
bar 2 aa
test
zz 4 bar

This might work for you (GNU sed):
sed = file | sed 'N;:a;s/\(\(.*\)\n.*\)###/\1\2/;ta;s/.*\n//'
An alternative using cat:
cat -n file | sed -E ':a;s/^(\s*(\S*)\t.*)###/\1\2/;ta;s/.*\t//'

As noted by Lev Levitsky this isn't possible with one invocation of sed, because the line number is sent directly to standard out.
You could have sed write a sed-script for you, and do the replacement in two passes:
infile
a
b
c
d
e
###
###
###
a
b
###
c
d
e
###
Find the lines that contain the pattern:
sed -n '/###/=' infile
Output:
6
7
8
11
15
Pipe that into a sed-script writing a new sed-script:
sed 's:.*:&s/###/&/:'
Output:
6s/###/6/
7s/###/7/
8s/###/8/
11s/###/11/
15s/###/15/
Execute:
sed -n '/###/=' infile | sed 's:.*:&s/^/& \&/:' | sed -f - infile
Output:
a
b
c
d
e
6
7
8
a
b
11
c
d
e
15

is this ok ?
kent$ echo "a
b
c
d
e"|awk '/d/{$0=$0" "NR}1'
a
b
c
d 4
e
if match pattern "d", append line number at the end of the line.
edit
oh, you want to replace the pattern not append the line number... take a look the new cmd:
kent$ echo "a
b
c
d
e"|awk '/d/{gsub(/d/,NR)}1'
a
b
c
4
e
and the line could be written like this as well: awk '1+gsub(/d/,NR)' file

one-liner to modify the FILE in place, replacing LINE with the corresponding line number:
seq 1 `wc -l FILE | awk '{print $1}'` | xargs -IX sed -i 'X s/LINE/X/' FILE

Following on from https://stackoverflow.com/a/53519367/29924
If you try this on osx the version of sed is different and you need to do:
seq 1 `wc -l FILE | awk '{print $1}'` | xargs --verbose -IX sed -i bak "X s/__line__/X/" FILE
see https://markhneedham.com/blog/2011/01/14/sed-sed-1-invalid-command-code-r-on-mac-os-x/

Related

Using a single sed call to split and grep

This is mostly by curiosity, I am trying to have the same behavior as:
echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1
in a single sed command.
I already tried
echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"
But I get the following error:
sed: can't read /1/p: No such file or directory
Any idea on how to fix this and combine different types of commands into a single sed call?
Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.
EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:
arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
| sed -e 's/<Contents>/<Contents>\n/g' \
| grep $arch \
| sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
What I would like to simplify is the curl line, turning it into something like:
curl $base \
| sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
| sort -V > out
Here are some alternatives, awk and sed based:
sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"
Or, using your logic, you would need to pipe a second sed command:
sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"
See this online demo. The awk solution looks cleanest.
Details
In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.
In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.
This might work for you (GNU sed):
sed 's/:/\n/;/^[^\n]*1/P;D' file
Replace each : and if the first line in the pattern space contains 1 print it.
Repeat.
An alternative:
sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file
This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.
An alternative to the really ugly sed is: grep -o '\w*2\w*'
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | grep -o '\w*2\w*'
test2
bob2
fred2
grep -o: only matching
Or: grep -o '[^:]*2[^:]*'
echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'
sed -n doesn't print unless told to
sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
P prints the pattern buffer up to the first newline.
D trims the pattern buffer up to the first newline
[^\n] is a character class that matches anything except a newline
// is sed shorthand for repeating a match
//! is then matching everything that didn't match previously
So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.
And, if there is not the character you are looking for, you want to D delete up to the first newline.
At that point, it works for one line of input, with one string containing the character you're looking for.
To expand to several matches within a line, you have to ta, conditionally branch back to label :a:
$ printf "test1:test2:test3\nbob3:bob2:fred2\n" | \
sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
This is simply NOT a job for sed. With GNU awk for multi-char RS:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '/1/'
test1
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' 'NR%2'
test1
test3
test5
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS='[:\n]' '!(NR%2)'
test2
test4
test6
$ echo "foo1:bar1:foo2:bar2:foo3:bar3" | awk -v RS='[:\n]' '/foo/ || /2/'
foo1
foo2
bar2
foo3
With any awk you'd just have to strip the \n from the final record before operating on it:
$ echo "test1:test2:test3:test4:test5:test6"| awk -v RS=':' '{sub(/\n$/,"")} /1/'
test1

How to force sed to print what it does with my file?

How to force sed to print what it does with my file?
My text01.txt file:
aaa
bbb
ccc
ddd
c
ee
My code:
sed -i 's/c/X/g' ./text01.txt
I want to get in terminal something like this:
sed: line 3 change ccc to XXX
sed: line 5 change c to X
sed -i"bak" 's/c/X/g' text01.txt && diff text01.txt text01.txtbak
will give you a diff summary. like:
3c3
< XXX
---
> ccc
5c5
< X
---
> c
You can read diff man page, to adjust the diff output, e.g. with -c/-u/-y... options as you like.
If you want to get exactly same format you described, you can do some work on diff output as well.
This comes pretty close to your requirement:
$ paste <(cat -n text01.txt) <(sed 's/c/X/g' ./text01.txt)
1 aaa aaa
2 bbb bbb
3 ccc XXX
4 ddd ddd
5 c X
6 ee ee
cat -n prepends line numbers, and the paste command with process substitution prints the file and the sed output next to each other.
Or, more elaborate, with awk:
awk '{ getline mod_line < ARGV[2]
if ($0 != mod_line) {
printf "sed line %d change %s to %s\n", NR, $0, mod_line }
}' text01.txt <(sed 's/c/X/g' text01.txt)
This reads, for each line of text01.txt, the corresponding line as modified by sed. If they are different, the line number and both lines get printed:
sed line 3 change ccc to XXX
sed line 5 change c to X
plus an awk warning because it tries to close an anonymous pipe – this can be suppressed by redirecting stderr, i.e., appending 2> /dev/null to the command.
The closest thing to sed "built-in" debugging is the l command, which prints the current content of the pattern space. If you'd like to go all in, there are proper debuggers, for example sedsed.
This might work for you (GNU sed):
sed -i -e 'h;/c/!b;s//X/g;H;x;s/\n/ to /;s/^/sed: changed /w/dev/stdout' -e 'x' file
This makes a copy of each line in the hold space (HS) and if the substitution pattern does not match, no further action takes place. Otherwise, the substitution is made on the line in the pattern space (PS) and this is appended to the HS. Focus is then changed to the HS and format of before and after effected. The formated line is then written out to the standard output i.e. the terminal and finally focus is reverted to the PS so that the substituted line is included in the original updated file.

Sed Process Substitution on Insert - Without Backslashes

I have function that prints a header that needs to be applied across several files, but if I utilize a sed process substitution the lines prior to the last have a backslash \ on them.
E.g.
function print_header() {
cat << EOF
-------------------------------------------------------------------
$(date '+%B %d, %Y # ~ %r') ID:$(echo $RANDOM)
EOF
}
If I then take a file such as test.txt:
line 1
line 2
line 3
line 4
line 5
sed "1 i $(print_header | sed 's/$/\\/g')" test.txt
I get:
-------------------------------------------------------------------\
November 24, 2015 # ~ 11:18:28 AM ID:13187
line 1
line 2
line 3
line 4
line 5
Notice the troublesome backslash at the end of the first line, I'd like to not have that backslash appear. Any ideas?
I would use cat for that:
cat <(print_header) file > file_with_header
This behavior depends on the sed dialect. Unfortunately, it's one of the things which depends on which version you have.
To simplify debugging, try specifying verbatim text. Here's one from a Debian system.
vnix$ sed '1i\
> foo\
> bar' <<':'
> hello
> goodbye
> :
foo
bar
hello
goodbye
Your diagnostics appear to indicate that your sed dialect does not in fact require the backslash after the first i.
Since you are generating the contents of the header programmatically anyway, my recommended solution would be to refactor the code so that you can avoid this conundrum. If you don't want cat <<EOF test.txt then maybe experiment with sed 1r/dev/stdin' <<EOF test.txt (I could not get 1r- to work, but /dev/stdin should be portable to any Linux.)
Here is my kludgy fix, if you can find something more elegant I'll gladly credit you:
sed "1 i $(print_header | sed 's/$/\\/g;$s/$/\x01/')" test.txt | tr -d '\001'
This puts an unprintable SOH (\x01) ascii Start Of Header character after the inserted text, that precludes the backslashes and then I run it over tr to delete the SOH chars.

Sed or awk: how to call line addresses from separate file?

I have 'file1' with (say) 100 lines. I want to use sed or awk to print lines 23, 71 and 84 (for example) to 'file2'. Those 3 line numbers are in a separate file, 'list', with each number on a separate line.
When I use either of these commands, only line 84 gets printed:
for i in $(cat list); do sed -n "${i}p" file1 > file2; done
for i in $(cat list); do awk 'NR==x {print}' x=$i file1 > file2; done
Can a for loop be used in this way to supply line addresses to sed or awk?
This might work for you (GNU sed):
sed 's/.*/&p/' list | sed -nf - file1 >file2
Use list to build a sed script.
You need to do > after the loop in order to capture everything. Since you are using it inside the loop, the file gets overwritten. Inside the loop you need to do >>.
Good practice is to or use > outside the loop so the file is not open for writing during every loop iteration.
However, you can do everything in awk without for loop.
awk 'NR==FNR{a[$1]++;next}FNR in a' list file1 > file2
You have to >>(append to the file) . But you are overwriting the file. That is why, You are always getting 84 line only in the file2.
Try use,
for i in $(cat list); do sed -n "${i}p" file1 >> file2; done
With sed:
sed -n $(sed -e 's/^/-e /' -e 's/$/p/' list) input
given the example input, the inner command create a string like this: `
-e 23p
-e 71p
-e 84p
so the outer sed then prints out given lines
You can avoid running sed/awk in a for/while loop altgether:
# store all lines numbers in a variable using pipe
lines=$(echo $(<list) | sed 's/ /|/g')
# print lines of specified line numbers and store output
awk -v lineS="^($lines)$" 'NR ~ lineS' file1 > out

How to use a sed one-liner to parse "rec:id=1&name=zz&age=21" into "1 zz 21"?

I can chain multiple sed substitutions and a awk operation to achieve this, but is there a single sed substitution that can do it?
Also is there any other tool that is more suitable for this parsing task?
You could try:
sed -r 's!rec:id=(.*?)&name=(.*?)&age=(.*?)!\1 \2 \3!' input_file
If you don't know the rec:id etc in advance but you know there's three, you could try:
sed -r 's![^=]+=(.*?)&[^=]+=(.*?)&[^=]+=(.*?)!\1 \2 \3!' input_file
If you don't know how many &name=value pairs you're after in advance but want to output all the values, you could try something like:
grep -P -o '(?<==)([^&]*)(?=&|$)' | xargs
where the -P means 'perl regex', the regex says "find the string followed by an & (or end of string) and preceded by and equals sign", the -o means to print just the matches (ie the 1, zz, and 21) each on their own line, and the | xargs moves these from their own line to one line and space separated (ie 1\nzz\n21 to 1 zz 21).
This might work for you:
echo "rec:id=1&name=zz&age=21" | sed 's/[^=]*=\([^&]*\)/\1 /g'
1 zz 21
However this leaves an extra space at the end, to solve this use:
echo "rec:id=1&name=zz&age=21"|sed 's/[^=]*=\([^&]*\)/\1 /g:;s/ $//'
1 zz 21
How about parsing the values directly into variables?
inbound="rec:id=1&name=zz&age=21"
eval $(echo $inbound | cut -c5- | tr \& "\n")
echo "Name:$name, ID:$id, Age:$age"
Or even better, though slightly more arcane:
inbound="rec:id=1&name=zz&age=21"
IFS=\& eval $(cut -c5- <<< $inbound)
echo "Name:$name, ID:$id, Age:$age"