Hot to replace newline characters with a string in sed - sed

First, this is not a duplicate of, e.g., How can I replace each newline (\n) with a space using sed?
What I want is to exactly replace every newline (\n) in a string, like so:
printf '%s' $'' | sed '...; s/\n/\\&/g'
should result in the empty string
printf '%s' $'a' | sed '...; s/\n/\\&/g'
should result in a (not followed by a newline)
printf '%s' $'a\n' | sed '...; s/\n/\\&/g'
should result in
a\
(the trailing \n of the final line should be replaced, too)
A solution like :a;N;$!ba; s/\n/\\&/g from the other question doesn't do that properly:
printf '%s' $'' | sed ':a;N;$!ba; s/\n/\\&/g' | hd
works;
printf '%s' $'a' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 |a|
00000001
works;
printf '%s' $'a\nb' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 5c 0a 62 |a\.b|
00000004
works;
but when there's a trailing \n on the last line
printf '%s' $'a\nb\n' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 5c 0a 62 0a |a\.b.|
00000005
it doesn't get quoted.

Easier to use perl than sed, since it has (by default, at least) a more straightforward treatment of the newlines in its input:
printf '%s' '' | perl -pe 's/\n/\\\n/' # Empty string
printf '%s' a | perl -pe 's/\n/\\\n/' # a
printf '%s\n' a | perl -pe 's/\n/\\\n/' # a\<newline>
printf '%s\n' a b | perl -pe 's/\n/\\\n/' # a\<newline>b\<newline>
# etc
If your inputs aren't huge, you could use
perl -0777 -pe 's/\n/\\\n/g'
instead to read the entire input at once instead of line by line, which can be more efficient.

how to replace newline charackters with a string in sed
It's not possible. From sed script point of view, the trailing line missing or not makes no difference and is undetectable.
Aaaanyway, use GNU sed with sed -z:
sed -z 's/\n/\\\n/g'

GNU awk can use the RT variable to detect a missing record terminator:
$ printf 'a\nb\n' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1'
a\
b\
$ printf 'a\nb' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1'
a\
b$
This adds a "\" before each non-empty record terminator.
Using any awk:
$ printf 'a\nb\n\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b\
$ printf 'a\nb\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b$
Or { cat file; echo; } | awk ... – always add a newline to the input.

Related

grep + grep + sed = sed: no input files

Can anybody help me please?
grep " 287 " file.txt | grep "HI" | sed -i 's/HIS/HID/g'
sed: no input files
Tried also xargs
grep " 287 " file.txt | grep HI | xargs sed -i 's/HIS/HID/g'
sed: invalid option -- '6'
This works fine
grep " 287 " file.txt | grep HI
If you want to keep your pipeline:
f=file.txt
tmp=$(mktemp)
grep " 287 " "$f" | grep "HI" | sed 's/HIS/HID/g' > "$tmp" && mv "$tmp" "$f"
Or, simplify:
sed -i -n '/ 287 / {/HI/ s/HIS/HID/p}' file.txt
That will filter out any line that does not contain " 287 " and "HI" -- is that what you want? I suspect you really want this:
sed -i '/ 287 / {/HI/ s/HIS/HID/}' file.txt
For lines that match / 287 /, execute the commands in braces. In there, for lines that match /HI/, search for the first "HIS" and replace with "HID". sed implicitly prints all lines if -n is not specified.
Other commands that do the same thing:
awk '/ 287 / && /HI/ {sub(/HIS/, "HID")} {print}' file.txt > new.txt
perl -i -pe '/ 287 / and /HI/ and s/HIS/HID/' file.txt
awk does not have an "in-place" option (except gawk -i inplace for recent gawk versions)

how to put | between content lines of a text file?

I have a file containing:
L1
L2
L3
.
.
.
L512
I want to change its content to :
L1 | L2 | L3 | ... | L512
It seems so easy , but its now 1 hour Im sitting and trying to make it, I tried to do it by sed, but didn't get what I want. It seems that sed just inserts empty lines between the content, any suggestion please?
With sed this requires to read the whole input into a buffer and afterwards replace all newlines by |, like this:
sed ':a;N;$!ba;s/\n/ | /g' input.txt
Part 1 - buffering input
:a defines a label called 'a'
N gets the next line from input and appends it to the pattern buffer
$!ba jumps to a unless the end of input is reached
Part 2 - replacing newlines by |
s/\n/|/ execute the substitute command on the pattern buffern
As you can see, this is very inefficient since it requires to:
read the complete input into memory
operate three times on the input: 1. reading, 2. substituting, 3. printing
Therefore I would suggest to use awk which can do it in one loop:
awk 'NR==1{printf $0;next}{printf " | "$0}END{print ""}' input.txt
Here is one sed
sed ':a;N;s/\n/ | /g;ta' file
L1 | L2 | L3 | ... | L512
And one awk
awk '{printf("%s%s",sep,$0);sep=" | "} END {print ""}' file
L1 | L2 | L3 | ... | L512
perl -pe 's/\n/ |/g unless(eof)' file
if space between | is not mandatory
tr "\n" '|' YourFile
Several options, including those mentioned here:
paste -sd'|' file
sed ':a;N;s/\n/ | /g;ta' file
sed ':a;N;$!ba;s/\n/ | /g' file
perl -0pe 's/\n/ | /g;s/ \| $/\n/' file
perl -0nE 'say join " | ", split /\n/' file
perl -E 'chomp(#x=<>); say join " | ", #x' file
mapfile -t ary < file; (IFS="|"; echo "${ary[*]}")
awk '{printf("%s%s",sep,$0);sep=" | "} END {print ""}' file

Pattern extraction using SED or AWK

How do I extract 68 from v1+r0.68?
Using awk, returns everything after the last '.'
echo "v1+r0.68" | awk -F. '{print $NF}'
Using sed to get the number after the last dot:
echo 'v1+r0.68' | sed 's/.*[.]\([0-9][0-9]*\)$/\1/'
grep is good at extracting things:
kent$ echo " v1+r0.68"|grep -oE "[0-9]+$"
68
Match the digit string before the end of the line using grep:
$ echo 'v1+r0.68' | grep -Eo '[0-9]+$'
68
Or match any digits after a .
$ echo 'v1+r0.68' | grep -Po '(?<=\.)\d+'
68
Print everything after the . with awk:
echo "v1+r0.68" | awk -F. '{print $NF}'
68
Substitute everything before the . with sed:
echo "v1+r0.68" | sed 's/.*\.//'
68
type man grep
and you will see
...
-o, --only-matching
Show only the part of a matching line that matches PATTERN.
then type echo 'v1+r0.68' | grep -o '68'
if you want it any where special do:
echo 'v1+r0.68' | grep -o '68' > anyWhereSpecial.file_ending

Insert comma after certain byte range

I'm trying to turn a big list of data into a CSV. Its basically a giant list with no spaces, and the rows are separated by newlines. I have made a bash script that basically loops through the document, awks out the line, cuts the byte range, and then adds a comma and appends it to the end of the line. It looks like this:
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 1-12 | tr -d '\n' >> $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 13-17 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 18-22 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 23-34 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
The problem is this is EXTREMELY slow, and the data has about 400k rows. I know there must be a better way to accomplish this. Essentially I just need to add a comma after every 12/17/22/34 etc character of a line.
Any help is appreciated, thank you!
There are many many ways to do this with Perl. Here is one way:
perl -pe 's/(.{12})(.{5})(.{5})(.{12})/$1,$2,$3,$4,/' < input-file > output-file
The matching pattern in the substitution captures four groups of text from the beginning of each line with 12, 5, 5, and 12 arbitrary characters. The replacement pattern places a comma after each group.
With GNU awk, you could write
gawk 'BEGIN {FIELDWIDTHS="12 5 5 12"; OFS=","} {$1=$1; print}'
The $1=$1 part is to force awk to rewrite the like, incorporating the output field separator, without changing anything.
This is very much a job for substr.
use strict;
use warnings;
my #widths = (12, 5, 5, 12);
my $offset;
while (my $line = <DATA>) {
for my $width (#widths) {
$offset += $width;
substr $line, $offset, 0, ',';
++$offset;
}
print $line;
}
__DATA__
1234567890123456789012345678901234567890
output
123456789012,34567,89012,345678901234,567890

'sed' usage in perl script error

I have the following line in a Perl script:
my $temp = `sed 's/ /\n/g' /sys/bus/w1/devices/w1_bus_master1/10-000802415bef/w1_slave | grep t= | sed 's/t=//'`;
Which throws up the error:
"sed: -e expression #1, char 2: unterminated `s' command"
If I run a shell script as below it works fine:
temp1=`sed 's/ /\n/g' /sys/bus/w1/devices/w1_bus_master1/10-000802415bef/w1_slave | grep t= | sed 's/t=//'`
echo $temp1
Anyone got any ideas?
Perl interpretes your \n as a literal newline character. Your command line will therefore look something like this from sed's perspective:
sed s/ /
/g ...
which sed doesn't like. The shell does not interpret it that way.
The proper solution is not to use sed/grep in such a situation at all. Perl is, after all, very, very good at handling text. For example (untested):
use File::Slurp;
my #lines = split m/\n/, map { s/ /\n/g; $_ } scalar(read_file("/sys/bus...));
#lines = map { s/t=//; $_ } grep { m/t=/ } #lines;
Alternatively escape the \n once, e.g. sed 's/ /\\n/g'....
You need to escape the \n in our first regular expression. The backtick-operator in perl thinks it is a control-character and inserts a newline instead of the string \n.
|
V
my $temp = `sed 's/ /\\n/g' /sys/bus/ # ...