How do I capture the last column with sed? - sed

I know this is easy with awk ('{print $NF}'), but for learning purposes I'd like to understand how to do it with sed.
I have a variable with a bunch of white-space separated values, and I'd like to print just that last one. I thought I could do it with a greedy match but it's printing everything after the first space, instead of everything after the last space:
echo $var
"t1": 0.004, "t2": 0.010, "t3": 0.144
echo $var | sed 's/\S*\(.*\)$/\1/'
0.004, "t2": 0.010, "t3": 0.144
When I try putting it in a match group with text afterward, I get even further away:
echo $var | sed 's/[\S*.*]+\(.*\)$/\1/'
"t1": 0.004, "t2": 0.010, "t3": 0.144

$ echo ">$var<"
>"t1": 0.004, "t2": 0.010, "t3": 0.144<
$ echo "$var" | sed 's/\S*\(.*\)$/\1/'
0.004, "t2": 0.010, "t3": 0.144
The regular expression matched as follows:
\S*
_____
"t1": 0.004, "t2": 0.010, "t3": 0.144
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
\(.*\)$
The manual page specifies that a regex will match as early as possible:
In the event that an RE could match more than one substring of a given string, the RE matches the one starting earliest in the string.
To return only the last substring of non-spaces after the last colon folowed by an optional space use sed -e 's/.*:\s*//':
$ echo "$var" | sed -e 's/.*:\s*//'
0.144
The regular expression matching like this:
.* \s*
______________________________ _
"t1": 0.004, "t2": 0.010, "t3": 0.144
^
:

Related

sed 's/\s+$//g' does not strip trailing space

I expected sed 's/\s+$//g' to strip trailing spaces
echo "'$(echo 'Magnetic ' | sed 's/\s+$//g')'"
outputs 'Magnetic ', as does
echo "'$(echo 'Magnetic ' | sed 's/[\n\s]+$//g')'"
How do I remove the trailing space with sed?
You have to escape the plus sign + because sed uses BRE, so:
echo "'$(echo 'Magnetic ' | sed 's/\s\+$//g')'"
if there's -r or -E flag, sed uses ERE instead so you don't have to escape it:
echo "'$(echo 'Magnetic ' | sed -r 's/\s+$//g')'"

Hot to replace newline characters with a string in sed

First, this is not a duplicate of, e.g., How can I replace each newline (\n) with a space using sed?
What I want is to exactly replace every newline (\n) in a string, like so:
printf '%s' $'' | sed '...; s/\n/\\&/g'
should result in the empty string
printf '%s' $'a' | sed '...; s/\n/\\&/g'
should result in a (not followed by a newline)
printf '%s' $'a\n' | sed '...; s/\n/\\&/g'
should result in
a\
(the trailing \n of the final line should be replaced, too)
A solution like :a;N;$!ba; s/\n/\\&/g from the other question doesn't do that properly:
printf '%s' $'' | sed ':a;N;$!ba; s/\n/\\&/g' | hd
works;
printf '%s' $'a' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 |a|
00000001
works;
printf '%s' $'a\nb' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 5c 0a 62 |a\.b|
00000004
works;
but when there's a trailing \n on the last line
printf '%s' $'a\nb\n' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000 61 5c 0a 62 0a |a\.b.|
00000005
it doesn't get quoted.
Easier to use perl than sed, since it has (by default, at least) a more straightforward treatment of the newlines in its input:
printf '%s' '' | perl -pe 's/\n/\\\n/' # Empty string
printf '%s' a | perl -pe 's/\n/\\\n/' # a
printf '%s\n' a | perl -pe 's/\n/\\\n/' # a\<newline>
printf '%s\n' a b | perl -pe 's/\n/\\\n/' # a\<newline>b\<newline>
# etc
If your inputs aren't huge, you could use
perl -0777 -pe 's/\n/\\\n/g'
instead to read the entire input at once instead of line by line, which can be more efficient.
how to replace newline charackters with a string in sed
It's not possible. From sed script point of view, the trailing line missing or not makes no difference and is undetectable.
Aaaanyway, use GNU sed with sed -z:
sed -z 's/\n/\\\n/g'
GNU awk can use the RT variable to detect a missing record terminator:
$ printf 'a\nb\n' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1'
a\
b\
$ printf 'a\nb' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1'
a\
b$
This adds a "\" before each non-empty record terminator.
Using any awk:
$ printf 'a\nb\n\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b\
$ printf 'a\nb\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b$
Or { cat file; echo; } | awk ... – always add a newline to the input.

Insert comma after certain byte range

I'm trying to turn a big list of data into a CSV. Its basically a giant list with no spaces, and the rows are separated by newlines. I have made a bash script that basically loops through the document, awks out the line, cuts the byte range, and then adds a comma and appends it to the end of the line. It looks like this:
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 1-12 | tr -d '\n' >> $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 13-17 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 18-22 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 23-34 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
The problem is this is EXTREMELY slow, and the data has about 400k rows. I know there must be a better way to accomplish this. Essentially I just need to add a comma after every 12/17/22/34 etc character of a line.
Any help is appreciated, thank you!
There are many many ways to do this with Perl. Here is one way:
perl -pe 's/(.{12})(.{5})(.{5})(.{12})/$1,$2,$3,$4,/' < input-file > output-file
The matching pattern in the substitution captures four groups of text from the beginning of each line with 12, 5, 5, and 12 arbitrary characters. The replacement pattern places a comma after each group.
With GNU awk, you could write
gawk 'BEGIN {FIELDWIDTHS="12 5 5 12"; OFS=","} {$1=$1; print}'
The $1=$1 part is to force awk to rewrite the like, incorporating the output field separator, without changing anything.
This is very much a job for substr.
use strict;
use warnings;
my #widths = (12, 5, 5, 12);
my $offset;
while (my $line = <DATA>) {
for my $width (#widths) {
$offset += $width;
substr $line, $offset, 0, ',';
++$offset;
}
print $line;
}
__DATA__
1234567890123456789012345678901234567890
output
123456789012,34567,89012,345678901234,567890

'sed' usage in perl script error

I have the following line in a Perl script:
my $temp = `sed 's/ /\n/g' /sys/bus/w1/devices/w1_bus_master1/10-000802415bef/w1_slave | grep t= | sed 's/t=//'`;
Which throws up the error:
"sed: -e expression #1, char 2: unterminated `s' command"
If I run a shell script as below it works fine:
temp1=`sed 's/ /\n/g' /sys/bus/w1/devices/w1_bus_master1/10-000802415bef/w1_slave | grep t= | sed 's/t=//'`
echo $temp1
Anyone got any ideas?
Perl interpretes your \n as a literal newline character. Your command line will therefore look something like this from sed's perspective:
sed s/ /
/g ...
which sed doesn't like. The shell does not interpret it that way.
The proper solution is not to use sed/grep in such a situation at all. Perl is, after all, very, very good at handling text. For example (untested):
use File::Slurp;
my #lines = split m/\n/, map { s/ /\n/g; $_ } scalar(read_file("/sys/bus...));
#lines = map { s/t=//; $_ } grep { m/t=/ } #lines;
Alternatively escape the \n once, e.g. sed 's/ /\\n/g'....
You need to escape the \n in our first regular expression. The backtick-operator in perl thinks it is a control-character and inserts a newline instead of the string \n.
|
V
my $temp = `sed 's/ /\\n/g' /sys/bus/ # ...

sed character replacement in portion of a line

New to sed and could use some help.
I would like to turn this "a/b/c a/b/c" into this "a/b/c a-b-c".
where a/b/c is any path.
thanks
Give this a try:
sed 'h; s/ .*//; x; s/.* //; s:/:-:g; x; G; s/\n/ /'
Since you want to use whitespace to delemit, I'd just use perl:
perl -ane '$F[1] =~ s/\//-/; print "#F\n"'
you can use awk,
$ echo "a/b/c a/b/c" | awk '{gsub("/","-",$NF)}1'
a/b/c a-b-c
This might work:
echo "a/b/c a/b/c" | sed ':a;s|\(.* [^/]*\)/|\1-|;ta'
a/b/c a-b-c
Or this:
echo "a/b/c a/b/c" | sed 's/.* //;h;y/\//-/;x;G;y/\n/ /'
a/b/c a-b-c