How to replace whitespace with one blank using sed? - sed

Using
sed "s/[[:blank:]]*/ /g" a>b
doesn't seem to work.

You need to change the asterisk to a plus sign:
sed "s/[[:blank:]]\+/ /g" a>b
or use an alternative that means the same thing:
sed "s/[[:blank:]][[:blank:]]*/ /g" a>b
or
sed "s/[[:blank:]]\{1,\}/ /g" a>b
Also, it's more helpful to post error messages or precise ways that behavior differs from expectations since "doesn't seem to work" conveys very little information.

Related

How to replace a specific character in bash

I want to replace '_v' with a whitespace and the last dot . into a dash "-". I tried using
sed 's/_v/ /' and tr '_v' ' '
Original Text
src-env-package_v1.0.1.18
output
src-en -package 1.0.1.18
Expected Output
src-env-package 1.0.1-18
This might work for you (GNU sed):
sed -E 's/(.*)_v(.*)\./\1 \2-/' file
Use the greed of the .* regexp to find the last occurrence of _v and likewise . and substitute a space for the former and a - for the latter.
If one of the conditions may occur but not necessarily both, use:
sed -E 's/(.*)_v/\1 /;s/(.*)\./\1-/' file
With your shown samples please try following sed code. Using sed's capability to store matched regex values into temp buffer(called capturing groups) here. Also using -E option here to enable ERE(extended regular expressions) for handling regex in better way.
Here is the Online demo for used regex.
sed -E 's/^(src-env-package)_v([0-9]+\..*)\.([0-9]+)$/\1 \2-\3/' Input_file
OR if its a variable value on which you want to run sed command then use following:
var="src-env-package_v1.0.1.18"
sed -E 's/^(src-env-package)_v([0-9]+\..*)\.([0-9]+)$/\1 \2-\3/' <<<"$var"
src-env-package 1.0.1-18
Bonus solution: Adding a perl one-liner solution here, using capturing groups concept(as explained above) in perl and getting the values as per requirement.
perl -pe 's/^(src-env-package)_v((?:[0-9]+\.){1,}[0-9]+)\.([0-9]+)$/\1 \2-\3/' Input_file

How do I use sed like cut -d':' -f2?

I'm trying to learn sed and cant figure this one out.
I found this:
sed -e ‘s/^[^=]*=//’
for picking up "valueA" from "valueA=value" but the data I'm working with is this:
Continent:Area(sq. mi):Density(People per sq. mi)
Asia:16,920,000:225
Africa:11,730,000:76
North America:9,460,000:54
South America:6,890,000:54
Antarctica:5,300,000:0.00018
Europe:3,930,000:181
Australia:3,478,200:9.3
===========================
The puzzle says to use sed like 'cut -d':' -f2' and I can't seem to get it done right.
This might work for you (GNU sed):
sed -n 's/[^:]*/\n&\n/2;s/.*\n\(.*\)\n.*/\1/p' file
Surround the 2nd field by newlines, then remove everything upto and including a newline as well as everything from a newline to the end of the line.

Decode sed expression

I would like to understand the sed part of this code:
/usr/local/bin/pcsensor -l60 -n | sed -e "s/^.*\$/PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:\0/"
(the input) pcsensor produces:
2016/09/19 22:41:31 Temperature 90.50F 32.50C
The code produces (output):
PUTVAL downloads/exec-environmental/temperature-cpu interval=30 N:32.50
I am hoping that understanding the sed expression will help me to knock the last digit off (so the temp is only 1 decimal place).
Updated: My booboo (it was late):
the -n in the first part of the command outputs this:
32.50
Which works fine in an echo/printf
printf "32.50 %s\n"| sed -e "s/^.*\$/PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:\0/"
About
sed -e "s/^.*\$/PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:\0/"
This is 1 sed command, namely the s/.../.../ for "substitute". In simple terms, it does a single "search and replace" for every line that it gets to work on.
The "search" part is ^.*\$, the "replacement" part is PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:\0/.
^.*\$ is a simple Regular expression that here stands for "everything" or "the whole line". So, the s command will replace the whole line with
PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:\0/
As Benjamin W. pointed out the use of \0 is "weird". It apparently was meant as a so-called reference, so that the part we searched for is appended after the text "PUTVAL(...)val=30 N:".
I have several issues with the way this is presented, though.
\0 is not in the manpage of my Debian GNU Sed 4.2.2.
Quoting the sed command with " is not needed here and makes things unnecessarily complicated and error-prone. Single quotes should be used instead.
A \0 anywhere in a Shell and especially in Sed could very well stand for a null character which here raises even more red flags due to the " quoting.
Using sed just to prepend a text is "useless use of Sed".
Since you asked about sed, here is how I would write it:
sed -e 's/^.*$/PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:&/'
& stands for "what the search part found". In your case, the whole line.
In order to cut off the last decimal, there are many ways to achieve this. A rather simple approach assumes that the input always has 2 decimals. Then we could prepend a command that replaces the last character (.$) with "nothing" (//):
sed -e 's/.$//;s/^[0-9][0-9]*\.[0-9]/PUTVAL downloads\/exec-environmental\/temperature-cpu interval=30 N:&/'
However, as I said, sed is overkill here. You could just use for instance printf:
text='PUTVAL downloads/exec-environmental/temperature-cpu interval=30 N:'
printf "%s%3.1f\n" "$text" $(/usr/local/bin/pcsensor -l60 -n)

Remove a hyphen from a specific line in a file

I have a data file that needs to have several uniq identifiers stripped of hyphens.
So I have:
(Special_Section "data-values")
and I want to have it replaced with:
(Special_Section "datavalues")
I wanted to use a simple sed find/replace, but the data and values are different each time. Preferably, I'd run this in-place since the file has a lot of other information I want to keep in tact.
Does sed or awk have a way to remove the hyphen from the matched portion only?
Currently I can match with: sed -i 's/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/&/g *myfiles*
But I would like to then run s/-// on & if it's possible.
You seems to be using GNU sed, so something like this might work:
sed -ri '
s/(Special_Section [^-]*)-([^)]*)/\1\2/g
' <your_filename_glob>
does this work?
sed -i '/(Special_Section ".*-.*")/{s/-//}' yourFile
Close - scan for the lines and then substitute on those that match:
sed -i '/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/s/\( "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*\)"/\1\2/' *myfiles*
You can split that over several lines to avoid the scroll bar in SO:
sed -i '/Special_Section "[a-zA-Z0-9]*-[a-zA-Z0-9]*"/{
s/\( "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*\)"/\1\2/
}' *myfiles*
And on further thoughts, you can also do:
sed -i 's/\(Special_Section "[a-zA-Z0-9]*\)-\([a-zA-Z0-9]*"\)/\1\2/' *myfiles*
This is more compact. You can add the g qualifier if you need it. Both solutions use the special \(...\) notation to capture parts of the regular expression.

capturing groups in sed

I have many lines of the form
ko04062 ko:CXCR3
ko04062 ko:CX3CR1
ko04062 ko:CCL3
ko04062 ko:CCL5
ko04080 ko:GZMA
and would dearly like to get rid of the ko: bit of the right-hand column. I'm trying to use sed, as follows:
echo "ko05414 ko:ITGA4" | sed 's/\(^ko\d{5}\)\tko:\(.*$\)/\1\2/'
which simply outputs the original string I echo'd. I'm very new to command line scripting, sed, pipes etc, so please don't be too angry if/when I'm doing something extremely dumb.
The main thing that is confusing me is that the same thing happens if I reverse the \1\2 bit to read \2\1 or just use one group. This, I guess, implies that I'm missing something about the mechanics of piping the output of echo into sed, or that my regexp is wrong or that I'm using sed wrong or that sed isn't printing the results of the substitution.
Any help would be greatly appreciated!
sed is outputting its input because the substitution isn't matching. Since you're probably using GNU sed, try this:
echo "ko05414 ko:ITGA4" | sed 's/\(^ko[0-9]\{5\}\)\tko:\(.*$\)/\1\2/'
\d -> [0-9] since GNU sed doesn't recognize \d
{} -> \{\} since GNU sed by default uses basic regular expressions.
This should do it. You can also skip the last group and simply use, \1 instead, but since you're learning sed and regex this is good stuff. I wanted to use a non-capturing group in the middle (:? ) but I could not get that to play with sed for whatever reason, perhaps it's not supported.
sed --posix 's/\(^ko[0-9]\{5\}\)\( ko:\)\(.*$\)/\1 \3/g' file > result
And ofcourse you can use
sed --posix 's/ko://'
You don't need sed for this
Here is how you can do it with bash:
var="ko05414 ko:ITGA4"
echo ${var//"ko:"}
${var//"ko:"} replaces all "ko:" with ""
See Manipulating Strings for more info
#OP, if you just want to get rid of "ko:", then
$ cat file
ko04062 ko:CXCR3
ko04062 ko:CX3CR1
ko04062 ko:CCL3
ko04062 ko:CCL5
some text with a legit ko: this ko: will be deleted if you use gsub.
ko04080 ko:GZMA
$ awk '{sub("ko:","",$2)}1' file
ko04062 CXCR3
ko04062 CX3CR1
ko04062 CCL3
ko04062 CCL5
some text with a legit ko: this ko: will be deleted if you use gsub.
ko04080 GZMA
Jsut a note. While you can use pure bash string substitution, its only more efficient when you are changing a single string. If you have a file, especially a big file, using bash's while read loop is still slower than using sed or awk.