Sed to replace single quotation mark - sed

I'm going crazy on a simple thing. I have written the following line that is amazingly working:
name=$(echo $name | sed 's/"//g' | sed 's/^ //' | sed 's/ $//' | sed "s/'/\\\'/")
I'm trying to reduce sed to only one command instead of four, and I wrote the following line that doesn't work, and I cannot manage to make it work:
name=$(echo $name | sed 's/"//g; s/^ //; s/ $//; s/\'/\\\'/g')
I get:
sed: 1: "s/"//g; s/^ //; s/ $//; ...": unterminated substitute in regular expression
What am I doing wrong? I can't see the syntax error, I've tried to rewrite it many times, the line with 4 sed works, but this where I try to put all in one sed it doesn't.
Thank you very much for your help!

You cannot embed a single quote in a single-quoted string, no matter now many escapes you use: 3.1.2.2 Single Quotes
You can use sed's -e option to chain the commands and give you the most quoting flexibility:
sed -e 's/"//g' -e 's/^ //' -e 's/ $//' -e "s/'/\\\'/"
# or
sed -e 's/"//g; s/^ //; s/ $//' -e "s/'/\\\'/"

This might work for you (GNU sed):
sed -E 's/^ |"| $//g;s/'\''/\\&/g' file
Use alternation to remove a space at the start/end of a line or double quote.
Replace a single quote by an escaped single quote.
It is not possible to have solution with a single substitution as the RHS of the substitution is different in one of the four cases i.e. first 3 cases remove strings whereas the fourth case adds one.
If, however you intend to remove the single quote, use:
sed -E 's/^ |["'\'']| $//g' file

You can also abut quotes anywhere to need to switch them. Just be careful of interpolation, per usual with double-quotes:
Instead of
name=$(echo $name | sed 's/"//g; s/^ //; s/ $//; s/\'/\\\'/g')
do
$ name=$(echo $name | sed 's/"//g; s/^ //; s/ $//;'" s/\'/\\\'/g")
although I think what you want may actually be
$ name=$(echo $name | sed 's/"//g; s/^ //; s/ $//;'" s/'//g")
Adjacent quotes are just like any other adjacent characters. With no intervening whitespace, they constitute a continuing string.

You can use the third method of bash quoting: $'...'. Here C like escapes are possible. So your sed command can be:
sed $'s/"//g; s/^ //; s/ $//; s/\'/\\\'/g'
But if you want, that the input a'b becomes a\'b, then use
sed $'s/"//g; s/^ //; s/ $//; s/\'/\\\\\'/g'
Analyzing \\\\\': bash reads it as \\'. And sed reads it as \'.

Related

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

Insert linebreak in a file after a string

I have a unique (to me) situation:
I have a file - file.txt with the following data:
"Line1", "Line2", "Line3", "Line4"
I want to insert a linebreak each time the pattern ", is found.
The output of file.txt shall look like:
"Line1",
"Line2",
"Line3",
"Line4"
I am having a tough time trying to escape ", .
I tried sed -i -e "s/\",/\n/g" file.txt, but I am not getting the desired result.
I am looking for a one liner using either perl or sed.
You may use this gnu sed:
sed -E 's/(",)[[:blank:]]*/\1\n/g' file.txt
"Line1",
"Line2",
"Line3",
"Line4"
Note how you can use single quote in sed command to avoid unnecessary escaping.
If you don't have gnu sed then here is a POSIX compliant sed solution:
sed -E 's/(",)[[:blank:]]*/\1\
/g' file.txt
To save changes inline use:
sed -i.bak -E 's/(",)[[:blank:]]*/\1\
/g' file.txt
Could you please try following. using awk's substitution mechanism here, in case you are ok with awk.
awk -v s1="\"" -v s2="," '{gsub(/",[[:blank:]]+"/,s1 s2 ORS s1)} 1' Input_file
Here's a Perl solution:
perl -pe 's/",\K/\n/g' file.txt
The substitution pattern matches the ",, but the \K says to ignore anything to the left for the replacement (so, ",) will not be replaced. The replacement then effectively inserts the newline.
I used the single quote for the argument to -e, but that doesn't work on Windows where you have to use ". Instead of escaping the ", you can specify it in another way. That's code number 0x22, so you can write:
perl -pe "s/\x22,\K/\n/g" file.txt
Or in octal:
perl -pe "s/\042,\K/\n/g" file.txt
Use this Perl one-liner:
perl -F'/"\K,\s*/' -lane 'print join ",\n", #F;' in_file > out_file
Or this for in-line replacement:
perl -i.bak -F'/"\K,\s*/' -lane 'print join ",\n", #F;' in_file
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
-F'/"\K,\s*/' : Split into #F on a double quote, followed by comma, followed by 0 or more whitespace characters, rather than on whitespace. \K : Cause the regex engine to "keep" everything it had matched prior to the \K and not include it in the match. This causes to keep the double quote in #F elements, while comma and whitespace are removed during the split.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlrequick: Perl regular expressions quick start

How to replace \n by space using sed command?

I have to collect a select query data to a CSV file. I want to use a sed command to replace \n from the data by a space.
I'm using this:
query | sed "s/\n/ /g" > file.csv .......
But it is not working. Only \ is getting removed, while it should also remove n and add a space. Please suggest something.
You want to replace newline with space, not necessarily using sed.
Use tr:
tr '\n' ' '
\n is special to sed: it stands for the newline character. To replace a literal \n, you have to escape the backslash:
sed 's/\\n/ /g'
Notice that I've used single quotes. If you use double quotes, the backslash has a special meaning if followed by any of $, `, ", \, or newline, i.e., "\n" is still \n, but "\\n" would become \n.
Since we want sed to see \\n, we'd have to use one of these:
sed "s/\\\n/ /g" – the first \\ becomes \, and \n doesn't change, resulting in \\n
sed "s/\\\\n/ /g" – both pairs of \\ are reduced to \ and sed gets \\n as well
but single quotes are much simpler:
$ sed 's/\\n/ /g' <<< 'my\nname\nis\nrohinee'
my name is rohinee
From comments on the question, it became apparent that sed had nothing to do with removing the backslashes; the OP tried
echo my\nname\nis | sed 's/\n/ /g'
but the backslashes are removed by the shell:
$ echo my\nname\nis
mynnamenis
so even if the correct \\n were used, sed wouldn't find any matches. The correct way is
$ echo 'my\nname\nis' | sed 's/\\n/ /g'
my name is

sed to copy part of line to end

I'm trying to copy part of a line to append to the end:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
becomes:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
I have tried:
sed 's/\(.*(GCA_\)\(.*\))/\1\2\2)'
$ f1=$'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz'
$ echo "$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\1\2\3\/\2\4/' <<<"$f1"
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz
sed -E (or -r in some systems) enables extended regex support in sed , so you don't need to escape the group parenthesis ( ).
The format (GCA_.[^.]*) equals to "get from GCA_ all chars up and excluding the first found dot" :
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\2/' <<<"$f1"
GCA_900169985
Similarly (.[^_]*) means get all chars up to first found _ (excluding _ char). This is the regex way to perform a non greedy/lazy capture (in perl regex this would have been written something like as .*_?)
$ sed -E 's/(.*)(GCA_.[^.]*)(.[^_]*)(.*)/\3/' <<<"$f1"
.1
Short sed approach:
s="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1_IonXpress_024_genomic.fna.gz"
sed -E 's/(GCA_[^._]+)\.([^_]+)/\1.\2\/\1/' <<< "$s"
The output:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/169/985/GCA_900169985.1/GCA_900169985_IonXpress_024_genomic.fna.gz

Why is sed not matching excess whitespace between non-whitespace characters

I have a sed oneliner which removes excess whitespace:
sed -e 's/^\s*//' -e 's/\s*$//' -e 's/\s{2,}/ /g'
When I test it on " \tone1 two\t3three\t ", the sed removes the whitespace at the beginning and end of the line but doesn't match the excess whitespace between words, and sed returns \tone1 two\t3three. What I want is \tone1 two 3three, so sed -e 's/[ \t]{2,}/ /g' is not functioning.
regexr.com shows the expression as functional.
My version is GNU sed version 4.2.1.
{ and } need to be escaped in basic regex mode that sed uses.
However, you can use this sed with a single substitution with alternation:
sed -E 's/^[[:blank:]]+|[[:blank:]]+$|[[:blank:]]{2,}//g' file
POSIX character class [[:blank:]] matches a space or tab characters.