Delete line if string between the 4th and 5th delimiter is empty - sed

"text";"text";"text";"text";;"text";"text"
If after the 4th delimiter the next one is following the line should be deleted.
Actually i'm doing that by using sed
sed -n '/;;/!p' input.txt
Is this a reliable solution?
Thanks for help.

Securing a bit potential escaped double quote and internal ";" (thanks #SLePort for remark)
sed -e 'h;s/\\"//g' -e ':c' -e 's/^\(\("[^"]*";\)*"[^"]*\);/\1/;t c' -e '/^\([^;]*;\)\{4\};/d;h'

sed -r '/^([^;]+;){4}\s*;/d' input.txt
awk -F';' '$5' input.txt

To remove lines containing ; after fourth delimiter:
sed '/^\("*[^"]*"*;\)\{4\};/d' input.txt

This might work for you (GNU sed):
sed -r '/^("(\\.|[^"])*";){4};/d' file
If the fourth grouping of double quotes followed by semi colon, where the characters within the grouping are either a pair of a quote and any other character or not a double quote, is followed by a further semi colon, then delete the line.
A more efficient regexp would be:
sed -r '/^("[^"\\]*(\\.[^"\\]*)*";){4};/d' file
This uses the pattern normal*(abnormal normal*)*

Related

How to replace only specific spaces in a file using sed?

I have this content in a file where I want to replace spaces at certain positions with pipe symbol (|). I used sed for this, but it is replacing all the spaces in the string. But I don't want to replace the space for the 3rd and 4th string.
How to achieve this?
Input:
test test test test
My attempt:
sed -e 's/ /|/g file.txt
Expected Output:
test|test|test test
Actual Output:
test|test|test|test
sed 's/ /\
/3;y/\n / |/'
As newline cannot appear in a sed pattern space, you can change the third space to a newline, then change all newlines and spaces to spaces and pipes.
GNU sed can use \n in the replacement text:
sed 's/ /\n/3;y/\n / |/'
If the original input doesn't contain any pipe characters, you can do
sed -e 's/ /|/g' -e 's/|/ /3' file
to retain the third white space. Otherwise see other answers.
You could replace the 'first space' twice, e.g.
sed -e 's/ /|/' -e 's/ /|/' file.txt
Or, if you want to specify the positions (e.g. the 2nd and 1st spaces):
sed -e 's/ /|/2' -e 's/ /|/1' file.txt
Using GNU sed to replace the first and second one or more whitespace chunks:
sed -i -E 's/\s+/|/;s/\s+/|/' file
See the online demo.
Details
-i - inline replacements on
-E - POSIX ERE syntax enabled
s/\s+/|/ - replaces the first one or more whitespace chars
; - and then
s/\s+/|/ the second one or more whitespace chars on each line (if present).
Keep it simple and use awk, e.g. using any awk in any shell on every Unix box no matter what other characters your input contains:
$ awk '{for (i=1;i<NF;i++) sub(/ /,"|")} 1' file
test|test|test test
The above replaces all but the last " " on each line. If you want to replace a specific number, e.g. 2, then just change NF to 2.

Remove a particular character and newline using sed

Say I have text:
Abc
B\
CH
DEG
It should be
Abc
BCH
DEG
Only new lines preceded by backslash should be removed, together with the backslash.
sed -z 's/\\\n//' file
Consume file as one line (-z) and then substitute \ and a new line for nothing,
sed -e :a -e '/\\$/N; s/\\\n//; ta' myFile
If you are sure that \ is always followed by a newline, you can shorten Raman Sailopal's answer by 1 character:
sed -z 's/\\.//' file
This might work for you (GNU sed):
sed 'N;s/\\\n//;P;D' file
Open a 2 line window and remove \\n between 2 lines.

How to add quote at the end of line by SED

sed -i 's/$/\'/g'
sed -i "s/$/\'/g"
How to escape both $ and ' by 1 command?
This might work for you (GNU sed):
sed 's/$/'\''/' file
Adds a single quote to the end of a line.
sed 's/\$/'\''/' file
Replaces a $ by a single quote.
sed 's/\$$/'\''/' file
Replaces a $ at the end of line by a single quote.
N.B. Surrounding sed commands by double quotes is fine for some interpolation but may return unexpected results.
Use octal values
sed 's/$/\o47/'
Care to use backslash + letter o minus + octal number 1 to 3 digit
Just don't use single quotes to start the sed script?
sed "s/$/'/"
The /g at the end means to apply everywhere it's found on each stream (line) - you don't need this since $ is a special character indicating end of stream.
To add a quote at the end of a line use
sed -i "s/$/'/g" file
sed -i 's/$/'"'"'/g' file
See proof.
If there are already single quotes, and you want to make sure there is single occurrence at the end of string use
sed -i "s/'*$/'/g" file
sed -i 's/'"'*"'$/'"'"'/g' file
See this proof.
To escape $ and ' chars use
sed -i "s/[\$']/\\\\&/g" file
See proof
[\$'] - matches $ (escaped as in double quotes it can be treated as a variable interpolation char) or '
\\\\& - a backslash (need 4, that is literal 2 backslashes, it is special in the replacement), and & is the whole match.

How to replace \n by space using sed command?

I have to collect a select query data to a CSV file. I want to use a sed command to replace \n from the data by a space.
I'm using this:
query | sed "s/\n/ /g" > file.csv .......
But it is not working. Only \ is getting removed, while it should also remove n and add a space. Please suggest something.
You want to replace newline with space, not necessarily using sed.
Use tr:
tr '\n' ' '
\n is special to sed: it stands for the newline character. To replace a literal \n, you have to escape the backslash:
sed 's/\\n/ /g'
Notice that I've used single quotes. If you use double quotes, the backslash has a special meaning if followed by any of $, `, ", \, or newline, i.e., "\n" is still \n, but "\\n" would become \n.
Since we want sed to see \\n, we'd have to use one of these:
sed "s/\\\n/ /g" – the first \\ becomes \, and \n doesn't change, resulting in \\n
sed "s/\\\\n/ /g" – both pairs of \\ are reduced to \ and sed gets \\n as well
but single quotes are much simpler:
$ sed 's/\\n/ /g' <<< 'my\nname\nis\nrohinee'
my name is rohinee
From comments on the question, it became apparent that sed had nothing to do with removing the backslashes; the OP tried
echo my\nname\nis | sed 's/\n/ /g'
but the backslashes are removed by the shell:
$ echo my\nname\nis
mynnamenis
so even if the correct \\n were used, sed wouldn't find any matches. The correct way is
$ echo 'my\nname\nis' | sed 's/\\n/ /g'
my name is

Sed uppercase lines if they starting with an uppercase character

I want the lines starting with one uppercase character to be uppercased, other lines should be not touched.
So this input:
cat myfile
a
b
Cc
should result in this output:
a
b
CC
I tried this command, but this not matches if i use grouping:
cat myfile | sed -r 's/\([A-Z]+.*\)/\U\1/g'
What am i doing wrong?
When you use the -r option, you must not put \ before parentheses used for grouping. So it should be:
sed -r 's/^([A-Z].*)/\U\1/' myfile
Also, notice that you need ^ to match the beginning of the line. The g modifier isn't needed, since you're matching the entire line.
cat myfile | sed 's/^\([A-Z].*\)$/\U\1/'
\U for uppercase conversion is a GNU sed extension.
Alternative for platforms where that is not available (e.g., macOS, with its BSD awk implementation):
awk '/^[A-Z]/ { print toupper($0); next } 1'
sed '/^[A-Z].*[a-z]/ s/.*/\U\1/' YourFile
only on line that are not compliant
This might work for you (GNU sed):
sed 's/^[[:upper:]].*/\U&/' file