Negation of sed regular expression - sed

I have a table in a text file with tab separator and I have a sed script that selects only those strings in my file that have 83, 86, 173, 163 in second column:
sed -n '/^[^\t]\+\t\(83\|89\|147\|163\)/p' test.txt
Now I want to select all the string that have anything else, but 83, 86, 173, 163 in second column. I've tried to put ^ in different places and tried to change p to d, but did't succeed.
Can anyone help me please?

Solved my problem using ! before p:
sed -n '/^[^\t]\+\t\(83\|89\|147\|163\)/!p' test.txt

Related

Delete a paragraph from a file using sed

I have a markdown file that looks something like this:
markdown.md
# Title1
line 1
line 2
line 3
# Title2
line 1
line 2
line 3
I'd like to be able to delete one of the paragraphs by searching for the title. I would need to delete the title, the following line, and then every subsequent line that is not blank.
The desired output would be:
# Title2
line 1
line 2
line 3
I was doing some reading about using {} to group multiple commands together but I can't seem to quite get the syntax right.
cat markdown.md | sed '/^# Title1.*/,+1d {/^\s*$/d}'
My thinking was this would delete the line beginning with '# Title1', then the following line with ,+1d, then subsequent lines until a blank line, but i see the following error:
sed: 1: "/^# Title1.*/,+1d { ...": extra characters at the end of d command
I've tried a few variations but no luck. Any help would be appreciated!
This is the kind of sed puzzle that makes me wish for a slightly different tool.
sed -n -e '/Title1/!{p;d;};n;' -e ':a' -e 'n;/./ba'
Loosely translated: "Don't print anything. If it doesn't contain 'Title1', then all right, print it, then start over with the next line. But if it does contain 'Title1', then grab the next line (which will be blank), enter a loop, and keep grabbing new lines until you come to the next empty line."
Using GNU sed
$ sed -z 's/# Title1[^#]*//' input_file
# Title2
line 1
line 2
line 3
This might work for you (GNU sed):
sed '/^# /h;G;/\n# Title1/!P;d' file
If a line begins # , make a copy.
Append the copy to each line and if that line does not contain \n# Title1, print it.
Delete all lines.
Alternative:
sed '/^# Title1/{:a;N;/\n#/!s/\n//;ta;D}' file

Using sed, insert a space at the 3rd last index of each line

I would like to insert a space, before the 3rd last character of each line, to turn this:
CC287999221
CHGFFDTTT34AAA387
CH654AZ0987XX277
Into this:
CC287999 221
CHGFFDTTT34AAA 387
CH654AZ0987XX 277
So far I've tried:
sed -i 's/.*\(...\)/ \1/' file
However this remove the preceding text also.
Thank you
One way:
sed 's/\(...$\)/ \1/' file
Just match the last 3 characters, while substituting put a space and then the matched pattern(\1)
With awk could you please try following.
awk '{print substr($0,1,length($0)-3),substr($0,length($0)-2)}' Input_file
tried on gnu sed:
sed -E 's/\S{3}\s*$/ &/' file
Another awk proposal:
awk '{sub(/.{3}$/," &")}1' file
CC287999 221
CHGFFDTTT34AAA 387
CH654AZ0987XX 277

How does this sed command: "sed -e :a -e '$d;N;2,10ba' -e 'P;D' " work?

I saw a sed command to delete the last 10 rows of data:
sed -e :a -e '$d;N;2,10ba' -e 'P;D'
But I don't understand how it works. Can someone explain it for me?
UPDATE:
Here is my understanding of this command:
The first script indicates that a label “a” is defined.
The second script indicates that it first determines whether the
line currently reading pattern space is the last line. If it is,
execute the "d" command to delete it and restart the next cycle; if
not, skip the "d" command; then execute "N" command: append a new
line from the input file to the pattern space, and then execute
"2,10ba": if the line currently reading the pattern space is a line
in the 2nd to 10th lines, jump to label "a".
The third script indicates that if the line currently read into
pattern space is not a line from line 2 to line 10, first execute "P" command: the first line
in pattern space is printed, and then execute "D" command: the first line in pattern
space is deleted.
My understanding of "$d" is that "d" will be executed when sed reads the last line into the pattern space. But it seems that every time "ba" is executed, "d" will be executed, regardless of Whether the current line read into pattern space is the last line. why?
:a is a label. $ in the address means the last line, d means delete. N stands for append the next line into the pattern space. 2,10 means lines 2 to 10, b means branch (i.e. goto), P prints the first line from the pattern space, D is like d but operates on the pattern space if possible.
In other words, you create a sliding window of the size 10. Each line is stored into it, and once it has 10 lines, lines start to get printed from the top of it. Every time a line is printed, the current line is stored in the sliding window at the bottom. When the last line gets printed, the sliding window is deleted, which removes the last 10 lines.
You can modify the commands to see what's getting deleted (()), stored (<>), and printed by the P ([]):
$ printf '%s\n' {1..20} | \
sed -e ':a ${s/^/(/;s/$/)/;p;d};s/^/</;s/$/>/;N;2,10ba;s/^/[/;s/$/]/;P;D'
[<<<<<<<<<<1>
[<2>
[<3>
[<4>
[<5>
[<6>
[<7>
[<8>
[<9>
[<10>
(11]>
12]>
13]>
14]>
15]>
16]>
17]>
18]>
19]>
20])
a simpler resort, if your data in 'd' file by gnu sed,
sed -Ez 's/(.*\n)(.*\n){10}$/\1/' d
^
pointed 10 is number of last line to remove
just move the brace group to invert, ie. to get only the last 10 lines
sed -Ez 's/.*\n((.*\n){10})$/\1/' d

Remove whitespaces till we find comma, but this should start skipping first comma in each line of a file

I am in the learning phase of sed and awk commands, trying some complicated logic but couldn't get solution for the below.
File contents:
This is apple,apple.com 443,apple2.com 80,apple3.com 232,
We talk on 1 banana,banana.com 80,banannna.com 23,
take 5 grape,grape5.com 23,
When I try with
$ cat sample.txt | sed -e 's/[[:space:]][^,]*,/,/g'
,apple.com,apple2.com,apple3.com,
,banana.com,banannna.com,
,grape5.com,
is ok but I want to skip this sed for the first comma in each line, so expected output is
This is apple,apple.com,apple2.com,apple3.com,
We talk on 1 banana,banana.com,banannna.com,
take 5 grape,grape5.com,
Any help is appreciated.
If you are using GNU sed, you can do something like
sed -e 's/[[:space:]][^,]*,/,/2g' file
where the 2g specifies something like start the substitution from the 2nd occurrence and g for doing it subsequently to the rest of the occurrences.
The output for the above command.
sed -e 's/[[:space:]][^,]*,/,/2g' file
This is apple,apple.com,apple2.com,apple3.com,
We talk on 1 banana,banana.com,banannna.com,
take 5 grape,grape5.com,
An excerpt from the man page of GNU sed
g
Apply the replacement to all matches to the regexp, not just the first.
number
Only replace the numberth match of the regexp.
awk '{gsub(/[ ]+/," ")gsub(/com [0-9]+/,"com")}1' file
This is apple,apple.com,apple2.com,apple3.com,
We talk on 1 banana,banana.com,banannna.com,
take 5 grape,grape5.com,
The first gsub removes extra space and the next one takes away unwanted numbers between com and comma.

How to conditionally remove first line only with sed when it matches?

Can I use sed to check the first line of some command's output (to stdout) and delete this very first line if it matches a certain pattern?
Say, the command's output is something like this:
"AB"
"CD"
"E"
"F"
I want it to become:
"CD"
"E"
"F"
But when the first line is "GH", I don't want to delete the line.
I tried this, but it doesn't work:
<some_command> |sed '1/<pattern>/d'
The shell told me:
sed: 0602-403 1/<pattern>/d is not a recognized function.
I only want to use sed to process the first line, leaving the other lines untouched.
What is the correct syntax here?
This might work for you:
sed -e '1!b' -e '/GH/!d' file
You want to reference the 1st line, then say delete:
$ sed '1 d' file
No need for any pattern if you know which line you want to delete.
With a pattern, use this syntax:
$ sed '0,/pattern/ d' file
This is what you want:
$ sed '1{/"GH"/!d}' file
sed '1{/<pattern>/{/GH/!d}}' input
The error in your expression can be fixed like this:
sed '1{/<pattern>/d}' input