how can i print lines between 2 pattern using sed - sed

How can i print lines between pattern1 and pattern2, i dont need lines between pattern1 and pattern3 though.
Please suggest the solution either in sed, awk.
I have case like this.
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern3
pattern1
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern3
Desire output:
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern2

With sed:
sed -n '/pattern1/{:l N;/pattern3/b;/pattern2/!bl;p}' input
Description
/pattern1/{ # Match pattern1 and ...
:l N; # start loop and read a line
/pattern3/b # quit if pattern3 matches
/pattern2/!bl # loop until pattern2 matches
p # print all lines
Output
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern2

One method:
$ awk '/pattern1/{s=1;f=1;s=NR}f{p[NR]=$0}/pattern3/{s=0}/pattern2/&&s{f=0;for(i=s;i<=NR;i++)print p[i]}' file
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern2

$ awk '/pattern1/{f=!f;buf=""} f{buf = buf $0 ORS} /pattern2/{if(f)printf "%s",buf; f=0} /pattern3/{f=0}' file
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern2
To possibly help with comprehension, here's the above spread across a few lines and with wordier variable names:
awk '
/pattern1/ {
found=!found
buffer=""
}
found {
buffer = buffer $0 ORS
}
/pattern2/ {
if (found) {
printf "%s",buffer
}
found=0
}
/pattern3/ {
found=0
}
' file

I got lost among my hold-spaces in a pure sed solution; so here is an alternative
$ tac input | sed '/pattern3/,/pattern1/d' | tac
pattern1
blah blah blah
blah blah blah
blah blah blah
pattern2
pattern1
blah blah blah
blah blah blah
pattern2

Related

Skip first 2 lines and remove quotes from row values in pyspark dataframe

I have a csv with around 15 columns
I would like to skip first 2 lines and use a custom schema
Remove double quotes from row values
csv is as below.
Header1 blah blah
Header2 blah blah
Name1;"1,456";"City1";"3";"pet"
Name2;"3,450";"City2";"4";"not pet"
delimiter = ";"
salesDF = spark.read.format("csv") \
.option("quote", "") \
.option("sep", delimiter) \
.load("sales_2018.csv")
salesDF = salesDF.replace("\"","")
I tried as above to remove quotes from csv. Delimiter works but quotes are not getting removed.
Results are as below: It has added only quotes but didn't remove.
Header1 blah blah
Header2 blah blah
"Name1;""1,456"";""City1"";""3"";""pet""
"Name2;""3,450"";""City2"";""4"";""not pet""
My idea is to remove quotes and the remove the first 2 lines of the dataframe to add my custom schema. Thanks.

Replace newline (\n) except last of each line

my input is split into multiple lines. I want it to output in a single line.
For example Input is :
1|23|ABC
DEF
GHI
newline
newline
2|24|PQR
STU
LMN
XYZ
newline
Output:
1|23|ABC DEF GHI
2|24|PQR STU LMN XYZ
Well, here is one for awk:
$ awk -v RS="" -F"\n" '{$1=$1}1' file
Output:
1|23|ABC DEF GHI
2|24|PQR STU LMN XYZ

Pattern search and next line copy

I have a text file that looks like this:
AAA
BBB
CCC
AAA
DDD
EEE
It has a specific keyword, for example AAA. After encountering the keyword, I'd like to copy the following line and then write it a second time in my output file.
I want it to look like this:
AAA
BBB
BBB
CCC
AAA
DDD
DDD
EEE
Is there anybody who will help me to do this?
Sed can do it like this:
$ sed '/AAA/{n;p}' infile
AAA
BBB
BBB
CCC
AAA
DDD
DDD
EEE
This looks for the pattern (/AAA/), the reads the next line of input (n) and prints it (p). Because printing is the default action anyway, the line gets printed twice, which is what we want.
awk to the rescue!
$ awk 'd{print;d=0} /AAA/{d=1}1' file
AAA
BBB
BBB
CCC
AAA
DDD
DDD
EEE
Explanation
d{print;d=0}
if flag dset print the line and reset the flag,
/AAA/{d=1}
set a flag to duplicate the line for the given pattern,
1
and print all lines.
You can use perl for this
perl -e ' $a =undef;
while(<>){
chomp;
if ($a eq "AAA"){
print "$_\n"
}
print "$_\n";
$a=$_;
}' your_file.txt
This iterates through the file and prints each line. If the previous line is "AAA", it prints it twice.
I don't know whether you share my hatred of one-line programs, but this is entirely possible in Perl
$ perl -ne'print; print scalar <> x 2 if /AAA/' aaa.txt
output
AAA
BBB
BBB
CCC
AAA
DDD
DDD
EEE

Sed: Add a word (or a new member) to a set line in a file

Lets imagine I have a file with some lines and there is one line with this structure:
blah blah
YYYY :['aaa','ddd']
blah
XXXX :['member1', 'member2']
blah blah
I want to have a script to add member3 to the end of XXXX array automatically. I tried to use sed, but I do not know how to replace the last bracket of the lines started with XXXX with "'member3']". So it looks like this:
blah blah
YYYY :['aaa','ddd']
blah
XXXX :['member1', 'member2', 'member3']
blah blah
Any help?
sed "/^XXXX /s/\]\$/, 'member3']/" < input
This applies a substitution to the lines that start with XXXX, replacing the final ] with 'member3']
A bit unclear, is this what you're after:
$ echo "XXXXX :['member1', 'member2']" | sed "s/]$/, 'member3']/"
XXXXX :['member1', 'member2', 'member3']
Update
$ cat file.txt
bla
bla bla
XXXXX :['member1', 'member2']
bla
bla bla
$ sed "s/]$/, 'member3']/" file.txt
bla
bla bla
XXXXX :['member1', 'member2', 'member3']
bla
bla bla

find lines between two patterns using sed

I have the lines in text.txt as below:
blah blah..
blah abc blah..
blah abc blah
blah blah..
blah blah..
blah blah..
blah efg blah blah
blah blah..
blah abc blah
blah abc blah
blah abc blah
blah abc blah
blah abc blah
blah blah..
blah efg blah blah
blah blah..
blah blah..
I want to output the lines between each last occurrence of "abc" before "efg" and "efg", for the above example, I want to output:
blah abc blah
blah blah..
blah blah..
blah blah..
blah efg blah blah
blah abc blah
blah blah..
blah efg blah blah
I know sed can select ranges using two patterns, like:
sed -n '/abc/,/efg/p' test.txt
However the output will begin from the first occurrence of "abc" instead of the last one, the output is as following:
blah abc blah..
blah abc blah
blah blah..
blah blah..
blah blah..
blah efg blah blah
blah abc blah
blah abc blah
blah abc blah
blah abc blah
blah abc blah
blah blah..
blah efg blah blah
Any enhancement can I do on the command line so the output will begin from a last occurrence of "abc"?
This might work for you (GNU sed):
sed -n '/\<abc\>/,/\<efg\>/{/\<abc\>/{h;d};H;/\<efg\>/{x;p}}' file