how do I delete a regex-delimited range plus a few lines with sed? - sed

I have a file containing a header I want to get rid of. I don't have a good way of addressing either the last line of the header or the first line of the data, but I can address the line before the next-to-last line of the header via a regular expression.
Example input:
a bunch of make output which I don't care about
for junk in blah; do
can't check for done!
done
for test in blurfl; do # this is the addressable line
more garbage
done
line 1
line 2
line 3
line 4
line 5
I've done the obvious 1,/for test in blurfl/d, but that doesn't get the next two lines. I can make the command {N;d} which gets rid of the next line, but {N;N;d} just blows away the rest of the file except the last line, which I figured out is because the range isn't slurped up and treated as a single entity, but instead is processed line-by-line.
I feel like I'm missing something obvious because I don't know some sed idiom, but none of the examples on the web or in the GNU manual have managed to trigger anything useful.
I can do this in awk, but other transformations I need to do make awk somewhat, well, awkward. But GNU sed is acceptable.

I have to disagree about [not] using awk. Anything non-trivial is almost always easier in awk than sed [even the sed manpage says so]. Personally, I'd use perl, but ...
So, here's the awk script:
BEGIN {
phase = 0
}
# initial match -- find second loop
phase == 0 {
if ($0 ~ /for test in blurfl/) {
phase = 1
next
}
}
# wait for end of second loop
phase == 1 {
if ($0 ~ /done/) {
phase = 2
next
}
}
# print phase
phase == 2 {
print($0)
}
If you wish to torture yourself [and sed] for complex changes, well, caveat emptor, but don't say I didn't warn you ...

I don't think you can do multi line matches in sed. First time I went down this rabbit hole I ended up using awk, which can support, but now recently I'd probably use Python or Ruby for this kind of thing.

Related

Search for a match, after the match is found take the number after the match and add 4 to it, it is posible in perl?

I am a beginer in perl and I need to modify a txt file by keeping all the previous data in it and only modify the file by adding 4 to every number related to a specific tag (< COMPRESSED-SIZE >). The file have many lines and tags and looks like below, I need to find all the < COMPRESSED-SIZE > tags and add 4 to the number specified near the tag:
< SOURCE-START-ADDRESS >01< /SOURCE-START-ADDRESS >
< COMPRESSED-SIZE >132219< /COMPRESSED-SIZE >
< UNCOMPRESSED-SIZE >229376< /UNCOMPRESSED-SIZE >
So I guess I need to do something like: search for the keyword(match) and store the number 132219 in a variable and add the second number (4) to it, replace the result 132219 with 132223, the rest of the file must remain unchanged, only the numbers related to this tag must change. I cannot search for the number instead of the tag because the number could change while the tag will remain always the same. I also need to find all the tags with this name and replace the numbers near them by adding 4 to them. I already have the code for finding something after a keyword, because I needed to search also for another tag, but this script does something else, adds a number in front of a keyword. I think I could use this code for what i need, but I do not know how to make the calculation and keep the rest of the file intact or if it is posible in perl.
while (my $row = <$inputFileHandler>)
{
if(index($row,$Data_Pattern) != -1){
my $extract = substr($row, index($row,$Data_Pattern) + length($Data_Pattern), length($row));
my $counter_insert = sprintf "%08d", $counter;
my $spaces = " " x index($row,$Data_Pattern);
$data_to_send ="what i need to add" . $extract;
print {$outs} $spaces . $Data_Pattern . $data_to_send;
$counter = $counter + 1;
}
else
{
print {$outs} $row;
next;
}
}
Maybe you could help me with a block of code for my needs, $Data_Pattern is the match. Thank you very much!
This is a classic one-liner Perl task. Basically you would do something like
$ perl -i.bak -pe's/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e' yourfile.txt
Which will in essence copy and replace your file with a new, edited file. This can be very dangerous, especially if you are a Perl newbie. The -i switch is here used with the .bak extension which saves a backup in yourfile.txt.bak. This does not make this operation safe, however, as running the command twice will overwrite the backup.
It is advisable to make a separate backup of the target file before using this command.
-i.bak edit "in-place", the file is overwritten, a backup of the original is created with extension .bak.
-p argument is treated as a file name, which is read, and printed back.
s/ // the substitution operator, which is applied to all lines of the file.
^ inside the regex looks for beginning of line.
\K keep the match that is to the left.
(\d+) capture () 1 or more digits \d+ and store them in $1
/e treat the right hand side of the substitution operator as an expression and use the result as the replacement string. In this case it will increase your number and return the sum.
The long version of this command is
while (<>) {
s/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e
}
Which can be placed in a file and run with the -i switch.

Is there any way to encode Multiple columns in a csv using base64 in Shell?

I have a requirement to replace multiple columns of a csv file with its base64 encoding value which should be applied to some columns of the file but keep the first line unaffected as the first line contains the header of the file. I have tried out for 1 column as below but as I have given it to proceed after skipping the first line of the file it is not
gawk 'BEGIN { FS="|"; OFS="|" } NR >=2 { cmd="echo "$4" | base64 -w 0";cmd | getline x;close(cmd); print $1,$2,$3,x}' awktest
o/p:
12|A|B|Qw==
13|C|D|RQ==
36|Z|V|VQ==
Qs: It is not showing the header in the output. What should I do to make produce the header in the output? Also can I use any loop here to replace multiple columns?
input:
10|A|B|C|5|T|R
12|A|B|C|6|eee|ff
13|C|D|E|9|dr|xrdd
36|Z|V|U|7|xc|xd
Required output:
10|A|B|C|5|T|R
12|A|B|encodedvalue|6|encodedvalue|ff
13|C|D|encodedvalue|9|encodedvalue|xrdd
36|Z|V|encodedvalue|7|encodedvalue|xd
Is this possible? Have researched a lot but could not find a proper explanation. I am new to shell. Kindly help. Many thanks!!!!
It looks like you can just sequence conditionals. This may not be the best way of solving the header issue, but it's intuitive.
BEGIN { FS="|"; OFS="|" } NR ==1 {print} NR >=2 { cmd="echo "$4" | base64 -w 0";cmd | getline x;close(cmd); print $1,$2,$3,x}
As for using a loop to affect multiple columns... Loops in bash are hard. Awk is technically its own language, and may have a looping construct of it's own, IDK. But it's not clear you need a loop. If there's only a reasonable number of fields that need modifying, you can just parameterize the existing command (somehow) by the field index, and then pipe through however many instances of it. It won't be as performant as doing it all in a single pass of awk, but that's probably ok.

Read entire line $nick found on

Below I have wrote my basic goal and the code I already have, any help is much appreciated as I am learning myself how IRC Scripting works, thanks guys!
on $*:text:*test*:#: {
if ($date isin $read(test1.txt, 1)) {
if ($nick isin $read(test1.txt, 1)) { write test.txt "entire line $nick was found on in test1.txt" $1- }
}
}
In the future, you should make your question clearer.
Your question looks like this one mIRC Search for multiple words in text file, you can read my answer there for more information, it's mostly the same so I'm copying and pasting it here with edits for your case.
To read a .txt file line by line you need a loop. To use this loop type: /findNick <NICK>
alias findNick {
var %nick = $1
while ($read(test1.txt, nw, $+(*,$date,*), $calc($readn + 1))) {
var %line = $v1
if (%nick isin %line) {
echo -a %nick found on the line: %line
; do your stuff here
}
}
}
$readn is an identifier that returns the line that $read() matched. It is used to start searching for the pattern on the next line. Which is in this case $date. The asteriks means a wildcard, so anything that contains that date.
In the code above, $readn starts at 0. We use $calc() to start at line 1. Every match $read() will start searching on the next line. When no more matches are after the line specified $read will return $null - terminating the loop.
The w switch is used to use a wildcard in your search
The n switch prevents evaluating the text it reads as if it was mSL code. In almost EVERY case you must use the n switch. Except if you really need it. Improper use of the $read() identifier without the 'n' switch could leave your script highly vulnerable.
The result is stored in a variable named %line to use it again to check wheter $nick is in the found line. If the $nick was found, it will echo the result in your active window.
And again, if there's anything unclear, I will try to explain it better.

use identifying symbols to identify and edit line/string, then append line/string to previous line in file

Using standard linux utilities (sed and awk, I am guessing)
Sorry about the vague title, I don't really know how to describe the request much better. An easier way to do so is to provide a simple example. I have a file with the following content:
www.example.com
johnsmith#gmail.com
fredflintstone#gmail.com
bettyboop#gmail.com
www.example2.com
kylejohnson#gmail.com
www.example3.com
chadbrown#gmail.com
joshbeck#gmail.com
www.example4.com
tomtom#gmail.com
jeffjeffries#gmail.com
billnorman#gmail.com
stankubrick#gmail.com
andrewanders#gmail.com
So, what I want to do is convert the above to:
www.example.com,johnsmith#gmail.com,fredflintstone#gmail.com,bettyboop#gmail.com
www.example2.com,kylejohnson#gmail.com
www.example3.com,chadbrown#gmail.com,joshbeck#gmail.com,
www.example4.com,tomtom#gmail.com,jeffjeffries#gmail.com,billnorman#gmail.com,stankubrick#gmail.com,andrewanders#gmail.com
I am thinking that the easiest thing to do would be to execute something along the lines of: if the line contains an "#" symbol, input a comma at the beginning of the line/string and then append that line/string to the preceding line. Anyone have any ideas? It would be simpler, I think, if there were a uniform number of email addresses associated with each website, but this is not the case.
Thanks in advance!
A simple approach
awk '{s=/#/?",":"\n";printf s"%s",$0}' file
www.example.com,johnsmith#gmail.com,fredflintstone#gmail.com,bettyboop#gmail.com
www.example2.com,kylejohnson#gmail.com
www.example3.com,chadbrown#gmail.com,joshbeck#gmail.com
s=/#/?",":"\n" Does line contain # yes set s="," no set s="\n" (newline).
printf s"%s",$0 print $0 using s as format. If line has # print newline, then $0, if not print ,, then $0
Try this awk program:
/^[:space:]*www\./ {
if (f) {print line}
f=1; line=$0;
next
}
f {
line=(line "," $0)
}

Quick Basic Colon Line Separator

I am working on some old qbasic code. It's a mess with all the Goto statements. Am I correct that the following line will always return?
IF FLAG = 0 THEN TARGET = X: GOSUB 55000: TEMP = XI - TEMP2: RETURN
So if I understand this correctly the colon separates statements on the same line. The if only pertains to TARGET = X. The GOSUB, TEMP =, and RETURN will always execute. Correct?
Part of my confusion is because the very next line reads
IF FLAG = 1 THEN STEP = X: GOSUB 115000
And since the label to the second statement is never used in a GOTO I can't see that it would ever get executed.
Yes, I believe your assessment is correct. The colon is a statement separator that lets you have multiple statements on the same line. Assuming your subroutine at 55000 returns, this line should return as well.
I was wrong. Running this program:
if 1=2 then print "Never printed" : print "how about this?"
print "End of program"
on qb64.net prints only End of program. I assume that its grammar details are the same as Qbasic's, although it is a reverse-engineered effort.
As an aside, this code is written in a pre-QBasic style (e.g. using GOSUB and line numbers). There is a script that often came with QBasic (remline.bas, I believe it was called) that is supposed to help translate these kinds of programs to a newer style. I have never used it myself, though.