Using sed to append string from regex pattern in the same file - perl

I'm new to unix programming like sed, perl etc. I've searched and no result found match my case. I need to append substring from top line in the same file. My file content text.txt :
Name: sur.name.custom
Tel: xxx
Address: yyy
Website: www.site.com/id=
Name: sur.name.custom1
Tel: xxx
Address: yyy
Website: www.site.com/id=
I need to append every Name (sur.name.*) to every website on its block.
So Expected ouput:
Name: sur.name.custom
Tel: xxx
Address: yyy
Website: www.site.com/id=sur.name.custom
Name: sur.name.custom1
Tel: xxx
Address: yyy
Website: www.site.com/id=sur.name.custom1
I've tried the following sed command:
sed -n "/^Website:.*id=$/ s/$/sur.name..*/p" ./text.txt;
But sed returned: Website: www.site.com/id=sur.name.* same string I put.
I'm sure sed can append from regex pattern. I need both sed and perl if possible.

Why don't you use awk for this? Assuming names doesn't contain spaces following command should work:
awk '$1=="Name:"{name=$2} $1=="Website:"{print $0 name;next} 1' file
Perl equivalent:
perl -pale'
$F[0] eq "Name:" and $name = $F[1];
$F[0] eq "Website:" and $_ .= $name;
' file
(Line breaks may be removed.)

Here' a sed solution:
sed '/^Name:/{h;s/Name: *//;x;};/^Website:/{G;s/\n//;}' filename
Translation: If the line begins with Name:, save the name to the hold space; if the line starts with Website:, append the (latest) name from the holdspace.

Related

Replace one matched pattern with another in multiline text with sed

I have file with this text:
mirrors:
docker.io:
endpoint:
- "http://registry:5000"
registry:5000:
endpoint:
- "http://registry:5000"
localhost:
endpoint:
- "http://registry:5000"
I need to replace it with this text in POSIX shell script (not bash):
mirrors:
docker.io:
endpoint:
- "http://docker.io"
registry:5000:
endpoint:
- "http://registry:5000"
localhost:
endpoint:
- "http://localhost"
Replace should be done dynamically in all places without hard-coded names. I mean we should take sub-string from a first line ("docker.io", "registry:5000", "localhost") and replace with it sub-string "registry:5000" in a third line.
I've figure out regex, that splits it on 5 groups: (^ )([^ ]*)(:[^"]*"http:\/\/)([^"]*)(")
Then I've tried to use sed to print group 2 instead of 4, but this didn't work: sed -n 's/\(^ \)\([^ ]*\)\(:[^"]*"http:\/\/\)\([^"]*\)\("\)/\1\2\3\2\5/p'
Please help!
This might work for you (GNU sed):
sed -E '1N;N;/\n.*endpoint:.*\n/s#((\S+):.*"http://)[^"]*#\1\2#;P;D' file
Open up a three line window into the file.
If the second line contains endpoint:, replace the last piece of text following http:// with the first piece of text before :
Print/Delete the first line of the window and then replenish the three line window by appending the next line.
Repeat until the end of the file.
Awk would be a better candidate for this, passing in the string to change to as a variable str and the section to change (" docker.io" or " localhost" or " registry:5000") and so:
awk -v findstr=" docker.io" -v str="http://docker.io" '
$0 ~ findstr { dockfound=1 # We have found the section passed in findstr and so we set the dockfound marker
}
/endpoint/ && dockfound==1 { # We encounter endpoint after the dockfound marker is set and so we set the found marker
found=1;
print;
next
}
found==1 && dockfound==1 { # We know from the found and the dockfound markers being set that we need to process this line
match($0,/^[[:space:]]+-[[:space:]]"/); # Match the start of the line to the beginning quote
$0=substr($0,RSTART,RLENGTH)str"\""; # Print the matched section followed by the replacement string (str) and the closing quote
found=0; # Reset the markers
dockfound=0
}1' file
One liner:
awk -v findstr=" docker.io" -v str="http://docker.io" '$0 ~ findstr { dockfound=1 } /endpoint/ && dockfound==1 { found=1;print;next } found==1 && dockfound==1 { match($0,/^[[:space:]]+-[[:space:]]"/);$0=substr($0,RSTART,RLENGTH)str"\"";found=0;dockfound=0 }1' file

is there a way to add characters in between separators in a text file?

I have an input file (customers.txt) that looks like this:
Name, Age, Email,
Hank, 22, hank#mail.com
Nathan, 32, nathan#mail.com
Gloria, 24, gloria#mail.com
I'm trying to have to output to a file (customersnew.txt) to have it look like this:
Name: Hank Age: 22 Email: hank#mail.com
Name: Nathan Age: 32 Email: nathan#mail.com
Name: Gloria Age: 24 Email: gloria#mail.com
So far, I've only been able to get an output like:
Name: Hank, 22, hank#mail.com
Name: Nathan, 32, nathan#mail.com
Name: Gloria, 24, gloria#mail.com
The code that I'm using is
sed -e '1d'\
-e 's/.*/Name: &/g' customers.txt > customersnew.txt
I've also tried separating the data using -e 's/,/\n/g'\ and then -e '2s/.*Age: &/g'. It doesn't work. Any help would be highly appreciated.
Have you considered using awk for this? Like:
$ awk 'BEGIN {FS=", ";OFS="\t"} NR==1 {split($0,hdr);next} {for(i=1;i<=NF;i++)$i=hdr[i]": "$i} 1' file
Name: Hank Age: 22 Email: hank#mail.com
Name: Nathan Age: 32 Email: nathan#mail.com
Name: Gloria Age: 24 Email: gloria#mail.com
This simply saves headers into an array and prepends each field in following records with <header>:.
This might work for you (GNU sed & column):
sed -E '1h;1d;G;s/^/,/;:a;s/,\s*(.*\n)([^,]+),/\2: \1/;ta;P;d' file | column -t
Copy the header to the hold space.
Append the header to each detail line.
Prepend a comma to the start of the line.
Create a substitution loop that replaces the first comma by the first heading in the appended header.
When all the commas have been replaced, print the first line and delete the rest.
To display in neat columns use the column command with the -t option.
Could you please try following.
awk '
BEGIN{
FS=", "
OFS="\t"
}
FNR==1{
for(i=1;i<=NF;i++){
value[i]=$i
}
next
}
{
for(i=1;i<=NF;i++){
$i=value[i] ": " $i
}
}
1
' Input_file
Explanation: Adding explanation for above solution.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=", " ##Setting field separator as comma space here.
OFS="\t" ##Setting output field separator as TAB here for all lines.
}
FNR==1{ ##Checking here if this is first line then do following.
for(i=1;i<=NF;i++){ ##Starting a for loop to traverse through all elements of fields here.
value[i]=$i ##Creating an array named value with index variable i and value is current field value.
}
next ##next will skip all further statements from here.
}
{
for(i=1;i<=NF;i++){ ##Traversing through all fields of current line here.
$i=value[i] ": " $i ##Setting current field value adding array value with index i colon space then current fiedl value here.
}
}
1 ##1 will print all lines here.
' Input_file ##Mentioning Input_file name here.

How to replace a block of code between two patterns with blank lines?

I am trying replace a block of code between two patterns with blank lines
Tried using below command
sed '/PATTERN-1/,/PATTERN-2/d' input.pl
But it only removes the lines between the patterns
PATTERN-1 : "=head"
PATTERN-2 : "=cut"
input.pl contains below text
=head
hello
hello world
world
morning
gud
=cut
Required output :
=head
=cut
Can anyone help me on this?
$ awk '/=cut/{f=0} {print (f ? "" : $0)} /=head/{f=1}' file
=head
=cut
To modify the given sed command, try
$ sed '/=head/,/=cut/{//! s/.*//}' ip.txt
=head
=cut
//! to match other than start/end ranges, might depend on sed implementation whether it dynamically matches both the ranges or statically only one of them. Works on GNU sed
s/.*// to clear these lines
awk '/=cut/{found=0}found{print "";next}/=head/{found=1}1' infile
# OR
# ^ to take care of line starts with regexp
awk '/^=cut/{found=0}found{print "";next}/^=head/{found=1}1' infile
Explanation:
awk '/=cut/{ # if line contains regexp
found=0 # set variable found = 0
}
found{ # if variable found is nonzero value
print ""; # print ""
next # go to next line
}
/=head/{ # if line contains regexp
found=1 # set variable found = 1
}1 # 1 at the end does default operation
# print current line/row/record
' infile
Test Results:
$ cat infile
=head
hello
hello world
world
morning
gud
=cut
$ awk '/=cut/{found=0}found{print "";next}/=head/{found=1}1' infile
=head
=cut
This might work for you (GNU sed):
sed '/=head/,/=cut/{//!z}' file
Zap the lines between =head and =cut.

delete string for each line with sed

My file contains x number of lines, I would like to remove the string before and after the reference string at the beginning and end of each line.
The reference string and string to remove are separated by space.
The file contains :
test.user.passs
test.user.location
global.user
test.user.tel
global.pass
test.user.email string_err
#ttt...> test.user.car ->
test.user.address
è_ 788 test.user.housse
test.user.child
{kl78>&é} global.email
global.foo
test.user.foo
How to remove the string at the start of each line which contain "test" string and also the end of each line separated by space or tab with sed?
The desired result is :
test.user.passs
test.user.location
global.user
test.user.tel
global.pass
test.user.email
test.user.car
test.user.address
test.user.housse
test.user.child
{kl78>&é} global.email
global.foo
test.user.foo
I interpret your question as: find the first word that is "word characters and at least one dots"
Tcl:
echo '
set fh [open [lindex $argv 1] r]
while {[gets $fh line] != -1} {puts [regexp -inline {\w+(?:\.\w+)+} $line]}
' | tclsh - file
sed
sed -r 's/.*\<([[:alpha:]]+(\.[[:alpha:]]+)).*/\1/' file
perl
perl -nE '/(\w+(\.\w+)+)/ and say $1' file
using sed like
sed -r 's/^[^ ]+[ ]+([^ ]+)[ ]+[^ ]*/\1/' file
This might work for you (GNU sed):
sed -r 's/.*(test\S+).*/\1/' file

sed/awk/cut/grep - Best way to extract string

I have a results.txt file that is structured in this format:
Uncharted 3: Javithaxx l Rampant l Graveyard l Team Deathmatch HD (D1VpWBaxR8c)
Matt Darey feat. Kate Louise Smith - See The Sun (Toby Hedges Remix) (EQHdC_gGnA0)
The Matrix State (SXP06Oax70o)
Above & Beyond - Group Therapy Radio 014 (guest Lange) (2013-02-08) (8aOdRACuXiU)
I want to create a new file extracting the youtube URL ID specified in the last characters in each line line "8aOdRACuXiU"
I'm trying to build a URL like this in a new file:
http://www.youtube.com/watch?v=8aOdRACuXiU&hd=1
Note, I appended the &hd=1 to the string that I am trying to be replaced. I have tried using Linux reverse and cut but reverse or rev munges my data. The hard part here is that each line in my text file will have entries with parentheses and I only care about getting the data between the last set of parentheses. Each line has a variable length so that isn't helpful either. What about using grep and .$ for the end of the line?
In summary, I want to extract the youtube ID from results.txt and export it to a new file in the following format: http://www.youtube.com/watch?v=8aOdRACuXiU&hd=1
Using awk:
awk '{
v = substr( $NF, 2, length( $NF ) - 2 )
printf "%s%s%s\n", "http://www.youtube.com/watch?v=", v, "&hd=1"
}' infile
It yields:
http://www.youtube.com/watch?v=D1VpWBaxR8c&hd=1
http://www.youtube.com/watch?v=EQHdC_gGnA0&hd=1
http://www.youtube.com/watch?v=SXP06Oax70o&hd=1
http://www.youtube.com/watch?v=8aOdRACuXiU&hd=1
$ sed 's!.*(\(.*\))!http://www.youtube.com/watch?v=\1\&hd=1!' results.txt
http://www.youtube.com/watch?v=D1VpWBaxR8c&hd=1
http://www.youtube.com/watch?v=EQHdC_gGnA0&hd=1
http://www.youtube.com/watch?v=SXP06Oax70o&hd=1
http://www.youtube.com/watch?v=8aOdRACuXiU&hd=1
Here, .*(\(.*\)) looks for the last occurrence of a pair of parentheses, and captures the characters inside those parentheses. The captured group is then inserted into the URL using \1.
Using a perl one-liner :
perl -lne 'printf "http://www.youtube.com/watch?v=%s&hd=1\n", $& if /[^\(]+(?=\)$)/' file.txt
Or multi-line version :
perl -lne '
printf(
"http://www.youtube.com/watch?v=%s&hd=1\n",
$&
) if /[^\(]+(?=\)$)/
' file.txt