Using perl to add lines containing only one string to previous line - perl

I have a file of names of this kind:
Josée Maria
Sanchez
Marco Robert
Figaro
Max Theodor Schmitz
John Smith
Maria Catharina Luise
Armino
and I want to add each line containing one single string (only one name) to the previous line:
Josée Maria Sanchez
Marco Robert Figaro
Max Theodor Schmitz
John Smith
Maria Catharina Luise Armino
Is there a good one-liner using perl?

perl -0777 -pe 's/\n(\S+)$/ $1/mg'
Untested, but the idea is to slurp the whole file into a string, then replace by spaces all newlines that precede a single-word line.

Related

sed: replace only in part of string

I have a simple playlist of song files:
1003 James Brown - The Boss Unknown Artist.mp3
1004 James Brown - Slaughters Theme Unknown Artist.mp3
1005 James Brown - Payback(1) Unknown Artist.mp3
...
I would like them in the following format:
1003 James_Brown_-_The_Boss_Unknown_Artist.mp3
1004 James_Brown_-_Slaughters_Theme_Unknown_Artist.mp3
...
Notice that the whitespace behind the number in front is NOT replaced. I have the following simple sed script:
sed "s/ /_/g"
but that replaces also the space after the number. I know how to form capture groups, but that will not help either. How can I convince sed to only apply the replacement to a portion of the input string, rather than the whole string?
You could do
sed 's/ /_/g; s/_/ /'
I.e. first turn all spaces into underscores, then turn the first underscore back into a space.

Finding all the names with a given surname in a file using sed

For the given file with following input test, I want to out put all the names with surname Smith, I have done it using grep:
grep -o -w "[A-Za-z]* Smith" filename
But I want to know how it can be done using sed, I have tried
sed -e 's/[A-Za-z]* Smith/&/g' filename
but it is printing the entire line.
Input files text :
John Smith Kent Smith Adam Smith
Adam Jones Devlin Thomas Bill Kate
Mark Taylor Dean Bush Kane King Nicole Smith
John Williams Adam Cole
James Brown Jason Taylor Mark Rose
Rache Davies Christian Williams
Chris Evans Steve Williams Craig Thomas Jack Smith
Jonna Wilson Jack Jones Jason Patt
Chris Thomas Connor Smith
Kat Watson Kat Smith Julia Roberts Greg Smith Bill Smith
Michael Johnson
Try this:
sed '/[A-Za-z]* Smith/!d;s//\n&\n/;s/^[^\n]*\n//;P;D' file
Explanations:
/[A-Za-z]* Smith/!d: retains only lines containing letters followed by Smith
s//\n&\n/: adds a newline(\n) before and after first string matching above address pattern
s/^[^\n]*\n//: removes unwanted strings(those not starting with \n)
lines now only contains desired names with surrounding \n character on the first match we can loop on with multiline P and D commands that act as a loop on strings containing \n characters
P prints the first part of the pattern space, up to the previously added newline character
after printing, D deletes the same first part of the pattern space
For more about P and D that are part of the multiline commands, please read http://www.grymoire.com/Unix/Sed.html.
But grep is definitely more suited for this job.
Actually, this is not as easy as I thought it was, since grep -o shows each match on a separate line. If you want to use sed twice, you can say:
sed -e 's/\([A-Za-z]* Smith\)/\n\1/g' names | sed '/Smith/!d'
you didn't ask but...
awk to the rescue!
$ awk -v sur="Smith" '{for(i=2;i<=NF;i++) if($i==sur) print $(i-1),sur}'
file
John Smith
Kent Smith
Adam Smith
Nicole Smith
Jack Smith
Connor Smith
Kat Smith
Greg Smith
Bill Smith

SED remove the spaces in numbers only not in string

sample string
There are 1 123 456 drops of water
Is there a ways to take out the thousand space separator with SED ?
resulting in
There are 1123456 drops of water
Find the pattern was not difficult
but I cannot find the how to remove the space
sed s/[0-9]' '[0-9]/ ??? /
Thank you in advance.
sed 's/\([0-9]\) \([0-9]\)/\1\2/g'
This should work too -
perl -pe 's/(?<=[0-9])(\s)(?=[0-9])//g'
We use a negative look behind and look ahead where we look for numbers in both cases. If we find a space between them, we replace with nothing.
[jaypal:~] echo "There are 1 123 456 drops of water" | perl -pe 's/(?<=[0-9])(\s)(?=[0-9])//g'
There are 1123456 drops of water

How to delete lines matching a certain pattern in Perl?

I'd like to do something similar to sed in Perl, namely be able to delete lines matching a certain pattern.
Given this input:
abcd
edfd
abcd
derder
abcd
erre
I want to remove the lines containing bc. How can I do this?
I had to use double quotes on Windows:
perl -ne "print unless /bc/" file
This is a FAQ.
How do I change, delete, or insert a line in a file, or append to the beginning of a file?
If you're programming in Perl then it's well worth taking a couple of hours to familiarise yourself with the FAQ.

left outer join by comparing 2 files

I have 2 files as shown below:
success.txt
amar
akbar
anthony
john
jill
tom
fail.txt
anthony
tom
I want to remove the records from sucess.txt those matches with fail.txt
Expected output:
amar
akbar
john
jill
I'd use fgrep - if available - as you're using fixed strings it should be more efficient.
fgrep -v -x -f fail.txt success.txt
You need the -x option to ensure only whole lines are matched, otherwise fails like tom will match successes like tomas.
awk one-liner: also keep the original order
awk 'NR==FNR{a[$0]=1;next;}!($0 in a)' fail.txt success.txt
There is a Posix-standard join(1) program in all modern Unix systems, see man join.
$ join -v1 success.txt fail.txt