I trying to use sed in finding a matching pattern in a file then deleting
the next line only.
Ex.
LocationNew York <---- delete USA
LocationLondon <---- deleteUK
I tried sed '/Location/{n; d}' that work on linux but didn't work on solaris.
Thanks.
As I mentioned in my answer about your other sed question, Solaris sed is old-school AND needs more hand-holding (or to put it another way), is more fussy about it's syntax.
All you need is an additional ';' char placed after the `d' char, i.e.
sed '/Location/{n; d;}'
More generally, anything that can be on a new-line inside {...} needs a semi-colon separator when it is rolled up onto a single line. However, you can't roll up the 'a', 'i', 'c' commands onto a single line as you can in Linux.
In Solaris standard sed, the 'a', i', 'c' commands need a trailing '\' with NO spaces or tabs after it, as much data as you like (probably within some K limit) on \n terminated lines (NO \r s), followed by a blank line.
Newer installations of Solaris may also have /usr/xpg4/bin/sed installed. Try
/usr/xpg4/bin/sed '/Location/{n; d}'
If you're lucky, it will support your shortcut syntax. I don't have access to any solaris machines anymore to test this.
Finally, if that doesn't work, there are packages of GNU tools that can be installed that would have a sed that is much more like what you're used to from Linux. Ask your sys-admins if GNU tools are already there, or if they can be installed. I'm not sure what version of gnu sed started to support 'relaxed' syntax, so don't assume that it will be fixed without testing :-)
I hope this helps.
P.S.
Welcome to StackOverflow and let me remind you of three things we usually do here: 1) As you receive help, try to give it too, answering questions in your area of expertise 2) Read the FAQs, http://tinyurl.com/2vycnvr , 3) When you see good Q&A, vote them up by using the gray triangles, http://i.imgur.com/kygEP.png , as the credibility of the system is based on the reputation that users gain by sharing their knowledge. Also remember to accept the answer that better solves your problem, if any, by pressing the checkmark sign , http://i.imgur.com/uqJeW.png
You can append the next line to the current one and then remove everything that is not Location:
$ cat text
Location
New York <---- delete
USA
Location
London <---- delete
UK
$ sed '/Location/{N;s/Location.*$/Location/;}' text
Location
USA
Location
UK
I do not have a Solaris here so I would like to know if this works.
Does this AWK Solution works for you -
[jaypal~/temp]$ cat a.txt
Location
New York <---- delete
USA
Location
London <---- delete
UK
Updated to preserve the empty line -
!NF is used to preserve the blank lines. It means, if the Number of Fields is = 0 then just print the line. NF is an in-built variable which keeps track of number of fields in a record. If we encounter a blank line, we skip the rest of the processing and go to the next line.
!/Location/ will print the lines. This is to preserve the lines which are not followed by Location. Printing is an implicit action in AWK whenever the pattern is true.
The third patter/action is where we print the line when it matches the RegEx /Location/. Apart from printing the line, we do getline twice which effectively deletes your next line and then print it.
[jaypal~/temp]$ awk '!NF{print;next}; !/Location/; /Location/{print;getline;getline;print}' INPUT_FILE
Location
USA
Location
UK
Related
My file is x in the format \D{5}\d\d/ D{5}\d or |D{5}dd
example:
aahed9aalii5aargh9abaca9abaci9aback13
The /d may be 1 or 2 digits no spaces or breaks in the entire document.
The goal is to create a .csv file dividing the \D{5} from \d{1} or \d{2}
Tried sublime text,perl,textedit or pages
In Sublime I understand how to find the (\D{5} group) but not how to replace that with (\D{5}),)
I found the s(dog/cat)substitution example but could not get that to translate in perl or sublime.
Found the perl command line idea
(perl -pi.bak -e 's\/D{5}/D{5}\,/g' $filename) may not be exact
But could not decipher all the errors
The reason I chose regex for this is the only commonality to each value is the length of the word is the same throughout the document. There are no tabs, no parens, no spaces, no fixed length fields nothing to get my hooks in.
The question:
How do I retain the original values in the replace/substitution function?
I realize what this board has to deal with in regard to duplicate
questions. Do you realize on my side how difficult it is to search through all the previous questions when I am not sure what I am looking for?
I am not looking for someone to give me a fish, looking for someone to teach me how to fish.
If REGEX is not the answer maybe I am missing something any guidance would be appreciated.
Thanks
The $1, $2, etc variables may be used to refer back to "captures" (parenthesized parts) within the most recent regexp.
echo aahed9aalii5aargh9abaca9abaci9aback13 | perl -pe 's/(\D{5})(\d*)/$1=$2,/g'
Outputs:
aahed=9,aalii=5,aargh=9,abaca=9,abaci=9,aback=13,
I came across this command in a project I am working on:
sed -i '/regex/,$d' file
I don't understand how the ,$d part works. If I omit any part of ,$d I get errors. In my tests it looks like it replaces the matching line and anything after it with nothing. Example:
File with contents:
first line
second line regex
third line
fourth line
Comes out as after running that command:
first line
I couldn't find any documentation in the man page that explains this, though I could have easily missed it. The man page is hard for me to parse...
This is example was tested with GNU Sed v 4.2.2.
This is not a replacement command; the sed substitute or replace command looks like s/from/to/.
The general form of a sed script is a sequence of commands - typically a single letter, but some of them take arguments, like the s command above - with an optional address expression before each. You are looking at a d (delete line) command preceded by the address expression /regex/,$
The address range specifies lines from the first regex match through to the end of the file ($ in this context specifies the last line) and the action d deletes the specified lines.
Although many people only ever encounter simple sed scripts which use just the s command, this behavior will be described in any basic introduction to sed, as well as in the man page.
My Sample file:
As I Come Into Your Presence
Key: F
1 As I come into Your presence
Past the gates of praise
Into Your sanctuary
Till we are standing face to face
And look upon Your countenance
I see the fullness of Your glory
And I can only bow down and say
Chorus:
Your awesome in this place
Mighty God
You are awesome in this place
Abba Father
You are worthy of all praise
To You our lives we raise
You are awesome in this place
Mighty God
<--- Empty line here
<--- Empty line here
I wrote this perl one-liner to get <i></i> tags around the entire chorus block:
perl -p0e "s#Chorus:(.*?)\n\n#<i>Chorus:$1</i>#gsm" file
The result:
As I Come Into Your Presence
Key: F
1 As I come into Your presence
Past the gates of praise
Into Your sanctuary
Till we are standing face to face
And look upon Your countenance
I see the fullness of Your glory
And I can only bow down and say
<i>Chorus:</i>%
I can't get the desired result where the </i> tag would be printed after the entire chorus after the Mighty God.
Where is the error? How can I achieve this?
Your solution would work if you just put it in single quotes instead of double quotes. You should pretty much always use single quotes for one-liners from the shell, no matter what language/interpreter you're running, to keep shell interpolation from messing things up.
In your code:
perl -p0e "s#Chorus:(.*?)\n\n#<i>Chorus:$1</i>#gsm" file
The $1 is being expanded by the shell before it ever gets to Perl, so Perl sees this:
perl -p0e "s#Chorus:(.*?)\n\n#<i>Chorus:</i>#gsm" file
and happily deletes your chorus. If you use single quotes instead:
perl -p0e 's#Chorus:(.*?)\n\n#<i>Chorus:$1</i>#gsm' file
it will work as intended.
Note, however, that the -0 means any NUL characters that creep into the input will still cause Perl to split it into multiple records at that point. A more correct solution would be to use -0777 instead, which tells Perl that no value should split the input; it is treated as a single record no matter what data it contains.
perl -p0777e 's#Chorus:(.*?)\n\n#<i>Chorus:$1</i>#gsm' file
escape the $
perl -p0777e "s#Chorus:(.*?)\n\n#<i>Chorus:\$1</i>#gsm" file.
also as #Kenney mention in the comment:
Use single quotes on the commandline to wrap perl expressions otherwise the shell expansion will kick in.
I noticed something a bit odd while fooling around with sed. If you try to remove multiple line intervals (by number) from a file, but any interval specified later in the list is fully contained within an interval earlier in the list, then an additional single line is removed after the specified (larger) interval.
seq 10 > foo.txt
sed '2,7d;3,6d' foo.txt
1
9
10
This behaviour was behind an annoying bug for me, since in my script I generated the interval endpoints on the fly, and in some cases the intervals produced were redundant. I can clean this up, but I can't think of a good reason why sed would behave this way on purpose.
Since this question was highlighted as needing an answer in the Stack Overflow Weekly Newsletter email for 2015-02-24, I'm converting the comments above (which provide the answer) into a formal answer. Unattributed comments here were made by me in essentially equivalent form.
Thank you for a concise, complete question. The result is interesting. I can reproduce it with your script. Intriguingly, sed '3,6d;2,7d' foo.txt (with the delete operations in the reverse order) produces the expected answer with 8 included in the output. That makes it look like it might be a reportable bug in (GNU) sed, especially as BSD sed (on Mac OS X 10.10.2 Yosemite) works correctly with the operations in either order. I tested using 'sed (GNU sed) 4.2.2' from an Ubuntu 14.04 derivative.
More data points for you/them. Both of these include 8 in the output:
sed -e '/2/,/7/d' -e '/3/,/6/d' foo.txt
sed -e '2,7d' -e '/3/,/6/d' foo.txt
By contrast, this does not:
sed -e '/2/,/7/d' -e '3,6d' foo.txt
The latter surprised me (even accepting the basic bug).
Beats me. I thought given some of sed's arcane constructs that you might be missing the batman symbol or something from the middle of your command but sed -e '2,7d' -e '3,6d' foo.txt behaves the same way and swapping the order produces the expected results (GNU sed 4.2.2 on Cygwin). /bin/sed on Solaris always produces the expected result and interestingly so does GNU sed 3.02. Ed Morton
More data: it only seems to happen with sed 4.2.2 if the 2nd range is a subset of the first: sed '2,5d;2,5d' shows the bug, sed '2,5d;1,5d' and sed '2,5d;2,6d' do not. glenn jackman
The GNU sed home page says "Please send bug reports to bug-sed at gnu.org" (except it has an # in place of ' at '). You've got a good reproduction; be explicit about the output you expect vs the output you get (they'll get the point, but it's best to make sure they can't misunderstand). Point out that the reverse ordering of the commands works as expected, and give the various other commands as examples of working or not working. (You could even give this Q&A URL as a cross-reference, but make sure that the bug report is self-contained so that it can be understood even if no-one follows the URL.)
You can also point to BSD sed (and the Solaris version, and the older GNU 3.02 sed) as behaving as expected. With the old version GNU sed working, it means this is arguably a regression. […After a little experimentation…] The breakage occurred in the 4.1 release; the 4.0.9 release is OK. (I also checked 4.1.5 and 4.2.1; both are broken.) That will help the maintainers if they want to find the trouble by looking at what changed.
The OP noted:
Thanks everyone for comments and additional tests. I'll submit a bug report to GNU sed and post their response. santayana
I want to delete newlines after lines containing a keyword e.g. like modifiers private:,public: or protected: to fulfill our coding standard. I need a command line tool (Linux) for this, so please no Notepad++, Emacs, VS, or Vim solutions, if they require user interaction. So in other words I want to do a:
sed -i 's/private:\s*\n\s*\n/private:\n/g'
I've seen this question but was unable to extend it to my needs.
If I understand correctly, you want to remove empty lines which follow a line containing private:, public:, or protected:.
sed ':loop;/private:\|public:\|protected:/{n;/^$/d;Tloop}' inputfile
Explanation:
:loop create a label
/private:\|public:\|protected:/ will search for lines containing the pattern.
n;/^$/d will load the next line (n), check whether it is an empty line (/^$/), and if it is, delete the line (d).
Tloop branch to label loop if there was no match (line was not empty)
I am no sed guru, there might be more elegant ways to do this. There might also be more elegant ways to do this in awk, perl, python, whatever.
perl -0777 -pi -e 's/([ \t\r]*)(private|protected|public):[ \t\r]*\n[ \t\r]*\n/$1$2:\n/g' file
should do the trick, while also take trailing and leading whitespace into account, which I didn't specified in the question as this is not a must requirement.