How to find only the first and last line of a file using sed - sed

I have a file called error_log for the apache and I want to see the first line and the last line of this file using sed command. Would you please help me how can I do that?
I know how to do that with head and tail commands, but I'm curious if it's possible in sed command too.
I have read the man sed and have googled a lot but nothing is found unfortunately.

This might work for you (GNU sed):
sed '1b;$b;d' file
All sed commands can be prefixed by either an address or a regexp. An address is either a line number or the $ which represents the last line. If neither an address or a regexp is present, the following command applies to all other lines.
The normal sed cycle, presents each line of input (less its newline) in the pattern space. The sed commands are then applied and the final act of the cycle is to re-attach the newline and print the result.
The b command controls command flow; if by itself it jumps out of the following sed commands to the final act of the cycle i.e. where the newline is re-attached and the result printed.
The d command deletes the pattern space and since there is nothing to be printed no further processing is executed (including re-attaching the newline and printing the result).
Thus the solution above prints the first line and the last and deletes the rest.
Sed has some command line options, one of which turns of the implicit printing of the result of the pattern space -n. The p command prints the current state of the pattern space. Thus the dual of the above solution is:
sed -n '1p;$p' file
N.B. If the input file is only one line the first solution will only print one line whereas the second solution will print the same line twice. Also if more than one file is input both solutions will print the first line of the first file and last line of the last file unless the -i option is in place, in which case each file will be amended. The -s option replicates this without amending each file but streams the results to stdout as if each file is treated separately.

This will work:
sed -n '1p ; $p' error_log
1p will print the first line and $p will print the last line.
As a suggestion, take a look at info sed, not only man sed. You can find the some examples about your question at the paragraph 2.1.

First line:
sed '2,$d' error_log
Last line:
sed '$!d' error_log

Based on your new requirement to output nothing if the input file is just 1 line (see How to find only the first and last line of a file using sed):
awk 'NR==1{first=$0} {last=$0} END{if (NR>1) print first ORS last}'
Original answer:
This is one of those things that you can, at face value, do easily enough in sed:
$ seq 3 7
3
4
5
6
7
$ seq 3 7 | sed -n '1p; $p'
3
7
but then how to handle edge cases like one line of input is non-obvious, e.g. is this REALLY the correct output:
$ printf 'foo\n' | sed -n '1p; $p'
foo
foo
or is the correct output just:
foo
and if the latter, how do you tweak that sed command to produce that output? #potong suggested a GNU sed command:
$ printf 'foo\n' | sed '1b;$b;d'
foo
which works but may be GNU-only (idk) and more importantly doesn't look much like the command we started with so the tiniest change in requirements meant a complete rewrite using different constructs.
Now, how about if you want to enhance it to, say, only print the first and last line if the file contained foo? I expect that'd be another challenging exercise with sed and probably involve non-portable constructs too.
It's just all pointless to learn how to do this with sed when you can use a different tool like awk and do whatever you like in a simple, consistent, portable syntax:
$ seq 3 7 |
awk 'NR==1{first=$0} {last=$0} END{print first ORS last}'
3
7
$ printf 'foo\n' |
awk 'NR==1{first=$0} {last=$0} END{print first ORS last}'
foo
foo
$ printf 'foo\n' |
awk 'NR==1{first=$0} {last=$0} END{print first (NR>1 ? ORS last : "")}'
foo
$ printf '3\nfoo\n7\n' |
awk 'NR==1{first=$0} /foo/{f=1} {last=$0} END{if (f) print first (NR>1 ? ORS last : "")}'
3
7
$ printf '3\nbar\n7\n' |
awk 'NR==1{first=$0} /foo/{f=1} {last=$0} END{if (f) print first (NR>1 ? ORS last : "")}'
$
Notice that:
Every command looks like every other command.
A minor change in requirements leads to a minor change in the code, not a complete rewrite.
Once you learn how to do any given thing A, how to do similar things B, C, D, etc. just builds on top of the syntax you already used, you don't have to learn a completely different syntax.
Each of those commands will work using any awk in any shell on every UNIX box.
Now, how about if you want to do that for multiple files such as would be created by the following commands?
$ seq 3 7 > file1
$ seq 12 25 > file2
With awk you can just store the lines in an array for printing in the END:
$ awk 'FNR==1{first[++cnt]=$0} {last[cnt]=$0}
END{for (i=1;i<=cnt;i++) print first[i] ORS last[i]}' file1 file2
3
7
12
25
or with GNU awk you can print them from ENDFILE:
$ awk 'FNR==1{first=$0} {last=$0} ENDFILE{print first ORS last}' file1 file2
3
7
12
25
With sed? An exercise left for the reader.

Related

sed: get a line number with regex and insert text at that line

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.
This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.
Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

Sed inside a while read loop

I have been reading a lot of questions and answers about using sed within a while loop. I think I have the command down correctly, but I seem to get no output once I put all of the pieces together. Can someone tell me what I am missing?
I have an input file with 700 variables, one on each line. I need to use each of these 700 variables within a sed command. I run the following command to verify variables are outputting correctly:
cat Input_File.txt | while read var; do echo $var; done
I then try to add in the sed command as follows:
cat Input_File.txt | while read var; do sed -n "/$var/,+10p" Multi-BLAST_5814.txt >> Multi_BLAST_Subset; done
This command leaves me without an error, but a blinking cursor as if this is an infinite loop. It should use each of the 700 variables, find the corresponding line in Multi_BLAST_5814.txt and output the search variable line and the 10 lines after the search term into a new file, appending each as it goes. I can execute the sed command alone with a manually set single value variable successfully and I can execute the while loop successfully using the input file. Anyone have a thought as to why this is not working?
User, that is exactly what I have done to this point.
I have a large text file (128 MB) with BLAST output. I need to search through this for a subset of results for 769 samples (Out of the 5814 samples that are in the file).
I have created a .txt file with those 769 sample names.
To test grep and sed, I manually assigned a variable with one of the 769 samples names I need to search and can get the results I need as follows:
$ Otu="S41_Folmer_Otu96;size=12;"
$ grep $Otu -A 10 Multi_BLAST_5814.txt
OR
$ sed -n "/$Otu/,+10p" Multi_BLAST_5814.txt
The OUTPUT is exactly what I want as follows:
Query= S41_Folmer_Otu96;size=12;
Length=101
Sequences producing significant alignments: Score(Bits) E Value
gi|58397553|gb|AY830431.1| Scopelocheirus schellenbergi clone... 180 1E-41
gi|306447543|gb|HQ018876.1| Liposcelis paeta isolate CZ cytoc... 174 6E-40
gi|306447533|gb|HQ018871.1| Liposcelis decolor isolate CQ cyt... 104 9E-19
gi|1043259532|gb|KX130860.1| Batocera rufomaculata isolate Br... 99 4E-17
gi|987210821|gb|KR141076.1| Psocoptera sp. BOLD:ACO1391 vouch... 81 1E-11
To Test to make sure the input file contains the correct variables I run the following:
$ Cat Input_File.txt
$ while read Otu; do echo $Otu; done <Input_File.txt
S41_Folmer_Otu96;size=12;
S78_Folmer_Otu15;size=538;
S73_Leray_Otu52;size=6;
S66_Leray_Otu93;size=6;
S10_Folmer_Otu10;size=1612;
... All 769 variables
Again, this is exactly what I expect and is correct.
But, When I do either of the following commands, nothing is printed to the screen (if I leave off the write file/append action) or to the file I need to create.
$ cat Input_File.txt | while read Otu; do grep "$Otu" -A 10 Multi_BLAST_5814.txt >> Multi_BLAST_Subset.txt; done
$ cat Input_File.txt | while read Otu; do sed -n "/$Otu/,+10p" Multi_BLAST_5814.txt >> Multi_BLAST_Subset.txt; done
Sed hangs and never closes, leaving me at a blinking cursor. Grep finishes but also gives no output. I am at a loss as to why this is not working. Everything works inidividually, so I may be left with manually searching all 769 samples copy/paste.
If you have access to GNU grep no need for a sed command, grep "$var" -A 10 will do the same thing and won't break if $var contains the delimiter used in your sed command.
From man grep :
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.
Places a line containing a group separator (--) between
contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.
Not sure whether you have already attempted it but try breaking the problem into smaller chunks. Simple example below :
$ cat Input_File.txt
one
two
three
$
$ cat file.txt
This is line one
This is line two
This is line three
This is another four
This is another five
This is another six
This is another seven
$
$ cat Input_File.txt | while read var ; do echo $var ; sed -n "/$var/,+1p" file.txt ; done
one
This is line one
This is line two
two
This is line two
This is line three
three
This is line three
This is another four
$

Sed Process Substitution on Insert - Without Backslashes

I have function that prints a header that needs to be applied across several files, but if I utilize a sed process substitution the lines prior to the last have a backslash \ on them.
E.g.
function print_header() {
cat << EOF
-------------------------------------------------------------------
$(date '+%B %d, %Y # ~ %r') ID:$(echo $RANDOM)
EOF
}
If I then take a file such as test.txt:
line 1
line 2
line 3
line 4
line 5
sed "1 i $(print_header | sed 's/$/\\/g')" test.txt
I get:
-------------------------------------------------------------------\
November 24, 2015 # ~ 11:18:28 AM ID:13187
line 1
line 2
line 3
line 4
line 5
Notice the troublesome backslash at the end of the first line, I'd like to not have that backslash appear. Any ideas?
I would use cat for that:
cat <(print_header) file > file_with_header
This behavior depends on the sed dialect. Unfortunately, it's one of the things which depends on which version you have.
To simplify debugging, try specifying verbatim text. Here's one from a Debian system.
vnix$ sed '1i\
> foo\
> bar' <<':'
> hello
> goodbye
> :
foo
bar
hello
goodbye
Your diagnostics appear to indicate that your sed dialect does not in fact require the backslash after the first i.
Since you are generating the contents of the header programmatically anyway, my recommended solution would be to refactor the code so that you can avoid this conundrum. If you don't want cat <<EOF test.txt then maybe experiment with sed 1r/dev/stdin' <<EOF test.txt (I could not get 1r- to work, but /dev/stdin should be portable to any Linux.)
Here is my kludgy fix, if you can find something more elegant I'll gladly credit you:
sed "1 i $(print_header | sed 's/$/\\/g;$s/$/\x01/')" test.txt | tr -d '\001'
This puts an unprintable SOH (\x01) ascii Start Of Header character after the inserted text, that precludes the backslashes and then I run it over tr to delete the SOH chars.

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution

the 'd' command in the sed utility

From the sed documentation:
d Delete the pattern space; immediately start next cycle.
What does it mean by next cycle? My understanding is that sed will not apply the following commands after the d command and it starts to read the next line from the input stream and processes it. But it seems that this is not true. See this example:
[root#localhost ~]# cat -A test.txt
aaaaaaaaaaaaaa$
$
bbbbbbbbbbbbb$
$
$
ccccccccc$
ddd$
$
eeeeeee$
[root#localhost ~]# cat test.txt | sed '/^$/d;p;p'
aaaaaaaaaaaaaa
aaaaaaaaaaaaaa
aaaaaaaaaaaaaa
bbbbbbbbbbbbb
bbbbbbbbbbbbb
bbbbbbbbbbbbb
ccccccccc
ccccccccc
ccccccccc
ddd
ddd
ddd
eeeeeee
eeeeeee
eeeeeee
[root#localhost ~]#
If immediately start next cycle, the p command will not have any output.
Anyone can help me to explain it please? Thanks.
It means that sed will read the next line and start processing it.
Your test script doesn't do what you think. It matches the empty lines and applies the delete command to them. They don't appear, so the print statements don't get applied to the empty lines. The two print commands aren't connected to the pattern for the delete command, so the non-empty lines are printed three times. If you instead try
sed '/./d;p;p' test.txt # matches all non-empty lines
nothing will be printed other than the blank lines, three times each.
a) You can combine multiple commands for one pattern with curly braces:
sed '/^$/{d;p;p}' test.txt
aaaaaaaaaaaaaa
bbbbbbbbbbbbb
ccccccccc
ddd
eeeeeee
The command d is only applied to empty lines here: '/^$/d;p;p'. Else the line is printed 2 additional times. To bind the 'p'-command to the pattern, you have to build curly braces. Then the p-command is as well skipped, but because of the skipping to next cycle, not because it doesn't match.
b) Useless use of cat. (already shown)