How to get block printed out with sed - sed

The problem
I have the following input file:
test.txt
This is a text2
This is a text3
This is a text4
This is a text5
START
This is a text1
END
This is a text6
This is a text7
This is a text8
This is a text9
I'd like to print the text between the START and END block with sed to
practice with it a little bit but I'm so confused how to achieve that.
What I've tried
I tried the following commands:
cat $(test.txt) | sed -n -e '{/START/=},{/END/=}p'
I hoped here that the {/START/=} and {/END/=} blocks return the line numbers
where the START and END block are.
cat $(test.txt) | sed -n -e '$(sed -n -e "/START/="),$(sed -n -e "/END/=")p'
I tried to get the line numbers of the START and END blocks by embedding
them into $(...).
I'm getting out of ideas.
What would you recommend to use for?
Anyways, I'm also wondering if using sed is the best tool for this task. Do you
have any suggestions for other tools which suit more for this task? I'd like to
see an example as well, thanks!

sed -ne '/START/,/END/p' test.txt
Is the typical solution. This will apply the p command to all lines between (and including) line that matches START until the line that matches END.
Excluding the start and end of the range is not very clean in sed. One approach is to explicitly match them:
sed -ne '/START/,/END/{/START/d; /END/d; p;}' test.txt
but for this particular use case it's probably cleaner to include those lines in the range of lines that are explicitly deleted:
sed -e '1,/START/d' -e '/END/,$d' test.txt

Related

Combining sed commands in bash

I am aiming to try and combine the two following sed commands to print out one output. The first command is used to strip the HTML file of its HTML tags and the second is to specify I only want lines 11 through to 16 of the file.
sed -e 's/<[^>]*.//g' file.html
sed -n '11,16p' file.html
I have been playing around with this for a while now and can only ever seem to get the output of lines 11-16 with the HTML tags, or all lines without the HTML, when I am aiming to display the output of lines 11-16 without any HTML tags. Any help would be greatly appreciated, thanks!
The naive way would be to use a pipe:
sed 's/<[^>]*.//g' file.htm | sed -n '11,16p'
You may also combine the address and the pattern:
sed -n '11,16 s/<[^>]*.//pg' file.html
Here,
-n will suppress the default line output
11,16 - will set the address, Lines 11 through 16
s/<[^>]*.// - will look for <, then zero or more chars other than > and then any one char (did you mean a >?)
p - print the result of the substitution
g - all occurrences on the line
An online demo (shortened version, Lines 2-4):
#!/bin/bash
s="<111111>aaa<111111>
<22222>bbb<111111>
<33333>ccc<111111>
<44444>ddd<111111>
<55555>eee<111111>"
sed -n '2,4 s/<[^>]*.//pg' <<< "$s"
Output:
bbb
ccc
ddd
If GNU-compatible,
sed -n '11,16{ s/<[^>]*.//g; p; }; 17q;' file.html
The range will take a block, allowing both commands to be done sequentially to each line.
The 17q; just keeps it from wasting time on lines you already know you don't need.

Parse file and insert new line after each occurrence

On a Unix system I am trying to add a new line in a file using sed or perl but it seems I am missing something.
Supposing my file has multiple lines of texts, always ending like this {TNG:}}${1:F01.
I am trying to find a to way to add a new line after the }$, in this way {1 should always start on a new line.
I tried it by escaping $ sign using this:
perl -e '$/ = "\${"; while (<>) { s/\$}\{$/}\n{/; print; }' but it does not work.
Any ideas will be appreciated.
give this a try:
sed 's/{TNG:}}\$/&\n/' file > newfile
The sed will by default use BRE, that is, the {}s are literal characters. But we must escape the $.
kent$ cat f
{TNG:}}${1:F01.
kent$ sed 's/{TNG:}}\$/&\n/' f
{TNG:}}$
{1:F01.
With perl:
$ cat input.txt
line 1 {TNG:}}${1:F01
line 2 {TNG:}}${1:F01
$ perl -pe 's/TNG:\}\}\$\K/\n/' input.txt
line 1 {TNG:}}$
{1:F01
line 2 {TNG:}}$
{1:F01
(Read up on the -p and -n options in perlrun and use them instead of trying to do what they do in a one-liner yourself)

Simple method for finding and replacing string linux

I'm currently trying to find a line in a file
#define IMAX 8000
and replacing 8000 with another number.
Currently, stuck trying to pipe arguments from awk into sed.
grep '#define IMAX' 1d_Euler_mpi_test.c | awk '{print $3}' | sed
Not too sure how to proceed from here.
I would do something like:
sed -i '' '/^#define IMAX 8000$/s/8000/NEW_NUMBER/' 1d_Euler_mpi_test.c
Could you please try following. Place new number's value in place of new_number too.(tested this with GNU sed)
echo "#define IMAX 8000" | sed -E '/#define IMAX /s/[0-9]+$/new_number/'
In case you are reading input from an Input_file and want to save its output into Input_file itself use following then.
sed -E '/#define IMAX /s/[0-9]+$/new_number/' Input_file
Add -i flag in above code in case you want to save output into Input_file itself. Also my codes will catch any digits which are coming at the end of the line which has string #define IMAX so in case you only want to look for 8000 or any fixed number change [0-9]+$ to 8000 etc in above codes then.
You may use GNU sed.
sed -i -e 's/IMAX 8000/IMAX 9000/g' /tmp/file.txt
Which will invoke sed to do an in-place edit due to the -i option. This can be called from bash.
If you really really want to use just bash, then the following can work:
while read a ; do echo ${a//IMAX 8000/IMAX 9000} ; done < /tmp/file.txt > /tmp/file.txt.t ; mv /tmp/file.txt{.t,}
This loops over each line, doing a substitution, and writing to a temporary file (don't want to clobber the input). The move at the end just moves temporary to the original name.

sed + remove "#" and empty lines with one sed command

how to remove comment lines (as # bal bla ) and empty lines (lines without charecters) from file with one sed command?
THX
lidia
If you're worried about starting two sed processes in a pipeline for performance reasons, you probably shouldn't be, it's still very efficient. But based on your comment that you want to do in-place editing, you can still do that with distinct commands (sed commands rather than invocations of sed itself).
You can either use multiple -e arguments or separate commands with a semicolon, something like (just one of these, not both):
sed -i 's/#.*$//' -e '/^$/d' fileName
sed -i 's/#.*$//;/^$/d' fileName
The following transcript shows this in action:
pax> printf 'Line # with a comment\n\n# Line with only a comment\n' >file
pax> cat file
Line # with a comment
# Line with only a comment
pax> cp file filex ; sed -i 's/#.*$//;/^$/d' filex ; cat filex
Line
pax> cp file filex ; sed -i -e 's/#.*$//' -e '/^$/d' filex ; cat filex
Line
Note how the file is modified in-place even with two -e options. You can see that both commands are executed on each line. The line with a comment first has the comment removed then all is removed because it's empty.
In addition, the original empty line is also removed.
#paxdiablo has a good answer but it can be improved.
(1) The '/^$/d' clause only matches 100% blank lines.
If you want to also match lines that are entirely whitespace (spaces, tabs etc.) use this instead:
'/^\s*$/d'
(2) The 's/#.*$//' clause only matches lines that start with the # character in column 0.
If you want to also match lines that have only whitespace before the first # use this instead:
'/^\s*#.*$/d'
The above criteria may not be universal (e.g. within a HEREDOC block, or in a Python multi-line string the different approaches could be significant), but in many cases the conventional definition of "blank" lines include whitespace-only, and "comment" lines include whitespace-then-#.
(3) Lastly, on OSX at least, the #paxdiablo solution in which the first clause turns comment lines into blank lines, and the second clause strips blank lines (including what were originally comments) doesn't work. It seems to be more portable to make both clauses /d delete actions as I've done.
The revised command incorporating the above is:
sed -e '/^\s*#.*$/d' -e '/^\s*$/d' inputFile
This tiny jewel removes all # comments, no matter where they begin in a line (see caution below):
sed -e 's/\s*#.*$//'
Example:
text="
this is a # test
#this is a test
#this is a #test
this is # another #test
"
$echo "$text" | sed -e 's/\s*#.*$//'
this is a
this is
Next this removes any resulting blank lines:
$echo "$text" | sed -e 's/\s*#.*$//' | sed -e '/^\s*$/d'
Caution: Depending on the syntax and/or interpretation of the lines your processing, this might not be an appropriate solution, as it just stupidly removes end of lines, even if the '#' is part of your data or code. However, for use cases where you'll never use a hash except for as an end of line comment then it works fine. So just as with all coding, context must be taken into consideration.
Alternative variant, using grep:
cat file.txt | grep -Ev '(#.*$)|(^$)'
you can use awk
awk 'NF{gsub(/^[ \t]*#/,"");print}' file
First example(paxdiablo) is very good except its not change file, just output result. If you want to change it inline:
sudo sed -i 's/#.*$//;/^$/d' inputFile
On (one of) my linux boxes, sed understands extended regular expressions with the -r option, so:
sed -r '/(^\s*#)|(^\s*$)/d' squid.conf.installed
is very useful for showing all non-blank, non comment lines.
The regex matches either start of line followed by zero or more spaces or tabs followed by either a hash or end of line, and deletes those matching lines from the input.

sed script to delete all characters up to & including the 2nd comma on a line

Can anyone explain how to use sed to delete all characters up to & including the 2nd comma on a line in a CSV file?
The beginning of a typical line might look like
1234567890,ABC/DEF, and the number of digits in the first column varies i.e. there might be 9 or 10 or 11 separate digits in random order, and the letters in the second column could also be random. This randomness and varying length makes it impossible to use any explicit pattern searching.
You could do it with sed like this
sed -e 's/^\([^,]*,\)\{2\}//'
not 100% sure on the syntax, I tried it, and it seems to work though. It'll delete zero-or-more of anything-but-a-comma followed by a comma, and all that is matched twice in succession.
But even easier would be to use cut, like this
cut -d, -f3-
which will use comma as a delimiter, and print fields 3 and up.
EDIT:
Just for the record, both sed and cut can work with a file as a parameter, just append it at the end like so
cut -d, -f3- myfile.txt
or you can pipe the output of your program through them
./myprogram | cut -d, -f3-
sed is not the "right" choice of tool (although it can be done). since you have structured data, you can use fields/delimiter method instead of creating complicated regex.
you can use cut
$ cut -f3- -d"," file
or gawk
$ gawk -F"," '{$1=$2=""}1' file
$ gawk -F"," '{for(i=3;i<NF;i++) printf "%s,",$i; print $NF}' file
Thanks for all replies - with the help provided I have written the simple executable script below which does what I want.
#!/bin/bash
cut -d, -f3- ~/Documents/forex_convert/input.csv |
sed -e '1d' \
-e 's/-/,/g' \
-e 's/ /,/g' \
-e 's/:/,/g' \
-e 's/,D//g' > ~/Documents/forex_convert/converted_input
exit