grep all lines from start of file to line containing a string - command-line

If I have input file containing
nothing here
I want to grep / extract (without using awk) every line from starting till I get the string "something". How can I do this? grep -B does not work since it needs the exact number of lines.
Desired output:

it's not completely robust, but sure -B works... just make the -B count huge:
grep -B `wc -l <filename>` -e 'something' <filename>

You could use a bash while loop and exit early when you hit the string:
$ cat file | while read line; do
> echo $line
> if echo $line | grep -q something; then
> exit 0
> fi
> done

head -n `grep -n -e 'something' <filename> | cut -d: -f1` <filename>


Removing a specific line in bash with an exact string

I'm having trouble in getting sed to remove just the specific line I want. Let's say I have a file that looks like this:
Currently I'm using this to remove the line I want:
sed -i "/$1/d" file
The issue is that with this if I were to give testfile as input it would delete all three lines but I want it to only remove the first line. How do I do this?
With grep
grep -x -F -v -- "$1" file
# or
grep -xFv -- "$1" file
-F is for "fixed strings" -- turns off regex engine.
-x is to match entire line.
-v is for "everything but" the matched line(s).
-- to signal the end of options, in case $1 starts with a hyphen.
To save the file
grep -xFv -- "$1" file | sponge file # `moreutils` package
# or
grep -xFv -- "$1" file > "$tmp" && mv "$tmp" file
So match the whole line.
sed -i '/^'"$var"'$/d' file
# or with " quoting
sed -i "/^$var\$/d" file
You can learn regex with fun online with regex crosswords.

search and select decimal numbers in a text file line

I have xml textfiles which contain lines of multiple numbers (3) separated by tabs/spaces, from which I would like to select the each set of numbers separately.
<tagname1> 110.0912 99.1234 55.1326 </tagname1>
I would like to use sed, awk, grep, etc. perl is fine too. Seems simple, but can't figure out a cleaner line. I've tried:
more FILENAME | grep tagname1 | grep -E -o "[0-9]+*\.[0-9]+" | head -n 1
perl -MRegexp::Common -nE 's/<.*?>//g; say for /($RE{num}{real})/g' file
You can use grep -o option.
$ cat file
<tagname1> 110.0912 99.1234 55.1326 </tagname1>
$ grep -oE '\b[0-9.]+\b' file
\b defines a word boundary
[0-9.]+ is a character class suggesting match numbers and . one or more times
-o option prints matched pattern only
awk -v which=2 '/<tagname1>(([0-9]*(\.[0-9]*)?)|[ \t])*<\/tagname1>/ {print $(which+1)}' input.txt
Select which number you want to be printed using the variable which in this example it will print the second number which=2
<tagname1> 110.0912 99.1234 55.1326 </tagname1>
You can use awk
awk '{print $2,$3,$4}' OFS="\n" file
$ cat file
<tagname1> 110.0912 99.1234 55.1326 </tagname1>
$ awk -v tag="tagname1" -v nr=1 '$0~"<"tag">"{print $(nr+1)}' file
$ awk -v tag="tagname1" -v nr=2 '$0~"<"tag">"{print $(nr+1)}' file
$ awk -v tag="tagname1" -v nr=3 '$0~"<"tag">"{print $(nr+1)}' file

AWK/SED. How to remove parentheses in simple text file

I have a text file looking like this:
(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02) ... and so on.
I would like to modify the file by removing all the parenthesis and a new line for each couple
so that it look like this:
A simple way to do that?
Any help is appreciated,
I would use tr for this job:
cat in_file | tr -d '()' > out_file
With the -d switch it just deletes any characters in the given set.
To add new lines you could pipe it through two trs:
cat in_file | tr -d '(' | tr ')' '\n' > out_file
As was said, almost:
sed 's/[()]//g' inputfile > outputfile
or in awk:
awk '{gsub(/[()]/,""); print;}' inputfile > outputfile
This would work -
awk -v FS="[()]" '{for (i=2;i<=NF;i+=2) print $i }' inputfile > outputfile
[jaypal:~/Temp] cat file
(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02)
[jaypal:~/Temp] awk -v FS="[()]" '{for (i=2;i<=NF;i+=2) print $i }' file
This might work for you:
echo "(-9.1744438E-02,7.6282293E-02) (-9.1744438E-02,7.6282293E-02)" |
sed 's/) (/\n/;s/[()]//g'
Guess we all know this, but just to emphasize:
Usage of bash commands is better in terms of time taken for execution, than using awk or sed to do the same job. For instance, try not to use sed/awk where grep can suffice.
In this particular case, I created a file 100000 lines long file, each containing characters "(" as well as ")". Then ran
$ /usr/bin/time -f%E -o log cat file | tr -d "()"
and again,
$ /usr/bin/time -f%E -ao log sed 's/[()]//g' file
And the results were:
05.44 sec : Using tr
05.57 sec : Using sed
cat in_file | sed 's/[()]//g' > out_file
Due to formatting issues, it is not entirely clear from your question whether you also need to insert newlines.

How can I check whether a piped content is text with perl

I've written a svn-hook for text files. The content test looks like this:
svnlook cat -t $txn $repos $file 2>/dev/null | file - | egrep -q 'text$'
and I was wondering if this could be done with Perl. However something like this doesn't work:
svnlook cat -t $txn $repos $file 2>/dev/null | perl -wnl -e '-T' -
I'm testing the exit status of this invocation ($?) to see if the given file was text or binary. Since I'm getting the content out of svn. I can't use perl's normal file check.
I've done a simulation with the file program and perl with a text and binary file (text.txt, icon.png):
find -type f | xargs -i /bin/bash -c 'if $(cat {} | file - | egrep -q "text$"); then echo "{}: text"; else echo "{}: binary"; fi'
./text.txt: text
./icons.png: binary
find -type f | xargs -i /bin/bash -c 'if $(cat {} | perl -wln -e "-T;"); then echo "{}: text"; else echo "{}: binary"; fi'
./text.txt: text
./icons.png: text
You're testing perl's exit code, but you never set it. You need
perl -le'exit(-T STDIN ?0:1)' < file

How to "grep" out specific line ranges of a file

There are often times I will grep -n whatever file to find what I am looking for. Say the output is:
1234: whatev 1
5555: whatev 2
6643: whatev 3
If I want to then just extract the lines between 1234 and 5555, is there a tool to do that? For static files I have a script that does wc -l of the file and then does the math to split it out with tail & head but that doesn't work out so well with log files that are constantly being written to.
Try using sed as mentioned on For example use
sed '2,4!d' somefile.txt
to print from the second line to the fourth line of somefile.txt. (And don't forget to check, sed is a wonderful tool.)
The following command will do what you asked for "extract the lines between 1234 and 5555" in someFile.
sed -n '1234,5555p' someFile
If I understand correctly, you want to find a pattern between two line numbers. The awk one-liner could be
awk '/whatev/ && NR >= 1234 && NR <= 5555' file
You don't need to run grep followed by sed.
Perl one-liner:
perl -ne 'if (/whatev/ && $. >= 1234 && $. <= 5555) {print}' file
Line numbers are OK if you can guarantee the position of what you want. Over the years, my favorite flavor of this has been something like this:
sed "/First Line of Text/,/Last Line of Text/d" filename
which deletes all lines from the first matched line to the last match, including those lines.
Use sed -n with "p" instead of "d" to print those lines instead. Way more useful for me, as I usually don't know where those lines are.
Put this in a file and make it executable:
#!/usr/bin/env bash
start=`grep -n $1 < $3 | head -n1 | cut -d: -f1; exit ${PIPESTATUS[0]}`
if [ ${PIPESTATUS[0]} -ne 0 ]; then
echo "couldn't find start pattern!" 1>&2
exit 1
stop=`tail -n +$start < $3 | grep -n $2 | head -n1 | cut -d: -f1; exit ${PIPESTATUS[1]}`
if [ ${PIPESTATUS[0]} -ne 0 ]; then
echo "couldn't find end pattern!" 1>&2
exit 1
stop=$(( $stop + $start - 1))
sed "$start,$stop!d" < $3
Execute the file with arguments (NOTE that the script does not handle spaces in arguments!):
Starting grep pattern
Stopping grep pattern
File path
To use with your example, use arguments: 1234 5555 myfile.txt
Includes lines with starting and stopping pattern.
If I want to then just extract the lines between 1234 and 5555, is
there a tool to do that?
There is also ugrep, a GNU/BSD grep compatible tool but one that offers a -K option (or --range) with a range of line numbers to do just that:
ugrep -K1234,5555 -n '' somefile.log
You can use the usual GNU/BSD grep options and regex patterns (but it also offers a lot more such as -K.)
If you want lines instead of line ranges, you can do it with perl: eg. if you want to get line 1, 3 and 5 from a file, say /etc/passwd:
perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd