How to find and erase lines inside a text file - sed

I have two text files:
remove.txt
red
green
blue
collors.txt
yellow
red
black
green
grey
blue
I want to remove the occurrences of remove.txt lines inside collors.txt and save it as output.txt. I tried using sed command inside a loop, but couldn't make it work.
Script:
remove='remove.txt'
input='collors.txt'
while read line; do
# I failed to use sed here to do the job
done < $remove

No need to use a Unix tool, cmd can do it itself:
findstr /v /x /g:remove.txt collors.txt > output.txt
See the output of findstr /? to learn about the switches
Output with your example files:
yellow
black
grey

This might work for you (GNU sed):
sed 's#.*#/&/d#' removeFile | sed -f - coloursFile >outFile
Create a sed script from the removeFile and apply it to the coloursFile to produce the outFile. The created script will have a line like /colour/d for each line in the remove file where the colour will be replaced by red etc.
N.B. The -f - option applies the output from the previous pipe as an input sed script.

Related

How to edit beginning and ending of file with sed

I'm trying to add array brackets at the beginning and end of a file using sed (after first removing the trailing comma at the end of the file) to put all the content of the file in an array. I'm first using this sed command to remove the last comma from a file
sed '$ s/,$//' "$path"
After that, I'm using the middle command below to add array brackets at beginning and ending of file
sed '$ s/,$//' "$path" | sed 's/^.*$/[&]/' | tee $filename
This sed 's/^.*$/[&]/' was supposed to match everything (from beginning to end ^$) and then put brackets around the whole match [&] (i.e. as if to make it into an array), but it instead put array brackets around the beginning and end of each line.
Question, how to edit the beginning and ending of a file with sed?
whole script
for path in dirname/* do
name="${path##*/}"
sed '$ s/,$//' "$path" | sed 's/^.*$/[&]/' | tee "newdir/$name"
done
sed is an editor that works line by line, so the command sed 's/^.*$/[&]/' would add brackets to every line. If you want to edit just the beginning and end of a file you need to put line numbers in front of the substitutions ($ stands for the last line):
sed -e '1 s/^/[/' -e '$ s/$/]/'
Since you already have a command that removes trailing ,'s you could combine it with the aforementioned substitutions. Your command line would then look like this:
sed -e '1 s/^/[/' -e '$ s/,*$/]/' "$path" | tee $filename

Delete lines containing pattern at the end of line

Quite certainly I miss something basic. My file contains lines like
fooLOCATION=sdfmsvdnv
fooLOCATION=
barLOCATION=sadssf
barLOCATION=
and I want to delete all lines ending with LOCATION=.
sed -i '/LOCATION=$/d' file
does not do, it deletes nothing, and I have tried endless variations, but I don't get it. What inline sed command can do this?
There are two approaches here, either print all non-matching lines with
sed -in '/LOCATION=$/!p' file
or delete all matching names with
sed -i '/LOCATION=$/d' file
The first uses the n command line option to suppress the default action of printing the line. We then test for lines that end in LOCATION= and invert the pattern (only keeping those that don't match). When we get a desirable line, we print it with the p option.
The second looks for lines matching the end of line pattern, and deletes those that do.
Your file contains blank lines, and both of these keep those. If we don't want to keep those, we can change the first option to
sed -in '/^$/!{/LOCATION=$/!p}' file
which first checks if a line is not empty, and only bothers checking if it should be printed if it isn't empty. We can modify the second option to
sed -i '/^$/d;/LOCATION=$/d' file
which deletes blank lines and then checks about deleting the other pattern.
We can modify the options to work with different line ending by specifying the difference in the pattern. The difference between line endings on Unix/Linux (\n) and Windows (\r\n) is the presence of an extra carriage return on Windows. Modifying the four commands above to accept either, we get
sed -in '/LOCATION=\r\{0,1\}$/!p' file
sed -i '/LOCATION=\r\{0,1\}$/d' file
sed -in '/^\r\{0,1\}$/!{/LOCATION=\r\{0,1\}$/!p}' file
sed -i '/^\r\{0,1\}$/d;/LOCATION=\r\{0,1\}$/d' file
Note that in each of these we allow an optional \r before the end of line. We use the curly bracket notation, as sed does not support the question mark optional quantifier in normal mode (using the r option to GNU sed for enabling extended regular expressions, we can replace \{0,1\} with ?).
On a Windows shell, all of the options above require double quotes instead of single quotes.
Your command does work for me:
$ sed -i '/LOCATION=$/d' file
Results, viewed using cat:
$ cat file
fooLOCATION=sdfmsvdnv
barLOCATION=sadssf
Note
If a file has non-Unix line endings such as files from Windows with DOS-formatted line-endings, it can be a reason for failure. A typical remedy is to use dos2unix:
$ dos2unix file
This converter fixes the newline issues, so that file will now have Unix-style line endings. Sed should now properly recognize those line endings, so retry your sed command and it should work.
This might work for you (GNU sed):
sed -i '/LOCATION=\s*$/d' file
This deletes the line if LOCATION= is at the end of the line or if there is any optional white space following the pattern.

Echo or preview text changed using sed -i

Using sed you can easily change text in multiple files, eg:
sed -i 's/cashtestUS/cheque_usd/g' *.xml
The problem is that this has tremendous power, and a complex regular expression could easily have unforeseen consequences.
Is there a simple way to do either:
1) Echo the changes made
2) Run sed in a preview mode, so that the potential changes can be previewed
Run in preview mode without the -i:
sed -e 's/cashtestUS/cheque_usd/g' *.xml
(The -e is not necessary; it just says the next argument is the sed script, or one part of the sed script.) This writes all the output to standard output. You'd probably pipe it through less (or more), or pass it through grep to see that the changed lines were those you expected. Or you might process one file at a time and run a difference:
for file in *.xml
do
echo "$file"
sed -e 's/cashtestUS/cheque_usd/g' "$file" | diff -u "$file" -
done
Or …
sed have several 'debug/display action'
= display the current line number
l display the current working buffer content with a $ at the end of the content
i and a could be used to show a trace like i \
Debug trace here
if holding buffer is not used a h;s/.*/Debug Trace here/;g is usefull and does not appear at end of line treatment like ior a
sample:
echo "line 1
and two" | sed ':a
=;h;s/.*/Before substitution/;g;l
s/..$/-/
=;l
t a'

Using variables in sed -f (where sed script is in a file rather than inline)

We have a process which can use a file containing sed commands to alter piped input.
I need to replace a placeholder in the input with a variable value, e.g. in a single -e type of command I can run;
$ echo "Today is XX" | sed -e "s/XX/$(date +%F)/"
Today is 2012-10-11
However I can only specify the sed aspects in a file (and then point the process at the file), E.g. a file called replacements.sed might contain;
s/XX/Thursday/
So obviously;
$ echo "Today is XX" | sed -f replacements.sed
Today is Thursday
If I want to use an environment variable or shell value, though, I can't find a way to make it expand, e.g. if replacements.txt contains;
s/XX/$(date +%F)/
Then;
$ echo "Today is XX" | sed -f replacements.sed
Today is $(date +%F)
Including double quotes in the text of the file just prints the double quotes.
Does anyone know a way to be able to use variables in a sed file?
This might work for you (GNU sed):
cat <<\! > replacements.sed
/XX/{s//'"$(date +%F)"'/;s/.*/echo '&'/e}
!
echo "Today is XX" | sed -f replacements.sed
If you don't have GNU sed, try:
cat <<\! > replacements.sed
/XX/{
s//'"$(date +%F)"'/
s/.*/echo '&'/
}
!
echo "Today is XX" | sed -f replacements.sed | sh
AFAIK, it's not possible. Your best bet will be :
INPUT FILE
aaa
bbb
ccc
SH SCRIPT
#!/bin/sh
STRING="${1//\//\\/}" # using parameter expansion to prevent / collisions
shift
sed "
s/aaa/$STRING/
" "$#"
COMMAND LINE
./sed.sh "fo/obar" <file path>
OUTPUT
fo/obar
bbb
ccc
As others have said, you can't use variables in a sed script, but you might be able to "fake" it using extra leading input that gets added to your hold buffer. For example:
[ghoti#pc ~/tmp]$ cat scr.sed
1{;h;d;};/^--$/g
[ghoti#pc ~/tmp]$ sed -f scr.sed <(date '+%Y-%m-%d'; printf 'foo\n--\nbar\n')
foo
2012-10-10
bar
[ghoti#pc ~/tmp]$
In this example, I'm using process redirection to get input into sed. The "important" data is generated by printf. You could cat a file instead, or run some other program. The "variable" is produced by the date command, and becomes the first line of input to the script.
The sed script takes the first line, puts it in sed's hold buffer, then deletes the line. Then for any subsequent line, if it matches a double dash (our "macro replacement"), it substitutes the contents of the hold buffer. And prints, because that's sed's default action.
Hold buffers (g, G, h, H and x commands) represent "advanced" sed programming. But once you understand how they work, they open up new dimensions of sed fu.
Note: This solution only helps you replace entire lines. Replacing substrings within lines may be possible using the hold buffer, but I can't imagine a way to do it.
(Another note: I'm doing this in FreeBSD, which uses a different sed from what you'll find in Linux. This may work in GNU sed, or it may not; I haven't tested.)
I am in agreement with sputnick. I don't believe that sed would be able to complete that task.
However, you could generate that file on the fly.
You could change the date to a fixed string, like
__DAYOFWEEK__.
Create a temp file, use sed to replace __DAYOFWEEK__ with $(date +%Y).
Then parse your file with sed -f $TEMPFILE.
sed is great, but it might be time to use something like perl that can generate the date on the fly.
To add a newline in the replacement expression using a sed file, what finally worked for me is escaping a literal newline. Example: to append a newline after the string NewLineHere, then this worked for me:
#! /usr/bin/sed -f
s/NewLineHere/NewLineHere\
/g
Not sure it matters but I am on Solaris unix, so not GNU sed for sure.

sed + remove "#" and empty lines with one sed command

how to remove comment lines (as # bal bla ) and empty lines (lines without charecters) from file with one sed command?
THX
lidia
If you're worried about starting two sed processes in a pipeline for performance reasons, you probably shouldn't be, it's still very efficient. But based on your comment that you want to do in-place editing, you can still do that with distinct commands (sed commands rather than invocations of sed itself).
You can either use multiple -e arguments or separate commands with a semicolon, something like (just one of these, not both):
sed -i 's/#.*$//' -e '/^$/d' fileName
sed -i 's/#.*$//;/^$/d' fileName
The following transcript shows this in action:
pax> printf 'Line # with a comment\n\n# Line with only a comment\n' >file
pax> cat file
Line # with a comment
# Line with only a comment
pax> cp file filex ; sed -i 's/#.*$//;/^$/d' filex ; cat filex
Line
pax> cp file filex ; sed -i -e 's/#.*$//' -e '/^$/d' filex ; cat filex
Line
Note how the file is modified in-place even with two -e options. You can see that both commands are executed on each line. The line with a comment first has the comment removed then all is removed because it's empty.
In addition, the original empty line is also removed.
#paxdiablo has a good answer but it can be improved.
(1) The '/^$/d' clause only matches 100% blank lines.
If you want to also match lines that are entirely whitespace (spaces, tabs etc.) use this instead:
'/^\s*$/d'
(2) The 's/#.*$//' clause only matches lines that start with the # character in column 0.
If you want to also match lines that have only whitespace before the first # use this instead:
'/^\s*#.*$/d'
The above criteria may not be universal (e.g. within a HEREDOC block, or in a Python multi-line string the different approaches could be significant), but in many cases the conventional definition of "blank" lines include whitespace-only, and "comment" lines include whitespace-then-#.
(3) Lastly, on OSX at least, the #paxdiablo solution in which the first clause turns comment lines into blank lines, and the second clause strips blank lines (including what were originally comments) doesn't work. It seems to be more portable to make both clauses /d delete actions as I've done.
The revised command incorporating the above is:
sed -e '/^\s*#.*$/d' -e '/^\s*$/d' inputFile
This tiny jewel removes all # comments, no matter where they begin in a line (see caution below):
sed -e 's/\s*#.*$//'
Example:
text="
this is a # test
#this is a test
#this is a #test
this is # another #test
"
$echo "$text" | sed -e 's/\s*#.*$//'
this is a
this is
Next this removes any resulting blank lines:
$echo "$text" | sed -e 's/\s*#.*$//' | sed -e '/^\s*$/d'
Caution: Depending on the syntax and/or interpretation of the lines your processing, this might not be an appropriate solution, as it just stupidly removes end of lines, even if the '#' is part of your data or code. However, for use cases where you'll never use a hash except for as an end of line comment then it works fine. So just as with all coding, context must be taken into consideration.
Alternative variant, using grep:
cat file.txt | grep -Ev '(#.*$)|(^$)'
you can use awk
awk 'NF{gsub(/^[ \t]*#/,"");print}' file
First example(paxdiablo) is very good except its not change file, just output result. If you want to change it inline:
sudo sed -i 's/#.*$//;/^$/d' inputFile
On (one of) my linux boxes, sed understands extended regular expressions with the -r option, so:
sed -r '/(^\s*#)|(^\s*$)/d' squid.conf.installed
is very useful for showing all non-blank, non comment lines.
The regex matches either start of line followed by zero or more spaces or tabs followed by either a hash or end of line, and deletes those matching lines from the input.