How to check if a file contains a specific set of characters (ex: ^&^) - text-processing

I have a file which is delimited with ^&^. Here is a snippet from the file.
XML_DOC^&^NUM^&^GEO_REF_ID^&^GRL
I need to perform some operations based on the delimiter. How can I check if the file has ^&^ ?
I have tried the below code but that did not work.
if grep -q "^&^" "local/filename.txt"; then
echo "has"
else
echo "has not "
fi
Any help is much appreciated.

grep -q '\^&^' filename.txt
^ in first position in grep means «start of the string».

Related

Using sed and mv to add characters to files

First off, I'd like to say that I know this is almost an exact duplicate of some posts that I've read, but have not had any luck with referencing.
I have 100+ files that all follow a very strict naming convention of 5_##_<name>.ext My issue was that when originally making these files I failed to realise that 5_100_ and above would mess up my ordering.
I am now trying to append a 0 in front of every number between 01 and 99. I've written a bash script using sed that works for the file contents (the file name is in the file as well):
#!/bin/bash
for fl in *.tcl; do
echo Filename: $fl
#sed -i 's/5_\(..\)_/5_0\1_/g' $fl
done
However, this only changes the contents and not the filename itself. I've read that mv is the solution (rename is simpler but I do not have it on my system). My current incarnation of my multiple attempts is:
mv "$fl" $(echo "$file" | sed -e 's/5_\(..\)_/5_0\1_/g') but it gives me an error: mv: missing destination file operand after <filename>
Again, I'm sorry about the duplicate but I wasn't able to solve my issue by reading it. I'm sure I'm just using the combination of mv and sed incorrectly.
Solution was entered in the comments. I was using $file instead of $fl.
Something like this might be useful:
for n in $(seq 99)
do
prefix2="5_$(printf "%02d" ${n})_"
prefix3="5_$(printf "%03d" ${n})_"
for f in ${prefix2}*.tcl
do
suffix="${f#${prefix2}}"
[[ -r "${prefix3}${suffix}" ]] || mv "${prefix2}${suffix}" "${prefix3}${suffix}"
done
done
Rather than processing every single file, it only looks at the ones that currently have a "5_XX_" prefix, and only renames them if the corresponding "5_XXX_" file doesn't already exist...
#!/bin/bash
for fl in *.tcl
do
NewName="$(echo "${fl} | sed '/^5_[0-9]\{2\}_/ s/../&0/' )"
#echo "Filename: ${fl} -> ${NewName}
[ ! "${fl}" = "${NewName}" ] && mv ${fl} ${NewName}
done
With a bit a securisation a allow to pass several time on same folder (changing only needed one).
Under linux (non posix sed by default), use sed --posix instead of simple sed call

comparing two directories with separate diff output per file

I'd need to see what has been changed between two directories which contain different version of a software sourcecode. While I have found a way to get a unique .diff file, how can I obtain a different file for each changed file in the two directories? I'd need this, as the "main" is about 6 MB and wanted some more handy thing.
I came around this problem too, so I ended up with some lines of a shell script. It takes three arguments: Source and destination directory (as used for diff) and a target folder (should exist) for the output.
It's a bit hacky, but maybe it would be useful for someone. So use with care, especially if your paths have special characters.
#!/bin/sh
DIFFARGS="-wb"
LANG=C
TARGET=$3
SRC=`echo $1 | sed -e 's/\//\\\\\\//g'`
DST=`echo $2 | sed -e 's/\//\\\\\\//g'`
if [ ! -d "$TARGET" ]; then
echo "'$TARGET' is not a directory." >&2
exit 1
fi
diff -rqN $DIFFARGS "$1" "$2" | sed "s/Files $SRC\/\(.*\?\) and $DST\/\(.*\?\) differ/\1/" | \
while read file
do
if [ ! -d "$TARGET/`dirname \"$file\"`" ]; then
mkdir -p "$TARGET/`dirname \"$file\"`"
fi
diff $DIFFARGS -N "$1/$file" "$2/$file" > "$TARGET"/"$file.diff"
done
if you want to compare source code it is better to commit it to a source vesioning program as "svn".
after you have done so. do a diff of your uploaded code and pipe it to file.diff
svn diff --old svn:url1 --new svn:url2 > file.diff
A bash for loop will work for you. The following will diff two directories with C source code and produce a separate diff for each file.
for FILE in $(find <FIRST_DIR> -name '*.[ch]'); do DIFF=<DIFF_DIR>/$(echo $FILE | grep -o '[-_a-zA-Z0-9.]*$').diff; diff -u $FILE <SECOND_DIR>/$FILE > $DIFF; done
Use the correct patch level for the lines starting with +++

SED Delete lines and replace with new from file

Have been looking at SED documention but need a little pointer in the right direction
I have 200 files I want to modify in a batch.
Source is html file.
Need to create a new file for the changes.
Want to delete the first part of each file up to the first tag (This is 20 or so lines but can vary slightly).
Then insert the contents of a source file (the same for all files) into the new target file starting at line 1, for 30 or so lines. The number of lines to insert does not match the number that are deleted though.
Hope you can help.
Paul
This can certainly be done with sed(1), but I would probably use the vanilla editor ed(1).
$ cat > bigfix.sh
for i in "$#"; do
ed "$i" << \eof
1,/<tag>/-1d
0r otherfile.html
w
q
eof
done
$ sh bigfix.sh file*.html
This shell script takes arguments and runs ed(1) on each arg. It deletes lines starting from the first and ending on the line right before the one with <tag>. It then puts otherfile.html at the top and writes out the result.
For an individual file:
sed -e '1,/tag/{/tag/r insertfile' -e ';d}' inputfile > outputfile
For many files:
find . -name 'criterion*.ext' -type f -exec sh -c 'sed -e "1,/tag/{/tag/r insertfile" -e ';d}" "{}" > "{}.new"' \;
Edit:
Fixed the find command to use sh because of the redirection. Note the change in quoting from the previous version.

Unable to filter rows which contain "Is a directory" by SED/AWK

I run the code gives me the following sample data
md5deep find * | awk '{ print $1 }'
A sample of the output
/Users/math/Documents/Articles/Number theory: Is a directory
258fe6853b1bfb2d07f512ff6bec52b1
/Users/math/Documents/Articles/Probability and statistics: Is a directory
4811bfb2ad04b9f4318049c01ebb52ef
8aae4ac3694658cf90005dbdea37b4d5
258fe6853b1bfb2d07f512ff6bec52b1
I have tried to filter the rows which contain Is a directory by SED unsuccessfully
md5deep find * | awk '{ print $1 }' | sed s/\/*//g
Its sample output is
/Users/math/Documents/Articles/Number theory: Is a directory
/Users/math/Documents/Articles/Topology: Is a directory
/Users/math/Documents/Articles/useful: Is a directory
How can I filter Out each row which contains "Is a directory" by SED/AWK?
[clarification]
I want to filter out the rows which contain Is a directory.
I have not used the md5deep tool, but I believe those lines are error messages; they would be going to standard error instead of standard out, and so they are going directly to your terminal instead of through the pipe. Thus, they won't be filtered by your sed command. You could filter them by merging your standard error and standard output streams, but
It looks like (I'm not sure because you are missing the backquotes) you are trying to call
md5deep `find *`
and find is returning all of the files and directories.
Some notes on what you might want to do:
It looks like md5deep has a -r for "recursive" option. So, you may want to try:
md5deep -r *
instead of the find command.
If you do wish to use a find command, you can limit it to only files using -type f, instead of files and directories. Also, you don't need to pass * into a find command (which may confuse find if there are files that have names that looks like the options that find understands); passing in . will search recursively through the current directory.
find . -type f
In sed if you wish to use slashes in your pattern, it can be a pain to quote them correctly with \. You can instead choose a different character to delimit your regular expression; sed will use the first character after the s command as a delimiter. Your pattern is also lacking a .; in regular expressions, to indicate one instance of any character you use ., and to indicate "zero or more of the preceding expression" you use *, so .* indicates "zero or more of any character" (this is different from glob patterns, in which * alone means "zero or more of any character").
sed "s|/.*||g"
If you really do want to be including your standard error stream in your standard output, so it will pass through the pipe, then you can run:
md5deep `find *` 2>&1 | awk ...
If you just want to ignore stderr, you can redirect that to /dev/null, which is a special file that just discards anything that goes into it:
md5deep `find *` 2>/dev/null | awk ...
In summary, I think the command below will help you with your immediate problem, and the other suggestions listed above may help you if I did not undersand what you were looking for:
md5deep -r * | awk '{ print $1 }'
To specifically answer the clarification: how to filter out lines using awk and sed:
awk '/Is a directory/ {next} {print}'
sed 'g/Is a directory/d'
Why not use grep instead?
ie,
md5deep find * | grep "Is a directory" | awk '{ print $1 }'
Edit: I just re-read your question and if you want to remove the lines with Is a directory, use the -v flag of grep, ie:
md5deep find * | grep -v "Is a directory" | awk '{ print $1 }'
I'm not intimately familiar with md5deep, but this may do something like you are tying to do.
find -type f -exec md5sum {} +

DOS to UNIX path substitution within a file

I have a file that contains this kind of paths:
C:\bad\foo.c
C:\good\foo.c
C:\good\bar\foo.c
C:\good\bar\[variable subdir count]\foo.c
And I would like to get the following file:
C:\bad\foo.c
C:/good/foo.c
C:/good/bar/foo.c
C:/good/bar/[variable subdir count]/foo.c
Note that the non matching path should not be modified.
I know how to do this with sed for a fixed number of subdir, but a variable number is giving me trouble. Actually, I would have to use many s/x/y/ expressions (as many as the max depth... not very elegant).
May be with awk, but this kind of magic is beyond my skills.
FYI, I need this trick to correct some gcov binary files on a cygwin platform.
I am dealing with binary files; therefore, I might have the following kind of data:
bindata\bindata%bindataC:\good\foo.c
which should be translated as:
bindata\bindata%bindataC:/good/foo.c
The first \ must not be translated, despite that it is on the same line.
However, I have just checked my .gcno files while editing this text and it looks like all the paths are flanked with zeros, so most of the answers below should fit.
sed -e '/^C:\\good/ s/\\/\//g' input_file.txt
I would recommend you look into the cygpath utility, which converts path names from one format to another. For instance on my machine:
$ cygpath `pwd`
/home/jericson
$ cygpath -w `pwd`
D:\root\home\jericson
$ cygpath -m `pwd`
D:/root/home/jericson
Here's a Perl implementation of what you asked for:
$ echo 'C:\bad\foo.c
C:\good\foo.c
C:\good\bar\foo.c
C:\good\bar\[variable subdir count]\foo.c' | perl -pe 's|\\|/|g if /good/'
C:\bad\foo.c
C:/good/foo.c
C:/good/bar/foo.c
C:/good/bar/[variable subdir count]/foo.c
It works directly with the string, so it will work anywhere. You could combine it with cygpath, but it only works on machines that have that path:
perl -pe '$_ = `cygpath -m $_` if /good/'
(Since I don't have C:\good on my machine, I get output like C:goodfoo.c. If you use a real path on your machine, it ought to work correctly.)
You want to substitute '/' for all '\' but only on the lines that match the good directory path. Both sed and awk will let you do this by having a LHS (matching) expression that only picks the lines with the right path.
A trivial sed script to do this would look like:
/[Cc]:\\good/ s/\\/\//g
For a file:
c:\bad\foo
c:\bad\foo\bar
c:\good\foo
c:\good\foo\bar
You will get the output below:
c:\bad\foo
c:\bad\foo\bar
c:/good/foo
c:/good/foo/bar
Here's how I would do it in awk:
# fixpaths.awk
/C:\\good/ {
gsub(/\\/,"/",$1);
print $1 >> outfile;
}
Then run it using the command:
awk -f fixpaths.awk paths.txt; mv outfile paths.txt
Or with some help from good ol' Bash:
#!/bin/bash
cat file | while read LINE
do
if <bad_condition>
then
echo "$LINE" >> newfile
else
echo "$LINE" | sed -e "s/\\/\//g" >> newfile
fi
done
try this
sed -re '/\\good\\/ s/\\/\//g' temp.txt
or this
awk -F"\\" '{if($2=="good"){OFS="\/"; $1=$1;} print $0}' temp.txt