iterate over stdin fish (context: filter music files by genre grep) - fish

I have this:
for file in **/*.ogg;
if ffprobe "$file" 2>&1 | sed -E -n 's/^ *GENRE *: (.*)/\1/p' | grep -q "$argv";
echo "$file"
else
end
end
but I would like to turn it into a function which will take a list of filenames as standard-input:
$ find . -maxdepth 1 -not -type d -exec du -h {} + | cut -f2 | filterByGenre Classical

You could do
function filterByGenre
while read line
do stuff with $line
end
end
or
function filterByGenre
set listOfLines (cat)
for line in $listOfLines
do stuff with $line
end
end

Related

Bash or Python efficient substring matching and filtering

I have a set of filenames in a directory, some of which are likely to have identical substrings but not known in advance. This is a sorting exercise. I want to move the files with the maximum substring ordered letter match together in a subdirectory named with that number of letters and progress to the minimum match until no matches of 2 or more letters remain. Ignore extensions. Case insensitive. Ignore special characters.
Example.
AfricanElephant.jpg
elephant.jpg
grant.png
ant.png
el_gordo.tif
snowbell.png
Starting from maximum length matches to minimum length matches will result in:
./8/AfricanElephant.jpg and ./8/elephant.jpg
./3/grant.png and ./3/ant.png
./2/snowbell.png and ./2/el_gordo.tif
Completely lost on an efficient bash or python way to do what seems a complex sort.
I found some awk code which is almost there:
{
count=0
while ( match($0,/elephant/) ) {
count++
$0=substr($0,RSTART+1)
}
print count
}
where temp.txt contains a list of the files and is invoked as eg
awk -f test_match.awk temp.txt
Drawback is that a) this is hardwired to look for "elephant" as a string (I don't know how to make it take an input string (rather than file) and an input test string to count against, and
b) I really just want to call a bash function to do the sort as specified
If I had this I could wrap some bash script around this core awk to make it work.
function longest_common_substrings () {
shopt -s nocasematch
for file1 in * ; do for file in * ; do \
if [[ -f "$file1" ]]; then
if [[ -f "$file" ]]; then
base1=$(basename "$file" | cut -d. -f1)
base2=$(basename "$file1" | cut -d. -f1)
if [[ "$file" == "$file1" ]]; then
echo -n ""
else
echo -n "$file $file1 " ; $HOME/Scripts/longest_common_substring.sh "$base1" "$base2" | tr -d '\n' | wc -c | awk '{$1=$1;print}' ;
fi
fi
fi
done ;
done | sort -r -k3 | awk '{ print $1, $3 }' > /tmp/filesort_substring.txt
while IFS= read -r line; do \
file_to_move=$(echo "$line" | awk '{ print $1 }') ;
directory_to_move_to=$(echo "$line" | awk '{ print $2 }') ;
if [[ -f "$file_to_move" ]]; then
mkdir -p "$directory_to_move_to"
\gmv -b "$file_to_move" "$directory_to_move_to"
fi
done < /tmp/filesort_substring.txt
shopt -u nocasematch
where $HOME/Scripts/longest_common_substring.sh is
#!/bin/bash
shopt -s nocasematch
if ((${#1}>${#2})); then
long=$1 short=$2
else
long=$2 short=$1
fi
lshort=${#short}
score=0
for ((i=0;i<lshort-score;++i)); do
for ((l=score+1;l<=lshort-i;++l)); do
sub=${short:i:l}
[[ $long != *$sub* ]] && break
subfound=$sub score=$l
done
done
if ((score)); then
echo "$subfound"
fi
shopt -u nocasematch
Kudos to the original solution for computing the match in the script which I found elsewhere in this site

using find with a loop returning files with spaces in their names

Using Cygwin on Windows 10, I am trying to find files in one directory (dir1) that are not in another (dir2), regardless of the file path
The idea is to loop through all files in dir1 and, for each, launch a find command in dir2 and display only the missing files:
for f in `ls -R /path/to/dir1` ; do
if [ $( find /path/to/dir2 -name "$f" | wc -l ) == 0 ] ; then
echo $f
fi
done
The problem is that some of the file names have spaces in them and this is causing the find command to fail
Any ideas?
Could you do this with find and comm? Something like the following should print files in dir1 which aren't in dir2.
comm -23 <(find dir1 -type f -printf '%f\n' | sort -u) <(find dir2 -type f -printf '%f\n' | sort -u)
It works with spaces, too:
$ mkdir dir1 dir2
$ touch dir1/foo dir1/bar
$ touch dir2/foo dir2/baz
$ touch dir1/'foo bar'
$ comm -23 <(find dir1 -type f -printf '%f\n' | sort -u) <(find dir2 -type f -printf '%f\n' | sort -u)
./bar
./foo bar
For real safety, you should use NUL-terminated strings, so filenames with newlines in will work.
comm -z23 <(find dir1 -type f -printf '%f\0' | sort -uz) <(find dir2 -type f -printf '%f\0' | sort -uz) | xargs -0 printf '%s\n'

Multiple jar file introspection

How do write a command on bash shell that can search a number of jar files in a directory for a specified class or string path. Eg
I want to search all workshop.jar searching for this string path:
com/bea/workshop/common/util/fileio/ManifestUtil
try this:
find . -name *.jar -exec bash -c "echo {} && jar tvf {} | grep ServiceMBean " \;
Good luck,
-M
Find jars with containing pattern (class or file) create the following script findjars from the root directory of a search tree
#!/bin/bash
JAR=$JAVA_HOME/bin/jar
if [ $# -ne 1 ];
then
echo "Usage: $0 pattern"
exit 1
fi
pattern=`echo $1 | sed -e 's/\./\//g'`
echo "Searching for: [$pattern]"
if [ ! -e $JAR ];
then
echo "$JAR does not exist"
exit 1
fi
for file in `find . -type f \( -name "*.jar" -o -name "*.zip" \) -print`;
do
$JAR tvf $file 2>/dev/null | grep ${pattern} 2>/dev/null
if [ $? -eq 0 ];
then
echo $file
fi
done
and use it as
findjars com/bea/workshop/common/util/fileio/ManifestUtil
Alternatively, to list all classes and files within all of the jar files within a directory or directory tree:
#!/bin/bash
JAR=$JAVA_HOME/bin/jar
if [ ! -e $JAR ];
then
echo "$JAR does not exist"
exit 1
fi
for file in `find . -type f \( -name "*.jar" -o -name "*.zip" \) -print`;
do
echo $file
$JAR tvf $file 2>/dev/null
done
Use JarScan.
Usage: java -jar jarscan.jar [-help | /?]
[-dir directory name]
[-zip]
[-showProgress]
<-files | -class | -package>
<search string 1> [search string 2]
[search string n]

Using grep with sed and writing a new file based on the results

I'm very new to some of the command line utilities and have been looking for a while for a command that would accomplish my goal.
The goal is to find files that contain a string of text, replace it with a new string, and then write the results to a file that is named the same as the original, but in a different directory.
Obviously this is not working, so I am asking how you who know about this stuff would go about it.
grep -rl 'stringToFind' *.* | sed 's|oldString|newString|g' < fileNameFromGrep > ./new/fileNameFromGrep
Thanks for your input!
John
for f in "`find /YOUR/SEARCH/DIR/ROOT -type f -exec fgrep -l 'stirngToFind' \{\} \;`" ; do
sed 's|oldString|newString|g' < "${f} > ./new/"${f}
done
Will do it for you.
If you have spaces in filenames:
OLDIFS=$IFS
IFS=''
find /PATH -print0 -type f | while read -r -d $'' file
do
fgrep -l 'stirngToFind' "$file" && \
sed 's|oldString|newString|g' < "${file} > ./new/"${file}
done
IFS=$OLDIFS
#!/bin/bash
for file in *; do
if grep -qF 'stringToFind' "$file"; then
sed 's/oldString/newString/g' "$file" > "./new/$file"
fi
done
for file in path/to/dir/*
do
grep -q 'pattern' "$file" > /dev/null
if [ $? == 0 ]; then
sed 's/oldString/newString/g' "$file" > /path/to/newdir/"$file"
fi
done
You try:
sed -ie "s/oldString/newString/g" \
$(grep -Rsi 'pattern' path/to/dir/ | cut -d: -f1)
sed:
i in_place
e exec other command or script
grep:
R recursive
s Suppress error messages
i ignore case sensitive

How do I find the largest 10 files in a given directory?

How do I find the largest 10 files in a given directory, with Perl or Bash?
EDIT:
I need this to be recursive.
I only want to see large files, no large directories.
I need this to work on Mac OS X 10.6 ('s version of find).
This prints the 10 largest files recursively from current directory.
find . -type f -printf "%s %p\n" | sort -nr | awk '{print $2}' | head -10
$ alias ducks
alias ducks='du -cs * |sort -rn |head -11'
This is a way to do it in perl. (Note: Non-recursive version, according to earlier version of the question)
perl -wE 'say for ((sort { -s $b <=> -s $a } </given/dir/*>)[0..9]);'
However, I'm sure there are better tools for the job.
ETA: Recursive version, using File::Find:
perl -MFile::Find -wE '
sub wanted { -f && push #files, $File::Find::name };
find(\&wanted, "/given/dir");
#files = sort { -s $b <=> -s $a } #files;
say for #files[0..9];'
To check file sizes, use e.g. printf("%-10s : %s\n", -s, $_) for #files[0..9]; instead.
How about this -
find . -type f -exec ls -l {} + | awk '{print $5,$NF}' | sort -nr | head -n 10
Test:
[jaypal:~/Temp] find . -type f -exec ls -l {} + | awk '{print $5,$NF}' | sort -nr | head -n 10
8887 ./backup/GTP/GTP_Parser.sh
8879 ./backup/Backup/GTP_Parser.sh
6791 ./backup/Delete_HIST_US.sh
6785 ./backup/Delete_NORM_US.sh
6725 ./backup/Delete_HIST_NET.sh
6711 ./backup/Delete_NORM_NET.sh
5339 ./backup/GTP/gtpparser.sh
5055 ./backup/GTP/gtpparser3.sh
4830 ./backup/GTP/gtpparser2.sh
3955 ./backup/GTP/temp1.file