fish shell: Is it possible to conveniently strip extensions? - fish

Is there any convenient way to strip an arbitrary extension from a file name, something à la bash ${i%%.*}? Do I stick to my friend sed?

If you know the extension (eg _bak, a common usecase) this is possibly more convenient:
for f in (ls *_bak)
mv $f (basename $f _bak)
end

Nope. fish has a much smaller feature set than bash, relying on external commands:
$ set filename foo.bar.baz
$ set rootname (echo $filename | sed 's/\.[^.]*$//')
$ echo $rootname
foo.bar

You can strip off the extension from a filename using the string command:
echo (string split -r -m1 . $filename)[1]
This will split filename at the right-most dot and print the first element of the resulting list. If there is no dot, that list will contain a single element with filename.
If you also need to strip off leading directories, combine it with basename:
echo (basename $filename | string split -r -m1 .)[1]
In this example, string reads its input from stdin rather than being passed the filename as a command line argument.

--- Update 2022-08-02 ---
As of fish 3.5+, there is a path command (docs) which was designed to handle stripping extensions:
$ touch test.txt.bak
$ path change-extension '' ./test.txt.bak
test.txt
You can also strip a set number of extensions:
set --local file ./test.txt.1.2.3
for i in (seq 3)
set file (path change-extension '' $file)
end
echo $file
# ./test.txt
Or strip all extensions:
set --local file ./test.txt.1.2.3
while path extension $file
set file (path change-extension '' $file)
end
echo $file
# ./test
--- Original answer ---
The fish string command is still the canonical way to handle this. It has some really nice sub commands that haven't been shown in other answers yet.
split lets you split from the right with a max of 1, so that you just get the last extension.
for f in *
echo (string split -m1 -r '.' "$f")[1]
end
replace lets you use a regex to lop off the extension, defined as the final dot to the end of the string
for f in *
string replace -r '\.[^\.]*$' '' "$f"
end
man string for more info and some great examples.
Update:
If your system has proper basename and dirname utilities, you can use something like this:
function stripext \
--description "strip file extension"
for arg in $argv
echo (dirname $arg)/(string replace -r '\.[^\.]+$' '' (basename $arg))
end
end

With the string match function built into fish you can do
set rootname (string match -r "(.*)\.[^\.]*\$" $filename)[2]
The string match returns a list of 2 items. The first is the whole string, and the second one is the first regexp match (the stuff inside the parentheses in the regex). So, we grab the second one with the [2].

I too need a function to split random files root and extension. Rather than re-implementing naively the feature at the risk of meeting caveats (ex: dot before separator), I am forwarding the task to Python's built-in POSIX path libraries and inherit from their expertise.
Here is an humble example of what one may prefer:
function splitext --description "Print filepath(s) root, stem or extension"
argparse 'e/ext' 's/stem' -- $argv
for arg in $argv
if set -q _flag_ext
set cmd 'import os' \
"_, ext = os.path.splitext('$arg')" \
'print(ext)'
else if set -q _flag_stem
set cmd 'from pathlib import Path' \
"p = Path('$arg')" \
'print(p.stem)'
else
set cmd 'import os' \
"root, _ = os.path.splitext('$arg')" \
'print(root)'
end
python3 -c (string join ';' $cmd)
end
end
Examples:
$ splitext /this/is.a/test.path
/this/is.a/test
$ splitext --ext /this/is.a/test.path
.path
$ splitext --stem /this/is.a/test.path
test
$ splitext /this/is.another/test
/this/is.another/test

Related

How to Find & Replace a String Within Files with Find / Grep / Sed

I have a folder of 500 *.INI files that I need to manually edit. Within each INI file, I have the line Source =. I would like that line to become Source = C:\software\{filename}.
For instance, a dx4.ini file would need to be fixed to become: Source = C:\software\dx4
Is there a quick way to do this with Find, Grep, or Sed functions?
You can try with sed
For example
Input file contents:
file.txt
Source =
some lines..
script:
newstring='Source = C:\software\dx4'
oldstring='Source ='
echo `sed "s/$oldstring/$newstring/g" file.txt` > file.txt
After running the above commands
output:
Source = C:\software\dx4
some lines..
If you want to edit a file in a script, I think ed is the way to go. Combined with a shell for loop:
for file in *.INI; do
base=$(basename "$file" .INI)
ed -s "$file" <<EOF
/^Source =/s/=/= C:\\\\software\\\\$base/
w
EOF
done
(This does assume that filenames will not have newlines or ampersands in their names)
With GNU awk for the 3rd arg to match(), gensub(), and "inplace" editing:
awk -i inplace '
match($0,/(.*Source = C:\\software\\){filename}(.*)/,a) {
fname = gensub(/\..*/,"",1,FILENAME)
$0 = a[1] fname a[2]
}
1' *.INI
The above assumes you're running in a UNIX environment though your use of the term folder instead of directory and that path starting with C: and containing backslashes makes me suspicious. If you're on Windows then save the part between the 2 's (exclusive) in a file named foo.awk and execute it as awk -i inplace foo.awk *.INI or however it is you normally execute commands like this in Windows.
find *.ini -type -f > stack
while read line
do
sed -i s"#Source =#Source = C:\\software\\dx4#" "${line}"
done < stack
Assuming that a} You have sed with "-i" (the insert flag, which AFAIK is not always portable) and b} sed doesn't crap itself about a double escape sequence, I think that will work.

Failed to open the file

I am trying to figure out the picture date of files in a folder structure. Some of the folder names contain with whitespaces. Now I try to set the quotes, but it doesn't work.
Can anyone give me a hint?
find . -name "*.jpg" -or -name "*.JPG" >> files.txt
sed -e "s/\(.*\)/'\1'/" files.txt >> files2.txt
for fn in `cat files2.txt`; do
DATEI=$( echo "$fn" | cut -c 3-)
EXIF=$(/usr/bin/exiv2 -pa --grep DateTimeOriginal "'"$PWD$DATEI | awk -F" " '{print $4","$5}')
if [ -z "$EXIF" ]
then
:
else
echo "$PWD$DATEI,$EXIF" >> ausgabe.csv
fi
done
echo "DONE!"
EDIT: This is the output that I get:
'/volume1/Intern/path/to/images/IMG_4206.jpg': Failed to open the file
I take it your result is supposed to look like
"/path/to/photo1.jpg","2017:01:15","22:19:15"
"/path/to/another/photo.JPG","2017:01:15","22:10:01"
The absolute path to the photo, then the DateTimeOriginal date/time, all in quotes.
exiv2 can actually take multiple photos in the file argument, so the whole process can be simplified to a pipeline of two commands:
# Need this for the fileglob
shopt -s globstar extglob
exiv2 -pa -g DateTimeOriginal **/*.#(jpg|JPG) |
awk -v pwd="$PWD/" -v dq='"' -v OFS=',' '{
fn = substr($0, 1, match($0, / *Exif\.Photo/)-1)
print dq pwd fn dq, dq $(NF-1) dq, dq $NF dq
}'
The shell options, globstar and extglob, enable the **/*.*#(jpg|JPG) expression, which returns all files ending in jpg or JPG for the whole directory tree.
exiv2 returns only something for the files that contain DateTimeOriginal data. The intermediate output looks something like this (some whitespace removed):
dir1/photo1.jpg Exif.Photo.DateTimeOriginal Ascii 20 2017:01:22 10:20:36
dir1/photo3.jpg Exif.Photo.DateTimeOriginal Ascii 20 2017:01:22 10:20:36
dir with space/photo2.JPG Exif.Photo.DateTimeOriginal Ascii 20 2017:01:22 10:20:38
dir with space/photo4.JPG Exif.Photo.DateTimeOriginal Ascii 20 2017:01:22 10:40:09
photo5.jpg Exif.Photo.DateTimeOriginal Ascii 20 2017:01:24 20:06:38
photo6.JPG Exif.Photo.DateTimeOriginal Ascii 20 2017:01:22 10:00:55
This would be straightforward with awk, were it not for the paths with spaces as mentioned in the question. The exiv2 output is space separated, and there doesn't seem to be an option to get tabs, so some awk trickery is required:
The path of the current directory, followed by a slash, is passed into the command using -v pwd="$PWD/".
To avoid messy escaping, we define the double quote with -v dq='"'.
The output field separator is set to a comma with -v OFS=','.
To get the filename, we search for the index of a series of spaces followed by Exif.Photo, then we assign a substring that ends just before that index to fn.
To print quoted and comma separated, we use our dq variable, prepend the filename with the path from pwd, and use $(NF-1) and $NF to get the second to last and last field, respectively.
The result is something like
"/home/benjamin/tinker/space dir/dir1/photo1.jpg","2017:01:22","10:20:36"
"/home/benjamin/tinker/space dir/dir1/photo3.jpg","2017:01:22","10:20:36"
"/home/benjamin/tinker/space dir/dir with space/photo2.JPG","2017:01:22","10:20:38"
"/home/benjamin/tinker/space dir/dir with space/photo4.JPG","2017:01:22","10:40:09"
"/home/benjamin/tinker/space dir/photo5.jpg","2017:01:24","20:06:38"
"/home/benjamin/tinker/space dir/photo6.JPG","2017:01:22","10:00:55"
To get this into a file, a redirection > ausgabe.csv has to be appended to the command.
As for why your command didn't work: let's look at a single file as an example. After the sed step, you have something like './photo5.jpg'. Now, you use cut -c 3-, which gives you /photo5.jpg'.
In your EXIF line, you add another single quote, so now exiv2 is looking for a file literally called '/photo5.jpg', including the single quotes – which doesn't exist.

Find and replace variables with SED in a file

I'd would like to find and replace variables in a conf file in KSH (with SED).
My question is: what is the correct regex pattern to identify KSH variables (like $toto or ${toto}), by taking into account that variable's name can contains special characters ?
Here is an example:
Let's say I have var_1=value1 and var_2=value2 in my current shell (grepable in export -p).
The configuration file before find and replace
PARAM1=$var_1/tata.txt
PARAM2=${var_2}/tata.txt
The configuration file after find and replace
PARAM1=value1/tata.txt
PARAM2=value2/tata.txt
What I have to do:
Find $var_1 and ${var_2} in then conf file (with generic regex, I suppose I don't know if there are variables in the conf file)
Search value of each with export -p | grep var_1 and export -p | grep var_2
Replace these 2 variables in conf file by the value found by previous search
Thank you for your answers.
This is one those cases where the evil and dangerous eval would come in handy:
while read line; do
eval echo "$line"
done < inputfile > outputfile
Using awk (I do see you are asking for sed)
awk '{gsub(/\${?var_1}?/,x);gsub(/\${?var_2}?/,y)}1' x="$var_1" y="$var_2" file
PARAM1=value1/tata.txt
PARAM2=value2/tata.txt
Src=/tmp/sed.Source
FIn=/tmp/sed.In
FOut=/tmp/sed.Out
# take all variable as source but could be reduce (highly recommended)
set > ${Src}
cp YourFile ${FIn}
sed 's/=/ /' ${Src} | while read vName vValue
do
sed "s/\$$vName/$vValue/g;s/\${$vName}/$vValue/g" ${FIn} > ${FOut}
mv ${Fout} ${FIn}
done
#rm ${Src}
cat ${Fin}
#rm ${Fin}
Main concern is when content (value) of the variable contain regex specail char like /.* so not the best solution for that
could be done in sed only wit a preload of variable value and a marker than recursive replacement
You can try this one-liner workaround:
cat config
PARAM1=$var_1/tata.txt
PARAM2=${var_2}/tata.txt
( export var_1=value1; export var_2=value2; bash -x ./config 2>&1 >/dev/null|sed 's/^+ //' )
PARAM1=value1/tata.txt
PARAM2=value2/tata.txt

perl -pe to manipulate filenames

I was trying to do some quick filename cleanup at the shell (zsh, if it matters). Renaming files. (I'm using cp instead of mv just to be safe)
foreach f (\#*.ogg)
cp $f `echo $f | perl -pe 's/\#\d+ (.+)$/"\1"/'`
end
Now, I know there are tools to do stuff like this, but for personal interest I'm wondering how I can do it this way. Right now, I get an error:
cp: target `When.ogg"' is not a directory
Where 'When.ogg' is the last part of the filename. I've tried adding quotes (see above) and escaping the spaces, but nonetheless this is what I get.
Is there a reason I can't use the output of s perl pmr=;omrt as the final argument to another command line tool?
It looks like you have a space in the file names being processed, so each of your cp command lines evaluates to something like
cp \#nnnn When.Ogg When.ogg
When the cp command sees more than two arguments, the last one must be a target directory name for all the files to be copied to - hence the error message. Because your source filename ($f) contains a space it is being treated as two arguments - cp sees three args, rather than the two you intend.
If you put double quotes around the first $f that should prevent the two 'halves' of the name from being treated as separate file names:
cp "$f" `echo ...
This is what you need in bash, hope it's good for zsh too.
cp "$f" "`echo $f | perl -pe 's/\#\d+ (.+)$/\1/'`"
If the filename contains spaces, you also have quote the second argument of cp.
I often use
dir /b ... | perl -nle"$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n"
The -l chomps the input.
The -e check is to avoid accidentally renaming all the files to one name. I've done that a couple of times.
In bash (and I'm guessing zsh), that would be
foreach f (...)
echo "$f" | perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
end
or
find -name '...' -maxdepth 1 \
| perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
or
find -name '...' -maxdepth 1 -exec \
perl -e'for (#ARGV) {
$o=$_; s/.../.../; $n=$_;
rename $o,$n if !-e $n;
}' {} +
The last supports file names with newlines in them.

I want to use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory

use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory.
sed "s#/dir#$dir#g"
The $ keeps being interpreted as a function or variable call.
Is there a way around this?
thanks
Read your shell's friendly manual:
man sh
In the shell, "double quotes" around text allow variable interpretation inside, while 'single quotes' do not, a convention adopted by later languages such as Perl and PHP (but not e.g. JavaScript).
sed 's#/dir#$dir#g' *
To perform the replacement within the scripts do something like
find * -maxdepth 0 -type f | while read f; do mv $f $f.old && sed 's#/dir#$dir#' $f.old > $f; done
or just
perl -pi.old -e 's#/dir#\$dir#' * # Perl also interpolates variables in s commands
You can simply escape it with a backslash:
sed "s#/dir#\$dir#g"
shell approach
for file in file*
do
if [ -f "$file ];then
while read -r line
case "$line" in
*/dir* ) line=${line///dir/\$dir}
esac
echo $line > temp
done < "file"
mv temp $file
fi
done