Replacing characters in a sh script - sh

I am writing an sh script and need to replace the . and - with a _
Current:
V123_45_678_910.11_1213-1415.sh
Wanted:
V123_45_678_910_11_1213_1415.sh
I have used a few mv commands, but I am having trouble.
for file in /virtualun/rest/scripts/IOL_Extra/*.sh ; do mv $file ${file//V15_IOL_NVMe_01./V15_IOL_NVMe_01_} ; done

You don't need to match any of the other parts of the file name, just the characters you want to replace. To avoid turning foo.sh into foo-sh, remove the extension first, then add it back to the result of the replacement.
for file in /virtualun/rest/scripts/IOL_Extra/*.sh ; do
base=${file%.sh}
mv -i -- "$file" "${base//[-.]/_}".sh
done
Use the -i option to make sure you don't inadvertently replace one file with another when the modified names coincide.

This should work:
#!/usr/bin/env sh
# Fail on error
set -o errexit
# Disable undefined variable reference
set -o nounset
# Enable wildcard character expansion
set +o noglob
# ================
# CONFIGURATION
# ================
# Pattern
PATTERN="/virtualun/rest/scripts/IOL_Extra/*.sh"
# ================
# LOGGER
# ================
# Fatal log message
fatal() {
printf '[FATAL] %s\n' "$#" >&2
exit 1
}
# Info log message
info() {
printf '[INFO ] %s\n' "$#"
}
# ================
# MAIN
# ================
{
# Check directory exists
[ -d "$(dirname "$PATTERN")" ] || fatal "Directory '$PATTERN' does not exists"
for _file in $PATTERN; do
# Skip if not file
[ -f "$_file" ] || continue
info "Analyzing file '$_file'"
# File data
_file_dirname=$(dirname -- "$_file")
_file_basename=$(basename -- "$_file")
_file_name="${_file_basename%.*}"
_file_extension=
case $_file_basename in
*.*) _file_extension=".${_file_basename##*.}" ;;
esac
# New file name
_new_file_name=$(printf '%s\n' "$_file_name" | sed 's/[\.\-][\.\-]*/_/g')
# Skip if equals
[ "$_file_name" != "$_new_file_name" ] || continue
# New file
_new_file="$_file_dirname/${_new_file_name}${_file_extension}"
# Rename
info "Renaming file '$_file' to '$_new_file'"
mv -i -- "$_file" "$_new_file"
done
}

You can try this:
for f in /virtualun/rest/scripts/IOL_Extra/*.sh; do
mv "$f" $(sed 's/[.-]/_/g' <<< "$f")
done
The sed command is replacing all characters .- by _.

I prefer using sed substitute as posted by oliv.
However, if you have not familiar with regular expression, using rename is faster/easier to understand:
Example:
$ touch V123_45_678_910.11_1213-1415.sh
$ rename -va '.' '_' *sh
`V123_45_678_910.11_1213-1415.sh' -> `V123_45_678_910_11_1213-1415_sh'
$ rename -va '-' '_' *sh
`V123_45_678_910_11_1213-1415_sh' -> `V123_45_678_910_11_1213_1415_sh'
$ rename -vl '_sh' '.sh' *sh
`V123_45_678_910_11_1213_1415_sh' -> V123_45_678_910_11_1213_1415.sh'
$ ls *sh
V123_45_678_910_11_1213_1415.sh
Options explained:
-v prints the name of the file before -> after the operation
-a replaces all occurrences of the first argument with the second argument
-l replaces the last occurrence of the first argument with the second argument
Note that this might not be suitable depending on the other files you have in the given directory that would match *sh and that you do NOT want to rename.

Related

Subset a string in POSIX shell

I have a variable set in the following format:
var1="word1 word2 word3"
Is it possible to subset/delete one of the space-delimited word portably? What I want to archive is something like this:
when --ignore option is supplied with the following argument
$ cmd --ignore word1 # case 1
$ cmd --ignore "word1 word2" # case2
I want the var1 changes to have only the following value
"word2 word3" # case1
"word3" #case2
If there is no way to achieve above described, is there a way to improve the efficiency of the following for loop? (The $var1 is in a for loop so my alternative thought to achieve similar was having following code)
# while loop to get argument from options
# argument of `--ignore` is assigned to `$sh_ignore`
for i in $var1
do
# check $i in $sh_ignore instead of other way around
# to avoid unmatch when $sh_ignore has more than 1 word
if ! echo "$sh_ignore" | grep "$i";
then
# normal actions
else
# skipped
fi
done
-------Update-------
After looking around and reading the comment by #chepner I now temporarily using following code (and am looking for improvement):
sh_ignore=''
while :; do
case
# some other option handling
--ignore)
if [ "$2" ]; then
sh_ignore=$2
shift
else
# defined `die` as print err msg + exit 1
die 'ERROR: "--ignore" requires a non-empty option argument.'
fi
;;
# handling if no arg is supplied to --ignore
# handling -- and unknown opt
esac
shift
done
if [ -n "$sh_ignore" ]; then
for d in $sh_ignore
do
var1="$(echo "$var1" | sed -e "s,$d,,")"
done
fi
# for loop with trimmed $var1 as downstream
for i in $var1
do
# normal actions
done
One method might be:
var1=$(echo "$var1" |
tr ' ' '\n' |
grep -Fxv -e "$(echo "$sh_ignore" | tr ' ' '\n')" |
tr '\n' ' ')
Note: this will leave a trailing blank, which can be trimmed off via var1=${var1% }

How do I make a zsh function autocomplet from the middle of a word?

I use zsh and wrote a function to replace cd function. With some help I got it to work like I want it to (mostly). This is a followup to one of my other question.
The function almost works like I want it to, but I still have some problems with syntax highlighting and autocompletion.
For the examples, lets say your directories look like this:
/
a/
b/
c/
d/
some_dir/
I am also assuming the following code has been sourced:
cl () {
local first=$( echo $1 | cut -d/ -f1 )
if [ -d $first ]; then
pushd $1 >/dev/null # If the first argument is an existing normal directory, move there
else
pushd ${PWD%/$first/*}/$1 >/dev/null # Otherwise, move to a parent directory or a child of that parent directory
fi
}
_cl() {
_cd
pth=${words[2]}
opts=""
new=${pth##*/}
local expl
# Generate the visual formatting and store it in `$expl`
_description -V ancestor-directories expl 'ancestor directories'
[[ "$pth" != *"/"*"/"* ]] && middle="" || middle="${${pth%/*}#*/}/"
if [[ "$pth" != *"/"* ]]; then
# If this is the start of the path
# In this case we should also show the parent directories
local ancestor=$PWD:h
while (( $#ancestor > 1 )); do
# -f: Treat this as a file (incl. dirs), so you get proper highlighting.
# -Q: Don't quote (escape) any of the characters.
# -W: Specify the parent of the dir we're adding.
# ${ancestor:h}: The parent ("head") of $ancestor.
# ${ancestor:t}: The short name ("tail") of $ancestor.
compadd "$expl[#]" -fQ -W "${ancestor:h}/" - "${ancestor:t}"
# Move on to the next parent.
ancestor=$ancestor:h
done
else
# $first is the first part of the path the user typed in.
# it it is part of the current direoctory, we know the user is trying to go back to a directory
first=${pth%%/*}
# $middle is the rest of the provided path
if [ ! -d $first ]; then
# path starts with parent directory
dir=${PWD%/$first/*}/$first
first=$first/
# List all sub directories of the $dir/$middle directory
if [ -d "$dir/$middle" ]; then
for d in $(ls -a $dir/$middle); do
if [ -d $dir/$middle/$d ] && [[ "$d" != "." ]] && [[ "$d" != ".." ]]; then
compadd "$expl[#]" -fQ -W $dir/ - $first$middle$d
fi
done
fi
fi
fi
}
compdef _cl cl
The problem:
In my zshrc, I have a line:
zstyle ':completion:*' matcher-list 'm:{a-z}={A-Za-z}' '+l:|=* r:|=*'
This should make autocompletions case insensitive and make sure I can start typing the last part of a directory name, and in will still finnish the full name
Example:
$ cd /a
$ cd di[tab] # replaces 'di' with 'some_dir/'
$ cl di[tab] # this does not do anything. I would like it to replace 'di' with 'some_dir'
How do get it to suggest 'some_dir' when I type 'di'?
The second matcher in your matcher-list never gets called, because _cl() returns "true" (exit status 0, actually) even when it has not added any matches. Returning "true" causes _main_complete() to assume that we're done completing and it will thus not try the next matcher in the list.
To fix this, add the following to the start of _cl():
local -i nmatches=$compstate[nmatches]
and this to the end of _cl():
(( compstate[nmatches] > nmatches ))
That way, _cl() will return "true" only when it has managed to actually add completions.

How to remove YAML frontmatter from markdown files?

I have markdown files that contain YAML frontmatter metadata, like this:
---
title: Something Somethingelse
author: Somebody Sometheson
---
But the YAML is of varying widths. Can I use a Posix command like sed to remove that frontmatter when it's at the beginning of a file? Something that just removes everything between --- and ---, inclusive, but also ignores the rest of the file, in case there are ---s elsewhere.
I understand your question to mean that you want to remove the first ----enclosed block if it starts at the first line. In that case,
sed '1 { /^---/ { :a N; /\n---/! ba; d} }' filename
This is:
1 { # in the first line
/^---/ { # if it starts with ---
:a # jump label for looping
N # fetch the next line, append to pattern space
/\n---/! ba; # if the result does not contain \n--- (that is, if the last
# fetched line does not begin with ---), go back to :a
d # then delete the whole thing.
}
}
# otherwise drop off the end here and do the default (print
# the line)
Depending on how you want to handle lines that begin with ---abc or so, you may have to change the patterns a little (perhaps add $ at the end to only match when the whole line is ---). I'm a bit unclear on your precise requirements there.
If you want to remove only the front matter, you could simply run:
sed '1{/^---$/!q;};1,/^---$/d' infile
If the first line doesn't match ---, sed will quit; else it will delete everything from the 1st line up to (and including) the next line matching --- (i.e. the entire front matter).
If you don't mind the "or something" being perl.
Simply print after two instances of "---" have been found:
perl -ne 'if ($i > 1) { print } else { /^---/ && $i++ }' yaml
or a bit shorter if you don't mind abusing ?: for flow control:
perl -ne '$i > 1 ? print : /^---/ && $i++' yaml
Be sure to include -i if you want to replace inline.
you use a bash file, create script.sh and make it executable using chmod +x script.sh and run it ./script.sh.
#!/bin/bash
#folder articles contains a lot of markdown files
files=./articles/*.md
for f in $files;
do
#filename
echo "${f##*/}"
#replace frontmatter title attribute to "title"
sed -i -r 's/^title: (.*)$/title: "\1"/' $f
#...
done
This AWK based solution works for files with and without FrontMatter, doing nothing in the later case.
#!/bin/sh
# Strips YAML FrontMattter from a file (usually Markdown).
# Exit immediately on each error and unset variable;
# see: https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/
set -Ee
print_help() {
echo "Strips YAML FrontMattter from a file (usually Markdown)."
echo
echo "Usage:"
echo " `basename $0` -h"
echo " `basename $0` --help"
echo " `basename $0` -i <file-with-front-matter>"
echo " `basename $0` --in-place <file-with-front-matter>"
echo " `basename $0` <file-with-front-matter> <file-to-be-without-front-matter>"
}
replace=false
in_file="-"
out_file="/dev/stdout"
if [ -n "$1" ]
then
if [ "$1" = "-h" ] || [ "$1" = "--help" ]
then
print_help
exit 0
elif [ "$1" = "-i" ] || [ "$1" = "--in-place" ]
then
replace=true
in_file="$2"
out_file="$in_file"
else
in_file="$1"
if [ -n "$2" ]
then
out_file="$2"
fi
fi
fi
tmp_out_file="$out_file"
if $replace
then
tmp_out_file="${in_file}_tmp"
fi
awk -e '
BEGIN {
is_first_line=1;
in_fm=0;
}
/^---$/ {
if (is_first_line) {
in_fm=1;
}
}
{
if (! in_fm) {
print $0;
}
}
/^(---|...)$/ {
if (! is_first_line) {
in_fm=0;
}
is_first_line=0;
}
' "$in_file" >> "$tmp_out_file"
if $replace
then
mv "$tmp_out_file" "$out_file"
fi

Print all line between the search pattern into different files using perl or any method

Could someone help out on this
I want to print all line between the search pattern (START & END) to different files (new_file_name can be any incremental name provided)
But the search pattern repeats in file hence each time it finds the pattern it should dump the line b/w them into different files
The file is something like this
START --- ./body1/b1
##########################
123body1
abcbody1
##########################
END --- ./body1/b1
START --- ./body2/b2
##########################
123body2
defbody2
##########################
END --- ./body2/b2
perl solution,
perl -MFile::Basename -MFile::Path -ne '
($a) = /^START.+?(\S+)$/;
$b = /^END/;
$a..$b or next;
if ($a){ mkpath(dirname $a); open STDOUT,">",$a; }
$a||$b or print;
' file
Here is my awk solution:
# print_between_patterns.awk
/^START/ { filename = $NF ; next } # On START, use the last field as file name
/^END/ { next } # On END, skip
{ print > filename } # For the rest of the lines, print to file
Assume your data file is called data.txt, the following will do what you want:
awk -f print_between_patterns.awk data.txt
Discussion
After the script ran, you will have ./body1, ./body2, and so on.
If you don't want to skip the BEGIN and END parts, remove the next commands.
Update
If you want to control the output filename in a sequential way:
/^START/ { filename = sprintf("out%04d.txt", ++count) ; next }
/^END/ { next }
{ print > filename }
To get automatically generated incremental file names:
awk '
/^END/ { inBlock=0 }
inBlock { print > outfile }
/^START/ { inBlock=1; outfile = "outfile" ++count }
' file
To use the file names from your input:
awk '
/^END/ { inBlock=0 }
inBlock { print > outfile }
/^START/ {
inBlock=1
outdir = outfile = $NF
sub(/\/[^\/]+$/,"",outdir)
system("mkdir -p \"" outdir "\"")
}
' file
The problem #JamesBond was having below was that I wasn't escaping the "/" within the character list in the sub() so I've updated my answer above to do that now. There's absolutely no reason why that should need to be escaped but apparently both nawk and /usr/xpg4/bin/awk require it:
$ cat file
the
quick/brown
dog
$ gawk '/[/]/' file
quick/brown
$ nawk '/[/]/' file
nawk: nonterminated character class [
source line number 1
context is
>>> /[/ <<< ]/
$ /usr/xpg4/bin/awk '/[/]/' file
/usr/xpg4/bin/awk: /[/: [ ] imbalance or syntax error Context is:
>>> /[/ <<<
and gawk doesn't care either way:
$ gawk --lint --posix '/[/]/' file
quick/brown
$ gawk --lint '/[/]/' file
quick/brown
$ gawk --lint --posix '/[\/]/' file
quick/brown
$ gawk --lint '/[\/]/' file
quick/brown
They all work just fine if I escape the backslash without putting it in a character list:
$ /usr/xpg4/bin/awk '/\//' file
quick/brown
$ nawk '/\//' file
quick/brown
$ gawk '/\//' file
quick/brown
So I guess that's something worth remembering for portability in future!
Using awk:
awk 'sub(/^START/, ""){out=sprintf("out%d", c++); p=1}
sub(/^END/, ""){print > out; p=0} p{print > out}' file
This will find and store each match between START and END into separate files named out1, out2 etc.
This is one way to do it in Bash.
#!/bin/bash
[ -n "$BASH_VERSION" ] || {
echo "You need Bash to run this script."
exit 1
}
shopt -s extglob || {
echo "Unable to enable extglob shell option."
exit 1
}
IFS=$' \t\n' ## Use default.
while read KEY DASH FILENAME; do
if [[ $KEY == START && $DASH == --- && -n $FILENAME ]]; then
CURRENT_FILENAME=$FILENAME
DIRNAME=${FILENAME%%+([^/])}
if [[ -n $DIRNAME ]]; then
mkdir -p "$DIRNAME" || {
echo "Unable to create directory $DIRNAME."
exit 1
}
fi
exec 4>"$CURRENT_FILENAME" || {
echo "Unable to open $CURRENT_FILENAME for output."
exit 1
}
for (( ;; )); do
IFS= read -r LINE || {
echo "End of file reached finding END block of $CURRENT_FILENAME."
exec 4>&-
exit 1
}
read -r KEY DASH FILENAME <<< "$LINE"
if [[ $KEY == END && $DASH == --- && $FILENAME == "$CURRENT_FILENAME" ]]; then
break
else
echo "$LINE" >&4
fi
done
exec 4>&-
fi
done
Make sure you save the script in UNIX file format then run it as bash script.sh < file.
I guess you need to see this.
perl -lne 'print if((/START/../END/) and ($_!~/START/ and $_!~/END/))' your_file
Tested below:
> cat temp
START --- ./body1
##########################
123body1
abcbody1
##########################
END --- ./body1
START --- ./body2
##########################
123body2
defbody2
##########################
END --- ./body2
> perl -lne 'print if((/START/../END/) and ($_!~/START/ and $_!~/END/))' temp
##########################
123body1
abcbody1
##########################
##########################
123body2
defbody2
##########################
>
This might work for you:
csplit -z file '/^START/' '{*}'
Files will be named xx00 xx01 xx..

Insert Header Row using sed

I need to run a bash script via cron to update a file.
The file is a .DAT (similar to csv) and contains pipe separated values.
I need to insert a header row at the top.
Here's what I have so far:
#!/bin/bash
# Grab the file, make a backup and insert the new line
sed -i.bak 1i"red|blue|green|orange|yellow" thefilename.dat
Exit
But how can I save the file as a different file name so that it always takes fileA, edits it and then saves it as fileB
do you really rename the old one to xxx.bak or can you just save a new copy?
either way, just use redirection.
sed 1i"red|blue|green|orange|yellow" thefilename.dat > newfile.dat
or if you want the .bak as well
sed 1i"red|blue|green|orange|yellow" thefilename.dat > newfile.dat \
&& mv thefilename.dat thefilename.dat.bak`
which would create your new file and then, only if the sed completed sucessfully, rename the orig file.
In case anyone finds it useful, here is what I ended up doing....
Grab the original file, convert it to the desired file type, whilst inserting a new header row and making a log of this.
#!/bin/bash -l
####################################
#
# This script inserts a header row in the file $DAT and resaves the file in a different format
#
####################################
#CONFIG
LOGFILE="$HOME/bash-convert/log-$( date '+%b-%d-%y' ).log"
HOME="/home/rootname"
# grab original file
WKDIR="$HOME/public_html/folder1"
# new location to save
NEWDIR="$HOME/public_html/folder2"
# original file to target
DAT="$WKDIR/original.dat"
# file name and type to convert to
NEW="$NEWDIR/original-converted.csv"
####################################
# insert a new header row
HDR="header-row-1|header-row-2|header-row-2 \r"
# and update the log file
{
echo "---------------------------------------------------------" >> $LOGFILE 2>&1
echo "Timestamp: $(date "+%d-%m-%Y: %T") : Starting work" >> $LOGFILE 2>&1
touch "$LOGFILE" || { echo "Can't create logfile -- Exiting." && exit 1 ;} >>"$LOGFILE"
# check if file is writable
sudo chmod 755 -R "$NEW"
echo "Creating file \"$NEW\", and setting permissions."
touch "$NEW" || {
echo "Can't create file \"$NEW\" -- Operation failed - exiting" && exit 1 ;}
} >>"$LOGFILE" 2>&1
{
echo "Prepending line \"$HDR\" to file $NEW."
{ echo "$HDR" ; cat "$DAT" ;} > "$NEW"
{
if [ "$?" -ne "0" ]; then
echo "Something went wrong with the file conversion."
exit 1
else echo "File conversion successful. Operation complete."
fi
}
} >>"$LOGFILE" 2>&1
exit 0
I found more clear an consistent the syntax with 'i' between the two single quotes for the pattern to 'insert'.
You can simply add a header, and save it in a different file with:
sed '1i header' file > file2
In your case:
sed '1i red|blue|green|orange|yellow' file > file2
If you wanted to save it on the same file, you'd use -i option:
sed -i '1i red|blue|green|orange|yellow' file