sh: add command string ourtput to array - sh

In sh I want to create an array with one element that is the curent date and time formatted with spaces.
$ date +"%b %d %H:%m"
Jun 23 16:06
Here are some things that don't work:
$ date +"%b %d %H:%m"
Jun 23 16:06
$ d=`date +"%b %d %H:%m"`
$ echo $d
Jun 23 16:06
$ arr=($d)
$ echo ${#arr}
3
$ arr=("$d")
$ echo ${#arr}
12
$ arr=("`date +"%b %d %H:%m"`")
$ echo ${#arr}
12
$ arr=(`date +"%b %d %H:%m"`)
$ echo ${#arr}
3
$ echo ${arr[2]}
$

You have a working solution already, but merely misinterpreted the test:
$ d=$(date +"%b %d %H:%m")
$ a=("$d")
$ echo ${#a[*]}
1
$ echo ${#a} # Gives length of ${a[0]}, not number of elements in the array
12
$ a=("$d" "second entry")
$ echo ${a[0]}
Jun 23 16:06
$ echo ${a[1]}
second entry
$ echo ${#a[*]}
2

The strangeness is because these three things are sometimes different:
${#arr}
${#arr[*]}
${#arr[#]}
Also, it's impossible to change IFS
# echo $IFS
# IFS="."
# echo $IFS
#

Related

awk - Call external command and populate output before the first column

I have a file that contains some information about daily storage utilization. There are two columns - DD.MM date and usage in KB for every day.
I'm using awk to show the difference between every second line and the previous one in GB as storage usage increases.
Example file:
20.09 10485760
21.09 20971520
22.09 26214400
23.09 27262976
My awk command:
awk 'NR > 1 {a=($2-prev)/1024^2" GB"} {prev=$2} {print $1,$2,a}' file
This outputs:
20.09 10485760
21.09 20971520 10 GB
22.09 26214400 5 GB
23.09 27262976 1 GB
I would also like to add the weekday name before the first column. The date format in the file is always DD.MM, so, to make GNU date accept it as a valid input and return the weekday name, i composed this pipeline:
echo '20.09.2022' | awk -v FS=. -v OFS=- '{print $3,$2,$1}' | date -f - +%a
It works, but i want to call it from the first awk for every processed line with the first column date as an argument and ".2022" appended to it in order to work, and put the output of this external pipeline (it will be the weekday name) before the date in first column.
Example output:
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
I looked at the system() option in awk, but i couldn't make it to work with my pipeline and my first awk command.
1st solution: Using a getline within awk please try following solution.
awk '
NR>1{
a=($2-prev)/1024^2" GB"
}
{
split($1,arr,".")
value="2022-"arr[2]"-"arr[1]
dateVal="date -d \"" value "\" +%a"
newVal = ( (dateVal | getline line) > 0 ? line : "N/A" )
close(dateVal)
print newVal,$0,a
prev=$2
}
' Input_file
2nd solution: With your shown samples please try following awk code. What system command does in awk is: It runs mentioned commands in a separate shell so basically you are calling awk-->system-->shell-->commands so in spite of that just get all the values with 1 awk for all days(based on 1st field of your Input_file) and we can pass it as an input to another awk where we are doing actual space calculations and we can merge both of them(because system command prints the output through shell commands so then we can't merge that output with awk's output). We could also do it with a while loop but IMHO doing it with awk could be faster.
awk '
FNR==NR{
arr[FNR]=$0
next
}
NR>1{
a=($2-prev)/1024^2" GB"
}
{
print arr[FNR],$1,$2,a
prev=$2
}
' <(awk '{split($1,arr,".");system("d=\"2022-" arr[2]"-"arr[1]"\";date -d \"$d\" +%a")}' Input_file) Input_file
Output with shown samples will be as follows:
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
Since you have GNU date you should also have GNU awk which has builtin time functions that'll be orders of magnitude faster than awk spawning a subshell to call date for each input line:
$ cat tst.sh
#!/usr/bin/env bash
awk '
BEGIN {
year = strftime("%Y")
}
NR > 1 {
diff = ( ($2 - prev) / (1024 ^ 2) ) " GB"
}
{
split($1,dayMth,/[.]/)
secs = mktime(year " " dayMth[2] " " dayMth[1] " 12 0 0")
day = strftime("%a",secs)
print day, $0, diff
prev = $2
}
' "${#:--}"
$ ./tst.sh file
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
If for some reason you don't have GNU awk and can't get it then this 2-pass approach would work fairly efficiently using GNU date and any awk:
$ cat tst.sh
#!/usr/bin/env bash
awk -v year="$(date +'%Y')" -v OFS='-' '{
split($1,dayMth,/[.]/)
print year, dayMth[2], dayMth[1]
}' "$#" |
date -f- +'%a' |
awk '
NR == FNR {
days[NR] = $1
next
}
FNR > 1 {
diff = ( ($2 - prev) / (1024 ^ 2) ) " GB"
}
{
print days[FNR], $0, diff
prev = $2
}
' - "$#"
$ ./tst.sh file
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
The downside to that 2nd script is it couldn't read input from a stream, only from a file, since it has to read it twice. If that's an issue and your input isn't too massive to fit a copy on disk then you could always use a temp file, e.g.:
$ cat tst.sh
#!/usr/bin/env bash
tmp=$(mktemp) &&
trap 'rm -f "$tmp"; exit' 0 &&
cat "${#:--}" > "$tmp" || exit 1
awk -v year="$(date +'%Y')" -v OFS='-' '{
split($1,dayMth,/[.]/)
print year, dayMth[2], dayMth[1]
}' "$tmp" |
date -f- +'%a' |
awk '
NR == FNR {
days[NR] = $1
next
}
FNR > 1 {
diff = ( ($2 - prev) / (1024 ^ 2) ) " GB"
}
{
print days[FNR], $0, diff
prev = $2
}
' - "$tmp"
$ ./tst.sh file
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
date can process multiple newline-sheared dates, therefore I propose following solution, let file.txt content be
20.09 10485760
21.09 20971520 10 GB
22.09 26214400 5 GB
23.09 27262976 1 GB
then
awk 'BEGIN{FS="[[:space:].]";OFS="-"}{print "2022",$2,$1}' file.txt | date -f - +%a | paste -d ' ' - file.txt
gives output
Tue 20.09 10485760
Wed 21.09 20971520 10 GB
Thu 22.09 26214400 5 GB
Fri 23.09 27262976 1 GB
Explanation: I use GNU AWK to extract and prepare date for consumption by date, so 20.09 becomes 2022-09-20 and so on, then date is used to compute codename of day of week, then paste is used to get columns side by side sheared by space character, 1st column is - meaning use standard input, 2nd column is unchanged file.txt
(tested in GNU Awk 5.0.1 and paste (GNU coreutils) 8.30)
who says you can't use system() to get the weekday ?
this function also comes with auto gnu-date vs. bsd-date detection,
(by way of gnu-date's ability to return up to nanoseconds precision, something that bsd-date lacks),
and adjusts its calling syntax accordingly
jot -w '2022-09-%d' 30 | gtail -n 12 |
mawk 'function ____(_) {
return \
substr("SunMonTueWedThuFriSat",(_=\
system("exit \140 date -" (\
system("exit \140date +\"%s%6N"\
"\" |grep -cF N\140") ? "j -f " \
"\"%Y-%m-%d\"":"d") " \""(_) \
"\" +%w \140")) +_+_+(_^=_<_),_+_+_)
} ($++NF=____($!_))^_'
2022-09-19 Mon
2022-09-20 Tue
2022-09-21 Wed
2022-09-22 Thu
2022-09-23 Fri
2022-09-24 Sat
2022-09-25 Sun
2022-09-26 Mon
2022-09-27 Tue
2022-09-28 Wed
2022-09-29 Thu
2022-09-30 Fri
system() typically can return you an unsigned integer from 0 to 255 if you explicitly set its exit code to be whatever value you desire,
so as long as the range of values needed is within 256 (or can be binned into it), then one can leverage system() and get the results quicker than a full getline routine.
But since this workaround requires numeric value returns, it wouldn't be able to directly just use the built-in formatting code date +'%a'.

How do I capture first tuesday in a month with zero padded in Unix

#Unix
I am trying to capture first tuesday of every month into a variable and trying to pad Zero against it without luck.
Below is the piece of code I was trying:
cal | sed -e 's/ \([1-9]\) /0\1 /g' -e 's/ \([1-9]\)$/0\1/' | awk 'NR>2{Sfields=7-NF; if (Sfields == 0 ) {printf "%d\n",$3;exit}}'
Can someone help me what I am missing here?
This awk should do:
cal | awk 'NR>2 && NF>4 {printf "%02d\n",$(NF-4);exit}'
03
To confirm its working:
for i in {1..12}; do cal -m $i | awk 'NR>2 && NF>4 {printf "%02d\n",$(NF-4);exit}' ; done
06
03
03
07
05
02
07
04
01
06
03
01
Or you can use ncal
ncal | awk '/Tu/ {printf "%02d\n",$2}'
03
If you like a version where you can specify name of week,
and would work if Monday is first day of week, then this gnu awk should do:
cal | awk 'NR==2 {for (i=1;i<=NF;i++) {sub(/ /,"",$i);a[$i]=i}} NR>2 {if ($a["Tu"]~/[0-9]/) {printf "%02d\n",$a["Tu"];exit}}' FIELDWIDTHS="3 3 3 3 3 3 3 3"
03
It uses FIELDWITH to make sure empty columns in start of month does not changes the output.
# for monday calendar
cal -m1 | sed -n '1,2b;/^.\{3\} \{0,1\}\([0-9]\{1,2\}\) .*/ {s//0\1/;s/.*\([0-9]\{2\}\)$/\1/p;q;}'
# for sunday calendar
cal -s1 01 01 2015 | sed -n '1,2b;/^.\{6\} \{0,1\}\([0-9]\{1,2\}\) .*/ {s//0\1/;s/.*\([0-9]\{2\}\)$/\1/p;q;}'
cal option depend on system (tested here on Red Hat 6.6) and mean -m for monday as first day and -sfor sunday (the attached 1 is for 1 month display). Take the line according to your specified output of cal.
don't print line by default
don't care of line 1 and 2
take line with non empty second(/third) group
take second(/third) group (position) of number until next one and replace by a 0, remove trailng char
take the 2 last digit of first group, remove the rest and print it
quit (no other line)
thanks to #Jotne for all remark about first wanted day in second week (4th line and not 3th) and first day of the week
I think I got the answer.
cal | awk 'NR>2{Sfields=7-NF; if (Sfields == 0 ) {printf "%02d\n",$3;exit}}'
Above statement would do."%02d" does it for me
bash and date. May be slower than parsing cal:
y=2015
for m in {1..12}; do
for d in {01..07}; do
if [[ $(date -d "$y-$m-$d" +%w) -eq 2 ]]; then
echo $d
break
fi
done
done
Translating into awk: will be faster as it doesn't have to call date multiple times:
gawk -v y=2015 '
BEGIN {
for (m=1; m<=12; m++) {
for (d=1; d<=7; d++) {
t = mktime( y " " m " " d " 12 0 0" )
if (strftime("%w", t) == 2) {
printf "%02d\n", d
break
}
}
}
}
'

search and print the value inside tags using script

I have a file like this. abc.txt
<ra><r>12.34</r><e>235</e><a>34.908</a><r>23</r><a>234.09</a><p>234</p><a>23</a></ra>
<hello>sadfaf</hello>
<hi>hiisadf</hi>
<ra><s>asdf</s><qw>345</qw><a>345</a><po>234</po><a>345</a></ra>
What I have to do is I have to find <ra> tag and for inside <ra> tag there is <a> tag whose valeus I have to store the values inside of into some variables which I need to process further. How should I do this.?
values inside tag within tag are:
34.908,234.09,23
345,345
This awk should do:
cat file
<ra><r>12.34</r><e>235</e><a>34.908</a><r>23</r><a>234.09</a><p>234</p><a>23</a></ra><a>12344</a><ra><e>45</e><a>666</a></ra>
<hello>sadfaf</hello>
<hi>no print from this line</hi><a>256</a>
<ra><s>asdf</s><qw>345</qw><a>345</a><po>234</po><a>345</a></ra>
awk -v RS="<" -F">" '/^ra/,/\/ra/ {if (/^a>/) print $2}' file
34.908
234.09
23
666
345
345
It take in care if there are multiple <ra>...</ra> groups in one line.
A small variation:
awk -v RS=\< -F\> '/\/ra/ {f=0} f&&/^a/ {print $2} /^ra/ {f=1}' file
34.908
234.09
23
666
345
345
How does it work:
awk -v RS="<" -F">" ' # This sets record separator to < and gives a new line for every <
/^ra/,/\/ra/ { # within the record starting witn "ra" to record ending with "/ra" do
if (/^a>/) # if line starts with an "a" do
print $2}' # print filed 2
To see how changing RS works try:
awk -v RS="<" '$1=$1' file
ra>
r>12.34
/r>
e>235
/e>
a>34.908
/a>
r>23
/r>
a>234.09
/a>
p>234
...
To store it in an variable you can do as BMW suggested:
var=$(awk ...)
var=$(awk -v RS=\< -F\> '/\/ra/ {f=0} f&&/^a/ {print $2} /^ra/ {f=1}' file)
echo $var
34.908 234.09 23 666 345 345
echo "$var"
34.908
234.09
23
666
345
345
Since its many values, you can use an array:
array=($(awk -v RS=\< -F\> '/\/ra/ {f=0} f&&/^a/ {print $2} /^ra/ {f=1}' file))
echo ${array[2]}
23
echo ${var2[0]}
34.908
echo ${var2[*]}
34.908 234.09 23 666 345 345
Use gnu grep's Lookahead and Lookbehind Zero-Length Assertions
grep -oP "(?<=<ra>).*?(?=</ra>)" file |grep -Po "(?<=<a>).*?(?=</a>)"
explanation
the first grep will get the content in ra tag. Even there are several ra tags in one line, it still can identified.
The second grep get the content in a tag

What goes under the hood of `mkvirtualenv` command?

I am curious about what happens under the hood of the mkvirtualenv command and so I am trying to understand how it calls virtualenv.
The lowest hanging fruit is to figure where the virtualenv program is located after installation and where the mkvirtualenv program is located after installation. So:-
Calvins-MacBook-Pro.local ttys006 Mon Apr 23 12:31:07 |~|
calvin$ which mkvirtualenv
Calvins-MacBook-Pro.local ttys006 Mon Apr 23 12:31:10 |~|
calvin$ which virtualenv
/opt/local/library/Frameworks/Python.framework/Versions/2.7/bin/virtualenv
So the strange thing I see here is that which mkvirtualenv does not give any result. Why?
Digging further, in the virtualenvwrapper directory after installing it, I see only 3 python files:-
Calvins-MacBook-Pro.local ttys004 Mon Apr 23 12:28:05 |/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/virtualenvwrapper|
calvin$ ls -la
total 88
drwxr-xr-x 8 root wheel 272 Apr 13 15:07 .
drwxr-xr-x 29 root wheel 986 Apr 15 00:55 ..
-rw-r--r-- 1 root wheel 5292 Apr 13 15:05 hook_loader.py
-rw-r--r-- 1 root wheel 4810 Apr 13 15:07 hook_loader.pyc
-rw-r--r-- 1 root wheel 1390 Apr 13 15:05 project.py
-rw-r--r-- 1 root wheel 2615 Apr 13 15:07 project.pyc
-rw-r--r-- 1 root wheel 7381 Apr 13 15:05 user_scripts.py
-rw-r--r-- 1 root wheel 11472 Apr 13 15:07 user_scripts.pyc
And I suppose that the only reason why mkvirtualenv is now available in my terminal is because I have added in a source/opt/local/library/Frameworks/Python.framework/Versions/2.7/bin/virtualenvwrapper.sh. So answering the question I asked earlier, this is simply because mkvirtualenv is expressed as a bash function and is available in my terminal because I have sourced virtualenvwrapper.sh in my .bashrc or .bash_profile files.
Digging into the virtualenvwrapper.sh script, I see
# Create a new environment, in the WORKON_HOME.
#
# Usage: mkvirtualenv [options] ENVNAME
# (where the options are passed directly to virtualenv)
#
function mkvirtualenv {
typeset -a in_args
typeset -a out_args
typeset -i i
typeset tst
typeset a
typeset envname
typeset requirements
typeset packages
in_args=( "$#" )
if [ -n "$ZSH_VERSION" ]
then
i=1
tst="-le"
else
i=0
tst="-lt"
fi
while [ $i $tst $# ]
do
a="${in_args[$i]}"
# echo "arg $i : $a"
case "$a" in
-a)
i=$(( $i + 1 ));
project="${in_args[$i]}";;
-h)
mkvirtualenv_help;
return;;
-i)
i=$(( $i + 1 ));
packages="$packages ${in_args[$i]}";;
-r)
i=$(( $i + 1 ));
requirements="${in_args[$i]}";;
*)
if [ ${#out_args} -gt 0 ]
then
out_args=( "${out_args[#]-}" "$a" )
else
out_args=( "$a" )
fi;;
esac
i=$(( $i + 1 ))
done
set -- "${out_args[#]}"
eval "envname=\$$#"
virtualenvwrapper_verify_workon_home || return 1
virtualenvwrapper_verify_virtualenv || return 1
(
[ -n "$ZSH_VERSION" ] && setopt SH_WORD_SPLIT
\cd "$WORKON_HOME" &&
"$VIRTUALENVWRAPPER_VIRTUALENV" $VIRTUALENVWRAPPER_VIRTUALENV_ARGS "$#" &&
[ -d "$WORKON_HOME/$envname" ] && \
virtualenvwrapper_run_hook "pre_mkvirtualenv" "$envname"
)
typeset RC=$?
[ $RC -ne 0 ] && return $RC
# If they passed a help option or got an error from virtualenv,
# the environment won't exist. Use that to tell whether
# we should switch to the environment and run the hook.
[ ! -d "$WORKON_HOME/$envname" ] && return 0
# If they gave us a project directory, set it up now
# so the activate hooks can find it.
if [ ! -z "$project" ]
then
setvirtualenvproject "$WORKON_HOME/$envname" "$project"
fi
# Now activate the new environment
workon "$envname"
if [ ! -z "$requirements" ]
then
pip install -r "$requirements"
fi
for a in $packages
do
pip install $a
done
virtualenvwrapper_run_hook "post_mkvirtualenv"
}
Here's where I don't understand yet - I don't seem to see any direct reference to virtualenv in this bash function. So how exactly does this bash function mkvirtualenv pass the arguments from command line (e.g. mkvirtualenv -p python2.7 --no-site-packages mynewproject) to the python virtualenv program?
So this is the line that does the trick.
(
[ -n "$ZSH_VERSION" ] && setopt SH_WORD_SPLIT
\cd "$WORKON_HOME" &&
"$VIRTUALENVWRAPPER_VIRTUALENV" $VIRTUALENVWRAPPER_VIRTUALENV_ARGS "$#" &&
[ -d "$WORKON_HOME/$envname" ] && \
virtualenvwrapper_run_hook "pre_mkvirtualenv" "$envname"
)
$VIRTUALENVWRAPPER_VIRTUALENV is in fact the location of where the current virtualenv program resides.
In terminal,
Calvins-MacBook-Pro.local ttys004 Mon Apr 23 13:24:14 |~|
calvin$ which $VIRTUALENVWRAPPER_VIRTUALENV
/opt/local/library/Frameworks/Python.framework/Versions/2.7/bin/virtualenv
Mytsery solved.

Collect numerals at the beginning of the file

I have a text file which contains some numerals, for example,
There are 60 nuts and 35 apples,
but only 24 pears.
I want to collect these numerals (60, 35, 24) at the beginning of the same file, in particular, I want after processing, the file to read
read "60"
read "35"
read "24"
There are 60 nuts and 35 apples,
but only 24 pears.
How could I do this using one of the text manipulating tolls available in *nix?
You can script an ed session to edit the file in place:
{
echo 0a # insert text at the beginning of the file
grep -o '[0-9]\+' nums.txt | sed 's/.*/read "&"/'
echo ""
echo . # end insert mode
echo w # save
echo q # quit
} | ed nums.txt
More succinctly:
printf "%s\n" 0a "$(grep -o '[0-9]\+' nums.txt|sed 's/.*/read "&"/')" "" . w q | ed nums.txt
One way to do it is:
egrep -o [0-9]+ input | sed -re 's/([0-9]+)/read "\1"/' > /tmp/foo
cat input >> /tmp/foo
mv /tmp/foo input