Lowercase to Uppercase of character in Shell Scripting [duplicate] - perl

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Character Lowercase to Uppercase in Shell Scripting
I have value as: james,adam,john I am trying to make it James,Adam,John (First character of each name should be Uppercase).
echo 'james,adam,john' | sed 's/\<./\u&/g'
is not working in all the systems. In one system its showing ok..but not ok in another system...
A="james adam john"
B=( $A )
echo "${B[#]^}"
its throwing some syntax error...So, i am doing it through a long query sing while loop, which is too lengthy.
Is there any shortcut way to do this?

There are many ways to define "beginning of a name". This method chooses any letter after a word boundary and transforms it to upper case. As a side effect, this will also work with names such as "Sue Ellen", or "Billy-Bob".
echo "james,adam,john" | perl -pe 's/(\b\pL)/\U$1/g'

With Perl:
echo "james,adam,john" | \
perl -ne 'print join(",", map{ ucfirst } split(/,/))'

You can use awk like this to capitalize first letter of every word in your input:
echo "james,adam,john" | awk 'BEGIN { RS=","; FS=""; ORS=","; OFS=""; }
{ $1=toupper($1); print $0; }'
OUTPUT
James,Adam,John

Same method as TLP but with GNU sed:
echo "james,adam,john,sue ellen,billy-bob" | sed -r 's/\b(.)/\u\1/g'
output:
James,Adam,John,Sue Ellen,Billy-Bob
If only the first letter should be capitalized, use this instead:
echo "james,adam,john,sue ellen,billy-bob" | sed 's/[^,]*/\u&/g'
output:
James,Adam,John,Sue ellen,Billy-bob

Related

printf zero padded string

The format of MAC addresses varies with the platform.
E.g. on HPUX I could get something like:
0:0:c:7:ac:1e
While Linux gives me
00:00:0c:07:ac:1e
I used to use awk in a kornshell script on CentOS5 to format this to 00000c07ac1e like shown below.
MAC="0:0:c:7:ac:1e"
echo $MAC | awk -F: '{printf( "%02s%02s%02s%02s%02s%02s\n", $1,$2,$3,$4,$5,$6)}'
Unfortunately our admin server now is Ubuntu 14LTS with a newer version of awk which doesn't support the zero padding in the %s format anymore and I get an undesired 0 0 c 7ac1e
So I now switched to perl and do:
echo $MAC | perl -ne '{#A=split(":"); printf( "%02s%02s%02s%02s%02s%02s", #A)}'
As this may break too in upcoming releases I am looking for a more robust but still compact way to format the string.
Your Perl snippet will not break in future releases. This is basic functionality. Changing it will break many, many programs. (Plus, Perl has a mechanism for introducing backwards incompatible changes without breaking existing program.)
Cleaned up:
echo "$MAC" | perl -ne'#F=split(/:/); printf("%02s%02s%02s%02s%02s%02s\n", #F)'
Shorter:
echo "$MAC" | perl -ne'printf "%02s%02s%02s%02s%02s%02s\n", split /:/'
Without the repetition:
echo "$MAC" | perl -ple'$_ = join ":", map sprintf("%02s", $_), split /:/'
There's -a if you want something more awkish:
echo "$MAC" | perl -F: -aple'$_ = join ":", map sprintf("%02s", $_), #F'
Bit long but should be pretty robust
awk -F: '{for(i=1;i<=NF;i++){while(length($i)<2)$i=0$i;printf "%s",$i;}print ""}'
How it works
1.Loop through fields
2.Whilst the field is less than 2 characters long add zeros to the front
3.print the field
4.print newline character at end.
If you were dealing with a number rather than hex, you could use %.Xd to indicate you want at least X digits.
$ awk -F: '{printf( "%.2d%.2d\n", $1, $2)}' <<< "0:23"
0023
^^
two digits
From The GNU Awk User’s Guide #5.5.3 Modifiers for printf Formats:
.prec
A period followed by an integer constant specifies the precision to
use when printing. The meaning of the precision varies by control
letter:
%d, %i, %o, %u, %x, %X
Minimum number of digits to print.
In this case, you need a more general approach to deal with each one of the blocks of the MAC address. You can loop through the elements and add a 0 in case their length is just 1:
awk -F: '{for (i=1;i<=NF;i++) #loop through the elements
{
if (length($i)==1) #if length is 1
printf("0") #add a 0
printf ("%s", $i) #print the rest
}
print "" #print a new line at the end
}' <<< "0:0:c:7:ac:1e"
This returns:
00000c07ac1e
^^ ^^ ^^
^^ ^^ ^^
Note awk '...' <<< "$MAC" is the same as echo "$MAC" | awk '...'.

Using the bash sort command within variable-length filenames

I am trying to numerically sort a series of files output by the ls command which match the pattern either ABCDE1234A1789.RST.txt or ABCDE12345A1789.RST.txt by the '789' field.
In the example patterns above, ABCDE is the same for all files, 1234 or 12345 are digits that vary but are always either 4 or 5 digits in length. A1 is the same length for all files, but value can vary so unfortunately it can't be used as a delimiter. Everything after the first . is the same for all files. Something like:
ls -l *.RST.txt | sort -k +9.13 | awk '{print $9} ' > file-list.txt
will match the shorter filenames but not the longer ones because of the variable length of characters before the field I want to sort by.
Is there a way to accomplish sorting all files without first padding the shorter-length files to make them all the same length?
Perl to the rescue!
perl -e 'print "$_\n" for sort { substr($a, -11, 3) cmp substr($b, -11, 3) } glob "*.RST.txt"'
If your perl is more recent (5.10 or newer), you can shorten it to
perl -E 'say for sort { substr($a, -11, 3) cmp substr($b, -11, 3) } glob "*.RST.txt"'
Because of the parts of the filename which you've identified as unchanging, you can actually build a key which sort will use:
$ echo ABCDE{99999,8765,9876,345,654,23,21,2,3}A1789.RST.txt \
| fmt -w1 \
| sort -tE -k2,2n --debug
ABCDE2A1789.RST.txt
_
___________________
ABCDE3A1789.RST.txt
_
___________________
ABCDE21A1789.RST.txt
__
etc.
What this does is tell sort to separate the fields on character E, then use the 2nd field numerically. --debug arrived in coreutils 8.6, and can be very helpful in seeing exactly what sort is doing.
The conventional way to do this in bash is to extract your sort field. Except for the sort command, the following is implemented in pure bash alone:
sort_names_by_first_num() {
shopt -s extglob
for f; do
first_num="${f##+([^0-9])}";
first_num=${first_num%[^0-9]*};
[[ $first_num ]] && printf '%s\t%s\n' "$first_num" "$f"
done | sort -n | while IFS='' read -r name; do name=${name#*$'\t'}; printf '%s\n' "$name"; done
}
sort_names_by_first_num *.RST.txt
That said, newline-delimiting filenames (as this question seems to call for) is a bad practice: Filenames on UNIX filesystems are allowed to contain newlines within their names, so separating them by newlines within a list means your list is unable to contain a substantial subset of the range of valid names. It's much better practice to NUL-delimit your lists. Doing that would look like so:
sort_names_by_first_num() {
shopt -s extglob
for f; do
first_num="${f##+([^0-9])}";
first_num=${first_num%[^0-9]*};
[[ $first_num ]] && printf '%s\t%s\0' "$first_num" "$f"
done | sort -n -z | while IFS='' read -r -d '' name; do name=${name#*$'\t'}; printf '%s\0' "$name"; done
}
sort_names_by_first_num *.RST.txt

how to separate a 10-digit phone number into two parts

For example, I get a phone number like 9191234567, how could I separate it into two parts, with the first part containing the three leading digits 919 and the other part containing the rest seven digits 1234567? After that, I want to store these two parts into two different variables in ksh.
I don't know if this could be done with sed?
You could try this :
echo "9191234567" | sed 's/^\([0-9]\{3\}\)\([0-9]\{7\}\)$/\1 \2/'
To store each part in a separate variable, you could do this :
phone="9191234567"
part1=$(echo $phone | sed 's/^\([0-9]\{3\}\)[0-9]\{7\}$/\1/')
part2=$(echo $phone | sed 's/^[0-9]\{3\}\([0-9]\{7\}\)$/\1/')
Or even more concise :
read part1 part2 <<< $(echo "9191234567" | sed 's/^\([0-9]\{3\}\)\([0-9]\{7\}\)$/\1 \2/')
cut should work
echo '9191234567' | cut --characters 1-3,4- --output-delimiter ' '
919 1234567
echo 9191234567 | sed 's/^\([1-9]\{3\}\)\([1-9]*\)/\1\-\2/'
Will print 919-1234567
Using bash
$ phone=9191234567
$ regex="^([0-9]{3})([0-9]{7})$"
$ [[ $phone =~ $regex ]] && part1="${BASH_REMATCH[1]}" && part2="${BASH_REMATCH[2]}"
$ echo $part1
919
$ echo $part2
1234567
Pure ksh, take number, print as two separate strings, separated by white space.
function split_at_third {
typeset number=$1 a b
b=${number#???} && a=${number%$b}
print $a $b
}

sed — joining a range of selected lines

I'm a beginner to sed. I know that it's possible to apply a command (or a set of commands) to a certain range of lines like so
sed '/[begin]/,/[end]/ [some command]'
where [begin] is a regular expression that designates the beginning line of the range and [end] is a regular expression that designates the ending line of the range (but is included in the range).
I'm trying to use this to specify a range of lines in a file and join them all into one line. Here's my best try, which didn't work:
sed '/[begin]/,/[end]/ {
N
s/\n//
}
'
I'm able to select the set of lines I want without any problem, but I just can't seem to merge them all into one line. If anyone could point me in the right direction, I would be really grateful.
One way using GNU sed:
sed -n '/begin/,/end/ { H;g; s/^\n//; /end/s/\n/ /gp }' file.txt
This is straight forward if you want to select some lines and join them. Use Steve's answer or my pipe-to-tr alternative:
sed -n '/begin/,/end/p' | tr -d '\n'
It becomes a bit trickier if you want to keep the other lines as well. Here is how I would do it (with GNU sed):
join.sed
/\[begin\]/ {
:a
/\[end\]/! { N; ba }
s/\n/ /g
}
So the logic here is:
When [begin] line is encountered start collecting lines into pattern space with a loop.
When [end] is found stop collecting and join the lines.
Example:
seq 9 | sed -e '3s/^/[begin]\n/' -e '6s/$/\n[end]/' | sed -f join.sed
Output:
1
2
[begin] 3 4 5 6 [end]
7
8
9
I like your question. I also like Sed. Regrettably, I do not know how to answer your question in Sed; so, like you, I am watching here for the answer.
Since no Sed answer has yet appeared here, here is how to do it in Perl:
perl -wne 'my $flag = 0; while (<>) { chomp; if (/[begin]/) {$flag = 1;} print if $flag; if (/[end]/) {print "\n" if $flag; $flag = 0;} } print "\n" if $flag;'

cut off known substring sh

How to cut off known substring from the string in sh?
For example, I have string "http://www.myserver.org/very/very/long/path/mystring"
expression "http://www.myserver.org/very/very/long/path/" is known. How can I get "mystring"?
Thanks.
E.g. using perl:
echo "http://www.myserver.org/very/very/long/path/mystring" | perl -pe 's|^http://www.myserver.org/very/very/long/path/(.*)$|\1|'
E.g. using sed:
echo "http://www.myserver.org/very/very/long/path/mystring" | sed 's|^http://www.myserver.org/very/very/long/path/\(.*\)$|\1|'
E.g. when the search string is held in a variable, here named variable. Use double quotes to expand the variable.
echo "http://www.myserver.org/very/very/long/path/mystring" | sed "s|^${variable}\(.*\)$|\1|"
Tested under /bin/dash
$ S="http://www.myserver.org/very/very/long/path/mystring" && echo ${S##*/}
mystring
where
S is the variable-name
## remove largest prefix pattern
*/ upto the last slash
For further reading, search "##" in man dash
Some more illustrations:
$ S="/mystring/" ; echo ${S##*/}
$ S="/mystring" ; echo ${S##*/}
mystring
$ S="mystring" ; echo ${S##*/}
mystring