I've gone through the manual for the tcsh but still can't figure out how it should work in my case or whether it should work at all. I basically need to extract part of the variable whose value is a six digit number. So I need to drop the first two characters and retrieve the last four.
The example below doesn't work (it would probably work in bash but tcsh HAS to be used):
set VAR1 = value1
set VAR2 = `echo ${VAR1:2}`
echo VAR2
It comes up with error Bad : modifier in $ (2), apparently because it's bash syntax and not understandable by tcsh, but can't figure out how to do it with tcsh arguments.
I'm not sure about using modifiers, but you can slice your string using cut or sed:
set VAR1=abcdef
"cut" characters 3-to-end
echo $VAR1 | cut -c3-
capture everything (\(.*\)), except for the first 2 characters (..)
echo $VAR1 | sed 's/..\(.*\)$/\1/'
You could also use the perl command line with regexs like the sed in #shx2 answer above:
echo $VAR1 | perl -pe 's/^\d\d(.*)/$1/'
Drops the first two digits it starts with.
Certainly tcsh doesn't accept bash-style modifiers, or vice versa. They're very different shells.
You say you need to extract the last 4 digits of a 6-digit number. I'll assume that number is a non-negative integer.
If you think of it as an arithmetic problem rather than as a string-processing problem, you can use the built-in # command:
% set VAR1 = 123456
% # VAR2 = $VAR1 % 10000
% echo VAR1=$VAR1 VAR2=$VAR2
VAR1=123456 VAR2=3456
%
(Why do you have to use tcsh? Obligatory link: http://www.perl.com/doc/FMTEYEWTK/versus/csh.whynot .)
Related
I'm discovering the language Perl. I try to create a script to integrate inside my Nagios server, but i got two errors that I'm not able to resolve. Can you help me?
The errors are the following:
Use of uninitialized value $5 in concatenation (.) or string at
check_disque.pl line 53.
Argument "/dev/mapper/centos-root 50G 5,5G 45G 11 /\n" isn't
numeric in numeric lt (<) at check_disque.pl line 55.
My line 55 :
$espace_utilise=`df -h / | awk 'FNR == 2 {print $5}' | sed 's/%//g'`;
And the line 56 :
if ($espace_utilise < $warning) {
$espace_utilise=`df -h / | awk 'FNR == 2 {print $5}' | sed 's/%//g'`;
# ^^--- here
The backticks interpolate variables, so $5 will be interpolated by Perl. You can solve this by escaping the dollar sign with a backslash \$5, or use qx'', which does the same as backticks, but the single quote delimiters disables interpolation. It will cause some issues with your awk/sed commands, though. Which will require more escaping. This is one reason using shell commands inside Perl is a bad idea.
$espace_utilise=`df -h / | awk 'FNR == 2 {print \$5}' | sed 's/%//g'`;
$espace_utilise=qx'df -h / | awk \'FNR == 2 {print $5}\' | sed \'s/%//g\'';
Luckily for you, you can just do the df command directly and use the text processing with Perl commands, which will be a lot easier. I would help you, but I don't know exactly what that awk command does. I would guess:
$espace_utilise=`df -h /`; # get the line
my $df = (split ' ', $espace_utilise)[4]; # get the 5th field
$df =~ s/%//g; # remove %. Can also use tr/%d//d
The other error:
Argument "/dev/mapper/centos-root 50G 5,5G 45G 11 /\n" isn't numeric in numeric lt (<) at check_disque.pl line 55. My line 55 :
...is just because the first statement failed. Perl interpolates $5 even though it warns about it, and it becomes the empty string instead. So your awk line just says { print }, which I assume is the same as printing the whole line. So if you fix the first part, you can ignore this.
I'm discovering the language PERL.
Then take a look at CPAN. Among many modules there is Filesys::DiskSpace which does what you want. You need to install it first. In order to do that you need to learn how to INSTALL modules from CPAN, following
cpan App::cpanminus
cpanm Filesys::DiskSpace
should work in your case. Note that if you did not use cpan earlier it might ask you if you want it to autoconfigure itself. Hit enter to say yes.
After installation usage is as simple as
use Filesys::DiskSpace;
($fs_type, $fs_desc, $used, $avail, $fused, $favail) = df $dir;
Note that it does not provide percentage implicitly, so you would need to follow df behavior
The percentage of the normally available space that is currently allocated to all
files on the file system. This shall be calculated using the fraction:
<space used>/( <space used>+ <space free>)
expressed as a percentage. This percentage may be greater than 100 if <space free> is less
than zero. The percentage value shall be expressed as a positive integer, with any
fractional result causing it to be rounded to the next highest integer.
I'd like to be able to replace a string between 2 known patterns. The catch is that I want to replace it by a string of the same length that is composed only of 'x'.
Let's say I have a file containing:
Hello.StringToBeReplaced.SecondString
Hello.ShortString.SecondString
I'd like the output to be like this:
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString
Using sed loops
You can use sed, though the thinking required is not wholly obvious:
sed ':a;s/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/;t a'
This is for GNU sed; BSD (Mac OS X) sed and other versions may be fussier and require:
sed -e ':a' -e 's/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/' -e 't a'
The logic is identical in both:
Create a label a
Substitute the lead string and a sequence of x's (capture 1), followed by a non-x, and arbitrary other data plus the second string (capture 2), and replace it with the contents of capture 1, an x and the content of capture 2.
If the s/// command made a change, go back to the label a.
It stops substituting when there are no non-x's between the two marker strings.
Two tweaks to the regex allow the code to recognize two copies of the pattern on a single line. Lose the ^ that anchors the match to the beginning of the line, and change .* to [^.]* (so that the regex is not quite so greedy):
$ echo Hello.StringToBeReplaced.SecondString Hello.StringToBeReplaced.SecondString |
> sed ':a;s/\(Hello\.x*\)[^x]\([^.]*\.SecondString\)/\1x\2/;t a'
Hello.xxxxxxxxxxxxxxxxxx.SecondString Hello.xxxxxxxxxxxxxxxxxx.SecondString
$
Using the hold space
hek2mgl suggests an alternative approach in sed using the hold space. This can be implemented using:
$ echo Hello.StringToBeReplaced.SecondString |
> sed 's/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1#\3##\2/
> h
> s/.*##//
> s/./x/g
> G
> s/\(x*\)\n\([^#]*\)#\([^#]*\)##.*/\2\1\3/
> '
Hello.xxxxxxxxxxxxxxxxxx.SecondString
$
This script is not as robust as the looping version but works OK as written when each line matches the lead-middle-tail pattern. It first splits the line into three sections: the first marker, the bit to be mangled, and the second marker. It reorganizes that so that the two markers are separated by #, followed by ## and the bit to be mangled. h copies the result to the hold space. Remove everything up to and including the ##; replace each character in the bit to be mangled by x, then copy the material in the hold space after the x's in the pattern space, with a newline separating them. Finally, recognize and capture the x's, the lead marker, and the tail marker, ignoring the newline, the # and ## plus trailing material, and reassemble as lead marker, x's, and tail marker.
To make it robust, you'd recognize the pattern and then group the commands shown inside { and } to group them so they're only executed when the pattern is recognized:
sed '/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/{
s/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1#\3##\2/
h
s/.*##//
s/./x/g
G
s/\(x*\)\n\([^#]*\)#\([^#]*\)##.*/\2\1\3/
}'
Adjust to suit your needs...
Adjusting to suit your needs
[I tried one of your solutions and it worked fine.]
However when I try to replace the 'hello' by my real string (which is
'1.2.840.') and my second string (which is simply a dot '.'), things stop
working. I guess all these dots confuse the sed command.
What I try to achieve is transform this '1.2.840.10008.' to
'1.2.840.xxxxx.'
And this pattern happens several times in my file with variable number
of characters to be replaced between the '1.2.840.' and the next dot '.'
There are times when it is important to get your question close enough to the real scenario — this may be one such. Dot is a metacharacter in
sed regular expressions (and in most other dialects of regular expression — shell globbing being the noticeable exception). If the 'bit to be mangled' is always digits, then we can tighten up the regular expressions, though actually (when I look at the code ahead) the tightening really isn't imposing much in the way of a restriction.
Pretty much any solution using regular expressions is a balancing act that has to pit convenience and abbreviation against reliability and precision.
Revised code plus data
cat <<EOF |
transform this '1.2.840.10008.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.21. and 1.2.840.20992. should lose the 21 and 20992.
EOF
sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'
Example output:
transform this '1.2.840.xxxxx.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.xx. and 1.2.840.xxxxx. should lose the 21 and 20992.
The changes in the script are:
sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'
Add 1\.2\.840\. as the start pattern.
Revise the 'character to replace' expression to 'not x or .'.
Use just \. as the tail pattern.
You could replace the [^x.] with [0-9] if you're sure you only want digits matched, in which case you won't have to worry about spaces as discussed below.
You may decide you don't want spaces to be matched so that a casual comment like:
The net prefix is 1.2.840. And there are other prefixes too.
does not end up as:
The net prefix is 1.2.840.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
In which case, you probably need to use:
sed ':a;s/\(1\.2\.840\.x*\)[^x. ]\([^ .]*\.\)/\1x\2/;t a'
And so the changes continue until you've got something precise enough to do what you want without doing anything you don't want on your current data set. Writing bullet-proof regular expressions requires a precise specification of what you want matched, and can be quite hard.
I'd choose perl:
perl -pe 's/(?<=Hello\.)(.*?)(?=\.SecondString)/ "x" x length($1) /e' file
This awk should do:
awk -F. '{for (i=1;i<=length($2);i++) a=a"x";$2=a;a=""}1' OFS="." file
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString
Bash Works Too
While the perl, sed and awk solutions are probably the better choice, a Bash solution is not that difficult (just longer). Bash has good character-by-character handling abilities as well:
#!/bin/bash
rep=0 # replace flag
skip=0 # delay reset flag
while read -r line; do # read each line
for ((i=0; i<${#line}; i++)); do # for each character in the line
# if '.' and replace on, turn off and set skip
[ ${line:i:1} == '.' -a $rep -eq 1 ] && { rep=0; skip=1; }
# print char or "x" depending on replace flag
[ $rep -eq 0 ] && printf "%c" ${line:i:1} || printf "x"
# if '.' and replace off
if [ ${line:i:1} == '.' -a $rep -eq 0 ]; then
# if skip, turn skip off, else set replace on
[ $skip -eq 1 ] && skip=0 || rep=1
fi
done
printf "\n"
done
exit 0
Input
$ cat dat/replacefile.txt
Hello.StringToBeReplaced.SecondString
Hello.ShortString.SecondString
Output
$ bash replacedot.sh < dat/replacefile.txt
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString
For the sake of your sanity, just use awk:
$ awk 'BEGIN{FS=OFS="."} {gsub(/./,"x",$2)} 1' file
Hello.xxxxxxxxxxxxxxxxxx.SecondString
Hello.xxxxxxxxxxx.SecondString
Below is a csh script.
#! /bin/csh
set alpha=10\20\30;
set beta = $alpha.alpha;
perl -p -i.bak -e 's/gamma/'$beta'/' tmp;
The tmp file contains just the word gamma. After running tmp.csh, I expect 10\20\30.alpha in tmp, but it's now 102030.alpha.
How to preserve slashes in this situation?
Note: I wouldn't prefer changing definition of alpha variable, as it is used in the script else where where it needs to be in this format (10\20\30) only.
Thanks.
In csh, for your alpha assignment, the backslash is being taken to mean 'a literal 2 or 3'. In order to keep csh from doing this, the assignment needs to be enclosed in quotes.
#! /bin/csh
set alpha="10\20\30";
set beta = $alpha.alpha;
perl -p -i.bak -e 's/gamma/'$beta'/' tmp;
If in doubt, it's often helpful to 'echo' your variables out to see exactly what they contain. I don't understand your final note, as the 'alpha' variable is not equal to 10\20\30 the way you have it originally assigned.
If I can have somewhere in my input a series of two or more characters (in my case, >), how can I insert something between each occurrence of >?
For example: >> to >foo>, but also:
>>> to >foo>foo> and:
>>>> to >foo>foo>foo>.
Using 's/>>/>foo>/g' gives me of course >foo>>foo>, which is not what I need.
In other words, how can I push a character back to the pattern space, or match a character without consuming it (does that make any sense?)
Using Perl, you can do it iteratively
$ echo '>>>>' | perl -pe 's/>>/>foo>/ while />>/'
>foo>foo>foo>
or use a look-ahead assertion, which does not consume the 2nd >
$ echo '>>>>' | perl -pe 's/>(?=>)/>foo/g'
>foo>foo>foo>
This should also work
sed ':b; s/>>/>foo>/; tb'
I've got a file called 'res' that's 29374 characters of http data in a one-line string. Inside it, there are several http links, but I only want to be display those that end in '/idNNNNNNNNN' where N is a digit. In fact I'm only interested in the string 'idNNNNNNNNN'.
I've tried with:
cat res | sed -n '0,/.*\(id[0-9]*\).*/s//\1/p'
but I get the whole file.
Do you know a way to do it?
perl -n -E 'say $1 while m!/id(\d{9})!g' input-file
should work. That assumes exactly 9 digits; that's the {9} in the above. You can match 8 or 9 ({8,9}), 8 or more ({8,}), up to 9 ({0,9}), etc.
Example of this working:
$ echo -n 'junk jumk http://foo/id231313 junk lalala http://bar/id23123 asda' | perl -n -E 'say $1 while m!id(\d{0,9})!g'
231313
23123
That's with the 0 to 9 variant, of course.
If you're stuck with a pre-5.10 perl, use -e instead of -E and print "$1\n" instead of say $1.
How it works
First is the two command-line arguments to Perl. -n tells Perl to read input from standard input or files given on the command line, line by line, setting $_ to each line. $_ is perl's default target for a lot of things, including regular expression matches. -E merely tells Perl that the next argument is a Perl one-liner, using the new language features (vs. -e which does not use the 5.10 extensions).
So, looking at the one liner: say means to print out some value, followed by a newline. $1 is the first regular expression capture (captures are made by parentheses in regular expressions). while is a looping construct, which you're probably familiar with. m is the match operator, the ! after it is the regular expression delimiter (normally, you see / here, but since the pattern contains / it's easier to use something else, so you don't have to escape the / as \/). /id(\d{9}) is the regular expression to match. Keep in mind that the delimiter is !, so the / is not special, it just matches a literal /. The parentheses form a capture group, so $1 will be the number. The ! is the delimiter, followed by g which means to match as many times as possible (as opposed to once). This is what makes it pick up all the URLs in the line, not just the first. As long as there is a match, the m operator will return a true value, so the loop will continue (and run that say $1, printing out the match).
Two-sed solution
I think this is one way to do this with only sed. Much more complicated!
echo 'junk jumk http://foo/id231313 junk lalala http://bar/id23123 asda' | \
sed 's!http://!\nhttp://!g' | \
sed 's!^.*/id\([0-9]*\).*$!\1!'
cat res | perl -ne 'chomp; print "$1\n" if m/\/(id\d*)/'
The trouble is that sed and grep and awk work on lines, and you've only got one line. So, you probably need to split things up so you have more than one line -- then you can make the normal tools work.
tr ':' '\012' < res |
sed -n 's%.*/\(id[0-9][0-9]*\).*%\1%p'
This takes advantage of URLs containing colons and maps colons to newlines with tr, then uses sed to pick up anything up to a slash, followed by id and one or more digits, followed by anything, and prints out the id and digit string (only). Since these only occur in URLs, they will only appear one per line and relatively near the start of the line too.
Here's a solution using only one invocation of sed:
sed -n 's| |\n|g;/^http/{s|http://[^/]*/id\([0-9]*\)|\1|;P};D' inputfile
Explanation:
s| |\n|g; - Divide and conquer
/^http/{ - If pattern space begins with "http"
s|http://[^/]*/id\([0-9]*\)|\1|; - capture the id
P - Print the string preceding the first newline
}; - end if
D - Delete the string preceding the first newline regardless of whether it contains "http"
Edit:
This version uses the same technique but is more selective.
sed -n 's|http://|\n&|g;/^\n*http/{s|\n*http://[^/]*/id\([0-9]*\)|\1\n|;P};D' inputfile