Makefiles: how to detach folder name from long path - sed

I need to detach a folder name from a long path, e.g.: ../../folder1/folder2 -> folder2. If it was a file name - the nodir function could be used. But nodir doesn't work with directories.
I have tried to use the following approach:
LIBNAMES := echo $(LIBS) | sed -e 's/\\S*\///g'
where LIBS is a list of long path folders. After implementing it my first impression was that it works (the correct LIBNAMES result was printed). But the following LIBNAMES usage in my makefile ended with strange errors: it seems the sed command itself somehow was added to the result. So, where is my mistake?

Why do you say that notdir doesn't work with directories? It works with any path.
all: ; echo $(notdir $(CURDIR))

$ CURDIR="../../folder1/folder2"
$ echo ${CURDIR##*/}
folder2
Here is the explanation of usage ## in bash (copy from bash manual)
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern just as in filename expansion (see Filename Expansion). If the pattern matches the beginning of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the ‘#’ case) or the longest matching pattern (the ‘##’ case) deleted. If parameter is ‘#’ or ‘*’, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with ‘#’ or ‘*’, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted. If parameter is ‘#’ or ‘*’, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with ‘#’ or ‘*’, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.

Related

Not able to understand a command in perl

I need help to understand what below command is doing exactly
$abc{hier} =~ s#/tools.*/dfII/?.*##g;
and $abc{hier} contains a path "/home/test1/test2/test3"
Can someone please let me know what the above command is doing exactly. Thanks
s/PATTERN/REPLACEMENT/ is Perl's substitution operator. It searches a string for text that matches the regex PATTERN and replaces it with REPLACEMENT.
By default, the substitution operator works on $_. To tell it to work on a different variable, you use the binding operator - =~.
The default delimiter used by the substitution operator is a slash (/) but you can change that to any other character. This is useful if your PATTERN or your REPLACEMENT contains a slash. In this case, the programmer has used # as the delimiter.
To recap:
$abc{hier} =~ s#PATTERN#REPLACEMENT#;
means "look for text in $abc{hier} that matches PATTERN and replace it with REPLACEMENT.
The substitution operator also has various options that change its behaviour. They are added by putting letters after the final delimiter. In this case we have a g. That means "make the substitution global" - or match and change all occurrences of PATTERN.
In your case, the REPLACEMENT string is empty (we have two # characters next to each other). So we're replacing the PATTERN with nothing - effectively deleting whatever matches PATTERN.
So now we have:
$abc{hier} =~ s#PATTERN*##g;
And we know it means, "in the variable $abc{hier}, look for any string that matches PATTERN and replace it with nothing".
The last thing to look at is the PATTERN (or regular expression - "regex"). You can get the full definition of regexes in perldoc perlre. But to explain what we're using here:
/tools : is the fixed string "/tools"
.* : is zero or more of any character
/dfII : is the fixed string "/dfII"
/? : is an optional slash character
.* : is (again) zero or more of any character
So, basically, we're removing bits of a file path from a value that's stored in a hash.
This =~ means "Do a regex operation on that variable."
(Actually, as ikegami correctly reminds me, it is not necessarily only regex operations, because it could also be a transliteration.)
The operation in question is s#something#else#, which means replace the "something" with something "else".
The g at the end means "Do it for all occurences of something."
Since the "else" is empty, the replacement has the effect of deleting.
The "something" is a definition according to regex syntax, roughly it means "Starting with '/tools' and later containing '/dfII', followed pretty much by anything until the end."
Note, the regex mentions at the end /?.*. In detail, this would mean "A slash (/) , or maybe not (?), and then absolutely anything (.) any number of times including 0 times (*). Strictly speaking it is not necessary to define "slash or not", if it is followed by "anything any often", because "anything" includes as slash, and anyoften would include 0 or one time; whether it is followed by more "anything" or not. I.e. the /? could be omitted, without changing the behaviour.
(Thanks ikeagami for confirming.)
$abc{hier} =~ s#/tools.*/dfII/?.*##g;
The above commands use regular expression to strip/remove trailing /tools.*/dfII and
/tools.*/dfII/.* from value of hier member of %abc hash.
It is pretty basic perl except non standard regular expression limiters (# instead of standard /). It allows to avoid escaping / inside the regular expression (s/\/tools.*\/dfII\/?.*//g).
My personal preferred style-guide would make it s{/tools.*/dfII/?.*}{}g .

Get prev directory path in a variable in linux

I am trying to get the parent directory of a given directory in a variable in linux script but I am unable to get it.
MN_CURR=/home/sshekhar/Desktop
MN_PREV=`$MN_CURR/..`
echo " Displayng $MN_PREV"
I am using CentOS. Can anyone please help?
Following on from my comment, when using POSIX shell, while the parameter expansions are limited compared to a more advanced shell such as bash, ksh, or zsh, POSIX shell does provide expansions to handle string length and substring removal.
In your case you want to remove the last component of the path (the suffix beginning with '/') leaving the parent directory. For that you can use:
MN_PREV=${MN_CURR%/*}
(which will remove all characters from the right -- up to and including the last '/')
The reference documentation for the POSIX shell parameter expansions can be found at POSIX Programmers Guide - 2.6.2 Parameter Expansion. The expansions concerning string length and substring removal are:
${#parameter}
String Length. The length in characters of the value of parameter shall be substituted. If parameter is '*' or '#', the result of the expansion is unspecified. If parameter is unset and set -u is in effect, the expansion shall fail.
${parameter%[word]}
Remove Smallest Suffix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in parameter, with the smallest portion of the suffix matched by the pattern deleted. If present, word shall not begin with an unquoted '%'.
${parameter%%[word]}
Remove Largest Suffix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in parameter, with the largest portion of the suffix matched by the pattern deleted.
${parameter#[word]}
Remove Smallest Prefix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in parameter, with the smallest portion of the prefix matched by the pattern deleted. If present, word shall not begin with an unquoted '#'.
${parameter##[word]}
Remove Largest Prefix Pattern. The word shall be expanded to produce a pattern. The parameter expansion shall then result in parameter, with the largest portion of the prefix matched by the pattern deleted.

What does the following sed statement mean

sed 's/<img src=\"\([^"]*\).*/\1/g'
input:
<img src="geo.yahoo.com/b?s=792600534"; height="1" width="1" style="position: absolute;" />
output:
https://geo.yahoo.com/b?s=792600534
This part is the regular expression to match with a capturing group Later referred as \1 (first capturing group). It extracting the value of the src attribute.
First part if the regex -> <img src=\"
capturing group -> \([^"]*\)
rest of the regex -> .*
The expression inside the square brackets could be read as: "anything not a double quote".
sed is a scripting language. Its s command performs substitutions using regular expressions. The syntax is s/regex/replacement/flags. In your example, you have the regex
<img src=\"\([^"]*\).*
and the replacement
\1
and the flags
g
The regex is apparently attempting to parse HTML, which deserves you a place in a warm location where a friendly gentleman with a pitchfork helps you with motivational issues. Far, far away, God reluctantly ends the life of a fluffy kitten.
The regular expression contains a capturing group, which is simply the text which matched between the parentheses. The replacement \1 refers back to this captured text. So in brief, you are taking away the parts which matched around this captured string.
s/foo\(bar\)baz/\1/
replaces foobarbaz with just baz, retrieving the "baz" part from whatever matched, rather than hard-coding a replacement string.
The regular expression .* matches any character any number of times; the regular expression engine will prefer the longest, leftmost possible match.
The regular expression [^"]* matches a single character which is not (newline or) " and the * again says to match as many times as possible. So "\([^"]*\)" finds a double-quoted string, and captures its contents; the negated " prevents the regular expression from matching past the closing quote when matching as many characters as possible. (As noted in comments, the backslash before the first " is unnecessary, but basically harmless. It just tells us that whoever wrote this isn't a regex wizard.)
However, your example just implicitly includes the closing quote in the .* match which will simply match everything from the closing quote through to the end of the line.
The g flag says to repeat the substitution command as many times as possible; so if an input line contains multiple matches, all of them will be replaced. (Without the g flag, sed will just replace the first match it finds on a line.) But since you just removed the rest of the line, the flag isn't actually useful here; there can only ever be a single match.
The gentleman with the pitchfork doesn't want me to tell you this, but this code is not suitable for a general-purpose script. There is no guarantee that the src attribute of the img element will be immediately adjacent to the img opening tag with just a single space in between; HTML allows arbitrary spacing (including a line wrap) and you can have other attributes like id or alt or title which could go before or after the src attribute. The proper solution is to use a HTML parser to extract the src attributes of img tags with proper understanding of the surrounding syntax.
xmlstarlet sel -T -t -m "/img" -m "#src" -v '.' -n
... though the stray semicolon after the src attribute is a HTML syntax violation; is it really there in your input?
(xmlstarlet command line shamelessly adapted from https://stackoverflow.com/a/3174307/874188)

Convert Perl to Shell

I have Perl script that I use to SNMP walk devices. However the server I have available to me does not allow me to install all the modules needed. So I need to convert the script to Shell (sh). I can run the script on individual devices but would like it to read from a text like it did in Perl. The Perl Script starts with:
open(TEST, "cat test.txt |");
#records=<TEST>;
close(TEST);
foreach $line (#records)
{
($field1, $field2, $field3)=split(/\s+/, $line);
# Run and record SNMP walk results.
Depending on exactly what the input is and what you are trying to do, that perl code fragment would likely translate to:
while read field1 field2 field3
do
# Run and record SNMP walk results.
echo "1=$field1 2=$field2 3=$field3"
done <text.txt
For example, if text.txt is:
$ cat text.txt
one two three
i ii iii
Then, the above code produces the output:
1=one 2=two 3=three
1=i 2=ii 3=iii
As you can see, the shell read command reads a line (record) at a time and also does splitting on whitespace. There are many options for read to control whether newlines or something else divide records (-d) and whether splitting is to be done on whitespace or something else (IFS) or whether backslashes in the input are to be treated as escape characters or not (-r). See man bash.
while read string; do
str1=${string%% *}
str3=${string##* }
temp=${string#$str1 }
str2=${temp%% *}
echo $str1 $str2 $str3
done <test.txt
alternate version
while read string; do
str1=${string%% *}
temp=${string#$str1 }
str2=${temp%% *}
temp=${string#$str1 $str2 }
str3=${temp%% *}
echo $str1 $str2 $str3
done <test.txt
POSIX substring parameter expansion
${parameter%word}
Remove Smallest Suffix Pattern. The word shall be expanded to produce
a pattern. The parameter expansion shall then result in parameter,
with the smallest portion of the suffix matched by the pattern
deleted.
${parameter%%word}
Remove Largest Suffix Pattern. The word shall be expanded to produce a
pattern. The parameter expansion shall then result in parameter, with
the largest portion of the suffix matched by the pattern deleted.
${parameter#word}
Remove Smallest Prefix Pattern. The word shall be expanded to produce
a pattern. The parameter expansion shall then result in parameter,
with the smallest portion of the prefix matched by the pattern
deleted. ${parameter##word} Remove Largest Prefix Pattern. The word
shall be expanded to produce a pattern. The parameter expansion shall
then result in parameter, with the largest portion of the prefix
matched by the pattern deleted.
${parameter##word}
Remove Largest Prefix Pattern. The word shall be expanded to produce a
pattern. The parameter expansion shall then result in parameter, with
the largest portion of the prefix matched by the pattern deleted.

Why doesn't '*' work as a perl regexp in my .Rbuildignore file?

When I try to build a package with the following in my .Rbuildignore file,
*pdf
*Rdata
I get the errors:
Warning in readLines(ignore_file) :
incomplete final line found on '/home/user/project/.Rbuildignore'
and
invalid regular expression '*pdf'
I thought '*' was a wildcard for one or more characters?
There are two styles of pattern matching for files:
regular expressions. These are used for general string pattern matching. See ?regex
globs. These are typically used by UNIX shells. See ?Sys.glob
You seem to be thinking in terms of globs but .Rbuildignore uses regular expressions. To convert a glob to a regular expression try
> glob2rx("*pdf")
[1] "^.*pdf$"
See help(regex) for help on regular expression, esp. the Perl variant, and try
.*pdf
.*Rdata
instead. The 'dot' matches any chartacter, and the 'star' says that it can repeat zero or more times. I just tried it on a package of mine and this did successfully ignore a pdf vignette as we asked it to.
In a perl regexp, use .*? as a wildcard.
But I think that what you actually want is pdf$ and Rdata$ as entries in .Rbuildignore seem to affect files whose paths they match only partially, too. $ means "end of the path".
* is a quantifier that attaches to a previous expression to allow between 0 and infinite repetitions of it. Since you have not preceded the quantifier with an expression, this is an error.
. is an expression that matches any character. So I suspect that you want .*pdf, .*Rdata, etc.