bash: extract double from string

bash: extract double from string - command-line

I have this string in bash:
str=sdk.iphoneos4.1.sdk
and I would like to have a variable with '4.1' in it
is there any way to parse a float/double value in bash ?

In Bash 3.2 or greater:
str=sdk.iphoneos4.1.sdk
pattern='[0-9]+\.[0-9]+'
[[ $str =~ $pattern ]]
echo ${BASH_REMATCH[0]}

Assuming the surrounding text always stays the same:
str=${str#sdk.iphoneos}
str=${str%.sdk}
This is less portable (bash only), but accepts anything in place of iphoneos:
shopt -s extglob
str=${str##sdk.*([a-z])}
str=${str%.sdk}

assuming no other digits elsewhere
$ str=sdk.iphoneos4.0.0.1.sdk
$ echo $str | grep -Po '(\d+.*\d+)(?=\.)'
4.0.0.1

Related

sed replacing content with star

I can use sed from tcsh like this:
set a = `echo $a | sed -e 's_old_new_'`
Everything is fine, but when I want to do this:
set a = `echo $a | sed -e 's_old_*new_'`
I can see "set: No match." . How I can escape this star?

I don't know much about tcsh but few experiments suggest the set assigning a variable attempts to expand right side *. Here is something that may help:
set a="`echo '2e2' | sed -e 's_e_*_'`"
echo "$a"
2*2
echo $a
echo: No match.
So double quote around back quotes and it will work.
set a = "`echo $a | sed -e 's_old_new_'`"

A command substitution (` or $(..)) that is not enclosed in double quotes is subject to filename expansion (aka 'globbing') and word splitting.
Normally, a variable assignment would suppress filename expansion and word splitting of the RHS (even without double quotes), but apparently not filename expansion in the case of command substitution.
Here's a test I ran for reference purposes:
$ touch randomfile
$ a="*file"
$ var_expand=$a
$ echo "$var_expand"
*file
$
$ cmd_subst=$(echo '*file')
$ echo "$cmd_subst"
randomfile
So I guess it's good practice to always double quote the command substitution when you are assigning to variable.
safe="$(cmd)"
Note: This is tested in Bash but I think tcsh exhibits a similar behavior in this respect.

How can sed replace "\ " (backslash + space)?

In a bash script, files with spaces show up as "File\ with\ spaces.txt" and I want to substitute those slashed-spaces with either _ or +.
How can I tell sed to do that? I had no success using;
$1=~/File\ with\ spaces.txt
ext=$1
web=$(echo "$ext" | sed 's/\ /+/')
I'm open to suggestions if there's a better way than through sed.
[EDIT]: Foo Bah's solution works well, but it substitutes only the first space because the text following it is treated as arguments rather than part of the $1. Any way around this?

sed 's/\\\\ /+/';
\\\\ evaluates to a \\ at the shell level, and then into a literal \ within sed.

Sed recognises \ as space just fine:
bee#i20 ~ $ echo file\ 123 | sed 's/\ /+/'
file+123
Your bash script syntax is all wrong, though.
Not sure what you were trying to do with the script, but here is an example of replacing spaces with +:
ext=~/File\ with\ spaces.txt
web=`echo "$ext" | sed 's/\ /+/g'`
echo $web
Upd:
Oh, and you need the g flag to replace all occurences of space, not only the first one. Fixed above.

you want to escape the slash:
web=$(echo "$ext" | sed 's/\\ /_/g')

single quotes are your friend
the following should be used with single quoted args for $1 and $2
#!/bin/bash
ESCAPE='\\'
if [ $# -ne 2 ];then
echo "$0 <TO_ESCAPE> <IN_STRING>"
echo args should be in single quotes!!
exit 1
fi
TO_ESCAPE="${1}"
IN_STRING="${2}"
if [ ${TO_ESCAPE} = '\' ];then
TO_ESCAPE='\\'
fi
echo "${IN_STRING}" | sed "s/${TO_ESCAPE}/${ESCAPE}${TO_ESCAPE}/g"

How to remove trailing whitespaces with sed?

I have a simple shell script that removes trailing whitespace from a file. Is there any way to make this script more compact (without creating a temporary file)?
sed 's/[ \t]*$//' $1 > $1__.tmp
cat $1__.tmp > $1
rm $1__.tmp

You can use the in place option -i of sed for Linux and Unix:
sed -i 's/[ \t]*$//' "$1"
Be aware the expression will delete trailing t's on OSX (you can use gsed to avoid this problem). It may delete them on BSD too.
If you don't have gsed, here is the correct (but hard-to-read) sed syntax on OSX:
sed -i '' -E 's/[ '$'\t'']+$//' "$1"
Three single-quoted strings ultimately become concatenated into a single argument/expression. There is no concatenation operator in bash, you just place strings one after the other with no space in between.
The $'\t' resolves as a literal tab-character in bash (using ANSI-C quoting), so the tab is correctly concatenated into the expression.

At least on Mountain Lion, Viktor's answer will also remove the character 't' when it is at the end of a line. The following fixes that issue:
sed -i '' -e's/[[:space:]]*$//' "$1"

Thanks to codaddict for suggesting the -i option.
The following command solves the problem on Snow Leopard
sed -i '' -e's/[ \t]*$//' "$1"

It is best to also quote $1:
sed -i.bak 's/[[:blank:]]*$//' "$1"

var1="\t\t Test String trimming "
echo $var1
Var2=$(echo "${var1}" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
echo $Var2

I have a script in my .bashrc that works under OSX and Linux (bash only !)
function trim_trailing_space() {
if [[ $# -eq 0 ]]; then
echo "$FUNCNAME will trim (in place) trailing spaces in the given file (remove unwanted spaces at end of lines)"
echo "Usage :"
echo "$FUNCNAME file"
return
fi
local file=$1
unamestr=$(uname)
if [[ $unamestr == 'Darwin' ]]; then
#specific case for Mac OSX
sed -E -i '' 's/[[:space:]]*$//' $file
else
sed -i 's/[[:space:]]*$//' $file
fi
}
to which I add:
SRC_FILES_EXTENSIONS="js|ts|cpp|c|h|hpp|php|py|sh|cs|sql|json|ini|xml|conf"
function find_source_files() {
if [[ $# -eq 0 ]]; then
echo "$FUNCNAME will list sources files (having extensions $SRC_FILES_EXTENSIONS)"
echo "Usage :"
echo "$FUNCNAME folder"
return
fi
local folder=$1
unamestr=$(uname)
if [[ $unamestr == 'Darwin' ]]; then
#specific case for Mac OSX
find -E $folder -iregex '.*\.('$SRC_FILES_EXTENSIONS')'
else
#Rhahhh, lovely
local extensions_escaped=$(echo $SRC_FILES_EXTENSIONS | sed s/\|/\\\\\|/g)
#echo "extensions_escaped:$extensions_escaped"
find $folder -iregex '.*\.\('$extensions_escaped'\)$'
fi
}
function trim_trailing_space_all_source_files() {
for f in $(find_source_files .); do trim_trailing_space $f;done
}

For those who look for efficiency (many files to process, or huge files), using the + repetition operator instead of * makes the command more than twice faster.
With GNU sed:
sed -Ei 's/[ \t]+$//' "$1"
sed -i 's/[ \t]\+$//' "$1" # The same without extended regex
I also quickly benchmarked something else: using [ \t] instead of [[:space:]] also significantly speeds up the process (GNU sed v4.4):
sed -Ei 's/[ \t]+$//' "$1"
real 0m0,335s
user 0m0,133s
sys 0m0,193s
sed -Ei 's/[[:space:]]+$//' "$1"
real 0m0,838s
user 0m0,630s
sys 0m0,207s
sed -Ei 's/[ \t]*$//' "$1"
real 0m0,882s
user 0m0,657s
sys 0m0,227s
sed -Ei 's/[[:space:]]*$//' "$1"
real 0m1,711s
user 0m1,423s
sys 0m0,283s

Just for fun:
#!/bin/bash
FILE=$1
if [[ -z $FILE ]]; then
echo "You must pass a filename -- exiting" >&2
exit 1
fi
if [[ ! -f $FILE ]]; then
echo "There is not file '$FILE' here -- exiting" >&2
exit 1
fi
BEFORE=`wc -c "$FILE" | cut --delimiter=' ' --fields=1`
# >>>>>>>>>>
sed -i.bak -e's/[ \t]*$//' "$FILE"
# <<<<<<<<<<
AFTER=`wc -c "$FILE" | cut --delimiter=' ' --fields=1`
if [[ $? != 0 ]]; then
echo "Some error occurred" >&2
else
echo "Filtered '$FILE' from $BEFORE characters to $AFTER characters"
fi

In the specific case of sed, the -i option that others have already mentioned is far and away the simplest and sanest one.
In the more general case, sponge, from the moreutils collection, does exactly what you want: it lets you replace a file with the result of processing it, in a way specifically designed to keep the processing step from tripping over itself by overwriting the very file it's working on. To quote the sponge man page:
sponge reads standard input and writes it out to the specified file. Unlike a shell redirect, sponge soaks up all its input before writing the output file. This allows constructing pipelines that read from and write to the same file.
https://joeyh.name/code/moreutils/

To remove trailing whitespace for all files in the current directory, I use
ls | xargs sed -i 's/[ \t]*$//'

These answers confused me. Both of these sed commands worked for me on a Java source file:
sed 's/\s\+$/ filename
sed 's/[[:space:]]\+$// filename
for test purposes, I used:
$ echo " abc " | sed 's/\s\+$/-xx/'
abc-xx
$ echo -e " abc \t\t " | sed 's/\s\+$/-xx/'
abc-xx
Replacing all trailing whitespace with "-xx".
#Viktor wishes to avoid a temporay file, personally I would only use the -i => in-place with a back-up suffix. At least until I know the command works.
Sorry, I just found the existing responses a little oblique. sed is straightforward tool. It is easier to approach it in a straightforward way 90% of the time. Or perhaps I missed something, happy to corrected there.

To only strip whitespaces (in my case spaces and tabs) from lines with at least one non-whitespace character (this way empty indented lines are not touched):
sed -i -r 's/([^ \t]+)[ \t]+$/\1/' "$file"

sed or grep or awk to match very very long lines

more file
param1=" 1,deerfntjefnerjfntrjgntrjnvgrvgrtbvggfrjbntr*rfr4fv*frfftrjgtrignmtignmtyightygjn 2,3,4,5,6,7,8,
rfcmckmfdkckemdio8u548384omxc,mor0ckofcmineucfhcbdjcnedjcnywedpeodl40fcrcmkedmrikmckffmcrffmrfrifmtrifmrifvysdfn drfr4fdr4fmedmifmitfmifrtfrfrfrfnurfnurnfrunfrufnrufnrufnrufnruf"****
need to match the content of param1 as
sed -n "/$param1/p" file
but because the line length (very long line) I cant match the line
what’s the best way to match very long lines?

The problem you are facing is that param1 contains special characters which are being interpreted by sed. The asterisk ('*') is used to mean 'zero or more occurrences of the previous character', so when this character is interpreted by sed there is nothing left to match the literal asterisk you are looking for.
The following is a working bash script that should help:
#!/bin/bash
param1=' 1,deerfntjefnerjfntrjgntrjnvgrvgrtbvggfrjbntr\*rfr4fv\*frfftrjgtrignmtignmtyightygjn 2,3,4,5,6,7,8, rfcmckmfdkckemdio8u548384omxc,mor0ckofcmineucfhcbdjcnedjcnywedpeodl40fcrcmkedmrikmckffmcrffmrfrifmtrifmrifvysdfn'
cat <<EOF | sed "s/${param1}/Bubba/g"
1,deerfntjefnerjfntrjgntrjnvgrvgrtbvggfrjbntr*rfr4fv*frfftrjgtrignmtignmtyightygjn 2,3,4,5,6,7,8, rfcmckmfdkckemdio8u548384omxc,mor0ckofcmineucfhcbdjcnedjcnywedpeodl40fcrcmkedmrikmckffmcrffmrfrifmtrifmrifvysdfn
EOF

Maybe the problem is that your $param1 contains special characters? This works for me:
A="$(perl -e 'print "a" x 10000')"
echo $A | sed -n "/$A/p"
($A contains 10 000 a characters).
echo $A | grep -F $A
and
echo $A | grep -P $A
also works (second requires grep with built-in PCRE support. If you want pattern matching you should use either this or pcregrep. If you don't, use the fixed grep (grep -F)).
echo $A | grep $A
is too slow.

How can I grep for a value from a shell variable?

I've been trying to grep an exact shell 'variable' using word boundaries,
grep "\<$variable\>" file.txt
but haven't managed to; I've tried everything else but haven't succeeded.
Actually I'm invoking grep from a Perl script:
$attrval=`/usr/bin/grep "\<$_[0]\>" $upgradetmpdir/fullConfiguration.txt`
$_[0] and $upgradetmpdir/fullConfiguration.txt contains some matching "text".
But $attrval is empty after the operation.

#OP, you should do that 'grepping' in Perl. don't call system commands unnecessarily unless there is no choice.
$mysearch="pattern";
while (<>){
chomp;
#s = split /\s+/;
foreach my $line (#s){
if ($line eq $mysearch){
print "found: $line\n";
}
}
}

I'm not seeing the problem here:
file.txt:
hello
hi
anotherline
Now,
mala#human ~ $ export GREPVAR="hi"
mala#human ~ $ echo $GREPVAR
hi
mala#human ~ $ grep "\<$GREPVAR\>" file.txt
hi
What exactly isn't working for you?

Not every grep supports the ex(1) / vi(1) word boundary syntax.
I think I would just do:
grep -w "$variable" ...

Using single quotes works for me in tcsh:
grep '<$variable>' file.txt
I am assuming your input file contains the literal string: <$variable>

If variable=foo are you trying to grep for "foo"? If so, it works for me. If you're trying to grep for the variable named "$variable", then change the quotes to single quotes.

On a recent linux it works as expected. Do could try egrep instead

Say you have
$ cat file.txt
This line has $variable
DO NOT PRINT ME! $variableNope
$variable also
Then with the following program
#! /usr/bin/perl -l
use warnings;
use strict;
system("grep", "-P", '\$variable\b', "file.txt") == 0
or warn "$0: grep exited " . ($? >> 8);
you'd get output of
This line has $variable
$variable also
It uses the -P switch to GNU grep that matches Perl regular expressions. The feature is still experimental, so proceed with care.
Also note the use of system LIST that bypasses shell quoting, allowing the program to specify arguments with Perl's quoting rules rather than the shell's.
You could use the -w (or --word-regexp) switch, as in
system("grep", "-w", '\$variable', "file.txt") == 0
or warn "$0: grep exited " . ($? >> 8);
to get the same result.

Using single quote it wont work. You should go for double quote
For example:
this wont work
--------------
for i in 1
do
grep '$i' file
done
this will work
--------------
for i in 1
do
grep "$i" file
done

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

bash: extract double from string - command-line

I have this string in bash: str=sdk.iphoneos4.1.sdk and I would like to have a variable with '4.1' in it is there any way to parse a float/double value in bash ?

In Bash 3.2 or greater: str=sdk.iphoneos4.1.sdk pattern='[0-9]+\.[0-9]+' [[ $str =~ $pattern ]] echo ${BASH_REMATCH[0]}

Assuming the surrounding text always stays the same: str=${str#sdk.iphoneos} str=${str%.sdk} This is less portable (bash only), but accepts anything in place of iphoneos: shopt -s extglob str=${str##sdk.*([a-z])} str=${str%.sdk}

assuming no other digits elsewhere $ str=sdk.iphoneos4.0.0.1.sdk $ echo $str | grep -Po '(\d+.*\d+)(?=\.)' 4.0.0.1

Related

sed replacing content with star

How can sed replace "\ " (backslash + space)?

How to remove trailing whitespaces with sed?

sed or grep or awk to match very very long lines

How can I grep for a value from a shell variable?

Categories

Resources