Linux bash - Regular Expression to find a string included between a static string and the first number - sed

I need to extract in linux bash from a string a set of characters which are included between a static string and the first number.
A simple example should be helpful:
Base string: hello-world-my_name-1.0.jar
Static string: hello-world-
Target: my_name
I'm trying with
ls *.jar | sed 's/(?<=hello-world-)(.+?(?=-[0-9]))/\1/'
but unfortunately I can't understand where I'm wrong

Match everything, capture your middle string in a group and replace everything with the captured value:
printf 'hello-world-my_name-1.0.jar\n' | sed 's/hello-world-\([^0-9]*\).*/\1/'
Output:
my_name-
If the hyphen needs to be replaced too, match it outside the group; or remove it with a second pattern:
printf 'hello-world-my_name-1.0.jar\n' | sed 's/hello-world-\([^0-9]*\)-.*/\1/'
# or
printf 'hello-world-my_name-1.0.jar\n' | sed 's/hello-world-\([^0-9]*\).*/\1/;s/-$//'
Alternatively, if the middle string cannot contain hyphens, add it to the character class:
printf 'hello-world-my_name-1.0.jar\n' | sed 's/hello-world-\([^0-9-]*\).*/\1/'

Related

How to extract a specific character inside a parentheses using sed command?

I want to extract an atomic symbols inside a parentheses using sed.
The data I have is in the form C(X12), and I only want the X symbol
EX: that a test command :
echo "C(Br12)" | sed 's/[0-9][0-9])$//g'
gives me C(Br.
You can use
sed -n 's/.*(\(.*\)[0-9]\{2\})$/\1/p'
See the online demo:
sed -n 's/.*(\(.*\)[0-9]\{2\})$/\1/p' <<< "c(Br12)"
# => Br
Details
-n - suppresses the default line output
.*(\(.*\)[0-9]\{2\})$ - a regex that matches
.* - any text
( - a ( char
\(.*\) - Capturing group 1: any text up to the last....
[0-9]\{2\} - two digits
)$ - a ) at the end of string
\1 - replaces with Group 1 value
p - prints the result of the substitution.
For example:
echo "C(Br12)" | sed 's/C(\(.\).*/\1/'
C( - match exactly literally C(
. match anything
\(.\) - match anythig - one character- and "remember" it in a backreference \1
.* ignore everything behind it
\1 - replace it by the stuff that was remembered. The first character.
Research sed, regex and backreferences for more information.
Try using the following command
echo "C(BR12)" | cut -d "(" -f2 | cut -d ")" -f1 | sed 's/[0-9]*//g'
The cut tool will split and get you the string in middle of the paranthesis.Then pass the string to a sed for replacing the numbers inside the string.
Not a fully sed solution but this will get you the output.

sed replace positional match of unknown string divided by user-defined separator

Want to rename the (known) 3th folder within a (unknown) file path from a string, when positioned on 3th level while separator is /
Need a one-liner explicitly for sed. Because I later want use it for tar --transform=EXPRESSION
string="/db/foo/db/bar/db/folder"
echo "$string" | sed 's,db,databases,'
sed replace "db" only on 3th level
expected result
/db/foo/databases/bar/db/folder
You could use a capturing group to capture /db/foo/ and then match db. Then use use the first caputring group in the replacement using \1:
string="/db/foo/db/bar/db/folder"
echo -e "$string" | sed 's,^\(/[^/]*/[^/]*/\)db,\1databases,'
About the pattern
^ Start of string
\( Start capture group
/[^/]*/[^/]*/ Match the first 2 parts using a negated character class
\) Close capture group
db Match literally
That will give you
/db/foo/databases/bar/db/folder
If awk is also an option for this task:
$ awk 'BEGIN{FS=OFS="/"} $4=="db"{$4="database"} 1' <<<'/db/foo/db/bar/db/folder'
/db/foo/database/bar/db/folder
FS = OFS = "/" assign / to both input and output field separators,
$4 == "db" { $4 = "database }" if fourth field is db, make it database,
1 print the record.
Here is a pure bash way to get this done by setting IFS=/ without calling any external utility:
string="/db/foo/db/bar/db/folder"
string=$(IFS=/; read -a arr <<< "$string"; arr[3]='databases'; echo "${arr[*]}")
echo "$string"
/db/foo/databases/bar/db/folder

Replace every " within string

I have lines in a text file which looks like this example:
"2009217",2015,3,"N","N","2","UPPER DARBY FIREFIGHTERS "PAC"","","","","7235 WEST CHESTER PIKE","","UPPER DARBY","PA","19082","","6106220269",4245.0100,650.0000,.0000
I want to replace every double quote in multiple partial strings similar to this "UPPER DARBY FIREFIGHTERS "PAC""across the whole file.
So the result should be as below for each instance of the recurring double quotes:
"2009217",2015,3,"N","N","2","UPPER DARBY FIREFIGHTERS PAC","","","","7235 WEST CHESTER PIKE","","UPPER DARBY","PA","19082","","6106220269",4245.0100,650.0000,.0000
I came to this sed line:
cat file.txt | sed "s/\([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,\)\([^,]*\),\(.*\)/\1\2\3/"
But now I don't know how to replace the double quote within \2.
Is that possible with sed?
I would personally use awk for that because it is more readable:
#!/usr/bin/env awk
BEGIN {
# Use ',' as the input and output field delimiter
FS=OFS=","
}
{
# Iterate through all fields. (NF is the number of fields.)
for(i=1;i<=NF;i++) {
# If the field starts and ends with a '"'
if($i ~ /^".*"$/) {
# Replace all '""
gsub(/"/,"",$i)
# Wrap in '"' again
$i = "\"" $i "\""
}
}
}
print
This might work for you (GNU sed):
sed -r ':a;s/^((([^",]*,)*("[^",]*",([^",]*,)*)*)"[^",]*)"([^,])/\1\6/;ta' file
This removes extra double quotes from strings surrounded by double quotes and delimited by ,'s.
It does this by eliminating properly constructed double quotes strings and non-quoted strings (in this example numbers) and then removes double quotes that are not followed by ,
[^",]*, # non double quoted strings
"[^",]*", # properly quoted strings
(([^",]*,)*("[^",]*",([^",]*,)*)*) # eliminate all properly constructed strings
"[^",]*"([^,]) # improper double quotes
^
|

cut off known substring sh

How to cut off known substring from the string in sh?
For example, I have string "http://www.myserver.org/very/very/long/path/mystring"
expression "http://www.myserver.org/very/very/long/path/" is known. How can I get "mystring"?
Thanks.
E.g. using perl:
echo "http://www.myserver.org/very/very/long/path/mystring" | perl -pe 's|^http://www.myserver.org/very/very/long/path/(.*)$|\1|'
E.g. using sed:
echo "http://www.myserver.org/very/very/long/path/mystring" | sed 's|^http://www.myserver.org/very/very/long/path/\(.*\)$|\1|'
E.g. when the search string is held in a variable, here named variable. Use double quotes to expand the variable.
echo "http://www.myserver.org/very/very/long/path/mystring" | sed "s|^${variable}\(.*\)$|\1|"
Tested under /bin/dash
$ S="http://www.myserver.org/very/very/long/path/mystring" && echo ${S##*/}
mystring
where
S is the variable-name
## remove largest prefix pattern
*/ upto the last slash
For further reading, search "##" in man dash
Some more illustrations:
$ S="/mystring/" ; echo ${S##*/}
$ S="/mystring" ; echo ${S##*/}
mystring
$ S="mystring" ; echo ${S##*/}
mystring

How do I search for a keyword in a file and print out the string following the keyword?

I want to search for keyword "mykey = " in a file and print out the string that is following the keyword.
I cannot do a "grep", because each line is very long. I just want to extract the string following the keyword.
Here's what I came up with. Not first, but works, and without the final grep.
grep 'mykey = ' file | sed 's/.*\(mykey = [A-Za-z]*\).*/\1/'
Assuming the keyword is a single word and a space follows it, like this:
mykey = myCoolValue
grep 'mykey' /your/file/here | sed -r 's/.*mykey = (^[ ]*) .*/\1/g' | grep .
If you have pcregrep at hand, you can issue this command in terminal or in a script to get only desired text after mykey =
$ pcregrep -o '(?<=mykey = ).+' file
The regex uses a positive lookbehind, where -o returns only the matched text, not the whole line.