I need help to shorten or simplify one sed command - sed

I like to convert (4)wdcp(9)microsoft(3)com(0) to wdcp.microsoft.com using sed (only) as part of a rex in Splunk
So remove first and last parentheses with number, and replace middle one with a dot.
I do manage to do it using this two long an ugly commands.
echo "(4)wdcp(9)microsoft(3)com(0)" | sed -r 's/(^\([0-9]+\)|\([0-9]+\)$)//g;s/\([0-9]+\)/./g'
wdcp.microsoft.com
Can this be simplified and hopefully shorten to one command and not two
PS, there may be more than three part url. eks www.microsoft.co.uk

This works: sed 's/([0-9]*)//1 ;s//./g; s/\.$//'
Delete the first, use // to repeat the match and replace the rest with ., then fix the trailing .
That is about as succinct as I can get it while maintaining your functionality. It's much more confusing than your replacements, though.
The space after the 1 seems required for some seds.

Related

sed backreference not being found

I am trying to use 'sed' to replace a list of paths in a file with another path.
An example string to process is:
/path/to/file/block
I want to replace /path/to/file with something else.
I have Tried
sed -r '/\s(\S+)\/block/s/\1/new_path/'
I know it's finding the matching string but I'm getting an invalid back reference error.
How can I do this?
This may do:
echo "/path/to/file/block" | sed -r 's|/\S*/(block)|/newpath/\1|'
/newpath/block
Test
echo "test=/path/file test2=/path/to/file/block test3=/home/root/file" | sed -r 's|/\S*/(block)|/newpath/\1|'
test=/path/file test2=/newpath/block test3=/home/root/file
Back-references always refer to the pattern of the s command, not to any address (before the command).
However, in this case, there's no need for addressing: we can apply the substitution to all lines (and it will change only lines where it matches), so we can write:
s,\s(\S+)/block/, \1/new_path,
(I added a space to the RHS, as I'm guessing you didn't mean to overwrite that; also used a different separator to reduce the need for backslashes.)

how to replace each ,, with ,?, using sed?

I have tried the following command:
echo "123456,,7,,,,890" | sed 's/,,/\,?,/g'
Result:
123456,?,7,?,,?,890
But the result I want is:
123456,?,7,?,?,?,890
Could anyone help me ?
Thanks
Your problem is, that the ,, in the result was never seen by the g option.
One of the two is coming from replacing.
With your special desired output (I would have expect only three instead of four replacements...) you need to look at the result of one replacement and replace again, until no replacing takes place anymore.
You can achieve that by making a loop, with :a, i.e. the label "a" and then go back after a successful replacement with ta, "to label a".
(The g becomes unnecessary, but might be more efficient. Time it to find out in your environment.)
sed ':a;s/,,/\,?,/g;ta'
result
"123456,?,7,?,?,?,890"
Regular expressions can not match overlapping spans. Thus, if you have ,,,,, the first two commas will be the first match, and the third and fourth comma will constitute the second match. There is no way to match the second and third comma with /??/.
Typically, this would be done using lookahead, to avoid one of the commas to be a part of the match; but sed does not support it. So you can switch to a more powerful regex engine, like that of perl:
echo "123456,,7,,,,890" | perl -pe 's/,(?=,)/,?/g'
Alternately, since in your specific case you will miss every other adjacent comma pair, you can just run your sed twice:
echo "123456,,7,,,,890" | sed 's/,,/,?,/g' | sed 's/,,/,?,/g'
or combine the two operations into one sed invocation:
echo "123456,,7,,,,890" | sed 's/,,/\,?,/g; s/,,/,?,/g'

Changing a character in between patterns in vi/sed

I am struggling to work out how to get a , out from inbetween various patterns such as:
500,000
xyz ,CA
I have tried something like:
sed -E "s/\([a-zA-Z]*\),([a-zA-Z]*\)/\([a-zA-Z]*\) ([a-zA-Z]*\)/g" $file -i
It picks up the first pattern, but then over writes it with the second pattern, I feel like I am missing something very simple and I can't work it out, any help really appreciated.
You're missing the notion of capture groups, I think. To refer to a parenthesized portion of the search within the replacement string, use \1 for the first group, \2 for the second group, etc.
The modified line would be:
sed -E "s/([a-zA-Z]),([a-zA-Z])/\1 \2/g" $file -i
Rather than replacing the part that matches the first ([a-zA-Z]) with the literal text "([a-zA-Z])", this modified line just copies the matched portion into the output (and likewise for the second group).

Sed to replace certain number of occurrences

I have the replace sed script below and it works for the first occurrence of every line but I'm trying to make it work for the first 2 occurrences per line instead of one (/1) or the whole line (/g):
sed -r '2,$s/(^ *|, *)([a-z])/\1\U\2/1'
Is there any way to do that either by combining sed commands or creating a script?
The best I can offer is
sed -r '2,$ { s/(^|,) *[a-z]/\U&/; s//\U&/; }'
The \U& trick uses the fact that the upper case version of a space is still a space; this is to make the repetition shorter. Because captures are no longer used, the regex can be simplified a little.
In the second s command, the // is a stand-in for the most recently attempted regex, so the first one is essentially executed a second time (this time matching what was originally the second appearance).
Since /1 doesn't actually do anything (replacing the first occurrence is default), I took the liberty of removing it.

Manipulate characters with sed

I have a list of usernames and i would like add possible combinations to it.
Example. Lets say this is the list I have
johna
maryb
charlesc
Is there is a way to use sed to edit it the way it looks like
ajohn
bmary
ccharles
And also
john_a
mary_b
charles_c
etc...
Can anyone assist me into getting the commands to do so, any explanation will be awesome as well. I would like to understand how it works if possible. I usually get confused when I see things like 's/\.(.*.... without knowing what some of those mean... anyway thanks in advance.
EDIT ... I change the username
sed s/\(user\)\(.\)/\2\1/
Breakdown:
sed s/string/replacement/ will replace all instances of string with replacement.
Then, string in that sed expression is \(user\)\(.\). This can be broken down into two
parts: \(user\) and \(.\). Each of these is a capture group - bracketed by \( \). That means that once we've matched something with them, we can reuse it in the replacement string.
\(user\) matches, surprisingly enough, the user part of the string. \(.\) matches any single character - that's what the . means. Then, you have two captured groups - user and a (or b or c).
The replacement part just uses these to recreate the pattern a little differently. \2\1 says "print the second capture group, then the first capture group". Which in this case, will print out auser - since we matched user and a with each group.
ex:
$ echo "usera
> userb
> userc" | sed "s/\(user\)\(.\)/\2\1/"
auser
buser
cuser
You can change the \2\1 to use any string you want - ie. \2_\1 will give a_user, b_user, c_user.
Also, in order to match any preceding string (not just "user"), just replace the \(user\) with \(.*\). Ex:
$ echo "marya
> johnb
> alfredc" | sed "s/\(.*\)\(.\)/\2\1/"
amary
bjohn
calfred
here's a partial answer to what is probably the easy part. To use sed to change usera to user_a you could use:
sed 's/user/user_/' temp
where temp is the name of the file that contains your initial list of usernames. How this works: It is finding the first instance of "user" on each line and replacing it with "user_"
Similarly for your dot example:
sed 's/user/user./' temp
will replace the first instance of "user" on each line with "user."
Sed does not offer non-greedy regex, so I suggest perl:
perl -pe 's/(.*?)(.)$/$2$1/g' file
ajohn
bmary
ccharles
perl -pe 's/(.*?)(.)$/$1_$2/g' file
john_a
mary_b
charles_c
That way you don't need to know the username before hand.
Simple solution using awk
awk '{a=$NF;$NF="";$0=a$0}1' FS="" OFS="" file
ajohn
bmary
ccharles
and
awk '{a=$NF;$NF="";$0=$0"_" a}1' FS="" OFS="" file
john_a
mary_b
charles_c
By setting FS to nothing, every letter is a field in awk. You can then easy manipulate it.
And no need to using capturing groups etc, just plain field swapping.
This might work for you (GNU sed):
sed -r 's/^([^_]*)_?(.)$/\2\1/' file
This matches any charactes other than underscores (in the first back reference (\1)), a possible underscore and the last character (in the second back reference (\2)) and swaps them around.