How to transpose individual characters using sed - sed

I'm trying to transpose the 3rd digit with the 8th digit. Trying this with the phone number:
(888) 747-7424
After transposition:
(884) 747-7824
This is what I have:
sed -E 's/(..)(.)(....)(.)(.*)/\1\4\3\2\5/'
It transposes the 2nd and the 5th digit.

You just miscounted characters:
$ sed -E 's/(.{3})(.)(.{7})(.)/\1\4\3\2/' file
(884) 747-7824

If you want to only swap the digits, you need to match digits after any 0 or more non-digit chars:
sed -E 's/^(([^0-9]*[0-9]){2}[^0-9]*)([0-9])(([^0-9]*[0-9]){4}[^0-9]*)([0-9])/\1\6\4\3/'
See the online sed demo:
s='(888) 747-7424'
sed -E 's/^(([^0-9]*[0-9]){2}[^0-9]*)([0-9])(([^0-9]*[0-9]){4}[^0-9]*)([0-9])/\1\6\4\3/' <<< "$s"
# => (884) 747-7824
Details
^ - start of string
(([^0-9]*[0-9]){2}[^0-9]*) - Group 1: two occurrences of non-digits followed with a digit, then 0 or more non-digits
([0-9]) - Group 3: a digit
(([^0-9]*[0-9]){4}[^0-9]*) - Group 4: four occurrences of non-digits followed with a digit, then 0 or more non-digits
([0-9]) - Group 6: a digit.

This might work for you (GNU sed):
sed -E 's/[0-9]/\n&/3;s//\n&/8;s/\n(.)(.*)\n(.)/\3\2\1/' file
This delimits each required digit by a newline then uses pattern matching to swap the two digits.
To make this atomic, use:
sed -E 's/[0-9]/\n&/8;T;s//\n&/3;s/\n(.)(.*)\n(.)/\3\2\1/' file

Related

Using sed to extract a number value from json

I have a file in which some lines contain a json object on a single line, and I want to extract the value of the window_indicator property.
A normal regular expression is: "window_indicator":\s*([\-\d\.]+) in which I want the value of the fist match group.
Here it is working perfectly well: https://regex101.com/r/w9Iuch/1
I've settled on sed because it seems that grep has to print the whole line and can't limit to the match group value, and perl is overkill.
Unfortunately, sed isn't actually capable of doing this, is it?
# sed 's/("window_indicator:)/\1/' in.txt
sed: -e expression #1, char 26: invalid reference \1 on `s' command's RHS
# sed -E 's/("window_indicator":)/\1/p' in.txt
prints out every line of the file
# sed -rn 's/("window_indicator":)/\1/p' in.txt
prints the whole line
# sed -rn 's/("window_indicator":)/\1/' in.txt
nothing
With sed, you need to match the whole line, capture what you need, replace the whole match with Group 1 placeholder, and make sure you suppress the default line output and only print the new text after successful substitution:
sed -nE 's/.*"window_indicator":[[:space:]]*([-0-9.]+).*/\1/p' in.txt
If the first match is to be retrieved, add q to quit:
sed -nE 's/.*"window_indicator":[[:space:]]*([-0-9.]+).*/\1/p;q' in.txt
Note that \d is not supported in POSIX regex, it is replaced with 0-9 range in the bracket expression here.
Details
n - suppress default line output
E - enables POSIX ERE flavor
.*"window_indicator":[[:space:]]*([-0-9.]+).* - finds
.* - any text
"window_indicator": - a fixed string
[[:space:]]* - zero or more whitespaces (GNU sed supports \s, too)
([-0-9.]+) - Group 1: one or more digits, - or .
.* - any text
\1 - replaces with Group 1 value
p - prints the result upon successful replacement
q - quits processing the stream.
With GNU grep, it is even easier:
grep -oP '"window_indicator":\s*\K[-\d.]+' in.txt
To get the first match,
grep -oP '"window_indicator":\s*\K[-\d.]+' in.txt | head -1
Here,
o - outputs matched texts only
P - enables the PCRE regex engine
"window_indicator":\s*\K[-\d.]+ - matches
"window_indicator": - a fixed string
\s* - zero or more whitespaces
\K - removes the text matched so far from the match value
[-\d.]+ - matches one or more -, . or digits.
1st solution: With your shown samples please try following awk code. Though its always advised to use json parsers like: jq. Simple explanation would be, using match function of awk here, where using regex "window_indicator":[0-9]+} in it to match needed value. If regex is successfully matched then creating variable val which has sub-string of matched regex in current line. Then substituting "window_indicator": and } with NULL in val and printing val which will give needed value.
awk '
match($0,/"window_indicator":[0-9]+}/){
val=substr($0,RSTART,RLENGTH)
gsub(/"window_indicator":|}/,"",val)
print val
}
' Input_file
2nd solution: Using GNU grep where using positive look ahead and positive look behind mechanism and getting the expected output as per requirement.
grep -oP '(?<="window_indicator":)\d+(?=})' Input_file
Using sed
$ sed -E 's/.*window_indicator":([0-9]+).*/\1/' input_file
0
Using grep
$ grep -Po '.*window_indicator":\K\d+' input_file
0
Using awk
$ awk '{match($0,/.*window_indicator":([0-9]+)/,arr);print arr[1]}' input_file
0

How to concatenate two lines using label in sed

I have a question if there is some way to detect the end of the line using sed. Bcs I need to concatenate two-line only if the end of the line end with a minus sign - otherwise not.
sed -e :a -e '/,$/N; s/-\n*/new_line/; ta' test.txt (that is only what i have and i need to substite new_line for actualy new line)
if file is look something like that
Here is a random sente-
nc and if random sentec ended with minus it is better to concatenate.
RESULT
Here is a random sentece and if random sentec ended with minus it is better to concatenate.
This might work for you (GNU sed):
sed '/-$/{N;s/-\n//}' file
If the line ends in -, append the next line and remove the last character of the first line and the following newline.
Here is how you will do that sed in any version:
cat file
foo bar_-
Here is a random sente-
nce and if random sentec ended with minus it is better to concatenate.
# and use sed as
sed -e :a -e '/-$/{N;s/-\n//g;ta' -e '}' file
foo bar_Here is a random sentence and if random sentec ended with minus it is better to concatenate.

How to extract from string all words between double quotes using SED

I'm trying to use a regexp, like
.*?"([^"]+).*?"/g
to extract all words between double quotes from string.
For example from:
< Header param1="1" param2="2" param3="" param4="" param5=5 param6="6"
>
I would like to get:
1 2 6
Yes, I know that I can use grep, but it is necessary do it by sed
There is no BRE or ERE than can do what you want so it can't be done in one regexp with sed. You CAN do this in sed instead if that's acceptable:
$ sed -E 's/^[^"]*"|"[^"]*$//g; s/"[^"]+"/ /g; s/ +/ /g' file
1 2 6

how do I use sed to delete lines with single digit instead of double

I have this line.
sed -i '/Total number 1/d' /tmp/test.txt
This will delete lines,
Total number 1
but it also deletes,
Total number 11
Total number 12
Total number 13
how do I set it to delete single digit only?
Add a dollar sign to the end sed -i '/Total number 1$/d' /tmp/test.txt
Also, if you want to delete any single digit, replace 1: sed -i '/Total number [0-9]$/d' /tmp/test.txt
Finally, if the number isn't necessarily at the end of the line, you could also have the pattern end when either the end of line or a non-digit is found: sed -i -E '/Total number [0-9]($|[^0-9])/d' /tmp/test.txt
The precise and generic solution would be:
sed '/\b[[:digit:]]\b/d'
\b stands for a word boundary.
Pass the -i option once you made sure that the above command works for you since it would effectively change your input files.

regular expression in sed for masking credit card

We need to mask credit card numbers.Masking all but last 4 digits. I am trying to use SED. As credit card number length varies from 12 digits to 19,I am trying to write regular expression.Following code will receive the String. If it contains String of the form "CARD_NUMBER=3737291039299199", it will mask first 12 digits.
Problem is how to write regular expression for credit card-12 to 19 digits long? If I write another expression for 12 digits, it doesn't work.that means for 12 digit credit card- first 8 digits should be masked. for 15 digit credit card, first 11 digits should be masked.
while read data; do
var1=${#data}
echo "Length is "$var1
echo $data | sed -e "s/CARD_NUMBER=\[[[:digit:]]\{12}/CARD_NUMBER=\[\*\*\*\*\*\*\*\*/g"
done
How about
sed -e :a -e "s/[0-9]\([0-9]\{4\}\)/\*\1/;ta"
(This works in my shell, but you may have to add or remove a backslash or two.) The idea is to replace a digit followed by four digits with a star followed by the four digits, and repeat this until it no longer triggers.
This does it in one sed command without an embedded newline:
sed -r 'h;s/.*([0-9]{4})/\1/;x;s/CARD_NUMBER=([0-9]*)([0-9]{4})/\1/;s/./*/g;G;s/\n//'
If your sed doesn't have -r:
sed 'h;s/.*\([0-9]\{4\}\)/\1/;x;s/CARD_NUMBER=\([0-9]*\)\([0-9]\{4\}\)/\1/;s/./*/g;G;s/\n//'
If your sed needs -e:
sed -e 'h' -e 's/.*\([0-9]\{4\}\)/\1/' -e 'x' -e 's/CARD_NUMBER=\([0-9]*\)\([0-9]\{4\}\)/\1/' -e 's/./*/g' -e 'G' -e 's/\n//'
Here's what it's doing:
duplicate the number so it's in pattern space and hold space
grab the last four digits
swap them into hold space and the whole number into pattern space
grap all but the last four digits
replace each digit with a mask character
append the last four digits from hold space to the end of the masked digits in pattern space (a newline comes along for free)
get rid of the newline
try this, you don't have to create complicated regex
var1="CARD_NUMBER=3737291039299199"
IFS="="
set -- $var1
cardnumber=$2
echo $cardnumber | awk 'BEGIN{OFS=FS=""}{for(i=1;i<=NF-4 ;i++){ $i="*"} }1'
output
$ ./shell.sh
************9199
I'm not much of a sed guru, and thus I cannot manage to do it in only one command, though there surely are ways. But with two sed commands, here is what I got:
sed -e 's/CARD_NUMBER=\([0-9]*\)\([0-9]\{4\}\)/\1\
\2/' | sed -e '1s/./x/g ; N ; s/\n//'
Please note the embedded newline.
Because sed works by lines, I first break the card number into the initial part and the last four digits, separating them by a newline (the first sed command). Then, I mask the initial part (1s/./x/g), and remove the new line (N ; s/\n//).
Good luck!