Case insensitive match but matching replacement case - sed

How can I adjust sed -e 's/(match other stuff too)[aA]/\1b/g' to have the replacing b match the case of the a being replaced? In this case only the single character is being replaced but the entire search can/should be case insensitive (I can address that separately with s///I I believe).

This might work for you (GNU sed):
sed -rn 's/$/\nabAB/;:a;s/(match other stuff too)([aA])(.*\n.*\2(.).*)/\1\4\3/;ta;P' file
Append a lookup table to the end of the line and loop until all lookups have been substituted.

Related

getting the first letter of an filtered part in sed

I have a filename e.g. 15736--1_brand-new-image.jpg
My goal is to get the first letter after the _ in this case the b.
With s/\(.*\)\_\(.*\)$/\2/ I am able to extract brand-new-image.jpg
which is partly based on the info found on https://www.oncrashreboot.com/use-sed-to-split-path-into-filename-extension-and-directory
I've already found get first letter of words using sed but fail to combine the two.
To validate my sed statement I've used https://sed.js.org/
How can I combina a new sed statement on the part I've filtered to get the first letter?
With your shown samples could you please try following.
echo "15736--1_brand-new-image.jpg" | sed 's/[^_]*_\(.\).*/\1/'
Explanation: Simply using substitution operation of sed, then looking till 1st occurrence of _ then saving next 1 char into back reference and mentioning .* will cover everything after it, while substituting simply substituting everything with 1st back reference value which will be after 1st _ in this case its b.
Explanation: Following is only for explanation purposes.
sed ' ##Starting sed program from here.
s/ ##using s to tell sed to perform substitution operation.
[^_]*_\(.\).* ##using regex to match till 1st occurrence of _ then using back reference \(.\) to catch value in temp buffer memory here.
/\1/ ##Substituting whole line with 1st back reference value here which is b in this case.
'
Using a . or \w could also match _ in case there are 2 consecutive __
If you want to match the first word character without matching the _ you could also use
echo "15736--1_brand-new-image.jpg" | sed 's/[^_]*_\([[:alnum:]]\).*/\1/'
Output
b
This might work for you (GNU sed):
sed -nE 's/^[^_]*_[^[:alpha:]]*([[:alpha:]]).*/\1/p' file
Since this a filtering type operation use the -n option to print only when there is a positive match.
Match the first _ from the start of the line and then discard any non-alpha characters until an alpha character and finally discard any other characters.
Print the result if there is a match.
N.B. Anchoring the match to the start of the line, prevents the result containing more than one character i.e. consider the string 123_456_abc might otherwise result in 4 or 123_a.

gnu sed remove portion of line after pattern match with special characters

The goal is to use sed to return only the url from each line of FF extension Mining Blocker which uses this format for its regex lines:
{"baseurl":"*://002.0x1f4b0.com/*", "suburl":"*://*/002.0x1f4b0.com/*"},
{"baseurl":"*://003.0x1f4b0.com/*", "suburl":"*://*/003.0x1f4b0.com/*"},
the result should be:
002.0x1f4b0.com
003.0x1f4b0.com
One way would be to keep everything after suburl":"*://*/ then remove each occurrence of /*"},
I found https://unix.stackexchange.com/questions/24140/return-only-the-portion-of-a-line-after-a-matching-pattern but the special characters are a problem.
this won't work:
sed -n -e s#^.*suburl":"*://*/##g hosts
Would someone please show me how to mark the 2 asterisks in the string so they are seen by regex as literal characters, not wildcards?
edit:
sed -n 's#.*://\*/\([^/]\+\)/.*#\1#p' hosts
doesn't work, unfortunately.
regarding character substitution, thanks for directing me to the references.
I reduced the searched-for string to //*/ and used ASCII character codes like this:
sed -n -e s#^.*\d047\d047\d042\d047##g hosts
Unfortunately, that didn't output any changes to the lines.
My assumptions are:
^.*something specifies everything up to and including the last occurrence of "something" in a line
sed -n -e s#search##g deletes (replace with nothing) "search" within a line
So, this line:
sed -n -e s#^.*\d047\d047\d042\d047##g hosts
Should output everything after //*/ in each line...except it doesn't.
What is incorrect with that line?
Regarding deleting everything including and after the first / AFTER that first operation, yes, that's wanted too.
This might work for you (GNU sed):
sed -n 's#.*://\*/\([^/]\+\)/.*#\1#p' file
Match greedily (the longest string that matches) all characters up to ://*/, followed by a group of characters (which will be referred to as \1) that do not match a /, followed by the rest of the line and replace it by the group \1.
N.B. the sed substitution delimiters are arbitrary, in this case chosen to be # so as make pattern matching / easier. Also the character * on the left hand side of the substitution command may be interpreted as a meta character that means zero or more of the previous character/group and so is quoted \* so that it does not mistakenly exert this property. Finally, using the option -n toggles off the usual printing of every thing in the pattern space after all the sed commands have been executed. The p flag on the substitution command, prints the pattern space following a successful substitution, therefore only URL's will appear in the output or nothing.

Can sed match one part of a line and replace another part, for multiple lines that each have a different match and replacement?

Given a file containing multiple lines with this format:
define('SOME_NAME', some_value);
How can sed be used to match SOME_NAME and replace some_value with some_other_value, for multiple different lines ?
This is a solution for one line:
sed -re "s|^(define *\('SOME_NAME'\s*,\s*).*(\);)|\1 some_other_value \2|" defs_file
To process a number of similar definitions in a file, I had to script outside sed (this example uses a bash version 4 associative array):
#!/bin/bash
declare -A args
args=([SOME_NAME1]=some_other_value1
[SOME_NAME2]=some_other_value2
[SOME_NAME3]=some_other_value3
[SOME_NAME4]=some_other_value4
[SOME_NAME5]=some_other_value5)
for arg in "${!args[#]}"
do
sed -i -re "s|^(define *\('$arg'\s*,\s*).*(\);)|\1 ${args[$arg]} \2|" defs_file
done
Is there a more elegant way to achieve this result that only relies on sed ?
This might work for you (GNU sed):
sed -r 's/$/\n#NAME:new_value/;s/([^,]*),[^\n]*\n.*#\1:([^#]*).*/\1,\2/;P;d' <<<"NAME,old_value"
This is a simplified example of yours; appending a lookup to the pattern space and then using regexp and back references to match and rearrange the output. See here for a detailed explanation.
N.B. In the example above (for brevity) I only included one lookup, in a real solution the lookup table would have several e.g. s/$/\n#NAME1:new_value1#NAME2:new_value2..../

Using sed to convert singular/plural words into uppercase

Using one sed command I'm trying to convert all occurrences of test and tests found in a .txt file into all caps. I also want to print only the converted lines, so I'm using -n. I've been playing around for it for over an hour. The problem is that I'm able to convert one or the other (either test or tests) but not both.
Any help would be so greatly appreciated. Thank you!
Use this
sed -e 's/tests/TESTS/g; s/test/TEST/g; T; p;' input.txt
The semicolons let you execute multiple commands.
This might work for you (GNU sed):
sed 's/\<tests\?\>/\U&/gp;d' file
This will uppercase words (\<....\>) that begin test with an optional s (s\?).
Sorry for the late response, but here is hopefully an understandable one with basic regex (no extended regex):
sed 's:\<test\(s*\)\>:TEST\1:g' < inputFile.txt > outputFile.txt; cat outputFile.txt | grep -n TEST
Explanation:
: delimiter (instead of usual /)
\<test\> matches test. The character before the first t can be any character except a letter, number or underscore. Same applies for the character after the last t.
\(\) remember what is inside the parenthesis.
s* match zero or more s's.
\1 used to insert first remembered match (i.e. any number of s's matched).
The rest is hopefully clear. Otherwise leave a comment.

Insert newline after pattern with changing number in sed

I want to insert a newline after the following pattern
lcl|NC_005966.1_gene_750
While the last number(in this case the 750) changes. The numbers are in a range of 1-3407.
How can I tell sed to keep this pattern together and not split them after the first number?
So far i found
sed 's/lcl|NC_005966.1_gene_[[:digit:]]/&\n/g' file
But this breaks off, after the first digit.
Try:
sed 's/lcl|NC_005966.1_gene_[[:digit:]]*/&\n/g' file
(note the *)
Alternatively, you could say:
sed '/lcl|NC_005966.1_gene_[[:digit:]]/G' file
which would add a newline after the specified pattern is encountered.
sed 's/lcl|NC_005966\.1_gene_[[:digit:]][[:digit:]]*/&\
/g' file
You need to escape . as it's an RE metacharacter, and you need [[:digit:]][[:digit:]]* to represent 1-or-more digits and you need to use \ followed by a literal newline for portability across seds.