Duplicating characters using sed - sed

I am working on trying to duplicate characters on certain words, but the sed script I wrote is not quite doing what I want it to. For example, I am trying to duplicate each character in a words like so:
FILE into FFIILLEE
I know how to remove the duplicate letters with :
sed s/\(.\)\1/\1/g' file.txt
This is the sed script I wrote, but it just ends up duplicating the whole word and I just get:
FILE FILE
This is what I wrote:
sed 's/[A-Z]*/& &/g' file.txt
How can I grab each letter and duplicate just the letter?

A slight variation on your first script should work:
sed 's/\(.\)/\1\1/g' file.txt
Translation: For every character seen, replace it by itself followed by itself.

sed 's/[[:alpha:]]/&&/g' file.txt
[:alpha:]class is the whole scope of letter available, you could extend with [:alnum:]including digit
& in replacement pattern is the whole search pattern matching. In this case 1 letter
g for each possible occurence
Your probleme was to use the * in search pattern that mean all occurence of previous pattern so the pattern is the whole word at once and not every letter of this word

Related

How to use Sed to change letter to uppercase in first and second column in text file to upper case

I have text file input.txt which has
april,december,month.gmail.com
lion,tiger,animal.gmail.com
Using sed change first and second columns to uppercase? Is there a way to do it?
With GNU sed:
sed 's/^[a-z]*,[a-z]*,/\U&/' file
s: substitute command
[a-z]*,: search for zero ore more lowercase letter followed by a ,. The pattern is repeated for second field
the \U sequence turns the replacement to uppercase
\U is applied to & which reference the matched string
or if there is only three comma separated fields:
sed 's/^[a-z].*,/\U&/' file
output:
APRIL,DECEMBER,month.gmail.com
LION,TIGER,animal.gmail.com
As #Sundeep suggests, the second sed can be shortened to:
s/^.*,/\U&/
which converts all characters until last , is found
For more on GNU sed substitution command, see this article

Insert specific lines from file before first occurrence of pattern using Sed

I want to insert a range of lines from a file, say something like 210,221r before the first occurrence of a pattern in a bunch of other files.
As I am clearly not a GNU sed expert, I cannot figure how to do this.
I tried
sed '0,/pattern/{210,221r file
}' bunch_of_files
But apparently file is read from line 210 to EOF.
Try this:
sed -r 's/(FIND_ME)/PUT_BEFORE\1/' test.text
-r enables extendend regular expressions
the string you are looking for ("FIND_ME") is inside parentheses, which creates a capture group
\1 puts the captured text into the replacement.
About your second question: You can read the replacement from a file like this*:
sed -r 's/(FIND_ME)/`cat REPLACEMENT.TXT`\1/' test.text
If replace special characters inside REPLACEMENT.TXT beforehand with sed you are golden.
*= this depends on your terminal emulator. It works in bash.
In https://stackoverflow.com/a/11246712/4328188 CodeGnome gave some "sed black magic" :
In order to insert text before a pattern, you need to swap the pattern space into the hold space before reading in the file. For example:
sed '/pattern/ {
h
r file
g
N
}' in
However, to read specific lines from file, one may have to use a two-calls solution similar to dummy's answer. I'd enjoy knowing of a one-call solution if it is possible though.

SED command to remove words at the end of the string

I want to remove last 2 words in the string which is in a file.
I am using this command first to delete the last word. But I couldn't do it. can someone help me
sed 's/\w*$//' <file name>
my strings are like this
Input:
asbc/jahsf/jhdsflk/jsfh/ -0.001 (exam)
I want to remove both numerical value and the one in brackets.
Output:
asbc/jahsf/jhdsflk/jsfh/
Using GNU sed:
$ sed -r 's/([[:space:]]+[-+.()[:alnum:]]+){2}$//' file
asbc/jahsf/jhdsflk/jsfh/
How it works
[[:space:]]+ matches one or more spaces.
[-+.()[:alnum:]]+ matches the 'words' which are allowed to contain any number of plus or minus signs, periods, parens, or any alphanumeric characters.
Note that, when a period is inside square brackets, [.], it is just a period, not a wildcard: it does not need to be escaped.
([[:space:]]+[-+.()[:alnum:]]+) matches one or more spaces followed by a word.
([[:space:]]+[-+.()[:alnum:]]+){2}$ matches two words and the spaces which precede them.
Note the use of character classes like [:space:] and [:alnum:]. Unlike the old-fashioned classes like [a-zA-Z0-9], these classes are unicode safe.
OSX (BSD) sed
The above was tested on GNU sed. For BSD sed, try:
sed -E 's/([[:space:]][[:space:]]*[-+.()[:alnum:][:alnum:]]*){2}$//' file
To remove everything that follows a number with decimal places
This looks for a decimal number with optional sign and removes it, the spaces which precede it, and everything which follows it:
$ sed -r 's/[[:space:]]+[-+]?[[:digit:]]+[.][[:digit:]]+[[:space:]].*//' file
asbc/jahsf/jhdsflk/jsfh/
How it works:
[[:space:]]+ matches one or more spaces
[-+]? matches zero or one signs.
[[:digit:]]+ matches any number of digits.
[.] matches a decimal point (period).
[[:digit:]]+ matches one or more digits following the decimal point.
[[:space:]] matches a space following the number.
.* matches anything which follows.
It looks like there is a tab between what you want to keep and what you want to get rid of. I don't have linux in front of me but try this.
sed 's/\t.*//'
This is assuming your strings are always formatted similarily which is what I take from your comment.
This might work for you (GNU sed):
sed -r 's/\s+\S+\s+\S+\s*$//' file
or if you prefer:
sed -r 's/(\s+\S+){2}\s*$//' file
This matches and removes: one or more whitespaces followed by one or more non-whitespaces twice followed by zero or more whitespaces at the end of the line.

Insert newline after pattern with changing number in sed

I want to insert a newline after the following pattern
lcl|NC_005966.1_gene_750
While the last number(in this case the 750) changes. The numbers are in a range of 1-3407.
How can I tell sed to keep this pattern together and not split them after the first number?
So far i found
sed 's/lcl|NC_005966.1_gene_[[:digit:]]/&\n/g' file
But this breaks off, after the first digit.
Try:
sed 's/lcl|NC_005966.1_gene_[[:digit:]]*/&\n/g' file
(note the *)
Alternatively, you could say:
sed '/lcl|NC_005966.1_gene_[[:digit:]]/G' file
which would add a newline after the specified pattern is encountered.
sed 's/lcl|NC_005966\.1_gene_[[:digit:]][[:digit:]]*/&\
/g' file
You need to escape . as it's an RE metacharacter, and you need [[:digit:]][[:digit:]]* to represent 1-or-more digits and you need to use \ followed by a literal newline for portability across seds.

Replace 3 lines with another line SED Syntax

This is a simple question, I'm not sure if i'm able to do this with sed/awk
How can I make sed search for these 3 lines and replace with a line with a determined string?
<Blarg>
<Bllarg>
<Blllarg>
replace with
<test>
I tried with sed "s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/g" But it just don't seem to find these lines. Probably something with my break line character (?) \n. Am I missing something?
Because sed usually handles only one line at a time, your pattern will never match. Try this:
sed '1N;$!N;s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/;P;D' filename
This might work for you:
sed '/<Blarg>/ {N;N;s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/}' <filename>
It works as follows:
Search the file till <Blarg> is found
Then append the two following lines to the current pattern space using N;N;
Check if the current pattern space matches <Blarg>\n<Bllarg>\n<Blllarg>
If so, then substitute it with <test>
You can use range addresses with regular expressions an the c command, which does exactly what you are asking for:
sed '/<Blarg>/,/<Blllarg>/c<test>' filename