Grep and replace the character from grepped result - sed

I am trying to grep the line from file and then from $1 I am trying to change the character.
eg
cat file1.txt
Surjit
Shilpa
cchiku
end of file
I tried and grepped the line which start with s.
grep -e "S"
Then I want to replace the 4th character to x for all grepped result in the file1.txt
I tried
sed -i "s/./x/4" file1.txt
How can I do this only for grepped results?

You can use the sed '/pattern/s/find/replace/' file syntax:
sed '/^S/s/./x/4' file
# ^^ ^^^^^^^
# | replace the 4th character with x
# |
# on lines starting with S
With your file:
$ sed '/^S/s/./x/4' file
Surxit
Shixpa
cchiku
end of file
Note I am using /^S/ as a pattern to match lines starting with S, because if you just say /S/ it will match any line containing S. The anchor ^ indicates the beginning of the line.

An alternative to fedorqui's answer is to include the starting with S condition into the pattern itself:
sed 's/^\(S..\)./\1x/' file
The command matches lines starting with S and puts the S and the following two characters into a matching group. In the replacement part the content of the matching group will get reused and next character after it will get replaced by x.

awk -v FS="" -v OFS="" '/^S/{$4="x"}1' infile
Surxit
Shixpa
cchiku
end of file

Related

sed: get a line number with regex and insert text at that line

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.
This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.
Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

How to truncate the first digit of a number?

For example, my file has the following data:
$ cat sample.txt
19999119999,string1,dddddd
18888135790,string2,dddddd
15555555500,string3,dddddd
This is a sample data. How can we remove ONLY first digit from each row? My output should be:
$ cat output.txt
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Is there any way to parse each line character wise using grep or sed?
Or any other way to get the desired output?
You just need to print from the second character on:
$ cut -c2- file
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Or, using sed, remove the first char:
$ sed 's/^.//' file
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
Try this:
$ sed -r 's/^[0-9](.*)/\1/' sample.txt
Output:
9999119999,string1,dddddd
8888135790,string2,dddddd
5555555500,string3,dddddd
^[0-9] - The first digit of each line
(.*) - The content of each line except the first digit
\1 - Denote the content of (.*)
Sorry for my bad English.
Grep can solve this with a look behind. For that you need -P option :
grep -Po '(?<=^\d)(.+)' file
or in shorthand :
grep -Po '^\d\K.+' file
The (?<=^\d)/^\d\K part is the look behind that matches the first digit.

Using sed to keep the beginning of a line

I have a file in which some lines start by a >
For these lines, and only these ones, I want to keep the first eleven characters.
How can I do that using sed ?
Or maybe something else is better ?
Thanks !
Muriel
Let's start with this test file:
$ cat file
line one with something or other
>1234567890abc
other line in file
To keep only the first 11 characters of lines starting with > while keeping all other lines:
$ sed -r '/^>/ s/(.{11}).*/\1/' file
line one with something or other
>1234567890
other line in file
To keep only the first eleven characters of lines starting with > and deleting all other lines:
$ sed -rn '/^>/ s/(.{11}).*/\1/p' file
>1234567890
The above was tested with GNU sed. For BSD sed, replace the -r option with -E.
Explanation:
/^>/ is a condition. It means that the command which follows only applies to lines that start with >
s/(.{11}).*/\1/ is a substitution command. It replaces the whole line with just the first eleven characters.
-r turns on extended regular expression format, eliminating the need for some escape characters.
-n turns off automatic printing. With -n in effect, lines are only printed if we explicitly ask them to be printed. In the second case above, that is done by adding a p after the substitute command.
Other forms:
$ sed -r 's/(>.{10}).*/\1/' file
line one with something or other
>1234567890
other line in file
And:
$ sed -rn 's/(>.{10}).*/\1/p' file
>1234567890

Simple SED statement to truncate everything after first instance of underscore AFTER # sign

Input:
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
Required Output:
ab_cd#yahoo.co.uk
ef_gh#gmail.com
I'm looking for a SED statement to do the above.
Essentially I would like everything from and including the first underscore character AFTER the # sign to be removed from the output.
I'm sorry I only have basic knowledge of programming. I'm on a Windows machine [I've found a SED editor] from here
and use it modify simple strings in a batch file from the Windows shell.
Many thanks
give this a try:
sed 's/_[^#]*$//' file
it worked here with your input:
kent$ cat f
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
kent$ sed 's/_[^#]*$//' f
ab_cd#yahoo.co.uk
ef_gh#gmail.com
This can be a way:
$ sed -r 's/(.*#[^_]*).*/\1/' file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
It catches all the text before #, after it and up to _. Then, it prints it back, getting rid of everything coming from _.
Explanation
Matching group: given a sed 's/something/back/' command, whatever you "catch" in the part, can be enclosed in parenthesis so that you can refer back to it with \1 (first match), \2 (2nd match), and up to \9.
$ cat file
hello33bye
hello44goodbye
hello55yeah
$ sed 's/hello([0-9]*).*/\1/g' a
33
44
55
So (.*#[^_]*).* means: catch a block of text followed by # and followed with any character appart from _. Then, catch the rest of the text.
Finally, print the catched block back.
To be sure we are not matching a _ within the domain:
sed -r 's/(.*#[^\.]*[^_]*).*/\1/' file
^^^^^^
catch a dot before catching an underscore
Test
$ cat a
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
aaa#gma_il.com_ST_BB
aaa#gma_il.com
$ sed -r 's/(.*#[^\.]*[^_]*).*/\1/' a
ab_cd#yahoo.co.uk
ef_gh#gmail.com
aaa#gma_il.com
aaa#gma_il.com
Here is an awk version. (awk is normal more easy to understand than sed)
cat file
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
test#home_net.com_BO_22.jpg
awk -F\. '{NF--;split($NF,a,"_");$NF=a[1]}1' OFS=\. file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
test#home_net.com
It removes the last field when split by ., then divide last field by _ and replace it by first part of it.

Unix - Split to N files using regexp to name destination file

How do I split a file to N files using as a filename the first 2 chars on the line.
Ex input file:
AA23409234TEXT
BA23201202Other Text
AA23509234YADA
BA23202202More Text.
C1000000000000000000
Should generate 3 files:
AA.txt
AA23409234TEXT
AA23509234YADA
BA.txt
BA23201202Other Text
BA23202202More Text.
C1.txt
C1000000000000000000
I'm thinking of using a sed script similar to this
/^(..)/w \1
But what that really does is create a file named '\1' instead of the capture group.
Any ideas?
$ awk '{fname=substr($0, 0, 2); print >>fname}' input.txt
Or
$ while read line; do echo "$line" >>"${line:0:2}"; done <input.txt
The first thing you need to do is determine all of your file names:
filenames=$(sed 's/\(..\).*/\1/' listOfStrings.txt | sort | uniq)
Then, loop through those filenames
for filename in $filenames
do
sed -n '/^$filename/ p' listOfStrings.txt > $filename.txt
done
I have not tested this, but I think it should work.
This might work for you:
sed 's/\(..\).*/echo "&" >>\1.txt/' file | sh
or if you have GNU sed:
sed 's/\(..\).*/echo "&" >>\1.txt/e' file