Replacing part of pattern with sed - sed

I'm looking to replace a space that always comes after a number with a | in the middle of a pattern. There are also similar patterns later in the line sometimes that I do not wish to replace (see first/second lines in the examples).
example:
12315 asdfea 1 1ffesa
45456 asefasef 1 era
12 asfase
4 4aefs
what I need:
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4|4aefs
I have tried this:
sed 's/\([0-9][ ][a-zA-Z]\)/|/g' file.txt
However this deletes the pattern such that it looks like this:
|sdfea 1 1ffesa
|sefasef 1 era
|sfase
|4aefs
Which is not what I need.

For given sample input/output,
$ sed 's/ /|/' file.txt
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4|4aefs
By default, only first match will be replaced. g modifier will replace all matches
To replace first matched space between a digit and alphabet (matching depends on locale too)
$ sed 's/\([0-9]\) \([a-zA-Z]\)/\1|\2/' file.txt
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4 4aefs
This uses capture group and backreferences. Note that last line is not modified

Related

Join certain lines with sed

I have an input which looks like this:
1
2
3
4
5
6
And I want to transform it with sed to :
12
345
6
I know it can be easily done with other tools but I want to do it specifically with sed as a learning exercise.
I have attempted this:
sed ':x ; /^ *$/{ N; s/\n// ; bx; }'
But it prints :
123456
Can someone help me fix this?
Quoting from the GNU sed manual:
A common technique to process blocks of text such as paragraphs (instead of line-by-line) is using the following construct:
sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
The first expression, /./{H;$!d} operates on all non-empty lines, and adds the current line (in the pattern space) to the hold space. On all lines except the last, the pattern space is deleted and the cycle is restarted.
The other expressions x and s are executed only on empty lines (i.e. paragraph separators). The x command fetches the accumulated lines from the hold space back to the pattern space. The s/// command then operates on all the text in the paragraph (including the embedded newlines).
And indeed,
sed '/./{H;$!d} ; x ; s/\n//g'
does what you want.
FWIW here's how to really do that task in UNIX:
$ awk -v RS= -v OFS= '{$1=$1}1' file
12
345
6
The above will work on any UNIX box.
A GNU awk approach:
$ awk -F"\n" '{gsub("\n","");}1' RS='\n{2,}' file
12
345
6
Note it will add a trailing newline\n after last line.

How to replace every 2nd tab character with a newline character using sed

given the input
123\t456\tabc\tdef
create the output
123\t456\nabc\tdef
which would display like
123 456
abc def
Note that it needs to work across multiple lines, not just two.
EDIT
a better example might help clarify.
input (there is only expected to be 1 line of input)
1\t2\t3\t4\t5\t6\t7\t8
expected output
1 2
3 4
5 6
7 8
...
With GNU sed:
sed 's/\t/\n/2;P;D;' file
Replaces second occurrence of tab character with newline character.
This little trick should work:
sed 's/\(\t[^\t]*\)\t/\1\n/g' < input_file.txt
EDIT:
Below is an example:
$ cat 1.txt
one two three four five six seven
five six seven
$ sed 's/\(\t[^\t]*\)\t/\1\n/g' < 1.txt
one two
three four
five six
seven
five six
seven
$
EDIT2:
For MacOS' standard sed try this:
$ sed $'s/(\t[^\t]*\t/\\1\\\n/g' < 1.txt
$ is used for replacing escape characters on the bash-level.
Let's say following is the Input_file:
cat Input_file
123 456 abc def
Then to get them into 2 columns following may help you in same.
xargs -n2 < Input_file
Output will be as follows.
123 456
abc def

Insert filename into text file with sed

I've been learning about sed and finding it very useful, but cannot find an answer to this in any of the many guides and examples ... I'd like to insert the filename of a text file, minus its path and extension, into a specific line within the text itself. Possible?
In such cases, the correct starting point should be man pages. Manual of sed does not provide a feature for sed to understand "filename", but sed does support inserting a text before/after a line.
As a result you need to isolate the filename separatelly , store the text to a variable and inject this text after/before the line you wish.
Example:
$ a="/home/gv/Desktop/PythonTests/cpu.sh"
$ a="${a##*/}";echo "$a"
cpu.sh
$ a="${a%.*}"; echo "$a"
cpu
$ cat file1
LOCATION 0 X 0
VALUE 1a 2 3
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2a $a" file1 # Inject the contents of variable $a after line2
LOCATION 0 X 0
VALUE 1a 2 3
cpu
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2i $a" file1 # Inject the contetns of variable $a before line2
LOCATION 0 X 0
cpu
VALUE 1a 2 3
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2a George" file1 #Inject a fixed string "George" after line 2
LOCATION 0 X 0
VALUE 1a 2 3
George
VALUE 1b 2 3
VALUE 1c 2 3
Explanation:
a="${a##*/}" : Removes all chars from the beginning of string up to last found slash / (longer match)
a="${a%.*}" : Remove all chars starting from the end of the string up to the first found dot . (short match) . You can also use %% for the longest found dot.
sed "2a $a" : Insert after line 2 the contents of variable $a
sed "2i $q" : Insert before line 2 the contents of $a
Optionally you can use sed -i to make changes in-place / in file under process
wrt I've been learning about sed then you may have been wasting your time as there isn't a lot TO learn about sed beyond s/old/new. Sure there's a ton of other language constructs and things you could do with sed, but in practice you should avoid them all and simply use awk instead. If you edit your question to include concise, testable sample input and expected output and add an awk tag then we can show you how to do whatever you want to do the right way.
Meanwhile, it sounds like all you need is:
$ cat /usr/tmp/file
a
b
c
d
e
$ awk 'NR==3{print gensub(/.*\//,"",1,FILENAME)} 1' /usr/tmp/file
a
b
file
c
d
e
The above inserts the current file name before line 3 of the open file. It uses GNU awk for gensub(), with other awks you'd just use sub() and a variable.

Sed to replace last character on condition

I have a file which has following lines
172XI207 X123955 1
412XE401 XE05689 1
412XI402 XI9515 1
412XI403 XI06702 1
412XE404 XE75348 1
I want to replace last column to 2 if the first two characters in the second column matches to XE.
The result should be like below
172XI207 X123955 1
412XE401 XE05689 2
412XI402 XI9515 1
412XI403 XI06702 1
412XE404 XE75348 2
I wanted to use sed (not awk). Can someone please let me know how this can be acheived using sed?
many sed commands take an address or address range (see the man page for the gory details). Probably the most common command is s of course, but it is among those that take an address range, meaning it doesn't need to apply to every line. An address range xan be a regular expression. The s command is:
{address}s/pattern/replacement/
For you the address - matching RE - is / XE/ (assuming your columns are space separarated; change that to a tab if necessary), the pattern is 1$ and the replacement 2. Therefore:
/ XE/s/1$/2/
or as a command line
sed -e '/ XE/s/1$/2/' < oldfile > newfile
EDIT: oops, second column, not start of line.
This command should do the trick (providing you are looking at myfile.txt)
sed -e '/ XE/ s/1$/2//' myfile.txt
You can make sure your replacement is acted by adding the -i option which will modify the file in-place, make sure it's exactly what you are expecting before though.
Edit: based on question in comments, here is a command that matches on 3rd column and replaces on fifth.
sed -e 's/^\(\(\w\+\W\+\)\{2\}XE\(\w\+\W\+\)\{2\}\)1/\12/'
Or, as an alternative, you can first select the line and then substitute:
sed -e '/^\(\w\+\W\+\)\{2\}XE/ s/^\(\(\w\+\W\+\)\{4\}\)1/\12/'

sed delete remaining characters in line except first 5

what would be sed command to delete all characters in line except first 5 leading ones, using sed?
I've tried going 'backwards' on this (reverted deleting) but it's not most elegant solution.
This might work for you (GNU sed):
echo '1234567890' | sed 's/.//6g'
12345
Or:
echo '1234567890' | cut -c-5
12345
Try this (takes 5 repetitions of 'any' character at the beginning of the line and save this in the first group, then take any number of repetition of any characters, and replace the matched string with the first group):
sed 's/^\(.\{5\}\).*/\1/'
Or the alternative suggested by mouviciel:
sed 's/^\(.....\).*/\1/'
(it is more readable as long as the number of first characters you want does not grow too large)