How to replace every 2nd tab character with a newline character using sed - sed

given the input
123\t456\tabc\tdef
create the output
123\t456\nabc\tdef
which would display like
123 456
abc def
Note that it needs to work across multiple lines, not just two.
EDIT
a better example might help clarify.
input (there is only expected to be 1 line of input)
1\t2\t3\t4\t5\t6\t7\t8
expected output
1 2
3 4
5 6
7 8
...

With GNU sed:
sed 's/\t/\n/2;P;D;' file
Replaces second occurrence of tab character with newline character.

This little trick should work:
sed 's/\(\t[^\t]*\)\t/\1\n/g' < input_file.txt
EDIT:
Below is an example:
$ cat 1.txt
one two three four five six seven
five six seven
$ sed 's/\(\t[^\t]*\)\t/\1\n/g' < 1.txt
one two
three four
five six
seven
five six
seven
$
EDIT2:
For MacOS' standard sed try this:
$ sed $'s/(\t[^\t]*\t/\\1\\\n/g' < 1.txt
$ is used for replacing escape characters on the bash-level.

Let's say following is the Input_file:
cat Input_file
123 456 abc def
Then to get them into 2 columns following may help you in same.
xargs -n2 < Input_file
Output will be as follows.
123 456
abc def

Related

Join certain lines with sed

I have an input which looks like this:
1
2
3
4
5
6
And I want to transform it with sed to :
12
345
6
I know it can be easily done with other tools but I want to do it specifically with sed as a learning exercise.
I have attempted this:
sed ':x ; /^ *$/{ N; s/\n// ; bx; }'
But it prints :
123456
Can someone help me fix this?
Quoting from the GNU sed manual:
A common technique to process blocks of text such as paragraphs (instead of line-by-line) is using the following construct:
sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
The first expression, /./{H;$!d} operates on all non-empty lines, and adds the current line (in the pattern space) to the hold space. On all lines except the last, the pattern space is deleted and the cycle is restarted.
The other expressions x and s are executed only on empty lines (i.e. paragraph separators). The x command fetches the accumulated lines from the hold space back to the pattern space. The s/// command then operates on all the text in the paragraph (including the embedded newlines).
And indeed,
sed '/./{H;$!d} ; x ; s/\n//g'
does what you want.
FWIW here's how to really do that task in UNIX:
$ awk -v RS= -v OFS= '{$1=$1}1' file
12
345
6
The above will work on any UNIX box.
A GNU awk approach:
$ awk -F"\n" '{gsub("\n","");}1' RS='\n{2,}' file
12
345
6
Note it will add a trailing newline\n after last line.

Replacing part of pattern with sed

I'm looking to replace a space that always comes after a number with a | in the middle of a pattern. There are also similar patterns later in the line sometimes that I do not wish to replace (see first/second lines in the examples).
example:
12315 asdfea 1 1ffesa
45456 asefasef 1 era
12 asfase
4 4aefs
what I need:
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4|4aefs
I have tried this:
sed 's/\([0-9][ ][a-zA-Z]\)/|/g' file.txt
However this deletes the pattern such that it looks like this:
|sdfea 1 1ffesa
|sefasef 1 era
|sfase
|4aefs
Which is not what I need.
For given sample input/output,
$ sed 's/ /|/' file.txt
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4|4aefs
By default, only first match will be replaced. g modifier will replace all matches
To replace first matched space between a digit and alphabet (matching depends on locale too)
$ sed 's/\([0-9]\) \([a-zA-Z]\)/\1|\2/' file.txt
12315|asdfea 1 1ffesa
45456|asefasef 1 era
12|asfase
4 4aefs
This uses capture group and backreferences. Note that last line is not modified

Insert filename into text file with sed

I've been learning about sed and finding it very useful, but cannot find an answer to this in any of the many guides and examples ... I'd like to insert the filename of a text file, minus its path and extension, into a specific line within the text itself. Possible?
In such cases, the correct starting point should be man pages. Manual of sed does not provide a feature for sed to understand "filename", but sed does support inserting a text before/after a line.
As a result you need to isolate the filename separatelly , store the text to a variable and inject this text after/before the line you wish.
Example:
$ a="/home/gv/Desktop/PythonTests/cpu.sh"
$ a="${a##*/}";echo "$a"
cpu.sh
$ a="${a%.*}"; echo "$a"
cpu
$ cat file1
LOCATION 0 X 0
VALUE 1a 2 3
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2a $a" file1 # Inject the contents of variable $a after line2
LOCATION 0 X 0
VALUE 1a 2 3
cpu
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2i $a" file1 # Inject the contetns of variable $a before line2
LOCATION 0 X 0
cpu
VALUE 1a 2 3
VALUE 1b 2 3
VALUE 1c 2 3
$ sed "2a George" file1 #Inject a fixed string "George" after line 2
LOCATION 0 X 0
VALUE 1a 2 3
George
VALUE 1b 2 3
VALUE 1c 2 3
Explanation:
a="${a##*/}" : Removes all chars from the beginning of string up to last found slash / (longer match)
a="${a%.*}" : Remove all chars starting from the end of the string up to the first found dot . (short match) . You can also use %% for the longest found dot.
sed "2a $a" : Insert after line 2 the contents of variable $a
sed "2i $q" : Insert before line 2 the contents of $a
Optionally you can use sed -i to make changes in-place / in file under process
wrt I've been learning about sed then you may have been wasting your time as there isn't a lot TO learn about sed beyond s/old/new. Sure there's a ton of other language constructs and things you could do with sed, but in practice you should avoid them all and simply use awk instead. If you edit your question to include concise, testable sample input and expected output and add an awk tag then we can show you how to do whatever you want to do the right way.
Meanwhile, it sounds like all you need is:
$ cat /usr/tmp/file
a
b
c
d
e
$ awk 'NR==3{print gensub(/.*\//,"",1,FILENAME)} 1' /usr/tmp/file
a
b
file
c
d
e
The above inserts the current file name before line 3 of the open file. It uses GNU awk for gensub(), with other awks you'd just use sub() and a variable.

How to remove empty lines to one empty line between sentences in text files?

I have a text file with many empty lines between sentences. I used sed, gawk, grep but they dont work. :(. How can I do now? Thanks.
Myfile: Desired file:
a a
b b
c c
. .
d d
e e
f f
g g
. .
h
i
h j
i k
j .
k
.
You can use awk for this:
awk 'BEGIN{prev="x"}
/^$/ {if (prev==""){next}}
{prev=$0;print}' inputFile
or the compressed one liner:
awk 'BEGIN{p="x"}/^$/{if(p==""){next}}{p=$0;print}' inFl
This is a simple state machine that collapses multi-blank-lines into a single one.
The basic idea is this. First, set the previous line to be non-empty.
Then, for every line in the file, if it and the previous one are blank, just throw it away.
Otherwise, set the previous line to that value, print the line, and carry on.
Sample transcript, the following command:
$ echo '1
2
3
4
5
6
7
8
9
10' | awk 'BEGIN{p="x"}/^$/{if(p==""){next}}{p=$0;print}'
outputs:
1
2
3
4
5
6
7
8
9
10
Keep in mind that this is for truly blank lines (no content). If you're trying to collapse lines that have an arbitrary number of spaces or tabs, that will be a little trickier.
In that case, you could pipe the file through something like:
sed 's/^\s*$//'
to ensure lines with just whitespace become truly empty.
In other words, something like:
sed 's/^\s*$//' infile | awk 'my previous awk command'
To suppress repeated empty output lines with GNU cat:
cat -s file1 > file2
Here's one way using sed:
sed ':a; N; $!ba; s/\n\n\+/\n\n/g' file
Otherwise, if you don't mind a trailing blank line, all you need is:
awk '1' RS= ORS="\n\n" file
The Perl solution is even shorter:
perl -00 -pe '' file
You could do like this also,
awk -v RS="\0" '{gsub(/\n\n+/,"\n\n");}1' file
Explanation:
RS="\0" Once we set the null character as Record Seperator value, awk will read the whole file as single record.
gsub(/\n\n+/,"\n\n"); this replaces one or more blank lines with a single blank line. Note that \n\n regex matches a blank line along with the previous line's new line character.
Here is an other awk
awk -v p=1 'p=="" {p=1;next} 1; {p=$0}' file

sed + count words on field 3

I used the following awk in order to count all words that appears in field 4
awk '{print $4}' file | awk '{print NF}' | grep -c 1
How we can to the same in sed?
Example of file:
1 2 3 4
1 2
1 2 3 4 5
1 2
1 2 3
1 2 3 4
From file sed should return the results 3 (three words on field 4)
yael
First of all, your awk is quite inefficient. Try this:
awk '$4{c++}END{print c}' file
Why do you want it in sed, BTW? This is what awk does well. If you really want it in sed, I guess something like this:
sed '/^\s*\S*\s*\S*\s*\S*\s*$/d' file | wc -l
awk explanation: In every line where fourth field is non-null, increment c. At the end, print c.
sed explanation: delete each line which matches the regexp. Then with wc count the lines of the sed output. The regexp basically says there can be maximum of two whitespace groups in the line, not counting initial and final ones, which then means there can be at most 3 fields in the line.
cut can also be used:
cut -f 5 -d' ' file | wc -w
Select the 5. column (the first one is empty due to the leading blank). The delimiter is a blank.
This might work for you:
sed 's/ *[^ ]*/&/4;t;d;' file | sed -n '$='