prepend text to every n:th line in a textfile - sed

This sed comandline script prepends text on every line in a file:
sed -i 's/^/to be prepended/g' text.txt
How can I make it so it only do that on every nth line?
I am working with sequencing data and in the "norma" multiple fasta format there is first an identifier line staring with a > and then have additional text.
The next line starts with a random DNA sequence like "AATTGCC" and so on when that string is done its new line and new identifier, how can i prepend text (additional bases) to the beginning of the sequence line?

Just use the following GNU sed syntax:
sed '0~Ns/^/to be prepended/'
# ^^^
# set N to the number you want!
for example, prepend HA to lines numbers that are multiple of 4:
$ seq 10 | sed '0~4s/^/HA/'
1
2
3
HA4
5
6
7
HA8
9
10
Or to those that are on the form 4N+1:
$ seq 10 | sed '1~4s/^/HA/'
HA1
2
3
4
HA5
6
7
8
HA9
10
From the sed manual → 3.2. Selecting lines with sed:
first~step
This GNU extension matches every stepth line starting with line first. In particular, lines will be selected when there exists a non-negative n such that the current line-number equals first + (n * step). Thus, to select the odd-numbered lines, one would use 1~2; to pick every third line starting with the second, ‘2~3’ would be used; to pick every fifth line starting with the tenth, use ‘10~5’; and ‘50~0’ is just an obscure way of saying 50.
By the way, there is no need to use /g for global replacement, since ^ can just be replaced once on every line.

$ seq 10 | perl -pe's/^/to be prepended / unless $. % 3'
1
2
to be prepended 3
4
5
to be prepended 6
7
8
to be prepended 9
10
$ seq 10 | perl -pe's/^/to be prepended / unless $. % 3 - 1'
to be prepended 1
2
3
to be prepended 4
5
6
to be prepended 7
8
9
to be prepended 10
$ seq 10 | perl -pe's/^/to be prepended / unless $. % 3 - 2'
1
to be prepended 2
3
4
to be prepended 5
6
7
to be prepended 8
9
10
You have an idea.

seq 15|awk -v line=4 'NR%line==0{$0="Prepend this text : " $0}1'
1
2
3
Prepend this text : 4
5
6
7
Prepend this text : 8
9
10
11
Prepend this text : 12
13
14
15

Related

How to skip a line every two lines starting by skipping the first line?

Here's my code : ls -lt | sed -n 'p;n'
That code makes me skip from a line to another when listing file names but doesn't start by skipping the first one, how to make that happen?
Here's an exemple without my code to skip to make it clear:
And here's an exemple of when I use the skip code:
You have to invert your sed command: it should be n;p instead of p;n:
Your code:
for x in {1..20}; do echo $x ; done | sed -n 'p;n'
1
3
5
7
9
11
13
15
17
19
The version with sed inverted:
for x in {1..20}; do echo $x ; done | sed -n 'n;p'
Output:
2
4
6
8
10
12
14
16
18
20
You can use sed's ~ operator: first~step
$ seq 1 10 | sed -n '1~2p'
1
3
5
7
9
$ seq 1 10 | sed -n '2~2p'
2
4
6
8
10

Replace a word every N lines with sed between a specific line interval

I have an input file:
Line 1 a
Line 2 b
Line 3 c
Line 4 d
Line 5 e
Line 6 f
Line 7 g
Line 8 h
Line 9 i
Line 10 j
Line 11 k
Line 12 l
Line 13 m
Line 14 n
Line 15 o
Line 16 p
Line 17 q
.
.
.
I want to insert with sed in a specific line interval, say between line 3 and line 17 of the file, a word that replaces the last word of each line every 4 lines.
In this case, let's say I want to put a Z in line 3 of the file, then line 7 of the file (i.e., 3+4), then line 11 of the file (i.e., 7+4), then line 15 of the file (i.e., 11+4).
Is there a way to do this with sed but just opening only once the file that I want to change?
The expected output would be:
Line 1 a
Line 2 b
Line 3 Z
Line 4 d
Line 5 e
Line 6 f
Line 7 Z
Line 8 h
Line 9 i
Line 10 j
Line 11 Z
Line 12 l
Line 13 m
Line 14 n
Line 15 Z
Line 16 p
Line 17 q
.
.
.
If you have GNU sed, you can use the first~step line addressing form:
sed '3,17{3~4s/\S*$/Z/}' infile
First, we limit all actions to an address range with 3,17{...}.
Then, within the curly braces, we run this:
3~4s/\S*$/Z/
"On line 3 and every 4th line after, replace the last word of the line (\S*$ – longest sequence of non-space characters) with Z".
Using POSIX sed, you can do:
sed '3,17{s/[^ ]*$/Z/;n;n;n;}'
An alternative could be awk which can be made a bit more flexible:
awk 'NR==3,NR==17{if (c++%4==0) { $NF="Z" }}1'

kdb/q: how to reshape a list into nRows, where nRows is a variable

If I am to split a list into 2 rows, I can use:
q)2 0N#til 10
However, the following syntax does not work:
q)n:2
q)n 0N#til 10
how I can achieve such reshaping?
Need brackets and semi colon
q)2 0N#til 10
0 1 2 3 4
5 6 7 8 9
q)n:2
q)(n;0N)#til 10
0 1 2 3 4
5 6 7 8 9
Here is the general syntax to split a list in matrix form:
(list1)#(list2)
As you can see, left part and right part of '#' is list. So here is one example:
q)list1: (4;3) / or simply (4 3)
q)list2: til 12
q)list1#list2
We can make an integer list in 2 way:
Using semicolon as list1:(2;3;4)
Using spaces as list1:(2 3 4)
But when you have variable, option 2 doesn't work;
q)list1: (n 3) / where n:2
q) `type error
So for your question, solution is to use semicolon to create list:
q) list1:(n;0N)
q) list1#til 10

print every 4 columns to one row in perl or awk

would you please help me how to convert every 4-sequantial rows into one tab-separated column?
convert:
A
1
2
3
3
3
4
1
to :
A 1 2 3
3 3 4 1
A simple way to do this is to use xargs:
$ xargs -n4 < file
A 1 2 3
3 3 4 1
With awk you would do:
$ awk '{printf "%s%s",$0,(NR%4?FS:RS)}' file
A 1 2 3
3 3 4 1
Another flexible approach is to use pr:
$ pr -tas' ' --columns 4 file
A 1 2 3
3 3 4 1
Both the awk and pr solution can be easily modified to change the output separator to a TAB:
$ pr -at --columns 4 file
A 1 2 3
3 3 4 1
$ awk '{printf "%s%s",$0,(NR%4?OFS:RS)}' OFS='\t' file
A 1 2 3
3 3 4 1
$ perl -pe 's{\n$}{\t} if $. % 4' old.file > new.file
or simply (thanks to mpapec's comment):
$ perl -pe 'tr_\n_\t_ if $. % 4' old.file > new.file

Find common elements in a file

The program that I would like to write has the same aim of the File row confrontation. This time the file I have is put in a different way:
1 2
1 3
1 4
2 1
2 3
2 4
2 5
3 1
...
8 6
8 7
8 9
9 8
I want to find:
when the first element of a row appears in the second position of the other rows and if the first element of the subsequent rows appear alongside the row taken in exam;
if it found then I want to print "I have found the link x y";
if the "link" exists, then I want to count how many "neighbours" they share, where by eighbours I mean how many elements in the second column they have in common and print "I found z triangles".
The file is sorted.
In this case the program will start founding the first "couple" 1 2 in the file but reversed and it will find it at the 4th row (2 1). Then it looks if the 3 ( second row and neighbour of 1) is also present in 2 ( and it is the case because it exists 2 3) and so on. At the end it will found that the "there is the link 1 2" and it "found 2 triangles" (1 - 2 - 3 and 1 - 2 - 4). I think the answer sould not be so different from the answer in the upper link, but I don't know how to arrange the files from a file made like this.
The first part of the problem is to find only the index of the inverted matching pairs? While reading this problem yesterday I had the feeling that grep may be of use;
#!usr/bin/perl
use warnings;
use strict;
my #parry;
while (<DATA>){
push #parry, [split(' ',$_)];
}
##remind is reverse matched indices;
my #remind = grep {
my $ind = $_;
grep { #reverse #{$parry[$_]} == #{parry[$ind]} did not appear to work.
#{$parry[$_]}[0] == #{$parry[$ind]}[1] &&
#{$parry[$_]}[1] == #{$parry[$ind]}[0];
} 0..$#parry
} 0..$#parry;
grep { print $_,': ',#{$parry[$_]},$/ } #remind;
__END__
1 2
1 3
1 4
2 1
2 3
2 4
2 5
3 1
8 6
8 7
8 9
9 8
output is
0: 12
1: 13
3: 21
7: 31
10: 89
11: 98
from here you then want to find say for
7[0] 7[1] (3 1) with neighbour row 6 and 8 with col 2?
6[1]
7[1] (1 5) and/or
7[1] (1 6) exist in the original set (in #parry)?
8[1]
Which they do not so no triangle.