insert word between lines

insert word between lines - sed

I have pdb (protein data base) file which has thousands of lines.
REMARK 1 PDB file generated by ptraj (set 1000)
ATOM 1 O22 DDM 1 2.800 4.419 20.868 0.00 0.00
ATOM 2 H22 DDM 1 3.427 4.096 20.216 0.00 0.00
ATOM 3 C22 DDM 1 3.351 5.588 21.698 0.00 0.00
ATOM 4 H42 DDM 1 3.456 5.274 22.736 0.00 0.00
ATOM 5 C23 DDM 1 2.530 6.846 21.639 0.00 0.00
ATOM 6 H43 DDM 1 2.347 7.159 20.611 0.00 0.00
ATOM 7 O23 DDM 1 1.313 6.498 22.334 0.00 0.00
ATOM 8 H23 DDM 1 0.903 5.837 21.771 0.00 0.00
ATOM 9 C24 DDM 1 3.073 8.109 22.266 0.00 0.00
ATOM 10 H44 DDM 1 3.139 7.837 23.319 0.00 0.00
ATOM 11 O24 DDM 1 2.218 9.278 22.007 0.00 0.00
ATOM 12 H24 DDM 1 1.278 9.184 22.179 0.00 0.00
ATOM 13 C25 DDM 1 4.494 8.317 21.764 0.00 0.00
ATOM 14 H45 DDM 1 4.391 8.452 20.687 0.00 0.00
'
I want to insert word "TER" every 81 lines in that file whcih contains more than 20,000 lines but ignoring the first line since it is a comment.
I browse through internet, seems SED can do it. But i am lost.
Can anyone guide?
Thanks in advance.

Try this:
sed -i -e '1~81 i\TER' file

I'm partial to awk myself:
awk '{if(FNR%81==0)print "TER"; print}' file
I find this is a lot easier to understand and debug than the sed equivalent. The only magic is that FNR is the line number
You might have to fiddle with the numbers in the if to get it exactly the way you want it.

The more verbose shell commands would be
{
read header
echo "$header"
i=0
while read line; do
echo "$line"
if (( ++i == 81 )); then
echo TER
i=0
fi
done
} < infile > outfile &&
mv outfile infile

Related

Minimal Powershell script with Process block gives system process list (MacOS, Pwsh 7.1.3)

I was writing a Powershell script using a pipeline with a Process block and it started doing something unexpected: listing all the running processes and then dumping the script contents. I kept minimizing the script to try to figure out what was going on and ended up with this:
[CmdletBinding()]
Param()
$varname = "huh"
Process
{
# nothing here
}
So it looks like this:
PS /Volumes/folder> cat ./test.ps1
[CmdletBinding()]
Param()
$varname = "huh"
Process
{
# nothing here
}
PS /Volumes/folder> pwsh ./test.ps1
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.00 0.00 0 639
0 0.00 0.00 0.00 1 1
0 0.00 0.00 0.00 60 60
0 0.00 0.00 0.00 61 61
0 0.00 0.00 0.00 65 65
0 0.00 0.00 0.00 67 67
0 0.00 0.00 0.00 68 68
0 0.00 0.00 0.00 69 69
0 0.00 0.00 0.00 71 71
0 0.00 0.00 0.00 73 73
0 0.00 0.00 0.00 75 75
0 0.00 25.60 75.82 68475 1 Activity Monito
0 0.00 11.74 97.63 1053 1 Adobe Crash Han
0 0.00 11.76 97.62 1084 1 Adobe Crash Han
0 0.00 11.69 97.64 1392 1 Adobe Crash Han
0 0.00 112.50 83.59 973 1 Adobe Desktop S
0 0.00 11.94 97.31 986 1 AdobeCRDaemon
0 0.00 16.95 105.99 966 1 AdobeIPCBroker
0 0.00 61.52 168.92 721 1 Adobe_CCXProces
0 0.00 18.57 3.01 454 1 adprivacyd
0 0.00 16.46 23.16 700 1 AGMService
0 0.00 13.65 4.43 701 1 AirPlayUIAgent
--snip--
0 0.00 9.11 12.72 89003 …03 VTDecoderXPCSer
0 0.00 13.32 4.69 418 1 WiFiAgent
0 0.00 12.21 1.58 543 543 WiFiProxy
# nothing here
I haven't done much in Powershell for a long time so if this is something stupid simple I'm going to laugh but I couldn't find anything searching the net.
Can someone tell me what's happening?

In order to use a process block (possibly alongside a begin, end, and, in v7.3+, the clean block), there must not be any code OUTSIDE these blocks - see the conceptual about_Functions help topic.
Therefore, remove $varname = "huh" from the top-level scope of your function body (possibly move it into one of the aforementioned blocks).
As for what you tried:
By having $varname = "huh" in the top-level scope of your function body, you've effectively made the function in one whose code runs in an implicit end block only.
process - because it is on its own line - was then interpreted as a command, which - due to the best-avoided default-verb logic - was interpreted as an argument-less call to the Get-Process cmdlet.
The output therefore included the list of all processes on your system.
The { ... } on the subsequent lines was then interpreted as a script block literal. Since that script block wasn't invoked, it was implicitly output, which results in its stringification, which is its _verbatim content, excluding { and }, resulting in output of the following string:
# nothing here

What does `A(::2,3) = -1.0` do in Fortran?

I have a matrix A declared as real :: A(7,8) and intialised so that all entries are 0.0.
The following command does not provide any compiling errors.
A(::2,3) = -1.0
I realise that the columns affected will be only column 3.
What about the rows? Does ::2 mean rows 1 and 2? Or something else?
I printed out the matrix, but couldn't understand the pattern produced.
Here (for completeness):
do, i=1,7
write(*, "(f5.2)") ( A(i,j), j=1,8 )
enddo
0.00 i = 1
0.00
-1.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 2
0.00
0.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 3
0.00
-1.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 4
0.00
0.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 5
0.00
-1.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 6
0.00
0.00
0.00
0.00
0.00
0.00
0.00 ----
0.00 i = 7
0.00
-1.00
0.00
0.00
0.00
0.00
0.00
Looking at it now, it looks like it starts at i=1 and adds 2 to i until it reaches the bounds of the matrix. Is this correct?
Does this mean that ::2 is equivalent to 1:7:2 ("from 1 to 7 with a step of 2)?

Looking at the documentation, we see:
print array-expression [first-expression : last-expression : stride-expression]
where:
array-expression Expression that should evaluate to an array type.
first-expression First element in a range, also first element to be
printed. Defaults to lower bound.
last-expression Last element in a range, but might not be the last
element to be printed if stride is not equal to 1. Defaults to upper
bound.
stride-expression Length of the stride. Defaults to 1.
So if first-expression and last-expression are omitted, they default to lower bound and upper bound respectively.

Yes, that's right - it's the same as 1:7:2, as you can see from the output, it's setting the 3rd element in the sub-array to -1 for every 2nd sub array

Can I use awk to specify in which column to place string?

I have a tab separated text file called test.txt with multiple columns that I'm trying to make identical to another file called output.txt.
The test.txt looks as follows
t m sx sy sz rx ry rz
49.07 0 -1.00 0.00 -0.11 20.00 0.00 -2.18
49.47 0 -1.00 0.00 -0.11 22.00 0.00 -2.33
50.89 0 -1.00 0.00 -0.11 34.00 0.00 -3.21
.
:
42.06 0 29.00 0.00 -2.86 12.00 0.00 -1.44
The problem is that, no matter what type of delimiter I use, still it will not have the same form as the desired output file called output.txt
In the output.txt, all these columns have a specific location, so
t m sx sy sz rx ry rz
Ln1,col1 Ln1,col9 Ln1,col17 Ln1,col25 Ln1,col33 Ln1,col41 Ln1,col49 Ln1,col57
I'm a bit new with awk, sed and most of other unix commands. Any suggestion?

have you tried something like this?
awk -F 'BEGIN{print "t\tm\tsx\tsy\tsz\trx\try\trz"}{print $1"\t"$9"\t"$17"\t"$25"\t"$33"\t"$41"\t"$49}' test.txt
You print the header at the beginning and then for each line you print the desired columns

example for printf:
awk '{for (i=1;i<=NF;i++) printf(\"%-9s\", $i); print}' file
more examples

match a pattern and print subsequent lines

there are 200 files named File1_0.pdb,File1_60.pdb etc....it looks like:
ATOM 1 N VAL 1 8.897 -21.545 -7.276 1.00 0.00
ATOM 2 H1 VAL 1 9.692 -22.015 -6.868 1.00 0.00
ATOM 3 H2 VAL 1 9.228 -20.766 -7.827 1.00 0.00
ATOM 4 H3 VAL 1 8.289 -22.236 -7.693 1.00 0.00
TER
ATOM 5 CA VAL 1 8.124 -20.953 -6.203 1.00 0.00
ATOM 6 HA VAL 1 8.072 -19.874 -6.345 1.00 0.00
ATOM 7 CB VAL 1 6.693 -21.515 -6.176 1.00 0.00
ATOM 8 HB VAL 1 6.522 -22.024 -5.227 1.00 0.00
ATOM 9 CG1 VAL 1 5.684 -20.370 -6.330 1.00 0.00
ATOM 10 1HG1 VAL 1 5.854 -19.861 -7.279 1.00 0.00
i have to extract the part after TER and put in a different file...this has to be done on all 200 files. I did something like sed '1,/TER/d' File1_0.pdb > 1_0.pdb. But this will work for one file at a time...can there be a solution for all 200 files in one go... output file is named same only "File" is removed from the name...

for i in *.pdb; do sed '1,/TER/d' $i > ${i/File/}; done

This might work:
seq 0 200| xargs -i -n1 cp File1_{}.pdb 1_{}.pbd # backup files
sed -si '1,/TER/d' 1_{0..200}.pdb # edit files separately inline

How can I profile template performance in Template::Toolkit?

What's the best method for benchmarking the performance of my various templates when using Template::Toolkit?
I want something that will break down how much cpu/system time is spent processing each block or template file, exclusive of the time spent processing other templates within. Devel::DProf, for example, is useless for this, since it simply tells me how much time is spent in the various internal methods of the Template module.

It turns out that Googling for template::toolkit profiling yields the best result, an article from November 2005 by Randal Schwartz. I can't copy and paste any of the article here due to copyright, but suffice to say that you simply get his source and use it as a module after template, like so:
use Template;
use My::Template::Context;
And you'll get output like this to STDERR when your script runs:
-- info.html at Thu Nov 13 09:33:26 2008:
cnt clk user sys cuser csys template
1 0 0.06 0.00 0.00 0.00 actions.html
1 0 0.00 0.00 0.00 0.00 banner.html
1 0 0.00 0.00 0.00 0.00 common_javascript.html
1 0 0.01 0.00 0.00 0.00 datetime.html
1 0 0.01 0.00 0.00 0.00 diag.html
3 0 0.02 0.00 0.00 0.00 field_table
1 0 0.00 0.00 0.00 0.00 header.html
1 0 0.01 0.00 0.00 0.00 info.html
1 0 0.01 0.01 0.00 0.00 my_checklists.html
1 0 0.00 0.00 0.00 0.00 my_javascript.html
1 0 0.00 0.00 0.00 0.00 qualifier.html
52 0 0.30 0.00 0.00 0.00 referral_options
1 0 0.01 0.00 0.00 0.00 relationship_block
1 0 0.00 0.00 0.00 0.00 set_bgcolor.html
1 0 0.00 0.00 0.00 0.00 shared_javascript.html
2 0 0.00 0.00 0.00 0.00 table_block
1 0 0.03 0.00 0.00 0.00 ticket.html
1 0 0.08 0.00 0.00 0.00 ticket_actions.html
-- end
Note that blocks as well as separate files are listed.
This is, IMHO, much more useful than the CPAN module Template::Timer.