text processing: awk, sed: change a number followed by a character - sed

I have a dot file consisting of 100+ nodes, such as
n12 -> n23
n14 -> n35
I want to increase the number by 1 in the node label, if the number after 'n' is greater than 20.
So the above two lines would become:
n12 -> n24
n14 -> n36
What is the nice way to do it, using awk, sed, or anything else?
(I can not use 'cut' to delete 'n' and compare the number, because that would delete some attributes with 'n' as well.)
Thanks!

Perl solution:
perl -pe 's/([0-9]+)/$1 > 20 ? $1 + 1 : $1/ge' INPUT_FILE
To change the input file in place, add the -i~ option.

Related

Sed to replace last character on condition

I have a file which has following lines
172XI207 X123955 1
412XE401 XE05689 1
412XI402 XI9515 1
412XI403 XI06702 1
412XE404 XE75348 1
I want to replace last column to 2 if the first two characters in the second column matches to XE.
The result should be like below
172XI207 X123955 1
412XE401 XE05689 2
412XI402 XI9515 1
412XI403 XI06702 1
412XE404 XE75348 2
I wanted to use sed (not awk). Can someone please let me know how this can be acheived using sed?
many sed commands take an address or address range (see the man page for the gory details). Probably the most common command is s of course, but it is among those that take an address range, meaning it doesn't need to apply to every line. An address range xan be a regular expression. The s command is:
{address}s/pattern/replacement/
For you the address - matching RE - is / XE/ (assuming your columns are space separarated; change that to a tab if necessary), the pattern is 1$ and the replacement 2. Therefore:
/ XE/s/1$/2/
or as a command line
sed -e '/ XE/s/1$/2/' < oldfile > newfile
EDIT: oops, second column, not start of line.
This command should do the trick (providing you are looking at myfile.txt)
sed -e '/ XE/ s/1$/2//' myfile.txt
You can make sure your replacement is acted by adding the -i option which will modify the file in-place, make sure it's exactly what you are expecting before though.
Edit: based on question in comments, here is a command that matches on 3rd column and replaces on fifth.
sed -e 's/^\(\(\w\+\W\+\)\{2\}XE\(\w\+\W\+\)\{2\}\)1/\12/'
Or, as an alternative, you can first select the line and then substitute:
sed -e '/^\(\w\+\W\+\)\{2\}XE/ s/^\(\(\w\+\W\+\)\{4\}\)1/\12/'

Limiting the sed search to 2 nd column in a file

Below is the content of ma file (sample.txt):
CQUAD4 5600000 560005 5602371 5602367 5602374 5602372 0. -1.75
CQUAD4 5600003 560005 5600000 5602367 5602374 5602372 0. -1.75
Am using the below command:
sed -i "s#\(\s*\w*\s*\)\(5600000\)\(\s*\)\([0-9]*\)\(.*\)#\1\2\36000 \5#g" sample.txt
I want to restrict the pattern matching 5600000 to only second column and then do a replace with '6000 '.
Can somebody help me...please
Here's a possible solution with GNU sed. Anchor the search to start of line with ^.
sed -i -r "s#^(\s*\S+\s+)5600000\s+#\16000 #" sample.txt
awk might be a little more natural for this:
awk '$2=="5600000"{$2="6000";print} 1' sample.txt
That basically says "if the second field is 5600000, replace it with 6000 and print the line, otherwise just print the line".
The one downside I see is that this might, depending on your version of awk, collapse multiple spaces down to one, which may mess with the alignment of your columns. You'll have to decide if that's a problem or not...

How to use 'sed or gawk' to delete a text block until the third line previous the last one

Good day,
I was wondering how to delete a text block like this:
1
2
3
4
5
6
7
8
and delete from the second line until the third line previous the last one, to obtain:
1
2
6
7
8
Thanks in advance!!!
BTW This text block is just an example, the real text blocks I working on are huge and each one differs among them in the line numbers.
Getting the number of lines with wc and using awk to print the requested range:
$ awk 'NR<M || NR>N-M' M=3 N="$(wc -l file)" file
1
2
6
7
8
This allows you to easily change the range by just changing the value of M.
This might work for you (GNU sed):
sed '3,${:a;$!{N;s/\n/&/3;Ta;D}}' file
or i f you prefer:
sed '1,2b;:a;$!{N;s/\n/&/3;Ta;D}' file
These always print the first two lines, then build a running window of three lines.
Unless the end of file is reached the first line is popped off the window and deleted. At the end of file the remaining 3 lines are printed.
since you mentioned huge and also line numbers could be differ. I would suggest this awk one-liner:
awk 'NR<3{print;next}{delete a[NR-3];a[NR]=$0}END{for(x=NR-2;x<=NR;x++)print a[x]}' file
it processes the input file only once, without (pre) calculating total line numbers
it stores minimal data in memory, in all processing time, only 3 lines data were stored.
If you want to change the filtering criteria, for example, removing from line x to $-y, you just simply change the offset in the oneliner.
add a test:
kent$ seq 8|awk 'NR<3{print;next}{delete a[NR-3];a[NR]=$0}END{for(x=NR-2;x<=NR;x++)print a[x]}'
1
2
6
7
8
Using sed:
sed -n '
## Append second line, print first two lines and delete them.
N;
p;
s/^.*$//;
## Read next three lines removing leading newline character inserted
## by the "N" command.
N;
s/^\n//;
N;
:a;
N;
## I will keep three lines in buffer until last line when I will print
## them and exit.
$ { p; q };
## Not last line yet, so remove one line of buffer based in FIFO algorithm.
s/^[^\n]*\n//;
## Goto label "a".
ba
' infile
It yields:
1
2
6
7
8

Finding lines which are greater than 120 characters length using sed

I want to get a list of lines in a batch file which are greater than 120 characters length. For this I thought of using sed. I tried but I was not successful. How can i achieve this ?
Is there any other way to get a list other than using sed ??
Thanks..
Another way to do this using awk:
cat file | awk 'length($0) > 120'
You can use grep and its repetition quantifier:
grep '.\{120\}' script.sh
Using sed, you have some alternatives:
sed -e '/.\{120\}/!d'
sed -e '/^.\{,119\}$/d'
sed -ne '/.\{120\}/p'
The first option matches lines that don't have (at least) 120 characters (the ! after the expression is to execute the command on lines that don't match the pattern before it), and deletes them (ie. doesn't print them).
The second option matches lines that from start (^) to end ($) have a total of characters from zero to 119. These lines are also deleted.
The third option is to use the -n flag, which tells sed to not print lines by default, and only print something if we tell it to. In this case, we match lines that have (at least) 120 characters, and use p to print them.

sed scripting : how to search after n no. of lines?

How to capture the first occurrence of a pattern using grep after 'n' numbers of lines in a large size file ?
For instance,
I have 1000 lines of code in which 'wire' occur before and after 451st line.
I want to grab the first occurrence of wire after 451st line .
You can use sed's range expressions to perform this task easily. For example:
sed -n '452,$ { /wire/ {p;q} }' /tmp/foo
This will skip the first 451 lines, then scan each line until EOF for "wire." When found, it will print the pattern space and then quit.