Limiting the sed search to 2 nd column in a file - sed

Below is the content of ma file (sample.txt):
CQUAD4 5600000 560005 5602371 5602367 5602374 5602372 0. -1.75
CQUAD4 5600003 560005 5600000 5602367 5602374 5602372 0. -1.75
Am using the below command:
sed -i "s#\(\s*\w*\s*\)\(5600000\)\(\s*\)\([0-9]*\)\(.*\)#\1\2\36000 \5#g" sample.txt
I want to restrict the pattern matching 5600000 to only second column and then do a replace with '6000 '.
Can somebody help me...please

Here's a possible solution with GNU sed. Anchor the search to start of line with ^.
sed -i -r "s#^(\s*\S+\s+)5600000\s+#\16000 #" sample.txt

awk might be a little more natural for this:
awk '$2=="5600000"{$2="6000";print} 1' sample.txt
That basically says "if the second field is 5600000, replace it with 6000 and print the line, otherwise just print the line".
The one downside I see is that this might, depending on your version of awk, collapse multiple spaces down to one, which may mess with the alignment of your columns. You'll have to decide if that's a problem or not...

Related

GREP Print Blank Lines For Non-Matches

I want to extract strings between two patterns with GREP, but when no match is found, I would like to print a blank line instead.
Input
This is very new
This is quite old
This is not so new
Desired Output
is very
is not so
I've attempted:
grep -o -P '(?<=This).*?(?=new)'
But this does not preserve the second blank line in the above example. Have searched for over an hour, tried a few things but nothing's worked out.
Will happily used a solution in SED if that's easier!
You can use
#!/bin/bash
s='This is very new
This is quite old
This is not so new'
sed -En 's/.*This(.*)new.*|.*/\1/p' <<< "$s"
See the online demo yielding
is very
is not so
Details:
E - enables POSIX ERE regex syntax
n - suppresses default line output
s/.*This(.*)new.*|.*/\1/ - finds any text, This, any text (captured into Group 1, \1, and then any text again, or the whole string (in sed, line), and replaces with Group 1 value.
p - prints the result of the substitution.
And this is what you need for your actual data:
sed -En 's/.*"user_ip":"([^"]*).*|.*/\1/p'
See this online demo. The [^"]* matches zero or more chars other than a " char.
With your shown samples, please try following awk code.
awk -F'This\\s+|\\s+new' 'NF==3{print $2;next} NF!=3{print ""}' Input_file
OR
awk -F'This\\s+|\\s+new' 'NF==3{print $2;next} {print ""}' Input_file
Explanation: Simple explanation would be, setting This\\s+ OR \\s+new as field separators for all the lines of Input_file. Then in main program checking condition if NF(number of fields) are 3 then print 2nd field (where next will take cursor to next line). In another condition checking if NF(number of fields) is NOT equal to 3 then simply print a blank line.
sed:
sed -E '
/This.*new/! s/.*//
s/.*This(.*)new.*/\1/
' file
first line: lines not matching "This.*new", remove all characters leaving a blank line
second lnie: lines matching the pattern, keep only the "middle" text
this is not the pcre non-greedy match: the line
This is new but that is not new
will produce the output
is new but that is not
To continue to use PCRE, use perl:
perl -lpe '$_ = /This(.*?)new/ ? $1 : ""' file
This might work for you:
sed -E 's/.*This(.*)new.*|.*/\1/' file
If the first match is made, the line is replace by everything between This and new.
Otherwise the second match will remove everything.
N.B. The substitution will always match one of the conditions. The solution was suggested by Wiktor Stribiżew.

Select specific items from a file using sed

I'm very much a junior when it comes to the sed command, and my Bruce Barnett guide sits right next to me, but one thing has been troubling me. With a file, can you filter it using sed to select only specific items? For example, in the following file:
alpha|november
bravo|october
charlie|papa
alpha|quebec
bravo|romeo
charlie|sahara
Would it be possible to set a command to return only the bravos, like:
bravo|october
bravo|romeo
With sed:
sed '/^bravo|/!d' filename
Alternatively, with grep (because it's sort of made for this stuff):
grep '^bravo|' filename
or with awk, which works nicely for tabular data,
awk -F '|' '$1 == "bravo"' filename
The first two use a regular expression, selecting those lines that match it. In ^bravo|, ^ matches the beginning of the line and bravo| the literal string bravo|, so this selects all lines that begin with bravo|.
The awk way splits the line across the field separator | and selects those lines whose first field is bravo.
You could also use a regex with awk:
awk '/^bravo|/' filename
...but I don't think this plays to awk's strengths in this case.
Another solution with sed:
sed -n '/^bravo|/p' filename
-n option => no printing by default.
If line begins with bravo|, print it (p)
2 way (at least) with sed
removing unwanted line
sed '/^bravo\|/ !d' YourFile
Printing only wanted lines
sed -n '/^bravo\|/ p' YourFile
if no other constraint or action occur, both are the same and a grep is better.
If there will be some action after, it could change the performance where a d cycle directly to the next line and a p will print then continue the following action.
Note the escape of pipe is needed for GNU sed, not on posix version

Using sed to swap columns X and X+1 inline in delimited file

I have a file with multiple lines and for line 2 to the end of the file I want to swap fields 8 and 9. The file is comma separated and I'd like to do the swap inline so I can run it on a batch of files using * wildcard. If this can be accomplished similarly with awk then that works for me too.
example:
header1,header2,header3,...,header8,header9,...,headerN
field1.1,...,field1.9,field1.8,...,field1.N
field2.1,...,field2.9,field2.8,...,field2.N
field3.1,...,field3.9,field3.8,...,field3.N
...
I think the command would look similar to sed -r -i '2,$s/^(([^,]*,){8})([^,]*,)([^,]*,)(.*)/\1\3\2\4/' temp*.log,
but \2 is not what I expect, it is the 7th field. I know that \2 will not be the 8th field because I have double parentheses there, but I'm not sure how to fix it. Could somebody please explain what this equation is doing and specifically what [^,] is doing and how the {8} is applied?
Thanks in advance.
In awk, you might use:
awk -F',' 'BEGIN {OFS=","} {t = $8; $8 = $9; $9 = t; print}'
In sed, the command is more convoluted, but it could be done.
sed -e 's/^\(\([^,]*,\)\{7\}\)\([^,]*,\)\([^,]*,\)/\1\4\3/'
Add the -i .bak option if your version of sed (e.g. GNU or BSD) supports it.
This uses the universally available sed regexes (it would work on even archaic versions of sed). You could lose most of the backslashes if you used 'extended regular expressions' instead:
sed -r -i 's/^(([^,]*,){7})([^,]*,)([^,]*,)/\1\4\3\5/'
Note the nested remembered (captured) patterns. The outer set is \1, the inner set would be \2 but that gets repeated 7 times, so you'd have the seventh field as \2. Anyway, that's why the eighth and ninth columns are switched with \4 and \3. \5 are the remaining columns.
(I note in passing that it would have been helpful to have some sample data in sufficiently the correct format to test with. It was a nuisance having to edit what is shown in the question to be able to test the code.)
If you need to do much CSV work, then either use Perl and its CSV modules (Text::CSV and Text::CSV_XS) or Python and its CSV module, or get CSVfix.
$2 is the second part in the RE
Denumbered by first occurence of (.
So in
'2,$s/^(([^,]*,){8})([^,]*,)([^,]*,)(.*)/\1\3\2\4/'
You could see (followind alignment):
$1 = (([^,]*,){8})
$2 = ([^,]*,)
$3 = ([^,]*,)
$4 = ([^,]*,)
and finaly $5 = (.*)
In this specific case, $2 must hold the last match of the height ({8}).
it seems that awk is the right tool:
awk -F',' -v OFS=',' '{t=$8;$8=$9;$9=t}7' file
This might work for you (GNU sed):
sed -ri '1!s/(,[^,]*)(,[^,]*)/\2\1/4' file
This swaps the 9th field with the 8th i.e. 8 / 2 = 4, if you wanted the 7th with the 8th:
sed -ri '1!{s/^/,/;s/(,[^,]*)(,[^,]*)/\2\1/4;s/^,//}' file

Matching strings even if they start with white spaces in SED

I'm having issues matching strings even if they start with any number of white spaces. It's been very little time since I started using regular expressions, so I need some help
Here is an example. I have a file (file.txt) that contains two lines
#String1='Test One'
String1='Test Two'
Im trying to change the value for the second line, without affecting line 1 so I used this
sed -i "s|String1=.*$|String1='Test Three'|g"
This changes the values for both lines. How can I make sed change only the value of the second string?
Thank you
With gnu sed, you match spaces using \s, while other sed implementations usually work with the [[:space:]] character class. So, pick one of these:
sed 's/^\s*AWord/AnotherWord/'
sed 's/^[[:space:]]*AWord/AnotherWord/'
Since you're using -i, I assume GNU sed. Either way, you probably shouldn't retype your word, as that introduces the chance of a typo. I'd go with:
sed -i "s/^\(\s*String1=\).*/\1'New Value'/" file
Move the \s* outside of the parens if you don't want to preserve the leading whitespace.
There are a couple of solutions you could use to go about your problem
If you want to ignore lines that begin with a comment character such as '#' you could use something like this:
sed -i "/^\s*#/! s|String1=.*$|String1='Test Three'|g" file.txt
which will only operate on lines that do not match the regular expression /.../! that begins ^ with optional whiltespace\s* followed by an octothorp #
The other option is to include the characters before 'String' as part of the substitution. Doing it this way means you'll need to capture \(...\) the group to include it in the output with \1
sed -i "s|^\(\s*\)String1=.*$|\1String1='Test Four'|g" file.txt
With GNU sed, try:
sed -i "s|^\s*String1=.*$|String1='Test Three'|" file
or
sed -i "/^\s*String1=/s/=.*/='Test Three'/" file
Using awk you could do:
awk '/String1/ && f++ {$2="Test Three"}1' FS=\' OFS=\' file
#String1='Test One'
String1='Test Three'
It will ignore first hits of string1 since f is not true.

Append text to a line on multiple conditions

I am very new to sed so please bear with me... I have a file with contents like
a=1
b=2,3,4
c=3
d=8
.
.
I want to append 'x' to a line which starts with 'c=' and does not contain an 'x'. What I am using right now is
sed -i '/^c=/ s/$/x/'
but this does not cover the second part of my explanation, the 'x' should only be appended if the line did not have it already and hence if I run the command twice it makes the line "c=3xx" which I do not want.
Any help here would be highly appreciated and I know there are a lot of sharp heads around here :) I understand that this can be handled pretty easily through bash but using sed here is a hard requirement.
You can do something like this:
sed -i '/^c=/ {/x/b; s/$/x/}'
Curly brackets are used for grouping. The b command branches to the end of the script (stops the processing of the current line).
b label
Branch to label; if label is omitted, branch to end of script.
Edit: as William Pursell suggests in the comment, a shorter version would be
sed -i '/^c=/ { /x/ !s/$/x/ }'
awk is probably a better choice here as you can easily combine regular expression matches with logical operators. Given the input:
$ cat file
a=1
b=2,3,4
c=3
c=x
c=3
d=8
The command would be:
$ awk '/^c=/ && !/x/ {$0=$0"x"; print $0}' file
a=1
b=2,3,4
c=3x
c=x
c=3x
d=8
Where $0 is the awk variable that contains the current line being read.
This might work for you (GNU sed):
sed -i '/^c=[^x]*$/s/$/x/' file
or:
sed -i 's/^c=[^x]*$/&x/' file