Remove lines from AWK output

Remove lines from AWK output - sed

I would like to remove lines that have less than 2 columns from a file:
awk '{ if (NF < 2) print}' test
one two
Is there a way to store these lines into variable and then remove it with xargs and sed, something like
awk '{ if (NF < 2) VARIABLE}' test | xargs sed -i /VARIABLE/d

GNU sed
I would like to remove lines that have less than 2 columns
less than 2 = remove lines with only one column
sed -r '/^\s*\S+\s+\S+/!d' file

If you would like to split the input into two files (named "pass" and "fail"), based on condition:
awk '{if (NF > 1 ) print > "pass"; else print > "fail"}' input
If you simply want to filter/remove lines with NF < 2:
awk '(NF > 1){print}' input

Related

How do I match multiple addresses in sed?

I want to execute some sed command for any line that matches either the and or or of multiple commands: e.g., sed '50,70/abc/d' would delete all lines in range 50,70 that match /abc/, or a way to do sed -e '10,20s/complicated/regex/' -e '30,40s/complicated/regex/ without having to retype s/compicated/regex/

Logical-and
The and part can be done with braces:
sed '50,70{/abc/d;}'
Further, braces can be nested for multiple and conditions.
(The above was tested under GNU sed. BSD sed may differ in small but frustrating details.)
Logical-or
The or part can be handled with branching:
sed -e '10,20{b cr;}' -e '30,40{b cr;}' -e b -e :cr -e 's/complicated/regex/' file
10,20{b cr;}
For all lines from 10 through 20, we branch to label cr
30,40{b cr;}
For all lines from 30 through 40, we branch to label cr
b
For all other lines, we skip the rest of the commands.
:cr
This marks the label cr
s/complicated/regex/
This performs the substitution on lines which branched to cr.
With GNU sed, the syntax for the above can be shortened a bit to:
sed '10,20{b cr}; 30,40{b cr}; b; :cr; s/complicated/regex/' file

To delete lines from 10 to 20 and 30 to 40 matching your complicated regex with GNU sed:
sed -e '10,20bA;30,40bA;b;:A;s/complicated/regex/;d' file
or:
sed -e '10,20bA' -e '30,40bA' -e 'b;:A;s/complicated/regex/;d' file
bA: jump to label :A
b: a jump without label -> jump to end of script
d: delete line

I don't think sed has the facility for multiple selection criteria, my advice would be to step up to awk, where you can do something like:
awk 'NR >= 50 && NR <= 70 && /abc/ {next} {print}' inputFile
awk '(NR >= 10 and NR <= 20) || (NR >= 30 && NR <= 40) {
sub("from-regex", "to-string", $0); print }'

sed is excellent for simple substitutions on individual lines but for anything else just use awk for clarity, robustness, portability, maintainability, etc...
awk '
(NR>=50 && NR<=70) && /abc/ { next }
(NR>=10 && NR<=20) || (NR>=30 && NR<=40) { sub(/complicated/,"regex") }
{ print }
' file

Make some replacements on a bunch of files depending the number of columns per line

I'm having a problem dealing with some files. I need to perform a column count for every line in a file and depending the number of columns i need to add severals ',' in in the end of each line. All lines should have 36 columns separated by ','
This line solves my problem, but how do I run it in a folder with several files in a automated way?
awk ' BEGIN { FS = "," } ;
{if (NF == 32) { print $0",,,," } else if (NF==31) { print $0",,,,," }
}' <SOURCE_FILE> > <DESTINATION_FILE>
Thank you for all your support
R&P

The answer depends on your OS, which you haven't told us. On UNIX and assuming you want to modify each original file, it'd be:
for file in *
do
awk '...' "$file" > tmp$$ && mv tmp$$ "$file"
done
Also, in general to get all records in a file to have the same number of fields you can do this without needing to specify what that number of fields is (though you can if appropriate):
$ cat tst.awk
BEGIN { FS=OFS=","; ARGV[ARGC++] = ARGV[ARGC-1] }
NR==FNR { nf = (NF > nf ? NF : nf); next }
{
tail = sprintf("%*s",nf-NF,"")
gsub(/ /,OFS,tail)
print $0 tail
}
$
$ cat file
a,b,c
a,b
a,b,c,d,e
$
$ awk -f tst.awk file
a,b,c,,
a,b,,,
a,b,c,d,e
$
$ awk -v nf=10 -f tst.awk file
a,b,c,,,,,,,
a,b,,,,,,,,
a,b,c,d,e,,,,,

It's a short one-liner with Perl:
perl -i.bak -F, -alpe '$_ .= "," x (36-#F)' *

if this is only a single folder without subfolders, use:
for oldfile in /path/to/files/*
do
newfile="${oldfile}.new"
awk '...' "${oldfile}" > "${newfile}"
done
if you also want to include subdirectories recursively, it's probably easiest to put the awk+redirection into a small shell-script, like this:
#!/bin/bash
oldfile=$1
newfile="${oldfile}.new"
awk '...' "${oldfile}" > "${newfile}"
and then run this script (let's calls it runawk.sh) via find:
find /path/to/files/ -type f -not -name "*.new" -exec runawk.sh \{\} \;

Joining lines in order of different blocks in the same text file

I have a file split in blocks like the following:
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTGGGG
AGGTAGTTATTATTTTTTTGGTTTTTAGTATTTAATTGAGTGTTT
ATGTAGGTGTTTATGTATTAGTTTTTTTTAGGTTTAGGGTGTTGT
ATTTAGGTTTTGTGTTTTGTGTATTATTGAATTTAATTAAAGTTA
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTT
AGTTTTTTTTTATTTGTCGGGATATTTTAGTTGATTTTAGATTGC
TATATTTTTAGTTTCGATTCGTCGTAAGTTTTATTTTTTTTTAAT
GGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTTT
I've truncated/wrapped the lines for clarity's sake, but imagine very long lines. The point of my question is that I want a final file that looks like this:
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTGGGGAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTT
AGGTAGTTATTATTTTTTTGGTTTTTAGTATTTAATTGAGTGTTTAGTTTTTTTTTATTTGTCGGGATATTTTAGTTGATTTTAGATTGC
ATGTAGGTGTTTATGTATTAGTTTTTTTTAGGTTTAGGGTGTTGTTATATTTTTAGTTTCGATTCGTCGTAAGTTTTATTTTTTTTTAAT
ATTTAGGTTTTGTGTTTTGTGTATTATTGAATTTAATTAAAGTTAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTTT
Where this new block has:
the same number of lines as the initial blocks,
each of the lines of the resulting block is a concatenation of the lines with the same line-number in the initial blocks.
this concatenation should be in-order (i.e. "1st line of 1st block" + "1st line of 2nd block", etc
Is it possible to achieve this final block using sed and/or awk, could you show me how it could be done?

In bash with paste:
$ paste <(head -4 file) <(tail -4 file) | tr -d '\t'
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTGGGGAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTT
AGGTAGTTATTATTTTTTTGGTTTTTAGTATTTAATTGAGTGTTTAGTTTTTTTTTATTTGTCGGGATATTTTAGTTGATTTTAGATTGC
ATGTAGGTGTTTATGTATTAGTTTTTTTTAGGTTTAGGGTGTTGTTATATTTTTAGTTTCGATTCGTCGTAAGTTTTATTTTTTTTTAAT
ATTTAGGTTTTGTGTTTTGTGTATTATTGAATTTAATTAAAGTTAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTTT

try this:
awk -vOFS="" '$0{a[NR]=$0}END{for(i=1;i<=NR/2;i++)print a[i],a[i+5]}' file
test with your example:
kent$ cat tmp.txt
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTGGGG
AGGTAGTTATTATTTTTTTGGTTTTTAGTATTTAATTGAGTGTTT
ATGTAGGTGTTTATGTATTAGTTTTTTTTAGGTTTAGGGTGTTGT
ATTTAGGTTTTGTGTTTTGTGTATTATTGAATTTAATTAAAGTTA
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTT
AGTTTTTTTTTATTTGTCGGGATATTTTAGTTGATTTTAGATTGC
TATATTTTTAGTTTCGATTCGTCGTAAGTTTTATTTTTTTTTAAT
GGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTTT
kent$ awk -vOFS="" '$0{a[NR]=$0}END{for(i=1;i<=NR/2;i++)print a[i],a[i+5]}' tmp.txt
AGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTGGGGAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTT
AGGTAGTTATTATTTTTTTGGTTTTTAGTATTTAATTGAGTGTTTAGTTTTTTTTTATTTGTCGGGATATTTTAGTTGATTTTAGATTGC
ATGTAGGTGTTTATGTATTAGTTTTTTTTAGGTTTAGGGTGTTGTTATATTTTTAGTTTCGATTCGTCGTAAGTTTTATTTTTTTTTAAT
ATTTAGGTTTTGTGTTTTGTGTATTATTGAATTTAATTAAAGTTAGGATAGGTTTTGGTGTTTGAGGTTAATTTTGTTTTATTTTTTTTT

awk -F'\n' -v RS= '{for (i=1;i<=NF;i++) {str[i] = str[i] $i} END {for (i=1;i<=NF;i++) print str[i]}' file

Brocade alishow merge two consecutive lines awk sed

How would like to join two lines usung awk or sed?
For example, I have data like below:
abcd
12:12:12:12:12:12:12:12
efgh001_01
45:45:45:45:45:45:45:45
ijkl7464746
78:78:78:78:78:78:78:78
and I need output like below:
abcd 12:12:12:12:12:12:12:12
efgh001_01 45:45:45:45:45:45:45:45
ijkl7464746 78:78:78:78:78:78:78:78
Running this almost works, but I need the space or tab:
awk '!(NR%2){print$0p}{p=$0}'

You're almost there:
awk '(NR % 2 == 0) {print p, $0} {p = $0}'

With sed you can do that as follows:
sed -n 'N;s/\n/ /p' file
where:
N reads next line
s replaces the new line character with a space to join both lines properly
p prints the result

This might work for you:
sed '$!N;s/\n/ /' file
or this:
paste -sd' \n' file

Combine columns from different files

I have two text files that have these structures:
File 1
Column1:Column2
Column1:Column2
...
File 2
Column3
Column3
...
I would like to create a file that has this file structure:
Column1:Column3
Column1:Column3
...
Open to any suggestions, but it would be nice if the solution can be done from a Bash shell, or sed / awk / perl / etc...

cut -d: -f1 "File 1" | paste -d: - "File 2"
This cuts field 1 from File 1 (delimited by a colon) and pastes it with the only column in File 2, separating the output fields with a colon.

Here's an awk solution. It assumes file1 and file2 have an equal number of lines.
awk -F : '{ printf "%s:",$1; getline < "file2"; print }' < file1

Since a pure bash implementation hasn't been suggested, also assuming an equal number of lines (bash v4 only):
mapfile -t file2 < file2
index=0
while IFS=: read -r column1 _; do
echo "$column1:${file2[index]}"
((index++))
done < file1
bash v3:
IFS=$'\n' read -r -d '' file2 < file2
index=0
while IFS=: read -r column1 _; do
echo "$column1:${file2[index]}"
((index++))
done < file1

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Remove lines from AWK output - sed

I would like to remove lines that have less than 2 columns from a file: awk '{ if (NF < 2) print}' test one two Is there a way to store these lines into variable and then remove it with xargs and sed, something like awk '{ if (NF < 2) VARIABLE}' test | xargs sed -i /VARIABLE/d

GNU sed I would like to remove lines that have less than 2 columns less than 2 = remove lines with only one column sed -r '/^\s*\S+\s+\S+/!d' file

If you would like to split the input into two files (named "pass" and "fail"), based on condition: awk '{if (NF > 1 ) print > "pass"; else print > "fail"}' input If you simply want to filter/remove lines with NF < 2: awk '(NF > 1){print}' input

Related

How do I match multiple addresses in sed?

Make some replacements on a bunch of files depending the number of columns per line

Joining lines in order of different blocks in the same text file

Brocade alishow merge two consecutive lines awk sed

Combine columns from different files

Categories

Resources