Perl: print certain rows based on certain values of column - perl

Hey guys im begginer in Perl programming ,on my list.txt i have a 5 row and 7 columns what i want to do is print certain rows based on the value that the column have for example:
NO. RES REF ERRORS WARNING PROB_E PROB_C
1 k C 0 0 0.240 0.713
2 l C 16 2 0.365 0.568
3 n C 7 4 0.365 0.568
4 f E 0 0 0.613 0.342
I want to print from the column 3,4(error and warnings ) all the rows that have value different than 0. In this case the output to is the row 2 and 3.I hope i make myself clear :) sorry for my poor english.

Try this:
perl -ane 'print if ($F[3] or $F[4])' list.txt

Related

How to read a block of rows into a single record with PowerShell?

How would columns of data for a block of text:
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ cat multiple_lines.data
a 4
b 5
d 6
e 7
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ datamash transpose < multiple_lines.data > transposed.data
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ cat transposed.data
a 4 b 5 d 6 e 7
nicholas#mordor:~/powershell$
nicholas#mordor:~/powershell$ datamash transpose < transposed.data
a 4
b 5
d 6
e 7
nicholas#mordor:~/powershell$
be fed into a CSV type file so that column a has the value 4, etc? Here c has been omitted, but it can be assumed to be present. Or, at least that missing columns can be added.
No doubt awk would be fantastic at grabbing the above numbers, but looking to use PowerShell here. Output to json or xml would be just as good as CSV, most any sort of output like data interchange format would be fine.
Assuming an array of such blocks.
Use Import-Csv instead of ConvertFrom-Csv when reading from a file. Powershell will recognize automaically the space deliminater instead of the comma.
$txt = #"
a 4
b 5
d 6
e 7
"#
$table = $txt | ConvertFrom-Csv
$table

split a line into its components

I need to split the lines of an input file into its columns.
ATOM 0 HB3 ALA C 999 28.811 -7.680 12.279 1.00 57.53 H
ATOM 7637 N PRO C1000 27.299 -5.667 10.647 1.00216.82 N
The code I have works fine, as long as the 6th column is <1000, or shorter than 4 digits:
($ATOM, $atom_num, $atom_type, $res, $chain, $res_num) = split(" ", $pdb)
However as soon as column 6 reaches 1000, it will no longer discriminate the two columns. I am no expert in perl, but the code I am dealing with is perl, so I need to figure out how to split this e.g. by the number of digits of each column.
Any suggestions?
I solved it by using unpack and defining the length of each column.
$format = 'A6 A6 A5 A4 A1 A5';
($ATOM, $atom_num, $atom_type, $res, $chain, $res_num) = unpack($format, $pdb);

How to convert spaces into new line using perl?

I have the following input which is stored in single scalar variable named as $var1.
Input(i.e stored in $var1)
Gain Dead_coverage Export_control Functional_coverage Function_logic top dac_decoder Datapath System_Level Black_DV Sync_logic temp1 temp2 temp3 temp4 temp5 temp6 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
Expected output:
Gain
Dead_coverage
Export_control
Functional_coverage
Function_logic
top
dac_decoder
Datapath
System_Level
Black_DV
Sync_logic
temp1
temp2
temp3
temp4
temp5
temp6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
My code:
I had tried the following regular expression.
$var1=tr{\s}{\n};
The above regular expression not brings my expected output.
Note:the numbers may range upto n numbers and the character may starts or ends with capital or lower case.Whatever i need to bring like the expected output.For that which regular expression can i use it.
Requirements:
1.split space into new line.
2.for numbers(i.e 123456789101112.....) it should be considered as follows
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
so on,...
After digit 9 the other numbers should be considered as double digit.
tr is a transliteration. That only works with individual characters, not patterns. You need to use s/// with the /g modifier.
$var1 =~ s/\s/\n/g;
You can also do this with split and join.
$var1 = join "\n", split / /, $var1;
It shouldn't make a difference in terms of performance, even if there are a lot of strings.

average range of data and plot in gnuplot

I have this kind of data:
label-> 1 2 3 4 5
val1 1.67E+07 2.20E+07 3.04E+07 7.89E+07 1.24E+08
val2 1.71E+07 2.35E+07 2.70E+07 7.80E+07 1.31E+08
val3 1.48E+07 2.15E+07 2.74E+07 7.18E+07 1.17E+08
val4 1.57E+07 2.07E+07 2.49E+07 7.46E+07 1.27E+08
val5 1.32E+07 2.23E+07 3.07E+07 7.50E+07 1.16E+08
I need to plot the label vs the average of each val column, like this:
label-> 1 2 3 4 5
val1 1.67E+07 2.20E+07 3.04E+07 7.89E+07 1.24E+08
val2 1.71E+07 2.35E+07 2.70E+07 7.80E+07 1.31E+08
val3 1.48E+07 2.15E+07 2.74E+07 7.18E+07 1.17E+08
val4 1.57E+07 2.07E+07 2.49E+07 7.46E+07 1.27E+08
val5 1.32E+07 2.23E+07 3.07E+07 7.50E+07 1.16E+08
mean 1.55E+07 2.20E+07 2.81E+07 7.57E+07 1.23E+08
Is there any possibility of perform this operation in gnuplot or should I keep attached to Excel?
You could do it using awk and gnuplot. Assume your example data (without mean row) is in data.txt.
Then you could calculate the mean in each column starting from the second column (from i=2) and the second row (record, or row, #1 -- NR==1 -> do not summate, but fill auxiliary array a with zeroes: a[i]=0.0). For that purpose one could use awk condition: if (NR==1)... else {...calculate the means...}.
Awk reads the data row-by-row. In each row, you iterate over fields and summate the data from column with number i into array element a[i]:
{for(i=2;i<=NF;i++) a[i]+=$i;}
When iterating over the first row (NR==1), we would ;
At the END of awk script (all rows processed), just divide by number of columns in your data NF-1 to calculate the mean values. Note, the code below assumes you have rectangular-formatted data (NF=const).
Also, save row column labels into label array:
if (NR==1) {for(i=2;i<=NF;i++) label[i]=$i; ... }
Then print the labels and mean values into the rows, one row for one label.
for(i=2;i<=NF;i++) {printf label[i]" "; print a[i]/(NF-1)}
The final data table would look that way:
1 15500000
2 22000000
3 28080000
4 75660000
5 123000000
Then you could plot one column against the other.
Note, the final data for gnuplot should be formatted in columns, not rows.
The following code performs the described operations:
gnuplot> unset key
gnuplot> plot "<export LC_NUMERIC=C; awk '{if (NR==1) {for(i=2;i<=NF;i++) label[i]=$i; a[i]=0.0;} else {for(i=2;i<=NF;i++) a[i]+=$i;};} END {for(i=2;i<=NF;i++) {printf label[i]\" \"; print a[i]/(NF-1)}};' data.txt"
Note, that spaces should be escaped with backslash character \ in the gnuplot.

Parse negative numbers from string in perl

How do I parse a negative number from a string in perl? I have this piece of code:
print 3 - int("-2");
It gives me 5, but I need to have 3. How do I do it?
Perl will automatically convert between strings and numbers as needed; no need for an int() operation unless you actually want to convert a floating point number (whether stored as a number or in a string) to an integer. So you can just do:
my $string = "-2";
print 3 - $string;
and get 5 (because 3 minus negative 2 is 5).
Well, 3 - (-2) really is 5. I'm not really sure what you want to achieve, but if you want to filter out negative values, why not do something like this:
$i = int("-2")
$i = ($i < 0 ? 0 : $i);
This will turn your negative values to 0 but lets the positive numbers pass.
It seems to be parsing it correctly.
3 - (-2) is 5.
If it was mistakenly parsing -2 as 2 then it would have output 3 - 2 = 1.
No matter how you add/subtract 2 from 3, you will never get 3.
You are probably thinking of some other function instead of 'int'.
try:
use List::Util qw 'max';
...
print 3 - max("-2", 0);
if you want to get 3 as result.
Regards
rbo