I was able to use boost::multiprecision and print a sample number:
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <iostream>
using namespace std;
using boost::multiprecision::cpp_dec_float_50;
int main() {
std::string number = "92233720368.54775807";
cpp_dec_float_50 decimal(number);
cout << fixed << setprecision(50) << "boost: " << decimal << endl;
}
the output is:
boost: 92233720368.54775807000000000000000000000000000000000000000000
How to print it without trailing zeros, exactly as 92233720368.54775807?
And is it possible to print it to std::wcout?
I have a simple c++ program that raises SIGSEGV:
#include <iostream>
#include <signal.h>
int main() {
std::cout << "Raising seg fault..." << std::endl;
raise(SIGSEGV);
return 0;
}
When running this program I get the following output
Raising seg fault...
Segmentation fault
But when I run my program inside a perl script using pipe, the segmentation fault disappears.
Here is my Perl script:
use strict;
use warnings;
my $cmd ="./test";
open (OUTPUT, "$cmd 2>&1 |") || die();
while (<OUTPUT>) {
print;
}
close (OUTPUT);
my $exit_status = $?>>8;
print "exit status: $exit_status\n";
I get the following output when running the script:
Raising seg fault...
exit status: 0
How could this be possible? Where is the segmentation fault and why is the exit status 0?
You are specifically ignoring the parts of $? that indicate if the process was killed by a signal.
Replace
my $exit_status = $?>>8;
print "exit status: $exit_status\n";
with
die("Killed by signal ".( $? & 0x7F )."\n") if $? & 0x7F;
die("Exited with error ".( $? >> 8 )."\n") if $? >> 8;
print("Completed successfully\n");
How can I iterate over $# after it has been stored in another variable in another function?
Note this is about the sh shell, not bash.
My code (super simplified):
#! /bin/sh
set -- a b "c d"
args=
argv() {
shift # pretend handling options
args="$#" # remaining arguments
}
fun() {
for arg in "$args"; do
echo "+$arg+"
done
}
argv "$#"
fun
Output:
+b c d+
I want:
+b+
+c d+
The special variable $# stores argv preserving whitespace. The for loop can loop over $# also preserving whitespace.
set -- a b "c d"
for arg in "$#"; do
echo "+$arg+"
done
Output:
+a+
+b+
+c d+
But once $# is assigned to another variable the whitespace preserving is gone.
set -- a b "c d"
args="$#"
for arg in "$args"; do
echo "+$arg+"
done
Output
+a b c d+
Without quotes:
for arg in $args; do
echo "+$arg+"
done
Output:
+a+
+b+
+c+
+d+
In bash it can be done using arrays.
set -- a b "c d"
args=("$#")
for arg in "${args[#]}"; do
echo "+$arg+"
done
Output:
+a+
+b+
+c d+
Can that be done in the sh shell?
You could use shift again inside fun if you know the shift has been performed in argv.
#! /bin/sh
set -- a b "c d"
args=
argv() {
shifted=1 # pretend handling options
shift $shifted
}
fun() {
[ -n $shifted ] && shift $shifted
for arg; do
echo "+$arg+"
done
}
argv "$#"
fun "$#"
Output:
+b+
+c d+
Here are two workarounds. Both have caveats.
First workaround: put newlines between arguments then use read.
set -- a b " c d "
args=
argv() {
shift
for arg in "$#"; do
args="$args$arg\n"
done
}
fun() {
printf "$args" | while IFS= read -r arg; do
echo "+$arg+"
done
}
argv "$#"
fun
Output:
+b+
+ c d +
Note that even the spaces before and after are preserved.
Caveat: if the arguments contain newlines you are screwed.
Second workaround: put quotes around arguments then use eval.
set -- a b " c d "
args=
argv() {
shift
for arg in "$#"; do
args="$args \"$arg\""
done
}
fun() {
for arg in "$#"; do
echo "+$arg+"
done
}
argv "$#"
eval fun "$args"
Caveat: if the arguments contain quotes you are screwed.
In my output file I have 800,000 rows and 8 fields for 3 samples. I just extract 2 rows here. I want only extract some specific information of each line such as:
chr, position, SNP-ID, Quality, DP, QD, genotypes (./.,0/0,/0/1, or 1/1). I need a script to extract those information and create new file: Could you please advise. Thanks
#chr pos SNP-ID Qual Info geno(sample1) geno(sample2) geno(sample3)
chrM 152 rs117135796 7427.14 AC=2;AF=0.333;AN=6;BaseQRankSum=-20.485;DB;DP=702;DS;Dels=0.00;FS=167.659;HaplotypeScore=2.6106;MLEAC=2;MLEAF=0.333;MQ=50.00;MQ0=0;MQRankSum=-1.507;QD=36.77;ReadPosRankSum=12.041 0/0:250,0:237:99:0,701,10320 0/0:250,0:238:99:0,713,10507 1/1:0,202:192:99:7465,572,0
chr10 5874 rs118203891 33.13 AC=1;AF=0.167;AN=6;BaseQRankSum=1.454;DB;DP=657;DS;Dels=0.00;FS=124.424;HaplotypeScore=5.1214;MLEAC=1;MLEAF=0.167;MQ=45.31;MQ0=0;MQRankSum=2.462;QD=0.15;ReadPosRankSum=-8.096 0/1:204,24:206:64:64,0,6345 0/0:203,0:193:99:0,473,6944 0/0:226,0:215:99:0,524,6448
Try:
awk -f ext.awk data.txt > summary.txt
where data.txt is your input data file, and ext.awk is:
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match($5,/(QD=[^;]+);/,a)
QD=a[1]
match($6,/^([^:]+\/[^:]+):/,a)
gt1=a[1]
match($7,/^([^:]+\/[^:]+):/,a)
gt2=a[1]
match($8,/^([^:]+\/[^:]+):/,a)
gt3=a[1]
print $1,$2,$3,$4,DP,QD,gt1,gt2,gt3
}
Update
Assuming the genotypes are given by the 3 first characters of each field (from $6 to $NF) you could try the following:
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match($5,/(MQ=[^;]+);/,a)
MQ=a[1]
printf "%s %s %s %s %s %s ", $1,$2,$3,$4,DP,MQ
for (i=6; i<=NF; i++) {
printf "%s", substr($i,1,3)
if (i<NF) printf " "
else printf "\n"
}
}
Update
If you want to:
if DP<10 or MQ<50 then delete that line;
convert genotypes as follows: (NA, 0, 1, 2)
convert ./. to "NA",
convert 0/0 to "0"
convert 0/1 to "1"
convert 1/1 to "2"
then you can try:
BEGIN {
geno["./."]="NA"
geno["0/0"]="0"
geno["0/1"]="1"
geno["1/1"]="2"
}
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match(DP,/=(.*)$/,a)
dpv=a[1]
match($5,/(MQ=[^;]+);/,a)
MQ=a[1]
match(MQ,/=(.*)$/,a)
mqv=a[1]
if (dpv<10 || mqv<50) next
else {
printf "%s %s %s %s %s %s ", $1,$2,$3,$4,DP,MQ
for (i=6; i<=NF; i++) {
type=substr($i,1,3)
printf "%s", geno[type]
if (i<NF) printf " "
else printf "\n"
}
}
}
Perl gives a nice terse program:
perl -ane '
BEGIN {$, = " "}
#fields = #F[0..3];
push #fields, $1, $2 if $F[4] =~ /(DP=.+?);.*(QD=.+?);/;
push #fields, (split /:/)[0] for #F[5,6,7];
print #fields, "\n";
' <<END
chrM 152 rs117135796 7427.14 AC=2;AF=0.333;AN=6;BaseQRankSum=-20.485;DB;DP=702;DS;Dels=0.00;FS=167.659;HaplotypeScore=2.6106;MLEAC=2;MLEAF=0.333;MQ=50.00;MQ0=0;MQRankSum=-1.507;QD=36.77;ReadPosRankSum=12.041 0/0:250,0:237:99:0,701,10320 0/0:250,0:238:99:0,713,10507 1/1:0,202:192:99:7465,572,0
chr10 5874 rs118203891 33.13 AC=1;AF=0.167;AN=6;BaseQRankSum=1.454;DB;DP=657;DS;Dels=0.00;FS=124.424;HaplotypeScore=5.1214;MLEAC=1;MLEAF=0.167;MQ=45.31;MQ0=0;MQRankSum=2.462;QD=0.15;ReadPosRankSum=-8.096 0/1:204,24:206:64:64,0,6345 0/0:203,0:193:99:0,473,6944 0/0:226,0:215:99:0,524,6448
END
chrM 152 rs117135796 7427.14 DP=702 QD=36.77 0/0 0/0 1/1
chr10 5874 rs118203891 33.13 DP=657 QD=0.15 0/1 0/0 0/0
The Perl wrapper below executes commands in parallel, saving STDOUT
and STDERR to /tmp files:
open(A,"|parallel");
for $i ("date", "ls", "pwd", "factor 17") {
print A "$i 1> '/tmp/$i.out' 2> '/tmp/$i.err'\n";
}
close(A);
How do I obtain the exit status values from the individual commands?
To get the exist status of the individual jobs, parallel would need to write the info somewhere. I don't know if it does or not. If it doesn't, you can do that yourself.
my %jobs = (
"date" => "date",
"ls" => "ls",
"pwd" => "pwd",
"factor" => "factor 17",
);
open(my $parallel, "|parallel");
for my $id (keys(%jobs)) {
print $parallel
$jobs{$id}
." 1> '/tmp/$id.out'"
." 2> '/tmp/$id.err' ; "
."echo \$?"
." > '/tmp/$id.exit'\n";
}
close($parallel);
my $exit_status = $? >> 8;
if ($exit_status >= 255) {
print("Failed\n");
} else {
printf("%d failed jobs\n", $exit_status);
}
for my $id (keys(%jobs)) {
...grab output and exit code from files...
}
Update:
I went and installed parallel.
It has an option called --joblog {file} which produces a report with exit codes. It accepts - for file name if you want it to output to STDOUT.
Note that parallel doesn't recognise abnormal death by signal, so this is not included in the --joblog report. Using the solution I posted above, a missing .exit file would indicate an abnormal death. (You must make sure it doesn't exist in the first place, though.)
Update:
#Ole Tange mentions that the limitation of --joblog {file} I mentioned above, the lack of logging of death by signal, has been addressed in version 20110722.
GNU Parallel 20110722 has exit val and signal in --joblog:
parallel --joblog /tmp/log false ::: a
cat /tmp/log
Seq Host Starttime Runtime Send Receive Exitval Signal Command
1 : 1311332758 0 0 0 1 0 false a
If you want to avoid the wrapper you could consider:
cat foo | parallel "{} >\$PARALLEL_SEQ.out 2>\$PARALLEL_SEQ.err; echo \$? >\$PARALLEL_SEQ.status"
Version 20110422 or later makes it even shorter:
cat foo | parallel "{} >{#}.out 2>{#}.err; echo \$? >{#}.status"
If your lines do no contain ' then this should work too:
cat foo | parallel "{} >'{}'.out 2>'{}'.err; echo \$? >'{}'.status"
Instead of wrapping parallel, you can use any of the tons of modules available from CPAN providing similar functionality.
For instance:
use Proc::Queue size => 10, qw(run_back);
my #pids;
for $i ("date", "ls", "pwd", "factor 17") {
push #pids, run_back {
open STDOUT, '>', '/tmp/$i.out';
open STDERR, '>', '/tmp/$i.err';
exec $i;
}
}
for (#pids) {
1 while waitfor($_, 0) <= 0;
say "process $_ exit code: ", ($? >> 8);
}