MongoDB verbose logging. What does every v add to output? - mongodb

What does every 'v' (from one to five) add to log output?
Sure, I can experiment. But does anybody provide a concrete answer?

There isn't a prescriptive list of what log lines each level of verbosity adds. Most of the extra details are really only meaningful for the MongoDB developers (particularly as the log levels increase).
You can grep the log entries from the source code if you're curious.
For example to see what's logged at level 1:
$ grep -r "LOG(1)" * | wc -l
185
$ grep -r "LOG(1)" * | head
client/connpool.cpp: LOG(1) << "Exception thrown when checking pooled connection to " <<
client/dbclient.cpp: LOG(1) << "creating new connection to:" << _servers[0] << endl;
client/dbclient.cpp: LOG(1) << "connected connection!" << endl;
client/dbclient_rs.cpp: LOG(1) << "checking replica set: " << name << endl;
client/dbclient_rs.cpp: if( wasFound ){ LOG(1) << "slave '" << prev << ( wasMaster ? "' is master node, trying to find another node" :
client/dbclient_rs.cpp: else{ LOG(1) << "slave '" << prev << "' was not found in the replica set" << endl; }
client/dbclient_rs.cpp: else LOG(1) << "slave '" << prev << "' is not initialized or invalid" << endl;
client/dbclient_rs.cpp: LOG(1) << "dbclient_rs getSlave falling back to a non-local secondary node" << endl;
client/dbclient_rs.cpp: LOG(1) << "dbclient_rs getSlave no member in secondary state found, "
client/dbclient_rs.cpp: LOG(1) << "_check : " << getServerAddress() << endl;

Related

Print boost::multiprecision::cpp_dec_float_50 without trailing zeros

I was able to use boost::multiprecision and print a sample number:
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <iostream>
using namespace std;
using boost::multiprecision::cpp_dec_float_50;
int main() {
std::string number = "92233720368.54775807";
cpp_dec_float_50 decimal(number);
cout << fixed << setprecision(50) << "boost: " << decimal << endl;
}
the output is:
boost: 92233720368.54775807000000000000000000000000000000000000000000
How to print it without trailing zeros, exactly as 92233720368.54775807?
And is it possible to print it to std::wcout?

Segmentation fault disappears when running command from Perl

I have a simple c++ program that raises SIGSEGV:
#include <iostream>
#include <signal.h>
int main() {
std::cout << "Raising seg fault..." << std::endl;
raise(SIGSEGV);
return 0;
}
When running this program I get the following output
Raising seg fault...
Segmentation fault
But when I run my program inside a perl script using pipe, the segmentation fault disappears.
Here is my Perl script:
use strict;
use warnings;
my $cmd ="./test";
open (OUTPUT, "$cmd 2>&1 |") || die();
while (<OUTPUT>) {
print;
}
close (OUTPUT);
my $exit_status = $?>>8;
print "exit status: $exit_status\n";
I get the following output when running the script:
Raising seg fault...
exit status: 0
How could this be possible? Where is the segmentation fault and why is the exit status 0?
You are specifically ignoring the parts of $? that indicate if the process was killed by a signal.
Replace
my $exit_status = $?>>8;
print "exit status: $exit_status\n";
with
die("Killed by signal ".( $? & 0x7F )."\n") if $? & 0x7F;
die("Exited with error ".( $? >> 8 )."\n") if $? >> 8;
print("Completed successfully\n");

Iterate over $# stored in another variable in another function

How can I iterate over $# after it has been stored in another variable in another function?
Note this is about the sh shell, not bash.
My code (super simplified):
#! /bin/sh
set -- a b "c d"
args=
argv() {
shift # pretend handling options
args="$#" # remaining arguments
}
fun() {
for arg in "$args"; do
echo "+$arg+"
done
}
argv "$#"
fun
Output:
+b c d+
I want:
+b+
+c d+
The special variable $# stores argv preserving whitespace. The for loop can loop over $# also preserving whitespace.
set -- a b "c d"
for arg in "$#"; do
echo "+$arg+"
done
Output:
+a+
+b+
+c d+
But once $# is assigned to another variable the whitespace preserving is gone.
set -- a b "c d"
args="$#"
for arg in "$args"; do
echo "+$arg+"
done
Output
+a b c d+
Without quotes:
for arg in $args; do
echo "+$arg+"
done
Output:
+a+
+b+
+c+
+d+
In bash it can be done using arrays.
set -- a b "c d"
args=("$#")
for arg in "${args[#]}"; do
echo "+$arg+"
done
Output:
+a+
+b+
+c d+
Can that be done in the sh shell?
You could use shift again inside fun if you know the shift has been performed in argv.
#! /bin/sh
set -- a b "c d"
args=
argv() {
shifted=1 # pretend handling options
shift $shifted
}
fun() {
[ -n $shifted ] && shift $shifted
for arg; do
echo "+$arg+"
done
}
argv "$#"
fun "$#"
Output:
+b+
+c d+
Here are two workarounds. Both have caveats.
First workaround: put newlines between arguments then use read.
set -- a b " c d "
args=
argv() {
shift
for arg in "$#"; do
args="$args$arg\n"
done
}
fun() {
printf "$args" | while IFS= read -r arg; do
echo "+$arg+"
done
}
argv "$#"
fun
Output:
+b+
+ c d +
Note that even the spaces before and after are preserved.
Caveat: if the arguments contain newlines you are screwed.
Second workaround: put quotes around arguments then use eval.
set -- a b " c d "
args=
argv() {
shift
for arg in "$#"; do
args="$args \"$arg\""
done
}
fun() {
for arg in "$#"; do
echo "+$arg+"
done
}
argv "$#"
eval fun "$args"
Caveat: if the arguments contain quotes you are screwed.

how to extract my specific information from each line of output file

In my output file I have 800,000 rows and 8 fields for 3 samples. I just extract 2 rows here. I want only extract some specific information of each line such as:
chr, position, SNP-ID, Quality, DP, QD, genotypes (./.,0/0,/0/1, or 1/1). I need a script to extract those information and create new file: Could you please advise. Thanks
#chr pos SNP-ID Qual Info geno(sample1) geno(sample2) geno(sample3)
chrM 152 rs117135796 7427.14 AC=2;AF=0.333;AN=6;BaseQRankSum=-20.485;DB;DP=702;DS;Dels=0.00;FS=167.659;HaplotypeScore=2.6106;MLEAC=2;MLEAF=0.333;MQ=50.00;MQ0=0;MQRankSum=-1.507;QD=36.77;ReadPosRankSum=12.041 0/0:250,0:237:99:0,701,10320 0/0:250,0:238:99:0,713,10507 1/1:0,202:192:99:7465,572,0
chr10 5874 rs118203891 33.13 AC=1;AF=0.167;AN=6;BaseQRankSum=1.454;DB;DP=657;DS;Dels=0.00;FS=124.424;HaplotypeScore=5.1214;MLEAC=1;MLEAF=0.167;MQ=45.31;MQ0=0;MQRankSum=2.462;QD=0.15;ReadPosRankSum=-8.096 0/1:204,24:206:64:64,0,6345 0/0:203,0:193:99:0,473,6944 0/0:226,0:215:99:0,524,6448
Try:
awk -f ext.awk data.txt > summary.txt
where data.txt is your input data file, and ext.awk is:
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match($5,/(QD=[^;]+);/,a)
QD=a[1]
match($6,/^([^:]+\/[^:]+):/,a)
gt1=a[1]
match($7,/^([^:]+\/[^:]+):/,a)
gt2=a[1]
match($8,/^([^:]+\/[^:]+):/,a)
gt3=a[1]
print $1,$2,$3,$4,DP,QD,gt1,gt2,gt3
}
Update
Assuming the genotypes are given by the 3 first characters of each field (from $6 to $NF) you could try the following:
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match($5,/(MQ=[^;]+);/,a)
MQ=a[1]
printf "%s %s %s %s %s %s ", $1,$2,$3,$4,DP,MQ
for (i=6; i<=NF; i++) {
printf "%s", substr($i,1,3)
if (i<NF) printf " "
else printf "\n"
}
}
Update
If you want to:
if DP<10 or MQ<50 then delete that line;
convert genotypes as follows: (NA, 0, 1, 2)
convert ./. to "NA",
convert 0/0 to "0"
convert 0/1 to "1"
convert 1/1 to "2"
then you can try:
BEGIN {
geno["./."]="NA"
geno["0/0"]="0"
geno["0/1"]="1"
geno["1/1"]="2"
}
NR>1 {
match($5,/(DP=[^;]+);/,a)
DP=a[1]
match(DP,/=(.*)$/,a)
dpv=a[1]
match($5,/(MQ=[^;]+);/,a)
MQ=a[1]
match(MQ,/=(.*)$/,a)
mqv=a[1]
if (dpv<10 || mqv<50) next
else {
printf "%s %s %s %s %s %s ", $1,$2,$3,$4,DP,MQ
for (i=6; i<=NF; i++) {
type=substr($i,1,3)
printf "%s", geno[type]
if (i<NF) printf " "
else printf "\n"
}
}
}
Perl gives a nice terse program:
perl -ane '
BEGIN {$, = " "}
#fields = #F[0..3];
push #fields, $1, $2 if $F[4] =~ /(DP=.+?);.*(QD=.+?);/;
push #fields, (split /:/)[0] for #F[5,6,7];
print #fields, "\n";
' <<END
chrM 152 rs117135796 7427.14 AC=2;AF=0.333;AN=6;BaseQRankSum=-20.485;DB;DP=702;DS;Dels=0.00;FS=167.659;HaplotypeScore=2.6106;MLEAC=2;MLEAF=0.333;MQ=50.00;MQ0=0;MQRankSum=-1.507;QD=36.77;ReadPosRankSum=12.041 0/0:250,0:237:99:0,701,10320 0/0:250,0:238:99:0,713,10507 1/1:0,202:192:99:7465,572,0
chr10 5874 rs118203891 33.13 AC=1;AF=0.167;AN=6;BaseQRankSum=1.454;DB;DP=657;DS;Dels=0.00;FS=124.424;HaplotypeScore=5.1214;MLEAC=1;MLEAF=0.167;MQ=45.31;MQ0=0;MQRankSum=2.462;QD=0.15;ReadPosRankSum=-8.096 0/1:204,24:206:64:64,0,6345 0/0:203,0:193:99:0,473,6944 0/0:226,0:215:99:0,524,6448
END
chrM 152 rs117135796 7427.14 DP=702 QD=36.77 0/0 0/0 1/1
chr10 5874 rs118203891 33.13 DP=657 QD=0.15 0/1 0/0 0/0

Obtaining exit status values from GNU parallel

The Perl wrapper below executes commands in parallel, saving STDOUT
and STDERR to /tmp files:
open(A,"|parallel");
for $i ("date", "ls", "pwd", "factor 17") {
print A "$i 1> '/tmp/$i.out' 2> '/tmp/$i.err'\n";
}
close(A);
How do I obtain the exit status values from the individual commands?
To get the exist status of the individual jobs, parallel would need to write the info somewhere. I don't know if it does or not. If it doesn't, you can do that yourself.
my %jobs = (
"date" => "date",
"ls" => "ls",
"pwd" => "pwd",
"factor" => "factor 17",
);
open(my $parallel, "|parallel");
for my $id (keys(%jobs)) {
print $parallel
$jobs{$id}
." 1> '/tmp/$id.out'"
." 2> '/tmp/$id.err' ; "
."echo \$?"
." > '/tmp/$id.exit'\n";
}
close($parallel);
my $exit_status = $? >> 8;
if ($exit_status >= 255) {
print("Failed\n");
} else {
printf("%d failed jobs\n", $exit_status);
}
for my $id (keys(%jobs)) {
...grab output and exit code from files...
}
Update:
I went and installed parallel.
It has an option called --joblog {file} which produces a report with exit codes. It accepts - for file name if you want it to output to STDOUT.
Note that parallel doesn't recognise abnormal death by signal, so this is not included in the --joblog report. Using the solution I posted above, a missing .exit file would indicate an abnormal death. (You must make sure it doesn't exist in the first place, though.)
Update:
#Ole Tange mentions that the limitation of --joblog {file} I mentioned above, the lack of logging of death by signal, has been addressed in version 20110722.
GNU Parallel 20110722 has exit val and signal in --joblog:
parallel --joblog /tmp/log false ::: a
cat /tmp/log
Seq Host Starttime Runtime Send Receive Exitval Signal Command
1 : 1311332758 0 0 0 1 0 false a
If you want to avoid the wrapper you could consider:
cat foo | parallel "{} >\$PARALLEL_SEQ.out 2>\$PARALLEL_SEQ.err; echo \$? >\$PARALLEL_SEQ.status"
Version 20110422 or later makes it even shorter:
cat foo | parallel "{} >{#}.out 2>{#}.err; echo \$? >{#}.status"
If your lines do no contain ' then this should work too:
cat foo | parallel "{} >'{}'.out 2>'{}'.err; echo \$? >'{}'.status"
Instead of wrapping parallel, you can use any of the tons of modules available from CPAN providing similar functionality.
For instance:
use Proc::Queue size => 10, qw(run_back);
my #pids;
for $i ("date", "ls", "pwd", "factor 17") {
push #pids, run_back {
open STDOUT, '>', '/tmp/$i.out';
open STDERR, '>', '/tmp/$i.err';
exec $i;
}
}
for (#pids) {
1 while waitfor($_, 0) <= 0;
say "process $_ exit code: ", ($? >> 8);
}