WildcardError - No values given for wildcard - Snakemake - workflow

I am really lost what exactly I can do to fix this error.
I am running snakemake to perform some post alignment quality checks.
My code looks like this:
SAMPLES = ["Exome_Tumor_sorted_mrkdup_bqsr", "Exome_Norm_sorted_mrkdup_bqsr",
"WGS_Tumor_merged_sorted_mrkdup_bqsr", "WGS_Norm_merged_sorted_mrkdup_bqsr"]
rule all:
input:
expand("post-alignment-qc/flagstat/{sample}.txt", sample=SAMPLES),
expand("post-alignment-qc/CollectInsertSizeMetics/{sample}.txt", sample=SAMPLES),
expand("post-alignment-qc/CollectAlignmentSummaryMetrics/{sample}.txt", sample=SAMPLES),
expand("post-alignment-qc/CollectGcBiasMetrics/{sample}_summary.txt", samples=SAMPLES) # this is the problem causing line
rule flagstat:
input:
bam = "align/{sample}.bam"
output:
"post-alignment-qc/flagstat/{sample}.txt"
log:
err='post-alignment-qc/logs/flagstat/{sample}_stderr.err'
shell:
"samtools flagstat {input} > {output} 2> {log.err}"
rule CollectInsertSizeMetics:
input:
bam = "align/{sample}.bam"
output:
txt="post-alignment-qc/CollectInsertSizeMetics/{sample}.txt",
pdf="post-alignment-qc/CollectInsertSizeMetics/{sample}.pdf"
log:
err='post-alignment-qc/logs/CollectInsertSizeMetics/{sample}_stderr.err',
out='post-alignment-qc/logs/CollectInsertSizeMetics/{sample}_stdout.txt'
shell:
"gatk CollectInsertSizeMetrics -I {input} -O {output.txt} -H {output.pdf} 2> {log.err}"
rule CollectAlignmentSummaryMetrics:
input:
bam = "align/{sample}.bam",
genome= "references/genome/ref_genome.fa"
output:
txt="post-alignment-qc/CollectAlignmentSummaryMetrics/{sample}.txt",
log:
err='post-alignment-qc/logs/CollectAlignmentSummaryMetrics/{sample}_stderr.err',
out='post-alignment-qc/logs/CollectAlignmentSummaryMetrics/{sample}_stdout.txt'
shell:
"gatk CollectAlignmentSummaryMetrics -I {input.bam} -O {output.txt} -R {input.genome} 2> {log.err}"
rule CollectGcBiasMetrics:
input:
bam = "align/{sample}.bam",
genome= "references/genome/ref_genome.fa"
output:
txt="post-alignment-qc/CollectGcBiasMetrics/{sample}_metrics.txt",
CHART="post-alignment-qc/CollectGcBiasMetrics/{sample}_metrics.pdf",
S="post-alignment-qc/CollectGcBiasMetrics/{sample}_summary.txt"
log:
err='post-alignment-qc/logs/CollectGcBiasMetrics/{sample}_stderr.err',
out='post-alignment-qc/logs/CollectGcBiasMetrics/{sample}_stdout.txt'
shell:
"gatk CollectGcBiasMetrics -I {input.bam} -O {output.txt} -R {input.genome} -CHART = {output.CHART} "
"-S {output.S} 2> {log.err}"
The error message says the following:
WildcardError in line 9 of Snakefile:
No values given for wildcard 'sample'.
File "Snakefile", line 9, in <module>
In my code above I have indicated the problem causing line. When I simply remove this line everything runs perfekt. I am really confused, because I pretty much copy and pasted each rule, and this is the only rule that causes any problems.
If someone could point out what I did wrong, I would be very thankful!
Cheers!

Seems like it could be a spelling mistake - in the highlighted line, you write samples=SAMPLES, but the wildcard is called {sample} without the "s".

Related

Snakemake workflow, ChildIOException or MissingInputException

I am trying to add a file renaming step in my current workflow to make it easier on some of the other users. What I want to do is take the contigs.fasta file from a spades assembly directory and rename it to include the sample name. (i.e foo_de_novo/contigs.fasta to foo_de_novo/foo.fasta)
here is my code... well currently.
configfile: "config.yaml"
import os
def is_file_empty(file_path):
""" Check if file is empty by confirming if its size is 0 bytes"""
# Check if singleton file exist and it is empty from bbrepair output
return os.path.exists(file_path) and os.stat(file_path).st_size == 0
rule all:
input:
expand("{sample}_de_novo/{sample}.fasta", sample = config["names"]),
rule fastp:
input:
r1 = lambda wildcards: config["sample_reads_r1"][wildcards.sample],
r2 = lambda wildcards: config["sample_reads_r2"][wildcards.sample]
output:
r1 = temp("clean/{sample}_r1.trim.fastq.gz"),
r2 = temp("clean/{sample}_r2.trim.fastq.gz")
shell:
"fastp --in1 {input.r1} --in2 {input.r2} --out1 {output.r1} --out2 {output.r2} --trim_front1 20 --trim_front2 20"
rule bbrepair:
input:
r1 = "clean/{sample}_r1.trim.fastq.gz",
r2 = "clean/{sample}_r2.trim.fastq.gz"
output:
r1 = temp("clean/{sample}_r1.fixed.fastq"),
r2 = temp("clean/{sample}_r2.fixed.fastq"),
singles = temp("clean/{sample}.singletons.fastq")
shell:
"repair.sh -Xmx10g in1={input.r1} in2={input.r2} out1={output.r1} out2={output.r2} outs={output.singles}"
rule spades:
input:
r1 = "clean/{sample}_r1.fixed.fastq",
r2 = "clean/{sample}_r2.fixed.fastq",
s = "clean/{sample}.singletons.fastq"
output:
directory("{sample}_de_novo")
run:
isempty = is_file_empty("clean/{sample}.singletons.fastq")
if isempty == "False":
shell("spades.py --careful --phred-offset 33 -1 {input.r1} -2 {input.r2} -s {input.singletons} -o {output}")
else:
shell("spades.py --careful --phred-offset 33 -1 {input.r1} -2 {input.r2} -o {output}")
rule rename_spades:
input:
"{sample}_de_novo/contigs.fasta"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"
When I have it written like this I get the MissingInputError and when I change it to this.
rule rename_spades:
input:
"{sample}_de_novo"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"
I get the ChildIOException
I feel I understand why snakemake is unhappy with both versions. The first one is becasue I don't explicitly output the "{sample}_de_novo/contigs.fasta" file. Its just one of several files spades outputs. And the other error is because it doesn't like how I am asking it to look into the directory. I however am at a loss on how to fix this.
Is there a way to ask snakmake to look into a directory for a file and then perform the task requested?
Thank you,
Sean
EDIT File Structure of Spades output
Sample_de_novo
|-corrected/
|-K21/
|-K33/
|-K55/
|-K77/
|-misc/
|-mismatch_corrector/
|-tmp/
|-assembly_graph.fastg
|-assembly_graph_with_scaffolds.gfa
|-before_rr.fasta
|-contigs.fasta
|-contigs.paths
|-dataset.info
|-input_dataset.ymal
|-params.txt
|-scaffolds.fasta
|-scaffolds.paths
|spades.log
Make {sample}_de_novo/contigs.fasta to be the output of spades and parse its path to get the directory that will be the argument to spades -o. Snakemake won't mind if there are other files created in addition to contigs.fasta. This should run --dry-run mode:
rule all:
input:
expand('{sample}_de_novo/{sample}.fasta', sample=['A', 'B']),
rule spades:
output:
fasta='{sample}_de_novo/contigs.fasta',
run:
outdir=os.path.dirname(output.fasta)
shell(f'spades ... -o {outdir}')
rule rename:
input:
fasta='{sample}_de_novo/contigs.fasta',
output:
fasta='{sample}_de_novo/{sample}.fasta',
shell:
r"""
mv {input.fasta} {output.fasta}
"""
Nope, spoke too soon. It didn't name the output directory correctly, so I moved it to the params and, now, finailly is working the way I wanted.
rule spades:
input:
r1 = "clean/{sample}_r1.fixed.fastq",
r2 = "clean/{sample}_r2.fixed.fastq",
s = "clean/{sample}.singletons.fastq"
output:
"{sample}_de_novo/contigs.fasta"
params:
outdir = directory("{sample}_de_novo/")
run:
isempty = is_file_empty("clean/{sample}.singletons.fastq")
if isempty == "False":
shell("spades.py --isolate --phred-offset 33 -1 {input.r1} -2 {input.r2} -s {input.singletons} -o {params.outdir}")
else:
shell("spades.py --isolate --phred-offset 33 -1 {input.r1} -2 {input.r2} -o {params.outdir}")
rule rename_spades:
input:
"{sample}_de_novo/contigs.fasta"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"

Generate many files with wildcard, then merge into one

I have two rules on my Snakefile: one generates several sets of files using wildcards, the other one merges everything into a single file. This is how I wrote it:
chr = range(1,23)
rule generate:
input:
og_files = config["tmp"] + '/chr{chr}.bgen',
output:
out = multiext(config["tmp"] + '/plink/chr{{chr}}',
'.bed', '.bim', '.fam')
shell:
"""
plink \
--bgen {input.og_files} \
--make-bed \
--oxford-single-chr \
--out {config[tmp]}/plink/chr{chr}
"""
rule merge:
input:
plink_chr = expand(config["tmp"] + '/plink/chr{chr}.{ext}',
chr = chr,
ext = ['bed', 'bim', 'fam'])
output:
out = multiext(config["tmp"] + '/all',
'.bed', '.bim', '.fam')
shell:
"""
plink \
--pmerge-list-dir {config[tmp]}/plink \
--make-bed \
--out {config[tmp]}/all
"""
Unfortunately, this does not allow me to track the file coming from the first rule to the 2nd rule:
$ snakemake -s myfile.smk -c1 -np
Building DAG of jobs...
MissingInputException in line 17 of myfile.smk:
Missing input files for rule merge:
[list of all the files made by expand()]
What can I use to be able to generate the 22 sets of files with the wildcard chr in generate, but be able to track them in the input of merge? Thank you in advance for your help
In rule generate I think you don't want to escape the {chr} wildcard, otherwise it doesn't get replaced. I.e.:
out = multiext(config["tmp"] + '/plink/chr{{chr}}',
'.bed', '.bim', '.fam')
should be:
out = multiext(config["tmp"] + '/plink/chr{chr}',
'.bed', '.bim', '.fam')

Merging several vcf files using snakemake

I am trying to merge several vcf files by chromosome using snakemake. My files are like this, and as you can see has various coordinates. What is the best way to merge all chr1A and all chr1B?
chr1A:0-2096.filtered.vcf
chr1A:2096-7896.filtered.vcf
chr1B:0-3456.filtered.vcf
chr1B:3456-8796.filtered.vcf
My pseudocode:
chromosomes=["chr1A","chr1B"]
rule all:
input:
expand("{sample}.vcf", sample=chromosomes)
rule merge:
input:
I1="path/to/file/{sample}.xxx.filtered.vcf",
I2="path/to/file/{sample}.xxx.filtered.vcf",
output:
outf ="{sample}.vcf"
shell:
"""
java -jar picard.jar GatherVcfs I={input.I1} I={input.I2} O={output.outf}
"""
EDIT:
workdir: "/media/prova/Maxtor2/vcf2/merged/"
import subprocess
d = {"chr1A": ["chr1A:0-2096.flanking.view.filtered.vcf", "chr1A:2096-7896.flanking.view.filtered.vcf"],
"chr1B": ["chr1B:0-3456.flanking.view.filtered.vcf", "chr1B:3456-8796.flanking.view.filtered.vcf"]}
rule all:
input:
expand("{sample}.vcf", sample=d)
def f(w):
return d.get(w.chromosome, "")
rule merge:
input:
f
output:
outf ="{chromosome}.vcf"
params:
lambda w: "I=" + " I=".join(d[w.chromosome])
shell:
"java -jar /home/Documents/Tools/picard.jar GatherVcfs {params[0]} O={output.outf}"
I was able to reproduce your bug. When constraining the wildcards, it works:
d = {"chr1A": ["chr1A:0-2096.flanking.view.filtered.vcf", "chr1A:2096-7896.flanking.view.filtered.vcf"],
"chr1B": ["chr1B:0-3456.flanking.view.filtered.vcf", "chr1B:3456-8796.flanking.view.filtered.vcf"]}
chromosomes = list(d)
rule all:
input:
expand("{sample}.vcf", sample=chromosomes)
# these tell Snakemake exactly what values the wildcards may take
# we use "|" to create the regex chr1A|chr1B
wildcard_constraints:
chromosome = "|".join(chromosomes)
rule merge:
input:
# a lambda is an unnamed function
# the first argument is the wildcards
# we merely use it to look up the appropriate files in the dict d
lambda w: d[w.chromosome]
output:
outf = "{chromosome}.vcf"
params:
# here we create the string
# "I=chr1A:0-2096.flanking.view.filtered.vcf I=chr1A:2096-7896.flanking.view.filtered.vcf"
# for use in our command
lambda w: "I=" + " I=".join(d[w.chromosome])
shell:
"java -jar /home/Documents/Tools/picard.jar GatherVcfs {params[0]} O={output.outf}"
It should have worked without the constraints too; this seems like a bug in Snakemake.

If matched then print all using awk

I have a file which contains many sub-sections each starting with [begin] and ending with [end]:
[begin li1_1378184738754_91]
header=7075|lime|0|0|109582|0|1|2700073||0|0|0|[355]|1|0|ssb-li1-1378184738754-90||0||LIME |0|saved=true|0.002406508312038836|0|[ser=zu1:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=uzu6:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=xzs5:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=sv-stda-zu3:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=hzu8:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=lzu3:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=yzu2:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=xzu7:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer]|0|null|false|40||false|
attrs=0|0|0||0|
ptitle=690751404|1|1|1|Rest:1998636||||||2700401|175619|900.5636134725806|0.985486|39.166666666666664|$9.99|100.0|1|||
seller=1998636|1|9.99|1|-1||0|||||true||4.7937584|10412|false|
ptitle=5543369186|2|1|1|Rest:1533891||||||2700211|19615|886.8211044369053|0.776121|34.0|$119.99|100.0|1|||
seller=1533891|1|119.99|3|-1|1.0:text,In+size+6.0%2C7.0%2C8.0%2C8.5%2C9.0%2C9.5%2C10.0%2C...,0.0,,,,0,0,|2|||||true||2.95|20|true|
ptitle=622529158|3|1|1|||||||2700408|67402|796.5289827432475|0.893899|63.0|$5.27|100.0|1|||
seller=4281413|1|5.27|1|-1||0|||||true||4.695052|1769|true|
ptitle=5507199621|4|1|1|||||||2700220|56412|706.9031281251306|0.791171|45.0|$99.99|100.0|1|||
seller=4806107|1|-1.0|1|-1|1.0:sale,$,30.000000000000014,0.0,,,,0,0,:text,In+size+6.0%2C6.5%2C7.0%2C7.5%2C8.0%2C8.5%2C9.0%2C9...,0.0,,,,0,0,|2||||$130 $30.00 off|false||5.0|1|false|
ptitle=5502728013|5|1|1|||||||900000|0|698.7772340643119|0.836740|75.0|$40.95|100.0|1|||
seller=955448|1|40.95|1|-1||0|||||false||4.142857|7|false|
ptitle=840662011|6|1|1|Rest:265238||||||300233|62718|683.2927820751431|0.995513|52.0|$22.95|100.0|1|||
seller=265238|1|22.95|1|-1||0|||||false||4.478261|23|false|
ptitle=848084980|8|1|1|||||||2700073|145653|670.4809846773688|0.880587|60.0|$24.99|100.0|1|||
seller=5267046|1|24.99|1|-1||0|||||true||0.0|0|false|
ptitle=891200492|9|1|1|Rest:1030132||||||2701003|17215|668.8437575254773|0.825491|32.0|$519.99|100.0|1|||
seller=1030132|1|519.99|1|-1||0|||||false||4.7391305|23|false|
ptitle=641974054|10|1|1|||||||900000|69433|667.6678790058678|0.752129|57.0|$4.19|100.0|1|||
seller=3365158|1|4.19|1|-1||0|||||true||4.70907|4410|true|
ptitle=517591869|12|1|1|Rest:4802895||||||2700408|127644|643.0972570735605|0.893899|17.25|$23.95|100.0|1|||
seller=4318776|1|-1.0|3|-1||0|||||false||0.0|0|false|
ptitle=541549480|13|1|1|Rest:1180414||||||2702000|105832|597.4904572011968|0.752129|24.666666666666664|$8.27|100.0|1|||
seller=4636561|1|8.27|1|-1||0|||||false||4.8283377|734|true|
ptitle=1020561900|14|1|1|||||||2700063|159813|594.4717491579845|0.934869|75.0|$5.39|100.0|1|||
seller=4722645|1|5.39|1|-1|1.0:sale,$,0.6000000000000005,0.0,,,,0,0,:text,Free+Shipping+on+All+Orders%21,0.0,201301010000/,,,0,0,|2||||$5.99 $0.60 off|true||4.3942246|1593|true|
ptitle=507792308|15|1|1|Rest:4683455||||||2702000|105832|591.7739184402442|0.768311|22.5|$9.48|100.0|1|||
seller=4910651|1|-1.0|2|-1||0|||||false||5.0|1|false|
ptitle=1090571346|16|1|1|Rest:4452919||||||2700211|20824|776.4814913363535|0.776121|35.0|$59.99|100.0|1|||
seller=1533891|1|59.99|1|-1|1.0:sale,$,49.99999999999999,0.0,,,,0,0,:text,In+size+7.5%2C8.0%2C8.5%2C9.0%2C9.5%2C10.0%2C10.5...,0.0,,,,0,0,|2||||$110 $50.00 off|true||2.95|20|true|
ptitle=573017390|17|1|1|||||||2700073|91937|679.695660577044|0.880587|33.5|$14.85|100.0|1|||
seller=4281413|1|14.85|1|-1||0|||||true||4.695052|1769|true|
ptitle=5502723300|18|1|1|||||||900000|0|639.3095640940136|0.836740|75.0|$50.95|100.0|1|||
seller=955448|1|50.95|1|-1||0|||||false||4.142857|7|false|
ptitle=940022974|20|1|1|||||||2700600|58701|569.9503499778303|0.875839|59.0|$14.40|100.0|1|||
seller=4825227|1|14.4|1|12||0|||||true||4.0289855|276|true|
ptitle=5513277553|21|1|1|||||||2700220|56412|565.2712749001105|0.776121|44.33333333333333|$129.95|100.0|1|||
seller=4825252|1|129.95|1|23||0|||||true||4.0289855|276|true|
ptitle=890329961|22|1|1|||||||2700408|133796|564.7642425785796|0.837916|34.75|$61.95|100.0|1|||
seller=4825235|1|61.95|4|19||0|||||true||4.0289855|276|true|
ptitle=753852910|24|1|1|||||||2700073|146738|557.7419123688652|0.934869|47.69230769230769|$26.99|100.0|1|||
seller=4722645|1|26.99|10|-1|1.0:sale,$,3.0,0.0,,,,0,0,:text,Free+Shipping+on+All+Orders%21,0.0,201301010000/,,,0,0,|2||||$29.99 $3.00 off|true||4.3942246|1593|true|
ptitle=654738989|26|1|1|||||||900000|84012|554.7756559595525|0.752129|57.0|$3.19|100.0|1|||
seller=3365158|1|3.19|1|-1||0|||||true||4.70907|4410|true|
ptitle=707747307|27|1|1|Rest:4736009||||||2700063|76249|552.234395428327|0.889614|19.857142857142854|$6.39|100.0|1|||
seller=4736009|1|6.39|1|-1||0|||||false||4.8071113|15356|true|
ptitle=63531001|28|1|1|||||||2700408|82712|625.0421885589608|0.893899|47.166666666666664|$7.69|100.0|1|||
seller=4281413|1|7.69|3|-1||0|||||true||4.695052|1769|true|
ptitle=5502728016|29|1|1|||||||900000|0|605.9895507237038|0.836740|75.0|$503.00|100.0|1|||
seller=955448|1|503.0|1|-1||0|||||false||4.142857|7|false|
ptitle=507792308|31|1|1|Rest:4683455||||||2702000|105832|559.6902659046442|0.752129|22.5|$8.99|100.0|1|||
seller=5105812|1|-1.0|1|-1||0|||||false||0.0|0|false|
ptitle=753852910|32|1|1|||||||2700073|146738|545.9987095658629|0.870929|47.69230769230769|$22.49|100.0|1|||
seller=4143386|1|22.49|6|-1|1.0:sale,$,7.5,0.0,,,,0,0,:text,Free+Shipping+on+Orders+Over+%24100,0.0,201109010000/201409302359,,,0,0,|2||||$29.99 $7.50 off|false||4.7316346|2355|true|
ptitle=5513277553|33|1|1|Rest:1533891||||||2700220|56412|653.3133907916089|0.825491|44.33333333333333|$149.99|100.0|1|||
seller=1533891|1|149.99|3|-1|1.0:text,In+size+5.0%2C5.5%2C6.0%2C6.5%2C7.0%2C7.5%2C8.0%2C8...,0.0,,,,0,0,|2|||||true||2.95|20|true|
ptitle=63531001|34|1|1|||||||2700408|82712|541.8233547780552|0.893899|47.166666666666664|$7.72|100.0|1|||
seller=2370155|1|7.72|4|-1||0|||||false||4.85|40|false|
ptitle=1018957017|35|1|1|||||||2700073|145653|540.6093714604533|0.860614|56.0|$25.95|100.0|1|||
seller=5036683|1|25.95|1|-1||0|||||false||4.8405056|366|false|
ptitle=743682867|36|1|1|||||||2700073|63437|539.5985846455641|0.870929|58.0|$46.99|100.0|1|||
seller=193176|1|46.99|1|-1||0|||||true||4.8511987|1418|true|
ptitle=679858288|37|1|1|||||||2700063|188669|535.1360632897284|0.902031|30.0|$12.41|100.0|1|||
seller=4143386|1|12.41|2|-1|1.0:sale,$,1.379999999999999,0.0,,,,0,0,:text,Free+Shipping+on+Orders+Over+%24100,0.0,201109010000/201409302359,,,0,0,|2||||$13.79 $1.38 off|false||4.7316346|2355|true|
ptitle=994328713|38|1|1|||||||2700073|71463|534.7715925279717|0.870929|58.0|$1.29|100.0|1|||
seller=1787388|1|1.29|1|-1||0|||||false||4.680464|3624|false|
ptitle=886915818|40|1|1|||||||2700444|201835|529.7519801432289|0.934869|65.5|$44.99|100.0|1|||
seller=4561883|1|44.99|2|-1||0|||||true||4.7913384|508|false|
seller_hidden=227502|990765963|1147436601|-1
seller_hidden=5310958|622529158|5645627277|-1
seller_hidden=4825254|5543369186|5651114316|23
seller_hidden=5289138|5548930281|5653769481|-1
[end li1_1378184738754_91]
I am trying to run the command cat /home/nextag/logs/OutpdirImpressions.log.2013-09-02 | awk -F "$begin" '{print $0}' | awk '$0 ~ "header=7075" {print $0}'
As per this command i want to split the entire file into sub-sections beginning with the word 'begin'. Now in that i want those sub-sections which contains 'header=7075'
Expected output is that it will print the entire sub-section(those which contain that string), but i am getting only this portion as output:
header=7075|lime|0|0|109582|0|1|2700073||0|0|0|[355]|1|0|ssb-li1-1378184738754-90||0||LIME
|0|saved=true|0.002406508312038836|0|[ser=zu1:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=uzu6:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=xzs5:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=sv-stda-zu3:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=hzu8:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=lzu3:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=yzu2:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer][ser=xzu7:mtu=model_other_20120806calibex.csv:mu=model_other_20120806calibex.csv:scorerClassUsed=LinearPersonalizedProductSearchScorer]|0|null|false|40||false|
I have tried using if in various ways, but it doesn't works. Can somebody please help me.
I tried awk -F "$begin" '{if($0 ~ "header=7075") {print $0}}' /home/nextag/logs/OutpdirImpressions.log.2013-09-02. It gave the same result
Can somebody please suggest that how do i get the complete sub-section in the result
Try this awk one-liner:
awk '$1=="[end"{p=0}/^header=7075/{p=1}p' file
In parts:
$1=="[end"{p=0} if you reach a line, with the first word "[end", then set the flag to zero
/^header=7075/{p=1} If you reach a line, which begins with "header=7075", set set the flag to one
p if the flag is non-zero, print the current line (equivalent to p{print} or p{print $0} or p!=0{print $0}

Shell variable name queried from Matlab has additional character

I'm working with the following script, run_test:
#!/bin/sh
temp=$1;
cat <<EOF | matlab
[status name] = unix('echo $temp');
disp(name);
% some Matlab code
test_complete = 1;
save(name)
exit
EOF
I want to pass a name to the script, run some code then save a .mat file with the name that was passed. However, there is a curious piece of behavior:
[energon2] ~ $ ./run_test 'run1'
Warning: No display specified. You will not be able to display graphics on the screen.
< M A T L A B (R) >
Copyright 1984-2010 The MathWorks, Inc.
Version 7.12.0.635 (R2011a) 64-bit (glnxa64)
March 18, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
>> >> >> >> run1
>> >> >> >> >>
[energon2] ~ $ ls *.mat
run1?.mat
There is a "?" at the end of the file name when it's saved, but not when displayed on command line. This is acceptable for my needs, but a bit irritating to not know why it's occurring. Any explanation would be appreciated.
Edits, solution:
Yuk was correct below in the underlying cause and the use of save('$temp'). I'm now using the following script
#!/bin/sh
temp=$1;
cat <<EOF | matlab
% some Matlab code
test_complete = 1;
save('$temp')
exit
EOF
Thanks for the help.
You name variable has end-of-line as the last character. When you run echo run1 in unix this command display run1 and then "hit enter". In your script all the output of echo are saved to the name variable.
You can confirm it with the following:
>> format compact
>> [status, name] = unix('echo run1')
status =
0
name =
run1
>> numel(name)
ans =
5
>> int8(name(end))
ans =
10
>> int8(sprintf('\n'))
ans =
10
Apparently this character can be a part of a file name in unix, but shell displays it as ?.
Can't you do save($temp) instead?
EDIT: See my comments below for correction and more explanation.