Is there any way to get the diff tool to report only the relative path of files in a recursive diff?

Is there any way to get the diff tool to report only the relative path of files in a recursive diff? - command-line

I'm trying to get a unified diff between many pairs of directories so I can ensure the comparison between pairs is consistent, and I want to know if there's a way to get diff to format the output with relative rather than absolute paths.
Right now if I use diff -r -u PATH1 PATH2 then I get this kind of output:
diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt
--- PATH1/some/subfile.txt Tue Feb 07 09:16:31 2017
+++ PATH2/some/subfile.txt Tue Feb 07 09:16:32 2017
## -70,7 +70,7 ##
*
* some stuff
*
- * I am Tweedledee and you are not
+ * I am Tweedledum and you are not
*/
void twiddle(void)
{
## -88,7 +88,7 ##
* check whether we should destroy everything
* and then destroy everything in either case
*/
-inline static void Tweedledee(void)
+inline static void Tweedledum(void)
{
if (should_destroy_everything())
{
I would rather get just the relative paths... is there any way to get diff to do this? example:
diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt
--- some/subfile.txt
+++ some/subfile.txt
## -70,7 +70,7 ##
*
* some stuff
*
- * I am Tweedledee and you are not
+ * I am Tweedledum and you are not
*/
void twiddle(void)
{
## -88,7 +88,7 ##
* check whether we should destroy everything
* and then destroy everything in either case
*/
-inline static void Tweedledee(void)
+inline static void Tweedledum(void)
{
if (should_destroy_everything())
{
This would make it easier to compare diff reports which are expected to be the same. (in my case PATH1 and PATH2 differ in each case, whereas the relative paths to files, and the exact content differences are the same)
Otherwise I have to filter this information out (either manually or with a script)

I would pipe the output of your diff command to a sed script something like this:
$ diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt | sed '1s/PATH1\///' | sed '2s/PATH2\///'
The script says": on line 1, replace "PATH1", followed by a single forward slash, by nothing, then, on line 2, replace "PATH2", followed by a single forward slash, by nothing. I'd have to create some content to test it, so I haven't tested it.

I bit the bullet and did this parsing in Python; it removes the diff blah blah blah statements and relativizes the paths to a pair of specified root directories, also removing timestamps:
udiff_re = re.compile(r'^## -(\d+),(\d+) \+(\d+),(\d+) ##$')
diffitem_re = re.compile(r'^(\+\+\+|---) (.*)\s+.{7} \d\d \d\d:\d\d:\d\d \d{4}$')
def process_diff_output(output, dir1, dir2):
state = 0
lines_pending = [0,0]
result = []
for line in output.splitlines():
if state == 0:
if line.startswith('##'):
m = udiff_re.search(line)
if m:
nums = [int(n) for n in m.groups()]
else:
raise ValueError('Huh?\n' + line)
result.append(line)
lines_pending = [nums[1],nums[3]]
state = 1
elif line.startswith('--- ') or line.startswith('+++ '):
m = diffitem_re.search(line)
whichone = m.group(1)
filename = m.group(2)
dirx = dir1 if whichone == '---' else dir2
result.append('%s %s' % (whichone, os.path.relpath(filename, dirx)))
elif line.startswith('diff '):
pass # leave the diff cmd out
else:
raise ValueError('unknown header line\n'+line)
elif state == 1:
result.append(line)
if line.startswith('+'):
lines_pending[1] -= 1
elif line.startswith('-'):
lines_pending[0] -= 1
else:
lines_pending[0] -= 1
lines_pending[1] -= 1
if lines_pending == [0,0]:
state = 0
return '\n'.join(result)

Related

How do I adjust the number of spaces ipython uses for indentation in the terminal?

I use ipython in a terminal (NOT in a notebook), and by default it autoindents with 4 spaces.
How do I change the number of automatically-inserted spaces?

Number of spaces inserted by the TAB key
Assuming you are on Linux, you can locate your ipython installation directory with:
which ipython
It will return you a path which ends in /bin/ipython. Change directory to that path without the ending part /bin/ipython.
Then locate the shortcuts.py file where the indent buffer is defined:
find ./ -type f -name "shortcuts.py"
And in that file, replace 4 in the below function by 2:
def indent_buffer(event):
event.current_buffer.insert_text(' ' * 4)
Unfortunately, the 4 above is not exposed as a configuration, so we currently have to edit each ipython installation. That's cumbersome when working with many environments.
Number of spaces inserted by autoindent
Visit /path/to/your/IPython/core/inputtransformer2.py and modify two locations where the number of spaces is hard-coded as 4:
diff --git a/IPython/core/inputtransformer2.py b/IPython/core/inputtransformer2.py
index 37f0e7699..7f6f4ddb7 100644
--- a/IPython/core/inputtransformer2.py
+++ b/IPython/core/inputtransformer2.py
## -563,6 +563,7 ## def show_linewise_tokens(s: str):
# Arbitrary limit to prevent getting stuck in infinite loops
TRANSFORM_LOOP_LIMIT = 500
+INDENT_SPACES = 2 # or whatever you prefer!
class TransformerManager:
"""Applies various transformations to a cell or code block.
## -744,7 +745,7 ## def check_complete(self, cell: str):
ix += 1
indent = tokens_by_line[-1][ix].start[1]
- return 'incomplete', indent + 4
+ return 'incomplete', indent + INDENT_SPACES
if tokens_by_line[-1][0].line.endswith('\\'):
return 'incomplete', None
## -778,7 +779,7 ## def find_last_indent(lines):
m = _indent_re.match(lines[-1])
if not m:
return 0
- return len(m.group(0).replace('\t', ' '*4))
+ return len(m.group(0).replace('\t', ' '*INDENT_SPACES))
class MaybeAsyncCompile(Compile):

Snakemake workflow, ChildIOException or MissingInputException

I am trying to add a file renaming step in my current workflow to make it easier on some of the other users. What I want to do is take the contigs.fasta file from a spades assembly directory and rename it to include the sample name. (i.e foo_de_novo/contigs.fasta to foo_de_novo/foo.fasta)
here is my code... well currently.
configfile: "config.yaml"
import os
def is_file_empty(file_path):
""" Check if file is empty by confirming if its size is 0 bytes"""
# Check if singleton file exist and it is empty from bbrepair output
return os.path.exists(file_path) and os.stat(file_path).st_size == 0
rule all:
input:
expand("{sample}_de_novo/{sample}.fasta", sample = config["names"]),
rule fastp:
input:
r1 = lambda wildcards: config["sample_reads_r1"][wildcards.sample],
r2 = lambda wildcards: config["sample_reads_r2"][wildcards.sample]
output:
r1 = temp("clean/{sample}_r1.trim.fastq.gz"),
r2 = temp("clean/{sample}_r2.trim.fastq.gz")
shell:
"fastp --in1 {input.r1} --in2 {input.r2} --out1 {output.r1} --out2 {output.r2} --trim_front1 20 --trim_front2 20"
rule bbrepair:
input:
r1 = "clean/{sample}_r1.trim.fastq.gz",
r2 = "clean/{sample}_r2.trim.fastq.gz"
output:
r1 = temp("clean/{sample}_r1.fixed.fastq"),
r2 = temp("clean/{sample}_r2.fixed.fastq"),
singles = temp("clean/{sample}.singletons.fastq")
shell:
"repair.sh -Xmx10g in1={input.r1} in2={input.r2} out1={output.r1} out2={output.r2} outs={output.singles}"
rule spades:
input:
r1 = "clean/{sample}_r1.fixed.fastq",
r2 = "clean/{sample}_r2.fixed.fastq",
s = "clean/{sample}.singletons.fastq"
output:
directory("{sample}_de_novo")
run:
isempty = is_file_empty("clean/{sample}.singletons.fastq")
if isempty == "False":
shell("spades.py --careful --phred-offset 33 -1 {input.r1} -2 {input.r2} -s {input.singletons} -o {output}")
else:
shell("spades.py --careful --phred-offset 33 -1 {input.r1} -2 {input.r2} -o {output}")
rule rename_spades:
input:
"{sample}_de_novo/contigs.fasta"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"
When I have it written like this I get the MissingInputError and when I change it to this.
rule rename_spades:
input:
"{sample}_de_novo"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"
I get the ChildIOException
I feel I understand why snakemake is unhappy with both versions. The first one is becasue I don't explicitly output the "{sample}_de_novo/contigs.fasta" file. Its just one of several files spades outputs. And the other error is because it doesn't like how I am asking it to look into the directory. I however am at a loss on how to fix this.
Is there a way to ask snakmake to look into a directory for a file and then perform the task requested?
Thank you,
Sean
EDIT File Structure of Spades output
Sample_de_novo
|-corrected/
|-K21/
|-K33/
|-K55/
|-K77/
|-misc/
|-mismatch_corrector/
|-tmp/
|-assembly_graph.fastg
|-assembly_graph_with_scaffolds.gfa
|-before_rr.fasta
|-contigs.fasta
|-contigs.paths
|-dataset.info
|-input_dataset.ymal
|-params.txt
|-scaffolds.fasta
|-scaffolds.paths
|spades.log

Make {sample}_de_novo/contigs.fasta to be the output of spades and parse its path to get the directory that will be the argument to spades -o. Snakemake won't mind if there are other files created in addition to contigs.fasta. This should run --dry-run mode:
rule all:
input:
expand('{sample}_de_novo/{sample}.fasta', sample=['A', 'B']),
rule spades:
output:
fasta='{sample}_de_novo/contigs.fasta',
run:
outdir=os.path.dirname(output.fasta)
shell(f'spades ... -o {outdir}')
rule rename:
input:
fasta='{sample}_de_novo/contigs.fasta',
output:
fasta='{sample}_de_novo/{sample}.fasta',
shell:
r"""
mv {input.fasta} {output.fasta}
"""

Nope, spoke too soon. It didn't name the output directory correctly, so I moved it to the params and, now, finailly is working the way I wanted.
rule spades:
input:
r1 = "clean/{sample}_r1.fixed.fastq",
r2 = "clean/{sample}_r2.fixed.fastq",
s = "clean/{sample}.singletons.fastq"
output:
"{sample}_de_novo/contigs.fasta"
params:
outdir = directory("{sample}_de_novo/")
run:
isempty = is_file_empty("clean/{sample}.singletons.fastq")
if isempty == "False":
shell("spades.py --isolate --phred-offset 33 -1 {input.r1} -2 {input.r2} -s {input.singletons} -o {params.outdir}")
else:
shell("spades.py --isolate --phred-offset 33 -1 {input.r1} -2 {input.r2} -o {params.outdir}")
rule rename_spades:
input:
"{sample}_de_novo/contigs.fasta"
output:
"{sample}_de_novo/{sample}.fasta"
shell:
"cp {input} {output}"

Save job output from SDSF into a PDS and using ISPF functions in REXX

We periodically runs jobs and we need to save the output into a PDS and then parse the output to extract parts of it to save into another member. It needs to be done by issuing a REXX command using the percent sign and the REXX member name as an SDSF command line. I've attempted to code a REXX to do this, but it is getting an error when trying to invoke an ISPF service, saying the ISPF environment has not been established. But, this is SDSF running under ISPF.
My code has this in it (copied from several sources and modified):
parse arg PSDSFPARMS "(" PUSERPARMS
parse var PSDSFPARMS PCURRPNL PPRIMPNL PROWTOKEN PPRIMCMD .
PRIMCMD=x2c(PPRIMCMD)
RC = isfquery()
if RC <> 0 then
do
Say "** SDSF environment does not exist, exec ending."
exit 20
end
RC = isfcalls("ON")
Address SDSF "ISFGET" PPRIMPNL "TOKEN('"PROWTOKEN"')" ,
" (" VERBOSE ")"
LRC = RC
if LRC > 0 then
call msgrtn "ISFGET"
if LRC <> 0 then
Exit 20
JOBNAME = value(JNAME.1)
JOBNBR = value(JOBID.1)
SMPDSN = "SMPE.*.OUTPUT.LISTINGS"
LISTC. = ''
SMPODSNS. = ''
SMPODSNS.0 = 0
$ = outtrap('LISTC.')
MSGVAL = msg('ON')
address TSO "LISTC LVL('"SMPDSN"') ALL"
MSGVAL = msg(MSGVAL)
$ = outtrap('OFF')
do LISTCi = 1 to LISTC.0
if word(LISTC.LISTCi,1) = 'NONVSAM' then
do
parse var LISTC.LISTCi . . DSN
SMPODSNS.0 = SMPODSNS.0 + 1
i = SMPODSNS.0
SMPODSNS.i = DSN
end
IX = pos('ENTRY',LISTC.LISTCi)
if IX <> 0 then
do
IX = pos('NOT FOUND',LISTC.LISTCi,IX + 8)
if IX <> 0 then
do
address ISPEXEC "SETMSG MSG(IPLL403E)"
EXITRC = 16
leave
end
end
end
LISTC. = ''
if EXITRC = 16 then
exit 0
address ISPEXEC "TBCREATE SMPDSNS NOWRITE" ,
"NAMES(TSEL TSMPDSN)"
I execute this code by typing %SMPSAVE next to the spool output line on the "H" SDSF panel and it runs fine until it gets to this point in the REXX:
114 *-* address ISPEXEC "TBCREATE SMPDSNS NOWRITE" ,
"NAMES(TSEL TSMPDSN)"
>>> "TBCREATE SMPDSNS NOWRITE NAMES(TSEL TSMPDSN)"
ISPS118S SERVICE NOT INVOKED. A VALID ISPF ENVIRONMENT DOES NOT EXIST.
+++ RC(20) +++
Does anyone know why it says I don't have a valid ISPF environment and how I can get around this?
I've done quite a bit in the past with REXX, including writing REXX code to handle line commands, but this is the first time I've tried to use ISPEXEC commands within this code.
Thank you,
Alan

Generate many files with wildcard, then merge into one

I have two rules on my Snakefile: one generates several sets of files using wildcards, the other one merges everything into a single file. This is how I wrote it:
chr = range(1,23)
rule generate:
input:
og_files = config["tmp"] + '/chr{chr}.bgen',
output:
out = multiext(config["tmp"] + '/plink/chr{{chr}}',
'.bed', '.bim', '.fam')
shell:
"""
plink \
--bgen {input.og_files} \
--make-bed \
--oxford-single-chr \
--out {config[tmp]}/plink/chr{chr}
"""
rule merge:
input:
plink_chr = expand(config["tmp"] + '/plink/chr{chr}.{ext}',
chr = chr,
ext = ['bed', 'bim', 'fam'])
output:
out = multiext(config["tmp"] + '/all',
'.bed', '.bim', '.fam')
shell:
"""
plink \
--pmerge-list-dir {config[tmp]}/plink \
--make-bed \
--out {config[tmp]}/all
"""
Unfortunately, this does not allow me to track the file coming from the first rule to the 2nd rule:
$ snakemake -s myfile.smk -c1 -np
Building DAG of jobs...
MissingInputException in line 17 of myfile.smk:
Missing input files for rule merge:
[list of all the files made by expand()]
What can I use to be able to generate the 22 sets of files with the wildcard chr in generate, but be able to track them in the input of merge? Thank you in advance for your help

In rule generate I think you don't want to escape the {chr} wildcard, otherwise it doesn't get replaced. I.e.:
out = multiext(config["tmp"] + '/plink/chr{{chr}}',
'.bed', '.bim', '.fam')
should be:
out = multiext(config["tmp"] + '/plink/chr{chr}',
'.bed', '.bim', '.fam')

bitbake patching is failing

I am trying to add few checks in watchdog.c file in below package, but getting patch failure error as below, manual patch works fine without any issue, also patch is created from the file of this pkg only so there is no issue with version mismatch or older version.
https://sourceforge.net/projects/watchdog/files/watchdog/5.13/
ERROR: Command Error: exit status: 1 Output:
Applying patch watchdog.patch
patching file src/watchdog.c
Hunk #2 succeeded at 948 with fuzz 1.
Hunk #3 FAILED at 1057.
1 out of 3 hunks FAILED -- rejects in file src/watchdog.c
Patch watchdog.patch does not apply (enforce with -f)
ERROR: Function failed: patch_do_patch
Tried to search the answers, but
Bitbake recipe not applying patch as expected
Why does this patch applied with a fuzz of 1, and fail with fuzz of 0?
Hunk #1 FAILED at 1. What's that mean?
Above posts mention about pkg or improper patch issue, which doesnt look to be for my pkg, requesting for help on this.
diff --git a/src/watchdog.c b/src/watchdog.c
index 8a09632..bb4a189 100644
--- a/src/watchdog.c
+++ b/src/watchdog.c
## -570,6 +570,7 ##
struct stat s;
struct icmp_filter filt;
filt.data = ~(1<<ICMP_ECHOREPLY);
+ int check = FALSE;
#if USE_SYSLOG
char *opts = "d:i:n:Ffsvbql:p:t:c:r:m:a:";
## -947,6 +948,11 ##
(void) fclose(fp);
}
+ check = check_product();
+ if(check == TRUE)
+ {
+ setup_files();
+ }
/* set signal term to set our run flag to 0 so that */
/* we make sure watchdog device is closed when receiving SIGTERM */
signal(SIGTERM, sigterm_handler);
## -1051,6 +1057,11 ##
do_check2(check_bin(act->name, timeout, 1), act->name, rbinary, NULL);
#endif
+ if(check == TRUE)
+ {
+ do_check(test_files(), repair_bin, NULL);
+ }
+
/* finally sleep some seconds */
usleep(tint * 500000); /* this should make watchdog sleep tint seconds alltogther */
/* sleep(tint); */
.bbappend file
# changes for patching the disk IO check
FILESEXTRAPATHS_prepend := "${THISDIR}/${PN}-${PV}:"
FILESEXTRAPATHS_append := "${THISDIR}/files"
SRC_URI += " \
file://watchdog.patch \
"
Above "SRC_URI" & "SRC_URI_machineoverride" tried one-by-one and did not work
SRC_URI_machineoverride += "file://watchdog.patch"

The problem may be that you have file://watchdog.patch twice in your SRC_URI, and bitbake attempts to apply it twice. Thus the second fails.
Try removing the SRC_URI_machineoverride line.

your current patch file is mismatching with the source found in https://sourceforge.net/projects/watchdog/files/watchdog/5.13/
I have added changes manually to file from https://sourceforge.net/projects/watchdog/files/watchdog/5.13/ and patch is as below,
--- a/src/watchdog.c 2020-07-02 13:57:45.771971643 +0200
+++ b/src/watchdog.c 2020-07-02 13:57:55.979985941 +0200
## -567,6 +567,7 ##
pid_t child_pid;
int oom_adjusted = 0;
struct stat s;
+ int check = FALSE
#if USE_SYSLOG
char *opts = "d:i:n:Ffsvbql:p:t:c:r:m:a:";
## -1053,6 +1054,10 ##
do_check2(check_bin(act->name, timeout, 1), act->name, rbinary, NULL);
#endif
+ if(check == TRUE)
+ {
+ do_check(test_files(), repair_bin, NULL);
+ }
/* finally sleep some seconds */
usleep(tint * 500000); /* this should make watchdog sleep tint seconds alltogther */
/* sleep(tint); */

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Is there any way to get the diff tool to report only the relative path of files in a recursive diff? - command-line

Related

How do I adjust the number of spaces ipython uses for indentation in the terminal?

Snakemake workflow, ChildIOException or MissingInputException

Save job output from SDSF into a PDS and using ISPF functions in REXX

Generate many files with wildcard, then merge into one

bitbake patching is failing

Categories

Resources