Comparing baseline differences including file contents - diff

I know that I can use the diffbl command to compare two baselines. But how can I also ask the command to print out the diffs of all the files that were different?
Is this even possible or do I need to write a script or something? Any pointers?

The closest you can get with cleartool alone is cleartool diffbl -ver
For each version, you can make a cleartool diff -dif -pred <aversion> in order to print the actual diff (using the -dif format, where differences are reported in the style of the UNIX and Linux diff utility)
As mentioned in the comments, the diff -pred only prints the diff introduced by a specific version.
The concatenation of all those diffs represents all the modifications introduced by the new baseline compared to the old one.
In other word, that concatenation of diffs represents "print out the diffs of all the files that were different". What was originally asked for.

Related

What's the best way to perform a differential between a list of directories?

I am interested in looking at a list of directories and comparing the previous list with a current list of directories and setting up a script to do so. Maybe in perl or as a shell script.
Should I use something like diff? Programatically, what would be an ideal way to do this? For example let say I output the diff to an output file, if there is no diff then exit, if there is results, I want to see it.
Let's for example I have the following directories today:
/foo/bar/staging/abc
/foo/bar/staging/def
/foo/bar/staging/a1b2c3
Next day would look like this where a directory is either added, or renamed:
/foo/bar/staging/abc
/foo/bar/staging/def
/foo/bar/staging/ghi
/foo/bar/staging/a1b2c4
There might be better ways, but the way I typically do something like this is to run a find command in each directory root, and pipe the output to separate files. You can then diff the files using the diff tool of your choice. If you want to filter out certain directories or files, you can throw in some grep or grep -v commands in the pipeline, or you can experiment with options on the find command.
The other main option is to find a diff tool that offers directory/folder comparisons. Most of the goods ones support this, but I like the command line method, because you get more control over what you're diffing.
cd /my/directory/one
find . -print | sort > /temp/one.txt
cd /my/directory/two
find . -print | sort > /temp/two.txt
diff /temp/one.txt /temp/two.txt
also check the inotifywait command. it allows you to monitor files in RT.
You might also consider the find command using the -newer switch.
The usage is:
find . -newer timefile.txt -print
The -newer switch makes find return a list of files that are created or updated after the specified file's modification time. In the example above, any file created or updated after timefile.txt would be returned. You'd have to create a timefile.txt file, most likely once per day. Some versions of find have variations of newer that compare against other time stamps for a file (last modified, last accessed, last created, etc.)
This technique would not report a file that was deleted, however. A daily diff of the file listings could report that.

Version control for DOCX and PDF?

I've been playing around with git and hg lately and then suddenly it occurred to me that this kind of thing will be great for documents.
I've a document which I edit in DOCX and export as PDF. I tried using both git and hg to version control it and turns out with hg you end up tracking only binary and diff-ing isn't meaningful. Although with git I can meaningfully diff DOCX (haven't tried on PDF yet) I was wondering if there is a better way to do it than I'm doing it right now. (Ideally, not having to leave Word to diff will be the best solution.)
There are two different concepts here - one is "can the version control system make some intelligent judgements about the contents of files?" - so that it can store just delta information between revisions (and do things like assign responsibility to individual parts of a file).
The other is 'do I have a file comparison tool which is useful for the types of files I have in the version control system'. Version control systems tend to come with file comparison tools which are inferior to dedicated alternatives. But they can pretty much always be linked to better diff programs - either for all file types or specific ones.
So it's common to use, for example, Beyond Compare as a general compare tool, with Word as a dedicated Word document comparer.
Different version control systems differ as to how good people perceive them to be at handling 'binaries', but that's often as much to do with handling huge files and providing exclusive locking as it is to do with file comparison.
http://tortoisehg.bitbucket.io/ includes a plugin called docdiff that integrates Word and Excel diff'ing.
You can use Beyond Compare as external diff tool for hg. Add to/change your user mercurial.ini as:
[extdiff]
cmd.vdiff = c:/path/to/BCompare.exe
Then get Beyond Compare file viewer rule for docx.
Now you should be able to compare two versions of docx in Beyond Compare.
This article outlines the solution for Docx using Pandoc
While this post outlines solution for PDF using pdf2html.
Only for docx, I compiled instructions for multiple places here: https://gist.github.com/nachocab/6429893
# download docx2txt by Sandeep Kumar
wget -O docx2txt.pl http://www.cs.indiana.edu/~kinzler/home/binp/docx2txt
# make a wrapper
echo '#!/bin/bash
docx2txt.pl $1 -' > docx2txt
chmod +x docx2txt
# make sure docx2txt.pl and docx2txt are your current PATH. Here's a guide
http://shapeshed.com/using_custom_shell_scripts_on_osx_or_linux/
mv docx2txt docx2txt.pl ~/bin/
# set .gitattributes (unfortunately I don't this can't be set by default, you have to create it for every project)
echo "*.docx diff=word" > .git/info/attributes
# add the following to ~/.gitconfig
[diff "word"]
binary = true
textconv = docx2txt
# add a new alias
[alias]
wdiff = diff --color-words
# try it
git init
# create my_file.docx, add some content
git add my_file.docx
git commit -m "Initial commit"
# change something in my_file.docx
git wdiff my_file.docx
# awesome!
It works great on OSX
If you happen to use a Mac, I wrote a git merge driver that can use Microsoft Word and tracked changes to merge and show conflicts between any file types Word can read & write.
http://github.com/jasmas/wordMerge
I say 'if you happen to use a Mac' because the driver I wrote uses AppleScript, primarily to accomplish this task.
It'd be nice to add a vbscript version to the project, but at the moment I don't have a Windows environment for testing. Anyone with some basic scripting knowledge should be able to take a look at what I'm doing and duplicate it in vbscript, powershell or whatever on Windows.
I used SVN (yes, in 2020 :-)) with TortoiseSVN on Windows. It has a built-in function to compare DOCX files (it opens Microsoft Word in a mode where your screen is divided into four parts: the file after the changes, before the changes, with changes highlighted and a list of changes). Screenshot below (sorry for the Polish version of MS Word). I also checked TortoiseGIT and it also has this functionality. I've read that TortoiseHG has it as well.

showing differences within a line in diff output

This StackOverflow answer has an image of KDiff3 highlighting intra-line differences. Does someone know of a tool which can show the same (ex, via color) on the command line?
Another way to think of this is wanting to diff each difference in a patch file.
I don't know if this is sufficiently command line for your purpose, but vimdiff can do this (even does colour). See for example the image in this related question.
I tried all the tools I found: wdiff, dwdiff, kdiff3, vimdiff to show the difference between two long and slightly different lines. My favourite is diff-highlight (part of git contrib)
it supports diff format - great advantage over tools requiring two files like (dwdiff), e.g. if you need to visualize the output of unit tests
it highlights with black+white or with color if you connect it to colordiff
highlights characterwise - helpful for comparing long lines without spaces (better than wdiff)
Installation
On Ubuntu, you probably already have it as part of git contrib (installed within the git deb package).
Copy or link it into your ~/bin folder from /usr/share/doc/git/contrib/diff-highlight/diff-highlight
Usage example
cat tmp.diff | diff-highlight | colordiff
Result:
Another intuitive way to see all word-sized differences (though not side-by-side) is to use wdiff together with colordiff (you might need to install both). An example of this would be:
wdiff -n {file-A} {file-A} | colordiff
You can optionally pipe this into less -R to scroll through the output (-R is used to show the colors in less).
I had a similar problem and wanted to avoid using vimdiff. I found dwdiff (which is available in Debian) to have several advantages over wdiff.
The most useful feature of dwdiff is that you can customise the delimiters with -d [CHARS], so it's useful for comparing all kinds of output. It also has color built in with the -c flag.
You might be able to use colordiff for this.
In their man page:
Any options passed to colordiff are
passed through to diff except for the
colordiff-specific option 'difftype',
e.g.
colordiff --difftype=debdiff file1
file2
Valid values for 'difftype' are: diff,
diffc, diffu, diffy, wdiff, debdiff;
these correspond to plain diffs,
context diffs, unified diffs,
side-by-side diffs, wdiff output and
debdiff output respectively. Use these
overrides when colordiff is not able
to determine the diff-type
automatically.
I haven't tested it, but the side-by-side output (as produced by diff -y file1 file2) might give you the equivalent of in-line differences.
ccdiff is a convenient dedicated tool for the task. Here is what an example may look like with it:
By default, it highlights the differences in color, but it can be used on a console without color support too.
The package is included in the main repository of Debian:
ccdiff is a colored diff that also colors inside changed lines.
All command-line tools that show the difference between two files fall short in showing minor changes visuably useful. ccdiff tries to give the look and feel of diff --color or colordiff, but extending the display of colored output from colored deleted and added lines to colors for deleted and addedd characters within the changed lines.

multiple changes in one line with diff tool?

Normally, 'diff' tool finds only changes between lines. For example, if i compare 'abcdef' and 'AbcdEf', diff will show that 'abcde' is changed and 'f' is unchanged. Is it possible to find multiple changes per line, so in example above i will see that it's only 'a' changed to 'A' and 'e' changed to 'E'? Or diff outut format does not support such?
There are multiple diff tools that will do what you're asking for.
Off the top of my head I know Winmerge and TortoiseMerge does that.
I recommend KDiff3 which highlights with different colours changes on the same line.
I wrote a tool to diff web code regardless of differences from comments and whitespace. This means my tool can diff a completely minified file against a similar beautified file. It is written entirely in JavaScript so you try it directly in your browser without downloading or installing anything. This does highlight differences per line and highlights differences per characters on those lines.
http://prettydiff.com/

How do you use the diff command against two source trees

I tried running 'diff' against two source directories get a patch file with a 'diff' between the two directories.
diff -rupN flyingsaucer-R8pre2_b/ flyingsaucer-R8pre2/ > a.patch
The command above does not seem to work, it generates a diff of everything and I get a 13 MB file, when in reality, it should be a couple of changes.
Should work with any recent version of gnu diff (tested here with gnu diff 2.8.1.)
You might want to add -b (and perhaps -B) to ignore difference in white space which perhaps generate large patch files unnecessarily.
I don't see any reason why it wouldn't work. Try adding "wb" to the argument list to ignore whitespace changes. Are you sure you got the trailing slashes the same on both sides?