How to sync .ackrc and .gitignore_global? - gitignore

I find that the file/directory types in my .gitignore_global (which changes fairly regularly) are typically those that I'd like to ignore in my ack searches.
Is there a way to sync or "pipe" the the ignored files / directories from my .gitignore_global to my .ackrc?

Edit: I now use ripgrep for this type of thing.
Recently found Ag (a.k.a. the silver searcher) and git-grep, both of which serve my purpose.

ack does not currently look at .gitignore, but we have plans to allow for plugins in the future and that will be one of the first ones we do.

Related

How to expand some version keywords in Mercurial?

In CVS I could put $LOG$ into the source file and when the file is checked in $LOG$ will be expanded into true logs in the file.
But how to implement this in Mercurial? Of course I mean the other keyword such as the latest checkin date and time.
For most of the problems keyword expansion solves it creates a whole heap more; isn't recommended in Mercurial CVS/RCS-like Keyword Substitution - Why You Don't Need It however it is documented how to do it with expansions if you really need to.
I'm not the only one to advise against keyword expansion, although there are times it can be useful one really needs to think hard before doing it.
Use the built-in keyword extension.
A couple of important things:
ONLY add the specific files you need keyword expansion to the filename pattern in hgrc [keyword].
The expansion is LOCAL. When your changeset is pushed to another repo, unless that repo also has the same keyword setup, keyword is NOT expanded.
I agree that it should be avoided whenever possible. When it is not possible to avoid is that you need to distribute a few selected files (for example, API headers) to other people (for example, API users), such that there's no way they can use hg to find out the version info.

Code formatting and source control diffs

What source control products have a "diff" facility that ignores white space, braces, etc., in calculating the difference between checked-in versions? I seem to remember that Clearcase's diff did this but Visual SourceSafe (or at least the version I used) did not.
The reason I ask is probably pretty typical. Four perfectly reasonable developers on a team have four entirely different ways of formatting their code. Upon checking out the code last changed by someone else, each will immediately run some kind of program or editor macro to format things the way they like. They make actual code changes. They check-in their changes. They go on vacation. Two days later that program, which had been running fine for two years, blows up. The developer assigned to the bug does a diff between versions and finds 204 differences, only 3 of which are of any significance, because the diff algorithm is lame.
Yes, you can have coding standards. Most everyone finds them dreadful. A solution where everyone can have their cake and eat it too seems far more preferable.
=========
EDIT: Thanks to everyone for some great suggestions.
What I take away from this is:
(1) A source control system with plug-in type diffs is preferable.
(2) Find a diff with suitable options.
(3) Use a good source formatting program and settle on a check-in standard.
Sounds like a plan. Thanks again.
Git does have these options:
--ignore-space-at-eol
Ignore changes in whitespace at EOL.
-b, --ignore-space-change
Ignore changes in amount of whitespace. This ignores whitespace at line end, and considers all other sequences of one or more
whitespace characters to be equivalent.
-w, --ignore-all-space
Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has
none.
I am not sure if brace changes can be ignored using Git's diff.
If it is C/C++ code, you can define Astyle rules and then convert the source code's brace style to the one that you want, using Astyle. A git diff will then produce sane output.
Choose one (dreadful) coding standard, write it down in some official coding standards document, and get on with your life, messing with whitespace is not productive work.
And remember you are a professional developer, it's your job to get the project done, changing anything in the code because of a personal style preference hurts the project - it wont only make diff-ing more difficult, it can also introduce hard to find problems if your source formatter or compiler has bugs (and your fancy diff tool won't save you when two co-worker start fighting over casing).
And if someone just doesn't agree to work with the selected style just remind him (or her) that he is programming as a profession not as an hobby, see http://www.ericsink.com/entries/No_Great_Hackers.html
Maybe you should choose one format and run some indentation tool before checking in so that each person can check out, reformat to his/her own preferences, do the changes, reformat back to the official standard and then check in?
A couple of extra steps but they already use indentation tools when working. Maybe it can be a triggered check-in script?
Edit: this would perhaps also solve the brace problem.
(I haven't tried this solution myself, hence the "perhapes" and "maybes", but I have been in projects with the same problems, and it is a pain to try to go through diffs with hundreds of irrelevant changes that are not limited to whitespace, but includes the formatting itself.)
As explained in Is it possible for git-merge to ignore line-ending differences?, it is more a matter to associate the right diff tool to your favorite VCS, rather than to rely on the right VCS option (even if Git does have some options regarding whitespace, like the one mentioned in Alan's answer, it will always be not as complete as one would like).
DiffMerge is the more complete on those "ignore" options, as it can not only ignore spaces but also other "variations" based on the programming language used in a given file.
Subversion apparently supports this, either natively in the latest versions, or by using an alternate diff like Gnu Diff.
Beyond Compare does this (and much much more) and you can integrate it either in Subversion or Sourcesafe as an external diff tool.

How do you prevent file confusion if version-control keywords are forbidden?

At least two brilliant programmers, Linus Torvalds and Guido von Rossum, disparage the practice of putting keywords into a file that expand to show the version number, last author, etc.
I know how keyword differences clutter up diffs. One of the reasons I like SlickEdit's DiffZilla is because it can be set to skip leading comments.
However, I have vivid memories of team-programming where we had four versions of a file (two different releases, a customer one-off, and the development version) all open for patching at the same time, and was quite helpful to verify with a glance that each time we navigated to an included header we got the proper one, and each time we pasted code the source and destination were what we expected.
There is also the where-did-this-file-come-from problem that arises when a hasty developer copies a file from one place to another using the file system, rather than checking it out of the repository using the tool; or, more defensibly, when files under control in locations A, B, and C need to be marshalled (with cherry-picking) into a distribution location D.
In places where VCS keywords are banned, how do you cope?
I've never used VCS keywords in my entire career, over 30 years. From the most primitive VCS system I've used, up to the present (TFS), I've used some other structure to understand "where I am".
I am rarely in a situation where I've only got one file to work with. I've usually got all the other files necessary to build the project or set of projects. I usually use branching (or streams on one occasion), and I'm working on some slice of the given branch or stream.
If I'm working on multiple branches or streams, I'll have one directory tree for each. All I need to do to know what file I'm working on is check the file path, at the very worst.
At the very best, the version control system will tell you exactly which version of the file you're working on, what the change history is, who else is working on different versions of the file, and anything else you'd care to know.
This doesn't exactly answer your question, but I imagine Linus and Guido have reasons for disliking keywords that don't apply to small-team corporate development.
An $Id$ tag for instance, has what you could consider to be a global version number. Linux and I guess also Python development is fragmented enough that no number can be global. Lots of people have their own repositories all over the place that would fill in their own $Id$ values and then those patches might be sent to Linus or Guido's repositories where they don't make any sense.
However, in your environment, you probably have one central repository which would assign these and it would be fine. Sounds like you're using git. I wonder if it's possible to configure the central git repository to do tag substitution while the local developer repositories don't. Or perhaps it's better to get the commit hash in the tag.

CLI Patterns/Antipatterns for usability

What patterns contribute or detract from the usability of a CLI interface?
As an example consider the CLI for ClearCase. The CLI is very comprehensive (+1) but it is has several glaring opportunities. Recently, I wanted to force the files to lower case into ClearCase using clearfsimport. Unfortunately I wound up on the documentation for its cousin clearimport. It may seem slight but it cost me more hours than I care to admit. The variation in the middle got me.
Why provide such nearly identical functionality with such nearly identical names? There are many better options in my opinion
clearimport -fs
fsclearimport
clear_fs_import
clearimport_fs
Anything would be better than what they went with. The code I am working on IS a CLI and this experience made me look at my own choices. I think I have all the basics covered (standard help, long-form vs short-form, short meaningful names, providing examples, eliminate ambiguity, accurately handling spaces within quotes, etc).
There is some literature on this subject.
Perhaps a bad CLI is no different than a bad API. CLI are type of an API in some sense. The goals are naturally common:: flexibility, readability, and completeness. Several factors differentiate CLI from a typical API. One is that CLI needs to support scriptability (participate many times perhaps in a series of pipes). Another is that autocompletion and namespaces don't exist in the same way. You don't always have a nice colorful GUI doing stuff for you. CLIs must document themselves externally to customer directly. And finally the audience of a CLI is vastly different than the standard API. I appreciate any insight you may have.
I like the subcommand pattern, which I'm most familiar with as its implemented in the command-line Subversion client.
svn [subcommand] [options] [files]
Without the subcommands, subversion would have waaaaay too many different options for me to remember them effectively, and the help system would be a pain to slog through.
But, if I don't remember how any particular subcommand works, I can just type:
svn help [subcommand]
...and it shows me only the relevant portions of the help documentation.
As noted above, this format:
[master verb] [subverb] [optionally, noun] [options]
is good in terms of remembering what commands are available. cvs, svn, Perforce, git, all adhere to this. It improves discoverability of commands, a major CLI problem. One wrinkle that occurs here is options for the master-verb vs. options for the subverb. I.e.,
cvs -d dir command bar
is different than
cvs command -d dir bar
This was a confusing situation in cvs, which svn "fixed" by allowing options specified in any order. Your own solution may vary; if you have a very good reason to pass options to the master verb, okay, just be aware of the overhead.
Looking to API usability is a good idea too, but beware that there is no real typing in CLI commands, and there is a lot of richness in what CLI commands 'return', since you've got both a return code and an output to work with. In the unixy/streams world, the output is usually much more important than the return code. Getting the format of your output right is crucial. Also, while tempting, I've found that sending different things to stdout vs. stderr is not always useful; it confuses novice and even intermediate users (because they both get dumped to console in most cases), and rarely is useful advanced users. So unless there's a real need for it I avoid it; it's too easy for (e.g.) someone to get very confused about why the output of a command was '' in an error condition just because the programmer nicely dumped the errors to stderr.
Another issue in design is the "what next" problem. In a GUI, the next steps for the user are spelled out by the available buttons, menus, etc. In a CLI, the user can literally type any command next, and pipe any command to any other. (Or try, at least.) I design my commands to give hints (either in the help or the output) as to what potential next steps might be in a typical workflow.
Another good pattern is allowing user customization of the output. While it is possible for users to use cut, sort, etc. to tailor the output, being able to specify a format string magnifies the utility of a command. The example I cite here is top, which lets you tell it which columns you want.

How well does Python's whitespace dependency interact with source control with regards to merging?

I'm wondering if the need to alter the indentation of code to adjust the nesting has any adverse effects on merging changes in a system like SVN.
I've used python with SVN and Mercurial, and have no hassles merging.
It all depends on how the diffing is done - and I suspect that it is character-by-character, which would notice the difference between one level of indent and another.
It works fine so long as everyone on the project has agreed to use the same whitespace style (spaces or tabs).
But I've seen cases where a developer has converted an entire file from spaces to tabs (I think Eclipse had that as a feature, bound to Ctrl+Tab!), which makes spotting diffs near impossible.
Generally source control systems merge on a line-by-line basis by default. I have found that merging Python code is no different from merging any other source code that is reasonably indented. If one programmer wraps a block of code in an if statement (indenting the whole block), and another programmer modifies something inside the block, then there will be a merge conflict. Fortunately, the conflict in this case is super easy to resolve.
If you use an external merge tool, then your tool may support more detailed textual merging algorithms that take the above scenario into account automatically.