Eclipse won't ignore CRLF in team synchronization - eclipse

First, let me explain what I am doing. I have a CVS repository that I store 5,000 Data Definition Language files in. These 5,000 files are generated from an external data modeling application, they are text and have windows CRLFs. During development, if I need to make a change, I re-generate the 5,000 files and then overwrite the contents of my local CVS workspace in eclipse. The full overwrite/replacement is to make sure that I don't miss any updates to files. After overwriting/replacing the files, I use eclipse to do a team < synchronize with repository. When I do this, the comparison flags every single file as an outgoing change because it looks to not be ignoring CRLFs in its comparison. I have "Ignore white space" checked off and the eclipse documentation states that it should be ignoring CRLFs:
Ignore whitespace option:
Causes the comparison to ignore differences which are whitespace characters
(spaces, tabs, etc.). Also causes differences in line terminators ( LF
versus CRLF) to be ignored.
When I open the files in text compare, it shows no diffs but there is an extra CRLF at top of one of the files. Is this a bug or is there an option I am missing in eclipse? It looks like the problem is that it doesn't ignore CRLFs that are on their own line.

The Eclipse compare dialog doesn't have a bug; you're just confused because you're seeing the output of several, independent problems.
The option "ignore whitespace" only reduces the amount of changes that the compare dialog shows; it has no effect whatsoever on the differences that CVS sees. So as long as the files have the wrong line ending, CVS will complain.
Some version control systems allow you to specify converters to solve this issue, CVS doesn't. So you really need to generate files with the correct line endings.
The "single file with extra CRLF" really has a an extra CRLF. Find out why and fix that to make the difference go away.
When generating files, you should never use PrintStream or PrintWriter. It is tempting but these two have many bugs (like close() doesn't flush(), violating their API contract) plus they use platform dependent line endings which is almost never what you want. Yes, it might work by accident but trust me on this, that's not what you want. You don't want you pay check filed on accident, either, right?
If you don't use PrintStream nor PrintWriter, then avoid the System property line.separator for the same reasons.
I suggest to wrote a helper class which has many of the methods of PrintStream / PrintWriter but none of the bugs. Plus it should allow you to set the line delimiter to whatever you need.
Note: If you use a Writer, make sure you also specify the charset / encoding or the "UTF-8 to bytes" conversion will be as random as the line endings.

Related

Can VSCode display a binary file if there is an executable that will convert it to an equivalent text file?

I sometimes use VSCode for a Delphi 7 project because I like VSCode's git functionality and for a few other reasons (superior string search, diff, etc).
Delphi 7 is a pain, and to get it to consistently compile I need to convert the dfm files to their binary version (all 2300 of them). This of course makes them unviewable in the diff viewer, or to just open the file?
Is there a setting where if I open that file, it will first pass it through the convert.exe (that's its actual name) util so that it can be viewed as a text? I understand that this might be read-only, which would be sufficient to my needs (though if on save it could just pass it back through, that'd be great too).
I'm having trouble figuring out what exactly to to search for on Google (the keywords seem too generic), but I can imagine some generalized functionality that would work for other environments beyond just Delphi/pascal.

Edmx update model add blank lines from autogeneration

I have an annoying problem and can't seem to figure out what's causing it. On my machine when I try to use Update Model from Database... on Edmx file in EF Database first approach the autogenerated model has blank lines between properties. This doesn't seem to occure on other developers machines even though we have same versions of VS , extensions etc.
Problem is that even when I add for example one new table the refresh automatically adds blank lines for all mapped tables. Later all of this is visible as conflict during merge operations in GIT.
Would really appreciate any help since I did't find a single shred of information on this issue anywhere and this really disrupts work.
I checked the files (Model.tt on my machine and my friends) using Notepad++ comparer and it said there are no differences but the encoding is different. When I copied Model.tt manually and did the update the blank lines were gone.... Must be some kind of quirk.
Posting as an answer since I wasted few hours on this and someone might have simmilar problem.
What worked for me
💡Turns out it was how my OS was ending lines
Working in Windows. Earlier disabled "auto carriage returns (CR) + line feeds (LF) line endings" in global Git configuration, reenabled:
git config --global core.autocrlf true
FYI 'nix/Mac ends lines w/ LFs only, Windows end lines w/ CRs + LFs
Opened up *DataModel.tt and *DataModel.Context.tt in Notepad++
Edit > EOL Conversions > Windows (CR LF) > Save
Refresh EDMX
Looking for a better terminal-based solution, sounds like dos2unix will come in to play at some point. Will amend this as soon as I've ironed this out.

Search code history for closest matching version based on content

I have a file that was forked from a project at an unknown moment in the past. I want to identify as closely as possible the moment of that fork. The file has been changed since the fork-moment.
Winmerge highlights about about 20% of the lines, with about half of those being just a few characters within the line, a path change or inline function turned into a variable or function call for instance. (20% after ignoring whitespace change and enabling moved-block detection that is, closer to ~40% without that.)
I don't have to worry about branches, the original version control system was CVS. (I don't have access to the CVS file system). I have a git imported version with tags corresponding to the CVS commits, and could generate the same with Mercurial for little effort if need be.
I don't care about matching the specific CSV commit date/time/number/whatever. The goal is to identify when the content of new file started drifting, and step forward through the revision history, cherry picking what to merge to the forked file.
For this project I could brute force it, there only a dozen or so revisions where the fork has mostly likely occurred and the file is less than 500 lines. However it's not hard to imagine a scenario where this is not feasible and I'm curious about what an elegant solution might be.
How would you go about solving this?
"Brute force" sounds as if you were contemplating testing all revisions. Normally one would use a binary search. To decide if it was a good match, I'd normally use just the numbers from diffstat (since you say there are post-fork changes). Accounting for block-moves complicates things, though.

CVS keeps adding code at the end of the file I want to commit

I have trouble with 4 files in my CVS project. Each time I commit one of those files, CVS keeps adding the same line of code at the end of it. This line of code is a repeated line of the current file (but not the last line of it).
I've try several things : update, delete lines and commit, delete all lines and commit, adding lines and commit, adding header and commit. But I always get the same line of code added to the end of my file. I could delete all files and recreate those, but I would lost all my history data.
I find it awkward that CVS is modifying my file when I commit. Is it not counter productive as it may add errors in a compliant code?
I could add that my file is a .strings (text file, unicode). I'm working on a branch, but recently merge it in the trunk.
More Details:
I'm using TortoiseSVN on a virtual Windows machine, which has access to my Documents folder of Mac OS X via a Network Drive between those two.
It turns out that my colleague, which has the same project but on a real Windows folder, could commit without any problem.
And now that he done that, the problem is solve for me too.
But I have no idea what happen. My only clue would be a hidden character in Mac OS X that would breaks TortoiseSVN. Is it possible?
I haven't experienced this issue with CVS, but note that you mention that the file you are editing is Unicode text (you don't mention if this means UTF8 or UTF16, but either can cause issues).
Depending on how your CVS server was built, and how (and on what platform) it is being run, it is highly possible that the server is not Unicode-aware. This can cause a whole range of possible issues, including expanding RCS-style $ tags in places where the second (or later) byte of a Unicode character is equal to ASCII '$'.
The workaround for this is to mark Unicode source files as binary objects. From the command line, this can be done using
cvs add -kb file-name
when adding a new file, or
cvs admin -kb file-name
for an existing file (replace file-name with the name of your file).
In the latter case, I'd recommend removing the (local copy of the) file and running 'cvs update' to get it back after changing the type.
Note that doing this is unlikely to help with changes you're already seeing in the file, so make sure to check the file, and fix any existing problem after making this change.

CVS keyword substitution and Microsoft Word file

CVS has the keyword substitution feature: in a text file you write $Header$ and, when you commit the file, CVS substitutes $Header$ with something like $Header: /repo/src.cpp,v 1.6 2009/03/12 14:53:14 luser Exp $
Is it possible to get the same feature when dealing with a binary Microsoft Word file?
Thank you.
The basic problem you have with a Word file is that it is effectively a binary file (as opposed to a plain-text file), so you cannot be sure a key string like "$Header$" doesn't appear somewhere (VB macro code, for example) by accident. CVS would expand that key string, and suddenly something apparently unrelated (VB macro code, for example...) stops working.
Using CVS? Not likely. Even if $Header$ doesn't appear anywhere in your Word document (as DevSolar suggested it might), where do you place that string? Word stores text in its proprietary binary format, but CVS looks for plain text.
On the other hand, I'm sure you can achieve the effect by using either an XML Word format, or a Word macro.
Seems almost impossible with the traditional .doc format. Some creative work might allow you to create a process for making it happen with the newer XML format. I'm not sure CVS can do the job even then, but using a post-commit hook in subversion might make it more reasonable to pull off.