SVN Commit and difference between a file and e formatted file - eclipse

How do you manage the maintenance/right format of code without problems with merge and difference hystory?
For eg. I have a very old file in repository with a wrong format like:
/**
* Old code
*/
public static void main(String[] args)
{
for (int i=0;i<10;i++) {System.out.println(i);}
}
then i format the code in:
/**
* Formatted code (ctrl-shift-f in Eclipse)
*/
public static void main(String[] args) {
for (int i = 0; i < 10000; i++) {
System.out.println(i);
}
}
If I compare this code with repository, I see a lot of change and I don't see the real difference (I change the limit 10 to 10000).
How do you manage versioning between "format code" and "activity code"?

You don't. Subversion doesn't understand programming languages or context - it just recognizes that the file changed. When you reformat the code, you'll get lots of minute changes. Your options:
Don't ever auto-format your code
Always auto-format your code, from the moment you create a new file
Deal with it.

In order to handle something like that you'll need a tool that actually understands the language. Version control systems don't include such capabilities since they're designed to be generalized and the language doesn't really matter (assuming the language is in text and not binary data).
The only such tool that I'm aware of is SemanticMerge. Which knows about C# and Java only at this point. It can be configured with your version control system to replace the built in merging.
One suggestion I'd make is don't mix formatting and functional changes. If you want to reformat do that in a separate commit from the functional change. Typically I commit these changes with a note in the commit message saying there is no functional change. This helps anyone merging it to know that nothing should behave differently with the change and any behavior changing is an error in their merging.

Related

GetClassifier() from IClassifierProvider called twice?

I implemented a VS2013 extension in a form of VSPackage that also exports a classifier for a particular file extension. Everything is working fine, and the only thing that bothers me is that i get GetClassifier() called twice in my implementation of IClassifierProvider. That leads to creation of two classifiers both processing same changes. The implementation of IClassifierProvider is as simple as it is shown below.
[Export(typeof(IClassifierProvider))]
[ContentType(MyConstants.MyContentType)]
public sealed class MyClassifierProvider : IClassifierProvider
{
public IClassifier GetClassifier(ITextBuffer textBuffer)
{
return new MyClassifier(textBuffer);
}
}
I tried to minimize my package by removing everything not related to classification to no avail. Would really appreciate an advice on this one.
UPDATE: I was wrong about text buffers being different in GetClassifier calls. Updated this post accordingly.
One thing you should describe is what gesture resulted in multiple calls with different buffers. You'll most definitely get a call for GetClassifier for each file that is open, and you could possibly get multiple calls for the same text buffer as well. For the same text buffer, the common pattern is to have some other component be watching for file changes, process it once, and then report the changes via all classifiers.
There's also "fancy" cases where text buffers can contain the contents of other text buffers, which get used for various features. That might also explain what you're seeing too.

Eclipse and Java 8 content assist

I wanted to check Java 8 integration with Eclipse Luna so I downloaded the M7 Luna from eclipse.org.
After configuring the JDK to jdk8u5, I started some tests.
Let's say you have a nice Runnable like
Runnable r = new Runnable() {
#Override
public void run() {
System.out.println("foo");
}
};
If you select the
new Runnable() {
#Override
public void run() {
System.out.println("foo");
}
}
block and press Ctrl-1 (Quick Fix), you get the suggestion to change it to a lambda, resulting in Runnable r = () -> System.out.println("foo");, which is pretty cool.
But a nicer thing whould be to actually help creating lambda expression.
For instance, if you type Runnable r = | (with | being the cursor location) and press ctrl+Space (content-assist), I would have expected to find a "create a lambda expression from this functional interface" option in the displayed popup. But nothing new is available.
Do you know if this will be implemented in the future ?
I think it might have something to do with the templates (Java/Editor/Templates in preferences) but I actually never experimented with them.
Providing good proposal right after the = is rather tricky as almost everything could be placed on the right hand side of an assignment.
Even the old way of implementing a function using an anonymous inner class was not proposed right after the equal sign. You had to type the four characters new␣ before the suggestion came up. And four characters is exactly what you have to type to create a lambda, ()->, but at this place proposing the creation of a lambda makes no sense anymore as you have already created it.
So proposing a lambda would require lifting its priority compared to other proposals to appear right after the equal sign but it would still have rather limited benefit. You had to press crtl+space unless you use automatic menu popup, then select “create lambda” to just get either the four characters ()-> or something like name-> inserted whereas the parameter name(s) are likely to be changed after the proposal is inserted.
For an inner class, read method overriding, it makes sense to propose parameters as you have to repeat all parameter types exactly, but for a lambda where you can omit all the bulk the saving is very limited.
So I don’t expect a proposal of lambda creation to ever appear in the list.

How to internationalize java source code?

EDIT: I completely re-wrote the question since it seems like I was not clear enough in my first two versions. Thanks for the suggestions so far.
I would like to internationalize the source code for a tutorial project (please notice, not the runtime application). Here is an example (in Java):
/** A comment */
public String doSomething() {
System.out.println("Something was done successfully");
}
in English , and then have the French version be something like:
/** Un commentaire */
public String faitQuelqueChose() {
System.out.println("Quelque chose a été fait avec succès.");
}
and so on. And then have something like a properties file somewhere to edit these translations with usual tools, such as:
com.foo.class.comment1=A comment
com.foo.class.method1=doSomething
com.foo.class.string1=Something was done successfully
and for other languages:
com.foo.class.comment1=Un commentaire
com.foo.class.method1=faitQuelqueChose
com.foo.class.string1=Quelque chose a été fait avec succès.
I am trying to find the easiest, most efficient and unobtrusive way to do this with the least amount of manual grunt work (other than obviously translating the actual text). Preferably working under Eclipse. For example, the original code would be written in English, then externalized (to properties, preferably leaving the original source untouched), translated (humanly) and then re-generated (as a separate source file / project).
Some trails I have found (other than what AlexS suggested):
AntLR, a language parser / generator. There seems to be a supporting Eclipse plugin
Using Eclipse's AST (Abstract Syntax Tree) and I guess building some kind of plugin.
I am just surprised there isn't a tool out there that does this already.
I'd use unique strings as methodnames (or anything you want to be replaced by localized versions.
public String m37hod_1() {
System.out.println(m355a6e_1);
}
then I'd define a propertyfile for each language like this:
m37hod_1=doSomething
m355a6e_1="Something was done successfully"
And then I'd write a small program parsing the sourcefiles and replacing the strings. So everything just outside eclipse.
Or I'd use the ant task Replace and propertyfiles as well, instead of a standalone translation program.
Something like that:
<replace
file="${src}/*.*"
value="defaultvalue"
propertyFile="${language}.properties">
<replacefilter
token="m37hod_1"
property="m37hod_1"/>
<replacefilter
token="m355a6e_1"
property="m355a6e_1"/>
</replace>
Using one of these methods you won't have to explain anything about localization in your tutorials (except you want to), but can concentrate on your real topic.
What you want is a massive code change engine.
ANTLR won't do the trick; ASTs are necessary but not sufficient. See my essay on Life After Parsing. Eclipse's "AST" may be better, if the Eclipse package provides some support for name and type resolution; otherwise you'll never be able to figure out how to replace each "doSomething" (might be overloaded or local), unless you are willing to replace them all identically (and you likely can't do that, because some symbols refer to Java library elements).
Our DMS Software Reengineering Toolkit could be used to accomplish your task. DMS can parse Java to ASTs (including comment capture), traverse the ASTs in arbitrary ways, analyze/change ASTs, and the export modified ASTs as valid source code (including the comments).
Basically you want to enumerate all comments, strings, and declarations of identifiers, export them to an external "database" to be mapped (manually? by Google Translate?) to an equivalent. In each case you want to note not only the item of interest, but its precise location (source file, line, even column) because items that are spelled identically in the original text may need different spellings in the modified text.
Enumeration of strings is pretty easy if you have the AST; simply crawl the tree and look for tree nodes containing string literals. (ANTLR and Eclipse can surely do this, too).
Enumeration of comments is also straightforward if the parser you have captures comments. DMS does. I'm not quite sure if ANTLR's Java grammar does, or the Eclipse AST engine; I suspect they are both capable.
Enumeration of declarations (classes, methods, fields, locals) is relatively straightforward; there's rather more cases to worry about (e.g., anonymous classes containing extensions to base classes). You can code a procedure to walk the AST and match the tree structures, but here's the place that DMS starts to make a difference: you can write surface-syntax patterns that look like the source code you want to match. For instance:
pattern local_for_loop_index(i: IDENTIFIER, t: type, e: expression, e2: expression, e3:expression): for_loop_header
= "for (\t \i = \e,\e2,\e3)"
will match declarations of local for loop variables, and return subtrees for the IDENTIFIER, the type, and the various expressions; you'd want to capture just the identifier (and its location, easily done by taking if from the source position information that DMS stamps on every tree node). You'd probably need 10-20 such patterns to cover the cases of all the different kinds of identifiers.
Capture step completed, something needs to translate all the captured entities to your target language. I'll leave that to you; what's left is to put the translated entities back.
The key to this is the precise source location. A line number isn't good enough in practice; you may have several translated entities in the same line, in the worst case, some with different scopes (imagine nested for loops for example). The replacement process for comments, strings and the declarations are straightforward; rescan the tree for nodes that match any of the identified locations, and replace the entity found there with its translation. (You can do this with DMS and ANTLR. I think Eclipse ADT requires you generate a "patch" but I guess that would work.).
The fun part comes in replacing the identifier uses. For this, you need to know two things:
for any use of an identifier, what is the declaration is uses; if you know this, you can replace it with the new name for the declaration; DMS provides full name and type resolution as well as a usage list, making this pretty easy, and
Do renamed identifiers shadow one another in scopes differently than the originals? This is harder to do in general. However, for the Java language, we have a "shadowing" check, so you can at least decide after renaming that you have an issues. (There's even a renaming procedure that can be used to resolve such shadowing conflicts
After patching the trees, you simply rewrite the patched tree back out as a source file using DMS's built-in prettyprinter. I think Eclipse AST can write out its tree plus patches. I'm not sure ANTLR provides any facilities for regenerating source code from ASTs, although somebody may have coded one for the Java grammar. This is harder to do than it sounds, because of all the picky detail. YMMV.
Given your goal, I'm a little surprised that you don't want a sourcefile "foo.java" containing "class foo { ... }" to get renamed to .java. This would require not only writing the transformed tree to the translated file name (pretty easy) but perhaps even reconstructing the directory tree (DMS provides facilities for doing directory construction and file copies, too).
If you want to do this for many languages, you'd need to run the process once per language. If you wanted to do this just for strings (the classic internationalization case), you'd replace each string (that needs changing, not all of them do) by a call on a resource access with a unique resource id; a runtime table would hold the various strings.
One approach would be to finish the code in one language, then translate to others.
You could use Eclipse to help you.
Copy the finished code to language-specific projects.
Then:
Identifiers: In the Outline view (Window>Show View>Outline), select each item and Refactor>Rename (Alt+Shift+R). This takes care of renaming the identifier wherever it's used.
Comments: Use Search>File to find all instances of "/*" or "//". Click on each and modify.
Strings:
Use Source>Externalize strings to find all of the literal strings.
Search>File for "Messages.getString()".
Click on each result and modify.
On each file, ''Edit>Find/Replace'', replacing "//\$NON-NLS-.*\$" with empty string.
for the printed/logged string, java possess some internatization functionnalities, aka ResourceBundle. There is a tutorial about this on oracle site
Eclipse also possess a funtionnality for this ("Externalize String", as i recall).
for the function name, i don't think there anything out, since this will require you to maintain the code source on many version...
regards
Use .properties file, like:
Locale locale = new Locale(language, country);
ResourceBundle captions= ResourceBundle.getBundle("Messages",locale);
This way, Java picks the Messages.properties file according to the current local (which is acquired from the operating system or Java locale settings)
The file should be on the classpath, called Messages.properties (the default one), or Messages_de.properties for German, etc.
See this for a complete tutorial:
http://docs.oracle.com/javase/tutorial/i18n/intro/steps.html
As far as the source code goes, I'd strongly recommend staying with English. Method names like getUnternehmen() are worse to the average developer then plain English ones.
If you need to familiarize foreign developers to your code, write a proper developer documentation in their language.
If you'd like to have Javadoc in both English and other languages, see this SO thread.
You could write your code using freemarker templates (or another templating language such as velocity).
doSomething.tml
/** ${lang['doSomething.comment']} */
public String ${lang['doSomething.methodName']}() {
System.out.println("${lang['doSomething.message']}");
}
lang_en.prop
doSomething.comment=A comment
doSomething.methodName=doSomething
doSomething.message=Something was done successfully
And then merge the template with each language prop file during your build (using Ant / Gradle / Maven etc.)

Can CodeDom add Source Code Files to a Project?

I have been using CodeDom to do some code generation. It works great, but I haven't found a way to include the generated source code files in a project. I started using T4 and the T4Toolbox to generate code because it supports integration with project files.
Does anyone know if CodeDom supports this functionality too? I'd consider taking a second look at CodeDom if it only supported this one feature.
Here is an example of how I make a source code file with CodeDom:
protected void CreateSourceFile(CodeCompileUnit codeCompileUnit,
string fileName,
out string fileNameWithExtension)
{
fileNameWithExtension = string.Format("{0}.{1}",
fileName,
CodeProvider.FileExtension);
var indentedTextWriter =
new IndentedTextWriter(new StreamWriter(fileNameWithExtension,
false),
TabString);
CodeProvider.GenerateCodeFromCompileUnit(codeCompileUnit,
indentedTextWriter,
new CodeGeneratorOptions());
indentedTextWriter.Close();
}
That works fine but it just outputs the file to the hard drive somewhere (probably bin folder).
Here is a second example of some code I use with T4, this one specifies the output as part of the project the template is transformed in:
public class RDFSClassGenerator : Generator
{
private readonly string rootNamespace;
private readonly string ontologyLocation;
public RDFSClassGenerator(
string rootNamespace,
string ontologyLocation)
{
this.rootNamespace = rootNamespace;
this.ontologyLocation = ontologyLocation;
}
protected override void RunCore()
{
XElement ontology = XElement.Load(ontologyLocation);
var service = new RDFSGeneratorService(ontology);
foreach (MetaClass metaClass in service.MetaClasses)
{
var rdfsClassTemplate = new RDFSClassTemplate(rootNamespace, metaClass);
rdfsClassTemplate.Output.File = "Domain/" + metaClass.Name + ".cs";
rdfsClassTemplate.Render();
}
}
}
So the T4 code will output the file into the "Domain" folder of my project. But the CodeGen stuff just outputs the file on disk and doesn't update the project file.
Here is a visual:
Yes, it can. Here is how: http://www.olegsych.com/2009/09/t4-and-codedom-better-together/
Short answer is no, but I could be wrong (ever try to prove a negative?)
Your question was a little confusing as CodeDom isn't exactly equitable with T4. T4 templates are a convenient way of generating code files the same way, for example, asp.net generates HTML files, mixing text and code that gets executed to generate a file that is then interpreted by something else (such as a compiler or a browser). CodeDom is usually used to generate assemblies at runtime rather than files, although you can do it (as you have discovered).
While T4 makes it easy to add files to the solution, you can do this with CodeDom as well. I don't believe it supports interaction with the solution directly, but you can manage this using EnvDTE, or the automation model for Visual Studio.
The problem with this is that the automation model isn't easy to work with. EnvDTE is a wrapper around COM classes, which is always fun to code against. Also, you have to be careful when attempting to get the object. The naive implementation will get the object from the first instance of Visual Studio loaded. You have to poll the Running Object Table to find the current instance. Once you have it, you must deal with searching through the dte for the location you're looking for, deal with source control, locked files, etc etc.
Working with it, you start to learn why T4 was created in the first place.
The question you have to ask yourself is, "Does CodeDom give me enough that T4 doesn't to make up for all its shortcomings?"

Are there any merge tools for source control that understand code?

I've recently been working through a large codebase, refactoring and generally improving design to increase coverage. Also, in quite a few files I've removed excess using statements, moved methods so that similar functionality is close together, added regions etc. but not actually changed the functionality of the code in the file.
Meanwhile, elsewhere on the team other developers are fixing bugs and changing lines of code here and there. Obviously when it comes to merging this can be an issue since line numbers no longer match and methods may have moved.
Now, I understand the general rule that in a source controlled environment it can be a dangerous thing to move methods around, and we decided that the benefit outweighed the cost. What I don't understand however is why it should be this way.
Say that my initial file was a simple calculator:
public class Calculator
{
public int Subtract(int a, int b)
{
return a + b;
}
public int Add(int a, int b)
{
return a + b;
}
}
And I decided that I wanted the methods to be alphabetical:
public class Calculator
{
public int Add(int a, int b)
{
return a + b;
}
public int Subtract(int a, int b)
{
return a + b;
}
}
While another developer fixed the bug in the subtract method
public class Calculator
{
public int Subtract(int a, int b)
{
return a - b;
}
public int Add(int a, int b)
{
return a + b;
}
}
A standard merge tool would probably require you to manually merge these two files, but one that understood the functionality of the code would easily be able to reconcile these two changes. The same applies to removing or adding other methods, comments, regions or using statements.
So, to (finally!) get to the question: Are there any merge tools out there that have an intelligent understanding of the functionality of code and could merge the two files above without any human intervention? If not, why not? Are there any complications which make this an unsolvable problem (of course a understand it isn't as simple as I'm implying - but is it impossible for some reason that I can't see?)
I uses C# in my source code and would love something that worked with that, but I'm interested in if this exists anywhere in the world of programming...
I'm already really concerned about the length of this question, but edited to add how I would expect the intelligent source system to work:
When the initial calculator file was checked in the system would parse the file and create a hierarchy of the class:
File: Calculator.cs
|
|--Class[0]: Calculator
|
|--Method[0]: Subtract
|
|--Line[0]: return a + b;
|
|--Method[1]: Add
|
|--Line[0]: return a +b;
(With extra lines in there for braces etc...)
When I check in my code (making the methods alphabetical) it updates the hierarchy above so that Subtract becomes Method[1] and Add becomes Method[0].
The second developer checks in his code (which obviously the source control system knows was based of the original) and notices the change to the first line in subtract. Now, rather than finding that line by line number in the overall file it knows that it can find it a Calculator.cs/Calculator/Subtract/0 and the fact that the method has changed location doesn't matter, it can still make the merge work.
Our approach with Plastic SCM is still far from being "complete", but it's already released and can help in this kind of situations. Take a look at Xmerge. Of course, feedback will be more than welcome and will grant some free licenses ;-)
I think that Source Code in Database is one potential answer to your question. The general idea is that you don't version files, you version blocks of code. The versioning system knows about the code DOM, and lets you query on the code DOM in order to check out functions, classes, what-have-you, for editing, compiling, etc.
Since the order of the methods doesn't necessarily matter, they're not stored in the Database with any order in mind. When you check out the class, you can specify the order that you like best (alphabetical, public/protected/private, etc). The only changes that matter are the ones like where you switch the + to a -. You won't have a conflict due to reordering the methods.
Unfortunately, SCID is still VERY young and there aren't many tools out there for it. However, it is quite an interesting evolution in the way one views and edits code.
Edit: Here's another reference for SCID