How do you evaluate a framework, library, or tool before adding it to your project?

How do you evaluate a framework, library, or tool before adding it to your project? - frameworks

There's so many cool ideas out there (ninject, automapper, specflow, etc) that look like they would help but I don't want to add something, tell others about it, and try using it just for it to be added to the growing heap of ideas that didn't quite work out. How can I determine if the promised benefits will happen and that it won't end up as something to be ignored or worked around?

Have a problem
Identify the cost of having the problem, or the value to solving it
Prioritize it against other problems
When it's the top priority, look for a solution that solves the problem with a proportional cost
Do you have the problem that ninject solves? Is it an important problem to solve? Is it the most important? What value will you get from solving it?

I don't think that you can tell whether any framework will deliver your expectations until you try it, and try it in anger and in context. This is usually time consuming and inevitably you'll have a few misses before you get any hits. Don't commit yourself by working through a simple sample from the authors website or howto files; these will always work and may impress but until you try to use the framework in the context of your billion user, multi-lingual, real-time on- and off- line application you're not going to find it's shortcomings.

Related

Paster vs. ArchGenXML

I questionning myself : is it better to use Paster for creating content types, browser view, portlet, etc... or ArchGenXML ?
Which one of those two create the better source code ?
Is there an advantage of using one or the other ?
Thanks.

They're two quite different things.
Paster creates an initial skeleton. Once it's done, you're on your own.
ArchGenXML needs a UML model first, and then it can create quite complex systems of code. As long as you modify your code only within prescribed regions of the .py file, you can change your model and rerun ArchGenXML as often as you wish.
Either one only generates code as good as its authors have provided, and while I use ArchGenXML extensively, I see a fair bit of deprecated code generated. otoh, I've never seen it generate completely invalid code.
I use ArchGenXML because I like having my original source in UML

Just to give you another point of view:
I think that ArchGenXML is a tool for someone that doesn't really want to get his/her hands dirty for as long as possible a tool that let you take a little more time planning and a little less time coding (I see this as a negative point) . Paster on the contrary is just a comfort to speed up some work, and you will get your hands dirty very soon.
I started coding in Plone using paster, then after a while when I felt secure, I abandoned it (like the babywalker :D ). And then you learn to run, and after some time you get old and lazy and you realize that, after all, paster is still your friend.
ArchGenXML, on the other hand, I think that's an obstruction for your learning.
Plus, if you use paster and in the end of the project your code sucks you can blame just yourself (I see this as a good point).
These are just my 2¢.

Reuse vs. maintainability and ease of testing

Everyone likes to talk about reusability. Where I work, whenever some new idea is being tossed around or tested out, the question of reusability always comes up. "We want to maximize our investment in this, let's make it reusable." "Reusability will bring higher quality with less work." And so on and so on.
What I've found is that when a reusable component or idea is introduced, everyone is immediately afraid of it and writes it off as a bad idea. Once applications become dependent on it, they say, it won't be maintainable, and any changes will result in the need to do regression testing on everything that uses it. People here point to one component in particular that has been around a long time and has a whole lot of dependents and grouse that it's become impossible to change becuase we don't know what the changes will break.
My responses to this complaint are:
It's good that change to a component
that has many dependents is slow,
because it forces the designers to
really think through the changes.
Time should be taken to get the
component right in the first place. Corrollary: If you're finding the need to change it all the time, it was never very reusable to begin with, was it?
Software development is hard and requires work. So does testing. You just gotta do it.
Unfortunately, what people hear in these responses are "slow," "time" and "effort."
I would love if there was a magic "make this reusable" switch I could flip on things I build so as to win brownie points from management, but things don't work that way. Making something reusable takes time and effort and you're still not guaranteed to get it right.
How do you deal with the request for "reusability" when delivering on it seems to bring nothing but complaints?

Reusability is only worthwhile if something will actually be reused. Make sure you have some practical reuse cases before you write something reusable.
Even if a reusable library is 10x harder to maintain than an ad-hoc version of itself, you're still saving on maintenance overall if the reusable library is used in place of ad-hoc versions in 10 different places.

Reusability is to make code reusable in term of similar behavior or "IS_A" relationship. If you just want to reuse code block by seeing them using again and again but they have no similar characteristic, you should better leave them alone to be loosely coupling. By that, we can have more flexibility to modify later.

One thing that we often do is to use versions and avoid the constant retesting. Just because there is a new version of common code doesn't mean everything has to use the new version right away. When something is getting updated for other reasons, update to the new version of the common code.

I'm rewriting my code around 10 times before finishing. Is this wrong?

When i start writing something complex, I find that restart the writing like 10 times before I end up with what I want, often discarding hundreds of lines of code.
Am I doing something wrong, or do others have workflows like this?
EDIT: Right now, I'm working on a modular compiler. The last project I was working on was a server in java. Before that it was some concurrency stuff.
I do a fair bit of planning, and I never start coding before I've got interfaces for everything.
Given that, is it normal to just wipe the slate clean repeatedly?

Discarding many lines of code is usually a positive aspect of refactoring. That's great. But starting over ten times means that you probably haven't analyzed your problem and solution. It's fine to backtrack and sometimes to start over but not that often. You should lay out your code in such a way that when you backtrack and refactor, you keep most of what you created because it will exist in nicely isolated and logical chunks. (Using vague language since language of choice wasn't specified.)
From author's comment:
Usually I restart because I get
confused by all the stuff going on in
my code.
Study your craft and make good use of design patterns and other best programming philosophies to lend your code a well-defined structure... something you'll recognize months and even days down the road.

If you're starting something complex, a little planning before you start writing would seem to be a good idea.
Design first.

Perfectly normal. No matter how much I plan ahead, I very often have an "Aha!" moment once the hands hit the keyboard.
Just as often it's a "What the heck was I thinking?" moment.
It's all good. You're making better code.

All the suggestions here are valid, however, remember that there's a moment in a programs lifetime that is "good enough". It's easy to fall into a trap of never ending refactoring, just because you see that "yes, this could be done better!". Well, face the truth -- unless your program has just a few lines, there's ALWAYS a way to do it better.
I believe there are happy programmers out there that don't suffer from that, but at least I need to keep reminding myself that there's a line that's called "good enough".
And it's especially true if you're writing code for someone else -- nobody will note that you did something "nicer", all that counts is "does it work well?".
Also, a VERY GOOD practice is at least to get it to WORK before rewriting. Then you can always fall back to a working previous solution.
(since 12 years I'm constantly rewriting a game I'm writing, and I'm nowhere near the end...)

On a complex problem, this is common. If you aren't totally stabbing in the dark, it really helps to sketch out our ideas first, but then again you're just moving the 'retries' from code to paper.
If it helps you get to a good solution, how can it be wrong?

It depends on how well I know the problem space. If it is familiar territory, then I'd be worried if it took 10 iterations. If it is unfamiliar territory, then it might take as many as 10 iterations, but at least some of them would be reduced to prototype - or an attempt at a prototype - before being discarded.

If you are learning something with each iteration then there probably isnt a problem. Deadlines and pesky things like that might make your life a little more difficult. :)
When I am working on a new problem I like to pseudocode it out in comments in the actual function handler, as part of generating the stub for my TDD. Then add the code in to each step I had in the comments of the function body.
It helps me to keep focused on what the problem that I am solving is, and not get lost in the details to early.

The single biggest change you can do to help yourself would be to plan your code first. On paper.
Your plan doesn't have to be super in-depth (although sometimes that's good too). Just sketch out a rough idea of what you want your program to do. Jot down the key functionality points. "I want it to do this, this and this".
Once you have that, sketch out a rough design. Classes, methods, functions, objects. Give it a little form. Do a rough allocation of functionality to various portions of your design.
Depending on the nature of the project, I might take a rough design such as that and turn it into a much more detailed design. If it's a small project, maybe not. But no matter the projected complexity, time spent designing will reward you with better code, and less time spent coding. If you have obvious mistakes that require you to refactor large portions of your program, they should be apparent in your initial design and you can adjust it. You won't have wasted hundreds of lines of code on an obvious mistake.

Compilers are very complex applications, and you can't write an optimizing compiler from start to finish in one pass - no matter how much thought you put into it at first. Usually you attempt to get something to work correctly from start to finish and then go back to modularizing it and adding new features like optimizations. This process means lots of refactoring and replacing whole sections outright. This is also part of the learning processes - as no one can know everything and remember it!
(I'm also working on a .NET compiler as part of the MOSA project - www.mosa-projet.org.)

solve your problem on paper . . . dont be in such a rush to type.

There are two situations. (1) I've been able to confidently plan ahead and isolated my abstractions. (2) I haven't.
For (1), an effective technique is to put in dummy versions of certain classes or functions just to drive the rest of the code. (or conversely, to write said classes and functions and drive them with a test script.) This allows you to tackle only part of the complexity in each pass.
As much as everyone says people should plan in advance, it often doesn't work that way, resulting in situation (2). Here, be careful to manage what you are trying to accomplish in one iteration of code. As soon as you find your brain unable to juggle all the things you are doing, scale back your ambition for what you want to achieve before the next compile-and-test. Allow your code to be flawed but easy-to-write on the first pass, and then develop it through refactoring. This improves efficiency over repeatedly wiping the slate clean.
For example, one way I used to get into messes was by sniffing out common code and refactoring into subroutines too early, before I really knew the shape of the code. I've since started allowing myself to duplicate code on the first pass, and then going back and factoring it into subroutines later. It has helped tremendously.

It's called refactoring buddy and it's good, you just need to limit it so you won't end up wasting all your time refactoring code you have and is working instead of writing new code.
Some of the reasons why one must refactor are:
Enhancing performance.
Organizing code.
You need to write your code in a different way to get something to work.
To do something in a different way because it saves a lot of work (i.e.: Using MXML instead of ActionScript).
You used the wrong name for a variable.

Consider learning some framework in whatever language you're using (or in any language for that matter).
I think that learning frameworks made my code a million times better. By learning the frameworks (and more importantly how they work) I learned not just design patterns, but how to implement them realistically.
Consider looking at rails, cakephp, or django (assuming you're in a scripting language; I don't know any desktop language frameworks. Sorry!). Then see how their pieces fit together.

Getting your head around other people's code

I'm occasionally unfortunate enough to have to make alterations to very old, poorly not documented and poorly not designed code.
It often takes a long time to make a simple change because there is not much structure to the existing code and I really have to read a lot of code before I have a feel for where things would be.
What I think would help a lot in cases like this is a tool that would allow one to visualise an overview of the code, and then maybe even drill down for more detail. I suspect such a tool would be very hard to get right, given that is trying to find structure where there is little or none.
I guess this is not really a question, but rather a musing. I should make it into a question - What do others do to assist in getting their head around other peoples code, the good and the bad?

Hmm, this is a hard one, so much to say so little time ...
1) If you can run the code it makes life soooo much easier, breakpoints (especially conditional) break points are you friend.
2) A purists' approach would be to write a few unit tests, for known functionality, then refactor to improve code and understanding, then re-test. If things break, then create more unit tests - repeat until bored/old/moved to new project
3) ReSharper is good at showing where things are being used, what's calling a method for instance, it's static but a good start, and it helps with refactoring.
4) Many .net events are coded as public, and events can be a pain to debug at the best of times. Recode them to be private and use a property with add/remove. You can then use break point to see what is listening on an event.
BTW - I'm playing in the .Net space, and would love a tool to help do this kind of stuff, like Joel does anyone out there know of a good dynamic code reviewing tool?

I have been asked to take ownership of some NASTY code in the past - both work and "play".
Most of the amateurs I took over code for had just sort of evolved the code to do what they needed over several iterations. It was always a giant incestuous mess of library A calling B, calling back into A, calling C, calling B, etc. A lot of the time they'd use threads and not a critical section was to be seen.
I found the best/only way to get a handle on the code was start at the OS entry point [main()] and build my own call stack diagram showing the call tree. You don't really need to build a full tree at the outset. Just trace through the section(s) you're working on at each stage and you'll get a good enough handle on things to be able to run with it.
To top it all off, use the biggest slice of dead tree you can find and a pen. Laying it all out in front of you so you don't have to jump back and forward on screens or pages makes life so much simpler.
EDIT: There's a lot of talk about coding standards... they will just make poor code look consistent with good code (and usually be harder to spot). Coding standards don't always make maintaining code easier.

I do this on a regular basis. And have developed some tools and tricks.
Try to get a general overview (object diagram or other).
Document your findings.
Test your assumptions (especially for vague code).
The problem with this is that on most companies you are appreciated by result. That's why some programmers write poor code fast and move on to a different project. So you are left with the garbage, and your boss compares your sluggish progress with the quick and dirtu guy. (Luckily my current employer is different).

I generally use UML sequence diagrams of various key ways that the component is used. I don't know of any tools that can generate them automatically, but many UML tools such as BoUML and EA Sparx can create classes/operations from source code which saves some typing.

The definitive text on this situation is Michael Feathers' Working Effectively with Legacy Code. As S. Lott says get some unit tests in to establish behaviour of the lagacy code. Once you have those in you can begin to refactor. There seems to be a sample chapter available on the Object Mentor website.

I strongly recommend BOUML. It's a free UML modelling tool, which:
is extremely fast (fastest UML tool ever created, check out benchmarks),
has rock solid C++ import support,
has great SVG export support, which is important, because viewing large graphs in vector format, which scales fast in e.g. Firefox, is very convenient (you can quickly switch between "birds eye" view and class detail view),
is full featured, intensively developed (look at development history, it's hard to believe that so fast progress is possible).
So: import your code into BOUML and view it there, or export to SVG and view it in Firefox.

See Unit Testing Legacy ASP.NET Webforms Applications for advice on getting a grip on legacy apps via unit testing.
There are many similar questions and answers. Here's the search https://stackoverflow.com/search?q=unit+test+legacy
The point is that getting your head around legacy is probably easiest if you are writing unit tests for that legacy.

I haven't had great luck with tools to automate the review of poorly documented/executed code, cause a confusing/badly designed program generally translates to a less than useful model. It's not exciting or immediately rewarding, but I've had the best results with picking a spot and following the program execution line by line, documenting and adding comments as I go, and refactoring where applicable.

a good IDE (EMACS or Eclipse) could help in many cases. Also on a UNIX-platform, there are some tools for crossreferencing (etags, ctags) or checking (lint) or gcc with many many warning options turned on.
First, before trying to comprehend a function/method, i would refactor it a bit to fit your coding conventions (spaces, braces, indentation) and remove most of the comments if they seem to be wrong.
Then I would refactor and comment the parts you understood, and try to find/grep those parts over the whole source tree and refactor them there also.
Over the time, you get a nicer code, you like to work with.

I personally do a lot of drawing of diagrams, and figuring out the bones of the structure.
The fad de jour (and possibly quite rightly) has got me writing unit tests to test my assertions, and build up a safety net for changes I make to the system.
Once I get to a point where I'm comfortable enought knowing what the system does, I'll take a stab at fixing bugs in the sanest way possible, and hope my safety nets neared completion.
That's just me, however. ;)

i have actuaally been using the refactoring features of ReSharper to help m get a handle on a bunch of projects that i inherited recently. So, to figure out another programmer's very poorly structured, undocumented code, i actually start by refactoring it.
Cleaning up the code, renaming methods, classes and namespaces properly, extracting methods are all structural changes that can shed light on what a piece of code is supposed to do. It might sound counterintuitive to refactor code that you don't "know" but trut me, ReSharper really allows you to do this. Take for example the issue of red herring dead code. You see a method in a class or perhaps a strangely named variable. You can start by trying to lookup usages or, ungh, do a text search, but ReSharper will actually detect dead code and color it gray. As soon as you open a file you see in gray and with scroll bar flags what would have in the past been confusing red herrings.
There are dozens of other tricks and probably a number of other tools that can do similar things but i am a ReSharper junky.
Cheers.

Get to know the software intimately from a user's point of view. A lot can be learnt about the underlying structure by studying and interacting with the user interface(s).

Printouts
Whiteboards
Lots of notepaper
Lots of Starbucks
Being able to scribble all over the poor thing is the most useful method for me. Usually I turn up a lot of "huh, that's funny..." while trying to make basic code structure diagrams that turns out to be more useful than the diagrams themselves in the end. Automated tools are probably more helpful than I give them credit for, but the value of finding those funny bits exceeds the value of rapidly generated diagrams for me.
For diagrams, I look for mostly where the data is going. Where does it come in, where does it end up, and what does it go through on the way. Generally what happens to the data seems to give a good impression of the overall layout, and some bones to come back to if I'm rewriting.

When I'm working on legacy code, I don't attempt to understand the entire system. That would result in complexity overload and subsequent brain explosion.
Rather, I take one single feature of the system and try to understand completely how it works, from end to end. I will generally debug into the code, starting from the point in the UI code where I can find the specific functionality (since this is usually the only thing I'll be able to find at first). Then I will perform some action in the GUI, and drill down in the code all the way down into the database and then back up. This usually results in a complete understanding of at least one feature of the system, and sometimes gives insight into other parts of the system as well.
Once I understand what functions are being called and what stored procedures, tables, and views are involved, I then do a search through the code to find out what other parts of the application rely on these same functions/procs. This is how I find out if a change I'm going to make will break anything else in the system.
It can also sometimes be useful to attempt to make diagrams of the database and/or code structure, but sometimes it's just so bad or so insanely complex that it's better to ignore the system as a whole and just focus on the part that you need to change.

My big problem is that I (currently) have very large systems to understand in a fairly short space of time (I pity contract developers on this point) and don't have a lot of experience doing this (having previously been fortunate enough to be the one designing from the ground up.)
One method I use is to try to understand the meaning of the naming of variables, methods, classes, etc. This is useful because it (hopefully increasingly) embeds a high-level view of a train of thought from an atomic level.
I say this because typically developers will name their elements (with what they believe are) meaningfully and providing insight into their intended function. This is flawed, admittedly, if the developer has a defective understanding of their program, the terminology or (often the case, imho) is trying to sound clever. How many developers have seen keywords or class names and only then looked up the term in the dictionary, for the first time?

It's all about the standards and coding rules your company is using.
if everyone codes in different style, then it's hard to maintain other programmer code and etc, if you decide what standard you'll use have some rules, everything will be fine :) Note: that you don't have to make a lot of rules, because people should have possibility to code in style they like, otherwise you can be very surprised.

Is there a good obfuscater for Perl code?

Does anyone know of a good code obsfucator for Perl? I'm being ask to look into the option of obsfucating code before releasing it to a client. I know obsfucated code can still be reverse engineered, but that's not our main concern.
Some clients are making small changes to the source code that we give them and it's giving us nightmares when something goes wrong and we have to fix it, or when we release a patch that doesn't work with what they've changed. So the intention is just to make it so that it's difficult for them to make their own changes to the code(they're not supposed to be doing that anyway).

I've been down this road before and it's an absolute nightmare when you have to work on "obfuscated" code because it drives up costs tremendously trying to debug a problem on the client's server when you, the developer, can't read the code. You wind up with "deobfuscators", copying the "real code" to the client's server or any of a number of other issues which just become a real hassle to maintain.
I understand where you're coming from, but it sounds like management has a problem and they're looking to you to implement a chosen solution rather than figuring out what the correct solution is.
In this case, it sounds like it's really a licensing or contractual issue. Let 'em have the code open source, but make it a part of the license that any changes they submit have to come back to you and be approved. When you push out patches, check the md5 sums of all code and if it doesn't match what's expected, they're in license violation and will be charged accordingly (and it should be a far, far higher rate). (I remember one company which let us have the code open source, but made it clear that if we changed anything, we've "bought" the code for $25,000 and they were no longer responsible for any bug fixes or upgrades unless we bought a new license).

Don't. Just don't.
Write it into the contract (or revise the contract if you have to), that you are not responsible for changes they make to the software. If they're f-ing up your code and then expecting you to fix it, you have client problems that aren't going to be solved by obfuscating the code. And if you obfuscate it and they encounter an actual problem, good luck in getting them to accurately report line number, etc., in the bug report.

Please don't do that. If you don't want people to alter your Perl code then put it under an appropriate licence and enforce that licence. If people change your code when you licence says that they shouldn't do that, then it's not your problem when your updates not longer work with their installation.
See perlfaq3's answer to "How Can I hide the source for my Perl programs? for more details.

It would seem your main issue is clients modifying code which then makes it difficult for you to support it. I would suggest you ask for checksums (md5,sha, etc) of their files when they come to you for support, and similarly check files' checksums when patching. For example, you can ask the client to provide the output of a provided program which goes through their install and checksums all the files.
Ultimately they have the code, so they can do whatever they want to it. The best you can do is enforce your licenses and to make sure you only support unmodified code.

In this case obfuscating is the wrong approach.
When you release the code to the client you should keep a copy of the code you send them (either on disk or preferably in your version control as a tag/branch).
Then if your client makes changes you can compare the code they have to the code you sent them and easily spot the changes. After all if they feel the need to make changes there is a problem somewhere and you should fix it in the master codebase.

Another alternative for converting your program into a binary is the free PAR-Packer tool on CPAN. There are even filters for code obfuscation, though as others have said, that's possibly more trouble than it's worth.

I agree with the previous suggestions.
However if you really want to, you can look into PAR and/or Filter::Crypto CPAN modules. You can also use them together.
I used the latter (Filter::Crypto) as a really lightweight form of "protection" when we were shipping our product on optical media. It doesn't "protect" you, but it will stop 90% of the people that want to modify your source files.

This isn't a serious suggestion, however take a look at Acme::Buffy.
It will at least brighten your day!

An alternative to obfuscation is converting your script to a binary using something like ActiveState's Perl Dev Kit.

I am running a Windows O/S and use perl2exe from IndigoSTAR. The resulting .EXE file will be unlikely to be changed on-site.
As others have said, "how do I obfuscate it" is the wrong question. "How do I stop the customer from changing the code" is the right one.

The checksum and contract ideas are good for preventing the "problems" you describe, but if the cost to you is the difficulty of rolling-out upgrades and bug-fixes, how are your clients making changes that don't pass the comprehensive test suite? If they are capable of making these changes (or at least, making a change which expresses what they want the code to do), why not simply make it easy/automated for them to open a support ticket and upload the patch? The customer is always right about what the customer wants (they might not have a clue how to do it "the right way", but that's why they are paying you.)
A better reason to want an obfuscator would be for mass-market desktop deployment where you don't have every customer on a standing contract. In that case, something like PAR -- anything which packs the encryption/obfuscation logic into a compiled binary is the way to go.

As several folks have already said: don't.
It's pretty much implicit, given the nature of the Perl interpreter, that anything you do to obfuscate the Perl must be undoable before Perl gets its hands on it, which means you need to leave the de-obfuscation script/binary lying around where the interpreter (and thus your customer) can find it :)
Fix the real problem: checksums and/or a suitably worded license. And support staff trained to say 'you changed it? we're invoking clause 34b of our license, and that'll be $X,000 before we touch it'....
Also, read why-should-i-use-obfuscation for a more general answer.

I would just invite them into my SVN tree on their own branch so they can provide changes and I can see them and integrate their changes into my development tree.
Don't fight it, embrace it.

As Ovid says, it's a contractual, social problem. If they change the code, they invalidate the warranty. Charge them a lot to fix that, but at the same time, give them a channel where they can suggest changes. Also, look at what they want to change and make that part of the configuration if you can. They have something they want to do, and until you satisfy that, they are going to keep trying to get around you.
In Mastering Perl, I talk a bit about defeating obfucators. Even if you do things like making nonsense variables names and the like, modules such as B::Deparse and B::Deobfuscate, along with Perl tools such as Perl::Tidy, make it pretty easy for the knowledgable and motivated person to get your source. You don't have to worry about the unknowledgable and unmotivated so much because they don't know what to do with the code anyway.
When I talk to managers about this, we go through the normal cost benefit analysis. There is all sorts of stuff you could do, but not much of it costs less than the benefit you get.
Good luck,

Another not serious suggestion is to use Acme::Bleach, it will make your code very clean ;-)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse