Related
Which Roblox/LUAU classes can have malware/scripts hidden inside? Which classes will still be executed as a script? Which classes cannot contain a malicious script? Audio?
Since the complaint has been made that it's not clear what I'm asking, I've added emphasis above and put the title in there, too.
Ok, so I'm trying to learn how to detect and remove malware from things in the Roblox Studio Toolbox. That is a tall order, since I'm still learning LUAU and there are many ways to conceal malware, including obfuscation techniques (spacing, reversed strings, reversed ascii strings, getfenv(), hidden teleports, nested scripts, scripts that were reclassified to something else, like a weld, etc.).
Reclassified malware is the thing I have the most trouble with, although long scripts and scripts split into different files can be a pain, too. I do things by trial and error, like in the case of the Sakura Tree model by TreelingDeveloper (rbxassetid://6787294322). I stripped it of everything except the Trunk and Mesh, Falling Leaves and Particle Emitter, and Leaves and Mesh, and it is still similar, despite removing a couple dozen pieces, including two scripts that were nested inside several welds and claimed to weld the bark on.
Edit: I rechecked the Sakura tree after posting. Deleting all those parts reduced its visual appeal, although not completely. There were a lot of "Bark," "Other" and "Welds" that I deleted, and even the ThumbnailCamera. As it turns out, keeping all of the "Bark" and "Others" adds additional details to the trunk. I can't see a use for the ThumbnailCamera or the welds and "auto-weld" scripts.
It's not terribly hard to use CTRL-SHIFT-F to search for words like "getfenv," "string.reverse," "require," "eriuqer," and "teleport" but it is beyond my level of ability to find everything.
If you have any suggestions or tips on the question or the larger issues of malware in Roblox assets, I'd love to hear about it. Thanks!
Antimalware Plug-ins
Thus far, I have reviewed several (~10) plug-ins for detecting malware. None of them have seem to have behavioral or real-time detection. They all seem to use simplistic heuristic detection, often relying on common words and phrases associated with known malware, as well as certain LUAU commands and obfuscation techniques. Those that I thought were worth using, as inadequate as they were, are GameGuard, Guardian Angel Defender, Mirror Egg and Ro-Protect. Unfortunately, they get a fair number of false positives (Mirror Egg, for example, suggests that anything with the name "Fire," including fire effects I added, may be malware). GAD has the best UI and seems to find more than anything else, but it also finds a lot of "empty objects" that I'm not sure what to make of. None of them are real-time, and none are good enough on their own.
Explanation
I'm not artistic, so it really isn't possible for me to develop my own models, meshes and such, and I think people who say those who use the toolbox are lazy and get what they deserve are apparently unaware that not everyone is a master craftsman when it comes to CG.
Roblox Studio's Toolbox is jam-packed with assets, which is great, except many of them contain malware.
Beyond this point is mostly a rant. Feel free to ignore, unless you're going to tell me to contact Roblox Customer Service or go to the Roblox DevForum.
You may ask why I don't post this on the DevForum. I've been there since April and, despite 7 hours of reading and over 1k likes, I still have not become a "regular". I asked customer service about this, but they gave me the run-around, referring me back to the rules to become a regular (which are deliberately vague) and generally being eager to get rid of me as fast as possible without actually helping.
You may ask why I didn't ask Roblox for help.
Having asked both the Appeals team (which had given me a 3-day IP ban and permanently suspended my unlisted, >private< game that is still in development because, they stated, I'd added an "inappropriate model" from the toolbox - a model that I didn't modify, and then rejected my appeal without any valid reason) and Customer Service, which gave me the run-around and did their very best not to help me, and to get rid of me as rapidly as possible, I'm trying to learn how to protect myself. In short, Roblox apparently refuses to take any real steps to deal with the plethora of malware, and help pages DO NOTHING to teach developers how to find malware. I've learned more on my own - which isn't nearly enough.
If I sound irritated, it's because I am. They have some of the worst customer support I've ever encountered in the 36 years of my adult life, and I've got many years of experience in that field, including tech support.
The Roblox tutorial pages have this to say when it introduces the toolbox:
> Anyone can upload an item to the Toolbox, so make sure your game still works after adding an item before settling on it. To learn how to inspect a model before inserting it, see [Item Inspection][1].
The linked page is woefully inadequate, and the page that Customer Support referred me to is, too. [What Is This Infected Model On My Place? help article.][2]
[1]: https://developer.roblox.com/en-us/resources/studio/Toolbox#item-inspection
[2]: https://en.help.roblox.com/hc/articles/203312920
When it comes to things that can execute code, the answer is Scripts and LocalScripts.
Some things to know, Scripts are only active in a few locations. According to the docs :
The instant that the following conditions are met, a Script’s Lua code is run in a new thread:
Disabled property is false
The Script object is a descendant of the Workspace or
ServerScriptService
Similarly for LocalScripts :
A LocalScript will only run Lua code if it is a descendant of one of the following objects:
A Player’s Backpack, such as a child of a Tool
A Player’s character model
A Player’s PlayerGui
A Player’s PlayerScripts.
The ReplicatedFirst service
This doesn't apply for Edit Mode, just when you are testing it in Play Mode.
The thing is, a Script instance can be inserted into the hierarchy of anything. Audio, Meshes, Decals, etc. none of these things execute code on their own, but they are often Trojan Horses for delivering Scripts into the Workspace. And because the Toolbox inserts things into the Workspace by default, it creates the avenue for exploits that you're describing in your question.
This is why I recommended the Venom plugin by pa00, because it allows you to strip out any and all Scripts that might come with an asset. It is an easy counter measure for when you are only looking for simple things. This suggestion falls apart once you start looking for more complicated assets like vehicles, Tools, and guns, where interactions have to be scripted, but it's a starting point.
Still very new to MATLAB and to programming in general, I got stuck in understanding how to best make profit from what is documented in the official MATLAB documentation in the chapter Calling External Functions. I am by now not capable to judge which of the many offered pathways might be the most effective one to go for, in terms of how easy a pathway can be learned by a beginner and which one could on the long term most easy and clear in the MATLAB code be applied over and over again.
To put it in a very practical question: as, for instance, the third party image processing function libraries ITK or OpenCV provide Java, C++, C and Python interfaces, and MATLAB has functionality to address such interfaces, which interface should a beginner in programming chose? Is one of them usable in a clearer laid out design and thus easier to get warm with and quicker to learn to apply?
I am afraid to now hear from everybody something like "well, it depends what you want to do", and my answer could only be "don´t know yet, I am learning programming and prefer to first gain some general success by going for the cleaner designed and easier to understand approach, and thus would like to get a recommendation where to start".
Please let me add this to my question and concerns: Highly respected Yair Altman states on his internet page "undocumentedmatlab.com" in the commercial for his "Undocumented Secrets of Matlab-Java Programming" book, that the Matlab programming environment would rely on Java for numerous tasks, including networking, data-processing algorithms and graphical user-interface (GUI). I derive from this statement that learning to specially connect MATLAB to JAVA will have significant advantages, THE MATHWORKS itself seem to have decided to take advantage of such connection when implementing MATLAB.
But I can also see, that THE MATHWORKS by providing for MATLAB the MEX functionality seems to lean towards a tight C/C++ incorporation, providing also MEX besides the other possibilities to call external C functions.
For me as a beginner it is now confusing to uncover which route of connectivity to external languages could be taken as the "standard" or "first to be recommended" one. Do any of you experienced programmers would have some arguments for me which route to first focus on? It is a long journey to learn programming, and I would not like to waste time on poorly recommended pathways.
This question sounds like: "I am still learning how to drive, still not a very experienced driver. Please give me your tips about how to change a flat tire, what is the best tire to get a flat tire on, passenger side rear? what are the best places to get a flat tire, is it the mall or my office parking lot or the middle of the street."
Let me give you some tips:
Changing a tire will not make you a more knowledgeable driver. You will learn very few things from doing it and it is a frustrating experience and it is not worth your time right now. Learn how to drive.
Explanation: Making MATLAB call Java/C++/C or whatever other language will not make you a better MATLAB programmer, and frankly is of secondary importance. Until the first sentence of your question isn't "I'm still new to MATLAB and programming in general" you're wasting your time. Like changing a flat tire, connecting MATLAB to other languages is not something cool, or interesting, in fact it is the opposite: it is frustrating, error prone and boring.
The day will come when you will have a flat tire. That day the location where you get it and which tire it is will become secondary. You will need to learn how to change it and you will. Trust me you will.
Explanation: You don't get to decide in what language the code that solves the exact problem you have right now is written. The same way you don't get to decide where you get the flat tire. The day will come when you already know C++ and need MATLAB to call into some C++ code (either your code or someone else's). That day you will need to learn how to write a mex file in C++ and compile it for your platform and invoke your code. Or the day will come when you need to invoke Java, and then you will learn how to call into Java.
Obsessing over this when you don't know what you need to do and you're clearly not technically equipped to do it is just a waste of time.
To Start with you can look for Interfacing MATLAB and Java (It is easiest way to learn)
Afterwards, Go for Interfacing MATLAB and C++
1. Create Classes and Gateway in C++ for MATLAB and create executable mex file
2. Create MATLAB function and Wrap in C++ (Shared C/C++ library approach)
Afterwards, Go for creating Excel Addins and Invoke those addins in Excel Sheets
Meanwhile you can look for dll referencing in C#.net application /VB.net application
I have inherited a broad, ill-designed web portfolio at my job. Most pages are written in Perl as most of the data ingested, processed, and displayed on the site comes in the form of flat-files which then have to be meticulously regexed and databased in our MySQL and Oracle databases.
As the first IT-trained manager of this environment, I am taking it upon myself to scrub the websites and lay down some structure to the development process. One of the choices I have been given is to choose whether or not to continue in Perl. There is substantial in-house talent for Java and PHP is rather easy to learn. I have considered taking the reins off the developers and letting them choose whatever language they want to use for their pages but that sounds like it might be trouble if the guy who chose PHP gets hit by a bus and no one else can fix it.
As the years go by hiring Perl programmers becomes more and more difficult and the complexity of maintaining legacy Perl code from previous developers whose main focus may have been just getting a page up and running is becoming very resource-consuming. Another, previous (non-IT) manager was more focused in on quick turnaround and immediate gratification of pages rather than ensuring that it was done right the first time (he has since been promoted outside our branch).
The production server is solaris. MySQL has most of our data but new projects have begun using Oracle more and more (for GIS data). Web servers are universally Apache. We live in an intranet disconnected from the regular internet. Our development is conducted in an Agile, iterative manner.
Whatever language is chosen to push forward into, there are resources to have the existing development staff re-trained. No matter what, the data that comes into our environment will have to be regexed to death so perl isn't going away anytime soon. My question to the community is, what are the pros and cons of the following languages for the above-defined web dev environment: Perl, PHP, Java, Python, and --insert your favorite language here--. If you had it to do all over again, which language set-up would you have chosen?
Edits and Clarifications:
Let me clarify a bit on my original post. I'm not throwing everything away. I've been given the opportunity to adjust the course of the ship to what I believe is a better heading. Even if I chose a new language, the perl code will be around for some time to come.
Hypothetically speaking, if I chose Assembly as my new language (haha) I would have to get the old developers up to speed probably by sending them to some basic assembly classes. New pages/projects would be in the new language and the old pages/projects would have to play nice with the new pages/projects. Some might one day be rewritten into the new language, some may never be changed.
What will likely always be in Perl will be the parsing scripts we wrote years ago to sift through and database information from the flat files. But that's okay because they don't interface with the webpages, they interface with the database.
Thank you all for your input, it has been very very helpful thus far.
It seems that your problem is more legacy code and informal development methodology than the language per se. So if you already have Perl developers on staff, why not start modernizing your methods and your code base, instead of switching to a new language, and creating an heterogeneous code base.
Modern Perl offers a lot in terms of good practices and powerful tools: testing is emphasized, with the Test::* modules and WWW::Mechanize, data base interaction can be done through plain DBI, but also using ORM modules like the excellent DBIx::Class, OO with Moose is now on par with more modern languages, mod_perl gives you access to a lot of power within Apache. There are also quite a few MVC frameworks for Perl. One that's getting a lot of attention is Catalyst.
Invest in a few copies of Perl Best Practices, bring in a proper trainer for a few classes on modern development methods, and start changing the culture of your group.
And if you have trouble finding developers that are already proficient in Perl, you can always hire good PHP people and train them, that shouldn't be too difficult. At least their willingness to learn a new language would be a good sign of their flexibility and will to improve.
It is always tempting to blame the state of your code on the language its written in, but in your case I am not sure that is the case. Lots of big companies seem to have no problem managing huge code bases in Perl, the list is long but the main Web companies are all there, along with a number of financial institutions.
I would bring in someone who is very good at Perl, to at least look at the current design. They would be able to tell you how bad the Perl code really is, and what needs to be done to get it into good shape.
At that point I would start considering my options. If the Perl code is salvageable, well than great, hire someone proficient in Perl. Also train some of your existing staff to help on the existing code-base. If you don't have someone proficient in Perl in charge of the Perl code, your code-base may become even worse than it already is.
Only if it was in terrible shape, would I consider abandoning it for another language. What that language is, that is something your going to have to think about that yourself.
p.s. I'm a bit biased, I prefer Perl
If regex is important I would choose a language with good support.
If you would use java, you will not be able to just copy paste your regex code from the perl code because the slashes have to be escaped. So I would vote against java.
I'm not familiar with php enough to know its regex features, but given your choices I would go for python. You can create cleaner code in python.
Would ruby also be an option? It also has good perl like regex support and rails supports agile web development out of the box.
first off let me point out that MySQL's spatial extensions work with GIS.
Second, if you have a bunch of Perl programmers that will need tow ork on the new sites then your best bet is to choose something they won't have too much trouble understanding. The obvious "something" there is PHP. When I learnt PHP I'd done some Perl years ago and picked up PHP in no time at all.
Switching to something like Java or .Net (or even Ruby on Rails) would be a far more dramatic shift in design.
Plus with Apache servers you already have your environment set up and you can probably stage any development as a mix of Perl and PHP reasonably easily.
As to the last part of your question: what would you do it in if you could do it over again? To me that's a seprate and basically irrelevant question. The fact is you're not rewinding and starting over, you're just... starting over. So legacy support, transition, skills development and all those other issues are far more important than any hypothetical question of what you'd choose in a perfect world if all other things were equal.
Love it or hate it, PHP is popular and is going nowhere anytime soon. Finding skiled people to do it is not too difficult (well, the dificulty is filtering them out from the self-taught cowboy script jockeys who think they can code but can't) and it's not a far step from Web-based Perl.
If your developers are any good, they'll be able to handle anything thrown at them. Deciding what language to use is quite a tricky strategic position, but I recommend you think very carefully before introducing any MORE (i.e. don't).
Unless of course, there is something that you absolutely cannot do (or cannot do sanely) with what you've got.
I think choosing a single language is key and if your database is primarily MySQL, then PHP seems like the obvious choice. It naturally works with your database, it's open source and there is tons of documentation, source code, doesn't require compiling, etc.
People come and go through positions and any website will evolve over time. If you have the ability to set some guidelines and rules, I would choose something that is forgiving, common-place, and easy(er) to learn.
I'd also suggest writing it down so people in the future don't re-invent the wheel.
I've heard a ton of complaining over the years about inherited projects that us developers have to work with. The WTF site has tons of examples of code that make me actually mutter under my breath "WTF?"
But have any of you actually been presented with code that made you go, "Holy crap this was well thought out!" or "Wow, I never thought of that!"
What inherited code have you had to work with that made you smile and why?
Long ago, I was responsible for the Turbo C/C++ run-time library. Tanj Bennett wrote the original 80x87 floating point emulator in 16-bit assembler. I hadn't looked closely at Tanj's code since it worked well and didn't require attention. But we were making the move to 32-bits and the task fell to me to stretch the emulator.
If programming could ever be said to have something in common with art this was it.
Tanj's core math functions managed to keep an 80-bit floating point temporary result in five 16-bit registers without having to save and restore them from memory. X86 assembly programmers will understand just what an accomplishment this was. Register space was scarce and keeping five registers as your temp while simultaneously doing complex math was a beautiful site to behold.
If it was only a matter of clever coding that would have been enough to qualify it as art but it was more than that. Tanj had carefully picked the underlying math algorithms that would be most suitable for keeping the temp in registers. The result was a blazing-fast floating point emulator which was an important selling point for many of our customers.
By the time the 386 came along most people who cared about floating-point performance weren't using an emulator but we had to support Intel's 386SX so the emulator needed an overhaul. I rewrote the instruction-decode logic and exception handling but left the core math functions completely untouched.
In my first job, I was amazed to discover a "safe ID" class in the codebase (c++), which was wrapping numerical IDs in a class templated with an empty tag class, that ensured that the compiler would complain if you tried for example to compare or assign a UserId into an OrderId.
Not only did I made sure that I had an equivalent Id class in all subsequent codebases I would be using, but it actually opened my eyes on what the compiler could do to guarantee correctness and help writing stronger code.
The code that impresses me the most, and which I try to emulate - is code that seems too simple and easy to understand.
It is damn difficult to write that kind of code. :-)
I have a funny story to tell here.
I was working on this Javaish application, filled with getters & setters that did nothing but get or set and interfaces and everything ever invented to make code unreadable. One day I stumbled upon some code which seemed very well crafted -- it was basically an algorithm implementation that looked very elegant = few lines of readable code, even though it respected every possible rule the project had to adhere to (it was checkstyled automatically).
I couldn't figure out who on the team could have written such code. I was dying to discuss with him and share thoughts. Thankfully, we had switched to subversion (from cvs) a few months earlier and I quickly ran am 'svn blame'. I loled all over the place, seeing my name next to the implementation.
I had heard stories about people not remembering code they wrote 6 months back, code that is a nightmare to maintain. I could not believe such a thing could happen: how can you forget code you wrote? Well, now I'm convinced it can happen. Thankfully the code was alright and easy to extend, so I've only experienced half of the story.
Some VB6 code by another programmer at my company I came across that handled the error conditions very well (whether it be deal with them directly or log them).
Along with some rather complex code that was well commented.
I know this will bring a lot of answers like,
"I've never find good code before I step in" and variations.
I think the real problem there is not that there isn't good coders or excellent projects out there, is that there's an excess of NIH syndrome and the fact that no body likes code from others. The latter is just because you have to make an intellectual effort to understand it, a much bigger effort than you need to understand you own code so that you dislike it (it's making you think and work after all).
Personally I can remember (as everyone I guess) some cases of really bad code but also I remember some pretty well documented, elegant code.
Currently, the project that most impressed me was a very potent, Dynamic Workflow Engine, not only by the simplicity but also for the way it is coded. I can remember some very clever snippets here and there, as well as a beautiful metaprogramming library based on a full IDL developed by some friends of mine (Aspl.es)
I inherited a large bunch of code that was SO well written I actually spent the $40 online to find the guy, I went to his house and thanked him.
I think Rocky Lhotka should get the credit, but I had to touch a CSLA.NET application recently {in my private practice on the side} and I was very impressed with the orderliness of the code. The app worked extremely well, but the client needed a few extensions. The original author had died tragically, and the new guy was unsophisticated. He didn't understand CSLA.NET's business object based approach, and he wanted to do it all over again in cut-and-dried VB.NET, without any fancy framework.
So I got the call. Looking at a working example of WinForm binding and CSLA.NET was pretty instructive about a lot of things.
Symbian OS - the old core bit of it anyway, the bit that dated back to the Psion days or those who even today keep that spirit alive.
And sitting right along side it and all over it is all the new crap created by the lowest bidders hired by the big phone corporations. It was startling, you could actually feel in your bones whether a bit of the code-base was old or new somehow.
I remember when I wrote my bachelor thesis on type inference, my Pascal-to-Pascal 'compiler' was an extension of a Parser my supervisor programmed (in Java). It had a pretty good structure as far as I can remember, and for me who had never done any serious Object-oriented programming, it was quite a revelation.
I've been doing a lot of Eclipse plug-in development and often had to debug into the actual Eclipse source code. While I haven't "inherited" it in the sense that I'm not continuing work on it, I've always been impressed with the design and quality of the early core.
I'm occasionally unfortunate enough to have to make alterations to very old, poorly not documented and poorly not designed code.
It often takes a long time to make a simple change because there is not much structure to the existing code and I really have to read a lot of code before I have a feel for where things would be.
What I think would help a lot in cases like this is a tool that would allow one to visualise an overview of the code, and then maybe even drill down for more detail. I suspect such a tool would be very hard to get right, given that is trying to find structure where there is little or none.
I guess this is not really a question, but rather a musing. I should make it into a question - What do others do to assist in getting their head around other peoples code, the good and the bad?
Hmm, this is a hard one, so much to say so little time ...
1) If you can run the code it makes life soooo much easier, breakpoints (especially conditional) break points are you friend.
2) A purists' approach would be to write a few unit tests, for known functionality, then refactor to improve code and understanding, then re-test. If things break, then create more unit tests - repeat until bored/old/moved to new project
3) ReSharper is good at showing where things are being used, what's calling a method for instance, it's static but a good start, and it helps with refactoring.
4) Many .net events are coded as public, and events can be a pain to debug at the best of times. Recode them to be private and use a property with add/remove. You can then use break point to see what is listening on an event.
BTW - I'm playing in the .Net space, and would love a tool to help do this kind of stuff, like Joel does anyone out there know of a good dynamic code reviewing tool?
I have been asked to take ownership of some NASTY code in the past - both work and "play".
Most of the amateurs I took over code for had just sort of evolved the code to do what they needed over several iterations. It was always a giant incestuous mess of library A calling B, calling back into A, calling C, calling B, etc. A lot of the time they'd use threads and not a critical section was to be seen.
I found the best/only way to get a handle on the code was start at the OS entry point [main()] and build my own call stack diagram showing the call tree. You don't really need to build a full tree at the outset. Just trace through the section(s) you're working on at each stage and you'll get a good enough handle on things to be able to run with it.
To top it all off, use the biggest slice of dead tree you can find and a pen. Laying it all out in front of you so you don't have to jump back and forward on screens or pages makes life so much simpler.
EDIT: There's a lot of talk about coding standards... they will just make poor code look consistent with good code (and usually be harder to spot). Coding standards don't always make maintaining code easier.
I do this on a regular basis. And have developed some tools and tricks.
Try to get a general overview (object diagram or other).
Document your findings.
Test your assumptions (especially for vague code).
The problem with this is that on most companies you are appreciated by result. That's why some programmers write poor code fast and move on to a different project. So you are left with the garbage, and your boss compares your sluggish progress with the quick and dirtu guy. (Luckily my current employer is different).
I generally use UML sequence diagrams of various key ways that the component is used. I don't know of any tools that can generate them automatically, but many UML tools such as BoUML and EA Sparx can create classes/operations from source code which saves some typing.
The definitive text on this situation is Michael Feathers' Working Effectively with Legacy Code. As S. Lott says get some unit tests in to establish behaviour of the lagacy code. Once you have those in you can begin to refactor. There seems to be a sample chapter available on the Object Mentor website.
I strongly recommend BOUML. It's a free UML modelling tool, which:
is extremely fast (fastest UML tool ever created, check out benchmarks),
has rock solid C++ import support,
has great SVG export support, which is important, because viewing large graphs in vector format, which scales fast in e.g. Firefox, is very convenient (you can quickly switch between "birds eye" view and class detail view),
is full featured, intensively developed (look at development history, it's hard to believe that so fast progress is possible).
So: import your code into BOUML and view it there, or export to SVG and view it in Firefox.
See Unit Testing Legacy ASP.NET Webforms Applications for advice on getting a grip on legacy apps via unit testing.
There are many similar questions and answers. Here's the search https://stackoverflow.com/search?q=unit+test+legacy
The point is that getting your head around legacy is probably easiest if you are writing unit tests for that legacy.
I haven't had great luck with tools to automate the review of poorly documented/executed code, cause a confusing/badly designed program generally translates to a less than useful model. It's not exciting or immediately rewarding, but I've had the best results with picking a spot and following the program execution line by line, documenting and adding comments as I go, and refactoring where applicable.
a good IDE (EMACS or Eclipse) could help in many cases. Also on a UNIX-platform, there are some tools for crossreferencing (etags, ctags) or checking (lint) or gcc with many many warning options turned on.
First, before trying to comprehend a function/method, i would refactor it a bit to fit your coding conventions (spaces, braces, indentation) and remove most of the comments if they seem to be wrong.
Then I would refactor and comment the parts you understood, and try to find/grep those parts over the whole source tree and refactor them there also.
Over the time, you get a nicer code, you like to work with.
I personally do a lot of drawing of diagrams, and figuring out the bones of the structure.
The fad de jour (and possibly quite rightly) has got me writing unit tests to test my assertions, and build up a safety net for changes I make to the system.
Once I get to a point where I'm comfortable enought knowing what the system does, I'll take a stab at fixing bugs in the sanest way possible, and hope my safety nets neared completion.
That's just me, however. ;)
i have actuaally been using the refactoring features of ReSharper to help m get a handle on a bunch of projects that i inherited recently. So, to figure out another programmer's very poorly structured, undocumented code, i actually start by refactoring it.
Cleaning up the code, renaming methods, classes and namespaces properly, extracting methods are all structural changes that can shed light on what a piece of code is supposed to do. It might sound counterintuitive to refactor code that you don't "know" but trut me, ReSharper really allows you to do this. Take for example the issue of red herring dead code. You see a method in a class or perhaps a strangely named variable. You can start by trying to lookup usages or, ungh, do a text search, but ReSharper will actually detect dead code and color it gray. As soon as you open a file you see in gray and with scroll bar flags what would have in the past been confusing red herrings.
There are dozens of other tricks and probably a number of other tools that can do similar things but i am a ReSharper junky.
Cheers.
Get to know the software intimately from a user's point of view. A lot can be learnt about the underlying structure by studying and interacting with the user interface(s).
Printouts
Whiteboards
Lots of notepaper
Lots of Starbucks
Being able to scribble all over the poor thing is the most useful method for me. Usually I turn up a lot of "huh, that's funny..." while trying to make basic code structure diagrams that turns out to be more useful than the diagrams themselves in the end. Automated tools are probably more helpful than I give them credit for, but the value of finding those funny bits exceeds the value of rapidly generated diagrams for me.
For diagrams, I look for mostly where the data is going. Where does it come in, where does it end up, and what does it go through on the way. Generally what happens to the data seems to give a good impression of the overall layout, and some bones to come back to if I'm rewriting.
When I'm working on legacy code, I don't attempt to understand the entire system. That would result in complexity overload and subsequent brain explosion.
Rather, I take one single feature of the system and try to understand completely how it works, from end to end. I will generally debug into the code, starting from the point in the UI code where I can find the specific functionality (since this is usually the only thing I'll be able to find at first). Then I will perform some action in the GUI, and drill down in the code all the way down into the database and then back up. This usually results in a complete understanding of at least one feature of the system, and sometimes gives insight into other parts of the system as well.
Once I understand what functions are being called and what stored procedures, tables, and views are involved, I then do a search through the code to find out what other parts of the application rely on these same functions/procs. This is how I find out if a change I'm going to make will break anything else in the system.
It can also sometimes be useful to attempt to make diagrams of the database and/or code structure, but sometimes it's just so bad or so insanely complex that it's better to ignore the system as a whole and just focus on the part that you need to change.
My big problem is that I (currently) have very large systems to understand in a fairly short space of time (I pity contract developers on this point) and don't have a lot of experience doing this (having previously been fortunate enough to be the one designing from the ground up.)
One method I use is to try to understand the meaning of the naming of variables, methods, classes, etc. This is useful because it (hopefully increasingly) embeds a high-level view of a train of thought from an atomic level.
I say this because typically developers will name their elements (with what they believe are) meaningfully and providing insight into their intended function. This is flawed, admittedly, if the developer has a defective understanding of their program, the terminology or (often the case, imho) is trying to sound clever. How many developers have seen keywords or class names and only then looked up the term in the dictionary, for the first time?
It's all about the standards and coding rules your company is using.
if everyone codes in different style, then it's hard to maintain other programmer code and etc, if you decide what standard you'll use have some rules, everything will be fine :) Note: that you don't have to make a lot of rules, because people should have possibility to code in style they like, otherwise you can be very surprised.