Which Roblox/LUAU classes can have malware/scripts hidden inside? - class

Which Roblox/LUAU classes can have malware/scripts hidden inside? Which classes will still be executed as a script? Which classes cannot contain a malicious script? Audio?
Since the complaint has been made that it's not clear what I'm asking, I've added emphasis above and put the title in there, too.
Ok, so I'm trying to learn how to detect and remove malware from things in the Roblox Studio Toolbox. That is a tall order, since I'm still learning LUAU and there are many ways to conceal malware, including obfuscation techniques (spacing, reversed strings, reversed ascii strings, getfenv(), hidden teleports, nested scripts, scripts that were reclassified to something else, like a weld, etc.).
Reclassified malware is the thing I have the most trouble with, although long scripts and scripts split into different files can be a pain, too. I do things by trial and error, like in the case of the Sakura Tree model by TreelingDeveloper (rbxassetid://6787294322). I stripped it of everything except the Trunk and Mesh, Falling Leaves and Particle Emitter, and Leaves and Mesh, and it is still similar, despite removing a couple dozen pieces, including two scripts that were nested inside several welds and claimed to weld the bark on.
Edit: I rechecked the Sakura tree after posting. Deleting all those parts reduced its visual appeal, although not completely. There were a lot of "Bark," "Other" and "Welds" that I deleted, and even the ThumbnailCamera. As it turns out, keeping all of the "Bark" and "Others" adds additional details to the trunk. I can't see a use for the ThumbnailCamera or the welds and "auto-weld" scripts.
It's not terribly hard to use CTRL-SHIFT-F to search for words like "getfenv," "string.reverse," "require," "eriuqer," and "teleport" but it is beyond my level of ability to find everything.
If you have any suggestions or tips on the question or the larger issues of malware in Roblox assets, I'd love to hear about it. Thanks!
Antimalware Plug-ins
Thus far, I have reviewed several (~10) plug-ins for detecting malware. None of them have seem to have behavioral or real-time detection. They all seem to use simplistic heuristic detection, often relying on common words and phrases associated with known malware, as well as certain LUAU commands and obfuscation techniques. Those that I thought were worth using, as inadequate as they were, are GameGuard, Guardian Angel Defender, Mirror Egg and Ro-Protect. Unfortunately, they get a fair number of false positives (Mirror Egg, for example, suggests that anything with the name "Fire," including fire effects I added, may be malware). GAD has the best UI and seems to find more than anything else, but it also finds a lot of "empty objects" that I'm not sure what to make of. None of them are real-time, and none are good enough on their own.
Explanation
I'm not artistic, so it really isn't possible for me to develop my own models, meshes and such, and I think people who say those who use the toolbox are lazy and get what they deserve are apparently unaware that not everyone is a master craftsman when it comes to CG.
Roblox Studio's Toolbox is jam-packed with assets, which is great, except many of them contain malware.
Beyond this point is mostly a rant. Feel free to ignore, unless you're going to tell me to contact Roblox Customer Service or go to the Roblox DevForum.
You may ask why I don't post this on the DevForum. I've been there since April and, despite 7 hours of reading and over 1k likes, I still have not become a "regular". I asked customer service about this, but they gave me the run-around, referring me back to the rules to become a regular (which are deliberately vague) and generally being eager to get rid of me as fast as possible without actually helping.
You may ask why I didn't ask Roblox for help.
Having asked both the Appeals team (which had given me a 3-day IP ban and permanently suspended my unlisted, >private< game that is still in development because, they stated, I'd added an "inappropriate model" from the toolbox - a model that I didn't modify, and then rejected my appeal without any valid reason) and Customer Service, which gave me the run-around and did their very best not to help me, and to get rid of me as rapidly as possible, I'm trying to learn how to protect myself. In short, Roblox apparently refuses to take any real steps to deal with the plethora of malware, and help pages DO NOTHING to teach developers how to find malware. I've learned more on my own - which isn't nearly enough.
If I sound irritated, it's because I am. They have some of the worst customer support I've ever encountered in the 36 years of my adult life, and I've got many years of experience in that field, including tech support.
The Roblox tutorial pages have this to say when it introduces the toolbox:
> Anyone can upload an item to the Toolbox, so make sure your game still works after adding an item before settling on it. To learn how to inspect a model before inserting it, see [Item Inspection][1].
The linked page is woefully inadequate, and the page that Customer Support referred me to is, too. [What Is This Infected Model On My Place? help article.][2]
[1]: https://developer.roblox.com/en-us/resources/studio/Toolbox#item-inspection
[2]: https://en.help.roblox.com/hc/articles/203312920

When it comes to things that can execute code, the answer is Scripts and LocalScripts.
Some things to know, Scripts are only active in a few locations. According to the docs :
The instant that the following conditions are met, a Script’s Lua code is run in a new thread:
Disabled property is false
The Script object is a descendant of the Workspace or
ServerScriptService
Similarly for LocalScripts :
A LocalScript will only run Lua code if it is a descendant of one of the following objects:
A Player’s Backpack, such as a child of a Tool
A Player’s character model
A Player’s PlayerGui
A Player’s PlayerScripts.
The ReplicatedFirst service
This doesn't apply for Edit Mode, just when you are testing it in Play Mode.
The thing is, a Script instance can be inserted into the hierarchy of anything. Audio, Meshes, Decals, etc. none of these things execute code on their own, but they are often Trojan Horses for delivering Scripts into the Workspace. And because the Toolbox inserts things into the Workspace by default, it creates the avenue for exploits that you're describing in your question.
This is why I recommended the Venom plugin by pa00, because it allows you to strip out any and all Scripts that might come with an asset. It is an easy counter measure for when you are only looking for simple things. This suggestion falls apart once you start looking for more complicated assets like vehicles, Tools, and guns, where interactions have to be scripted, but it's a starting point.

Related

Best way to organize bioinformatics projects? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I come from a computer science. background, but I am now doing genomics.
My projects include a lot of bioinformatics typically involving: aligning sequences, comparing overlap, etc. between sequences and various genome-annotation-features, from different classes of biological samples, time-course data, microarray, high-throughput sequencing ("next-generation" sequencing, though it's the current generation actually) data, this kind of stuff.
The workflow with this kind of analyses is quite different from what I experienced during my computer science studies: no UML and thoughtfully designed objects shining with sublime elegance, no version management, no proper documentation (often no documentation at all), no software engineering at all.
Instead, what everyone does in this field is hacking out one Perl-script or AWK-one-liner after the other, usually for one-time usage.
I think the reason is that the input data and formats change so fast, the questions need to be answered so soon (deadlines!), that there seems to be no time for project organization.
One example to illustrate this: Let's say you want to write a raytracer. You would probably put a lot of effort into the software engineering first. Then program it, finally in some highly-optimized form. Because you would use the raytracer countless of times with different input data and would make changes to the source code over a duration of years to come. So good software engineering is paramount when coding a serious raytracer from scratch. But imagine you want to write a raytracer, where you already know that you will use it to raytrace one, single picture ever. And that picture is of a reflecting sphere over a checkered floor. In this case you would just hack it together somehow. Bioinformatics is like the latter case only.
You end up with whole directory trees with the same information in different formats until you have reached the one particular format necessary for the next step, and dozen of files with names like "tmp_SNP_cancer_34521_unique_IDs_not_Chimp.csv" where you don't have the slightest idea one day later why you created this file and what it exactly is.
For a while I was using MySQL which helped, but now the speed in which new data is generated and changes formats is such that it is not possible to do proper database design.
I am aware of one single publication which deals with these issues (Noble, W. S. (2009, July). A quick guide to organizing computational biology projects. PLoS Comput Biol 5 (7), e1000424+). The author sums the goal up quite nicely:
The core guiding principle is simple:
Someone unfamiliar with your project
should be able to look at your
computer files and understand in
detail what you did and why.
Well, that's what I want, too! But I am following the same practices as that author already, and I feel it is absolutely insufficient.
Documenting each and every command you issue in Bash, commenting it with why exactly you did it, etc., is just tedious and error-prone. The steps during the workflow are just too fine-grained. Even if you do it, it can be still an extremely tedious task to figure out what each file was for, and at which point a particular workflow was interrupted, and for what reason, and where you continued.
(I am not using the word "workflow" in the sense of Taverna; by workflow I just mean the steps, commands and programs you choose to execute to reach a particular goal).
How do you organize your bioinformatics projects?
I'm a software specialist embedded in a team of research scientists, though in the earth sciences, not the life sciences. A lot of what you write is familiar to me.
One thing to bear in mind is that much of what you have learned in your studies is about engineering software for continued use. As you have observed a lot of what research scientists do is about one-off use and the engineered approach is not suitable. If you want to implement some aspects of good software engineering you are going to have to pick your battles carefully.
Before you start fighting any battles, you are going to have to critically examine your own ideas to ensure that what you learned in school about general-purpose software engineering is valid for your current situation. Don't assume that it is.
In my case the first battle I picked was the implementation of source code control. It wasn't hard to find examples of all the things that go wrong when you don't have version control in place:
some users had dozens of directories each with different versions of the 'same' code, and only the haziest idea of what most of them did that was unique, or why they were there;
some users had lost useful modifications by overwriting them and not being able to remember what they had done;
it was easy to find situations where people were working on what should have been the same program but were in fact developing incompatibly in different directions;
etc etc etc
Once I had gathered the information -- and make sure you keep good notes about who said what and what it cost them -- it became relatively easy to paint a picture of a better world with source code control.
Next, well, next you have to choose your own next battle. But one of the seeds of doubt you have to sow in your scientist-colleagues minds is 'reproducibility'. Scientific experiments are not valid if they are not reproducible; if their experiments involve software (and they always do) then careful software engineering is essential for reproducibility. A lot of this is about data provenance, but that's a topic for another day.
Part of the issue here is the distinction between documentation for software vs documentation for publication.
For software development (and research plan) design, the important documentation is structural and intentional. Thus, modeling the data, reasons why you are doing something, etc. I strongly recommend using the skills you've learned in CS for documenting your research plan. Having a plan for what you want to do gives you a lot of freedom to multi-task while long analyses are running.
On the other hand, a lot of bioinformatics work is analysis. Here, you need to treat documentation like a lab notebook, and not necessarily a project plan. You want to be document what you did, maybe a brief comment why (e.g. when you are troubleshooting data), and what the outputs and results are.
What I do is fairly simple.
First, I start in a directory and create a git repo. Then, whenever I change some file, I commit it to the repo. As much as possible, I try to name data outputs in a way that I can drop then into my git ignore files.
Then, as much as possible, I work on a single terminal session for a project at a time, and when I hit a pause point (like when I've got a set of jobs sent up to the grid, I run 'history |cut -c 8-' and paste that into a lab notes file. I then edit the file to add comments for what I did, and remember, change the git add/commit lines to git checkout (I have a script that does this based on the commit messages). As long as I start it in the right directory, and my external data doesn't go away, this means that I can recreate the entire process later.
For any even slightly complex processing tasks, I write a script to do it, so that my notebook, as much as possible, looks clean. To an approximation, a helper script can be viewed as a subroutine in a larger project, and should be documented internally to at least that level.
Your question is about project management. Bad project management is not unique to bioinformatics. I find it hard to believe that the entire industry of bioinformatics is commited to bad software design.
About the presure... Again there are others in this world that have very challenging deadlines, and they are still using good software designs.
In many cases, following a good software design does not hold down the projects and may even speed its design and maintainance (at least on the long run).
Now to your real question... You can offer your manager to redesign small parts of the code that have no influence on the rest of the code as a proof of concept (POC), but it's really hard to stop a truck from keep on moving, so don't get upset if he feels "we worked this way for years - we know what we are doing, and we don't need a child to teach us how to do our work". Learn to work like the rest and when you will gain their trust, you could
do your thing once in a while (I hope you will have time and the devotion to do the right thing).
Good luck.

What content have you made/seen made using procedural techniques

I was looking at some study i have to do in the future to do with procedural generation techniques and i was wondering what type of content you have:
Developed
Helped Develop
Seen implemented
Tried to develop
and what methods/techniques/procedures you used to develop it.
If you feel generous maybe you can even go into specifics of it such as data structures ad algorithms you have used to develop it.
If this needs to be put as community wiki because it is not me asking for a problem to be solved just let me know.
This is not a homework thread because it is a research unit that i'm not taking yet ;)
Introversion software, the makers of the games Defcon, Uplink and Darwinia (among others) have started working on a game about a year ago which extensively uses PCG for city generation, here is a video of their work, and you can read more about it on the development diary of the game (start from the first part at the bottom of the page!).
This immediately got me extremely interested, and seeing the potential for games I immediately started researching the technology. I have amassed a folder of 18 PDFs about the subject (research papers, SIGGRAPH presentations, etc). Here, I uploaded it for you.
The main approach is to use L-Systems, however, I never got around to understanding enough of that to make something out of this. I tried other, less successful approaches like using Voronois, recursively splitting a rectangular area into more smaller areas and shifting the boundaries a little to obtain a bit of randomness and polygon division.
The last method I had gotten from Mike's Code Blog's posts (here and here). The screenshots shown on his blog make me drool, it is my biggest programmer's dream to ever get something that looks like that. I emailed him to ask how he did it, and here is the relevant part of his reply, I'm sure he wouldn't mind me posting this here:
L-Systems is definitely one way to go, but that isn't what I'm doing. The basis of my method is polygon subdivision. I start with a simple polygon that represents the entire area of the city. Then, I split it (roughly) in half, and then split those two polygons, etc. until I get down to city-block size. At that point, the edges of all my polygons represent roads. I then use the same subdivision method to break the blocks down into building-size lots.
The devil is in the details, of course, but that is the basic method.
I for one still haven't managed to fully implement a solution of which I'm satisfied of, but it remains one of, if not my single biggest programmer's dream to ever achieve something like this.
Here are a few of the leaders in procedurally generated terrain (and to a lesser extent foliage). If you don't get a detailed answer here regarding methods and techniques, you might want to look in / ask in their forums. I have seen some discussions of techniques there.
TerraGen 2
World Builder
World Machine
Natural Graphics
Noone mentioned the demoscene that ONLY use procedural stuff?
So, go search for Werkkzeug, Kkrieger, MilkyTracker to start. Also you can visit the site pouet and see the wonder of well done procedural videos (yes, procedural videoclips! With music and graphics, all procedural!)
Allegorithmic's products are used in actual shipping titles. These guys focus on texture generation (both offline and at runtime).
They have some very pretty screenshots and demos.

What inherited code has impressed or inspired you?

I've heard a ton of complaining over the years about inherited projects that us developers have to work with. The WTF site has tons of examples of code that make me actually mutter under my breath "WTF?"
But have any of you actually been presented with code that made you go, "Holy crap this was well thought out!" or "Wow, I never thought of that!"
What inherited code have you had to work with that made you smile and why?
Long ago, I was responsible for the Turbo C/C++ run-time library. Tanj Bennett wrote the original 80x87 floating point emulator in 16-bit assembler. I hadn't looked closely at Tanj's code since it worked well and didn't require attention. But we were making the move to 32-bits and the task fell to me to stretch the emulator.
If programming could ever be said to have something in common with art this was it.
Tanj's core math functions managed to keep an 80-bit floating point temporary result in five 16-bit registers without having to save and restore them from memory. X86 assembly programmers will understand just what an accomplishment this was. Register space was scarce and keeping five registers as your temp while simultaneously doing complex math was a beautiful site to behold.
If it was only a matter of clever coding that would have been enough to qualify it as art but it was more than that. Tanj had carefully picked the underlying math algorithms that would be most suitable for keeping the temp in registers. The result was a blazing-fast floating point emulator which was an important selling point for many of our customers.
By the time the 386 came along most people who cared about floating-point performance weren't using an emulator but we had to support Intel's 386SX so the emulator needed an overhaul. I rewrote the instruction-decode logic and exception handling but left the core math functions completely untouched.
In my first job, I was amazed to discover a "safe ID" class in the codebase (c++), which was wrapping numerical IDs in a class templated with an empty tag class, that ensured that the compiler would complain if you tried for example to compare or assign a UserId into an OrderId.
Not only did I made sure that I had an equivalent Id class in all subsequent codebases I would be using, but it actually opened my eyes on what the compiler could do to guarantee correctness and help writing stronger code.
The code that impresses me the most, and which I try to emulate - is code that seems too simple and easy to understand.
It is damn difficult to write that kind of code. :-)
I have a funny story to tell here.
I was working on this Javaish application, filled with getters & setters that did nothing but get or set and interfaces and everything ever invented to make code unreadable. One day I stumbled upon some code which seemed very well crafted -- it was basically an algorithm implementation that looked very elegant = few lines of readable code, even though it respected every possible rule the project had to adhere to (it was checkstyled automatically).
I couldn't figure out who on the team could have written such code. I was dying to discuss with him and share thoughts. Thankfully, we had switched to subversion (from cvs) a few months earlier and I quickly ran am 'svn blame'. I loled all over the place, seeing my name next to the implementation.
I had heard stories about people not remembering code they wrote 6 months back, code that is a nightmare to maintain. I could not believe such a thing could happen: how can you forget code you wrote? Well, now I'm convinced it can happen. Thankfully the code was alright and easy to extend, so I've only experienced half of the story.
Some VB6 code by another programmer at my company I came across that handled the error conditions very well (whether it be deal with them directly or log them).
Along with some rather complex code that was well commented.
I know this will bring a lot of answers like,
"I've never find good code before I step in" and variations.
I think the real problem there is not that there isn't good coders or excellent projects out there, is that there's an excess of NIH syndrome and the fact that no body likes code from others. The latter is just because you have to make an intellectual effort to understand it, a much bigger effort than you need to understand you own code so that you dislike it (it's making you think and work after all).
Personally I can remember (as everyone I guess) some cases of really bad code but also I remember some pretty well documented, elegant code.
Currently, the project that most impressed me was a very potent, Dynamic Workflow Engine, not only by the simplicity but also for the way it is coded. I can remember some very clever snippets here and there, as well as a beautiful metaprogramming library based on a full IDL developed by some friends of mine (Aspl.es)
I inherited a large bunch of code that was SO well written I actually spent the $40 online to find the guy, I went to his house and thanked him.
I think Rocky Lhotka should get the credit, but I had to touch a CSLA.NET application recently {in my private practice on the side} and I was very impressed with the orderliness of the code. The app worked extremely well, but the client needed a few extensions. The original author had died tragically, and the new guy was unsophisticated. He didn't understand CSLA.NET's business object based approach, and he wanted to do it all over again in cut-and-dried VB.NET, without any fancy framework.
So I got the call. Looking at a working example of WinForm binding and CSLA.NET was pretty instructive about a lot of things.
Symbian OS - the old core bit of it anyway, the bit that dated back to the Psion days or those who even today keep that spirit alive.
And sitting right along side it and all over it is all the new crap created by the lowest bidders hired by the big phone corporations. It was startling, you could actually feel in your bones whether a bit of the code-base was old or new somehow.
I remember when I wrote my bachelor thesis on type inference, my Pascal-to-Pascal 'compiler' was an extension of a Parser my supervisor programmed (in Java). It had a pretty good structure as far as I can remember, and for me who had never done any serious Object-oriented programming, it was quite a revelation.
I've been doing a lot of Eclipse plug-in development and often had to debug into the actual Eclipse source code. While I haven't "inherited" it in the sense that I'm not continuing work on it, I've always been impressed with the design and quality of the early core.

Getting your head around other people's code

I'm occasionally unfortunate enough to have to make alterations to very old, poorly not documented and poorly not designed code.
It often takes a long time to make a simple change because there is not much structure to the existing code and I really have to read a lot of code before I have a feel for where things would be.
What I think would help a lot in cases like this is a tool that would allow one to visualise an overview of the code, and then maybe even drill down for more detail. I suspect such a tool would be very hard to get right, given that is trying to find structure where there is little or none.
I guess this is not really a question, but rather a musing. I should make it into a question - What do others do to assist in getting their head around other peoples code, the good and the bad?
Hmm, this is a hard one, so much to say so little time ...
1) If you can run the code it makes life soooo much easier, breakpoints (especially conditional) break points are you friend.
2) A purists' approach would be to write a few unit tests, for known functionality, then refactor to improve code and understanding, then re-test. If things break, then create more unit tests - repeat until bored/old/moved to new project
3) ReSharper is good at showing where things are being used, what's calling a method for instance, it's static but a good start, and it helps with refactoring.
4) Many .net events are coded as public, and events can be a pain to debug at the best of times. Recode them to be private and use a property with add/remove. You can then use break point to see what is listening on an event.
BTW - I'm playing in the .Net space, and would love a tool to help do this kind of stuff, like Joel does anyone out there know of a good dynamic code reviewing tool?
I have been asked to take ownership of some NASTY code in the past - both work and "play".
Most of the amateurs I took over code for had just sort of evolved the code to do what they needed over several iterations. It was always a giant incestuous mess of library A calling B, calling back into A, calling C, calling B, etc. A lot of the time they'd use threads and not a critical section was to be seen.
I found the best/only way to get a handle on the code was start at the OS entry point [main()] and build my own call stack diagram showing the call tree. You don't really need to build a full tree at the outset. Just trace through the section(s) you're working on at each stage and you'll get a good enough handle on things to be able to run with it.
To top it all off, use the biggest slice of dead tree you can find and a pen. Laying it all out in front of you so you don't have to jump back and forward on screens or pages makes life so much simpler.
EDIT: There's a lot of talk about coding standards... they will just make poor code look consistent with good code (and usually be harder to spot). Coding standards don't always make maintaining code easier.
I do this on a regular basis. And have developed some tools and tricks.
Try to get a general overview (object diagram or other).
Document your findings.
Test your assumptions (especially for vague code).
The problem with this is that on most companies you are appreciated by result. That's why some programmers write poor code fast and move on to a different project. So you are left with the garbage, and your boss compares your sluggish progress with the quick and dirtu guy. (Luckily my current employer is different).
I generally use UML sequence diagrams of various key ways that the component is used. I don't know of any tools that can generate them automatically, but many UML tools such as BoUML and EA Sparx can create classes/operations from source code which saves some typing.
The definitive text on this situation is Michael Feathers' Working Effectively with Legacy Code. As S. Lott says get some unit tests in to establish behaviour of the lagacy code. Once you have those in you can begin to refactor. There seems to be a sample chapter available on the Object Mentor website.
I strongly recommend BOUML. It's a free UML modelling tool, which:
is extremely fast (fastest UML tool ever created, check out benchmarks),
has rock solid C++ import support,
has great SVG export support, which is important, because viewing large graphs in vector format, which scales fast in e.g. Firefox, is very convenient (you can quickly switch between "birds eye" view and class detail view),
is full featured, intensively developed (look at development history, it's hard to believe that so fast progress is possible).
So: import your code into BOUML and view it there, or export to SVG and view it in Firefox.
See Unit Testing Legacy ASP.NET Webforms Applications for advice on getting a grip on legacy apps via unit testing.
There are many similar questions and answers. Here's the search https://stackoverflow.com/search?q=unit+test+legacy
The point is that getting your head around legacy is probably easiest if you are writing unit tests for that legacy.
I haven't had great luck with tools to automate the review of poorly documented/executed code, cause a confusing/badly designed program generally translates to a less than useful model. It's not exciting or immediately rewarding, but I've had the best results with picking a spot and following the program execution line by line, documenting and adding comments as I go, and refactoring where applicable.
a good IDE (EMACS or Eclipse) could help in many cases. Also on a UNIX-platform, there are some tools for crossreferencing (etags, ctags) or checking (lint) or gcc with many many warning options turned on.
First, before trying to comprehend a function/method, i would refactor it a bit to fit your coding conventions (spaces, braces, indentation) and remove most of the comments if they seem to be wrong.
Then I would refactor and comment the parts you understood, and try to find/grep those parts over the whole source tree and refactor them there also.
Over the time, you get a nicer code, you like to work with.
I personally do a lot of drawing of diagrams, and figuring out the bones of the structure.
The fad de jour (and possibly quite rightly) has got me writing unit tests to test my assertions, and build up a safety net for changes I make to the system.
Once I get to a point where I'm comfortable enought knowing what the system does, I'll take a stab at fixing bugs in the sanest way possible, and hope my safety nets neared completion.
That's just me, however. ;)
i have actuaally been using the refactoring features of ReSharper to help m get a handle on a bunch of projects that i inherited recently. So, to figure out another programmer's very poorly structured, undocumented code, i actually start by refactoring it.
Cleaning up the code, renaming methods, classes and namespaces properly, extracting methods are all structural changes that can shed light on what a piece of code is supposed to do. It might sound counterintuitive to refactor code that you don't "know" but trut me, ReSharper really allows you to do this. Take for example the issue of red herring dead code. You see a method in a class or perhaps a strangely named variable. You can start by trying to lookup usages or, ungh, do a text search, but ReSharper will actually detect dead code and color it gray. As soon as you open a file you see in gray and with scroll bar flags what would have in the past been confusing red herrings.
There are dozens of other tricks and probably a number of other tools that can do similar things but i am a ReSharper junky.
Cheers.
Get to know the software intimately from a user's point of view. A lot can be learnt about the underlying structure by studying and interacting with the user interface(s).
Printouts
Whiteboards
Lots of notepaper
Lots of Starbucks
Being able to scribble all over the poor thing is the most useful method for me. Usually I turn up a lot of "huh, that's funny..." while trying to make basic code structure diagrams that turns out to be more useful than the diagrams themselves in the end. Automated tools are probably more helpful than I give them credit for, but the value of finding those funny bits exceeds the value of rapidly generated diagrams for me.
For diagrams, I look for mostly where the data is going. Where does it come in, where does it end up, and what does it go through on the way. Generally what happens to the data seems to give a good impression of the overall layout, and some bones to come back to if I'm rewriting.
When I'm working on legacy code, I don't attempt to understand the entire system. That would result in complexity overload and subsequent brain explosion.
Rather, I take one single feature of the system and try to understand completely how it works, from end to end. I will generally debug into the code, starting from the point in the UI code where I can find the specific functionality (since this is usually the only thing I'll be able to find at first). Then I will perform some action in the GUI, and drill down in the code all the way down into the database and then back up. This usually results in a complete understanding of at least one feature of the system, and sometimes gives insight into other parts of the system as well.
Once I understand what functions are being called and what stored procedures, tables, and views are involved, I then do a search through the code to find out what other parts of the application rely on these same functions/procs. This is how I find out if a change I'm going to make will break anything else in the system.
It can also sometimes be useful to attempt to make diagrams of the database and/or code structure, but sometimes it's just so bad or so insanely complex that it's better to ignore the system as a whole and just focus on the part that you need to change.
My big problem is that I (currently) have very large systems to understand in a fairly short space of time (I pity contract developers on this point) and don't have a lot of experience doing this (having previously been fortunate enough to be the one designing from the ground up.)
One method I use is to try to understand the meaning of the naming of variables, methods, classes, etc. This is useful because it (hopefully increasingly) embeds a high-level view of a train of thought from an atomic level.
I say this because typically developers will name their elements (with what they believe are) meaningfully and providing insight into their intended function. This is flawed, admittedly, if the developer has a defective understanding of their program, the terminology or (often the case, imho) is trying to sound clever. How many developers have seen keywords or class names and only then looked up the term in the dictionary, for the first time?
It's all about the standards and coding rules your company is using.
if everyone codes in different style, then it's hard to maintain other programmer code and etc, if you decide what standard you'll use have some rules, everything will be fine :) Note: that you don't have to make a lot of rules, because people should have possibility to code in style they like, otherwise you can be very surprised.

How important is it to write functional specs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
I've never written functional specs, I prefer to jump into the code and design things as I go. So far its worked fine, but for a recent personal project I'm writing out some specs which describe all the features of the product, and how it should 'work' without going into details of how it will be implemented, and I'm finding it very valuable.
What are your thoughts, do you write specs or do you just start coding and plan as you go, and which practice is better?
If you're driving from your home to the nearest grocery store, you probably don't need a map. But...
If you're driving to a place you've never been before in another state, you probably do.
If you're driving around at random for the fun of driving, you probably don't need a map. But...
If you're trying to get somewhere in the most effective fashion (minimize distance, minimize time, make three specific stops along the way, etc.) you probably do.
If you're driving by yourself and can take as long as you like, stopping any time you see something interesting or to reconsider your destination or route, you may not need a map. But...
If you're driving as part of a convoy, and all need to make food and overnight lodging stops together, and need to arrive together, you probably do.
If you think I'm not talking about programming, you probably don't need a functional spec, story cards, narrative, CRCs, etc. But...
If you think I am, you might want to consider at least one of the above.
;-)
For someone who "jumps into the code" and "design[s] as they go", I would say writing anything including a functional spec is better than your current methods. A great deal of time and effort can be saved if you take the time to think it through and design it before you even start.
Requirements help define what you need to make.
Design helps define what you are planning on making.
User Documentation defines what you did make.
You'll find that most places will have some variation of these three documents. The functional spec can be lumped into the design document.
I'd recommend reading Rapid Development if you're not convinced. You truely can get work done faster if you take more time to plan and design.
Jumping "straight to code" for large software projects would almost surely lead to failure (as immediatley starting posing bricks to build a bridge would).
The guys at 37 Signals would say that is better to write a short document on paper than writing a complex spec. I'd say that this could be true for mocking up quickly new websites (where the design and the idea could lead better than a rigid schema), but not always acceptable in other real life situations.
Just think of the (legal, even) importance a spec document signed by your customer can have.
The morale probably is: be flexible, and plan with functional or technical specs as much as you need, according to your project's scenario.
For one-off hacks and small utilities, don't bother.
But if you're writing a serious, large application, and have demanding customers and has to run for a long time, it's a MUST. Read Joel's great articles on the subject - they're a good start.
I do it both ways, but I've learned something from Test Driven Development...
If you go into coding with a roadmap you will get to the end of the trip a helluva lot faster than you will if you just start walking down the road without having any idea of how it is going to fork in the middle.
You don't have to write down every detail of what every function is going to do, but define you basics so that way you know what you should get done to make everything work well together.
All that being said, I needed to write a series of exception handlers yesterday and I just dove right in without trying to architect it out at all. Maybe I should reread my own advice ;)
What a lot of people don't want to admit or realize is that software development is an engineering discipline. A lot can be learned as to how they approach things. Mapping out what your going to do in an application isn't necessarily vital on small projects as it is normally easier to quickly go back and fix your mistakes. You don't see how much time is wasted compared to writing down what the system is going to do first.
In reality in large projects its almost necessary to have road map of how the system works and what it does. Call it a Functional Spec if you will, but normally you have to have something that can show you why step b follows step a. We all think we can think it up on the fly (I am definitely guilty of this too), but in reality it causes us problems. Think back and ask yourself how many times you encountered something and said to yourself "Man I wish I would have thought of that earlier?" Or someone else see's what you've done, and showed you that you could have take 3 steps to accomplish a task where you took 10.
Putting it down on paper really forces you to think about what your going to do. Once it's on paper it's not a nebulous thought anymore and then you can look at it and evaluate if what you were thinking really makes sense. Changing a one page document is easier than changing 5000 lines of code.
If you are working in an XP (or similar) environment, you'll use stories to guide development along with lots of unit and hallway useability testing (I've drunk the Kool-Aid, I guess).
However, there is one area where a spec is absolutely required: when coordinating with an external team. I had a project with a large insurance company where we needed to have an agreement on certain program behaviors, some aspects of database design and a number of file layouts. Without the spec, I was wide open to a creative interpretation of what we had promised. These were good people - I trusted them and liked working with them. But still, without that spec it would have been a death march. With the spec, I could always point out where they had deviated from the agreed-to layout or where they were asking for additional custom work ($$!). If working with a semi-antagonistic relationship, the spec can save you from even worse: a lawsuit.
Oh yes, and I agree with Kieveli: "jumping right to code" is almost never a good idea.
I would say it totally "depends" on the type of problem. I tend to ask myself am I writing it for the sake of it or for the layers above you. I also had debated this and my personal experience says, you should since it keeps the project on track with the expectations (rather than going off course).
I like to decompose any non trivial problems loosely on paper first, rather than jumping in to code, for a number of reasons;
The stuff i write on paper doesn't have to compile or make any sense to a computer
I can work at arbitrary levels of abstraction on paper
I can add pictures and diagrams really easily
I can think through and debug a concept very quickly
If the problem I'm dealing with is likely to involve either a significant amount of time, or a number of other people, I'll write it up as an outline functional spec. If I'm being paid by someone else to develop the software, and there is any potential for ambiguity, I will add enough extra detail to remove this ambiguity. I also like to use this documentation as a starting point for developing automated test cases, once the software has been written.
Put another way, I write enough of a functional specification to properly understand the software I am writing myself, and to resolve any possibile ambiguities for anyone else involved.
I rarely feel the need for a functional spec. OTOH I always have the user responsible for the feature a phone call away, so I can always query them for functional requirements as I go.
To me a functional spec is more of a political tool than technical. I guess once you have a spec you can always blame the spec if you later discover problems with the implementation. But who to blame is really of no interest to me, the problem will still be there even if you find a scapegoat, better then to revisit the implementation and try to do it right.
It's virtually impossible to write a good spec, because you really don't know enough of either the problem or the tools or future changes in the environment to do it right.
Thus I think it's much more important to adapt an agile approach to development and dedicate enough resources and time to revisit and refactor as you go.
It's important not to write them: There's Nothing Functional about a Functional Spec