How to publish a game? - embedded-resource

I don't just mean publish, but pretty much everything between when the pure coding is finished and the first version is released. For example, how do games make it so that their save files are hidden/unhackable, how do they include their resources within the game as opposed to having a resource file containing all of the sprites, etc., how do they make it so that there are special file extensions like .rect and .screen_mode, and so on and so forth.
So does anyone know any good books, articles, websites, etc. that explain the process between completing the pure code for a game and the release of it?

I don't think developers make much of an effort to ensure saves are hidden or unhackable. PC games usually just save out to a folder, one file per save, and any obfuscation is likely the result of using a binary file format (which requires some level of effort to reverse-engineer) or plaintext values that aren't very meaningful out of context, but not deliberate attempts to circumvent hacking. There are probably a ton of PC games that have shipped with very easily hackable text or XML save files, but I've never been a save hacker so I don't have any specific examples. On consoles the save files are going to a memory card or the console's hard drive, which makes them inherently inconvenient to access, but beyond that I don't think console developers make much of an effort to encrypt or otherwise obfuscate save data. That energy would more likely be directed towards securing the game against cheating if it's on online game or just making other systems work better.
Special file extensions come from just using your own extensions and/or defining your own file formats. You can use any extension for any file, so there are tons of "special" file formats that are just text files with a different extension, I've done this plenty of times myself. In other cases, if they have defined their own binary file format, that means they also have their own file parsers to process those files at runtime.
I don't know what platforms you have in mind, but for PC and console games, resources are not embedded in the executable. You will generally see a separate executable and then various archives and configuration files. Depending on the game, it may be a single resource pack, or perhaps a handful of packs for related resources like graphics, sound, level data, etc. As a general observation console games are more aggressively archived (to minimize file operations on slow optical media, and perhaps to overcome limitations of the native file systems on more primitive platforms). Some PC games have very loose assets, with even script files hanging out in the open.

If you develop for Windows or XBox 360, Microsoft might offer some help here. Check out their Game Development tools for Visual Studio C++ Express Edition.

If you are looking for books the Game Development Essentials series should answer your questions.

For circumventing saved file modifications, you can implement a simple encryption algorithm and use it to encrypt saved files, and then decrypt them when loading. File extensions are simply a matter of choice.

To use special file extensions in your game, just do the following:
Create some files in a format of your choice that have that extension, and then
write some code that knows how to read that format, and point it at those files.
File extensions are conventions, nothing more; there's nothing magic about them.
ETA: As for embedding resources, there are a few different ways to approach that problem. One common technique is to keep all your resources bundled together in a small number of files - maybe only one (Guild Wars takes that approach).
At the other extreme, you can leave your resources spread across many files in a directory tree, maybe in a custom format that requires special tools to modify, and maybe not. Civilization 4 does things this way, as do all the Turbine games I'm familiar with. This is a matter of taste, and not very important either way.

I think a better solution is two break your images in tiles of some known size and then join them back to back in some random order in a new file. This random order is only known to you and hence only you know how to jumble the tiles to get the original image back.
The approach would be to maintain a single dimensional array and maintains the position of tiles in it. Know use the crop functions of MIDP to extract each tile and render each tile back to the console.
If you need, I can post the code for you.

I would suggest to check the presentation from the developers of World of Goo (great game):
http://2dboy.com/public/eyawtkagibwata.pdf.

Related

Can I and How to extract data from .dat file like using the software?

I want to know if i can extract data from .dat file imitating the way I do it inside the software.
Commonly, I load the .dat file in the software through "Load Data" option, then is on my interest obtaining 2 files, generated through "Params to ASCII" and "Data to ASCII" options inside the software. Like you see, I obtain 2 ascii files, which are easily read with a text editor.
The concern is that I do it all manually, and there are lot of .dat files, so I spend lot of ass-hours doing just clicks.
So, I want to know if there is some way to automatize those operations, anyway serve. I am thinking, through my limited knowledge, in scripts that imitate what I do manually (don't know how to do it), or something more complex, which involves reverse-engineering (also don't know how to do it or if it's possible). Or maybe using powershell...
Maybe you guys could help me, surely you have more brillaint minds!
Kind regards!
There are at least four options that I can think of. Sadly, .dat is not a well-defined file format like .pdf, but a general extension used for all kinds of data files. Do you know the name of the software you open the files? That would help to find a solution. Anyway, some general ideas; to recommend any or be more practical requires to know the software.
Use application vendor's API or libraries to read the file. Vendors often provide .Net library for reading the file from disk or via API call. This would be the clean and supported way. For example, to read dBase database files, there's a library at Github.
Read the file as raw binary (as explained in the article linked by Abraham Zinala). I'd rather not try this first, as it requires some reverse engineering and might provide unexpected errors.
Use UI automation. That is, create a script that uses SendKeys to simulate pressing keyboard keys. There are tools such as AutoIT that make this easier. This is kind of last resort, as it is error prone and cumbersome. If the software supports macros or has internal automation capability, try that before 3rd party tools.
The system sending you .dat files offers the data in some other easy to process format. Whilst this is the easiest solution for you, the other party might not agree.

Why Fuzz images?

I am reading about fuzzing. I have some basic questions regarding fuzzing. I searched but couldn't find any good explanation.
Why image files are popular and common for fuzzing? What is the benefit of using image files?
Why png files are popular and common for fuzzing?
Why Libpng is popular and common for fuzzing?
Is it best to fuzz png images with libpng for beginners? Why?
If someone can answer, it will be very helpful for me.
Thank you in advance.
You fuzz not image files, but software that parses these. Typically
developers don't write code to parse images, but use third party
libraries like libpng. As a developer you don't need to fuzz third
party libraries, only the code of your project. As a security
engineer you can fuzz them.
It is easy to setup fuzzing of such an opensource library - you can build it
statically instrumented, create a small application that calls into
it and fuzz it with an easy to setup fuzzer like afl. This and the
fact that such libraries are widely used, thus errors in these can
have big impact on a lot of applications, make them a good target
for fuzzing.
But image files are not the only files that are widely used and have
popular libraries to handle them. Most fuzzers are unaware of input
structure of the tested binary. They mostly use input mutation
techniques at bit/byte level - changing values of some bit/byte of
the input, feeding it to the tested application and watching it's
behaviour. When the input is highly structured, a fuzzer fails to
test deep into the code. For example to test a browser feeding
html-files to it, requires a fuzzer to create inputs that have
correct lexical and syntactical structure. Typically the code for
lexical/syntax handling is autogenerated based on a language
grammar. By changing bits/bytes in html you most likely get bad
keywords, which would be rejected by such an autogenerated code,
thus testing mostly this code and not getting deeper. Image files
are typically not highly structured and easier to fuzz deeply, thus
can be fuzzed with better coverage.
It is also faster to fuzz a small input than a bigger one - less
bits to change. It's easy to create a small image file just by
taking a small image as a seed, than for example an html-file.
I don't know if png files are more popular for fuzzing than other binary media files, but they structure can include multiple headers/chunks of different types which results in more different handling paths in the code and thus makes it more likely to have errors.
As I said it's opensource, widely used, easy to set up and can be fuzzed fast - it's much faster to run a small application, than for example a browser.
I'm not sure there can be a 'best' criteria, but it's easy and therefore good for beginners.

MIT-Scratch adding/removing language features

I am seeking a way to allow my non-tech users to specify a workflow and execute it (if anyone is interested, I want them to specify and execute test cases). Visual programming seems a good way to go.
Can I modify the Scratch IDE to remove some categories (such as sound, motion, etc), and add some of my own? Ditto for individual keywords (obviously, I then need to handle new keywords).
I have Googled, but the answer is not immediately apparent.
[Update] I have just found Google's Blockly
Blockly was influenced by App Inventor, which in turn was influenced
by Scratch, which in turn was influenced by StarLogo.
It looks very promising. Especially when it says
Exportable code. Users can extract their programs as JavaScript, Python, PHP, Dart or other language so that when they outgrow Blockly
they can keep learning.
Open source. Everything about Blockly is open: you can fork it, hack it, and use it in your own websites.
Extensible. Make Blockly fit with your application by adding custom blocks for your API and remove unneeded blocks and
functionality.
One possible snag is that it is browser based, but if my management don't like that, then I can create a dummy Windows based app consisting of little but a TWebBrowser component.
I will investigate and report back - unless someone else posts an acceptable answer first.
The short answer to your initial question is: no. You can't customize Scratch, or not to the extent that you seem to ask/want.
That said, look at:
custom blocks.
scratch extensions.
variants like snap
using scratch's source code in squeak to make your own variant.
other systems inspired from scratch, like appinventor and blockly.
Only the first two are compatible with the scratch web site.
A word on the site: depending on your purpose with Scratch, the exchange between users is a powerful part of scratch. Check how cooperation is supported, like the backpack. There's also a good wiki that documents much of the above.

Communication between applications written in different languages

I am looking at linking a few applications together (all written in different languages like C#, C++, Python) and I am not sure how to go about it.
What I mean by linking? The system I am working on consists of small programs each responsible for a particular processing task. I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes) and I also need some form of way to control the current state of the operation (This is where a client-server model rings a bell)
It seems like sockets or maybe SOAP would be a universal solution but just wanted to get some opinions as to what people think about this subject.
Comments/suggestions will be appreciated, thanks!
I personally take a liking towards ØMQ. It's a library that has a familiar BSD-sockets-like interface for passing messages, but you'll find it implements interesting patterns for distributing tasks.
It sounds like you want to arrange several processes in a pipeline. ØMQ allows you to do that using push and poll sockets. (And afterwards, you'll find it's even possible to scale up across multiple processes and machines with little effort.) Take a look at the guide to get started, and the zmq_socket(3) manpage specifically for how push and pull works.
Bindings are available for all the languages you mention.
As for the contents of the message, ØMQ doesn't concern itself with that, they are just blocks of raw data. You can use any format that suits you, such as JSON, or perhaps Protocol Buffers.
What I'm not sure about is the ‘controlling state’ you mention. Are you interested in, for example, cancelling a job halfway through?
For C# to C# you can use Windows Communication Foundation. You may be able to use it with Python and C++ as well.
You may also want to checkout named pipes.
I would think about moving to a model where you eliminate the issue by having centralized data that all of the applications look at. Keep "one source of the truth" so to speak.
Most outside software has trouble linking against C++ code, due to the name-mangling algorithm it uses for its symbols. For that reason, when interfacing with programs written in other languages, it is often best to declare wrappers to things as extern "C" or inside an extern "C" { block.
I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes)
Use the file system.
and I also need some form of way to control the current state of the operation
Again, use the file system. A "current_state.json" file with a JSON serialized object is perfect for multiple languages to work with.
It seems like sockets or maybe SOAP would be a universal solution.
Perhaps. But it's overkill for this kind of thing. Your OS already has all the facilities you need. Just use the file system. It's very simple and very reliable.
There are many ways to do interprocess communication. As you said, sockets may be a universal solution. SOAP, i think, is somewhat an overkill. You may also use mailslots. I wrote C++ application using it a couple of years ago. Named pipes could be also a solution, but if you are coding on Windows, it may be difficult.
In my opinion:
Sockets
Mailslots
Are the best candidates.

Writing my own file versioning program

There is what seems to be a plethora of version control systems. Therefore, to draw a bad conclusion, it must be easy to write one.
What are some issues that must be considered in order to write a simple file versioning system? (What are the minimum necessary functions?)
Is it a feasible task for one person?
A good place to learn about version control is Eric Sink's Weblog. His most recent article is Time and Space Tradeoffs in Version Control Storage, for one example.
Another good example is his series of articles Source Control HOWTO. Yes, it's all about how to use source control, but it has a lot of information about the decisions and tradeoffs developers have to make when designing the system. The best example of this is probably his article on Repositories, where he explains different methods of storing versions. I really learned a lot from this series.
How simple?
You could arguably write a version control system with a single-line shell script, upversion.sh:
cp $WORKING_COPY $REPO/$(date +"%s")
For large binary assets, that is basically all you need! It could be improved quite easily, say by making the version folders read-only, perhaps recording metadata with each version (you could have a text file at $REPO/$(date...).meta for example)
That sounds like a huge simplification, but it's not far of the asset-management-systems many film post-production facilities use (for example)
You really need to know what you wish to version, and why..
With large-binary assets (video, say), you need to focus on tools to visually compare versions. You also probably need to deal with dependancies ("I need image123.jpg and video321.avi to generate this image")
With code, you need to focus on things like making diff's between any two versions really easy. Also since edits to source-code are usually small (a few characters from a project with many thousands of lines), it would be horribly inefficient to copy the entire project for each version - so you only store the differences between each version (delta encoding).
To version a database, you probably want to store information on the schema, tracking new tables, or columns, or adjustments to existing ones (rather than calculating deltas of the database files, or making copies like the previous two systems)
There's no perfect way to version everything, you have to focus on doing one thing well.. Git is great for text, but not for binary files. Adobe Version Cue is great with binary files (images), but useless for text..
I suppose the things to consider can be summarised as..
What do you want to version?
Why can I not use (or extend/modify) an existing system?
How will I track differences between versions? (entire files? deltas?)
What other data do I need to attach to versions? (Author? Time-stamp? Dependancies?)
What tasks would a user commonly need to do (diff'ing? reverting specific files?)
Have a look in the question "core concepts" about (D)VCS.
In short, writing a VCS would involve making a decisions about each of these core concepts (Central vs. Distributed, linear vs. DAG, file centric vs. repository centric, ...)
Not a "quick" project, I believe ;)
If you're Linus Torvalds, you can write something like Git in a month.
But "a version control system" is such a vague and stretchable concept, that your question is really unanswerable.
I'd consider asking yourself what you want to achieve (learn about VCS, learn a language, ...) and then define some clear goal. It's good to have a project, but it's also good to have a reachable goal in a small amount of time. Small successes are good for your morale.
That IS really a bad conclusion. My personal opinion here is that the problem domain is so wide and generally hard that nobody has gotten it "right" yet, thus people try to solve it over and over again, from different angles and under different assumptions.That of course doesn't mean you shouldn't try. Just be warned that many smart people were there before you, so you should do your homework.
What could give you a good overview in a less technical manner is The Git Parable.
It is a nice abstraction on the principles of git, but it gives a very good understanding what a VCS should be able to perform. All things beyond this are rather "low-level" decisions.
A good delta algorithm, good compression and network efficiency.
A simple one is doable by one person for a learning opportunity. One issue you might consider is how to efficiently store plain text deltas. A very popular delta format is the one from RCS (used by many version control programs). You might want to study it to get ideas.
To write a proof of concept, you probably could pull it off, implementing or borrowing the tools Alan mentions.
IMHO, the most important aspect of a VCS is ease-of-use. This sounds like an odd statement, but when you think about it, hard drive space is one of the easiest IT commodities to scale horizontally, so bad compression or even real sloppy deltas are going to be tolerated. The main reason people demand improvement in versioning systems is to do common tasks more intuitively or to support more features that droves of people eventually demand but that weren't obvious before release. And since versioning tools tend to be monolithic and thoroughly integrated at a company, the cost to switch is high, and it may not be possible to support a new feature without breaking an existing repo.
The very minimal necessary prerequisite is an exhaustive and accurate test suite. Nobody (including you) will want to use your new system unless you can demonstrate that it works, reliably and completely error free.