Can I and How to extract data from .dat file like using the software? - powershell

I want to know if i can extract data from .dat file imitating the way I do it inside the software.
Commonly, I load the .dat file in the software through "Load Data" option, then is on my interest obtaining 2 files, generated through "Params to ASCII" and "Data to ASCII" options inside the software. Like you see, I obtain 2 ascii files, which are easily read with a text editor.
The concern is that I do it all manually, and there are lot of .dat files, so I spend lot of ass-hours doing just clicks.
So, I want to know if there is some way to automatize those operations, anyway serve. I am thinking, through my limited knowledge, in scripts that imitate what I do manually (don't know how to do it), or something more complex, which involves reverse-engineering (also don't know how to do it or if it's possible). Or maybe using powershell...
Maybe you guys could help me, surely you have more brillaint minds!
Kind regards!

There are at least four options that I can think of. Sadly, .dat is not a well-defined file format like .pdf, but a general extension used for all kinds of data files. Do you know the name of the software you open the files? That would help to find a solution. Anyway, some general ideas; to recommend any or be more practical requires to know the software.
Use application vendor's API or libraries to read the file. Vendors often provide .Net library for reading the file from disk or via API call. This would be the clean and supported way. For example, to read dBase database files, there's a library at Github.
Read the file as raw binary (as explained in the article linked by Abraham Zinala). I'd rather not try this first, as it requires some reverse engineering and might provide unexpected errors.
Use UI automation. That is, create a script that uses SendKeys to simulate pressing keyboard keys. There are tools such as AutoIT that make this easier. This is kind of last resort, as it is error prone and cumbersome. If the software supports macros or has internal automation capability, try that before 3rd party tools.
The system sending you .dat files offers the data in some other easy to process format. Whilst this is the easiest solution for you, the other party might not agree.

Related

Why Fuzz images?

I am reading about fuzzing. I have some basic questions regarding fuzzing. I searched but couldn't find any good explanation.
Why image files are popular and common for fuzzing? What is the benefit of using image files?
Why png files are popular and common for fuzzing?
Why Libpng is popular and common for fuzzing?
Is it best to fuzz png images with libpng for beginners? Why?
If someone can answer, it will be very helpful for me.
Thank you in advance.
You fuzz not image files, but software that parses these. Typically
developers don't write code to parse images, but use third party
libraries like libpng. As a developer you don't need to fuzz third
party libraries, only the code of your project. As a security
engineer you can fuzz them.
It is easy to setup fuzzing of such an opensource library - you can build it
statically instrumented, create a small application that calls into
it and fuzz it with an easy to setup fuzzer like afl. This and the
fact that such libraries are widely used, thus errors in these can
have big impact on a lot of applications, make them a good target
for fuzzing.
But image files are not the only files that are widely used and have
popular libraries to handle them. Most fuzzers are unaware of input
structure of the tested binary. They mostly use input mutation
techniques at bit/byte level - changing values of some bit/byte of
the input, feeding it to the tested application and watching it's
behaviour. When the input is highly structured, a fuzzer fails to
test deep into the code. For example to test a browser feeding
html-files to it, requires a fuzzer to create inputs that have
correct lexical and syntactical structure. Typically the code for
lexical/syntax handling is autogenerated based on a language
grammar. By changing bits/bytes in html you most likely get bad
keywords, which would be rejected by such an autogenerated code,
thus testing mostly this code and not getting deeper. Image files
are typically not highly structured and easier to fuzz deeply, thus
can be fuzzed with better coverage.
It is also faster to fuzz a small input than a bigger one - less
bits to change. It's easy to create a small image file just by
taking a small image as a seed, than for example an html-file.
I don't know if png files are more popular for fuzzing than other binary media files, but they structure can include multiple headers/chunks of different types which results in more different handling paths in the code and thus makes it more likely to have errors.
As I said it's opensource, widely used, easy to set up and can be fuzzed fast - it's much faster to run a small application, than for example a browser.
I'm not sure there can be a 'best' criteria, but it's easy and therefore good for beginners.

How to read disks?

In programming how do you go about reading raw data off of disks. Note: not with a hex-editor, I know how to do that. I basically want to make my own tool.
For example; I want to be able to read the raw data off of a flash drive or some other disk so I can find deleted data. Is it as simple as opening a file and reading a stream? Can someone point me in the right direction?
Obviously I would want the data to appear in hex so that I can scan for file signatures (http://www.garykessler.net/library/file_sigs.html)
C and Python are the ones I am really curious about. Will the standard libraries allow you to open a disk and read data from it directly?
Linux and Windows are two OS's I use.
Thanks

Communication between applications written in different languages

I am looking at linking a few applications together (all written in different languages like C#, C++, Python) and I am not sure how to go about it.
What I mean by linking? The system I am working on consists of small programs each responsible for a particular processing task. I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes) and I also need some form of way to control the current state of the operation (This is where a client-server model rings a bell)
It seems like sockets or maybe SOAP would be a universal solution but just wanted to get some opinions as to what people think about this subject.
Comments/suggestions will be appreciated, thanks!
I personally take a liking towards ØMQ. It's a library that has a familiar BSD-sockets-like interface for passing messages, but you'll find it implements interesting patterns for distributing tasks.
It sounds like you want to arrange several processes in a pipeline. ØMQ allows you to do that using push and poll sockets. (And afterwards, you'll find it's even possible to scale up across multiple processes and machines with little effort.) Take a look at the guide to get started, and the zmq_socket(3) manpage specifically for how push and pull works.
Bindings are available for all the languages you mention.
As for the contents of the message, ØMQ doesn't concern itself with that, they are just blocks of raw data. You can use any format that suits you, such as JSON, or perhaps Protocol Buffers.
What I'm not sure about is the ‘controlling state’ you mention. Are you interested in, for example, cancelling a job halfway through?
For C# to C# you can use Windows Communication Foundation. You may be able to use it with Python and C++ as well.
You may also want to checkout named pipes.
I would think about moving to a model where you eliminate the issue by having centralized data that all of the applications look at. Keep "one source of the truth" so to speak.
Most outside software has trouble linking against C++ code, due to the name-mangling algorithm it uses for its symbols. For that reason, when interfacing with programs written in other languages, it is often best to declare wrappers to things as extern "C" or inside an extern "C" { block.
I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes)
Use the file system.
and I also need some form of way to control the current state of the operation
Again, use the file system. A "current_state.json" file with a JSON serialized object is perfect for multiple languages to work with.
It seems like sockets or maybe SOAP would be a universal solution.
Perhaps. But it's overkill for this kind of thing. Your OS already has all the facilities you need. Just use the file system. It's very simple and very reliable.
There are many ways to do interprocess communication. As you said, sockets may be a universal solution. SOAP, i think, is somewhat an overkill. You may also use mailslots. I wrote C++ application using it a couple of years ago. Named pipes could be also a solution, but if you are coding on Windows, it may be difficult.
In my opinion:
Sockets
Mailslots
Are the best candidates.

How can I hide Perl code?

I've written some Perl programs and am planning on distributing them. They're part of a large binary distribution (mostly compiled C/C++). If possible, I'd prefer to give up as little as possible (I'm responsible for delivering working software, not delivering clever algorithms). What is my best bet for hiding the Perl code so that if someone really wants to see the source, they'd have to put a bit more effort than in than simply opening the file in an editor?
You could encrypt your code and then at run time decrypt it and send it to perl stdin. (of course the decryptor would not be encrypted).
I got some minify/compile answers to my question How can I compile my Perl script so to reduce startup time?
Acme::Bleach
Filter::Crypto (potentially via PAR::Filter::Crypto) is clearly the most advanced open source tool for this job (barring perlcc which doesn't work well for many things, YMMV).
If all you want is hide the code from casual tinkerers, that's more than sufficient. Hiding it from determined and/or capable people is practically impossible.
It won't make it harder to just open the files but an obfuscator can make it more difficult to understand and modify your code. Have a look here or here for a start.

How to publish a game?

I don't just mean publish, but pretty much everything between when the pure coding is finished and the first version is released. For example, how do games make it so that their save files are hidden/unhackable, how do they include their resources within the game as opposed to having a resource file containing all of the sprites, etc., how do they make it so that there are special file extensions like .rect and .screen_mode, and so on and so forth.
So does anyone know any good books, articles, websites, etc. that explain the process between completing the pure code for a game and the release of it?
I don't think developers make much of an effort to ensure saves are hidden or unhackable. PC games usually just save out to a folder, one file per save, and any obfuscation is likely the result of using a binary file format (which requires some level of effort to reverse-engineer) or plaintext values that aren't very meaningful out of context, but not deliberate attempts to circumvent hacking. There are probably a ton of PC games that have shipped with very easily hackable text or XML save files, but I've never been a save hacker so I don't have any specific examples. On consoles the save files are going to a memory card or the console's hard drive, which makes them inherently inconvenient to access, but beyond that I don't think console developers make much of an effort to encrypt or otherwise obfuscate save data. That energy would more likely be directed towards securing the game against cheating if it's on online game or just making other systems work better.
Special file extensions come from just using your own extensions and/or defining your own file formats. You can use any extension for any file, so there are tons of "special" file formats that are just text files with a different extension, I've done this plenty of times myself. In other cases, if they have defined their own binary file format, that means they also have their own file parsers to process those files at runtime.
I don't know what platforms you have in mind, but for PC and console games, resources are not embedded in the executable. You will generally see a separate executable and then various archives and configuration files. Depending on the game, it may be a single resource pack, or perhaps a handful of packs for related resources like graphics, sound, level data, etc. As a general observation console games are more aggressively archived (to minimize file operations on slow optical media, and perhaps to overcome limitations of the native file systems on more primitive platforms). Some PC games have very loose assets, with even script files hanging out in the open.
If you develop for Windows or XBox 360, Microsoft might offer some help here. Check out their Game Development tools for Visual Studio C++ Express Edition.
If you are looking for books the Game Development Essentials series should answer your questions.
For circumventing saved file modifications, you can implement a simple encryption algorithm and use it to encrypt saved files, and then decrypt them when loading. File extensions are simply a matter of choice.
To use special file extensions in your game, just do the following:
Create some files in a format of your choice that have that extension, and then
write some code that knows how to read that format, and point it at those files.
File extensions are conventions, nothing more; there's nothing magic about them.
ETA: As for embedding resources, there are a few different ways to approach that problem. One common technique is to keep all your resources bundled together in a small number of files - maybe only one (Guild Wars takes that approach).
At the other extreme, you can leave your resources spread across many files in a directory tree, maybe in a custom format that requires special tools to modify, and maybe not. Civilization 4 does things this way, as do all the Turbine games I'm familiar with. This is a matter of taste, and not very important either way.
I think a better solution is two break your images in tiles of some known size and then join them back to back in some random order in a new file. This random order is only known to you and hence only you know how to jumble the tiles to get the original image back.
The approach would be to maintain a single dimensional array and maintains the position of tiles in it. Know use the crop functions of MIDP to extract each tile and render each tile back to the console.
If you need, I can post the code for you.
I would suggest to check the presentation from the developers of World of Goo (great game):
http://2dboy.com/public/eyawtkagibwata.pdf.