Guids vs. Protocols in EDK2 - uefi

I was trying to understand the different sections in the package declaration file (.dec) of an EDK2 module, however I can't seem to figure out why some GUID definitions are under the [GUIDs] section and some are under the [Protocols] section or [Ppis] section. Is there a reason why they should not be under the same section, especially from the perspective of the EDK2 build process?

So, this is half an answer at most, but:
A GUID, ultimately, is nothing other than a 128-bit value statistically guaranteed to be unique (if generated using the defined method).
The [Guids] section of the .dec defines GUIDs that point to generic data structures, variable namespaces, things...
The [Protocols] section defines discoverable UEFI APIs, whereas [Ppis] defines PEI (Pre-EFI) APIs.
Ultimately, this becomes relevant when processing module .inf files, which declare which [Guids], [Protocols] and [Ppis] they require to build. I.e., you could possibly get away with just declaring everything as GUIDs - but then you'd loose any sanity checking preventing you from using PPIs in DXE, or the other way around.

Related

Finding class and function names from an x86-64 executable

I am wondering about if it is always possible, in some way to obtain the function and class names when reversing an application. (in this case a game) I have tried for around 1 month to reverse a game (Assassin's Creed Unity (anvil engine)) but still no luck getting the function names. I have found a way to obtain the class names but no clue on function names.
So my question is, is it possible to actually obtain the function name out having the documentation, and create a hierarchy. (I ame doing this to get better at reversing and to learn new things (asm x64))
Any tips and tricks related to reversing classes/structers are appreciated.
No, function and class names aren't needed for compiled code to work, and usually aren't part of an executable that's had its symbol table stripped.
The exception to that would be calls across DLL boundaries where you might get some mangled C++ names containing function and class names, or if there are any error-check / assert messages in the release build then some names might show up in strings.
C++ with RTTI (RunTime Type Info) might have type names somewhere, maybe mapping vtable pointers to strings, or for classes with no virtual members probably only if typeid was ever actually used. (Or not at all if compiled with RTTI disabled. activate RTTI in c++)
Even exception-handling I think doesn't need class names in the binary.
Other than that, there's no need for class names or function names in the compiled binary. Definitely not in the machine code itself; that's of course all pointers / relative offsets, even for classes with virtual functions. How do objects work in x86 at the assembly level?.
C++ does not generally support introspection, unlike Java, so there's no default need for any of the info you're looking for to be in the executable anywhere.

How is Perl useful as a metadata tool?

In The Pragmatic Programmer:
Normally, you can simply hide a third-party product behind a
well-defined, abstract interface. In fact , we've always been able to
do so on any project we've worked on. But suppose you couldn't isolate
it that cleanly. What if you had to sprinkle certain statements
liberally throughout the code? Put that requirement in metadata, and
use some automatic mechanism, such as Aspects (see page 39 ) or Perl,
to insert the necessary statements into the code itself.
Here the author is referring to Aspect Oriented Programming and Perl as tools that support "automatic mechanisms" for inserting metadata.
In my mind I envision some type of run-time injection of code. How does Perl allow for "automatic mechanisms" for inserting metadata?
Skip ahead to the section on Code Generators. The author provides a number of examples of processing input files to generate code, including this one:
Another example of melding environments using code generators happens when different programming languages are used in the same application. In order to communicate, each code base will need some information in commondata structures, message formats, and field names, for example. Rather than duplicate this information, use a code generator. Sometimes you can parse the information out of the source files of one language and use it to generate code in a second language. Often, though, it is simpler to express it in a simpler, language-neutral representation and generate the code for both languages, as shown in Figure 3.4 on the following page. Also see the answer to Exercise 13 on page 286 for an example of how to separate the parsing of the flat file representation from code generation.
The answer to Exercise 13 is a set of Perl programs used to generate C and Pascal data structures from a common input file.

How can I set the SHA digest size in java.security.MessageDigest?

I am kinda playing with the SHA-1 algorithm. I want to find out differences and variations in the results if I change few values in the SHA-1 algorithm for a college report. I have found a piece of java code to generate hash of a text. Its done by importing
java.security.MessageDigest
class. However, I want to change the h0-4 values and edit them but I don't know where can I find them? I had a look inside the MessageDigest class but couldn't find it there. Please help me out!
Thanx in advance.
I don't believe you can do that. Java doesn't provide any API for its MessageDigest Class, which can allow you change the values.
However, there are some workarounds (none of which I've ever tried). Take a look at this answer to the question "How to edit Java Platform Package (Built-in API) source code?"
If you're playing around with tweaks to an algorithm, you shouldn't be using a built-in class implementing that algorithm. The class you mention is designed to implement standard algorithms for people who just want to use them in production; if you're using SHA-1 (or any cryptographic algorithm) instead of playing around and tweaking it, it's never a good idea to change the algorithm yourself (e.g. by changing the initial hash value), so the class does not support modifying those constants.
Just implement the algorithm yourself; from Wikipedia's pseudocode, it doesn't look like it's all that complicated. I know that "don't implement your own crypto, use a standard and well-tested implementation" is a common mantra here, but that only applies to production-type code -- if you're playing around with an algorithm to see what effect tweaking it has, you should implement it yourself, so you have more flexibility in modifying it and seeing the effect of the modifications.
Basically adding to #Rahil's answer but too much for comments:
Even without API access, if MessageDigest were the implementation you could use reflection. But it's not.
Most of the java standard library is just commonly-useful classes in the usual way, e.g. java.util.ArrayList contains the implementation of ArrayList (or ArrayList<?> since 6), java.io.FileInputStream contains the implementation of FileInputStream (although it may use other classes in that implementation), etc. Java Cryptography uses a more complicated scheme where the implementations are not in the API classes but instead in "providers" that are mostly in their own jars (in JRE/lib and JRE/lib/ext) not rt.jar and mostly(?) don't have source in src.zip.
Thus the java.security.MessageDigest class does not have the code to implement SHA1, or SHA256, or MD5, etc etc. Instead it has code to search the JVM's current list of crypto providers to find an implementation of whatever algorithm is asked for, and instantiate and use that. Normally the list of providers used is set to (the list of) those included in the JRE distribution, although an admin or program can change it.
With the normal JRE7 providers, SHA1 is implemented by sun.security.provider.SHA.
In effect the API classes like MessageDigest Signature Cipher KeyGenerator etc function more like interfaces or facades by presenting the behavior that is common to possibly multiple underlying implementations, although in Java code terms they are actual classes and not interfaces.
This was designed back in 1990 or so to cope with legal restrictions on crypto in effect then, especially on export from the US. It allowed the base Java platform to be distributed easily because by itself it did no crypto. To use it -- and even if you don't do "real" crypto on user data in Java you still need things like verification of signed code -- you need to add some providers; you might have one set of providers, with complete and strong algorithms, used in US installations, and a different set, with fewer and weaker algorithms, used elsewhere. This capability is now much less needed since the US officially relaxed and in practice basically dropped enforcement about 2000, although there are periodically calls to bring it back. There is still one residual bit, however: JCE (in Oracle JREs) contains a policy that does not allow symmetric keys over 128 bits; to enable that you must download from the Oracle website and install an additional (tiny) file "JCE Unlimited Strength Policy".
TLDR: don't try to alter the JCE implementation. As #cpast says, in this case where you want to play with something different from the standard algorithm, do write your own code.

What code would you consider using a code generator like CodeSmith for?

I use CodeSmith for the PLINQO templates, to build my DAL from my DB objects; it works great!
I believe that's the primary use of code generator apps, but I'm curious... what other code would you consider using a code generator for? Do you have any CodeSmith templates that you use frequently (if so, what does it do)?
I haven't used CodeSmith, but I've done a fair bit of code generation. Noteably I wrote most of a configuration management (CM) system for a WiMAX system, where the CM code was generated for 3 different platforms. The only difference was the CM model for each platform.
The model was in a custom Domain Specific Language (DSL) that we built had a parser for. The language was a basic container/element style where containers could nest and have an identifier, and elements were of pre-defined types. Documentation was an attribute of elements and containers. You could add Lua snippets to the element and container definitions to do semantic validation (e.g., the value is in the correct range, if it's an IP address is it in a CIDR range defined elsewhere, etc.).
The parser generated a syntax tree that we then pushed at templates. The template language was a partial C implementation of StringTemplate. We used it to generate:
A model specific C API that applications could call into to get configuration values,
The collected Lua code for validating the model and providing useful error messages,
Two "backends" for the API that would manage values in memory (for temporary manipulation of a model), and in a database system (for sharing amongst processes),
Configuration file parser and writer,
HTML documentation, and
Command Line Interface (CLI) implementation for interactive viewing and changing of a configuration.
In retrospect, I should have simply used Lua directly as the DSL. It would have been more verbose, but having the parser already there and lots of Lua templating choices available to me would have saved a lot of development effort.
For things that have a repetivie structure and well defined rules about what those things need to do, code generation can be a wonderful thing.

Are there guidelines for updating C++Builder applications for C++Builder 2009?

I have a range of Win32 VCL applications developed with C++Builder from BCB5 onwards, and want to port them to ECB2009 or whatever it's now called.
Some of my applications use the old TNT/TMS unicode components, so I have a good mix of AnsiStrings and WideStrings throughout the code. The new version introduces UnicodeString, and a bunch of #defines that change the way functions like c_str behave.
I want to modify my code in a way that is as backwards-compatible as possible, so that the same code base can still be compiled and run (in a non-unicode fashion) on BCB2007 if necessary.
Particular areas of concern are:
Passing strings to/from Win32 API
functions
Interop with TXMLDocument
'Raw' strings used for RS232 comms, etc.
Rather than knife-and-fork the changes, I'm looking for guidelines that I can apply to ease the migration, while keeping backwards compatibility wherever possible.
If no such guidelines already exist, maybe we can formulate some here?
The biggest issue is compatibility for C++Builder 2009 and previous versions, the Unicode differences are some, but the project configuration files have changed as well. From the discussions I've been following on the CodeGear forums, there are not a whole lot of choices in the matter.
I think the first place to start, if you have not done so, is the C++Builder 2009 release notes.
The biggest thing seen has been the TCHAR mapping (to wchar or char); using the STL string varieties may be a help, since they shouldn't be very different between the two versions. The mapping existed in C++Builder 2007 as well (with the tchar header).
For any code that does not need to be explicitally Ansi or explitically Unicode, you should consider using the System::String, System::Char, and System::PChar typedefs as much as possible. That will help ease a lot of migration, and they work in previous versions.
When passing a System::String to an API function, you have to take into account the new "TCHAR maps to" setting in the Project options. If you try to pass AnsiString::c_str() when "TCHAR maps to" is set to "wchar_t", or UnicodeString::c_str() when "TCHAR maps to" is set to "char", you will have to perform appropriate typecasts. If you have "TCHAR maps to" set to "wchar_t". Technically, UnicodeString::t_str() does the same thing as TCHAR does in the API, however t_str() can be very dangerous if you misuse it (when "TCHAR maps to" is set to "char", t_str() transforms the UnicodeString's internal data to Ansi).
For "raw" strings, you can use the new RawByteString type (though I do not recommend it), or TBytes instead (which is an array of bytes - recommended). You should not be using Ansi/Wide/UnicodeString for non-character data to begin with. Most people used AnsiString as makeshift data buffers in past versions. Do not do that anymore. This is particularly important because AnsiString is now codepage-aware, and thus your data might get converted to other codepages when you least expect it.