Where does Scala store information that cannot be represented in Java? - scala

There are some constructs that don't have equivalents in java. Examples would be
named parameters
instance private members
Where/How does Scala store the information necessary for this stuff (some kind of flag in the first case, the parameter names in the second case?
If I get it right this has to get stored in the byte code, since it works even if I just have a compiled library without the source code!?

This information is captured in an annotation named ScalaSig in the class file (see this answer for an example).
You can view the (not very human-friendly) annotation with javap -verbose, or parse it using an internal API, but in general neither should be necessary.

Related

How to read scala documentation using reflection

Is there anyway we can read scala doc comments using reflection. My requirement is to read the #group tag value and use it for counting how many functions are there for each group
No, you can't use Scala reflection to access documentation comments. The reason is simple: comments are, almost by definition, not part of the program. Therefore, it is logically impossible for them to be available via reflection.
In Python, for example, documentation is available from the running program (in fact, even without using reflection), because the documentation is not hidden away in comments, but rather simply assigned to a field of the object that is being documented. Many Lisps (e.g. Clojure), and also Ioke and Seph work that way, too.
In Newspeak, what they call "comments" is available using reflection, but that's because what they call "comments" are not really comments, it is more like arbitrary metadata that can be attached to objects. It is in fact more similar to an annotation in Scala than a comment.
In Scala, documentation is written in comments, and comments are not part of the program (they are literally equivalent to whitespace in the Scala Language Specification), and therefore, cannot possibly be part of the program and thus cannot possibly be accessed via reflection.

Finding class and function names from an x86-64 executable

I am wondering about if it is always possible, in some way to obtain the function and class names when reversing an application. (in this case a game) I have tried for around 1 month to reverse a game (Assassin's Creed Unity (anvil engine)) but still no luck getting the function names. I have found a way to obtain the class names but no clue on function names.
So my question is, is it possible to actually obtain the function name out having the documentation, and create a hierarchy. (I ame doing this to get better at reversing and to learn new things (asm x64))
Any tips and tricks related to reversing classes/structers are appreciated.
No, function and class names aren't needed for compiled code to work, and usually aren't part of an executable that's had its symbol table stripped.
The exception to that would be calls across DLL boundaries where you might get some mangled C++ names containing function and class names, or if there are any error-check / assert messages in the release build then some names might show up in strings.
C++ with RTTI (RunTime Type Info) might have type names somewhere, maybe mapping vtable pointers to strings, or for classes with no virtual members probably only if typeid was ever actually used. (Or not at all if compiled with RTTI disabled. activate RTTI in c++)
Even exception-handling I think doesn't need class names in the binary.
Other than that, there's no need for class names or function names in the compiled binary. Definitely not in the machine code itself; that's of course all pointers / relative offsets, even for classes with virtual functions. How do objects work in x86 at the assembly level?.
C++ does not generally support introspection, unlike Java, so there's no default need for any of the info you're looking for to be in the executable anywhere.

Scala: Difference between file.class and file$.class from scalac

When I use scalac to compile file.scala, I end up with 2 outputs, file.class and file$.class. What is the difference between these files and which is the appropriate one to then run? I get distinctly different error messages between executing "scala file" vs "scala file$".
Scala objects get compiled to classes ending in "$" because you're allowed to have an "ordinary" class with the same name. But the object's methods are also exposed as static methods on the "ordinary" class, so that they can be called under the names you would expect. This is an artifact of trying to represent the scala semantics in a way that make sense to Java / the JVM, and I would encourage you to regard it as an implementation detail rather than something important.
(#MattPutnam's answer is correct that anonymous classes, including closures, are compiled to class files with $es in their name, but that's not what's causing your file$.class in this particular instance)
Use scala file. If you're interested in the implementation details you might also want to try java -cp /path/to/scala-library.jar file.
file$.class is some inner anonymous class. In Java they're very explicit, but they can be easy to miss in Scala. If you use any method that takes a function, there's an implicit anonymous class there. Post the code and I'll point it out.

scala: analogy to metaclasses in python?

in scala i need to implement something similar to python metaclasses. in my case the goal of using the metaclasses is usually to create a registry of all the subclasses of a particular base class - that is, a mapping from say a string representation of the class to a reference to the class. in python it's very convenient to put a metaclass on the base class so that nothing special needs to be done on every subclass. i'm looking to do something similar in scala. is there any way to emulate metaclasses, or otherwise do this a different way? thanks!
If you know the fully qualified name of the class, you can load it using the usual Java reflection methods in java.lang.Class, namely Class.forName(String fqClassName). Given the resulting instance of Class, instantiation is easy only if there's a zero-argument constructor, otherwise you get entangled in the messy world of all the Java reflection types.
If you want a kind of "discovery" where classes unknown at compile time and whose names are not supplied as an input or parameter of the program in some way, then the classloader approach is probably the only answer.
There's nothing similar to python's metaclasses. The registry you speak of might be possible using custom class loaders or reflection.

GWT Generators - Determine Whether a Class is Referenced Anywhere

I have GWT project that uses Generators to create light dynamic reflection objects.
I was wondering if anybody knows of a way to determine whether or not a particular class is referenced in the dependency tree beginning at all EntryPoints. If I could do this, I could avoid generating reflection data for classes that will never be used anyway.
My understanding is that when GWT does its compiling, it performs a similar check so that it can reduce the total size of the compiled code, but I haven't been able to find any related methods in TypeOracle or anything like that.
This is an indirect method of accomplishing what you are getting at. I believe each GWT module, is fully packaged into a regular java package. You can use
TypeOracle.findPackage(String pkgName)
to get the JPackage instance, and on that instance you use findType(String typeName) to see if a type is present in that package. If present, its likely that it is referenced in some file and GWT will compile it.
There is also this method getPackages() which returns an array of all packages known to this type oracle - therefore reachable for GWT compiler.
JPackage[] getPackages()
You can iteratively findType() on each package to find if the type is going to be compiled or not.
The BEST method is to define a custom annotation and whitelist all the classes that you do want to generate reflection code. You can annotate the required classes with it, and checking for that presence of annotation before generating code for it.
My favorite is to follow a naming convention over annotation, (I did both together), and thus maintain a whitelist, and make the convention (its usually a REGEX) a "setting" that can be changed however the team wants.