Are generics specialized during compilation or they are just like java generics only for compile time checks? - swift

There are three ways to implement generics:
Just a tool for compile time checks, but every template instance
is compiled to the same byte/assembly code implementation (Java, as noted in comments "type erasure" implementation)
Each template instantiation is compiled to specialized code (C++, C#)
Combination of #1 and #2
Which one is implemented in Swift?

Swift starts by compiling a single implementation that does dynamic type checking, but the optimizer can then choose to clone off specialized implementations for particular types when the speed vs code size tradeoffs make sense. Ideally, this gets 90% of the speedup of always cloning, without the code size and compilation time exploding.

Related

Idiomatic Rust plugin system

I want to outsource some code for a plugin system. Inside my project, I have a trait called Provider which is the code for my plugin system. If you activate the feature "consumer" you can use plugins; if you don't, you are an author of plugins.
I want authors of plugins to get their code into my program by compiling to a shared library. Is a shared library a good design decision? The limitation of the plugins is using Rust anyway.
Does the plugin host have to go the C way for loading the shared library: loading an unmangled function?
I just want authors to use the trait Provider for implementing their plugins and that's it.
After taking a look at sharedlib and libloading, it seems impossible to load plugins in a idiomatic Rust way.
I'd just like to load trait objects into my ProviderLoader:
// lib.rs
pub struct Sample { ... }
pub trait Provider {
fn get_sample(&self) -> Sample;
}
pub struct ProviderLoader {
plugins: Vec<Box<Provider>>
}
When the program is shipped, the file tree would look like:
.
├── fancy_program.exe
└── providers
├── fp_awesomedude.dll
└── fp_niceplugin.dll
Is that possible if plugins are compiled to shared libs? This would also affect the decision of the plugins' crate-type.
Do you have other ideas? Maybe I'm on the wrong path so that shared libs aren't the holy grail.
I first posted this on the Rust forum. A friend advised me to give it a try on Stack Overflow.
UPDATE 3/27/2018:
After using plugins this way for some time, I have to caution that in my experience things do get out of sync, and it can be very frustrating to debug (strange segfaults, weird OS errors). Even in cases where my team independently verified the dependencies were in sync, passing non-primitive structs between the dynamic library binaries tended to fail on OS X for some reason. I'd like to revisit this, find what cases it happens in, and perhaps open an issue with Rust, but I'm going to advise caution with this going forward.
LLDB and valgrind are near-essential to debug these issues.
Intro
I've been investigating things along these lines myself, and I've found there's little official documentation for this, so I decided to play around!
First let me note, as there is little official word on these properties please do not rely on any code here if you're trying to keep planes in the air or nuclear missiles from errantly launching, at least not without doing far more comprehensive testing than I've done. I'm not responsible if the code here deletes your OS and emails an erroneous tearful confession of committing the Zodiac killings to your local police; we're on the fringes of Rust here and things could change from one release or toolchain to another.
I have personally tested this on Rust 1.20 stable in both debug and release configurations on Windows 10 (stable-x86_64-pc-windows-msvc) and Cent OS 7 (stable-x86_64-unknown-linux-gnu).
Approach
The approach I took was a shared common crate both crates listed as a dependency defining common struct and trait definitions. At first, I was also going to test having a struct with the same structure, or trait with the same definitions, defined independently in both libraries, but I opted against it because it's too fragile and you wouldn't want to do it in a real design. That said, if anybody wants to test this, feel free to do a PR on the repository above and I will update this answer.
In addition, the Rust plugin was declared dylib. I'm not sure how compiling as cdylib would interact, since I think it would mean that upon loading the plugin there are two versions of the Rust standard library hanging around (since I believe cdylib statically links the Rust stdlib into the shared object).
Tests
General Notes
The structs I tested were not declared #repr(C). This could provide an extra layer of safety by guaranteeing a layout, but I was most curious about writing "pure" Rust plugins with as little "treating Rust like C" fiddling as possible. We already know you can use Rust via FFI by wrapping things in opaque pointers, manually dropping, and such, so it's not very enlightening to test this.
The function signature I used was pub fn foo(args) -> output with the #[no_mangle] directive, it turns out that rustfmt automatically changes extern "Rust" fn to simply fn. I'm not sure I agree with this in this case since they are most certainly "extern" functions here, but I will choose to abide by rustfmt.
Remember that even though this is Rust, this has elements of unsafety because libloading (or the unstable DynamicLib functionality) will not type check the symbols for you. At first I thought my Vec test was proving you couldn't pass Vecs between host and plugin until I realized on one end I had Vec<i32> and on the other I had Vec<usize>
Interestingly, there were a few times I pointed an optimized test build to an unoptimized plugin and vice versa and it still worked. However, I still can't in good faith recommending building plugins and host applications with different toolchains, and even if you do, I can't promise that for some reason rustc/llvm won't decide to do certain optimizations on one version of a struct and not another. In addition, I'm not sure if this means that passing types through FFI prevents certain optimizations such as Null Pointer Optimizations from occurring.
You're still limited to calling bare functions, no Foo::bar because of the lack of name mangling. In addition, due to the fact that functions with trait bounds are monomorphized, generic functions and structs are also out. The compiler can't know you're going to call foo<i32> so no foo<i32> is going to be generated. Any functions over the plugin boundary must take only concrete types and return only concrete types.
Similarly, you have to be careful with lifetimes for similar reasons, since there's no static lifetime checking Rust is forced to believe you when you say a function returns &'a when it's really &'b.
Native Rust
The first tests I performed were on no custom structures; just pure, native Rust types. This would give a baseline for if this is even possible. I chose three baseline types: &mut i32, &mut Vec, and Option<i32> -> Option<i32>. These were all chosen for very specific reasons: the &mut i32 because it tests a reference, the &mut Vec because it tests growing the heap from memory allocated in the host application, and the Option as a dual purpose of testing passing by move and matching a simple enum.
All three work as expected. Mutating the reference mutates the value, pushing to a Vec works properly, and the Option works properly whether Some or None.
Shared Struct Definition
This was meant to test if you could pass a non-builtin struct with a common definition on both sides between plugin and host. This works as expected, but as mentioned in the "General Notes" section, can't promise you Rust won't fail to optimize and/or optimize a structure definition on one side and not another. Always test your specific use case and use CI in case it changes.
Boxed Trait Object
This test uses a struct whose definition is only defined on the plugin side, but implements a trait defined in a common crate, and returns a Box<Trait>. This works as expected. Calling trait_obj.fun() works properly.
At first I actually anticipated there would be issues with dropping without making the trait explicitly have Drop as a bound, but it turns out Drop is properly called as well (this was verified by setting the value of a variable declared on the test stack via raw pointer from the struct's drop function). (Naturally I'm aware drop is always called even with trait objects in Rust, but I wasn't sure if dynamic libraries would complicate it).
NOTE:
I did not test what would happen if you load a plugin, create a trait object, then drop the plugin (which would likely close it). I can only assume this is potentially catastrophic. I recommend keeping the plugin open as long as the trait object persists.
Remarks
Plugins work exactly as you'd expect just linking a crate naturally, albeit with some restrictions and pitfalls. As long as you test, I think this is a very natural way to go. It makes symbol loading more bearable, for instance, if you only need to load a new function and then receive a trait object implementing an interface. It also avoids nasty C memory leaks because you couldn't or forgot to load a drop/free function. That said, be careful, and always test!
There is no official plugin system, and you cannot do plugins loaded at runtime in pure Rust. I saw some discussions about doing a native plugin system, but nothing is decided for now, and maybe there will never be any such thing. You can use one of these solutions:
You can extend your code with native dynamic libraries using FFI. To use the C ABI, you have to use repr(C), no_mangle attribute, extern etc. You will find more information by searching Rust FFI on the internets. With this solution, you must use raw pointers: they come with no safety guarantee (i.e. you must use unsafe code).
Of course, you can write your dynamic library in Rust, but to load it and call the functions, you must go through the C ABI. This means that the safety guarantees of Rust do not apply there. Furthermore, you cannot use the highest level Rust's functionalities as trait, enum, etc. between the library and the binary.
If you do not want this complexity, you can use a language adapted to expand Rust: with which you can dynamically add functions to your code and execute them with same guarantees as in Rust. This is, in my opinion, the easier way to go: if you have the choice, and if the execution speed is not critical, use this to avoid tricky C/Rust interfaces.
Here is a (not exhaustive) list of languages that can easily extend Rust:
Gluon, a functional language like Haskell
Dyon, a small but powerful scripting language intended for video games
Lua with rlua or hlua
You can also use Python or Javascript, or see the list in awesome-rust.

Is it possible/useful to transpile Scala to golang?

Scala native has been recently released, but the garbage collector they used (for now) is extremely rudimentary and makes it not suitable for serious use.
So I wonder: why not just transpile Scala to Go (a la Scala.js)? It's going to be a fast, portable runtime. And their GC is getting better and better. Not to mention the inheritance of a great concurrency model: channels and goroutines.
So why did scala-native choose to go so low level with LLVM?
What would be the catch with a golang transpiler?
There are two kinds of languages that are good targets for compilers:
Languages whose semantics closely match the source language's semantics.
Languages which have very low-level and thus very general semantics (or one might argue: no semantics at all).
Examples for #1 include: compiling ECMAScript 2015 to ECMAScript 5 (most language additions were specifically designed as syntactic sugar for existing features, you just have to desugar them), compiling CoffeeScript to ECMAScript, compiling TypeScript to ECMAScript (basically, after type checking, just erase the types and you are done), compiling Java to JVM byte code, compiling C♯ to CLI CIL bytecode, compiling Python to CPython bytecode, compiling Python to PyPy bytecode, compiling Ruby to YARV bytecode, compiling Ruby to Rubinius bytecode, compiling ECMAScript to SpiderMonkey bytecode.
Examples for #2 include: machine code for a general purpose CPU (RISC even more so), C--, LLVM.
Compiling Scala to Go fits neither of the two. Their semantics are very different.
You need either a language with powerful low-level semantics as the target language, so that you can build your own semantics on top, or you need a language with closely matching semantics, so that you can map your own semantics into the target language.
In fact, even JVM bytecode is already too high-level! It has constructs such as classes that do not match constructs such as Scala's traits, so there has to be a fairly complex encoding of traits into classes and interfaces. Likewise, before invokedynamic, it was actually pretty much impossible to represent dynamic dispatch on structural types in JVM bytecode. The Scala compiler had to resort to reflection, or in other words, deliberately stepping outside of the semantics of JVM bytecode (which resulted in a terrible performance overhead for method dispatch on structural types compared to method dispatch on other class types, even though both are the exact same thing).
Proper Tail Calls are another example: we would like to have them in Scala, but because JVM bytecode is not powerful enough to express them without a very complex mapping (basically, you have to forego using the JVM's call stack altogether and manage your own stack, which destroys both performance and Java interoperability), it was decided to not have them in the language.
Go has some of the same problems: in order to implement Scala's expressive non-local control-flow constructs such as exceptions or threads, we need an equally expressive non-local control-flow construct to map to. For typical target languages, this "expressive non-local control-flow construct" is either continuations or the venerable GOTO. Go has GOTO, but it is deliberately limited in its "non-localness". For writing code by humans, limiting the expressive power of GOTO is a good thing, but for a compiler target language, not so much.
It is very likely possible to rig up powerful control-flow using goroutines and channels, but now we are already leaving the comfortable confines of just mapping Scala semantics to Go semantics, and start building Scala high-level semantics on top of Go high-level semantics that weren't designed for such usage. Goroutines weren't designed as a general control-flow construct to build other kinds of control-flow on top of. That's not what they're good at!
So why did scala-native choose to go so low level with LLVM?
Because that's precisely what LLVM was designed for and is good at.
What would be the catch with a golang transpiler?
The semantics of the two languages are too different for a direct mapping and Go's semantics are not designed for building different language semantics on top of.
their GC is getting better and better
So can Scala-native's. As far as I understand, the choice for current use of Boehm-Dehmers-Weiser is basically one of laziness: it's there, it works, you can drop it into your code and it'll just do its thing.
Note that changing the GC is under discussion. There are other GCs which are designed as drop-ins rather than being tightly coupled to the host VM's object layout. E.g. IBM is currently in the process of re-structuring J9, their high-performance JVM, into a set of loosely coupled, independently re-usable "runtime building blocks" components and releasing them under a permissive open source license.
The project is called "Eclipse OMR" (source on GitHub) and it is already production-ready: the Java 8 implementation of IBM J9 was built completely out of OMR components. There is a Ruby + OMR project which demonstrates how the components can easily be integrated into an existing language runtime, because the components themselves assume no language semantics and no specific memory or object layout. The commit which swaps out the GC and adds a JIT and a profiler clocks in at just over 10000 lines. It isn't production-ready, but it boots and runs Rails. They also have a similar project for CPython (not public yet).
why not just transpile Scala to Go (a la Scala.js)?
Note that Scala.JS has a lot of the same problems I mentioned above. But they are doing it anyway, because the gain is huge: you get access to every web browser on the planet. There is no comparable gain for a hypothetical Scala.go.
There's a reason why there are initiatives for getting low-level semantics into the browser such as asm.js and WebAssembly, precisely because compiling a high-level language to another high-level language always has this "semantic gap" you need to overcome.
In fact, note that even for lowish-level languages that were specifically designed as compilation targets for a specific language, you can still run into trouble. E.g. Java has generics, JVM bytecode doesn't. Java has inner classes, JVM bytecode doesn't. Java has anonymous classes, JVM bytecode doesn't. All of these have to be encoded somehow, and specifically the encoding (or rather non-encoding) of generics has caused all sorts of pain.

What is the difference between Clojure REPL and Scala REPL?

I’ve been working with Scala language for a few months and I’ve already created couple projects in Scala. I’ve found Scala REPL (at least its IntelliJ worksheet implementation) is quite convenient for quick development. I can write code, see what it does and it’s nice. But I do the procedure only for functions (not whole program). I can’t start my application and change it on spot. Or at least I don’t know how (so if you know you are welcome to give me piece of advice).
Several days ago my associate told me about Clojure REPL. He uses Emacs for development process and he can change code on spot and see results without restarting. For example, he starts the process and if he changes implementation of a function, his code will change his behavior without restart. I would like to have the same thing with Scala language.
P.S. I want to discuss neither which language is better nor does functional programming better than object-oriented one. I want to find a good solution. If Clojure is the better language for the task so let it be.
The short answer is that Clojure was designed to use a very simple, single pass compiler which reads and compiles a single s-expression or form at a time. For better or worse there is no global type information, no global type inference and no global analysis or optimization. Clojure uses clojure.lang.Var instances to create global bindings through a series of hashmaps from textual symbols to transactional values. def forms all create bindings at global scope in this global binding map. So where in Scala a "function" (method) will be resolved to an instance or static method on a given JVM class, in Clojure a "function" (def) is really just a reference to an entry in the table of var bindings. When a function is invoked, there isn't a static link to another class, instead the var is reference by symbolic name, then dereferenced to get an instance of a clojure.lang.IFn object which is then invoked.
This layer of indirection means that it is possible to re-evaluate only a single definition at a time, and that re-evaluation becomes globaly visible to all clients of the re-defined var.
In comparison, when a definition in Scala changes, scalac must reload the changed file, macroexpand, type infer, type check, and compile. Then due to the semantics of classloading on the JVM, scalac must also reload all classes which depend on methods in the class which changed. Also all values which are instances of the changed class become trash.
Both approaches have their strengths and weaknesses. Obviously Clojure's approach is simpler to implement, however it pays an ongoing cost in terms of performance due to continual function lookup operations forget correctness concerns due to lack of static types and what have you. This is arguably suitable for contexts in which lots of change is happening in a short timeframe (interactive development) but is less suitable for context when code is mostly static (deployment, hence Oxcart). some work I did suggests that the slowdown on Clojure programs from lack of static method linking is on the order of 16-25%. This is not to call Clojure slow or Scala fast, they just have different priorities.
Scala chooses to do more work up front so that the compiled application will perform better which is arguably more suitable for application deployment when little or no reloading will take place, but proves a drag when you want to make lots of small changes.
Some material I have on hand about compiling Clojure code more or less cronological by publication order since Nicholas influenced my GSoC work a lot.
Clojure Compilation [Nicholas]
Clojure Compilation: Full Disclojure [Nicholas]
Why is Clojure bootstrapping so slow? [Nicholas]
Oxcart and Clojure [me]
Of Oxen, Carts and Ordering [me]
Which I suppose leaves me in the unhappy place of saying simply "I'm sorry, Scala wasn't designed for that the way Clojure was" with regards to code hot swapping.

Side effects in Scala

I am learning Scala right in these days. I have a slight familiarity with Haskell, although I cannot claim to know it well.
Parenthetical remark for those who are not familiar with Haskell
One trait that I like in Haskell is that not only functions are first-class citizens, but side effects (let me call them actions) are. An action that, when executed, will endow you with a value of type a, belongs to a specific type IO a. You can pass these actions around pretty much like any other value, and combine them in interesting ways.
In fact, combining the side effects is the only way in Haskell to do something with them, as you cannot execute them. Rather, the program that will be executed, is the combined action which is returned by your main function. This is a neat trick that allows functions to be pure, while letting your program actually do something other than consuming power.
The main advantage of this approach is that the compiler is aware of the parts of the code where you perform side effects, so it can help you catch errors with them.
Actual question
Is there some way in Scala to have the compiler type check side effects for you, so that - for instance - you are guaranteed not to execute side effects inside a certain function?
No, this is not possible in principle in Scala, as the language does not enforce referential transparency -- the language semantics are oblivious to side effects. Your compiler will not track and enforce freedom from side effects for you.
You will be able to use the type system to tag some actions as being of IO type however, and with programmer discipline, get some of the compiler support, but without the compiler proof.
The ability to enforce referential transparency this is pretty much incompatible with Scala's goal of having a class/object system that is interoperable with Java.
Java code can be impure in arbitrary ways (and may not be available for analysis when the Scala compiler runs) so the Scala compiler would have to assume all foreign code is impure (assigning them an IO type). To implement pure Scala code with calls to Java, you would have to wrap the calls in something equivalent to unsafePerformIO. This adds boilerplate and makes the interoperability much less pleasant, but it gets worse.
Having to assume that all Java code is in IO unless the programmer promises otherwise would pretty much kill inheriting from Java classes. All the inherited methods would have to be assumed to be in the IO type; this would even be true of interfaces, since the Scala compiler would have to assume the existence of an impure implementation somewhere out there in Java-land. So you could never derive a Scala class with any non-IO methods from a Java class or interface.
Even worse, even for classes defined in Scala, there could theoretically be an untracked subclass defined in Java with impure methods, whose instances might be passed back in to Scala as instances of the parent class. So unless the Scala compiler can prove that a given object could not possibly be an instance of a class defined by Java code, it must assume that any method call on that object might call code that was compiled by the Java compiler without respecting the laws of what functions returning results not in IO can do. This would force almost everything to be in IO. But putting everything in IO is exactly equivalent to putting nothing in IO and just not tracking side effects!
So ultimately, Scala encourages you to write pure code, but it makes no attempt to enforce that you do so. As far as the compiler is concerned, any call to anything can have side effects.

For Scala are there any advantages to type erasure?

I've been hearing a lot about different JVM languages, still in vaporware mode, that propose to implement reification somehow. I have this nagging half-remembered (or wholly imagined, don't know which) thought that somewhere I read that Scala somehow took advantage of the JVM's type erasure to do things that it wouldn't be able to do with reification. Which doesn't really make sense to me since Scala is implemented on the CLR as well as on the JVM, so if reification caused some kind of limitation it would show up in the CLR implementation (unless Scala on the CLR is just ignoring reification).
So, is there a good side to type erasure for Scala, or is reification an unmitigated good thing?
See Ola Bini's blog. As we all know, Java has use-site covariance, implemented by having little question marks wherever you think variance is appropriate. Scala has definition-site covariance, implemented by the class designer. He says:
Generics is a complicated language feature. It becomes even more
complicated when added to an existing language that already has
subtyping. These two features don’t play very well together in the
general case, and great care has to be taken when adding them to a
language. Adding them to a virtual machine is simple if that machine
only has to serve one language - and that language uses the same
generics. But generics isn’t done. It isn’t completely understood how
to handle correctly and new breakthroughs are happening (Scala is a
good example of this). At this point, generics can’t be considered
“done right”. There isn’t only one type of generics - they vary in
implementation strategies, feature and corner cases.
...
What this all means is that if you want to add reified generics to the
JVM, you should be very certain that that implementation can encompass
both all static languages that want to do innovation in their own
version of generics, and all dynamic languages that want to create a
good implementation and a nice interfacing facility with Java
libraries. Because if you add reified generics that doesn’t fulfill
these criteria, you will stifle innovation and make it that much
harder to use the JVM as a multi language VM.
i.e. If we had reified generics in the JVM, most likely those reified generics wouldn't be suitable for the features we really like about Scala, and we'd be stuck with something suboptimal.