How to call Rust functions in Flutter (Dart) via FFI, but with convenience and safety? - flutter

I know we can call Rust from Flutter/Dart via FFI. But Flutter only allows the C ABI when doing FFI. Therefore, I have to manually write down boilerplate code. Especially, Rust unsafe code - since I have to deal with lots of raw pointers :(
Therefore, is there any approaches to do it in a safe way? We know Rust itself is very safe (since its unique memory management approach), and Dart/Flutter itself is also very safe (since GC). But I do not want the ffi call be the Achilles heel and destroy the safety of my app!

There are several ways to do it.
a. JSON/Protobuf-based Approach
The first way that I have used in the production environment for a year is that, you can use JSON or Protobuf to pass all the data between Rust and Dart/Flutter. By doing this, you do not need to write down tons of boilerplate code to allocate/free a String, a List of bytes, a struct/class, etc. All you need to do is to write down one single function that accepts a byte array payload and outputs a byte array result. By saying "one" function, I mean, you can have an action field in your JSON/Protobuf, so calls to indeed different Rust functions can be interleaved into this one thin interface.
Despite its convenience (only a bit of unsafe boilerplate), the drawback is also evident. The serialization and deserialization does not come for free. You will have to pay the CPU time and memory for it, which can be quite large sometimes. Moreover, you cannot easily pass around big objects. For example, if you have an image (you know, at least megabytes of size), serializing it to Protobuf, then deserialize it from Protobuf can be quite a waste of both CPU and memory - useless copies! Even worse, since Flutter/Dart FFI does not support a convenient way of async FFI, you have to make it running in a separate worker isolate - one more memory copy. You can see more here: https://github.com/dart-lang/language/issues/1862 (this is an issue that I opened).
b. Code generator
The second way that I use recently is to write down a code generator. Indeed the code follows several common patterns, such as "allocate - fill data - call FFI - free", etc. So it is not that hard to write a generator to automatically do such kind of things. The idea is to mimic what human beings will do when they write down boilerplate code manually.
I did hope that there already exist some code generator such that I could directly use, but it seemed that none exists... So, go and write it by yourself.
c. Use existing open-source code generator
After I write down the code generator, I guess people may have the same problem as me, so I open-sourced it: https://github.com/fzyzcjy/flutter_rust_bridge
Indeed, my code generator not only solves the problem above, but also have rich type support, allows zero-copy, allows async programming and direct call from main isolate, etc, which can be implemented via code generator but will require lots of boilerplate code if you do it by hand.
Disclaimer: This is a Q&A-style answer to show my thoughts and what I have done on this problem that is critical to my own app in production environment. Indeed I have used the JSON approach since last year, and later refactor into the code generator approach. Hope it also helps other people who faces the same situation!

Related

Idiomatic Rust plugin system

I want to outsource some code for a plugin system. Inside my project, I have a trait called Provider which is the code for my plugin system. If you activate the feature "consumer" you can use plugins; if you don't, you are an author of plugins.
I want authors of plugins to get their code into my program by compiling to a shared library. Is a shared library a good design decision? The limitation of the plugins is using Rust anyway.
Does the plugin host have to go the C way for loading the shared library: loading an unmangled function?
I just want authors to use the trait Provider for implementing their plugins and that's it.
After taking a look at sharedlib and libloading, it seems impossible to load plugins in a idiomatic Rust way.
I'd just like to load trait objects into my ProviderLoader:
// lib.rs
pub struct Sample { ... }
pub trait Provider {
fn get_sample(&self) -> Sample;
}
pub struct ProviderLoader {
plugins: Vec<Box<Provider>>
}
When the program is shipped, the file tree would look like:
.
├── fancy_program.exe
└── providers
├── fp_awesomedude.dll
└── fp_niceplugin.dll
Is that possible if plugins are compiled to shared libs? This would also affect the decision of the plugins' crate-type.
Do you have other ideas? Maybe I'm on the wrong path so that shared libs aren't the holy grail.
I first posted this on the Rust forum. A friend advised me to give it a try on Stack Overflow.
UPDATE 3/27/2018:
After using plugins this way for some time, I have to caution that in my experience things do get out of sync, and it can be very frustrating to debug (strange segfaults, weird OS errors). Even in cases where my team independently verified the dependencies were in sync, passing non-primitive structs between the dynamic library binaries tended to fail on OS X for some reason. I'd like to revisit this, find what cases it happens in, and perhaps open an issue with Rust, but I'm going to advise caution with this going forward.
LLDB and valgrind are near-essential to debug these issues.
Intro
I've been investigating things along these lines myself, and I've found there's little official documentation for this, so I decided to play around!
First let me note, as there is little official word on these properties please do not rely on any code here if you're trying to keep planes in the air or nuclear missiles from errantly launching, at least not without doing far more comprehensive testing than I've done. I'm not responsible if the code here deletes your OS and emails an erroneous tearful confession of committing the Zodiac killings to your local police; we're on the fringes of Rust here and things could change from one release or toolchain to another.
I have personally tested this on Rust 1.20 stable in both debug and release configurations on Windows 10 (stable-x86_64-pc-windows-msvc) and Cent OS 7 (stable-x86_64-unknown-linux-gnu).
Approach
The approach I took was a shared common crate both crates listed as a dependency defining common struct and trait definitions. At first, I was also going to test having a struct with the same structure, or trait with the same definitions, defined independently in both libraries, but I opted against it because it's too fragile and you wouldn't want to do it in a real design. That said, if anybody wants to test this, feel free to do a PR on the repository above and I will update this answer.
In addition, the Rust plugin was declared dylib. I'm not sure how compiling as cdylib would interact, since I think it would mean that upon loading the plugin there are two versions of the Rust standard library hanging around (since I believe cdylib statically links the Rust stdlib into the shared object).
Tests
General Notes
The structs I tested were not declared #repr(C). This could provide an extra layer of safety by guaranteeing a layout, but I was most curious about writing "pure" Rust plugins with as little "treating Rust like C" fiddling as possible. We already know you can use Rust via FFI by wrapping things in opaque pointers, manually dropping, and such, so it's not very enlightening to test this.
The function signature I used was pub fn foo(args) -> output with the #[no_mangle] directive, it turns out that rustfmt automatically changes extern "Rust" fn to simply fn. I'm not sure I agree with this in this case since they are most certainly "extern" functions here, but I will choose to abide by rustfmt.
Remember that even though this is Rust, this has elements of unsafety because libloading (or the unstable DynamicLib functionality) will not type check the symbols for you. At first I thought my Vec test was proving you couldn't pass Vecs between host and plugin until I realized on one end I had Vec<i32> and on the other I had Vec<usize>
Interestingly, there were a few times I pointed an optimized test build to an unoptimized plugin and vice versa and it still worked. However, I still can't in good faith recommending building plugins and host applications with different toolchains, and even if you do, I can't promise that for some reason rustc/llvm won't decide to do certain optimizations on one version of a struct and not another. In addition, I'm not sure if this means that passing types through FFI prevents certain optimizations such as Null Pointer Optimizations from occurring.
You're still limited to calling bare functions, no Foo::bar because of the lack of name mangling. In addition, due to the fact that functions with trait bounds are monomorphized, generic functions and structs are also out. The compiler can't know you're going to call foo<i32> so no foo<i32> is going to be generated. Any functions over the plugin boundary must take only concrete types and return only concrete types.
Similarly, you have to be careful with lifetimes for similar reasons, since there's no static lifetime checking Rust is forced to believe you when you say a function returns &'a when it's really &'b.
Native Rust
The first tests I performed were on no custom structures; just pure, native Rust types. This would give a baseline for if this is even possible. I chose three baseline types: &mut i32, &mut Vec, and Option<i32> -> Option<i32>. These were all chosen for very specific reasons: the &mut i32 because it tests a reference, the &mut Vec because it tests growing the heap from memory allocated in the host application, and the Option as a dual purpose of testing passing by move and matching a simple enum.
All three work as expected. Mutating the reference mutates the value, pushing to a Vec works properly, and the Option works properly whether Some or None.
Shared Struct Definition
This was meant to test if you could pass a non-builtin struct with a common definition on both sides between plugin and host. This works as expected, but as mentioned in the "General Notes" section, can't promise you Rust won't fail to optimize and/or optimize a structure definition on one side and not another. Always test your specific use case and use CI in case it changes.
Boxed Trait Object
This test uses a struct whose definition is only defined on the plugin side, but implements a trait defined in a common crate, and returns a Box<Trait>. This works as expected. Calling trait_obj.fun() works properly.
At first I actually anticipated there would be issues with dropping without making the trait explicitly have Drop as a bound, but it turns out Drop is properly called as well (this was verified by setting the value of a variable declared on the test stack via raw pointer from the struct's drop function). (Naturally I'm aware drop is always called even with trait objects in Rust, but I wasn't sure if dynamic libraries would complicate it).
NOTE:
I did not test what would happen if you load a plugin, create a trait object, then drop the plugin (which would likely close it). I can only assume this is potentially catastrophic. I recommend keeping the plugin open as long as the trait object persists.
Remarks
Plugins work exactly as you'd expect just linking a crate naturally, albeit with some restrictions and pitfalls. As long as you test, I think this is a very natural way to go. It makes symbol loading more bearable, for instance, if you only need to load a new function and then receive a trait object implementing an interface. It also avoids nasty C memory leaks because you couldn't or forgot to load a drop/free function. That said, be careful, and always test!
There is no official plugin system, and you cannot do plugins loaded at runtime in pure Rust. I saw some discussions about doing a native plugin system, but nothing is decided for now, and maybe there will never be any such thing. You can use one of these solutions:
You can extend your code with native dynamic libraries using FFI. To use the C ABI, you have to use repr(C), no_mangle attribute, extern etc. You will find more information by searching Rust FFI on the internets. With this solution, you must use raw pointers: they come with no safety guarantee (i.e. you must use unsafe code).
Of course, you can write your dynamic library in Rust, but to load it and call the functions, you must go through the C ABI. This means that the safety guarantees of Rust do not apply there. Furthermore, you cannot use the highest level Rust's functionalities as trait, enum, etc. between the library and the binary.
If you do not want this complexity, you can use a language adapted to expand Rust: with which you can dynamically add functions to your code and execute them with same guarantees as in Rust. This is, in my opinion, the easier way to go: if you have the choice, and if the execution speed is not critical, use this to avoid tricky C/Rust interfaces.
Here is a (not exhaustive) list of languages that can easily extend Rust:
Gluon, a functional language like Haskell
Dyon, a small but powerful scripting language intended for video games
Lua with rlua or hlua
You can also use Python or Javascript, or see the list in awesome-rust.

coffeescript and repetition of code. Is there a solution?

So - I am really really digging coffeescript. But, I am curious how the possibility of repetition of code is dealth with across a large repository of code.
For instance.
Lets say I create a simple class.
class Cart
constructor: (#session, #group) ->
class Shoes extends Cart
compiler will create __extends and __hasProp methods.
Mind you, this is just one example -- pretty much this happens with loops etc... So, granted each bit of code is usually in its walled garden.. BUT, there could be many many of the same methods thru-out a code base.... because of the compiler just creating generic helper methods that are all the same.
Anyone else have to contend with this or deal with that possible bloat?
That is probably a lot more specific to what build tool you are using to manage a large codebase. grunt-contrib-coffee for example provides the ability to concatenate before compilation which means something like the __extends method should only get declared once. Likewise, I believe, asset pipeline in rails makes similar optimizations through the require statements.

Object class members as pointers to avoid #include in headers - is it good practice?

This is really a question of precedence: which is more preferred in C++, avoiding pointers or avoiding #includes in header files?
"Don't Use #include in header files."
There seems to be some ambiguity based on my research. In this SO question, the top answer says "...make sure you actually need an include, [don't use one] when a forward declaration or even leaving it out completely will do." (From Header files and include best practice)
And this article explains the negative effect excess header inclusions can have on compile-time: http://blog.knatten.org/2012/11/09/another-reason-to-avoid-includes-in-headers/
As well as this tutorial, stating, "...you should try to put all of your code in the CPP class and only the class declaration in the HPP file.": https://github.com/LaurentGomila/SFML/wiki/Tutorial%3A-Basic-Game-Engine#wiki-declarations
"Don't Use Pointers."
But, there is also evidence that pointers should be avoided most often as well:
c++: when to use pointers?
https://softwareengineering.stackexchange.com/questions/56935/why-are-pointers-not-recommended-when-coding-with-c
Which preference takes precedence?
If my understanding about avoiding #includes in header files is correct, this can easily be done by changing things like class members to pointers so I can use a forward declaration instead, but is this a good idea for class members whose lifetime only lasts as long as the class itself?
It's not really an "one or the other". Both statements are true, but you need to understand the reasoning behind them.
tl;dr: Use forward declaration where possible to reduce compile time. Use stack objects or references as much as possible and pointers only in rare cases.
"Don't Use #include in header files."
This is a rather general statement, which as is, would be wrong. The more important part behind this statement actually is: "Use forward declarations where ever possible". Includes in header files are not something bad per se, but they often aren't needed either.
Forward declarations can be used, if the included type/class/etc. is used as a pointer in the new type/class/etc. declaration within the given header. Forward declaration just tells the compiler: "Somewhere a long the way you'll find the actual declaration of type X." The include can even be removed if the type isn't used at all in the declaration. The reason is that the compiler doesn't need to know anything about these types to calculate the required memory layout for the new type. For example a pointer has "always" the same size. Including the file additionally in the header, would potentially only waste processing power, since the compiler would have to open and parse the file, thus adding expensive seconds to the compile time. So in most cases you'll do yourself a favor by reducing the unnecessary includes in the header files and instead use forward declaration.
For the sake of completion: Forward declaration are explicitly needed if you get circular references (class A depends on class B, which depends on class C, which depends on class A). However this can often also reveal either bad design and/or old/outdate coding standards which would lead us to the second topic.
"Don't use pointers."
Again the statement is a tiny bit too general. One might rather want to say: "Don't use raw pointers."
With C++11 and soon C++1y the language itself has changed a lot. As much bad C++ books the world has seen, the more outdated C++ books float around nowadays (here's a good list however). While in the past we were mostly stuck with pointers new and delete for memory management, we've evolved to better, more readable, less risk and 100% memory leak free ways to manage the data in memory. One of the magic words is RAII - since you linked something from SFML above, here's a nice demonstration of the power of RAII. I see many people use pointers and new and delete just because or maybe because they are thinking in Java or C# terms were objects get instantiated with the new keyword. In C++ however object don't need to use new to be allocated and it's mostly preferable to run things on the stack instead of the heap. This works for many, many things, especially when using STL containers, which will hide the dynamic management in the background. The usage of the heap is mostly all cases only preferable if you need the data to be dynamic, non "local" or you need a lot of it. However when you use the heap, make sure to use smart pointers such as std::unique_ptr or std::shared_ptr depending on the use case, but certainly not raw pointers. In modern C++ raw pointers should never own an object anymore. There are cases where it's okay to return a raw pointer to reference an object, but there's really no reason in modern C++ to call new on a raw pointer.
Lets get back to the original question though. The "Don't use raw pointers" is essentially more of a design question and quite unrelated to the whole header issue. While there might be some cases where you'll have to switch to raw pointers, due to circular references, the use of forward declarations is otherwise just about compilation time (and maybe clean code), but it's not as essential for the programming itself.
In short: Don't use raw pointers to avoid inclusions in header files, but use forward declaration where ever possible and utilize smart pointers as much as possible.

Does it matter if there are unused functions I put into a big CoolFunctions.h / CoolFunctions.m file that's included everywhere in my project?

I want to create a big file for all cool functions I find somehow reusable and useful, and put them all into that single file. Well, for the beginning I don't have many, so it's not worth thinking much about making several files, I guess. I would use pragma marks to separate them visually.
But the question: Would those unused methods bother in any way? Would my application explode or have less performance? Or is the compiler / linker clever enough to know that function A and B are not needed, and thus does not copy their "code" into my resulting app?
This sounds like an absolute architectural and maintenance nightmare. As a matter of practice, you should never make a huge blob file with a random set of methods you find useful. Add the methods to the appropriate classes or categories. See here for information on the blob anti-pattern, which is what you are doing here.
To directly answer your question: no, methods that are never called will not affect the performance of your app.
No, they won't directly affect your app. Keep in mind though, all that unused code is going to make your functions file harder to read and maintain. Plus, writing functions you're not actually using at the moment makes it easy to introduce bugs that aren't going to become apparent until much later on when you start using those functions, which can be very confusing because you've forgotten how they're written and will probably assume they're correct because you haven't touched them in so long.
Also, in an object oriented language like Objective-C global functions should really only be used for exceptional, very reusable cases. In most instances, you should be writing methods in classes instead. I might have one or two global functions in my apps, usually related to debugging, but typically nothing else.
So no, it's not going to hurt anything, but I'd still avoid it and focus on writing the code you need now, at this very moment.
The code would still be compiled and linked into the project, it just wouldn't be used by your code, meaning your resultant executable will be larger.
I'd probably split the functions into seperate files, depending on the common areas they are to address, so I'd have a library of image functions separate from a library of string manipulation functions, then include whichever are pertinent to the project in hand.
I don't think having unused functions in the .h file will hurt you in any way. If you compile all the corresponding .m files containing the unused functions in your build target, then you will end up making a bigger executable than is required. Same goes for if you include the code via static libraries.
If you do use a function but you didn't include the right .m file or library, then you'll get a link error.

Why use a post compiler?

I am battling to understand why a post compiler, like PostSharp, should ever be needed?
My understanding is that it just inserts code where attributed in the original code, so why doesn't the developer just do that code writing themselves?
I expect that someone will say it's easier to write since you can use attributes on methods and then not clutter them up boilerplate code, but that can be done using DI or reflection and a touch of forethought without a post compiler. I know that since I have said reflection, the performance elephant will now enter - but I do not care about the relative performance here, when the absolute performance for most scenarios is trivial (sub millisecond to millisecond).
Let's try to take an architectural point on the issue. Say you are an architect (everyone wants to be an architect ;)
You need to deliver the architecture to your team:
a selected set of libraries, architectural patterns, and design patterns. As a part of your design, you say: "we will implement caching using the following design pattern:"
string key = string.Format("[{0}].MyMethod({1},{2})", this, param1, param2 );
T value;
if ( !cache.TryGetValue( key, out value ) )
{
using ( cache.Lock(key) )
{
if (!cache.TryGetValue( key, out value ) )
{
// Do the real job here and store the value into variable 'value'.
cache.Add( key, value );
}
}
}
This is a correct way to do tracing. Developers are going to implement this pattern thousands of times, so you write a nice Word document telling how you want the pattern to be implemented. Yeah, a Word document. Do you have a better solution? I'm afraid you don't. Classic code generators won't help. Functional programming (delegates)? It works fairly well for some aspects, but not here: you need to pass method parameters to the pattern. So what's left? Describe the pattern in natural language and trust developers will implement them.
What will happen?
First, some junior developer will look at the code and tell "Hm. Two cache lookups. Kinda useless. One is enough." (that's not a joke -- ask the DNN team about this issue). And your patterns cease to be thread-safe.
As an architect, how do you ensure that the pattern is properly applied? Unit testing? Fair enough, but you will hardly detect threading issues this way. Code review? That's maybe the solution.
Now, what is you decide to change the pattern? For instance, you detect a bug in the cache component and decide to use your own? Are you going to edit thousands of methods? It's not just refactoring: what if the new component has different semantics?
What if you decide that a method is not going to be cached any more? How difficult will it be to remove caching code?
The AOP solution (whatever the framework is) has the following advantages over plain code:
It reduces the number of lines of code.
It reduces the coupling between components, therefore you don't have to change much things when you decide to change the logging component (just update the aspect), therefore it improves the capacity of your source code to cope with new requirements over time.
Because there is less code, the probability of bugs is lower for a given set of features, therefore AOP improves the quality of your code.
So if you put it all together:
Aspects reduce both development costs and maintenance costs of software.
I have a 90 min talk on this topic and you can watch it at http://vimeo.com/2116491.
Again, the architectural advantages of AOP are independent of the framework you choose. The differences between frameworks (also discussed in this video) influence principally the extent to which you can apply AOP to your code, which was not the point of this question.
Suppose you already have a class which is well-designed, well-tested etc. You want to easily add some timing on some of the methods. Yes, you could use dependency injection, create a decorator class which proxies to the original but with timing for each method - but even that class is going to be a mess of repetition...
... or you can add reflection to the mix and use a dynamic proxy of some description, which lets you write the timing code once, but requires you to get that reflection code just right -which isn't as easy as it might be, especially if generics are involved.
... or you can add an attribute to each method that you want timed, write the timing code once, and apply it as a post-compile step.
I know which seems more elegant to me - and more obvious when reading the code. It can be applied even in situations where DI isn't appropriate (and it really isn't appropriate for every single class in a system) and with no other changes elsewhere.
AOP (PostSharp) is for attaching code to all sorts of points in your application, from one location, so you don't have to place it there.
You cannot achieve what PostSharp can do with Reflection.
I personally don't see a big use for it, in a production system, as most things can be done in other, better, ways (logging, etc).
You may like to review the other threads on this matter:
Anyone with Postsharp experience in production?
Other than logging, and transaction management what are some practical applications of AOP?
Aspect Oriented Programming: What do you use PostSharp for?
etc (search)
Aspects take away all the copy & paste - code and make adding new features faster.
I hate nothing more than, for example, having to write the same piece of code over and over again. Gael has a very nice example regarding INotifyPropertyChanged on his website (www.postsharp.net).
This is exactly what AOP is for. Forget about the technical details, just implement what you are being asked for.
In the long run, I think we all should say goodbye to the way we are writing software now. It's tedious and plainly stupid to write boilerplate code and iterate manually.
The future belongs to declarative, functional style being held together by an object oriented framework - and the cross cutting concerns being handled by aspects.
I guess the only people who will not get it soon are the guys who are still payed for lines of code.