Auto-Wrap huge C++ libs to C for import in Swift / Go - swift

Let's say i have a huge lib in C++ (with tons of dependencies, it needs about 3h for a full build under GCC). I want to build upon that lib but don't want to do so in C++ but rather in a more productive language. How can i actually bridge or wrap that extern lib package so i can access it in another language and program on top of it?
Languages considered:
Swift
Go
What i found is, that both languages do provide auto bridging or wrapping for C libs and code (I don't actually know whats the difference between wrapping / bridging). So, if i have some c code, i can just throw it in the same Swift or Go project and can use it with a simple import in my project.
This doesn't work in both languages for C++ code however. So i googled how to transform C++ libs to C code or generate autowrappers. I found the following:
swig.org - auto wrapper for C++ libs
Comeau C++ compiler - automatically transfers C++ to C code
LLVM - should be able to take any input and transform it to any output that LLVM is capable of.
Question:
Is it even in the realms of usable / realistic / managable to build
on top of such a huge lib in other languages like Swift / Go, if
using auto wrapping or auto bridging?
What of the 3 listed libs / programs / frameworks works best for the process of C++ -> C (because Swift and Go both provide C auto
wrapping).
Are there better alternatives than what i considered so far?
Would it be better to just "stick with C++" as using any other tools to do the wrapping / bridging process would be far to much
work to equal out the benefit of using a more productive language
like Swift / Go?
Thanks:)
Disclaimer: There is also the possibility to manually wrap a C++ lib in C but that would take an unbearable amount of work for such a huge lib.

Q1: Is it realistic?
Not realistic, because any large complicated C++ interop is going to get too complicated. Automatic tools are likely to fail and manual work is too hard.
Q2: What's best?
I don't know and given A1 it does not seem to matter.
Q3: Alternative?
Q4: Is C++ only the best alternative?
If you want to take advantage of existing C++ code from another language regardless of the language involved the best option in complex scenarios is to use a hybrid approach.
Most languages provide interop to C and not C++ due to non-standard C++ naming convention. In other words, just about every language provides access to plain C-functions, but C++ is frequently not supported.
Since your library is complex, the best solution would be based on "Facade" pattern. Create a new C-library and implement application specific logic that utilizes C++ library. Try to design this library to be as thin as possible. The goal is not to write all business logic, but to provide C-functions that hold on C++ objects and call C++ functions. The GO-level language code would then call this library to use C++ library underneath. This approach differs from Q1 approach. In Q1 you attempt to have one interop call on per C++ function or object's method. In Facade you attempt to implement C++ usage scenarios that are unique to your application.
With Facade you reduce the scope of interop work, because you target your application scenarios. At the same time you mitigate away from C++ complexity at GO language level.
For example, you need to read a temperature sensor using C++ library.
In C++ you'd have to do:
open file
read stream until you find SLIP terminator
read one "record"
close file
With facade you create a single function called "readTemperature(deviceFileName)" and that C function executes 4 calls at once.
That's a fake example, just to show the point.
With facade you might want to hide original C++ objects and at this point it becomes a small layer. The goal here is to stay focused and balance your application needs with generalization to support your application.
Interestingly enough Facade approach is a way to improve interop performance. Interop in just about every language is more expensive than normal operations due to need to marshal from langauage runtime environment and keep it protected. Lots of interop calls slow down application (we are talking about millions here). For example, having 10 interop calls combined into 1 improves performance, because amount of itnerop operations is reduced.

I was successful wrapping a large (although perhaps not "huge") C++ library (hundreds of header files) in Swift using a relatively simple process. You directly link your project to the library. The only thing you have to wrap are any new functions that you write (to be invoked in Swift) that actually use the library (in the C++ wrapper file). The verbose stuff can be left in the wrapper file, mostly without any modification. There is a simple little tutorial which helped me: https://www.swiftprogrammer.info/swift_call_cpp.html
(FYI, there is one step he omitted: Set your library search paths in Build Settings => Search Paths => Library Search Paths (both Debug and Release) )

Related

Idiomatic Rust plugin system

I want to outsource some code for a plugin system. Inside my project, I have a trait called Provider which is the code for my plugin system. If you activate the feature "consumer" you can use plugins; if you don't, you are an author of plugins.
I want authors of plugins to get their code into my program by compiling to a shared library. Is a shared library a good design decision? The limitation of the plugins is using Rust anyway.
Does the plugin host have to go the C way for loading the shared library: loading an unmangled function?
I just want authors to use the trait Provider for implementing their plugins and that's it.
After taking a look at sharedlib and libloading, it seems impossible to load plugins in a idiomatic Rust way.
I'd just like to load trait objects into my ProviderLoader:
// lib.rs
pub struct Sample { ... }
pub trait Provider {
fn get_sample(&self) -> Sample;
}
pub struct ProviderLoader {
plugins: Vec<Box<Provider>>
}
When the program is shipped, the file tree would look like:
.
├── fancy_program.exe
└── providers
├── fp_awesomedude.dll
└── fp_niceplugin.dll
Is that possible if plugins are compiled to shared libs? This would also affect the decision of the plugins' crate-type.
Do you have other ideas? Maybe I'm on the wrong path so that shared libs aren't the holy grail.
I first posted this on the Rust forum. A friend advised me to give it a try on Stack Overflow.
UPDATE 3/27/2018:
After using plugins this way for some time, I have to caution that in my experience things do get out of sync, and it can be very frustrating to debug (strange segfaults, weird OS errors). Even in cases where my team independently verified the dependencies were in sync, passing non-primitive structs between the dynamic library binaries tended to fail on OS X for some reason. I'd like to revisit this, find what cases it happens in, and perhaps open an issue with Rust, but I'm going to advise caution with this going forward.
LLDB and valgrind are near-essential to debug these issues.
Intro
I've been investigating things along these lines myself, and I've found there's little official documentation for this, so I decided to play around!
First let me note, as there is little official word on these properties please do not rely on any code here if you're trying to keep planes in the air or nuclear missiles from errantly launching, at least not without doing far more comprehensive testing than I've done. I'm not responsible if the code here deletes your OS and emails an erroneous tearful confession of committing the Zodiac killings to your local police; we're on the fringes of Rust here and things could change from one release or toolchain to another.
I have personally tested this on Rust 1.20 stable in both debug and release configurations on Windows 10 (stable-x86_64-pc-windows-msvc) and Cent OS 7 (stable-x86_64-unknown-linux-gnu).
Approach
The approach I took was a shared common crate both crates listed as a dependency defining common struct and trait definitions. At first, I was also going to test having a struct with the same structure, or trait with the same definitions, defined independently in both libraries, but I opted against it because it's too fragile and you wouldn't want to do it in a real design. That said, if anybody wants to test this, feel free to do a PR on the repository above and I will update this answer.
In addition, the Rust plugin was declared dylib. I'm not sure how compiling as cdylib would interact, since I think it would mean that upon loading the plugin there are two versions of the Rust standard library hanging around (since I believe cdylib statically links the Rust stdlib into the shared object).
Tests
General Notes
The structs I tested were not declared #repr(C). This could provide an extra layer of safety by guaranteeing a layout, but I was most curious about writing "pure" Rust plugins with as little "treating Rust like C" fiddling as possible. We already know you can use Rust via FFI by wrapping things in opaque pointers, manually dropping, and such, so it's not very enlightening to test this.
The function signature I used was pub fn foo(args) -> output with the #[no_mangle] directive, it turns out that rustfmt automatically changes extern "Rust" fn to simply fn. I'm not sure I agree with this in this case since they are most certainly "extern" functions here, but I will choose to abide by rustfmt.
Remember that even though this is Rust, this has elements of unsafety because libloading (or the unstable DynamicLib functionality) will not type check the symbols for you. At first I thought my Vec test was proving you couldn't pass Vecs between host and plugin until I realized on one end I had Vec<i32> and on the other I had Vec<usize>
Interestingly, there were a few times I pointed an optimized test build to an unoptimized plugin and vice versa and it still worked. However, I still can't in good faith recommending building plugins and host applications with different toolchains, and even if you do, I can't promise that for some reason rustc/llvm won't decide to do certain optimizations on one version of a struct and not another. In addition, I'm not sure if this means that passing types through FFI prevents certain optimizations such as Null Pointer Optimizations from occurring.
You're still limited to calling bare functions, no Foo::bar because of the lack of name mangling. In addition, due to the fact that functions with trait bounds are monomorphized, generic functions and structs are also out. The compiler can't know you're going to call foo<i32> so no foo<i32> is going to be generated. Any functions over the plugin boundary must take only concrete types and return only concrete types.
Similarly, you have to be careful with lifetimes for similar reasons, since there's no static lifetime checking Rust is forced to believe you when you say a function returns &'a when it's really &'b.
Native Rust
The first tests I performed were on no custom structures; just pure, native Rust types. This would give a baseline for if this is even possible. I chose three baseline types: &mut i32, &mut Vec, and Option<i32> -> Option<i32>. These were all chosen for very specific reasons: the &mut i32 because it tests a reference, the &mut Vec because it tests growing the heap from memory allocated in the host application, and the Option as a dual purpose of testing passing by move and matching a simple enum.
All three work as expected. Mutating the reference mutates the value, pushing to a Vec works properly, and the Option works properly whether Some or None.
Shared Struct Definition
This was meant to test if you could pass a non-builtin struct with a common definition on both sides between plugin and host. This works as expected, but as mentioned in the "General Notes" section, can't promise you Rust won't fail to optimize and/or optimize a structure definition on one side and not another. Always test your specific use case and use CI in case it changes.
Boxed Trait Object
This test uses a struct whose definition is only defined on the plugin side, but implements a trait defined in a common crate, and returns a Box<Trait>. This works as expected. Calling trait_obj.fun() works properly.
At first I actually anticipated there would be issues with dropping without making the trait explicitly have Drop as a bound, but it turns out Drop is properly called as well (this was verified by setting the value of a variable declared on the test stack via raw pointer from the struct's drop function). (Naturally I'm aware drop is always called even with trait objects in Rust, but I wasn't sure if dynamic libraries would complicate it).
NOTE:
I did not test what would happen if you load a plugin, create a trait object, then drop the plugin (which would likely close it). I can only assume this is potentially catastrophic. I recommend keeping the plugin open as long as the trait object persists.
Remarks
Plugins work exactly as you'd expect just linking a crate naturally, albeit with some restrictions and pitfalls. As long as you test, I think this is a very natural way to go. It makes symbol loading more bearable, for instance, if you only need to load a new function and then receive a trait object implementing an interface. It also avoids nasty C memory leaks because you couldn't or forgot to load a drop/free function. That said, be careful, and always test!
There is no official plugin system, and you cannot do plugins loaded at runtime in pure Rust. I saw some discussions about doing a native plugin system, but nothing is decided for now, and maybe there will never be any such thing. You can use one of these solutions:
You can extend your code with native dynamic libraries using FFI. To use the C ABI, you have to use repr(C), no_mangle attribute, extern etc. You will find more information by searching Rust FFI on the internets. With this solution, you must use raw pointers: they come with no safety guarantee (i.e. you must use unsafe code).
Of course, you can write your dynamic library in Rust, but to load it and call the functions, you must go through the C ABI. This means that the safety guarantees of Rust do not apply there. Furthermore, you cannot use the highest level Rust's functionalities as trait, enum, etc. between the library and the binary.
If you do not want this complexity, you can use a language adapted to expand Rust: with which you can dynamically add functions to your code and execute them with same guarantees as in Rust. This is, in my opinion, the easier way to go: if you have the choice, and if the execution speed is not critical, use this to avoid tricky C/Rust interfaces.
Here is a (not exhaustive) list of languages that can easily extend Rust:
Gluon, a functional language like Haskell
Dyon, a small but powerful scripting language intended for video games
Lua with rlua or hlua
You can also use Python or Javascript, or see the list in awesome-rust.

Using Inline::CPP vs SWIG - when?

In this question i saw two different answers how to directly call functions written in C++
Inline::CPP (and here are more, like Inline::C, Inline::Lua, etc..)
SWIG
Handmade (as daxim told - majority of modules are handwritten)
I just browsed nearly all questions in SO tagged [perl][swig] for finding answer for the next questions:
What are the main differences using (choosing between) SWIG and Inline::CPP or Handwritten?
When is the "good practice" - recommented to use Inline::CPP (or Inline:C) and when is recommented to use SWIG or Handwritten?
As I thinking about it, using SWIG is more universal for other uses, like asked in this question and Inline::CPP is perl-specific. But, from the perl's point of view, is here some (any) significant difference?
I haven't used SWIG, so I cannot speak directly to it. But I'm pretty familiar with Inline::CPP.
If you would like to compose C++ code that gets compiled and becomes callable from within Perl, Inline::CPP facilitates this. So long as the C++ code doesn't change, it should only compile once. If you base a module on Inline::CPP, the code will be compiled at module install time, so another user never really sees the first time compilation lag; it happens at install time, just before the testing phase.
Inline::CPP is not 100% free of portability isues. The target user must have a C++ compiler that is of similar flavor to the C compiler used to build Perl, and the C++ standard libraries should be of versions that produce binary-compatible code with Perl. Inline::CPP has about a 94% success rate with the CPAN testers. And those last 6% almost always boil down to issues of the installation process not correctly deciphering what C++ compiler and libraries to use. ...and of those, it usually comes down to the libraries.
Let's assume you as a module author find yourself in that 95% who have no problem getting Inline::CPP installed. If you know that your target audience will fall into that same category, then producing a module based on Inline::CPP is simple. You basically have to add a couple of directives (VERSION and NAME), and swap out your Makefile.PL's ExtUtils::MakeMaker call to Inline::MakeMaker (it will invoke ExtUtils::MakeMaker). You might also want a CONFIGURE_REQUIRES directive to specify a current version of ExtUtils::MakeMaker when you create your distribution; this insures that your users have a cleaner install experience.
Now if you're creating the module for general consumption and have no idea whether your target user will fit that 94% majority who can use Inline::CPP, you might be better off removing the Inline::CPP dependency. You might want to do this just to minimize the dependency chain anyway; it's nicer for your users. In that case, compose your code to work with Inline::CPP, and then use InlineX::CPP2XS to convert it to a plain old XS module. Your user will now be able to install without the process pulling Inline::CPP in first.
C++ is a large language, and Inline::CPP handles a large subset of it. Pay attention to the typemap file to determine what sorts of parameters can be passed (and converted) automatically, and what sorts are better dealt with using "guts and API" calls. One feature I wouldn't recommend using is automatic string conversion, as it would produce Unicode-unfriendly conversions. Better to handle strings explicitly through API calls.
The portion of C++ that isn't handled gracefully by Inline::CPP is template metaprogramming. You're free to use templates in your code, and free to use the STL. However, you cannot simply pass STL type parameters and hope that Inline::CPP will know how to convert them. It deals with POD (basic data types), not STL stuff. Furthermore, if you compose a template-based function or object method, the C++ compiler won't know what context Perl plans to call the function in, so it won't know what type to apply to the template at compiletime. Consequently, the functions and object methods exposed directly to Inline::CPP need to be plain functions or methods; not template functions or classes.
These limitations in practice aren't hard to deal with as long as you know what to expect. If you want to expose a template class directly to Inline::CPP, just write a wrapper class that either inherits or composes itself of the template class, but gives it a concrete type for Inline::CPP to work with.
Inline::CPP is also useful in automatically generating function wrappers for existing C++ libraries. The documentation explains how to do that.
One of the advantages to Inline::CPP over Swig is that if you already have some experience with perlguts, perlapi, and perlcall, you will feel right at home already. With Swig, you'll have to learn the Swig way of doing things first, and then figure out how to apply that to Perl, and possibly, how to do it in a way that is CPAN-distributable.
Another advantage of using Inline::CPP is that it is a somewhat familiar tool in the Perl community. You are going to find a lot more people who understand Perl XS, Inline::C, and to some extent Inline::CPP than you will find people who have used Swig with Perl. Although XS can be messy, it's a road more heavily travelled than using Perl with Swig.
Inline::CPP is also a common topic on the inline#perl.org mailing list. In addition to myself, the maintainer of Inline::C and several other Inline-family maintainers frequent the list, and do our best to assist people who need a hand getting going with the Inline family of modules.
You might also find my Perl Mongers talk on Inline::CPP useful in exploring how it might work for you. Additionally, Math::Prime::FastSieve stands as a proof-of-concept for basing a module on Inline::CPP (with an Inline::CPP dependency). Furthermore, Rob (sisyphus), the current Inline maintainer, and author of InlineX::CPP2XS has actually included an example in the InlineX::CPP2XS distribution that takes my Math::Prime::FastSieve and converts it to plain XS code using his InlineX::CPP2XS.
You should probably also give ExtUtils::XSpp a look. I think it requires you to declare a bit more stuff than Inline::CPP or SWIG, but it's rather powerful.

C++ Libraries on iPhone/Android: Generation of Wrapper classes

I have the following question:
Let's assume I have many different C++ libraries(algorithms), which are written in the same style . (They need some inputs and give back some outputs).
I've done some research and wanted to ask if its possible to auto-generate Wrapper classes (by using an algorithm which are given the input and the outputs of the c++ algorithm), which can be easily used in Objective-C/Java (iOS/Android) then .
The app-programming part isn't really time-consuming.
You'll want to look at SWIG. This generates bindings for other languages from a C based API. Objective-C support is in there as is Java.
I'm not sure what happened to objective-C support in the later versions, but its in v1.1 and you can see the branch where it was added.

Mixing ObjC and C++ with C++ template classes in more than one source

I'm just thinking of porting some old C++ sources held in my archive to iOS thus supplying a ObjC GUI, using wrappers for some C++ stuff and leave the important data working stuff within the C++ code. So, the problem is that the old sources come from Win32 MFC thus using CString class for strings and I want to replace that with Joe O'Leary's CStdString which is a C++ template class that will do it just fine ... but:
I have to use the string class definition along with a big bunch of different C++ sources and so each of them will include the CStdString template on their own. Normally I would write a wrapper for the whole string class, but better if I needn't.
Will I have a problem with instantiation of strings in the different sources? Could it make a problem to pass a templated string from one source to another? In fact I don't know if the compiler generates the code for a template only once or multiple times having the fact that the same instantiation type is used for the template.
Can you fill some light into this?
Thanks...
MFC and CString may only work properly on Windows OS so they aren't good candidates to be putting in any kind of library that will be potentially used by a platform other than windows.
I'm not familiar with Joe O'Leary's CStdString classes but I'd recommend using std::string as much as possible and char* with "extern C" exports and wrapping functions for use outside of C++ land as the c-style string is more easily compatible with other languages that may need to call into your C++ library.
As far as templates all the variations are generated at compile time and then the correct implementation is chosen at run time as far as I know. However your problem will most likely be in translation from one kind of string to another which may require you to create some middle layer or wrapper to marshal from string type of one language to another.
I agree with CString, as long as you stay with std::string or some other multi-platform string implementation for C++ you are not going to face any issues ( even boost works on iOS ).
I've been integrating C++/Obj-C for about two years now so you can be sure that keeping model classes in C++ ( even with heavily templated code ), is not a problem. I would advice you to do what you could do best with Obj-C in Obj-C though... ( avoiding being a hammer developer :) )
Good luck!

Code generation for Java JVM / .NET CLR

I am doing a compilers discipline at college and we must generate code for our invented language to any platform we want to. I think the simplest case is generating code for the Java JVM or .NET CLR. Any suggestion which one to choose, and which APIs out there can help me on this task? I already have all the semantic analysis done, just need to generate code for a given program.
Thank you
From what I know, on higher level, two VMs are actually quite similar: both are classic stack-based machines, with largely high-level operations (e.g. virtual method dispatch is an opcode). That said, CLR lets you get down to the metal if you want, as it has raw data pointers with arithmetic, raw function pointers, unions etc. It also has proper tailcalls. So, if the implementation of language needs any of the above (e.g. Scheme spec mandates tailcalls), or if it is significantly advantaged by having those features, then you would probably want to go the CLR way.
The other advantage there is that you get a stock API to emit bytecode there - System.Reflection.Emit - even though it is somewhat limited for full-fledged compiler scenarios, it is still generally enough for a simple compiler.
With JVM, two main advantages you get are better portability, and the fact that bytecode itself is arguably simpler (because of less features).
Another option that i came across what a library called run sharp that can generate the MSIL code in runtime using emit. But in a nicer more user friendly way that is more like c#. The latest version of the library can be found here.
http://code.google.com/p/runsharp/
In .NET you can use the Reflection.Emit Namespace to generate MSIL code.
See the msdn link: http://msdn.microsoft.com/en-us/library/3y322t50.aspx