Sandboxing Guile by deleting unwanted libraries? - lisp

I have an open-source GUI for which I've just implemented a very basic extension mechanism that allows the user to embed a snippet of Lisp (Guile) code in a document to allow certain functions to be customized. Currently the use case (my own) is that in a certain situation I just want a certain number to be divided by 10 automatically.
In principle this means that if someone uses my GUI, someone else can send them a document containing a Trojan horse attack in the form of malicious Guile code. This seems unlikely in reality due to my very small user base and the relevant social factors, but anyway I would like to protect against this by sandboxing the code.
Guile 2.2.1 has a sandboxing mechanism: https://www.gnu.org/software/guile/manual/html_node/Sandboxed-Evaluation.html However, this seems entirely focused on preventing excessive use of resources (and many users will not have such a current version of Guile, e.g., I don't right now).
Is it possible in a Guile program, after the interpreter has started up, to delete libraries such as POSIX so that any later code can't do things like read files? If so, then I could just have my GUI prepend the sandboxing code to the potentially untrusted code supplied by the document. My goal would basically be to ensure that the untrusted code would have to be 100% without side effects.
It seems like other people must have run into this issue before, since Guile has been promoted as a standard extension language for Linux.
EDIT: After coming across How to undefine a variable in Scheme? , it seems to me that I could probably create a new scope, override the definition of any dangerous function by doing a set! within that scope, and then execute the untrusted code within that scope. I suppose if I could read the entire symbol table somehow, I could do this kind of overriding on every function that isn't on a whitelist. This just seems clumsy and inefficient, and might depend on reading the symbol table using some implementation-specific mechanism. It also might be vulnerable to syntactical tricks, like if the untrusted code uses a ) to pop itself out of the scope of the sandbox.

Related

ebpf: bpf_prog_load() vs bpf_object__load()

I have not used libbpf in a while. Now, when I'm looking at the source code and examples, it looks to me that all API now is built around bpf_object while before it was based on program FD (at least on the user-facing level). I believe that fd is now hidden in bpf_object or such.
Of course it keeps backward compatibility and I still can use bpf_prog_load for example, however it looks like the preferred way of writing application code using libbpf is by bpf_object API?
Correct me if I'm wrong. Thanks!
Sounds mostly correct to me.
Low-Level Wrappers
If I remember correctly, the functions returning file descriptors in libbpf, mostly defined in tools/lib/bpf/bpf.c, have always been very low-level. This is the case for bpf_load_program() for example, which is no more than a wrapper around the bpf() system call for loading programs. Such functions are still available, but their use may be tedious for complex use cases.
bpf_prog_load()
Some more advanced functions have long been provided. bpf_prog_load(), that you mention, is one of them, but it returns an error code, not a file descriptor. It is still available as one option to load programs with the library.
bpf_object__*()
Although I don't think there are strict guidelines, I believe it is true that most example now use the bpf_object__*() function. One reason is that they provide a more consistent user experience, being organised around the manipulation of an object file to extract all the relevant bytecode and metadata, and then to load and attach the program. One other reason, I think, is that since this model has been favoured over the last releases, these functions have better support for recent eBPF features and the bpf_object__*() functions offer features that the older bpf_prog_load() workflow does not support.
Libbpf Evolves
At last, it's worth mentioning that libbpf's API is currently undergoing some review and will likely be reworked as part of a major v1.0 release. You may want to have a look at the work document linked in the announcement: Some bpf_object__ functions may be deprecated, and similarly there is currently a proposal to:
Deprecate bpf_prog_load() and bpf_prog_load_xattr() in favor of bpf_object__open_{mem, file}() and bpf_object__load() combo.
There is nothing certain yet regarding the v1.0 release, so I wouldn't worry too much about “deprecation” at the moment - I don't expect all functions to be removed just yet. But that's something you may want to consider when building your next applications.

REST Server in TCL

I would like to add a REST interface to an existing TCL codebase (so that the programms in other language can use the existing TCL code).
I found a list of Webserver with TCL support but I have no idea which one would be a good solution to quickly map our TCL functions to HTTP/REST calls without tons of boilerplate code.
Has anyone here already done something like this and can tell me which of these servers would be a good (or bad/difficult) solution?
Is there maybe another server/framework that is even better for this use case?
Consider Naviserver. Tcl is its embedded interpreter language. It has a low profile memory overhead, and is regularly maintained and tested for performance and low latency.
For what you’re describing, you might consider Wapp. It’ll do exactly the boilerplate elimination you want, and it’s easy to dive into. You’d probably want to use it as a library, rather than an app, given that you’ve got an existing codebase, but its operation past the initial setup is the same for that use case.

black box function export in MATLAB [duplicate]

If a company works on matlab projects, then how do they provide the client the project? I mean which file do they send to the client as they cannot hand over the client the whole codes and data ?
It would depend on lots of things, such as the nature of the product you are building for the client, your relationship and contractual agreement with them, and whether they need to modify the product in the future.
When I carry out consultancy on MATLAB projects for a company, I usually supply them with MATLAB source code. Part of the contract would typically say that they own the code (and the copyright to the code) that I produce for them, and they can then do pretty much whatever they want with it.
If you have a different relationship, where you continue to own the code and need to prevent them from reading it and/or modifying it, then the issue is really the same as it is for any other language: you rely on a mixture of technological restrictions and legal restrictions, designed to be as restrictive as you need while minimizing inconvenience for the end-user.
For example,
You can obfuscate your code using the command pcode. That will prevent almost everyone who isn't extremely determined from seeing your code and modifying it (there are some loopholes though), but they will still be able to run it within MATLAB. Downsides might be that your code may become unexecutable in a future version of MATLAB, so you may need to support it again to fix that later. To mitigate this, you could specify in your contract or license agreement that only specific versions of MATLAB will be supported.
You can use MATLAB Compiler to produce a standalone library or executable that contains the code in an encrypted form. Downsides might be that they would rather use the code from within MATLAB. An upside would be that unlike the first option it doesn't require MATLAB, so you're not vulnerable to backward-compatibility issues in future.
You can include licence-management code within your MATLAB application. You can either roll your own, perhaps by calling a bit of Java for the cryptography (you will likely not be able make it very secure, unless you're very talented, but you'll probably be able to make something simple and workable), or you can buy third-party C libraries that do it well, and call them from MATLAB.
You can simply put copyright lines in your code saying that you own the copyright, and licence the code to them under specific terms, such as that they may view it, use it, but not modify or redistribute it. If you really want, you could ask them to also sign a non-disclosure agreement requiring them not to discuss the content of the code with third parties.
Although the technological restrictions available are a little different in MATLAB than they would be for a compiled language such as C or Java, at the end of the day those are only ever there to keep honest people honest - anyone determined will be able to get around them eventually, and they may well inconvenience the honest people, annoying them into disliking your product or service.
Better to use a mixture of very light technological restrictions, crystal-clear contract and licensing terms, and trust.
<advert> One of the consultancy services I offer is advice and help in preparing MATLAB code for deployment, including protecting it. If you think you'd benefit from that, please get in touch. </advert>
You can use the Matlab Compiler and compile your codes in to an exe file for windows. This is what is usually expected by a company. Some who have R&D themselves might ask you for the original m-code, or specific functions depending on your relationship/contract with them. I've been asked to give m-code several times, and it says on my contract that I am suppose to give this information to them. (UK based)

Allowing the user a sandboxed version of a programming language

Note: I'd appreciate some tag suggestions for this one..
I'd like to provide my users with a method of programmatically manipulating data on the server. This would be done by using an in-browser code editor to be executed at a later date, not dissimilar to the manner https://www.onx.ms employ.
I'd like to avoid writing a DSL (a barrier to adoption?), and would prefer the language that the user writes to be either JavaScript or Ruby based.
My obvious concern is security. I understand the perils of allowing user generated code to run server-side, but what steps can I take to eliminate the risk?
Do sites like http://railsforzombies.com actually use irb, or is it far simpler than that?
Would you consider Java (or other JVM languages such as JRuby, Scala, Clojure etc)? If so - there is a wealth of power in the JVM to restrict the privileges of a sandboxed app. See this other question for details: How do I create a Java sandbox?
Google Caja lets you safely embed user-specified Javascript in your website, but I think it might be aimed at running the code in the user's browser rather than on your server. I haven't used it myself.
I don't know if there are ready-made solutions for other languages, but I think a custom solution would involve recompiling the interpreter yourself after removing all API libraries that allow the user to write to disk, open network connections, fork processes/threads, and do any other dangerous or denial-of-service operation. Whitelisting "safe" libraries is the only approach that could work for that.
It would be safer if you had separate virtual servers for individual users.

What does "monolithic" mean?

I've seen it in the context of classes. I suspect it means that the class could use being broken down into logical subunits, but I can't find a good definition. Could you give some examples?
Thanks for the help.
Edit: I love the smart replies, but I'm obviously referring to "monolithic" within a software context. I know about monoliths, megaliths, dolmens, and all the stone-related contexts. Gee, I have enough of them in my country...
Interesting question. I don't think there are any formal definitions of what a monolithic class is, but you've got the idea. A class that contains multiple components that are logically unconnected, or pointlessly coupled, is a monolithic class.
If you've read The Pragmatic Programmer, which I strongly recommend, you can define a monolithic class as an anti-pattern that goes against almost everything from that book.
As for examples, you'll find more in the realm of chip and OS design, where there are formal definitions of monolithic chips/kernels, which are similar to a monolithic class. Here are some examples, although each of them can be argued against being on this list:
JOGL - Java bindings for OpenGL. This could be arguable, and with good reason.
Most academic projects - For obvious reasons.
If you started programming alone, rather than joining a team, then chances are you can open one of your first projects, and there will be a class that is monolithic.
If you look up the etymology of the word you'll see it comes from the Greek monos (single) and lithos (stone). In the context of software as you mention it, it describes a single-tiered application in which the code for the user interface and the data access are combined into a single program from a single platform.
"Monolithic" is a term that has been used to flame succesful software. This link exposes the assumptions inherent in the term, and their limited usefulness.
The basic assumption is that a system works better if it is built from software components that each have an individual, well-defined task. Intuitively, this seems right. If each component works, the entire system must work, right?
In reality, it's not that easy. A larger, compositional (non-monolithic) system can miss a critical function, even when there is no single component to blame. This happens when the architectural design fails to allocate a function to any specific component. This can happen especially if it's a function which doesn't map cleanly to a single component.
Now Linux (to continue with the linked example) in reality is not monolithic. It has a modular userspace on top of a monolithic kernel, a userspace that comes with many separate utilities. Except when it doesn't.
My definition of a Monolithic design in software development, is a design which requires additional functionality to be added to a single indivisible block of code.
PRO:
Everything is in one place, and therefore easy to find
Can be simpler, given there less relations to consider (can also be more complex see cons)
CONS:
Over time as functionality is added the complexity of the system may exponentially increase, to the point new features are extremely hard or impossible to implement
Can make it difficult for multiple developers to work with e.g Entity Framework EDMX files have the entire database in a single file which can be extremely difficult for multiple developers to work on.
Reduced re-usability, by definition it does not have smaller components which can be then reused and re-purposed to solve other problems, unless a complete copy of the code is made and then modified.
A monolithic architecture is a model of software structure which is created as one piece where all Rails tools (ActionMailer, ActiveJob, ActionCable, etc.) can be gathered together with the code that these tools applies. The tools are not connected with each other but they are also not autonomous.
If one feature needs changes, it will influence the work of the whole process and other features because they are parts of one process.
Let’s recall what Ruby on Rails is, what it can offer, its pros and cons. Its most important benefit is that it is easy to work with.
If you write rails new you immediately get a new application at once, then you can create any REST API you want and use Rails helpers and generators, which makes development even easier.
If you need to send emails in your Rails app, then use Rails ActionMailer. When you need to do some hard processing, ActiveJob will help you. With Rails 5 you will also be able to use websockets out of the box. Thus, it will be easy to create chats or make your application more interactive.
In case you use correct DSL syntax, you can use all that and even more immediately. Moreover, you don’t have to know everything about the internal implementation of these tools, consider it’s DSL, and receive the expected result.
It means something is the opposite of modular. A modular application can have parts, referred to as modules, replaced without requiring replacement of the entire application. Whereas a monolithic application, after having a part fixed or upgraded, must be replaced in it's entirety.
From Wikipedia: "Modularity is desirable, in general, as it supports reuse of parts of the application logic and also facilitates maintenance by allowing repair or replacement of parts of the application without requiring wholesale replacement."
So in the context of a monolithic class, all its features are self-contained and if you want to add or alter a feature to the class you would need to alter/add code in the class and recompile it. Conversely a modular class exposes access to functionality which is implemented externally. For example a "Calculator" class may use a separate "Add" class for actually adding numbers; call a "Multiply" function from a separate library; or even call an "Amortize" function from a web service. As long as the each of these functional parts can be altered externally from the class, it is modular.