What is the principle of Codepad.org website? - webserver

I wondering the principle of Codepad.org website. (Principle of online C compiler)
I think the principle follow these steps.
User submit the C code.
Website send to GCC installed on server.
GCC compile the code.
GCC return the strings and send to Website(Webserver)
Webserver return the strings to user.
Is that steps right?
Then, how to protect from malignant code such as deleting all file from server?

From http://codepad.org/about:
Code execution is handled by a supervisor based on geordi. The strategy is to run everything under ptrace, with many system calls disallowed or ignored. Compilers and final executables are both executed in a chroot jail, with strict resource limits. The supervisor is written in Haskell.
Also:
Paranoia
When your app is remote code execution, you have to expect security problems. Rather than rely on just the chroot and ptrace supervisor, I've taken some additional precautions:
The supervisor processes run on virtual machines, which are firewalled such that they are incapable of making outgoing connections.
The machines that run the virtual machines are also heavily firewalled, and restored from their source images periodically.

Related

Pass SIGTSTP signal to all processes in a job in LSF

The problem statement in short: Is there a way in LSF to pass a signal SIGCONT/SIGTSTP to all processes running within a job?
I have a Perl wrapper script that runs on LSF (Version 9.1.2) and starts a tool (Source not available) on the same LSF machine as the Perl script.
The tool starts 2 processes, one for license management and another for doing the actual work. It also supports an option where sending SIGSTSP/SIGCONT to both processes will release/reacquire the license (which is what I wish to achieve).
Running bkill -s SIGCONT <JOB_ID> only resumes the tool process and not the license process, which is a problem.
I tried to see if I can send the signals to the Perl script's own PGID, but the license process starts its own process group.
Any suggestions to move forward through Perl or LSF options are welcome.
Thanks,
Abhishek
I tried to see if I can send the signals to the Perl script's own PGID, but the license process starts its own process group.
This is likely your problem right here. LSF keeps track of "processes running within the job" by process group. If your job spawns a process that runs within its own process group (say by daemonizing itself) then it essentially is a runaway process out of LSF's control -- it becomes your job's responsibility to manage it.
For reference, see the section on "Detached processes" here.
As for options:
I think the cgroups tracking functionality helps in a lot of these cases, you can ask your admin if LSF_PROCESS_TRACKING and LSF_LINUX_CGROUP_ACCT are set in lsf.conf. If they aren't, then you can ask him to set them and see if that helps for your case (you need to make sure the host you're running on supports cgroups). In 9.1.2 this feature is turned on at installation time, so this option might not actually help you for various reasons (your hosts don't have cgroups enabled for example).
Manage the license process yourself. If you can find out the PID/PGID of the license process from within your perl script, you can install custom signal handlers for SIGCONT/SIGSTP in your script using sigtrap or the like and forward them to the license process yourself when your script receives them through bkill. See here.

Implementing a distributed grep

I'm trying to implement a distributed grep. How can I access the log files from different systems? I know I need to use the network but I don't know whether you use ssh, telnet, or anything else? What information do I need to know about the machines I am going to connect to from my machine? I want to be able to connect to different Linux machines and read their log files and pipe it back to my machine.
Your system contains a number of Linux machine which produce log data(SERVERs), and one machine which you operate(CLIENT). Right?
Issue 1) file to be accessed.
In general, log file is locked by a software which produce log data, because the software has to be able to write data into log file at any time.
To access the log file from other software, you need to prepare unlocked log data file.
Some modification of the software's setup ane/or the software(program) itself.
Issue 2) program to serve log files.
To get log data from SERVER, each SERVERs have to run some server program.
For remote shell access, rshd (remote shell deamon) is needed. (ssh is combination of rsh and secure communication).
For FTP access, ftpd (file transfer protocol deamon) is needed.
The software to be needed is depend how CLIENT accesses SERVERs.
Issue 3) distribued grep.
You use words 'distribued grep'. What do you mean by the words?
What are distribued in your 'distributed grep'?
Many senarios came in my mind.
a) Log files are distribued in SERVERs. All log data are collected to CLIENT, and grep program works for collected log data at CLIENT.
b) Log files are distribued in SERVERs. Grep function are implemented on each SERVERs also. CLIENT request to each SERVERs for getting the resule of grep applied to log data, and results are collected to CLIENT.
etc.
What is your plan?
Issue 4) access to SERVERs.
Necessity of secure communication is depend on locations of machines and networks among them.
If all machines are in a room/house, and networks among machines are not connected the Internet, secure communication is not necessary.
If the data of log is top secret, you may need encript the data before send the data on the network.
How is your log data important?
At very early stage of development, you should determing things described above.
This is my advice.

Duplicated code sections - move to a service?

I have a C# application that enables users to write a test and execute it (client). It also supports distributed execution over multiple machines using a central server and agents on said machines.
The agent is practically a duplication of the original execution ability but it is in a standalone solution.
We'd like to refactor that because:
Code duplication.
If a user will try to write and execute on a machine that runs an agent, there will be a problematic collision.
I'm considering 2 options:
Move this execution to a service, that both client and agent will use. I mean a service that will run locally, not a web service.
Merge client and agent - we'll have no agent, but the server will communicate with the client as an agent instead.
I have no experience in working with services. Are there any known advantages/disadvantages to either options?
A common library shared by both client and agent sounds more appropriate to allow simple cases such as just using the client and avoid the overhead of having to set up an extra service locally.

Need an opinion on a method for pull data from a file with Perl

I am having a conflict of ideas with a script I am working on. The conflict is I have to read a bunch of lines of code from a VMware file. As of now I just use SSH to probe every file for each virtual machine while the file stays on the server. The reason I am now thinking this is a problem is because I have 10 virtual machines and about 4 files that I probe for filepaths and such. This opens a new SSH channel every time I refer to the ssh object I have created using Net::OpenSSH. When all is said and done I have probably opened about 16-20 ssh objects. Would it just be easier in a lot of ways if I SCP'd the files over to the machine that needs to process them and then have most of the work done on the local side. The script I am making is a backup script for ESXi and it will end up storing the files anyway, the ones that I need to read from.
Any opinion would be most helpful.
If the VM's do the work locally, it's probably better in the long run.
In the short term, the ~equal amount of resources will be used, but if you were to migrate these instances to other hardware, then of course you'd see gains from the processing distribution.
Also from a maintenance perspective, it's probably more convenient for each VM to host the local process, since I'd imagine that if you need to tweak it for a specific box, it would make more sense to keep it there.
Aside from the scalability benefits, there isn't really any other pros/cons.

Can I run my mod_perl application as an ordinary user

Can I run my mod_perl aplication as an ordinary user similar to running a plain vanilla CGI application under suexec?
From the source:
Is it possible to run mod_perl enabled Apache as suExec?
The answer is No. The reason is that
you can't "suid" a part of a process.
mod_perl lives inside the Apache
process, so its UID and GID are the
same as the Apache process.
You have to use mod_cgi if you need
this functionality.
Another solution is to use a crontab
to call some script that will check
whether there is something to do and
will execute it. The mod_perl script
will be able to create and update this
todo list.
A more nuanced answer with some possible workarounds from "Practical mod_perl" book:
(I hope that's not a pirated content, if it is please edit it out)
mod_perl 2.0 improves the situation,
since it allows a pool of Perl
interpreters to be dedicated to a
single virtual host. It is possible to
set the UIDs and GIDs of these
interpreters to be those of the user
for which the virtual host is
configured, so users can operate
within their own protected spaces and
are unable to interfere with other
users.
Additional solutions from the sme book are in appendix C2
As mod_perl runs within the apache process, I would think the answer is generally no. You could, for example, run a separate apache process as this ordinary user and use the main apache process as a proxy for it.