Diagnose crashes of FiftyOne app – logs or other tools - mlops

We need to make a FiftyOne instance available to multiple users via a web browser. We need to start a process and have it run, even after we log off from the session that initiated the app processes.
I’m using the following command to start the process. I’m executing this in a Docker container. The container is running on an Ubuntu host via AWS EC2.
$ nohup fiftyone app launch --remote > fiftyone.log 2>&1 &
If I launch this command from the terminal, it launches processes which allow a web browser to connect with the FiftyOne app. These persists after I log out.
However, these processes sometimes become unavailable. For example, after running for over 20 hours, FiftyOne crashed with the following in the log file ~/.fiftyone/var/lib/mongo/log/mongo.log.
(produced by cat ~/.fiftyone/var/lib/mongo/log/mongo.log | jq '{msg,t}')
{
"msg": "CMD fsync",
"t": {
"$date": "2021-09-01T15:04:24.152+00:00"
}
}
{
"msg": "Received signal",
"t": {
"$date": "2021-09-01T15:04:24.181+00:00"
}
}
{
"msg": "Signal was sent by kill(2)",
"t": {
"$date": "2021-09-01T15:04:24.181+00:00"
}
How might I get more information about why this crashed?

The open-source version of FiftyOne is designed primarily for individual users. The best experience for multi-user collaboration is FiftyOne Teams. You can sign up here: https://voxel51.com/#teams-form
About this error specifically:
On the backend, calling fiftyone app launch --remote in effect runs the following Python commands:
session = fo.launch_app(remote=True)
session.wait()
For remote sessions, the session.wait() call will block until something connects to it, and then will continue blocking until all connected tabs are closed.
There is a timeout built in to handle the case when the tab is refreshed so that the session is not immediately closed. In some cases, we have noticed that the refresh takes longer than the timeout, and sessions are closed prematurely. This is being looked into.
The next release provides an argument that will cause wait to block indefinitely. You will be able to call fiftyone app launch --remote --wait 0.
In the meantime, I would recommend writing and calling a small script (launch_app.py) to permanently block until it is exited.
import fiftyone as fo
session = fo.launch_app(remote=True)
# Indefinite blocking
while True:
pass
python launch_app.py

Related

Stop restart loop in kiosk mode with crashing application

I have a Windows 10 LTSB 2016 machine set up in kiosk mode to run a single application. I followed the Powershell instructions at https://learn.microsoft.com/en-us/windows/configuration/kiosk-shelllauncher#configure-a-custom-shell-using-powershell to build the install script, it works well. In that script, I can set the behavior of the shell if the application crashes. I can restart the application, restart the PC, or shut down the PC.
In general, this is all good, but occasionally someone does something like edit a file that causes the application to instantly crash on starting. Because I have the script set to restart the application, this causes the shell to get stuck in a loop where the application is rapidly crashing and restarting. This fills my logs and is generally a poor UX for the end user.
I'd like to condition the behavior so the application only attempts to restart a limited number of times, then shuts down (or some other behavior). Is there any way to achieve this? Can the custom shell access a counter, or accept a return statement from the application (e.g. to differentiate between an intended application shutdown/restart vs an application crash)?

How to make RUST run gracefully in the background and daemonize?

This is what i want to achieve
root> ./webserver start // Does not block the terminal after startup, runs in the background and the process is guarded
root>
My current implementation logic:
Logic running in the background
use std::process::Command;
use std::thread;
use std::env;
use std::time::Duration;
fn main() {
let args: Vec<String> = env::args().collect();
if args.len() == 2 {
if &args[1] == "start" {
// Main process start child process
let child = Command::new(&args[0])
.spawn().expect("Child process failed to start.");
println!("child pid: {}", child.id());
// Main process exit
}
} else {Is there any more elegant approach? Looking forward to your reply
// Main business logic
run webserver
}
}
In this way, rust will run in the background without blocking the terminal, but the information printed by rust in the background will still be displayed on the terminal, and the current rust program will exit when exiting the terminal
Process daemon logic
My idea is to monitor the exit signal of the system and not process the exit request
SIGHUP 1 /* Hangup (POSIX). */
SIGINT 2 /* Interrupt (ANSI). */
SIGQUIT 3 /* Quit (POSIX). */
SIGTERM 15 /* Termination (ANSI). */
code:
use signal_hook::{iterator::Signals, SIGHUP,SIGINT,SIGQUIT,SIGTERM};
use std::{thread, time::Duration};
pub fn process_daemon() {
let signals = match Signals::new(&[SIGHUP,SIGINT,SIGQUIT,SIGTERM]) {
Ok(t) => t,
Err(e) => panic!(e),
};Is there any more elegant approach? Looking forward to your reply
thread::spawn(move || {
for sig in signals.forever() {
println!("Received signal {:?}", sig);
}
});
thread::sleep(Duration::from_secs(2));
}
Is there any more elegant approach? Looking forward to your reply.
TLDR: if you really want your process to act like a service (and never quit), probably do the work to set up a service manager. Otherwise, just let it be a normal process.
Daemonizing a Process
One thing to notice right off the bat is that most of the considerations about daemonizing have nothing to do with Rust as a language and are more about:
The underlying system your processes are targeted for
The exact behavior of your daemon processes once spawned
By looking at your question, it seems you have realized most of this. Unfortunately to properly answer your question we have to delve a bit into the intricacies of processes and how they are managed. It should be noted that existing 'service' managers are a great solution if you are OK with significant platform dependence in your launching infrastructure.
Linux: systemd
FreeBSD: rc
MacOS: launchd
Windows: sc
As you can see, no simple feat if you want to have a simple deployment that just works (provided that it is compiled for the relevant system). These are just the bare metal service managers. If you want to support virtual environments you will have to think about Kubernetes services, dockerizing, etc.
These tools exist because there are many considerations to managing a long-running process on a system:
Should my daemon behave like a service and respawn if killed (or if the system is rebooted)? The tools above will allow for this.
If a service, should my daemon have status states associated with it to help with maintenance? This can help with managing outages and building tooling to scale horizontally.
If the daemon shouldn't be a service (unlikely in your case given your binary's name) there are even more questions: should it be attached to the parent process? Should it be attached to the login process group?
My guess for your process given how complex this can become, simply run the process directly. Don't daemonize at all.
For testing, (if you are in a unix-like environment) you can run your process in the background:
./webserver start &
This will spawn the new process in the background, but attach it to your shell's process list. This can be nice for testing, because if that shell goes away the system will clean up these attached processes along with it.
The above will direct stderr and stdout file descriptors back to your terminal and print them. If you wish to avoid that, you can always redirect the output somewhere else.
Disabling signals to a process like this doesn't seem like the right approach to me. Save these signals to gracefully exit your process once you need to save state or send a termination message to a client. If you do the above, your daemon will only be killable by a kill -9 <pid> or rebooting, or finding some non-standard signal you haven't overridden whose default behavior is to terminate.

how to run swift server single process

I'm trying to run a swift based web server using Kitura on Ubuntu.
This is following command to start hello word server.
.build/debug/helloworld
I can launch standalone process using .build/debug/helloworld &
but launching with that creates multiple process if execute again.
Or I've to kill old process then start new If I want to run only single process.
I've followed following tutorial to get server up running. But don't want to use Bluemix to deploy application. Instead I want to launch it on AWS ubuntu.
http://www.kitura.io/en/starter/gettingstarted.html
I assume there must be more easy and proper way to do this.
As you can see I'm almost newbie for servers.
You have to kill the Kitura process in order to stop a Kitura Server app - there is no other way to stop it.
If you just want to test your server you can run it inside a screen session. Screen is an essential utility for managing remote servers via ssh.
If you want to run it properly as a service/daemon you should look into systemd.

Connect to console app running as a system task on Windows server

I run several game servers on a single windows-server-2012-r2. Many of the game servers run as console-application. I have created scheduled-tasks to run each on windows startup even if I'm not logged on. I would like to be able to attach to the consoles of those apps when logged on to the server, similar to what can be done in linux. Perhaps I'm going about this in the wrong way. Is there a way to attach to console apps running as tasks? Is there a software tool that accommodates this sort of thing?
Update:
Been searching high and low for a solution but haven't found anything yet. Have decided to write a wrapper for console app that will redirect Stdin, Stdout and Stderr of a process to a Telnet connection. Will use nssm to run the wrapper as a service.
I produced a solution: https://github.com/ccourson/Banjo
Banjo will launch a specified console application and route its streams to and from a telnet connection.
Pull requests welcome.

Install4j: Silent updater exits on start

I am running my installer in silent mode (-q option).
After starting, the installer quits. The error.log shows the entry:
"The application is running. Please close instances and run this installer again".
However, there is no other instance of the installer running.
Note - The installer is launched in the Windows local system account.
I encountered the exact same issue with an updater installer being run in unattended (silent) mode and launched from a Windows service. I discovered that there were two reasons for the updater failing to complete:
The unattended mode installer was being invoked with arguments -q -wait 20 (these are the defaults when generating an updater application using the install4j UI). The -wait argument causes the installer to wait for running applications to close before progressing to the stage where the Welcome screen would normally be shown in an interactive install. It does not attempt to close them itself. If the applications don't disappear within 20 seconds, the installer exits at this early stage with the error.log message "The application is running. Please close instances and run this installer again". (In interactive mode the user would be shown a screen at this stage asking them to close the running applications.)
(Note: -wait only seems to check for the presence of application processes (user launched apps, UIs etc). Installed services that are running are not considered a blocker, probably because they are automatically stopped as part of the Install files action.)
I had already added actions later in my installer to close down the application processes before installing the update, but the -wait argument was not allowing it to progress that far. I removed the argument from the "Set installer arguments" action in the updater app and this allowed the installer to progress beyond the welcome phase.
The 'Check for running processes' action, using a Close strategy of 'Soft close immediately', failed to close any user-launched applications running (UI applications in my case) when the installer was launched from the Windows service. This in turn caused the unattended update to fail due to locking issues overwriting the installed files.
However using the Close strategy of 'Terminate immediately' did allow the service account to successfully kill the running applications and allowed the installation to complete.
I ended up using a sequence of "Stop a service", "Check for running processes (soft close)" (which works when the installer is run in interactive mode by a user), then "Check for running processes (Terminate immediately)" at the start of the Installation section to cover all bases.