What does the EC2 command line say when a machine won't start? - command-line

When starting an instance on Amazon EC2, how would I detect a failure, for instance, if there's no machine available to fulfill my request? I'm using one of the less-common machine types and am concerned it won't start up, but am having trouble finding out what message to look for to detect this.
I'm using the EC2 commandline tools to do this. I know I can look for 'running' when I do ec2-describe-instance to see if the machine is up, but don't know what to look for to see if the startup failed.
Thanks!

The output from ec2-start-instances only returns you stopped pending, and as you say you need to use ec2-describe-instances to retrieve the state.
For that, you have a couple of choices; you can either use a loop to check for instance-state-name, looking for a result of running or stopped; alternatively you could look at either the reason or state-reason-code fields; unfortunately you'll need to trigger the failure you're worried about, to obtain the values that indicate failure.
The batch file I use to wait for a successful startup (fill in the underscores):
#echo off
set EC2_HOME=C:\tools\ec2-api-tools
set EC2_PRIVATE_KEY=C:\_\pk-_.pem
set EC2_CERT=C:\_\cert-_.pem
set JAVA_HOME=C:\Program Files (x86)\Java\jre6
%EC2_HOME%\bin\ec2-start-instances i-_
:docheck
%EC2_HOME%\bin\ec2-describe-instances | C:\tools\gnuwin32\bin\grep.exe -c stopped > %EC2_HOME%\temp.txt
findstr /m "1" %EC2_HOME%\temp.txt > nul
if %errorlevel%==0 (c:\tools\gnuwin32\bin\echo -n "."
goto docheck)
del temp.txt

ec2-start-instances will return you the previous state (after last command to instance) and the current state (after your command). ec2-stop instances does the same thing. THE PROBLEM IS, if you are scripting and you use -start- on a 'stopping' instance -OR- you use a -stop- on a 'pending' instance. These will cause exceptions in the command line tool and NASTILY exit your scripts all the way to the original console (VERY BAD BEHVIOR, AMAZON). So you have to go all the way through parsing the ec2-describe-instances [instance-id] result. HOWVER, that still leaves you vulnerable to that tiny little bit of time between when you GET the status from your instance and you APPLY A COMMAND. If someone else, or Amazon, puts you into pending or stopping, and you then do 'stop' or 'start respectively, your script will break. I really don't know how to catch such an exception with script. Bad Amazon AWS, BAD DOG!

Related

How to launch a script that execute a check disk (chkdsk) without manual confirmation in Powershell?

I have a supervision tool that can deploy scripts on customers end devices.
I'm trying to make two powershell scripts.
The first one is supposed to launch a "chkdsk disk_name: /f /r".
The second one is supposed to extract the result of the chkdsk after the reboot from the event viewer.
The second script is operational. My problem is with the first one.
I think that when I'm launching my job from my administration tool, the script is launched on the end device, but when you type "chkdsk disk_name: /f /r" on a command prompt, it asks if you want to do the chkdsk at the start of the machine because the disk is actually in use. I think that the letter "Y" that you have to type to confirm, is blocking the execution of the command (and my script by consequence).
I didn't find in the documentation of the command any method to launch it with a "default confirmation".
Do you have any idea of what I can do to automate this?
Sorry for my English, it's not my native language.
Thank you all!
I tried to launch the script (it's in admin mode when my administration tool launch it's job) but the result was that my job was running indefinitely and at the restart of the machine, the check disk is not performed.

exec power shell script having corrective action every time

Getting the corrective action for exec while using Powershell to ADD usersand groups to local admin group. Please note I am not a scripting guy, not sure what wrong I am doing.
Notice: /
By default, an Exec resource is applied on every run. That is mediated, where desired, by the resource's unless, onlyif, and / or creates parameters, as described in that resource type's documentation.
The creates parameter is probably not appropriate for this particular case, so choose one of unless or onlyif. Each one is expected to specify a command for Puppet to run, whose success or failure (as judged by its exit status) determines whether the Exec should be applied. These two parameters differ primarily in how they interpret the exit status:
unless interprets exit status 0 (success) as indicating that the Exec's main command should not be run
onlyif interprets exit statuses other than 0 (success) as indicating that the Exec's main command should not be run
I cannot advise you about the specific command to use here, but the general form of the resource declaration would be:
exec { 'Add-LocalGroupMember Administrators built-in':
command => '... PowerShell command to do the work ...',
unless => '... PowerShell command that exits with status 0 if the work is already done ...',
provider => 'powershell',
}
(That assumes that the puppetlabs-powershell module is installed, which I take to be the case for you based on details presented in the question.)
I see your comment on the question claiming that you tried this approach without success, but this is the answer. If your attempts to implement this were unsuccessful then you'll need to look more deeply into what went wrong with those. You haven't presented any of those details, and I'm anyway not fluent in PowerShell, but my first guess would be that the exit status of your unless or onlyif script was computed wrongly.
Additionally, you probably should set the Exec's refresh property to a command that succeeds without doing anything. I'm not sure what the would be on Windows, but on most other systems that Puppet supports, /bin/true would be idiomatic. (That's not correct for Windows; I give it only as an example of the kind of thing I mean.) This will prevent running the main command twice in the same Puppet run in the event that the Exec receives an event.

Perl script file run manually but not in crontab

I have a perlscript file was running fine in crontab but suddenly it stopped running without any modification.
cd /home/user/public_html/crons && ./script.pl 2>&1 >/dev/null
The top of the script file is #!/usr/bin/perl -X
The output expect from this script is changes in database
I have another script file with the same modification and still works fine
When I run the file in the browser it works fine and execute all lines without any problem
I tried full path /usr/bin/perl but it didn't work
I tried Perl at the beginning but it didn't work
I run the command from SSH using putty but nothing happened
I checked log file /var/log/cron but no errors at all
I created temporary log file cd /home/user/public_html/crons/script.pl> /tmp/temp.log 2>&1 to see the errors but the log is empty
Here is the solution:-
I found the issue, There is was a stuck process for the same cron file , so i killed this process and its fixed
You can find your file process like this
ps aux | grep 'your cron file here'
This is a really common antipattern people seem to tend toward with cron.
Cron sends you an email with the output of your script, if it generates any output. People often redirect output to /dev/null to prevent cron from sending the email. This is bad because now the output of your script is lost entirely. Even if the script has some built-in logging, it might generate errors before it gets the log file opened and those are lost. It also might crash in a way that doesn't get written to the logging mechanism.
At a bare minimum, you should just remove 2>&1 >/dev/null to start receiving the email. (and also, test your mail setup using a temporary cron job like 1 * * * * echo "Test" )
The next better solution is to change it to >> /var/log/myscript/current.log and then also set up something to rotate the log files (like logrotate) and also make sure to create that directory with permissions that the script user is allowed to write to it. By only redirecting STDOUT of the script, any errors or warnings it writes to STDERR cause you to get an email, and if there are no errors/warnings the output goes to the log file and no email gets sent.
Neither of those changes solve the root problem though, which is that when cron runs your script it does so with a different environment than you have on the command line. What you really want is a way to run the script with a consistent environment, and log it. The "ultimate solution" is to define your task in some kind of service manager, and then use cron to occasionally start it. For instance, you could use systemd and define a service that doesn't restart, then use systemctl start my_custom.service in your cron job. Now you can test independent of cron, and your tests will have the same exact environment, and be logged by the service manager. As extra bonuses, you are protected from accidentally running your script twice at once, and you get a clean way to stop a running cron job without the danger of stale pid files.
I don't particularly advocate systemd myself, but thankfully there are lots of alternatives:
Runit : http://smarden.org/runit/runsvdir.8.html
S6 : https://skarnet.org/software/s6/
Perp : http://b0llix.net/perp/site.cgi?page=perpd.8
(but installing and configuring a service manager is a bigger task than just using systemd if your distro is based on systemd) Each of these allows you to define a service that doesn't restart. Then you use a shell command to issue a "run once" directive to the supervisor, which runs the task as a child. Now you can easily launch the jobs yourself and see all the errors in the log, and then add that command to the crontab and know that it will run identically when cron starts it.
Back to your original problem, once you get some logging you are likely to discover it is a permission problem or a upgraded module in the system perl.

Automating Month End Process

Whether month end process can be automated in progress bases applications like nessie? I already searched for it and I think maybe it can done by scheduling it through background jobs.
Scheduling jobs is a function of the OS or of 3rd party applications that specialize in such things (generally used in large enterprises with IT groups that obsess over that kind of stuff).
If you are using UNIX then you want to look into "cron".
If you are using Windows then "scheduled tasks".
In any event you will need to create a "wrapper" script that properly sets the background job environment and launches a Progress session. If you are using Windows you should be aware that a batch process is "headless" and that unless your batch process is doing something very strange it will not be using GUI components -- so you should probably run _progres.exe rather than prowin32.exe.
A generic (UNIX) example:
#!/bin/sh
#
DLC=/usr/dlc
PATH=$DLC/bin:$PATH
export DLC PATH
_progres -b -db /path/dbname -p batchjob.p > logfile 2>&1 &
(That is "_progres" with just 1 "s" -- this is from the days when file names were restricted to 8 characters on some operating systems.)
Windows is very similar:
# echo off
set DLC=c:\progress
set PATH=%DLC%\bin;%PATH%
_progres.exe -b -db \path\dbname -p batchjob.p > logfile 2>&1
But there are a lot of "gotchyas" with Windows. If, for instance, you run a job using a login-id that might actually login then you will have the problem that on logout all the scheduled tasks will be "helpfully" killed by the OS. Aside from stopping your job when you probably don't want it to this may have other negative side effects like crashing the db. To get around that problem on Windows you either create a "service account" that never logs in or use 3rd party scheduler that runs jobs "as a service".

how to use a shell script to supervise a program?

I've searched around but haven't quite found what I'm looking for. In a nutshell I have created a bash script to run in a infinite while loop, sleeping and checking if a process is running. The only problem is even if the process is running, it says it is not and opens another instance.
I know I should check by process name and not process id, since another process could jump in and take the id. However all perl programs are named Perl5.10.0 on my system, and I intend on having multiple instances of the same perl program open.
The following "if" always returns false, what am I doing wrong here???
while true; do
if [ ps -p $pid ]; then
echo "Program running fine"
sleep 10
else
echo "Program being restarted\n"
perl program_name.pl &
sleep 5
read -r pid < "${filename}_pid.txt"
fi
done
Get rid of the square brackets. It should be:
if ps -p $pid; then
The square brackets are syntactic sugar for the test command. This is an entirely different beast and does not invoke ps at all:
if test ps -p $pid; then
In fact that yields "-bash: [: -p: binary operator expected" when I run it.
Aside from the syntax error already pointed out, this is a lousy way to ensure that a process stays alive.
First, you should find out why your program is dying in the first place; this script doesn't fix a bug, it tries to hide one.
Secondly, if it is so important that a program remain running, why do you expect your (at least once already) buggy shell script will do the job? Use a system facility that is specifically designed to restart server processes. If you say what platform you are using and the nature of your server process. I can offer more concrete advice.
added in response to comment:
Sure, there are engineering exigencies, but as the OP noted in the OP, there is still a bug in this attempt at a solution:
I know I should check by process name
and not process id, since another
process could jump in and take the id.
So now you are left with a PID tracking script, not a process "nanny". Although the chances are small, the script as it now stands has a ten second window in which
the "monitored" process fails
I start up my week long emacs process which grabs the same PID
the nanny script continues on blissfully unaware that its dependent has failed
The script isn't merely buggy, it is invalid because it presumes that PIDs are stable identifiers of a process. There are ways that this could be better handled even at the shell script level. The simplest is to never detach the execution of perl from the script since the script is doing nothing other than watching the subprocess. For example:
while true ; do
if perl program_name.pl ; then
echo "program_name terminated normally, restarting"
else
echo "oops program_name died again, restarting"
fi
done
Which is not only shorter and simpler, but it actually blocks for the condition that you are really interested in: the run-state of the perl program. The original script repeatedly checks a bad proxy indication of the run state condition (the PID) and so can get it wrong. And, since the whole purpose of this nanny script is to handle faults, it would be bad if it were faulty itself by design.
I totally agree that fiddling with the PID is nearly always a bad idea. The while true ; do ... done script is quite good, however for production systems there a couple of process supervisors which do exactly this and much more, e.g.
enable you to send signals to the supervised process (without knowing it's PID)
check how long a service has been up or down
capturing its output and write it to a log file
Examples of such process supervisors are daemontools or runit. For a more elaborate discussion and examples see Init scripts considered harmful. Don't be disturbed by the title: Traditional init scripts suffer from exactly the same problem like you do (they start a daemon, keep it's PID in a file and then leave the daemon alone).
I agree that you should find out why your program is dying in the first place. However, an ever running shell script is probably not a good idea. What if this supervising shell script dies? (And yes, get rid of the square braces around ps -p $pid. You want the exit status of ps -p $pid command. The square brackets are a replacement for the test command.)
There are two possible solutions:
Use cron to run your "supervising" shell script to see if the process you're supervising is still running, and if it isn't, restart it. The supervised process can output it's PID into a file. Your supervising program can then cat this file and get the PID to check.
If the program you're supervising is providing a service upon a particular port, make it an inetd service. This way, it isn't running at all until there is a request upon that port. If you set it up correctly, it will terminate when not needed and restart when needed. Takes less resources and the OS will handle everything for you.
That's what kill -0 $pid is for. It returns success if a process with pid $pid exists.