Embedded Linux LED-flashing daemon: does it exist? - daemon

I've seen embedded boards before that have an LED that flashes like a heartbeat to show that the board is still executing code. I'd like to do something similar on an embedded Linux board I'm working on. Given that it's a fairly trivial bit of code, it seems likely to me that someone has already written a daemon for Linux that does this, but I haven't been able to find any evidence.
Note that OS X Server's heartbeatd and the High-Availability Linux heartbeat daemon are not what I'm looking for-- they both coordinate system availability over IP networks, or something like that.
Assuming what I'm looking for doesn't exist, I'm also interested in advice about how to write a daemon that toggles a pin while minimizing resource usage. At what update rate does cron become a stupid idea?
(I'd also rather not hear gushing about the LED on the sleeping MacBook Pro, if that seems relevant for some reason.)
Thanks.

The LED heartbeat is a built-in kernel function. Assuming you have a device driver for your LED, turning on the heartbeat is done thus:
$ echo "heartbeat" > /sys/class/leds/MyLed/trigger
To see the list of triggers (MMC activity, heartbeat, etc.)
$ cat /sys/class/leds/MyLed/trigger
See drivers/leds/ledtrig-heartbeat.c and http://www.avrfreaks.net/wiki/index.php/Documentation:Linux/LEDs
The interesting thing about the heartbeat is that the pattern is dynamic. The basic pattern is thump-thump-pause, just like a human heartbeat. But the rate of the heartbeat is controlled by the load average! Light loads beat at about 50 beats per minute. Heavier loads cause faster beating until it maxes out at about 180 bpm.

I wouldn't use the cron. Its just not the right tool. A very simple solution is to just run a
shell script from your inittab.
Example:
#!/bin/sh
while [ true ];
do
logger "blink!" # to be replaced
sleep 1
done
Save this to /bin/blink.sh, add the following line to your inittab and have init reread the tab be running init q.
bl:2345:respawn:/bin/blink.sh
Of course you have to adjust the blink.sh script to your environment. Its highly depended on the
particular board how an LED can be toggled from user space (device driver file, sysfs entry, ....).
If you need something more efficient you might redo the while thing in C but it might not be worth the effort.
One thing to think about is what you want to signal with a pulsing LED. With the approach outlined above we can only show that the board is still alive (kernel is running, the process executing blink.sh is scheduled and blink.sh is doing what it is supposed to do). For some use cases this might be fine but more often you actually want to signal that the application running on an embedded board is still OK (doesn't hang, hasn't crashed, ...). To implement such functionality you need to integrate the code that toggles the LED into the main loop of your application.

Related

Sleep mode and duty cycling in UnetStack, and adding energy consumed in idle listening, and sleeping modes into a simple energy model

I have two questions:
We want to consider a very low transmission duty cycle in our underwater sensor network,as it is the power consumption in listening and sleep mode that will dominate our network lifetime in practice.
I noticed the Scheduler Commands in the new version of UnetStack Simulator, version 3.2.0, the addsleep , showsleep etc, I downloaded the the latest version of the simulator, I tried to use those commands, but it didn't work, I tried to work both on the shell as well as inside groovy scripts, and tried to import org.arl.unet.scheduler, but none of the Scheduler commands worked, and kept receiving errors.
For example, I tried to use this: addsleep 20.s.later, but the simulator does not recognise "later", also received errors for using import org.arl.unet.scheduler.
I wonder if anyone can help me in that, in how to use the addsleep command for example.
Another question:
Besides consuming energy in transmitting and receiving, our modem draws 2.5 mA from a 5V supply while listening for the start of a packet, and can go to sleep and draw about 0.24 mA from a 5V supply, with the ability to wake up and return to the listening mode after a programmable time period.
So my question is, is there a way to consider energy consumed in idle listening and sleeping in a simple energy model?
We implemented a very simple energy model, something like the following (found this example in stackoverflow):
class MyHalfDuplexModem extends HalfDuplexModem {
float energy = 1000
#Override
boolean send(Message m) {
if (m instanceof TxFrameNtf) energy -= 10
if (m instanceof RxFrameNtf) energy -= 1
return super.send(m)
}
}
How to add energy consumed in idle listening, and sleeping to the above code? shall we need to use something like WakeFromSleepNtf ()
Thanks and any help is much appreciated.
Marwa
The scheduler service is usually hardware dependent, as it requires interaction with the specific single board computer (SBC) to put it into a sleep state and allow it to be woken up. On modems, this is usually the modem driver agent.
The HalfDuplexModem simulated modem doesn't provide this service, and so it won't work out of the box. Since HalfDuplexModem doesn't have an energy model build into it, "sleep" doesn't mean much to it. If you wanted to simulate networks where nodes slept and consumed less energy during the sleep, it would be possible to extend the HalfDuplexModem to implement the SCHEDULER service. The service is quite simple, with just 4 messages (AddScheduledSleepReq, RemoveScheduledSleepReq, GetSleepScheduleReq and WakeFromSleepNtf). Your implementation could keep track of the energy used by each node, based on whether it is sleeping, listening or transmitting, since you can keep track of the sleep schedule and hence know how much time the node has been awake/sleeping.
Commands addsleep, showsleep etc are simply convenience shortcuts in the shell extension that use the above 4 messages to do the actual work. They are enabled in the shell by loading the SchedulerShellExt, and you can simply use the messages directly from agents or in simulation scripts.

TestRealTime: How to test a realtime Operating System with Rational Test Real Time

On an AUTOSAR realtime Operating System (OS), the software architecture is layered separately (User space, systemcall interface, kernel space). Also, the switching between user context and kernel context is handled by hardware-specific infrastructure and typically the context switching handler is written in assembly code.
IBM® Rational® Test RealTime v8.0.1 (RTRT) currently treats embedded-assembly-code as mentioned in the below Q&A.
https://www.ibm.com/support/pages/how-treat-embedded-assembly-code ( ** )
RTRT tool using code insertion technololy (technically known as instrumentation process) to insert its own code to measure code coverage of the system under test.
In my case, OS with full pre-emptive design doesn't have the termination points. As the result, OS always runs unless loss of power supply. If there's no work, OS shall be in sleep (normally an idle state and do nothing). If any unexpected errors or exceptions occurs, OS shall be shutdown and run into an infinite loop. These indicated that OS is always running.
I learnt from ( ** ) and ensure context switching working correctly.
But I don't know how to teach RTRT to finish its postprocessing (consisting of attolcov and attolpostpro) in a right way. Note that OS has worked correctly throughout all my tasks already and was confirmed by debugger. SHUTDOWN OS procedure has been executed correctly and OS has been in INFINITE loop (such as while(1){};)
After RTRT ends all its processes, the coverage report of OS module is still empty.
Based on IBM guideline for RTRT
https://www.ibm.com/developerworks/community/forums/atom/download/attachment_14076432_RTRT_User_Guide.pdf?nodeId=de3b0048-968c-4111-897e-b73654af32af
RTRT provides two breakpoints to mark the logging point (priv_writeln) and termination point (priv_close) of its process.
I already tried to drive from INFINITE (my OS) to priv_close (RTRT) by interacting PC register and all Context Switching registers with the Lauterbach debugger but RTRT coverage report was empty even thougth none of errors happened. No error meant that the context switch from kernel space to user space is able to work well and main() function returned correctly.
Solved the problem.
It definitely came from context switching process of Operating System.
In my case, I did a RAM dump to see how user context memory (befor Starting OS) look like.
After that, I will backup all context areas and retore them from Sleep or Infiniteloop in the exact order.
Hereby, RTRT can know the return point and reach its own main() function's _exit to finish report generation progress.
Finally, the code coverage report has been generated.

OS system calls in x86

While working on a educational simplistic RISC processor I was wondering about how system calls work when implementing my software interrupt function. For example, hypothetically lets say our program calls sys_end which ends the current process. Now I know this would go to a vector table and then to the code to end the current process.
My question is the code that ends the process ran in supervisor mode or user mode? No where I seem to look specifies this. I'm assuming if its in normal user mode that could pose a very significant problem as a user mode process could do say do something evil like:
for (i=0; i++; i<10000){
int sys_fork //creates child process
}
which could be very bad I thought the OS would have some say on how many times a process could repeat itself and not to mention what other harmful things a process could do by changing the code in the system call itself.
system calls run in supervisor mode for the duration of the system call. The supervisor mode is necessary for accessing hardware (the screen, the keyboard), and for keeping user processes isolated from each other.
There are (or can be configured) limits on the amount of cpu, number of processes, etc. a user process may use or request, which can offer some protection against the kind of runaway program you describe.
But the default linux configuration allows 10k processes to be created in a tight loop; I've done it myself (both intentionally and accidentally)

How do OSes Handle context switching?

As I can understand, every OS need to have some mechanism to periodically check if it should run some tasks and suspend others.
One way would be some kind of timer on whose expiry the OS will check if it should run/suspend some task.
Generally, say on a ARM system that would probably be some kind of ISR.
My real question, is that I've been ABLE to only visualize this and not see it somewhere. Could some one point to some free/open RTOS code where I can actually see the code that handles the preemption/scheduling?
freertos.org. The entire OS is open source, and right there for you to see. And there are dozens of different ports to compare and contrast. For the context switch code, you will want to look in the ports directory, in any one of many files called port.c, port.asm, etc. And yes, in the case of freertos all context switches are performed in interrupts (a tick timer ISR, or any other SysCall interrupt).
A context switch is very-much processor specific, as the list of registers to save and the assembly code to save them varies between processor families, and sometimes within a given family. As a result each port has a separate file for this code.
The scheduling (selection of next task to run), on the other hand, is done in a file called tasks.c, which is common to all ports and references the port-specific code.
It is not the case than an RTOS simply context switches periodically - that is how most GPOS work. In an RTOS the scheduler runs on any scheduling event. These include system-tick, but also message post, event trigger, semaphore give, or mutex unlock for example.
On ARM Cortex-M the CMSIS 3.x includes an RTOS API (intended primarily for RTOS developers rather than a complete RTOS itself), the source for this will include a context switching mechanism.
If you want a detailed description for a simple RTOS you might consider reading µC/OS-II: The Real-Time Kernel or the slightly more sophisticated µC/OS-III: The Real-Time Kernel .
FreeRTOS is increasingly popular, though perhaps a little unconventional architecturally. A more complete (in that it is not just a scheduling kernel but a more complete OS) and very powerful option is eCos.
You can take a look at xv6.
Its not an RTOS, it is just a skeleton OS(based on V6 unix) meant for academic purpose.
In the XV6 book take a look at chapter 4, there is explanation along with the code as to how scheduling is done on a small OS like xv6.XV6 puts a process to sleep when it is waiting for disk or some I/O operation, there is also timer interupt every 100msec to switch a process.
There is also explanation with code on how the context switching takes place, what information is saved( context frame of a process), how the switch from user to kernel mode happens when the scheduler has to run.
The best part is that the amount of reading you have to do to understand these concepts is very less unlike some reference book on OS :) The code is relatively small, you can infact run the XV6 on qemu set breakpoints in the sched , swtch and other functions and actually see the information saved during a context switch.(how to run xv6 in this link)
You dont have to read previous chapters to understand the chapter4. There isnt much dependency,xv6 uses struct proc to identify a process, ptable for all the current running process in the system, proc->conext -refers to the state the process is in (register value etc) , this is saved by the scheduler.
Cheers :)

How can I avoid zombies in Perl CGI scripts run under Apache 1.3?

Various Perl scripts (Server Side Includes) are calling a Perl module with many functions on a website.
EDIT:
The scripts are using use lib to reference the libraries from a folder.
During busy periods the scripts (not the libraries) become zombies and overload the server.
The server lists:
319 ? Z 0:00 [scriptname1.pl] <defunct>
320 ? Z 0:00 [scriptname2.pl] <defunct>
321 ? Z 0:00 [scriptname3.pl] <defunct>
I have hundreds of instances of each.
EDIT:
We are not using fork, system or exec, apart form the SSI directive
<!--#exec cgi="/cgi-bin/scriptname.pl"-->
As far as I know, in this case httpd itself will be the owner of the process.
MaxRequestPerChild is set to 0 which should not let the parents die before the child process is finished.
So far we figured that temporarily suspending some of the scripts help the server coping with the defunct processes and prevent it from falling over however zombie processes are still forming without a doubt.
Apparently gbacon seems to be the closest to the truth with his theory that the server is not being able to cope with the load.
What could lead to httpd abandoning these processes?
Is there any best practice to prevent these from happening?
Thanks
Answer:
The point goes to Rob.
As he says, CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.
Since we were running on Apache 1.3, for every page view the SSI's turned into defunct processes. Although the server was trying to clear them it was way too busy with the running tasks to be able to succeed. As a result, the server fell over and become unresponsive.
As a short term solution we reviewed all SSI's and moved some of the processes to client side to free up server resources and give it time to clean up.
Later we upgraded to Apache 2.2.
More Band-Aid than best practice, but sometimes you can get away with simple
$SIG{CHLD} = "IGNORE";
According to the perlipc documentation
On most Unix platforms, the CHLD (sometimes also known as CLD) signal has special behavior with respect to a value of 'IGNORE'. Setting $SIG{CHLD} to 'IGNORE' on such a platform has the effect of not creating zombie processes when the parent process fails to wait() on its child processes (i.e., child processes are automatically reaped). Calling wait() with $SIG{CHLD} set to 'IGNORE' usually returns -1 on such platforms.
If you care about the exit statuses of child processes, you need to collect them (commonly referred to as "reaping") by calling wait or waitpid. Despite the creepy name, a zombie is merely a child process that has exited but whose status has not yet been reaped.
If your Perl programs themselves are the child processes becoming zombies, that means their parents (the ones that are forking-and-forgetting your code) need to clean up after themselves. A process cannot stop itself from becoming a zombie.
I just saw your comment that you are running Apache 1.3 and that may be associated with your problem.
SSI's can run CGI's. But CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.
As I'd suggested above, try running your scripts on their own and have a look at the output. Are they generating SSI's?
Edit: Have you tried launching a trivial Perl CGI script to simply printout a Hello World type HTTP response?
Then if this works add a trivial SSI directives such as
<!--#printenv -->
and see what happens.
Edit 2: Just realised what is probably happening. Zombies occur when a child process exits and isn't reaped. These processes are hanging around and slowly using up resources within the process table. A process without a parent is an orphaned process.
Are you forking off processes within your Perl script? If so, have you added a waitpid() call to the parent?
Have you also got the correct exit within the script?
CORE::exit(0);
As you have all the bits yourself, I'd suggest running the individual scripts one at a time from the command line to see if you can spot the ones that are hanging.
Does a ps listing show an inordinate number of instances of one particular script running?
Are you running the CGI's using mod_perl?
Edit: Just saw your comments regarding SSI's. Don't forget that SSI directives can run Perl scripts themselves. Have a look to see what the CGI's are trying to run?
Are they dependent on yet another server or service?