How to log kernel messages to /var/log/messages on Yocto system with systemd? - yocto

We have a fairly standard Yocto system with Busybox and systemd. We have noticed that kernel messages (visible with dmesg etc) are not logged to /var/log/messages, but are only visible with "journalctl".
The system runs all of these:
/sbin/klogd -n
/sbin/syslogd -n
/lib/systemd/systemd-journald
I have noticed that kernel messages are actually logged in /var/log/messages if journald is stopped. That service provides little of value to us, so disabling or removing it altogether could be a solution, but how can this be done?

Related

Why Systemd remove my SHM file but not Postgresql's?

I'm developing a daemon application on Ubuntu server that being managed by Systemd.
I create a SHM file in /dev/shm/ by using shm_open, and close the file descriptor after calling to mmap. At the beginning it exists, but it disappeared after a time, maybe as I loged out from the server.
Perhaps this is controlled by the option RemoveIPC=yes in /etc/systemd/logind.conf.
My question is
Why does systemd not clean up the shm file created by Postgresql, but mine?
How to modify my app to make it like Postgresql, so that we can reduce the managing/maintaining work at the producing time.
I found that the shm memory is still available after it be cleaned by systemd. Does this mean that I can ignore that, and continue to use it without recreating?
I think your suspicion is right; see the documentation for details:
If systemd is in use, some care must be taken that IPC resources (including shared memory) are not prematurely removed by the operating system. This is especially of concern when installing PostgreSQL from source. Users of distribution packages of PostgreSQL are less likely to be affected, as the postgres user is then normally created as a system user.
The setting RemoveIPC in logind.conf controls whether IPC objects are removed when a user fully logs out. System users are exempt. This setting defaults to on in stock systemd, but some operating system distributions default it to off.
[...]
A “user logging out” might happen as part of a maintenance job or manually when an administrator logs in as the postgres user or something similar, so it is hard to prevent in general.
What is a “system user” is determined at systemd compile time from the SYS_UID_MAX setting in /etc/login.defs.
Packaging and deployment scripts should be careful to create the postgres user as a system user by using useradd -r, adduser --system, or equivalent.
Alternatively, if the user account was created incorrectly or cannot be changed, it is recommended to set
RemoveIPC=no
in /etc/systemd/logind.conf or another appropriate configuration file.
While this is talking about PostgreSQL, the same applies to your software. So take one of the recommended measures.

Supervisor kills Prefect agent with SIGTERM unexpectedly

I'm using a rapsberry pi 4, v10(buster).
I installed supervisor per the instructions here: http://supervisord.org/installing.html
Except I changed "pip" to "pip3" because I want to monitor running things that use the python3 kernel.
I'm using Prefect, and the supervisord.conf is running the program with command=/home/pi/.local/bin/prefect "agent local start" (I tried this with and without double quotes)
Looking at the supervisord.log file it seems like the Prefect Agent does start, I see the ASCII art that normally shows up when I start it from the command line. But then it shows it was terminated by SIGTERM;not expected, WARN recieved SIGTERM inidicating exit request.
I saw this post: Supervisor gets a SIGTERM for some reason, quits and stops all its processes but I don't even have that 10Periodic file it references.
Anyone know why/how Supervisor processes are getting killed by sigterm?
It could be that your process exits immediately because you don’t have an API key in your command and this is required to connect your agent to the Prefect Cloud API. Additionally, it’s a best practice to always assign a unique label to your agents, below is an example with “raspberry” as a label.
You can also check the logs/status:
supervisorctl status
Here is a command you can try, plus you can specify a directory in your supervisor config (not sure whether environment variables are needed but I saw it from other raspberry Pi supervisor user):
[program:prefect-agent]
command=prefect agent local start -l raspberry -k YOUR_API_KEY --no-hostname-label
directory=/home/pi/.local/bin/prefect
user=pi
environment=HOME="/home/pi/.local/bin/prefect",USER="pi"

Artemis: AMQ222210: Storage usage is beyond max-disk-usage. System will start blocking producers

I'm sending a message from Application A to Artemis but I'm getting this error from Application A:
AMQ212054: Destination address=my-service is blocked. If the system is configured to block make sure you consume messages on this configuration.
Looking at the logs of artemis starting up this is what I see which I believe is the cause:
AMQ222210: Storage usage is beyond max-disk-usage. System will start blocking producers
I've looked at the documentation here and found nothing that could help. Also have logged into the running container and changed the 'max-disk-usage' to 100 as per my google research and so far nothing has helped.
I'm running artemis using the following command:
docker run -it --rm -e ARTEMIS_USERNAME=artemis -e ARTEMIS_PASSWORD=artemis -p 8161:8161 -p 61616:61616 vromero/activemq-artemis
Any help is appreciated~ Thank you
You are receiving this message because you computer's disk space is over 90% full and Artemis blocks producers once this happens. To solve your problem you can either:
Clear up disk space on your computer so that it is below 90% .
Increase how full your disk can be before Artimes blocks producers. To do this you need to modify the broker configuration file which is located at:
path-to-broker\artemis\etc\broker.xml
In this file, there is a tag labeled max-disk-usage which is by default set to 90. Simply increase this to 100 (or whatever value you feel comfortable with).
Note that the reason Artemis configures your brokers to start blocking producers once you computer's disk space usage reaches 90% and above is to prevent potentially using up all of your disk space in the case of message back log.
I've downloaded a different version and this issue hasn't occurred anymore.

Systemd - always have a service running and reboot if service stops more than X times

I need to have a systemd service which runs continuously. System under question is an embedded linux built by Yocto.
If the service stops for any reason (either failure or just completed), it should be restarted automatically
If restarted more than X times, system should reboot.
What options are there for having this? I can think of the following two, but both seem suboptimal
1) having a cron job which will literally do the check above and keep the number of retries somewhere in /tmp or other tmpfs
2) having the service itself track the number times it has been started (again in some tmpfs location) and rebooting if necessary. Systemd would just have to continuously try to start the service if it's not running
edit: as suggested by an answer, I modified the service to use the StartLimitAction as given below. It causes the unit to correctly restart, but at no point does it reboot the system, even if I continuously kill the script:
[Unit]
Description=myservice system
[Service]
Type=simple
WorkingDirectory=/home/root
ExecStart=/home/root/start_script.sh
Restart=always
StartLimitAction=reboot
StartLimitIntervalSec=600
StartLimitBurst=5
[Install]
WantedBy=multi-user.target
This in your service file should do something very close to your requirements:
[Service]
Restart=always
[Unit]
StartLimitAction=reboot
StartLimitIntervalSec=60
StartLimitBurst=5
It will restart the service if it stops, except if there are more than 5 restarts in 60 seconds: in that case it will reboot.
You may also want to look at WatchdogSec value, but this software watchdog functionality requires support from the service itself (very easy to add though, see the documentation for WatchDogSec).
My understanding is that the line Restart= should be in [Service], as in the example
but lines StartLimitxxxxx= should be in [Unit].

Lustre hangs while mounting oss

I have installed Parallel file system "Lustre" along with this slide with RPM.
Have set node A, B.
Installed mds and mdt to node A. Its mount was successful.
But, After format oss to node B using mkfs.lustre, then I mounted it, but it began Infinite waiting.
And it retrieve this error once 120 seconds.
INFO: task mount.lustre:1541 blocked for more than 120 seconds.
Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Why it occurs? Or can you give me better tutorial or experience? Its version of Lustre is 2.7.0.
Thanks a lot.
It is a info message. As mentioned in the message, though you can echo 0 to "hung_task_timeout_secs" to disable the message from showing up but still I will not recommend it.
Try to lower the mark for flushing the cache from 40% to 10% by setting “vm.dirty_ratio=5″ & "vm.dirty_background_ratio=5" in /etc/sysctl.conf. Activate it by using sysctl -p command, there is no need to reboot the system.