Rauc and Yocto on Jetson Nano - Unable to find primary boot slot - yocto

This is a continuation of my other post.
I've managed to create an image with u-boot and rauce.
I've made a simple rauc system.conf:
[system]
compatible=Jetson Nano
bootloader=uboot
#
[slot.rootfs.0]
device=/dev/mmcblk0p1
type=ext4
bootname=system0
#
[slot.rootfs.1]
device=/dev/mmcblk0p13
type=ext4
bootname=system1
[UPDATED]:
Pretty much copy pasted the contrib uboot.sh script.
Then I've added a bb file from here into my bsp layer.
And added rauc to my IMAGE_INSTALL.
When i boot up the nano with my image, rauc isn't working as it should. When i check the status on the service with systemctl status rauc-mark-service-good.service it returns:
● rauc-mark-good.service - Rauc Good-marking Service
Loaded: loaded (/lib/systemd/system/rauc-mark-good.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Tue 2019-10-01 07:51:22 UTC; 4s ago
Process: 4147 ExecStart=/usr/bin/rauc status mark-good (code=exited, status=0/SUCCESS)
Main PID: 4147 (code=exited, status=0/SUCCESS)
Oct 01 07:51:22 jetson-nano systemd[1]: Started Rauc Good-marking Service.
Oct 01 07:51:22 jetson-nano rauc[4147]: Failed getting primary slot: Failed getting primary slot: Unable to find primary boot slot
Oct 01 07:51:22 jetson-nano rauc[4147]: rauc mark: marked slot rootfs.0 as good
Oct 01 07:51:22 jetson-nano systemd[1]: rauc-mark-good.service: Succeeded.
systemctl status rauc returns:
● rauc.service - Rauc Update Service
Loaded: loaded (/lib/systemd/system/rauc.service; static; vendor preset: enabled)
Active: active (running) since Tue 2019-10-01 07:49:36 UTC; 2min 0s ago
Docs: https://rauc.readthedocs.io
Main PID: 4092 (rauc)
Tasks: 3 (limit: 4178)
Memory: 4.4M
CGroup: /system.slice/rauc.service
└─4092 /usr/bin/rauc --mount=/run/rauc service
Oct 01 07:49:36 jetson-nano systemd[1]: Starting Rauc Update Service...
Oct 01 07:49:36 jetson-nano systemd[1]: Started Rauc Update Service.
Oct 01 07:49:48 jetson-nano rauc[4092]: Failed getting primary slot: Failed getting primary slot: Unable to find primary boot slot
Oct 01 07:49:48 jetson-nano rauc[4092]: Failed to load status file /slot.raucs: No such file or directory
Oct 01 07:49:48 jetson-nano rauc[4092]: mounting slot /dev/mmcblk0p13
Oct 01 07:49:48 jetson-nano rauc[4092]: Failed to load status file /run/rauc/rootfs.1/slot.raucs: No such file or directory
Oct 01 07:51:22 jetson-nano rauc[4092]: Failed getting primary slot: Failed getting primary slot: Unable to find primary boot slot
Oct 01 07:51:22 jetson-nano rauc[4092]: rauc mark: marked slot rootfs.0 as good
And rauc status returns:
(rauc:4195): rauc-WARNING **: 07:51:46.126: Failed getting primary slot: Failed getting primary slot: Unable to find primary boot slot
Compatible: Jetson Nano
Variant:
Booted from: rootfs.0 (/dev/mmcblk0p1)
Activated: (null) ((null))
slot states:
rootfs.0: class=rootfs, device=/dev/mmcblk0p1, type=ext4, bootname=system0
state=booted, description=, parent=(none), mountpoint=/
boot status=bad
rootfs.1: class=rootfs, device=/dev/mmcblk0p13, type=ext4, bootname=system1
state=inactive, description=, parent=(none), mountpoint=(none)
boot status=bad
So there is no /slot.raucs file and it failed to find primary boot slot.
After that, systemctl status rauc-mark-good returns that the rootfs.0 slot has been marked as good in the end, but systemctl status rauc shows that the boot status is bad.
What am I missing here?

I edited the uboot script to the following:
test -n "${BOOT_ORDER}" || setenv BOOT_ORDER "system0 system1"
test -n "${BOOT_system0_LEFT}" || setenv BOOT_system0_LEFT 3
test -n "${BOOT_system1_LEFT}" || setenv BOOT_system1_LEFT 3
setenv bootargs
for BOOT_SLOT in "${BOOT_ORDER}"; do
if test "x${bootargs}" != "x"; then
# skip remaining slots
elif test "x${BOOT_SLOT}" = "xsystem0"; then
if test ${BOOT_system0_LEFT} -gt 0; then
setexpr BOOT_system0_LEFT ${BOOT_system0_LEFT} - 1
echo "Found valid slot system0, ${BOOT_system0_LEFT} attempts remaining"
setenv distro_bootpart "1"
setenv boot_line "mmc 1:1 any ${scriptaddr} /boot/extlinux/extlinux.conf"
fi
elif test "x${BOOT_SLOT}" = "xsystem1"; then
if test ${BOOT_system1_LEFT} -gt 0; then
setexpr BOOT_system1_LEFT ${BOOT_system1_LEFT} - 1
echo "Found valid slot system1, ${BOOT_system1_LEFT} attempts remaining"
setenv distro_bootpart "13"
setenv boot_line "mmc 1:D any ${scriptaddr} /boot/extlinux/extlinux.conf"
fi
fi
done
if test -n "${bootargs}"; then
saveenv
else
echo "No valid slot found, resetting tries to 3"
setenv BOOT_system0_LEFT 3
setenv BOOT_system1_LEFT 3
saveenv
reset
fi
sysboot ${boot_line}
And it ended up working. Apparently there was some issues with the BOOT_ORDER "system0 system1" in the the uboot script that was somehow not the same as in the RAUC system.conf. When i re-wrote the script, there was no issues and RAUC was running fine.

Related

MongoDB does´t start, exit code 203

I installed MongoDB on the Ruspberry pi desktop on a VM, then I started it with the following command:
sudo service mongod start
The result of this command is the following:
systemctl list-unit-files --state enabled
mongodb.service enabled
Then when I check the status using
sudo service mongod status
The other issue is that there´s no .sock file in my /tmp folder.
PS: I removed MongoDB and reinstalled it trying to fix the issue but I always get the same problem.
Can anyone help me please?
Thank you in advance.
I get the following error:
● mongod.service - MongoDB Database Server
Loaded: loaded (/lib/systemd/system/mongod.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2020-06-08 12:33:17 CEST; 3s ago
Docs: https://docs.mongodb.org/manual
Process: 22168 ExecStart=/usr/bin/mongod --config /etc/mongod.conf (code=exited, status=203/EXEC)
Main PID: 22168 (code=exited, status=203/EXEC)
Jun 08 12:33:17 raspberry systemd[1]: Started MongoDB Database Server.
Jun 08 12:33:17 raspberry systemd[22168]: mongod.service: Failed to execute command: Exec format error
Jun 08 12:33:17 raspberry systemd[22168]: mongod.service: Failed at step EXEC spawning /usr/bin/mongod: Exec format error
Jun 08 12:33:17 raspberry systemd[1]: mongod.service: Main process exited, code=exited, status=203/EXEC
Jun 08 12:33:17 raspberry systemd[1]: mongod.service: Failed with result 'exit-code'.```

Postgresql Failed in Ubuntu environment. with error message - "socket "/var/run/postgresql/.s.PGSQL.5432"?

One day, My Postgresql server stopped working. Checked log. It was shutdown somehow.
root#ip_address:/# tail /var/log/postgresql/postgresql-10-main.log
2020-02-19 06:47:49.215 CET [23497] LOG: received smart shutdown request
2020-02-19 06:47:49.477 CET [23497] LOG: worker process: logical replication launcher (PID 23512) exited with exit code 1
2020-02-19 06:47:49.482 CET [23507] LOG: shutting down
2020-02-19 06:47:49.546 CET [23497] LOG: database system is shut down
When I run,
root#ip_address:/# psql
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
It complained that there are no files and directory. so I checked if my postgresql running.
root#ip_address:/# systemctl status postgresql
● postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: active (exited) since Sun 2020-03-08 16:19:24 CET; 26min ago
Process: 30136 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 30136 (code=exited, status=0/SUCCESS)
Mar 08 16:19:24 vps584959 systemd[1]: Starting PostgreSQL RDBMS...
Mar 08 16:19:24 vps584959 systemd[1]: Started PostgreSQL RDBMS.
It was running. but, if I check postgresql cluster.
root#ip_address:/# pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
10 main 5432 down postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log
It was DOWN
so I tried
root#ip_address:/# pg_ctlcluster 10 main start
Error: Config owner (deploy:1003) and data owner (postgres:114) do not match, and config owner is not root
I wasn't able to make it work. then I tried.
sudo chown -R deploy:postgres /var/lib/postgresql/10/ && sudo chmod -R u=rwX,go= /var/lib/postgresql/10/
try again.
root#ip_address:/# pg_ctlcluster 10 main start
Job for postgresql#10-main.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status postgresql#10-main.service" and "journalctl -xe" for details.
root#ip_address:/# systemctl status postgresql#10-main.service
● postgresql#10-main.service - PostgreSQL Cluster 10-main
Loaded: loaded (/lib/systemd/system/postgresql#.service; indirect; vendor preset: enabled)
Active: failed (Result: protocol) since Sun 2020-03-08 16:59:53 CET; 2min 52s ago
Process: 31635 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 10-main start (code=exited, status=1/FAILURE)
Main PID: 23497 (code=exited, status=0/SUCCESS)
Mar 08 16:59:53 vps584959 systemd[1]: Starting PostgreSQL Cluster 10-main...
Mar 08 16:59:53 vps584959 postgresql#10-main[31635]: Error: /usr/lib/postgresql/10/bin/pg_ctl /usr/lib/postgresql/10/bin/pg_ctl start -D /var/lib/postgresql/10/main -l /var/log/postgre
Mar 08 16:59:53 vps584959 systemd[1]: postgresql#10-main.service: Can't open PID file /var/run/postgresql/10-main.pid (yet?) after start: No such file or directory
Mar 08 16:59:53 vps584959 systemd[1]: postgresql#10-main.service: Failed with result 'protocol'.
Mar 08 16:59:53 vps584959 systemd[1]: Failed to start PostgreSQL Cluster 10-main.
Don't know what to do more. Is anybody had the same problem?
More infos.
root#ip_address:/var/run/postgresql# ls
total 0
drwxrwsr-x 3 postgres postgres 60 Feb 19 06:47 .
drwxr-xr-x 28 root root 1060 Mar 8 13:58 ..
drwxr-s--- 2 postgres postgres 40 Feb 19 06:47 10-main.pg_stat_tmp
pg_ctlcluster 10 main start
Error: Config owner (deploy:1003) and data owner (postgres:114) do not match, and config owner is not root
That's pretty clear, isn't it?
The Ubuntu PostgreSQL startup script wants that postgresql.conf and/or pg_hba.conf be owned by user postgres, else it refuses to proceed.

Fatal error when starting orion context broker

My orion context broker does not start and when I enter the command
/etc/init.d/contextBroker start I get this message
[root#context-broker ~]# /etc/init.d/contextBroker start
Starting contextBroker (via systemctl): Job for contextBroker.service failed because the control process exited with error code. See "systemctl status contextBroker.service" and "journalctl -xe" for details.
[FAILED]
The systemctl status contextBroker.service commannd gives this message
[root#context-broker ~]# systemctl status contextBroker.service
● contextBroker.service - LSB: run contextBroker
Loaded: loaded (/etc/rc.d/init.d/contextBroker; bad; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2019-05-24 11:38:50 UTC; 1min 11s ago
Docs: man:systemd-sysv-generator(8)
Process: 9782 ExecStart=/etc/rc.d/init.d/contextBroker start (code=exited, status=1/FAILURE)
May 24 11:38:47 context-broker.novalocal systemd[1]: Starting LSB: run contextBroker...
May 24 11:38:48 context-broker.novalocal contextBroker[9782]: contextBroker is stopped
May 24 11:38:48 context-broker.novalocal contextBroker[9782]: Starting...
May 24 11:38:48 context-broker.novalocal su[9788]: (to orion) root on none
May 24 11:38:50 context-broker.novalocal contextBroker[9782]: Starting contextBroker... cat: /var/run/contextBroker/contextBroker.pid...irectory
May 24 11:38:50 context-broker.novalocal systemd[1]: contextBroker.service: control process exited, code=exited status=1
May 24 11:38:50 context-broker.novalocal contextBroker[9782]: pidfile not found[FAILED]
May 24 11:38:50 context-broker.novalocal systemd[1]: Failed to start LSB: run contextBroker.
May 24 11:38:50 context-broker.novalocal systemd[1]: Unit contextBroker.service entered failed state.
May 24 11:38:50 context-broker.novalocal systemd[1]: contextBroker.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
Also the /tmp/contextBroker.log file looks like this
time=2019-05-24T11:41:12.971Z | lvl=FATAL | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=rest.cpp[1753]:restStart | msg=Fatal Error (error starting REST interface)
I checked if mongodb is running and it is running correctly.
UPDATE
With some searching I realised I had to kill the pid of the process and after I did that the service successfully starts according to the messaags but I find it doesnt actually work. When I ask for the status I get the following:
[root#context-broker centos]# /etc/init.d/contextBroker status
● contextBroker.service - LSB: run contextBroker
Loaded: loaded (/etc/rc.d/init.d/contextBroker; bad; vendor preset: disabled)
Active: active (exited) since Sun 2019-05-26 18:34:49 UTC; 4min 56s ago
Docs: man:systemd-sysv-generator(8)
Process: 16295 ExecStop=/etc/rc.d/init.d/contextBroker stop (code=exited, status=0/SUCCESS)
Process: 16319 ExecStart=/etc/rc.d/init.d/contextBroker start (code=exited, status=0/SUCCESS)
May 26 18:34:47 context-broker.novalocal systemd[1]: Starting LSB: run contextBroker...
May 26 18:34:47 context-broker.novalocal contextBroker[16319]: contextBroker is stopped
May 26 18:34:47 context-broker.novalocal contextBroker[16319]: Starting...
May 26 18:34:47 context-broker.novalocal su[16325]: (to orion) root on none
May 26 18:34:49 context-broker.novalocal systemd[1]: Started LSB: run contextBroker.
May 26 18:34:49 context-broker.novalocal contextBroker[16319]: Starting contextBroker... [ OK ]
The log file has the same message as previously.
With some searching again I believe the cause is that the service doesnt have a daemon(??). So if that is the case how do I add one?
Normally when getting the error starting REST interface, it's because there is already a broker running, which means the port is already taken. Make sure there is no broker already running.

Should systemctl show output when start/stop fails?

I've googled every variation of this question I can think of, but I'm just getting questions about failed services, not about how systemctl treats them. I have a service that I've been running as an init.d script. We're using systemctl now, fine. I created a service file that's a lightly modified version of the file automatically generated by systemd-sysv-generator. For ExecStart and ExecStop, it calls a bash script that returns 0 if the start/stop was successful, and non-zero if it was not.
I understand that there's no output from "systemctl start/stop" if it was successful. But I also don't get any output if either of the calls failed. The return code of the systemctl start/stop command is always 0 even if the return code of the source script is not. It's very clear it did fail because it shows as failed when I run the status command.
Is that expected behavior? Should it not give any indication that something failed unless you run a separate status command? And if that's not how it should behave, how can I make it indicate that a failure occurred?
Service file below.
[Unit]
SourcePath=/my/service/script.sh
[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/my/service/script.sh start
ExecStop=/my/service/script.sh stop
Working fine here, CentOS 7. Maybe double check that script.sh is really returning non zero?
$pwd
/etc/systemd/system
$cat me.service
[Unit]
[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/etc/systemd/system/die.sh
$cat die.sh
#!/bin/bash
echo "dying"
exit 1
$sudo systemctl start me
Job for me.service failed because the control process exited with error code. See "systemctl status me.service" and "journalctl -xe" for details.
$sudo systemctl status me.service
● me.service
Loaded: loaded (/etc/systemd/system/me.service; static; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2019-09-12 17:55:02 GMT; 8s ago
Process: 19758 ExecStart=/etc/systemd/system/die.sh (code=exited, status=1/FAILURE)
Sep 12 17:55:02 dpsdev-wkr01 systemd[1]: Starting me.service...
Sep 12 17:55:02 dpsdev-wkr01 die.sh[19758]: dying
Sep 12 17:55:02 dpsdev-wkr01 systemd[1]: me.service: control process exited, code=exited status=1
Sep 12 17:55:02 dpsdev-wkr01 systemd[1]: Failed to start me.service.
Sep 12 17:55:02 dpsdev-wkr01 systemd[1]: Unit me.service entered failed state.
Sep 12 17:55:02 dpsdev-wkr01 systemd[1]: me.service failed.

Filebeat Service will not start on RHEL 7

I have a trouble/problem with my Filebeat installation.
When I try it to start with "service filebeat start", it says "Starting Filebeat". After "service filebeat status" I get 4 PIDs (until here everything looks "normal"):
[root#(Server) run]# service filebeat status
Filebeat is running with pid: 30650 30657 30658 30659
But after checking the PID, we see that it is not running:
[root#(Server) run]# ps -ef | grep 30650
root 30665 31360 0 16:27 pts/0 00:00:00 grep --color=auto 30650
Trying to start it with systemctl doesn't help:
[root#(Server) run]# systemctl start filebeat
Job for filebeat.service failed because a configured resource limit was exceeded. See "systemctl status filebeat.service" and "journalctl -xe" for details.
Status says:
[root#Server run]# systemctl status filebeat
● filebeat.service - LSB: start and stop filebeat
Loaded: loaded (/etc/rc.d/init.d/filebeat; bad; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2017-09-26 16:30:33 CEST; 1min 41s ago
Docs: man:systemd-sysv-generator(8)
Process: 32118 ExecStart=/etc/rc.d/init.d/filebeat start (code=exited, status=0/SUCCESS)
Sep 26 16:30:33 Server... systemd[1]: Starting LSB: start and stop filebeat...
Sep 26 16:30:33 Server... filebeat[32118]: Starting Filebeat
Sep 26 16:30:33 Server... su[32119]: (to user) root on none
Sep 26 16:30:33 Server... systemd[1]: PID file /var/run/filebeat.pid not readable (yet?) after start.
Sep 26 16:30:33 Server... systemd[1]: Failed to start LSB: start and stop filebeat.
Sep 26 16:30:33 Server... systemd[1]: Unit filebeat.service entered failed state.
Sep 26 16:30:33 Server... systemd[1]: filebeat.service failed.
Does somebody has any idea?
Regards
Problem was "chown permissions". I installed filebeat not as root and the "data" directory had root user & group ownership. After changing that, it runs and starts automatically after boot.
Regards