How do I resolve a failure to start service to run private build agent on Azure running Linux VM - azure-devops

I have installed and run interactively a private build agent for Azure DevOps on Linux.
However, when attempting to follow documentation to setup as a service it's failing to run. It usually completes the install successfully. But, starting the service always returns an error.
Configuration: New VM running Ubuntu 18.04LTS, secured with AAD and JIT, logged in with VM Admin permissions.
Error:
$ sudo ./svc.sh install
Creating launch agent in /etc/systemd/system/vsts.agent.xxx.linux-agent-01.service
Run as user: xxx#microsoft.com
Run as uid: 1613914
gid: 1613914
$ sudo ./svc.sh start
Failed to start vsts.agent.xxx.linux-agent-01.service: Unit vsts.agent.xxx.linux-agent-01.service is not loaded properly: Exec format error.
See system logs and 'systemctl status vsts.agent.xxx.linux-agent-01.service' for details.
Failed: failed to start vsts.agent.xxx.linux-agent-01.service
$
When attempting to run I get this:
3$ sudo ./svc.sh status
/etc/systemd/system/vsts.agent.edgewebui.LinuxAgent03.service
● vsts.agent.edgewebui.LinuxAgent03.service - VSTS Agent (edgewebui.LinuxAgent03)
Loaded: error (Reason: Exec format error)
Active: inactive (dead)
Feb 28 18:59:18 build-agent-linux systemd[1]: /etc/systemd/system/vsts.agent.edgewebui.LinuxAgent03.service:7: Invalid user/group…osoft.com
Hint: Some lines were ellipsized, use -l to show in full.
Any suggestions on why this isn't working.

Related

When trying to connect jenkins and kubernetes, Jenkins job throws the following error

Started by user admin.
Running as SYSTEM.
Building in workspace /var/lib/jenkins/workspace/myjob
[myjob] $ /bin/sh -xe /tmp/jenkins8491647919256685444.sh
+ sudo kubectl get pods
error: the server doesn't have a resource type "pods"
Build step 'Execute shell' marked build as failure
Finished: FAILURE
It looks to me that the authentication credentials were not set correctly. Please copy the kubeconfig file /etc/kubernetes/admin.conf to ~/.kube/config? Also check that the KUBECONFIG variable is set.
It would also help to increase the verbose level using the flag --v=99.
Please take a look: kubernetes-configuration.

Gitlab not starting after upgrade to Ubuntu 18.04

I have successfully upgraded Gitlab to 12.1.6 on Ubuntu 16.04 and checked that all was working. After making sure Ubuntu was fully up to date I checked again: Gitlab worked.
I then used the do-release-upgrade command to update to Ubuntu 18.04. After restart, everything seems to work ok, but Gitlab refuses to start.
I get the following errors:
fail: alertmanager: runsv not running
fail: gitaly: runsv not running
fail: gitlab-exporter: runsv not running
fail: gitlab-workhorse: runsv not running
fail: grafana: runsv not running
fail: logrotate: runsv not running
fail: nginx: runsv not running
fail: node-exporter: runsv not running
fail: postgres-exporter: runsv not running
fail: postgresql: runsv not running
fail: prometheus: runsv not running
fail: redis: runsv not running
fail: redis-exporter: runsv not running
fail: sidekiq: runsv not running
fail: unicorn: runsv not running
I tried:
gitlab-ctl reconfigure --> runs with success
I installed runit with success, rebooted the machine but the errors remain
I found a similar issue here: on Stackoverflow, followed the instructions (yum --> apt), still no success
and here on Gitlab. This advised to run
sudo systemctl restart gitlab-runsvdir
sudo gitlab-ctl restart
But the first command never finishes
I found this: on Gitlab which states to run
sudo gitlab-rake gitlab:env:info --trace
Output:
** Invoke gitlab:env:info (first_time)
** Invoke gitlab_environment (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute gitlab_environment
** Execute gitlab:env:info
System information
System: Ubuntu 18.04
Current User: git
Using RVM: no
Ruby Version: 2.6.3p62
Gem Version: 2.7.9
Bundler Version:1.17.3
Rake Version: 12.3.3
Redis Version: 3.2.12
Git Version: 2.24.1
Sidekiq Version:5.2.7
Go Version: unknown
rake aborted!
PG::ConnectionBad: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/opt/gitlab/postgresql/.s.PGSQL.5432"?
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/pg-1.1.4/lib/pg.rb:56:in `initialize'
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/pg-1.1.4/lib/pg.rb:56:in `new'
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/pg-1.1.4/lib/pg.rb:56:in `connect'
which suggests that the Postgresql server is not running. I have no idea how to start it. Any ideas?
UPDATE: Since then I found an answer that said for some reason on the reboot command the runsv process gets stuck. To solve this you have to:
sudo service gitlab-runsvdir start
sudo gitlab-ctl reconfigure
sudo gitlab-ctl restart
I wanted to post the same above question. This happens to me when i do the monthly upgrade/updates to our GitLab. I tried your solution before, but to me they didn't solve the problem.
I present my solution, but use these tips with caution, as I still sometimes struggle with this problem in different ways.
I do a combination of these command, the order of these steps are really important(!):
sudo systemctl disable gitlab-runsvdir.service
sudo gitlab-ctl reconfigure
sudo gitlab-rake db:migrate
sudo gitlab-ctl restart gitlab-runsvdir--has to be aborted with Ctrl-C
sudo /opt/gitlab/embedded/bin/runsvdir-start--has to be aborted with Ctrl-C
Reference:
In this issue Stan Hu's answer helped me:
https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3744
Now yesterday I messed up and didn't do the first step, which could cause my problems, because somehow then for 10 hours the postgresql server was down and it couldn't make it back up again.
If i would have stick to the order, maybe it wouldn't have happened, because it didn't happen in the past while I upgraded/updated GitLab several times.
This was the connection error yesterday, same issue as questioner's
Automatic restart this morning
This morning magically the server at 7:13 am, was accepting connections again. (althought i tried to restart it yesterday, it didn't make any difference then)
Still GitLab wasn't reachable online from web browser as of 7:30 am.
One hour later, and a few runsvdir restarting/starting commands after (added it to the commands order), the GitLab is working. I have no idea why.
Got the same problem, and I ended up running /opt/gitlab/embedded/bin/runsvdir-start manually. I fixed my problem, launched reconfigure, and then it worked from there.
I ran into a similar runsv error, but only saw it for once service, not the whole list you have. These steps are a log of my attempts to get it working - probably not a direct line, but my local Gitlab does work now:
In the CentOS vm:
vi /etc/gitlab/gitlab.rb
change the external_url from http://example.gitlab.com to http://192.168.1.131
sudo gitlab-ctl reconfigure
first observed the error runsv not running
yum update -y
sudo gitlab-ctl status
sudo gitlab-ctl restart
sudo gitlab-ctl reconfigure
systemctl start gitlab-runsvdir.service
systemctl status gitlab-runsvdir.service
sudo gitlab-ctl reconfigure
still saw an error about runsv not running, several times, but it was never a blocker and the reconfigure was successful
On host
navigate to 192.168.1.131
See the prompt for root password
As for the issue with Postgres, I'm not sure

Failed to restart mongod.service : Unit mongod.service not found

There are a lot of variations for this question, on different forums. I tried a lot of things to get it to work. I am using AWS EC2 and MEAN by Bitnami, I tried connecting using Node JS and I realized that my monogodb service is not running. I checked it by running on the terminal (connected using Putty)
service mongod status
This is the error I get
mongodb.service Loaded:not-found (Reason: No such file or directory)
Active: inactive(dead)
To try my luck, I tried
sudo service mongod restart
And I get this error:
Failed to restart mongod.service : Unit mongod.service not found
Now, just to probe more I tried looking if I have this service installed.
I ran this command: ls /lib/systemd/system
And it gave a huge list, but I couldn't find mongod.service anywhere.
My Ubuntu Ver: 16.04
I am guessing it's not present or maybe I am looking for the wrong stuff. Please let me know how do I get the service to run. I am sort of new to MongoDB and Bitnami.
Each Bitnami MEAN stack includes a control script that lets you easily stop, start and restart services.
The script is located at /opt/bitnami/ctlscript.sh.
To start all services:
sudo /opt/bitnami/ctlscript.sh start
To start a single service:
sudo /opt/bitnami/ctlscript.sh start <service name>
So to answer your question:
sudo /opt/bitnami/ctlscript.sh start mongod
You can obtain a list of available services and operations by running the script without any arguments:
sudo /opt/bitnami/ctlscript.sh

Weird systemctl behavior

I installed Postgres9.6 on my Ubuntu 16.04 through apt-get. and I installed the same version on an AWS ubuntu instance to try replication.
I was following the steps from this blog:
http://www.rassoc.com/gregr/weblog/2013/02/16/zero-to-postgresql-streaming-replication-in-10-mins/
Now I can log to the postgres user on both machines, and run psql and execute all database operations. But when I check the status of the server (on both machines) using sudo systemctl status postgres.service, I get this:
● postgres.service Loaded: not-found (Reason: No such file or
directory) Active: inactive (dead)
Which is weird because I can run psql, right?
And this prevents me from creating remote connections.

flocker-docker-plugin not working on centos7.2

I am trying to integrate flocker with docker, for that I found plugin flocker-docker-plugin. I installed it by using the commands on my flocker agents.-
$ yum install -y clusterhq-flocker-docker-plugin
$ systemctl enable flocker-docker-plugin
$ systemctl restart flocker-docker-plugin
It shows flocker-docker-plugin is running. However after few seconds when I checked status by using $ systemctl status flocker-docker-plugin, I got error saying
flocker-docker-plugin.service: main process exited, code=killed, status=11/SEGV
Based on the information you have given there could be multiple reasons for this error:
Check if you can reach the flocker control service and more so if your node-agents can reach the control-service.
Check if the flocker-dataset-agent and the flocker-container-agent are running on your nodes.
Check if you have provided certificates for the flocker-docker-plugin as mentioned on their site (https://docs.clusterhq.com/en/latest/docker-integration/generate-api-plugin.html).
While installing flocker i also got the same error as we have just installed the docker plugin and by default it does't start's up.
First use the command systemctl start flocker-docker-plugin and then check the running status of flocker using systemctl status flocker-docker-plugin
Make sure the control service and dataset agent are running correctly first, you can find logs by looking in /var/log/flocker/, journalctl -u flocker-dataset-agent or running flocker-diagnostics.
Read through any error in these logs such as communication with control service issues, certificates issues, agent.yml config issues etc, or feel free to post them for more help.
You can also find flocker-docker-plugin logs the same way to see specific errors that may be occurring.
Here is more information about how to debug flocker.