Failed to start LSB :Bring Up down Networking [closed] - centos

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 10 months ago.
Improve this question
I am new to CentOS 7 and I am configuring a static IP on CentOS 7, so I have edited the file /etc/sysconfig/network-scipts/ifcfg-eth0 as following:
TYPE=Ethernet
BOOTPROTO=none
Device=eth0
ONBBOOT=yes
IPADDR=192.168.4.196
NETMASK=255.255.255.0
GATEWAY=192.168.88.254
DNS1=8.8.8.8
USERCTL=no
But when I issue the command
systemctl restart network
I am getting the error
failed to start LSB :/Bring Up down Networking
ip route show gives me no output.
I have applied the solution that stops NetworkManager with the same existing error.
I am able to configure a dynamic DHCP and get a dynamic IP address but not static one.
What can be possible solutions?

Its because of interface issue
Solution worked for me was:
Check the interface available
cp ifcfg-eno16780032 ifcfg-ens192
vi ifcfg-ens192 and change NAME and Device field to ens192
systemctl disable NetworkManager
systemctl status NetworkManager -> inactive
systemctl stop network
systemctl start network
After that check ip a
get the details of IP and able to ping that IP.

You should change BOOTPROTO to static and move your DNS config to your /etc/resolv.conf file, for example:
TYPE=Ethernet
BOOTPROTO=static
PHYSDEV=eth0
ONBBOOT=yes
IPADDR=192.168.4.196
NETMASK=255.255.255.0
GATEWAY=192.168.88.254
USERCTL=no

When facing this issue that derailed proper autossh functionality on my roaming laptop, I decided to rip apart whatever of my MageiaOS code to understand the root cause. I did not have NetworkManager, so knew for sure it was not the obstacle.
The found issue could be described as kind of eventual live-lock between SysV and systemd ways of managing network service. Potentially, many conditions could trigger it (NetworkManager is one of the examples), in my case it was misconfigured vboxnet ifaces from VMWare.
There're two critical blockers in each part of SysV/systemd balance that might start triggering each other in the loop. On SysV side, init.d/network script eventually calls "ifup $device boot", which in response of 'boot' parameter starts ifplugd daemon for pluggable ifaces. The problem with this daemon that despite of '-I' switch (used to ignore errors) it still fails with exit code 4 upon detecting itself in memory. The only proper way to shutdown this daemon from network script is issuing "ifdown $device boot" command, which is supposed to get executed upon stopping network service by 'service' or 'systemctl' commands.
The interesting part of this question: why ifplugd is already in memory before the network service starts? Well, in my case WiFi iface was fired before misconfigured vbox iface but the latter caused entire initscript to fail. So, network was started on boot but service status was recorded as failing. But what prevents us just stopping network service and consequently killing ifplugd from ifdown/boot command? The answer is: systemd in its ingenious ways of handling ExecStop directive in unit file (which is auto-generated on the fly for network service). Basically, "systemctl stop" command just ignores ExecStop directive if it believes that the service is not started. Well, of course it is not because... if previously failed stumbling on unexpected ifplugd instance! So, no way to stop the service, hence no way to get rid of ifplugd, hence no way of (re)starting the service and so on.
Conclusion. There's no single recipe for this sort of trouble because the compatibility balance between network script and systemd approach is very fragile, so many unexpected factors can start interfering. To troubleshoot this scenario, several statuses might be useful:
network service: systemctl status network
ifplugd service: ps ax|grep ifplugd
network link status: ifconfig / iwconfig
autogenerated unit: cat /var/run/systemd/generator.late/network.service
other places running ifup independently: grep -rs ifup /etc
and of course, "bash -x" and debugging "echo Bump" instruction. :-)
Long-term solution is fixing ifplugd to honour '-I' switch in this scenario. Mid-term solution is fixing /etc/sysconfig/network-scripts/ifup-eth for ignoring ifplugd return code. Short-term solution seems to be the most tricky, which is just removing all possible config factors triggering this live-lock. But this is the only one tolerating system autoupdates...

Execute tee /etc/modprobe.d/*blacklist*.conf <- "blacklist ideapad_laptop"
Then reboot. This should unblock your Wi-Fi.

I came here looking for a answer to my case so I'll share, maybe it will help someone else. I'd like to thank cPanel staff for pointing this out to me
As for the reported issues, we have seen the CloudLInux servers running a kernel version lower than "3.10.0-862" and update to Cloudlinux 7.7, they will get an update to the 'iproute' package.
The 'iproute' package needs to wither a newer kernel or to be excluded from updating onto the server initially.
This information has been reported. You can find some more information about it here:
https://www.cloudlinux.com/cloudlinux-os-blog/entry/cloudlinux-os-7-7-released

In my case
journalctl -xe
Shows there was a duplicate interface configuration eth0 & eno1 using the same UUID:
Nov 06 09:35:41 4200-150-137 /etc/sysconfig/network-scripts/ifup-eth[27549]: Device eno1 does not seem to be present, del
Nov 06 09:35:41 4200-150-137 network[27401]: [FAILED]
Nov 06 09:35:41 4200-150-137 network[27401]: Bringing up interface eth0: [ OK ]
removing the unused interface ifcfg file solved the problem for me.

After several trials including restarting of network manager, commenting out the UUID on the interface concerned (mine being ifcfg-eth0), it finally boiled down to a missing file which apparently needs to be included despite the fact that its values can be included directly in the interface file.
vi /etc/sysconfig/network
then add your right values and save:
NETWORKING=yes
HOSTNAME=xxx.xxx.xxx
GATEWAY=x.x.x.x
I hope this helps someone. It is tested on CentOS 7 as a guest VM on Hyper V on Windows 10.

I have VPS with OVH and have been struggling with similar issue.
Just wanna share my solution as it can help some people.
It used to delay boot by 5 minutes, dhclient was checking ipv6 on ifup call.
Set this to no
DHCPV6C=no
inside /etc/sysconfig/network-scripts/ifcfg-eth0

I know this is an old discussion but i had this problem on my bare metal server from ovh after disable NetworkManager service by installing CPanel
This issue solved by adding bellow parameters' in ifcfg-eno1 (or in your case any active interface)
LINKDELAY=31
NM_CONTROLLED=no
ONBOOT=yes
DHCPV6C=no
Also note that you have activated the network service

Related

MongoDB: Error connecting to 127.0.0.1:27017, No connetion could be made because target machine actively refused it

I've been using Mongo for a while now, and I never had any kind of errors. But today, I tried running the mongo command in my terminal and I got the following error:
Error connecting to 127.0.0.1:27017 :: caused by :: No connection could be made because the target machine actively refused it. :
I have my PATH variable for Mongo properly configured in my environment variables as follows:
C:\Program Files\MongoDB\Server\4.4\bin
so I doubt that is the issue. I remember going through my task manager yesterday and I accidentally terminated a program running in the background related to Mongo, but I can't seem to remember exactly what it was called, and I really think that that's the root of my problem, because before having terminated that Mongo program in my task manager I had never ran across this connection problem before.
By terminating a program in the background, I'm going to assume you didn't just end process, otherwise a simple computer restart would fix your issue. And in some cases, that same program would've relaunched when you launched MongoDB. But if you disabled a service and need to find which service needs to be running to be able to connect to your MongDB then I would suggest going through your Windows Services list and seeing which ones you disabled and looking one relating to TCP or SNMP.
This is because MongoDB Wire Protocol is a simple socket-based, request-response style protocol. You communicate with the database server through a regular TCP/IP socket and since you can't remember which one you "terminated" and any number of services related to networking can cause a dependency to be absent, I can't be more specific in helping you determine which one you need to turn back on and you'll have to do it through trial and error but I can at least offer you some guidance, hopefully.
Specifically you can either
Run system configuration using
msconfig
In a run box, navigating to the Services tab, order the list by Date Disabled to find the service that was disabled which correlates with when you when snooping through task manager, or
Run Task manager and navigate to the Services Tab, then Open Services, and order them by Status or by Name, and look for any service that includes TCP/IP, COM+, Port direction, etc. to see which one is disabled and change the configuration from anything but Disabled and then stat it manually and run MongDB again.
It's about as specific as I can get without knowing anything more than you terminated some program running in the background but I hope it helps.
The background process (daemon) for MongoDB is called 'mongod'. It's an executable in your bin directory inside your mongodb installation. You can just execute it in the terminal.
Run:
C:\Program Files\MongoDB\Server\4.4\bin\mongod.exe

Apache CloudStack: No templates showing when adding instance

I have setup the apache cloudstack on CentOS 6.8 machine following quick installation guide. The management server and KVM are setup on the same machine. The management server is running without problems. I was able to add zone, pod, cluster, primary and secondary storage from the web interface. But when I tried to add an instance it is not showing any templates in the second stage as you can see in the screenshot
However, I am able to see two templates under Templates link in web UI.
But when I select the template and navigate to Zone tab, I see Timeout waiting for response from storage host and Ready field shows no.
When I check the management server logs, it seems there is an error when cloudstack tries to mount secondary storage for use. The below segment from cloudstack-management.log file describes this error.
2017-03-09 23:26:43,207 DEBUG [c.c.a.t.Request] (AgentManager-Handler-
14:null) (logid:) Seq 2-7686800138991304712: Processing: { Ans: , MgmtId:
279278805450918, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":
{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
GetRootDir for nfs://172.16.10.2/export/secondary failed due to
com.cloud.utils.exception.CloudRuntimeException: Unable to mount
172.16.10.2:/export/secondary at /mnt/SecStorage/6e26529d-c659-3053-8acb-
817a77b6cfc6 due to mount.nfs: Connection timed out\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.getRootDir(Nf
sSecondaryStorageResource.java:2080)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.execute(NfsSe
condaryStorageResource.java:1829)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.executeReques
t(NfsSecondaryStorageResource.java:265)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)\n\tat
com.cloud.utils.nio.Task.call(Task.java:83)\n\tat
com.cloud.utils.nio.Task.call(Task.java:29)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:262)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\
n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\
n\tat java.lang.Thread.run(Thread.java:745)\n","wait":0}}] }
Can anyone please guide me how to resolve this issue? I have been trying to figure it out for some hours now and don't know how to proceed further.
Edit 1: Please note that my LAN address was 10.103.72.50 which I assume is not /24 address. I tried to give CentOs a static IP by making the following settings in ifcg-eth0 file
DEVICE=eth0
HWADDR=52:54:00:B9:A6:C0
NM_CONTROLLED=no
ONBOOT=yes
BOOTPROTO=none
IPADDR=172.16.10.2
NETMASK=255.255.255.0
GATEWAY=172.16.10.1
DNS1=8.8.8.8
DNS2=8.8.4.4
But doing this would stop my internet. As a workaround, I reverted these changes and installed all the packages first. Then I changed the IP to static by the same configuration settings as above and ran the cloudstack management. Everything worked fine untill I bumped into this template thing. Please help me figure out what might have went wrong
I know I'm late, but for people trying out in the future, here it goes:
I hope you have successfully added a host as mentioned in Quick Install Guide before you changed your IP to static as it autoconfigures VLANs for different traffic and creates two bridges - generally with names 'cloud' or 'cloudbr'. Cloudstack uses the Secondary Storage System VM for doing all the storage-related operations in each Zone and Cluster. What seems to be the problem is that secondary storage system vm (SSVM) is not able to communicate with the management server at port 8250. If not, try manually mounting the NFS server's mount points in the SSVM shell. You can ssh into the SSVM using the below command:
ssh -i /var/cloudstack/management/.ssh/id_rsa -p 3922 root#<Private or Link local Ip address of SSVM>
I suggest you run the /usr/local/cloud/systemvm/ssvm-check.sh after doing ssh into the secondary storage system VM (assuming it is running) and has it's private, public and link local IP address. If that doesn't help you much, take a look at the secondary storage troubleshooting docs at Cloudstack.
I would further recommend, if anyone in future runs into similar issues, check if the SSVM is running and is in "Up" state in the System VMs section of Infrastructure tab and that you are able to open up a console session of it from the browser. If that is working go on to run the ssvm-check.sh script mentioned above which systematically checks each and every point of operation that SSVM executes. Even if console session cannot be opened up, you can still ssh using the link local IP address of SSVM which can be accessed by opening up details of SSVM and than execute the script. If it says, it cannot communicate with Management Server at port 8250, I recommend you check the iptables rules of management server and make sure all traffic is allowed at port 8250. A custom command to check the same is nc -v <mngmnt-server-ip> 8250. You can do a simple search and learn how to add port 8250 in your iptables rules if that is not opened. Next, you mentioned you used CentOS 6.8, so it probably uses older versions of nfs, so execute exportfs -a in your NFS server to make sure all the NFS shares are properly exported and there are no errors. I would recommend that you wait for the downloading status of CentOS 5.5 no GUI kvm template to be complete and its Ready status shown as 'Yes' before you start importing your own templates and ISOs to execute on VMs. Finally, if your ssvm-check.sh script shows everything is good and the download still does not start, you can run the command: service cloud restart and actually check if the service has gotten a PID using service cloud status as the older versions of system vm templates sometimes need us to manually start the cloud service using service cloud start even after the restart command. Restarting the cloud service in SSVM triggers the restart of downloading of all remaining templates and ISOs. Side note: the system VMs uses a Debian kernel if you want to do some more troubleshooting. Hope this helps.

Concourse result keeps loading

I'm new to concourse and really excited to start working with it but I have a problem running the hello world example described here: https://concourse-ci.org/hello-world.html
I'm running this example on a concourse docker setup described here: https://concourse-ci.org/docker-repository.html.
Everything seems to work just fine but when I want to verify the results of both examples it keeps saying loading:
Task result loading (image)
Any idea why this would happen? I'm running docker-compose on Mac OS X (El Capitan) but that shouldn't matter right? Is there some additional configuration that I'm missing?
I also noticed when checking the network trace that the following request doesn't return any value: /api/v1/builds/<buildnumber>/events
It keeps saying 'pending'. Is that normal? I assume it isn't but I don't know the cause of this. Is there any logging I can check?
EDIT:
It seems to have something to do with the fact that it isn't running on localhost. When I use port forwarding and open concourse on localhost:8080 the results are shown just fine. Also mapping another hostname to 127.0.0.1 with port forwarding enabled works. So only when I communicate directly with the opened docker ports it doesn't work. Am I missing something?
After much frustration I found out that to cause of this issue was that Sophos Anti-Virus was blocking Concourse server-side events...
https://community.sophos.com/products/free-antivirus-tools-for-desktops/f/sophos-anti-virus-for-mac-home-edition/5750/sophos-av-blocks-server-sent-events-sse-on-mac-os-x-yosemite

Proxy setting in gsutil tool

I use gsutil tool for download archives from Google Storage.
I use next CMD command:
python c:\gsutil\gsutil cp gs://pubsite_prod_rev_XXXXXXXXXXXXX/YYYYY/*.zip C:\Tmp\gs
Everything works fine, but if I try to run that command from corporate proxy, I receive error:
Caught socket error, retrying: [Errno 10051] A socket operation was attempted to an unreachable network
I tried several times to set the proxy settings in .boto file, but all to no avail.
Someone faced with such a problem?
Thanks!
Please see the section "I'm connecting through a proxy server, what do I need to do?" at https://developers.google.com/storage/docs/faq#troubleshooting
Basically, you need to configure the proxy settings in your .boto file, and you need to ensure that your proxy allows traffic to accounts.google.com as well as to *.storage.googleapis.com.
A change was just merged into github yesterday that fixes some of the proxy support. Please try it out, or specifically, overwrite this file with your current copy:
https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/util.py
I believe I am having the same problem with the proxy settings being ignored under Linux (Ubuntu 12.04.4 LTS) and gsutils 4.2 (downloaded today).
I've been watching tcpdump on the host to confirm that gsutils is attempting to directly route to Google IPs instead of to my proxy server.
It seems that on the first execution of a simple command like "gsutil -d ls" it will use my proxy settings specified .boto for the first POST and then switch back to attempting to route directly to Google instead of my proxy server.
Then if I CTRL-C and re-run the exact same command, the proxy setting is no longer used at all. This difference in behaviour baffles me. If I wait long enough, I think it will work for the initial request again so this suggests some form on caching taking place. I'm not 100% of this behaviour yet because I haven't been able to predict when it occurs.
I also noticed that it always first tries to connect to 169.254.169.254 on port 80 regardless of proxy settings. A grep shows that it's hardcoded into oauth2_client.py, test_utils.py, layer1.py, and utils.py (under different subdirectories of the gsutils root).
I've tried setting the http_proxy environment variable but it appears that there is code that unsets this.

Attempt to access remote folder mounted with CIFS hangs when disconnected

This question is an extension for that question.
Yet again: I'm working under CentOS 6.0 and I have a remote win7 folder, mounted with:
mount -t cifs //PC128/mnt /media/net -o "username=WORKGROUP\user,password=pwd,rw,noexec,soft,uid=user,gid=user"
When remote folder is not available (e.g. network cable is pulled out) an attempt to access the remote folder locks an application I'm working on. At first I detected that QDir::exists() caused locking for 20-90 seconds (I still can't find out why such difference), further I detected that any call to stat() function leads to application lock.
I followed an advice provided in topic above, I moved QDir::exists() call (and later - call to the stat() function) to another thread and this didn't solve the problem. The application still hangs when connection is suddenly lost. Qt trace shows that lock is somewhere in the kernel:
0 __kernel_vsyscall
1 __xstat64#GLIBC_2.1 /lib/libc.so.6
2 QFSFileEnginePrivate::doStat stat.h
I did also tried to check if remote share is still mounted before trying to access folder itself, but it didn't help. Approaches such as:
mount | grep /media/net
show that shared folder is still mounted even is there is no active connection to the network.
Checking folder status differences such as:
stat -fc%t:%T /media/net/ != stat -fc%t:%T /media/net/..
also hangs for ~20 seconds.
So I have several questions:
Is there any way to change CIFS timeouts? I did try to find out but it seems that there is no appropriate parameters and no CIFS config.
How can I check if remote folder is still mounted and not get locked?
How can I check is folder exists and also not get locked?
Your problem: "An unreachable network filesystem" is a very well known example which trigger linux hung task which isn't the same of zombies process at all(killing the parent PID won't do anything)
An hung task, is task which triggered a system call that cause problem in the kernel, so that the system call never return.
The major particularity is that the task is declared in the "D" state by the scheduler which mean the program is in an uninterruptible state. This mean that you can do nothing to stop you program: You can trigger all signal to the task, it would not respond. Launching hundreds of SIGTERM/SIGKILL does nothing!
This the case whith my old kernel: when my nfs server crash, I need to reboot the client to kill the tasks using the filesystem. I compiled it a long time ago (I have still the build tree on my hdd) and during the configuration I saw this in lib/Kconfig.debug:
config DETECT_HUNG_TASK
bool "Detect Hung Tasks"
depends on DEBUG_KERNEL
default LOCKUP_DETECTOR
help
Say Y here to enable the kernel to detect "hung tasks",
which are bugs that cause the task to be stuck in
uninterruptible "D" state indefinitiley.
When a hung task is detected, the kernel will print the
current stack trace (which you should report), but the
task will stay in uninterruptible state. If lockdep is
enabled then all held locks will also be reported. This
feature has negligible overhead.
It was only proposing to detect such tash or panic on detection: I don't checked if recent kernel actually can solve the problem (It seems to be the case with your question), but I think it didn't worth enabling it.
There is second problem : normally, the detection occur after 120 seconds, but I saw also a Konfig option for this:
config DEFAULT_HUNG_TASK_TIMEOUT
int "Default timeout for hung task detection (in seconds)"
depends on DETECT_HUNG_TASK
default 120
help
This option controls the default timeout (in seconds) used
to determine when a task has become non-responsive and should
be considered hung.
It can be adjusted at runtime via the kernel.hung_task_timeout_secs
sysctl or by writing a value to
/proc/sys/kernel/hung_task_timeout_secs.
A timeout of 0 disables the check. The default is two minutes.
Keeping the default should be fine in most cases.
This also works with kernel threads: example: make a loop device to a file on a fuse filesystem. Then crash the userspace program controlling the fuse filesystem!
You should a get a Ktread which name is in the form loopX (X correspond normally to your loopback device number) HUNGing!
weblinks:
https://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work (look at the answer written by ultrasawblade)
http://www.linuxquestions.org/questions/linux-general-1/kill-a-hung-task-when-kill-9-doesn't-help-697305/
http://forums-web2.gentoo.org/viewtopic-t-811557-start-0.html
http://comments.gmane.org/gmane.linux.kernel/1189978
http://comments.gmane.org/gmane.linux.kernel.cifs/7674 (This is a case similar to yours)
In your case of the three question: you have the answer: This probably due to what is probably a well known bug in the vfs linux kernel layer! (There is no CIFS timeouts)
After much trial & error I found a solution that persists.
# vim /etc/fstab
//192.168.1.122/myshare /mnt/share cifs username=user,password=password,_netdev 0 0
The _netdev option is important since we are mounting a network device. Clients may hang during the boot process if the system encounters any difficulties with the network.
https://www.redhat.com/sysadmin/samba-windows-linux