How to resolve DNS lookup error when trying to run example microservice application using minikube - kubernetes

Dear StackOverflow community!
I am trying to run the https://github.com/GoogleCloudPlatform/microservices-demo locally on minikube, so I am following their development guide: https://github.com/GoogleCloudPlatform/microservices-demo/blob/master/docs/development-guide.md
After I successfully set up minikube (using virtualbox driver, but I tried also hyperkit, however the results were the same) and execute skaffold run, after some time it will end up with following error:
Building [shippingservice]...
Sending build context to Docker daemon 127kB
Step 1/14 : FROM golang:1.15-alpine as builder
---> 6466dd056dc2
Step 2/14 : RUN apk add --no-cache ca-certificates git
---> Running in 0e6d2ab2a615
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/x86_64/APKINDEX.tar.gz
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/main: DNS lookup error
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/x86_64/APKINDEX.tar.gz
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.13/community: DNS lookup error
ERROR: unable to select packages:
git (no such package):
required by: world[git]
Building [recommendationservice]...
Building [cartservice]...
Building [emailservice]...
Building [productcatalogservice]...
Building [loadgenerator]...
Building [checkoutservice]...
Building [currencyservice]...
Building [frontend]...
Building [adservice]...
unable to stream build output: The command '/bin/sh -c apk add --no-cache ca-certificates git' returned a non-zero code: 1. Please fix the Dockerfile and try again..
The error message suggest that DNS does not work. I tried to add 8.8.8.8 to /etc/resolv.conf on a minikube VM, but it did not help. I've noticed that after I re-run skaffold run and it fails again, the content /etc/resolv.conf returns to its original state containing 10.0.2.3 as the only DNS entry. Reaching the outside internet and pinging 8.8.8.8 form within the minikube VM works.
Could you point me to a direction how can possible I fix the problem and learn on how the DNS inside minikube/kubernetes works? I've heard that problems with DNS inside Kubernetes cluster are frequent problems you run into.
Thanks for your answers!
Best regards,
Richard

Tried it with docker driver, i.e. minikube start --driver=docker, and it works. Thanks Brian!

Sounds like issue was resolved for OP but if one is using docker inside minikube then below suggestion worked for me.
Ref: https://github.com/kubernetes/minikube/issues/10830
minikube ssh
$>sudo vi /etc/docker/daemon.json
# Add "dns": ["8.8.8.8"]
# save and exit
$>sudo systemctl restart docker

Related

Docker nuget connection timeout

Trying to utilize official jetbrains\teamcity-agent image on Kubernetes. I've managed to run Docker in Docker there but trying to build an ASP.NET Core image with docker build command failes on dotnet restore with
The HTTP request to 'GET https://api.nuget.org/v3/index.json' has timed out after 100000ms.
When I connect to the pod itself and try curling the URL it's super fast. So I assume network is not an issue. Thank for any advice.
Update
Trying to run a simple dotnet restore step from container worked. But not from inside the docker build.
Update 2
I've isolated the problem, it has nothing to do with nuget nor TeamCity. Is network related on the Kubernetes host.
Running simple docker build with this Dockerfile:
FROM praqma/network-multitool AS build
RUN route
RUN ping -c 4 google.com
produces output:
Step 1/3 : FROM praqma/network-multitool AS build
---> 3619cb81e582
Step 2/3 : RUN route
---> Running in 80bda13a9860
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 eth0
Removing intermediate container 80bda13a9860
---> d79e864eafaf
Step 3/3 : RUN ping -c 4 google.com
---> Running in 76354a92a413
PING google.com (216.58.201.110) 56(84) bytes of data.
--- google.com ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 53ms
Pods orchestrated by Kubernetes can access internet normally. I'm using Calico as network layer.
I fix this issue by passing argument --disable-parallel to restore command which Disables restoring multiple projects in parallel.
RUN dotnet restore --disable-parallel
i have exactly same behaviour:
i have solution with contains several nuget dependencies
it build without any issue on local machine.
it build without any issue on windows build agent
it build without any issue on docker host machine
but then i try to build it in build agent in docker - i have a lot of message such following:
Failed to download package 'System.Threading.4.0.11' from 'https://api.nuget.org/v3-flatcontainer/system.threading/4.0.11/system.threading.4.0.11.nupkg'.
The download of 'https://api.nuget.org/v3-flatcontainer/system.threading/4.0.11/system.threading.4.0.11.nupkg' timed out because no data was received for 60000ms
i can ping and curl page from nuget.org normally from docker container.
so i think this is some special case. i found some info about MTU but i'm not tested it.
UPDATE initial problem may be connect to k8s - my container work inside k8s cluster based on ubuntu 18.04 with flannel ang k8s v1.16
on my local machine (win based) all works without any issue... but it is strange because i have many services that works in this cluster without any problems! (such harbor, graylog, jaeger etc)
UPDATE 2 ok, now i can understand anything.
i try to execute
curl https://api.nuget.org/v3/index.json
and can get file content without any errors
after this i try to run
wget https://api.nuget.org/v3-flatcontainer/system.threading/4.0.11/system.threading.4.0.11.nupkg
and package downloaded successfully
but after i run dotnet restore i still receive errors with timeout
UPDATE 3
i try to reproduce problem not in k8s cluster but in docker locally
i run container
docker run -it -v d:/project/test:/mnt/proj teamcity-agent-core3.1 bash
teamcity-buildagent-core3.1 - my image based on jetbrains/teamcity-agent which contains .net core 3.1 sdk.
and then execute command inside interactive session:
dotnet restore test.sln
with failed with following messages:
Failed to download package 'System.Runtime.InteropServices.4.3.0' from 'https://api.nuget.org/v3-flatcontainer/system.runtime.interopservices/4.3.0/system.runtime.interopservices.4.3.0.nupkg'.
Received an unexpected EOF or 0 bytes from the transport stream.
The download of 'https://api.nuget.org/v3-flatcontainer/system.text.encoding.extensions/4.3.0/system.text.encoding.extensions.4.3.0.nupkg' timed out because no data was received for 60000ms.
Exception of type 'System.TimeoutException' was thrown.
In my case the solution was marked out here
As noted in the comment, "So maybe the issue needs to be fixed by microsoft by changing the default nuget.config inside of mcr.microsoft.com/dotnet/sdk:5.0."
This was my problem. Docker building from sdk:5.0. Solution seems to be doing the job, which is to add a nuget.config file to the root of the solution.
Contents of nuget.config (again, from posts in that issue):
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<config>
<add key='maxHttpRequestsPerSource' value='10' />
</config>
</configuration>
I had a similar issue. The mistake I was doing was not specifying the exact dotnet version on the docker image.
FROM mcr.microsoft.com/dotnet/core/sdk AS build
My project targets dotnet 2.2. What I did not know was this was pulling the latest dotnet SDK 3.1. So when the dotnet restore ran, it was timing out.
So this is what I did.
FROM mcr.microsoft.com/dotnet/core/sdk:2.2 AS build
I had to specify a specific version. I'm not sure if this is relation to your problem but I hope it send you in the right direction. Always be explicit with the image version.
I had similar of #NIMROD MAINA and #Anatoly Kryzhanovsky issue when i was using build in docker container from gitlab-runner (docker).
When i run dotnet restore outside docker container. Everything it's work!
In my case it didn't work when nuget.config was inside the project folder.
I put nuget.config in the solution folder (out of the project folder) and it worked again.
For me it was solution setting docker (Windows) to:
Expose daemon on tcp://localhost:2375 without TLS (true) and
Use Docker Compose V2 (true)
It's temporary solution, but it works.
Check your DNS settings (A record). Try to type nslookup yourfeeddomain. Make sure that IP address is one and resolved.

docker-compose portmapping gives failed to create endpoint hnsCall failed in Win32: The specified port already exists

I have started a new (.net core 3.0)project in Visual Studio, with Docker support (Windows)
I have added Docker support (right-click on project Add->Docker support) and in the same way added Docker compose support.
If I just Click "play-button" for Docker Compose, the project starts everything works well.
But when I run docker-compose up from the solution folder I get
Cannot start service testproj30: failed to create endpoint
testproj30_testproj30_1 on network nat: hnsCall failed in Win32: The
specified port already exists.
(I have closed my VS solution). If I remove the port mapping in docker-compose.override.yaml I dont get this error message. I have dont the most common tricks with restarting docker servce, hni service and so on. Nothing helps.
I dont want to depend on all VS-voodoo from the project file and God knows what other files that are involved.
I can run docker run -p 8080:80 443:443 without any port problems
I fixed a similar problem by removing some terminated container and then pruning networks.
List terminated container :
docker ps -a
Remove them (Cygwin syntax) :
docker rm $(docker ps -aq)
You will have error message for runnnig containers.
Clean your networks :
docker network prune
For myself, the main cause was the Docker killing process skiped the port releasing mechanism of my application.

minikube install package using toolbox but the container does not internet conexion

I'm wondering how can install a package inside the minikube VM. I need some tools.
I have tried the /bin/toolbox container, but It does not have internet conexion.
[root#docker-fedora-24 ~]# dnf update --verbose
cachedir: /var/cache/dnf
DNF version: 1.1.9
Cannot download 'https://mirrors.fedoraproject.org/metalink?repo=updates-released-f24&arch=x86_64': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=updates-released-f24&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org].
Error: Failed to synchronize cache for repo 'updates'
I have tried the same toolbox script in my computer and it is properly working.
What configuration parameters I'm missing in minikube or systemd-nspaw?
Or how can I cook a customized minikube VM?
Thanks a lot
You can run minicube without VM on your local docker (if you use linux):
minikube start --vm-driver=none
A alternative, run toolbox with docker run --net=host ... to make network for container more transparent. Troubleshoot your internet connection with nslookup, traceroute/tracepath, curl -v, ifconfig.
http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:Ch04:_Simple_Network_Troubleshooting#.WfY1xGi0OUk
Minikube is not meant to be tweaked. The advised method is to prepare a helm chart for your application. As part of the helm chart you can add whatever tool you need in your docker file... Including make... Then you can install or upgrade your package in kubernetes/minikube using helm.
I had a similar problem when I wanted to use tcpdump in the minikube VM.
I ended up using minikube mount SRC-dir:DST-dir to mount the host folder inside the VM and copying the tcpdump binary along with dependent libs (libcrypto and libpcap) to the mount point.
Then I executed tcpdump from the minikube VM and it worked.
Note: My host arch and the minikube VM arch (x86_64) was the same.
Note also: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:DST-dir has to be done.

How to start up a Kubernetes cluster using Rocket?

I'm using a Chromebook Pixel 2, and it's easier to get Rocket working than Docker. I recently installed Rocket 1.1 into /usr/local/bin, and have a clone of the Kubernetes GitHub repo.
When I try to use ./hack/local-up-cluster.sh to start a cluster, it eventually fails with this message:
Failed to successfully run 'docker ps', please verify that docker is installed and $DOCKER_HOST is set correctly.
According to the docs, k8s supports Rocket. Can someone please guide me about how to start a local cluster without a working Docker installation?
Thanks in advance.
You need to set three environment variables before running ./hack/local-up-cluster.h:
$ export CONTAINER_RUNTIME=rkt
$ export RKT_PATH=$PATH_TO_RKT_BINARY
$ export RKT_STAGE1_IMAGE=PATH=$PATH_TO_STAGE1_IMAGE
This is described in the docs for getting started with a local rkt cluster.
Try running export CONTAINER_RUNTIME="rocket" and then re-running the script.

Can't clone public repo from within Dockerfile

I have the following line in my Dockerfile:
RUN git clone https://github.com/assafg/youtube-remote.git ./youtube-remote
When executing sudo docker build -t 'yremote' .
I get the following error:
Cloning into './youtube-remote'... fatal: unable to access
'https://github.com/assafg/youtube-remote.git/': Could not resolve
host: github.com The command '/bin/sh -c git clone
https://github.com/assafg/youtube-remote.git ./youtube-remote'
returned a non-zero code: 128
Running clone command from command line works fine.
This can happen if your container can't connect to the internet. Possibly because it was started with a weird networking option? Run this command to check default internet connectivity:
docker run ubuntu apt install -y git && \
git clone https://github.com/assafg/youtube-remote.git ./youtube-remote
If that container successfully pulls down the repo, it probably means the first container has a networking problem. Try to restart, or change networking settings.
Docker Network just became a first class citizen in the Docker ecosystem. It's a really fast-moving project. This advice applies to v1.8
This is not a very scientific answer but sometimes docker restart helps especially in cases connected with docker network.