Alertmanager can't send email when running in Openshift (Error: getsockopt: connection timed out) - email

I have Alertmanager and Prometheus running in Openshift. Alertmanager receives and shows the alerts from Prometheus but when sending it with any smtp server (I'm using Gmail now but I tried others), I get the following error:
time="2017-05-30T08:47:21Z" level=error msg="Notify for 1 alerts failed: dial tcp 74.125.206.109:587: getsockopt: connection timed out" source="dispatch.go:261"
I have a config.yml which worked when I tried it locally with alertmanager and prometheus. I received the email alerts, so I dont get why it doesn't work when running in Openshift. I ran out of ideas.
My config file:
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'emailtestingxxx#gmail.com'
smtp_auth_username: 'emailtestingxxx#gmail.com'
smtp_auth_password: 'ABCD1234'
templates:
- '/etc/alertmanager/template/*.tmpl'
group_by: ['alertname', 'cluster', 'service']
group_wait: 1m
group_interval: 1m
repeat_interval: 1m
receiver: team-X-mails
routes:
- match:
severity: page
receiver: team-X-mails
receivers:
- name: 'team-X-mails'
email_configs:
- to: 'myemail#myemail.com'

Solved. The problem was a firewall configuration which was blocking the outgoing requests.

Related

Webhook failing in rabbtimq

I have install rabbitmq cluster using rabbitmq cluster operator. I have also added rabbitmq topology operator. I am trying to create queues using rabbitmq topology operator using following yml file
kind: Queue
metadata:
name: software-results
namespace: rabbitmq-system
spec:
name: software-results # name of the queue
type: quorum # without providing a queue type, rabbitmq creates a classic queue
autoDelete: false
durable: true # seting 'durable' to false means this queue won't survive a server restart
rabbitmqClusterReference:
name: client-queues
I am getting error as
Error from server (InternalError): error when creating "singleQueue.yml": Internal error occurred: failed calling webhook "vqueue.kb.io": failed to call webhook: Post "https://webhook-service.rabbitmq-system.svc:443/validate-rabbitmq-com-v1beta1-queue?timeout=10s": dial tcp 10.97.65.156:443: connect: connection refused
I tried to search for the same but didn't find much. Can anyone help me what exactly is going wrong here ?

viz extension crashloop with Request failed error unauthorized connection on server proxy-admin

I just tried to install Linkerd viz extension following official documentation but all the pods are in a crash loop.
linkerd viz install | kubectl apply -f -
Linkerd-getting-started
r proxy-admin
[ 29.797889s] INFO ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}: linkerd_app_inbound::policy::authorize::http: Request denied server=proxy-admin tls=None(NoClientHello) client=50.50.55.177:47068
[ 29.797910s] INFO ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:rescue{client.addr=50.50.55.177:47068}: linkerd_app_core::errors::respond: Request failed error=unauthorized connection on server proxy-admin
[ 29.817790s] INFO ThreadId(01) linkerd_proxy::signal: received SIGTERM, starting shutdown
The error appeared on Kubernetes cluster Server Version: v1.21.5-eks-bc4871b
The issue was the policy that come by default installation.
This authorizes unauthenticated requests from IPs in the clusterNetworks configuration. If the source IP (<public-ip-address-of-hel-k1>) is not in that list, these connections will be denied. To fix this, the authorization policy could be updated with the following:
spec:
client:
unauthenticated: true
networks:
- cidr: 0.0.0.0/0
The default policy missing the client part
networks:
- cidr: 0.0.0.0/0
To update the policy, get the server authorization
k get ServerAuthorization -n linkerd-viz
NAME SERVER
admin admin
grafana grafana
metrics-api metrics-api
proxy-admin proxy-admin
Now edit admin,grafana, proxy-admin and grafana and add the networks part.
k edit ServerAuthorization metrics-api
as after fixing this I was also getting errors for grafana which help me to fix by adding network part.
[ 32.278014s] INFO ThreadId(01) inbound:server{port=3000}:rescue{client.addr=50.50.53.140:44718}: linkerd_app_core::errors::respond: Request failed error=unauthorized connection on server grafana
[ 38.176927s] INFO ThreadId(01) inbound:server{port=3000}: linkerd_app_inbound::policy::authorize::http: Request denied server=grafana tls=None(NoClientHello) client=50.50.55.177:33170
All linkerd-viz pods in CrashLoopBackOff

Accessing an SMTP server when istio is enabled

getting error curl: (56) response reading failed while trying to send email via smtp using curl. checked the isto-proxy logs of sidecar but don't see any error logs related to this host. Tried the solution mentioned in How to access external SMTP server from within Kubernetes cluster with Istio Service Mesh as well but didn't work.
service entry
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: smtp
spec:
addresses:
- 192.168.8.45/32
hosts:
- smtp.example.com"
location: MESH_EXTERNAL
ports:
- name: tcp-smtp
number: 2255
protocol: TCP
Most probably port number is causing the error and if not, try deleting the mesh policies
Also please validate based on below points:
1.If you recently updated istio try downgrading it.
2.Look again in Sidecar logs for any conflicts or try disabling it.
3.When it comes to curl 56 error packet transmission; limit could be the problem.
The curl requests from primary container are routed via sidecar when istio is enabled, the response from smtp server is being masqueraded by sidecar and returned to primary container, which was quite misleading.
upon disabling Istio and trying to do curl on smtp port curl request failed with error Failed to connect to smtp.example.com port 2255: Operation timed out. which was because firewall from cluster to smtp server port was not open.
while istio was enabled the curl response didn't give timeout error but gave curl: (56) response reading failed which mislead me to think that the response was coming from smtp server.

How to monitor filebeat stats with metricbeat

I'm trying to setup metricbeat to monitor filebeat stats. But when I tried the beats module for doing so in my metricbeat config, I'm getting this error:
error message from metricbeat logs:
Error fetching data for metricset beat.stats: error making http request: Get http://filebeat_ip:5044/stats: dial tcp filebeat_ip:5044: connect: connection refused
metricbeat.yml file
metricbeat.modules:
- module: beat
metricsets:
- stats
- state
period: 10s
hosts: ["filebeat_ip:5044"]
where filebeat_ip is the ip where my filebeat is running, which's the same machine as my metricbeat.
Can someone please help me as to why I'm getting this error?
If it's the same machine I would just use localhost or 127.0.0.1.
PS: If not running on localhost, I'd double check if the port is actually reachable and not blocked by a firewall. Something like telnet <ip> 5044 should be a quick sanity check.

Server gave HTTP response to HTTPS client

I am trying to setup prometheus to monitor nodes, services and endpoints for my kubernetes cluster [1 master, 7 minions ] . For that I have a very basic promethus.yml file :
scrape_configs:
- job_name: 'kubernetes-pods'
tls_config:
insecure_skip_verify: true
kubernetes_sd_configs:
- role: pod
Before starting the Prometheus application , I ran the below 2 commands :
export KUBERNETES_SERVICE_HOST=172.9.25.6
export KUBERNETES_SERVICE_PORT=8080
I can access the Kubernetes API server using http://172.9.25.6:8080
The connect is formed over http and NOT https.
Now when I start the application, I get the below ERROR :
level=info ts=2017-12-13T20:39:05.312987614Z caller=kubernetes.go:100 component="target manager" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2017-12-13T20:39:05.313443232Z caller=main.go:371 msg="Server is ready to receive requests."
level=error ts=2017-12-13T20:39:05.316618074Z caller=main.go:211 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:205: Failed to list *v1.Pod: Get https://172.9.25.6:8080/api/v1/pods?resourceVersion=0: http: server gave HTTP response to HTTPS client"
I also tried to add scheme: http to my prometheus.yml config but it does not work. How can I configure the client to accept HTTP responses ?
Try specifying api_server inside kubernetes_sd_configs:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
api_server: http://172.9.25.6:8080