We have a Concourse Web Container and a Concourse Worker Container running on Server A (212.77.7.255 - real IP is conceiled). We use the latest Concourse Version 7.8.1.
As we ran out of Worker resources, we added another Concourse Worker Container running on Server B. The Worker on Server B has been running fine for about five days, but all of a sudden it is not able to connect anymore to Concourse Web on Server A.
The logs of the Worker on Server B say:
{
"timestamp": "2022-07-12T11:15:59.542 985762Z",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5430446562",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4.2"
}
}{
"timestamp": "2022-07-12T11:15:59.5430608042",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5430689532",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-get-containers-to-destroy",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5541187512",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper. tick.failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5541648442",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper.tick.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4.3"
}
}{
"timestamp": "2022-07-12T11:15:59.5541725932",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper.tick.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:15:59.554179789Z",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper. tick. failed-to-get-volume 3-to-destroy",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:16:04.5802200122",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon. failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.580284659Z",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1.10"
}
}{
"timestamp": "2022-07-12T11:16:04.5803353772",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.5803598682",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.exited-with-error",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.580372552Z",
"level": "debug",
"source": "worker",
"message",
"worker.beacon-runner.beacon.done",
"data": {
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.5803948792",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.failed",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4"
}
}
The logs on Concourse Web on Server A show no entries of the Worker on Server B trying to connect. On Server B I'm able to connect to Concourse Web on Server A:
$ nc 212.77.7.255 2222
SSH-2.0-Go
We had this problem before, but we solved it by upgrading Concourse to the latest version 7.8.1. Now I'm running out of options where to debug this. What I've tried:
restarting the workers
restarting the web container
pruning the stalled worker of Server B
docker system prune on Server B
Nothing does help. What can I do to debug this further and make the Worker on Server B connect again?
You said it happened to an earlier version, you "ran out of Worker resources", and I'm seeing I/O timeout in the logs... the one component you didn't mention is the DB.
It might be that the max conns on the DB has been reached, especially if the DB is used for purposes other than just Concourse. That's where I'd look next.
We couldn't find out why the docker network did not allow connecting to Server A. As connections on the host machine were going through, we told docker to use the host network:
services:
concourse-worker:
...
network-mode: host
...
This solved the issue. Not a pretty workaround, as the docker container should have it's own separated network, but as there is nothing else running on this server it's fine.
I received this error from Facebook, for a specific guest user:
"message": "[FacebookClient] response",
"status": 400,
"request": {
"baseURL": "https://graph.facebook.com/v8.0/",
"url": "/me/messages?access_token=EAAT9ZAq9Nm..."
},
"data": {
"error": {
"message": "(#-1) Unexpected internal error",
"type": "OAuthException",
"code": -1,
"error_subcode": 2018012,
"fbtrace_id": "AJcFpXZa28zOMut3XAUxorW"
}
},
....
"error_key": "E-1_S2018012"
Facebook documentation did not mention anything about this error. And very few info from Internet with reasons (not encode as unicode, send same payload messages in same time and timezone) that are not applicable to me.
Pls help me what it is and how to solve? Thank a lot!
I logged to vault with a root token.
I try to
$ vault token lookup
but I keep getting
Error looking up token: Error making API request.
URL: GET https://106.120.137.192:8200/v1/auth/token/lookup-self
Code: 403. Errors:
* permission denied
I have vault logs on Trace level, but there is no related event.
I enabled audit logs to see what's going on but they give me no hint.
[
{
"time": "2021-10-21T15:34:17.647568529Z",
"type": "request",
"auth": {
"token_type": "default"
},
"request": {
"id": "1d5d7f5f-94ca-e281-c0b2-5ffbceccb0dc",t
"operation": "read",
"mount_type": "token",
"client_token": "hmac-sha256:75f6fc0b19c105af0f2c27fd180742eef282c38d346fc732771bfaa2d1ce2ea6",
"namespace": {
"id": "root"
},
"path": "auth/token/lookup-self",
"remote_address": "172.18.0.1"
},
"error": "permission denied"
},
{
"time": "2021-10-21T15:34:17.647692649Z",
"type": "response",
"auth": {
"token_type": "default"
},
"request": {
"id": "1d5d7f5f-94ca-e281-c0b2-5ffbceccb0dc",
"operation": "read",
"mount_type": "token",
"client_token": "hmac-sha256:75f6fc0b19c105af0f2c27fd180742eef282c38d346fc732771bfaa2d1ce2ea6",
"namespace": {
"id": "root"
},
"path": "auth/token/lookup-self",
"remote_address": "172.18.0.1"
},
"response": {
"mount_type": "token",
"data": {
"error": "hmac-sha256:9493ed1bac12e9a7fae0e03c488dd1d5f46bcc33ea36ee2c1e5ca92acd683c81"
}
},
"error": "1 error occurred:\n\t* permission denied\n\n"
}
]
What else can I do?
I am running Vault 1.7.0
Ok. I found the problem does not happen when I am running same command on localhost, i.e., against local instance of Vault.
When I attept to execute any request(through pg bouncer), then get exception
Caused by: io.vertx.pgclient.PgException: { "message": "unnamed prepared statement does not exist", "severity": "ERROR", "code": "26000", "file": "postgres.c", "line": "1620", "routine": "exec_bind_message" }
But if I exec directly to postgres(not through), everything OK
Can any one please tell me how to download or read the files from Google cloud storage without login/Authentication
when i tried the below link i am getting
https://www.googleapis.com/storage/v1/b/igvideos/o/test1235.txt
{
"error": {
"errors": [
{
"domain": "global",
"reason": "required",
"message": "Login Required",
"locationType": "header",
"location": "Authorization"
}
],
"code": 401,
"message": "Login Required"
}
}
Thanks
Thanigaivelan