EDIT
It seems that the second server DOES occasionally get this error, this makes me near certain it's a config problem. Could it be one of:
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse =1
version information as requested: Meteor: 1.5.0
OS: Ubuntu 16.04
Provider: AWS EC2
I'm getting the following error, intermittently and seemingly randomly, on both processes running on one server (of a pair). The other server never gets this error, the error doesn't refer to any code I've written, so I can only assume its (a) a bug in Meteor or (b), a bug with my server config. The server whose processes are crashing is also hosting two other meteor sites, both of which occasionally get this error:
Error: write after end
at writeAfterEnd (_stream_writable.js:167:12)
at PassThrough.Writable.write (_stream_writable.js:212:5)
at IncomingMessage.ondata (_stream_readable.js:542:20)
at emitOne (events.js:77:13)
at IncomingMessage.emit (events.js:169:7)
at IncomingMessage.Readable.read (_stream_readable.js:368:10)
at flow (_stream_readable.js:759:26)
at resume_ (_stream_readable.js:739:3)
at nextTickCallbackWith2Args (node.js:511:9)
at process._tickDomainCallback (node.js:466:17)
things I've already checked:
memory limits (nowhere near close)
connection limits - very small, around 20 per server at the time of failure, and the processes were bumped to the second server within 1 minute, which handled them + it's own just fine
process limits - both processes on server 1 failed within 7 minutes of each other.
server config - while I was trying to eek out a little extra performance during load testing, I modified sysctl.conf based on a post I saw for high load node.js servers, this is the contents of the faulty servers sysctl.conf however, the functioning server has an identical config.
.
fs.file-max = 1000000
fs.nr_open = 1000000
ifs.file-max = 70000
net.nf_conntrack_max = 1048576
net.ipv4.netfilter.ip_conntrack_max = 32768
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_max_orphans = 8192
net.ipv4.ip_local_port_range = 16768 61000
net.ipv4.tcp_max_syn_backlog = 10024
net.ipv4.tcp_max_tw_buckets = 360000
net.core.netdev_max_backlog = 2500
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse =1
net.core.somaxconn = 20048
I have an NGINX balancer on server1 which load balances across the 4 processes (2 per server). The NGINX error log is littered with lines as follows:
2017/08/17 16:15:01 [warn] 1221#1221: *6233472 an upstream response is buffered to a temporary file /var/lib/nginx/proxy/1/46/0000029461 while reading upstream, client: 164.68.80.47, server: server redacted, request: "GET path redacted HTTP/1.1", upstream: "path redacted", host: "host redacted", referrer: "referrer redacted"
At the time of the error, I see a pair of lines like this:
2017/08/17 15:07:19 [error] 1222#1222: *6215301 connect() failed (111: Connection refused) while connecting to upstream, client: ip redacted, server: server redacted, request: "GET /admin/sockjs/info?cb=o2ziavvsua HTTP/1.1", upstream: "http://127.0.0.1:8080/admin/sockjs/info?cb=o2ziavvsua", host: "hostname redacted", referrer: "referrer redacted"
2017/08/17 15:07:19 [warn] 1222#1222: *6215301 upstream server temporarily disabled while connecting to upstream, client: ip redacted, server: server redacted, request: "GET /admin/sockjs/info?cb=o2ziavvsua HTTP/1.1", upstream: "http://127.0.0.1:8080/admin/sockjs/info?cb=o2ziavvsua", host: "hostname redacted", referrer: "referrer redacted"
If it matters at all, I'm using a 3 node mongo replica set, where both servers are pointing at all 3 nodes.
I'm also using a custom hosted version of kadira (since it went offline).
If there is no way to stop the errors, is there anyway to stop them taking down the entire process, there are times when 50-100 users are connected per process, booting them all because of one error seems excessive
It's been two days without a crash, so I think the solution was changing:
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
to
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 0
I don't know which of those was causing the problem (probably the timeout). I still think its a "bug" that a single "Write after end" error crashes the entire meteor process. Perhaps this should simply be logged.
Given:
client <-- HTTP --> spray web service <-- HTTP --> other web service
The client receives the following HTTP status code and entity body when hitting the spray web service:
504 Gateway Timeout - Empty
Per this answer, my understanding is that the spray web service is not receiving a timely response from the other web service, and thus sending an HTTP 504 to the client.
Assuming that's reasonable, per https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/StatusCode.scala#L158, I'm guessing that one of these server config values is responsible for the 504 in the spray web service's HTTP response to the client.
What config or implicit would cause the spray web service to reply with a 504 to the client?
I think you are using the default Spray timeouts and perhaps you will need to increase them. If this is the case, there are 2 values you will have to configure to increase the timeouts.
idle-timeout: Time you can have an idle connection to your server before it is disconnected (default 60s).
request-timeout: Time a request from your client (from your server to another) can be idle before it timesout (default 20s).
The first value must be always higher than the second, as the idle-timeout will make pointless the connections from your request client.
So just overwrite your configuration in your application.conf like this:
spray.can {
server {
idle-timeout = 120 s
request-timeout = 20 s
}
}
I had a responsive Akka Http app, did some changes, then got these messages:
14:55:56.128 INFO bootstrap.akka.HttpActor - Accepted new connection from /127.0.0.1:55192
14:56:23.938 INFO bootstrap.akka.HttpActor - Accepted new connection from /127.0.0.1:55193
14:56:26.684 ERROR akka.actor.ActorSystemImpl - Internal server error, sending 500 response
akka.stream.impl.io.TcpStreamActor$$anonfun$handleSubscriptionTimeout$1$$anon$1: Publisher was not attached to upstream within deadline (5000) ms
14:56:26.684 ERROR akka.actor.ActorSystemImpl - Internal server error, sending 500 response
akka.stream.impl.io.TcpStreamActor$$anonfun$handleSubscriptionTimeout$1$$anon$1: Publisher was not attached to upstream within deadline (5000) ms
I cannot figure out what they mean. They are not mentioned in the docs and no Google hits. Could anyone share some insight?
akka.version=2.3.12
akka.stream.version=1.0
scala.version=2.11.6
jdk.version=1.8.0_60
I am doing load test of web server. Current i am using tomcat 6 to test my code. While running the server resets the connection after few minutes on receiving continuous GET requests for the same page. If I send GET request with some gap (say 500 ms) then it works fine. If I send GET request with 10 ms or less than 10 ms then server resets the connection after few seconds from the start of test. Please help on how to fix this problem. What is the reason for reset ? Whether the server is overloaded or I have to perform some operation while establish connection ??.
My GET request format is:
GET /index.html HTTP/1.1
Host: 180.168.40.40
Connection: keep-alive
I want to download the result of a Express.js REST API which is very slow to process (~10 minutes). I tried few timeout options with wget but it gives up after few minutes while I ask it to wait around ~60 000 years.
wget "http://localhost:5000/slowstuff" --http-user=user --http-password=password --read-timeout=1808080878708 --tries=1
--2015-02-26 11:14:21-- http://localhost:5000/slowstuff
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:5000... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Authentication selected: Basic realm="Authorization Required"
Reusing existing connection to [localhost]:5000.
HTTP request sent, awaiting response... No data received.
Giving up.
EDIT:
The problem doesn't come from the wget timeout value. With a timeout set to 4 seconds, the error is different: Read error (Connection timed out) in headers. And I have exactly the same problem with curl.
I think the problem comes from my API. It looks like a timeout of 2 minutes is set by default in NodeJS.
Now, I need to find how to change this value.
This
--http-password=password--read-timeout=1808080878708
is missing a blank. Use
--http-password=password --read-timeout=1808080878708