Redis standalone stop working (timeouts) on master replica sync - kubernetes

We use a redis cache in out kubernetes cluster which stops working really randomly. It's a Standalone version based on this image: bitnami/redis:6.0.15
As custom parameters we use:
MASTER true
REDIS_AOF_ENABLED no
Every time when the redis stop working I see the following logs:
Jul 5 13:30:27 redis-0 redis 1:M 05 Jul 2022 11:30:27.060 * 10000 changes in 60 seconds. Saving...
Jul 5 13:30:27 redis-0 redis 1:M 05 Jul 2022 11:30:27.090 * Background saving started by pid 364
Jul 5 13:31:34 redis-0 redis 364:C 05 Jul 2022 11:31:34.307 * DB saved on disk
Jul 5 13:31:34 redis-0 redis 364:C 05 Jul 2022 11:31:34.341 * RDB: 431 MB of memory used by copy-on-write
Jul 5 13:31:34 redis-0 redis 1:M 05 Jul 2022 11:31:34.488 * Background saving terminated with success
Jul 5 13:32:35 redis-0 redis 1:M 05 Jul 2022 11:32:35.022 * 10000 changes in 60 seconds. Saving...
Jul 5 13:32:35 redis-0 redis 1:M 05 Jul 2022 11:32:35.052 * Background saving started by pid 365
-----
Jul 5 13:32:40 redis-0 redis 1:S 05 Jul 2022 11:32:40.436 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
Jul 5 13:32:40 redis-0 redis 1:S 05 Jul 2022 11:32:40.436 * REPLICAOF 178.20.40.200:8886 enabled (user request from 'id=71457 addr=10.0.16.46:14072 fd=12 name= age=0 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=47 qbuf-free=32721 argv-mem=24 obl=0 oll=0 omem=0 tot-mem=61488 events=r cmd=slaveof user=default')
Jul 5 13:32:41 redis-0 redis 1:S 05 Jul 2022 11:32:41.316 * Connecting to MASTER 178.20.40.200:8886
Jul 5 13:32:41 redis-0 redis 1:S 05 Jul 2022 11:32:41.316 * MASTER <-> REPLICA sync started
Jul 5 13:32:41 redis-0 redis 1:S 05 Jul 2022 11:32:41.362 * Non blocking connect for SYNC fired the event.
Jul 5 13:32:41 redis-0 redis Error 1:S 05 Jul 2022 11:32:41.409 # Error reply to PING from master: '-Reading from master: Connection reset by peer'
Jul 5 13:32:42 redis-0 redis 1:S 05 Jul 2022 11:32:42.316 * Connecting to MASTER 178.20.40.200:8886
Jul 5 13:32:42 redis-0 redis 1:S 05 Jul 2022 11:32:42.317 * MASTER <-> REPLICA sync started
Jul 5 13:32:42 redis-0 redis 1:S 05 Jul 2022 11:32:42.366 * Non blocking connect for SYNC fired the event.
Jul 5 13:32:42 redis-0 redis Error 1:S 05 Jul 2022 11:32:42.415 # Error reply to PING from master: '-Reading from master: Connection reset by peer'
Jul 5 13:32:43 redis-0 redis 1:S 05 Jul 2022 11:32:43.317 * Connecting to MASTER 178.20.40.200:8886
Jul 5 13:32:43 redis-0 redis 1:S 05 Jul 2022 11:32:43.317 * MASTER <-> REPLICA sync started
Jul 5 13:32:43 redis-0 redis 1:S 05 Jul 2022 11:32:43.366 * Non blocking connect for SYNC fired the event.
Jul 5 13:32:43 redis-0 redis Error 1:S 05 Jul 2022 11:32:43.416 # Error reply to PING from master: '-Reading from master: Connection reset by peer'
Jul 5 13:32:44 redis-0 redis 1:S 05 Jul 2022 11:32:44.320 * Connecting to MASTER 178.20.40.200:8886
Jul 5 13:32:44 redis-0 redis 1:S 05 Jul 2022 11:32:44.320 * MASTER <-> REPLICA sync started
Jul 5 13:32:44 redis-0 redis 1:S 05 Jul 2022 11:32:44.370 * Non blocking connect for SYNC fired the event.
Then I see that the queue increase, but I need to kill the pod to restart redis otherwise it will not work anymore.
next: GET 6126674261995698486,
inst: 1,
qu: 0, // queue => waiting operations
qs: 17,
aw: False,
rs: ReadAsync,
ws: Idle,
in: 0, // bytes waiting from input stream
in-pipe: 0,
out-pipe: 0,
serverEndpoint: redis.default.svc.cluster.local:6379,
mc: 1/1/0,
mgr: 10 of 10 available, // tread pool
clientName: production-9bbd94544-nlmv7,
IOCP: (Busy=0,Free=1000,Min=5,Max=1000), // no busy threads
WORKER: (Busy=14,Free=32753,Min=256,Max=32767),
v: 2.2.4.27433```
```Timeout performing GET (3000ms),
next: 2865582319381864083,
inst: 0,
qu: 0,
qs: 333,
aw: False,
rs: ReadAsync,
ws: Idle,
in: 0,
in-pipe: 0,
out-pipe: 0,
serverEndpoint: redis.default.svc.cluster.local:6379,
mc: 1/1/0,
mgr: 10 of 10 available,
clientName: production-58c7874fd8-tdcpz,
IOCP: (Busy=0,Free=1000,Min=1,Max=1000),
WORKER: (Busy=3,Free=32764,Min=256,Max=32767),
v: 2.2.4.27433
next: GET 6126674261995698486,
inst: 47,
qu: 0,
qs: 21368,
aw: False,
rs: ReadAsync,
ws: Idle,
in: 0,
in-pipe: 0,
out-pipe: 0,
serverEndpoint: redis.default.svc.cluster.local:6379,
mc: 1/1/0,
mgr: 10 of 10 available,
clientName: production-9bbd94544-nlmv7,
IOCP: (Busy=0,Free=1000,Min=5,Max=1000),
WORKER: (Busy=162,Free=32605,Min=256,Max=32767),
v: 2.2.4.27433```
Has anyone an idea?
Thank you.

Can you check if the redis service is running? kubectl get service/redis. It seems like the service is unable to receive traffic, which is possible if there are no pods to receive it.

Related

HeadlessChrome 84.0.4147 (Linux 0.0.0) ERROR

I'm trying to run tests using Karma + HeadlessChrome from an amzn2 AMI (I have a Jenkins installed there).
The tests run successful from my local machine, however I'm getting stuck when trying to execute them in Jenkins.
I changed the log level of Karma to debug and this is the relevant output:
14 07 2020 06:51:39.801:INFO [karma-server]: Karma v4.1.0 server started at http://0.0.0.0:9876/
14 07 2020 06:51:39.804:INFO [launcher]: Launching browsers ChromeHeadlessNoSandbox with concurrency unlimited
14 07 2020 06:51:39.815:INFO [launcher]: Starting browser ChromeHeadless
14 07 2020 06:51:39.815:DEBUG [launcher]: null -> BEING_CAPTURED
14 07 2020 06:51:39.816:DEBUG [temp-dir]: Creating temp dir at /tmp/karma-99510655
[...]
14 07 2020 06:51:56.165:DEBUG [karma-server]: A browser has connected on socket 1HnLJZHDqfyg-11TAAAA
14 07 2020 06:51:56.187:DEBUG [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: undefined -> CONNECTED
14 07 2020 06:51:56.187:INFO [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: Connected on socket 1HnLJZHDqfyg-11TAAAA with id 99510655
14 07 2020 06:51:56.189:DEBUG [launcher]: BEING_CAPTURED -> CAPTURED
14 07 2020 06:51:56.189:DEBUG [launcher]: ChromeHeadless (id 99510655) captured in 16.382 secs
14 07 2020 06:51:56.191:DEBUG [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: CONNECTED -> CONFIGURING
[...]
14 07 2020 06:51:57.356:DEBUG [middleware:source-files]: Fetching /_karma_webpack_/vendor.js
14 07 2020 06:51:58.119:DEBUG [middleware:source-files]: Requesting /_karma_webpack_/vendor.js
14 07 2020 06:51:58.128:DEBUG [middleware:source-files]: Fetching /_karma_webpack_/vendor.js
14 07 2020 06:51:59.599:DEBUG [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: Disconnected during run, waiting 2000ms for reconnecting.
14 07 2020 06:51:59.605:DEBUG [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: CONFIGURING -> EXECUTING_DISCONNECTED
14 07 2020 06:52:01.610:WARN [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: Disconnected (0 times)reconnect failed before timeout of 2000ms (transport close)
14 07 2020 06:52:01.610:DEBUG [HeadlessChrome 84.0.4147 (Linux 0.0.0)]: EXECUTING_DISCONNECTED -> DISCONNECTED
HeadlessChrome 84.0.4147 (Linux 0.0.0) ERROR
Disconnectedreconnect failed before timeout of 2000ms (transport close)
HeadlessChrome 84.0.4147 (Linux 0.0.0) ERROR
Disconnectedreconnect failed before timeout of 2000ms (transport close)
14 07 2020 06:52:01.683:DEBUG [launcher]: CAPTURED -> BEING_KILLED
14 07 2020 06:52:01.684:DEBUG [launcher]: BEING_KILLED -> BEING_FORCE_KILLED
14 07 2020 06:52:01.709:DEBUG [karma-server]: Run complete, exiting.
14 07 2020 06:52:01.710:DEBUG [launcher]: Disconnecting all browsers
14 07 2020 06:52:01.710:DEBUG [launcher]: BEING_FORCE_KILLED -> BEING_FORCE_KILLED
14 07 2020 06:52:01.736:DEBUG [launcher]: Process ChromeHeadless exited with code null and signal SIGTERM
Following https://github.com/karma-runner/karma-chrome-launcher/issues/137 I'm using Puppeteer:
I've added process.env.CHROME_BIN = require('puppeteer').executablePath() in the karma.conf.js and:
logLevel: config.LOG_DEBUG,
autoWatch: true,
browsers: ['ChromeHeadlessNoSandbox'],
customLaunchers: {
ChromeHeadlessNoSandbox: {
base: 'ChromeHeadless',
flags: ['--no-sandbox']
}
},
I don't know what else I can try, any ideas?

Master turning into slave after redis sentinel failover

I am trying out redis master slave replication using sentinels.
I have 1 master and 2 slaves and 3 sentinels. All running as different pods.
My issue is:
1) When I delete the master pod, one of the slaves turns to master.
2) Ideally, there should be a new master now with only one slave. For some reason, the master IP that I deleted turns into a slave of the newly elected master.
3) Is this a desirable behaviour? Because when the sentinel shows there are 2 slaves to the newly elected master, in fact there exists only 1 slave pod because the master pod is deleted.
Below are the logs:
:M 29 May 2020 07:32:19.569 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
8:M 29 May 2020 07:32:19.569 # Server initialized
8:M 29 May 2020 07:32:19.569 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
8:M 29 May 2020 07:32:19.569 * Ready to accept connections
8:M 29 May 2020 07:33:22.329 * Replica 172.16.2.12:6379 asks for synchronization
8:M 29 May 2020 07:33:22.329 * Full resync requested by replica 172.16.2.12:6379
8:M 29 May 2020 07:33:22.329 * Starting BGSAVE for SYNC with target: disk
8:M 29 May 2020 07:33:22.330 * Background saving started by pid 12
12:C 29 May 2020 07:33:22.333 * DB saved on disk
12:C 29 May 2020 07:33:22.334 * RDB: 2 MB of memory used by copy-on-write
8:M 29 May 2020 07:33:22.355 * Background saving terminated with success
8:M 29 May 2020 07:33:22.356 * Synchronization with replica 172.16.2.12:6379 succeeded
8:M 29 May 2020 07:33:23.092 * Replica 172.16.4.48:6379 asks for synchronization
8:M 29 May 2020 07:33:23.092 * Full resync requested by replica 172.16.4.48:6379
8:M 29 May 2020 07:33:23.092 * Starting BGSAVE for SYNC with target: disk
8:M 29 May 2020 07:33:23.092 * Background saving started by pid 13
13:C 29 May 2020 07:33:23.097 * DB saved on disk
13:C 29 May 2020 07:33:23.097 * RDB: 2 MB of memory used by copy-on-write
8:M 29 May 2020 07:33:23.158 * Background saving terminated with success
8:M 29 May 2020 07:33:23.158 * Synchronization with replica 172.16.4.48:6379 succeeded
8:M 29 May 2020 07:36:26.866 # Connection with replica 172.16.2.12:6379 lost.
8:M 29 May 2020 07:36:27.871 # Connection with replica 172.16.4.48:6379 lost.
8:S 29 May 2020 07:36:37.926 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
8:S 29 May 2020 07:36:37.927 * REPLICAOF 172.16.2.12:6379 enabled (user request from 'id=21 addr=172.16.3.135:56721 fd=9 name=sentinel-5261eb21-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=151 qbuf-free=32617 obl=36 oll=0 omem=0 events=r cmd=exec')
8:S 29 May 2020 07:36:37.933 # CONFIG REWRITE executed with success.
8:S 29 May 2020 07:36:38.284 * Connecting to MASTER 172.16.2.12:6379
8:S 29 May 2020 07:36:38.284 * MASTER <-> REPLICA sync started
8:S 29 May 2020 07:36:38.284 * Non blocking connect for SYNC fired the event.
8:S 29 May 2020 07:36:38.285 * Master replied to PING, replication can continue...
8:S 29 May 2020 07:36:38.285 * Trying a partial resynchronization (request 563ca4b5f67f1e24c129729eaa74800b108902a3:52568).
8:S 29 May 2020 07:36:38.321 * Full resync from master: f21b8c35187b109b621605b375ef62e61b301834:52901
8:S 29 May 2020 07:36:38.321 * Discarding previously cached master state.
8:S 29 May 2020 07:36:38.356 * MASTER <-> REPLICA sync: receiving 178 bytes from master
8:S 29 May 2020 07:36:38.356 * MASTER <-> REPLICA sync: Flushing old data
8:S 29 May 2020 07:36:38.356 * MASTER <-> REPLICA sync: Loading DB in memory
8:S 29 May 2020 07:36:38.356 * MASTER <-> REPLICA sync: Finished with success
I am using redis 5.0. Earlier I was using redis 4.0 but I did not face such issue.

Plex media server on Raspberry pi zero w not working

I'm trying to install Plex media server on my raspberrypi zero w and i keep getting this error:
plexmediaserver.service - Plex Media Server for Linux
Loaded: loaded (/lib/systemd/system/plexmediaserver.service; enabled;
vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2018-09-06 18:22:09 UTC; 12s ago
Process: 1043 ExecStart=/bin/sh -c LD_LIBRARY_PATH=/usr/lib/plexmediaserver "/usr/lib/plexmediaserv
Process: 1039 ExecStartPre=/bin/sh -c /usr/bin/test -d "${PLEX_MEDIA_SERVER_APPLICATION_SUPPORT_DIR
Main PID: 1043 (code=exited, status=132)
Sep 06 18:22:04 raspberrypi systemd[1]: plexmediaserver.service: Unit entered failed state.
Sep 06 18:22:04 raspberrypi systemd[1]: plexmediaserver.service: Failed with result 'exit-code'.
Sep 06 18:22:09 raspberrypi systemd[1]: plexmediaserver.service: Service hold-off time over, scheduli
Sep 06 18:22:09 raspberrypi systemd[1]: Stopped Plex Media Server for Linux.
Sep 06 18:22:09 raspberrypi systemd[1]: plexmediaserver.service: Start request repeated too quickly.
Sep 06 18:22:09 raspberrypi systemd[1]: Failed to start Plex Media Server for Linux.
Sep 06 18:22:09 raspberrypi systemd[1]: plexmediaserver.service: Unit entered failed state.
Sep 06 18:22:09 raspberrypi systemd[1]: plexmediaserver.service: Failed with result 'exit-code'.
If anyone can help I'd really appreciate it, Thanks!
(btw I'm quite a noob when it comes to raspberry pi's)
As far as I know it is not supported anymore due to lots of API changes of Plex resulting in the discontinued support of ARM v61 (Pi Zero W has ARM v61) and only "supports" ARM v7 and v8 (v64)
See also: https://forums.plex.tv/t/pi-zero-w-problems/191899/7

Zookeeper / Exhibitor recurring JMX Error

I don't know why this is occurring, but occasionally I will get this series of repeating errors and the zookeeper instances will go into a bad state.
Tue Feb 16 07:05:04 EST 2016 ERROR ZooKeeper Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Tue Feb 16 07:05:04 EST 2016 ERROR ZooKeeper Server: JMX enabled by default
Tue Feb 16 07:05:04 EST 2016 INFO Process started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh
Tue Feb 16 07:05:03 EST 2016 INFO Kill attempted result: 0
Tue Feb 16 07:05:03 EST 2016 INFO Attempting to start/restart ZooKeeper
Tue Feb 16 07:05:03 EST 2016 INFO Attempting to stop instance
Tue Feb 16 07:05:03 EST 2016 INFO Restarting down/not-serving ZooKeeper after 60037 ms pause
Tue Feb 16 07:04:33 EST 2016 INFO ZooKeeper down/not-serving waiting 30026 of 40000 ms before restarting
Tue Feb 16 07:04:05 EST 2016 INFO ZooKeeper Server: Starting zookeeper ... STARTED
Tue Feb 16 07:04:04 EST 2016 ERROR ZooKeeper Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Tue Feb 16 07:04:04 EST 2016 ERROR ZooKeeper Server: JMX enabled by default
The exhibitor stuff uses shared storage on a NAS. The servers are centOs 6.6. It is a three node ensemble, and the one noticible problem I have seen is that the "ensemble" connection string inside of Exhibitor GUI all of a sudden becomes different between the three nodes (one node may "forget" about some of the other nodes in the ensemble).
I don't even know where to look to dig into these causes. Any help or direction will be greatly appreciated. Its trully odd...
update versions
zk: 3.4.6
Exhibitor: 1.5.5

cannot get mongodb connected with scrapy-linkedin

Mon Oct 31 18:40:34 [initandlisten] fd limit hard:1024 soft:1024 max conn: 819
Mon Oct 31 18:40:34 [initandlisten] waiting for connections on port 27017
Mon Oct 31 18:40:34 BackgroundJob starting: snapshot
Mon Oct 31 18:40:34 BackgroundJob starting: ClientCursorMonitor
Mon Oct 31 18:40:34 BackgroundJob starting: PeriodicTask::Runner
Mon Oct 31 18:40:34 [websvr] fd limit hard:1024 soft:1024 max conn: 819
Mon Oct 31 18:40:34 [websvr] admin web console waiting for connections on port 28017
the command prompt gets hung up for so much time as i start mongod.exe
i have made the directory data\db as well
http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/#mongodb-as-a-windows-service