red 5 - connecting via rtmpt issues - streaming

I have a red 5 server up and running successfully using regular rtmpt.
I have also made the necessary changes as per this link
Watching the logs of a typical working connection to red5 via RTMP. I see the following.
==> /var/log/red5/error.log <==
2011-12-12 10:48:41,261 [http-8088-exec-2] ERROR o.r.server.net.rtmp.RTMPHandshake - Unable to validate client
==> /var/log/red5/red5.log <==
2011-12-12 10:48:41,261 [http-8088-exec-2] ERROR o.r.server.net.rtmp.RTMPHandshake - Unable to validate client
2011-12-12 10:48:41,484 [http-8088-exec-3] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action connect
2011-12-12 10:48:41,492 [http-8088-exec-3] INFO o.red5.server.net.rtmp.RTMPHandler - Connecting to: [WebScope#df8b14 Depth = 1, Path = '/default', Name = 'splitstream']
2011-12-12 10:48:41,731 [http-8088-exec-4] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action authorize1
2011-12-12 10:48:41,971 [http-8088-exec-5] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action authorize2
2011-12-12 10:48:42,200 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action releaseStream
2011-12-12 10:48:42,200 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action FCPublish
2011-12-12 10:48:42,202 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action createStream
2011-12-12 10:48:42,432 [http-8088-exec-2] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action publish
2011-12-12 10:48:42,440 [http-8088-exec-2] INFO o.r.s.stream.ClientBroadcastStream - Provider connect
2011-12-12 10:48:42,441 [http-8088-exec-2] INFO o.r.s.stream.ClientBroadcastStream - Stream start
2011-12-12 10:48:42,442 [http-8088-exec-2] INFO o.r.s.stream.ClientBroadcastStream - Provider connect
2011-12-12 10:48:43,118 [http-8088-exec-5] INFO o.r.s.stream.codec.ScreenVideo2 - Allocating memory for 510 compressed blocks
When I switch to RTMPT I don't seem to see the Provider connect, Stream Start messages - and of course my stream never starts.
==> /var/log/red5/error.log <==
2011-12-12 10:57:52,177 [http-8088-exec-2] ERROR o.r.server.net.rtmp.RTMPHandshake - Unable to validate client
==> /var/log/red5/red5.log <==
2011-12-12 10:57:52,177 [http-8088-exec-2] ERROR o.r.server.net.rtmp.RTMPHandshake - Unable to validate client
2011-12-12 10:57:52,405 [http-8088-exec-3] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action connect
2011-12-12 10:57:52,411 [http-8088-exec-3] INFO o.red5.server.net.rtmp.RTMPHandler - Connecting to: [WebScope#db38a4 Depth = 1, Path = '/default', Name = 'splitstream']
2011-12-12 10:57:52,613 [http-8088-exec-4] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action authorize1
2011-12-12 10:57:52,847 [http-8088-exec-5] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action authorize2
2011-12-12 10:57:53,079 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action releaseStream
2011-12-12 10:57:53,079 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action FCPublish
2011-12-12 10:57:53,079 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action createStream
2011-12-12 10:57:53,316 [http-8088-exec-2] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action publish
I also notice this in my logs
2011-12-13 04:54:00,980 [http-8088-exec-2] INFO o.red5.server.net.rtmp.RTMPHandler - Scope :80/splitstream not found on dev-100.host.com:80:80
==> /var/log/red5/error.log <==
2011-12-13 04:54:05,105 [Red5_Scheduler_Worker-1] WARN o.r.server.net.rtmp.RTMPConnection - Closing RTMPTConnection from 127.0.0.1 : 47814 to localhost.localdomain (in: 3626 out 3265 ), with id 6 due to long handshake
==> /var/log/red5/red5.log <==
2011-12-13 04:54:05,105 [Red5_Scheduler_Worker-1] WARN o.r.server.net.rtmp.RTMPConnection - Closing RTMPTConnection from 127.0.0.1 : 47814 to localhost.localdomain (in: 3626 out 3265 ), with id 6 due to long handshake
2011-12-13 04:54:06,402 [http-8088-exec-1] INFO o.r.s.n.r.codec.RTMPProtocolDecoder - Action connect
2011-12-13 04:54:06,403 [http-8088-exec-1] INFO o.red5.server.net.rtmp.RTMPHandler - Connecting to: [WebScope#19c6163 Depth = 1, Path = '/default', Name = 'splitstream']
Any ideas

Change the value of rtmpt.max_inactivity in the red5.properties file. Also change the value of maxHandshakeTimeout in the rtmpt section of red5-core.xml file and retry.

Related

zookeeper - [NIOServerCnxn#383] - Exception causing close of session 0x0: Len error 1195725856

I am trying to insttall zookeeper in my Windows. I am getting the error bellow no matter which suggestion I followed in zookeeper + Kafka - Unable to create data directory.
I am running it as Administrator and I have tried all these options:
#dataDir=/tmp/zookeeper
#dataDir=:\zookeeper-3.4.14\
#dataDir=C:\\_d\\WSs\\kafka\\zookeeper-3.4.14\\data
#dataDir=:\\\\zookeeper\\\\data
dataDir=C:\\_d\\WSs\\kafka\\zookeeper-3.4.14
I donĀ“t think it is relevant but let me add here: I have Java 11.
Any idea why it is happening will be appreciated.
Full logs
C:\Windows\system32>zkserver
C:\Windows\system32>call "C:\Program Files\Java\jdk-11.0.2"\bin\java "-Dzookeeper.log.dir=C:\_d\WSs\kafka\zookeeper-3.4.14\bin\.." "-Dzookeeper.root.logger=INFO,CONSOLE" -cp "C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\build\classes;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\build\lib\*;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\*;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\*;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\conf" org.apache.zookeeper.server.quorum.QuorumPeerMain "C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\conf\zoo.cfg"
2019-04-18 15:17:42,629 [myid:] - INFO [main:QuorumPeerConfig#136] - Reading configuration from: C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\conf\zoo.cfg
2019-04-18 15:17:42,644 [myid:] - INFO [main:DatadirCleanupManager#78] - autopurge.snapRetainCount set to 3
2019-04-18 15:17:42,644 [myid:] - INFO [main:DatadirCleanupManager#79] - autopurge.purgeInterval set to 0
2019-04-18 15:17:42,644 [myid:] - INFO [main:DatadirCleanupManager#101] - Purge task is not scheduled.
2019-04-18 15:17:42,644 [myid:] - WARN [main:QuorumPeerMain#116] - Either no config or no quorum defined in config, running in standalone mode
2019-04-18 15:17:42,769 [myid:] - INFO [main:QuorumPeerConfig#136] - Reading configuration from: C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\conf\zoo.cfg
2019-04-18 15:17:42,769 [myid:] - INFO [main:ZooKeeperServerMain#98] - Starting server
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:zookeeper.version=3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:host.name=DESKTOP-AKCNE7F
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:java.version=11.0.2
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:java.vendor=Oracle Corporation
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:java.home=C:\Program Files\Java\jdk-11.0.2
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:java.class.path=C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\build\classes;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\build\lib\*;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\zookeeper-3.4.14.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\audience-annotations-0.5.0.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\jline-0.9.94.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\log4j-1.2.17.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\netty-3.10.6.Final.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\slf4j-api-1.7.25.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\lib\slf4j-log4j12-1.7.25.jar;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\..\conf
2019-04-18 15:17:47,344 [myid:] - INFO [main:Environment#100] - Server environment:java.library.path=C:\Program Files\Java\jdk-11.0.2\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\iCLS\;C:\Program Files\Intel\Intel(R) Management Engine Components\iCLS\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Java\jdk-11.0.2\bin;C:\Program Files\Git\cmd;C:\ProgramData\chocolatey\bin;C:\_d\tools\apache-maven-3.6.0\bin;C:\_d\WSs\kafka\zookeeper-3.4.14\bin\;C:\Users\jimis\AppData\Local\Programs\Python\Python37-32\Scripts\;C:\Users\jimis\AppData\Local\Programs\Python\Python37-32\;C:\Users\jimis\AppData\Local\Microsoft\WindowsApps;.
2019-04-18 15:17:47,360 [myid:] - INFO [main:Environment#100] - Server environment:java.io.tmpdir=C:\Users\jimis\AppData\Local\Temp\
2019-04-18 15:17:47,360 [myid:] - INFO [main:Environment#100] - Server environment:java.compiler=<NA>
2019-04-18 15:17:47,360 [myid:] - INFO [main:Environment#100] - Server environment:os.name=Windows 10
2019-04-18 15:17:47,360 [myid:] - INFO [main:Environment#100] - Server environment:os.arch=amd64
2019-04-18 15:17:47,376 [myid:] - INFO [main:Environment#100] - Server environment:os.version=10.0
2019-04-18 15:17:47,376 [myid:] - INFO [main:Environment#100] - Server environment:user.name=jimis
2019-04-18 15:17:47,376 [myid:] - INFO [main:Environment#100] - Server environment:user.home=C:\Users\jimis
2019-04-18 15:17:47,376 [myid:] - INFO [main:Environment#100] - Server environment:user.dir=C:\Windows\system32
2019-04-18 15:17:47,391 [myid:] - INFO [main:ZooKeeperServer#836] - tickTime set to 2000
2019-04-18 15:17:47,391 [myid:] - INFO [main:ZooKeeperServer#845] - minSessionTimeout set to -1
2019-04-18 15:17:47,391 [myid:] - INFO [main:ZooKeeperServer#854] - maxSessionTimeout set to -1
2019-04-18 15:17:47,782 [myid:] - INFO [main:ServerCnxnFactory#117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2019-04-18 15:17:47,797 [myid:] - INFO [main:NIOServerCnxnFactory#89] - binding to port 0.0.0.0/0.0.0.0:2181
2019-04-18 15:18:00,365 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#222] - Accepted socket connection from /127.0.0.1:54057
2019-04-18 15:18:00,375 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#222] - Accepted socket connection from /127.0.0.1:54058
2019-04-18 15:18:00,378 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#383] - Exception causing close of session 0x0: Len error 1195725856
2019-04-18 15:18:00,379 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1056] - Closed socket connection for client /127.0.0.1:54057 (no session established for client)
*** Edited
*** edited
*** The answer to my question is "you can ignore the fact that I get an error while curl 127.0.0.1:port. Kafka is working anyway.
Are you trying to do a "HTTP GET" against the zookeeper client port?
So the error comes from NIOServerCnxn.java:readLength which is expecting either a 4-letter command or buffer where the first 4 bytes represent size.
The number 1195725856 in hex is 0x47455420 which is "GET " in ASCII.
So the error message is caused when you try to do a HTTP GET" against the 2181 port.
$ curl http://0.0.0.0:2181/
curl: (52) Empty reply from server
$ sudo tail /var/log/zookeeper/zookeeper.out
...
2019-04-19 12:56:25,303 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#215] - Accepted
2019-04-19 12:56:25,304 [myid:3] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#383] - Exception causing close of session 0x0: Len error 1195725856
2019-04-19 12:56:25,304 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1040] - Closed socket connection for client /127.0.0.1:33011 (no session established for client)
This WARN message is safe to ignore since ZooKeeper will just close that client session which is implied by the curl response.

JobManager doesn't automatically redirect all requests to the remaining / running TaskManager

Problem Description
2 computers(203,204)
created a Standalone mode HA Flink v1.6.1 cluster
both run jobmanager and taskmanager(2 task slots) on every computer
After I start a job (examples SocketWindowWordCount.jar ./flink run ../examples/streaming/SocketWindowWordCount.jar --hostname 10.1.2.9 --port 9000) on the JobManager node, I kill the working TaskManager instance.
Web Dashboard I can see the job being cancelled and then failed. Web Dashboard image
flink-conf.yaml
state.backend: filesystem
state.checkpoints.dir: hdfs://10.1.2.109:8020/wulin/flink-checkpoints
rest.port: 9081
blob.server.port: 6124
query.server.port: 6125
web.tmpdir: /home/flink/deploy/webTmp
web.log.path: /home/flink/deploy/log
io.tmp.dirs: /home/flink/deploy/taskManagerTmp
high-availability: zookeeper
high-availability.zookeeper.quorum: 10.0.1.79:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: flink
high-availability.storageDir: hdfs://10.1.2.109:8020/wulin
security.kerberos.login.principal: xxxx
security.kerberos.login.keytab: /home/ctu/flink/flink-1.6/conf/user.keytab
full logs
log-standalonesession-203
log-taskexecutor-203
log-standalonesession-204
exception
kill working TM, get the excpetion like this
2018-12-28 11:04:27,877 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#hz203:42861] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#hz203:42861]] Caused by: [Connection refused: hz203/10.0.0.203:42861]
2018-12-28 11:04:28,660 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: hz203/10.0.0.203:42861
2018-12-28 11:04:28,660 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#hz203:42861] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#hz203:42861]] Caused by: [Connection refused: hz203/10.0.0.203:42861]
2018-12-28 11:04:28,678 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - The heartbeat of TaskManager with id 0f41bca09600cd25000e19801076fa1f timed out.
2018-12-28 11:04:28,678 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Closing TaskExecutor connection 0f41bca09600cd25000e19801076fa1f because: The heartbeat of TaskManager with id 0f41bca09600cd25000e19801076fa1f timed out.
2018-12-28 11:04:28,678 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Unregister TaskManager dcf3bb5b7ed2208cf45b658d212fd8d2 from the SlotManager.
2018-12-28 11:04:28,678 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Socket Stream -> Flat Map (1/1) (88aa62ad152f4df6b39a969dd32c0249) switched from RUNNING to FAILED.
org.apache.flink.util.FlinkException: The assigned slot 0f41bca09600cd25000e19801076fa1f_0 was removed.
at org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:786)
at org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlots(SlotManager.java:756)
at org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.internalUnregisterTaskManager(SlotManager.java:948)
at org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.unregisterTaskManager(SlotManager.java:372)
at org.apache.flink.runtime.resourcemanager.ResourceManager.closeTaskManagerConnection(ResourceManager.java:803)
at org.apache.flink.runtime.resourcemanager.ResourceManager$TaskManagerHeartbeatListener$1.run(ResourceManager.java:1116)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2018-12-28 11:04:28,680 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Socket Window WordCount (61f55876e79934d515c163d095d706a6) switched from state RUNNING to FAILING.
submit job
run ./bin/flink run -d ./examples/streaming/SocketWindowWordCount.jar --port 9000 --hostname 10.1.2.9, get the JM logs like this
2018-12-28 19:20:01,354 INFO org.apache.flink.runtime.jobmaster.JobMaster - Starting execution of job Socket Window WordCount (5cdb91c15ee12ec6e74256eed10b5291)
2018-12-28 19:20:01,354 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Socket Window WordCount (5cdb91c15ee12ec6e74256eed10b5291) switched from state CREATED to RUNNING.
2018-12-28 19:20:01,356 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Socket Stream -> Flat Map (1/1) (e30439b9f548c6013d8b8689e30d0dd7) switched from CREATED to SCHEDULED.
2018-12-28 19:20:01,359 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger, ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out (1/1) (102d04f5aa6fc50cfe5088e20902c72e) switched from CREATED to SCHEDULED.
2018-12-28 19:20:01,364 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{e33a40832a3922897470fb76bcf76b29}]
2018-12-28 19:20:01,367 INFO org.apache.flink.runtime.jobmaster.JobMaster - Connecting to ResourceManager akka.tcp://flink#hz203:46596/user/resourcemanager(b22f96303e74df23645fe4567f884b9e)
2018-12-28 19:20:01,370 INFO org.apache.flink.runtime.jobmaster.JobMaster - Resolved ResourceManager address, beginning registration
2018-12-28 19:20:01,370 INFO org.apache.flink.runtime.jobmaster.JobMaster - Registration at ResourceManager attempt 1 (timeout=100ms)
2018-12-28 19:20:01,371 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService /leader/5cdb91c15ee12ec6e74256eed10b5291/job_manager_lock.
2018-12-28 19:20:01,371 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registering job manager 9a31e8b4e8dfbf7b31d6ed3d227648b6#akka.tcp://flink#hz203:46596/user/jobmanager_0 for job 5cdb91c15ee12ec6e74256eed10b5291.
2018-12-28 19:20:01,431 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registered job manager 9a31e8b4e8dfbf7b31d6ed3d227648b6#akka.tcp://flink#hz203:46596/user/jobmanager_0 for job 5cdb91c15ee12ec6e74256eed10b5291.
2018-12-28 19:20:01,432 INFO org.apache.flink.runtime.jobmaster.JobMaster - JobManager successfully registered at ResourceManager, leader id: b22f96303e74df23645fe4567f884b9e.
2018-12-28 19:20:01,433 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Requesting new slot [SlotRequestId{e33a40832a3922897470fb76bcf76b29}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
2018-12-28 19:20:01,434 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 5cdb91c15ee12ec6e74256eed10b5291 with allocation id AllocationID{f7a24e609e2ec618ccb456076049fa3b}.
2018-12-28 19:20:01,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Socket Stream -> Flat Map (1/1) (e30439b9f548c6013d8b8689e30d0dd7) switched from SCHEDULED to DEPLOYING.
2018-12-28 19:20:01,511 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Socket Stream -> Flat Map (1/1) (attempt #0) to hz203
2018-12-28 19:20:01,515 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger, ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out (1/1) (102d04f5aa6fc50cfe5088e20902c72e) switched from SCHEDULED to DEPLOYING.
2018-12-28 19:20:01,515 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger, ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out (1/1) (attempt #0) to hz203
2018-12-28 19:20:01,674 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger, ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out (1/1) (102d04f5aa6fc50cfe5088e20902c72e) switched from DEPLOYING to RUNNING.
2018-12-28 19:20:01,708 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Socket Stream -> Flat Map (1/1) (e30439b9f548c6013d8b8689e30d0dd7) switched from DEPLOYING to RUNNING.
2018-12-28 19:20:43,267 INFO org.apache.flink.runtime.blob.BlobClient - Downloading null/t-61808afb630553305c73a0a23f9231ffd6b2b448-513fbe1e6ddf69d10689eccf4c65da97 from hz203/10.0.0.203:6124
2018-12-28 19:20:48,339 INFO org.apache.flink.runtime.blob.BlobClient - Downloading null/t-dd915bb9821ff6ced34dd5e489966b674de5a48f-7ea2600930e5fc5a4fbb7d47ee198789 from hz203/10.0.0.203:6124
2018-12-28 19:20:52,623 INFO org.apache.flink.runtime.blob.BlobClient - Downloading null/t-61808afb630553305c73a0a23f9231ffd6b2b448-0bd1ab86fa4cc54daeb472079bfbea8c from hz203/10.0.0.203:6124
kill TM
Body is limited to 30000 characters. please read this JM logs when kill TM
The logs indicate that your RestartStrategy has depleted its restart attempts or that no RestartStrategy has been configured. Please check whether you specified a RestartStrategy in your program via env.setRestartStrategy(RestartStrategies.fixedDelayRestart(10, 0L)) or in flink-conf.yaml via restart-strategy: fixed-delay. If you want to learn more about Flink's restart strategies check out the documentation.

Why can't I connect to Kafka/Zookeeper? (In a Docker)

I'm running Kafka (0.10.0.0) in a Docker on a Mac (w/docker-machine). I derived my Dockerfile from Spotify's, which means Kafka and Zookeeper run in the same image.
My instance starts cleanly and poking around inside it appears everything is normal/OK.
Docker maps ports 2181 and 9092 to high-ports 32822 and 32820 in this case. From outside my running Kafka Docker I am able to successfully telnet 192.168.99.100 32822 (where 192.168.99.100 is the IP of my docker-machine). From there I can issue a zookeeper command and get expected output.
It all seems so encouraging, but... I then try this code:
val numPartitions = 4
val replicationFactor = 1
val topicConfig = new java.util.Properties
// zookeeper = "192.168.99.100:32822"
val zkClient = ZkUtils(zookeeper, 10000, 10000, false)
try {
AdminUtils.createTopic(zkClient, topic, numPartitions, replicationFactor, topicConfig)
} catch {
case k: kafka.common.TopicExistsException => // do nothing...topic exists
}
zkClient.close()
This produces this error output:
DEBUG ZkConnection - Creating new ZookKeeper instance to connect to 192.168.99.100:32822.
INFO ZkEventThread - Starting ZkClient event thread.
INFO ZooKeeper - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
INFO ZooKeeper - Client environment:host.name=172.25.42.82
INFO ZooKeeper - Client environment:java.version=1.8.0_60
INFO ZooKeeper - Client environment:java.vendor=Oracle Corporation
INFO ZooKeeper - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre
INFO ZooKeeper - Client environment:java.class.path=/usr/local/Cellar/sbt/0.13.11/libexec/sbt-launch.jar
INFO ZooKeeper - Client environment:java.library.path=/Users/wmy965/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
INFO ZooKeeper - Client environment:java.io.tmpdir=/var/folders/ph/ccz4n1qs62n0bn8mqdg94gswt1jlwk/T/
INFO ZooKeeper - Client environment:java.compiler=<NA>
INFO ZooKeeper - Client environment:os.name=Mac OS X
INFO ZooKeeper - Client environment:os.arch=x86_64
INFO ZooKeeper - Client environment:os.version=10.11.5
INFO ZooKeeper - Client environment:user.name=wmy965
INFO ZooKeeper - Client environment:user.home=/Users/wmy965
INFO ZooKeeper - Client environment:user.dir=/Users/wmy965/git/LateKafka
INFO ZooKeeper - Initiating client connection, connectString=192.168.99.100:32822 sessionTimeout=10000 watcher=org.I0Itec.zkclient.ZkClient#55397e3
DEBUG ClientCnxn - zookeeper.disableAutoWatchReset is false
DEBUG ZkClient - Awaiting connection to Zookeeper server
INFO ZkClient - Waiting for keeper state SyncConnected
INFO ClientCnxn - Opening socket connection to server 192.168.99.100/192.168.99.100:32822. Will not attempt to authenticate using SASL (unknown error)
WARN ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
DEBUG ClientCnxnSocketNIO - Ignoring exception during shutdown input
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:780)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:399)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:200)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1185)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1110)
DEBUG ClientCnxnSocketNIO - Ignoring exception during shutdown output
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:797)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:407)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:207)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1185)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1110)
INFO ClientCnxn - Opening socket connection to server 192.168.99.100/192.168.99.100:32822. Will not attempt to authenticate using SASL (unknown error)
INFO ClientCnxn - Socket connection established to 192.168.99.100/192.168.99.100:32822, initiating session
DEBUG ClientCnxn - Session establishment request sent on 192.168.99.100/192.168.99.100:32822
INFO ClientCnxn - Session establishment complete on server 192.168.99.100/192.168.99.100:32822, sessionid = 0x155225c51720000, negotiated timeout = 10000
DEBUG ZkClient - Received event: WatchedEvent state:SyncConnected type:None path:null
INFO ZkClient - zookeeper state changed (SyncConnected)
DEBUG ZkClient - Leaving process event
DEBUG ZkClient - State is SyncConnected
DEBUG ClientCnxn - Reading reply sessionid:0x155225c51720000, packet:: clientPath:null serverPath:null finished:false header:: 1,8 replyHeader:: 1,1,-101 request:: '/brokers/ids,F response:: v{}
It looks like I can't connect (presumably to zookeeper). Why not?
In new kafka streams, the ip of producer must have been knowing by kafka (docker). Kafka send their uuid (you can show this in /etc/hosts inside kafka docker) and espect response from this.
Summary:
Map uuid kafka docker to docker-machine in /etc/host of mac OS.
To help you, how to change etc/host file in mac:
https://www.tekrevue.com/tip/edit-hosts-file-mac-os-x/
Cleaner would be to set advertised.listeners=host-ip:port since advertised.host.name and advertised.port are deprecated in Kafka server.properties file.
If set host-ip to 0.0.0.0 it will listen requests from anywhere. But it's insecure.

Not able to start zookeeper

I am trying to manually start zookeeper. I run
# source zkServer.sh start
It outputs:
JMX enabled by default
Using config: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
When I run #jps, it outputs
15360 QuorumPeerMain
15412 Jps
From what I read online, Zookeeper is the same process as QuorumPeerMain listed above. But then when I check its status using
source zkServer.sh status
It hangs at:
JMX enabled by default
Using config: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg
So I run
#source zkServer.sh status > templogs.txt
Running above makes terminal to flash below for a moment (I have to run this command many times to read what actually is output on terminal before it closes) and closes terminal
JMX enabled by default
Using config:
grep: No such file or directory
grep: No such file or directory
Writing to templogs.txt following
Error contacting service. It is probably not running.
When I open zookeeper.out, I can see this output:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/zookeeper-3.4.6/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hive-0.12.0-cdh5.0.3/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-0.96.1.1-cdh5.0.3/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark-0.9.0-cdh/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/splicemachine/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-06-29 15:49:40,831 [myid:] - INFO [main:QuorumPeerConfig#103] - Reading configuration from: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg
2015-06-29 15:49:40,836 [myid:] - INFO [main:DatadirCleanupManager#78] - autopurge.snapRetainCount set to 3
2015-06-29 15:49:40,836 [myid:] - INFO [main:DatadirCleanupManager#79] - autopurge.purgeInterval set to 0
2015-06-29 15:49:40,836 [myid:] - INFO [main:DatadirCleanupManager#101] - Purge task is not scheduled.
2015-06-29 15:49:40,837 [myid:] - WARN [main:QuorumPeerMain#113] - Either no config or no quorum defined in config, running in standalone mode
2015-06-29 15:49:40,847 [myid:] - INFO [main:QuorumPeerConfig#103] - Reading configuration from: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg
2015-06-29 15:49:40,847 [myid:] - INFO [main:ZooKeeperServerMain#95] - Starting server
2015-06-29 15:49:40,896 [myid:] - INFO [main:Environment#100] - Server environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2015-06-29 15:49:40,896 [myid:] - INFO [main:Environment#100] - Server environment:host.name=ingester
2015-06-29 15:49:40,896 [myid:] - INFO [main:Environment#100] - Server environment:java.version=1.8.0_25
2015-06-29 15:49:40,896 [myid:] - INFO [main:Environment#100] - Server environment:java.vendor=Oracle Corporation
2015-06-29 15:49:40,900 [myid:] - INFO [main:Environment#100] - Server environment:java.home=/usr/java/jdk1.8.0_25/jre
2015-06-29 15:49:40,901 [myid:] - INFO [main:Environment#100] - Server environment:java.class.path=/opt/zookeeper-3.4.6/bin/../build/classes:/opt
I omit the huge path string that follows, and give below the remaining log:
2015-06-29 15:49:40,902 [myid:] - INFO [main:Environment#100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2015-06-29 15:49:40,902 [myid:] - INFO [main:Environment#100] - Server environment:java.io.tmpdir=/tmp
2015-06-29 15:49:40,902 [myid:] - INFO [main:Environment#100] - Server environment:java.compiler=<NA>
2015-06-29 15:49:40,903 [myid:] - INFO [main:Environment#100] - Server environment:os.name=Linux
2015-06-29 15:49:40,903 [myid:] - INFO [main:Environment#100] - Server environment:os.arch=amd64
2015-06-29 15:49:40,903 [myid:] - INFO [main:Environment#100] - Server environment:os.version=3.17.8-200.fc20.x86_64
2015-06-29 15:49:40,903 [myid:] - INFO [main:Environment#100] - Server environment:user.name=root
2015-06-29 15:49:40,904 [myid:] - INFO [main:Environment#100] - Server environment:user.home=/root
2015-06-29 15:49:40,904 [myid:] - INFO [main:Environment#100] - Server environment:user.dir=/root
2015-06-29 15:49:40,909 [myid:] - INFO [main:ZooKeeperServer#755] - tickTime set to 2000
2015-06-29 15:49:40,909 [myid:] - INFO [main:ZooKeeperServer#764] - minSessionTimeout set to -1
2015-06-29 15:49:40,909 [myid:] - INFO [main:ZooKeeperServer#773] - maxSessionTimeout set to -1
2015-06-29 15:49:40,918 [myid:] - INFO [main:NIOServerCnxnFactory#94] - binding to port 0.0.0.0/0.0.0.0:2181
Is it like zookeeper stuck at obtaining bindings to port 2181?
But when I run lsof -i:2181 -s, it outputs:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 15360 root 467u IPv6 18340028 TCP *:eforward (LISTEN)
The pid is that of the QuorumPeerMain.
Running source zkServer.sh status on another PC, properly gives
Mode: standalone
But on this PC, I am pretty much screwed. Can anyone help me.
Have you validated that the config file is fine?
Try to run Zookeeper with the config file full path
Example: zkServer.sh start /etc/zookeeper/conf/zoo.cfg
It seems some problem with the PORT binding
"binding to port 0.0.0.0/0.0.0.0:2181"
and the problem should be resolved by adding below the host entry to the hosts file
127.0.0.1 localhost
In my case zookeeper 3.7.0. just need shutdown tomcat server
In my case i not install JAVA
so it was not started
please download JDK and install
then try again

"java.net.ConnectException: Connection refused" in zookeeper

I installed zookeeper as follows :
wget http://archive.cloudera.com/cdh/3/zookeeper-3.3.3-cdh3u1.tar.gz
Here is my zoo.cf:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/home/reach121/basf/data/zookeeper/data1
# maximum client connection
maxClientCnxns=500
# the port at which the clients will connect
clientPort=2183
server.1=localhost:2878:3878
server.2=localhost:2879:3879
server.3=localhost:2880:3880
and started by
/bin/zkServer.sh start zoo.cfg
and when I do?
bin/zkCli.sh -server 127.0.0.1:2183
it gives me this error:
Connecting to 127.0.0.1:2183
2011-10-13 14:11:28,433 - INFO [main:Environment#97] - Client environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 15:17 GMT
2011-10-13 14:11:28,437 - INFO [main:Environment#97] - Client environment:host.name=cignexnew
2011-10-13 14:11:28,437 - INFO [main:Environment#97] - Client environment:java.version=1.6.0_22
2011-10-13 14:11:28,438 - INFO [main:Environment#97] - Client environment:java.vendor=Sun Microsystems Inc.
2011-10-13 14:11:28,438 - INFO [main:Environment#97] - Client environment:java.home=/usr/lib/jvm/java-6-openjdk/jre
2011-10-13 14:11:28,439 - INFO [main:Environment#97] - Client environment:java.class.path=/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../build/classes:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../build/lib/*.jar:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../zookeeper-3.3.3-cdh3u1.jar:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../lib/log4j-1.2.15.jar:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../lib/jline-0.9.94.jar:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../src/java/lib/*.jar:/home/reach121/basf/zookeeper-3.3.3-cdh3u1/bin/../conf:
2011-10-13 14:11:28,439 - INFO [main:Environment#97] - Client environment:java.library.path=/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2011-10-13 14:11:28,440 - INFO [main:Environment#97] - Client environment:java.io.tmpdir=/tmp
2011-10-13 14:11:28,440 - INFO [main:Environment#97] - Client environment:java.compiler=<NA>
2011-10-13 14:11:28,441 - INFO [main:Environment#97] - Client environment:os.name=Linux
2011-10-13 14:11:28,441 - INFO [main:Environment#97] - Client environment:os.arch=amd64
2011-10-13 14:11:28,441 - INFO [main:Environment#97] - Client environment:os.version=2.6.35.4-rscloud
2011-10-13 14:11:28,442 - INFO [main:Environment#97] - Client environment:user.name=reach121
2011-10-13 14:11:28,443 - INFO [main:Environment#97] - Client environment:user.home=/home/reach121
2011-10-13 14:11:28,443 - INFO [main:Environment#97] - Client environment:user.dir=/home/reach121/basf/zookeeper-3.3.3-cdh3u1
2011-10-13 14:11:28,446 - INFO [main:ZooKeeper#373] - Initiating client connection, connectString=127.0.0.1:2183 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher#5311a775
Welcome to ZooKeeper!
2011-10-13 14:11:28,472 - INFO [main-SendThread():ClientCnxn$SendThread#1041] - Opening socket connection to server /127.0.0.1:2183
JLine support is enabled
2011-10-13 14:11:28,487 - WARN [main-SendThread(localhost:2183):ClientCnxn$SendThread#1161] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
[zk: 127.0.0.1:2183(CONNECTING) 0] 2011-10-13 14:11:30,374 - INFO [main-SendThread(localhost:2183):ClientCnxn$SendThread#1041] - Opening socket connection to server localhost/127.0.0.1:2183
2011-10-13 14:11:30,376 - WARN [main-SendThread(localhost:2183):ClientCnxn$SendThread#1161] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
Are the servers coming up? Likely not given:
server.1=localhost:2878:3878
server.2=localhost:2879:3879
server.3=localhost:2880:3880
If you are running all three servers on the same host they will need to each have a different config - in particular the datadir location must be different, and you need to assure that each datadir has a myid file corresponding to the server line (ie server.# in config).
Typically when you want to run in distributed mode you need to have separate hosts. In this case why not just run in standalone (default) mode?
I'd suggest you read more in the admin guide first: http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html
Make sure all required services are running
Step 1 : Check if hbase-master is running
sudo /etc/init.d/hbase-master status
if not, then start it sudo /etc/init.d/hbase-master start
Step 2 : Check if hbase-regionserver is running
sudo /etc/init.d/hbase-regionserver status
if not, then start it sudo /etc/init.d/hbase-regionserver start
Step 3 : Check if zookeeper-server is running
sudo /etc/init.d/zookeeper-server status
if not, then start it sudo /etc/init.d/zookeeper-server start
or simply run these 3 commands in a row.
sudo /etc/init.d/hbase-master restart
sudo /etc/init.d/hbase-regionserver restart
sudo /etc/init.d/zookeeper-server restart
after that don't forget to check the status
sudo /etc/init.d/hbase-master status
sudo /etc/init.d/hbase-regionserver status
sudo /etc/init.d/zookeeper-server status
You might find that zookeeper is still not running:
then you can run the zookeeper
sudo /usr/lib/zookeeper/bin/zkServer.sh stop
sudo /usr/lib/zookeeper/bin/zkServer.sh start
after that again check the status and make sure its running
sudo /etc/init.d/zookeeper-server status
This should work.
I had the same issue connecting from a client code to mapr m3 out of the box
the issue is the the client was trying to connect to M3 zookeeper at localhost.
/opt/mapr/conf/mapr-clusters.conf on my M3 cluster was pointing to localhost ...
i changed it to the ip address of M3 machine and the connection from client worked
/opt/mapr/conf/cldb.conf add ip address in place of localhost
and restart zookeeper