Zookeeper: java.io.IOException: No snapshot found, but there are log entries. Something is broken - apache-kafka

I have been working with Kafka 2.4.0 (2.11) and yesterday I had to forcefully terminate the process for some unknown reason. Since then I haven't been unable to start Zookeeper due to the following error:
[2020-01-11 11:12:43,783] ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.io.IOException: No snapshot found, but there are log entries. Something is broken!
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:222)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:290)
at org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:450)
at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:764)
at org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:98)
at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:144)
at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
And as soon as I searched for this problem I found issue ZOOKEEPER-3513 reported, which may or may not explain the problem. However, what I'm finding strange is that if I delete the Kafka/Zookeeper directory and download it again from scratch, the problem persists. Does anyone know how I can solve this?
Thank you for your help

Check for the tmp/zookeeper folder on the drive where you have kafka folder (lets say D:/), and delete the folder tmp, which will create automatically for you once run the zookeeper again.

Try changing your zookeeper data directory.
Your zookeeper data directory is defined in zookeeper.properties (I think the default is /tmp/zookeeper).
Perhaps you're not deleting the correct zookeeper directory?
I had the same problem, and this solution worked.
NOTE: I'm experimenting with Kafka, and not using it in production. I have no idea what else the above does, apart from fix this error...

I've faced the same issue with Zookeeper after updating from version 3.4.x to 3.5.6. As described here. I've:
added empty snapshot.0 file in data directory
added a property 'zookeeper.snapshot.trust.empty=true' to Zookeeper configuration file (default is zoo.cfg)

On windows ->
Go to the tmp folder where the zookeeper details are stored
and delete the existing log files
Directory path = d:\tmp\zookeeper\version-2
On Linux ->
Path = /tmp/zookeeper/version-2
And remove all the existing log files using rm -r log.1
The log files will be created automatically again and will resolve the issue.

Faced same issue in macOS.
Solution: In kafka dir, path cd /tmp/zookeeper/version-2 deleted the log.1 file. It worked for me

if you are on windows make sure you escape the location of the zookeeper temp directory.
dataDir=d:\tmp\zookeeper

Created a new dir for logs and configured the same path in zoo.cfg.
It worked:)

I use macOS and my solution was to delete everything in the dataDir, the default value should be /usr/local/var/lib/zookeeper.

For those who are using docker, I'll share my experience:
I've been running zookeeper confluentinc/cp-zookeeper:5.2.1 as it follows:
docker run \
--network kafka-net --name=zookeeper \
-e ALLOW_ANONYMOUS_LOGIN=yes \
-e ZOOKEEPER_CLIENT_PORT=2181 \
-v /tmp/zookeeper-data:/var/lib/zookeeper/data \
-v /tmp/zookeeper-txn-logs:/var/lib/zookeeper/log \
-p 2181:2182 confluentinc/cp-zookeeper:5.2.1
As expected, I can see a few files placed in /tmp/zookeeper-txn-logs and /tmp/zookeeper-data on host. After cleaning up /tmp/zookeeper-data and running again, I've got the error No snapshot found, but there are log entries.
In my case, I just had to purge the data on /tmp/zookeeper-txn-logs. For a dev/production environment, I'd recommend following the docs https://access.redhat.com/documentation/en-us/red_hat_amq/6.3/html/fabric_guide/ensemble-purgetxnlog

Related

Unable to start kafka with zookeeper (kafka.common.InconsistentClusterIdException)

Below the steps I did to get this issue :
Launch ZooKeeper
Launch Kafka : .\bin\windows\kafka-server-start.bat .\config\server.properties
And at the second step the error happens :
ERROR Fatal error during KafkaServer startup. Prepare to shutdown
(kafka.server.KafkaServer)
kafka.common.InconsistentClusterIdException: The Cluster ID
Reu8ClK3TTywPiNLIQIm1w doesn't match stored clusterId
Some(BaPSk1bCSsKFxQQ4717R6Q) in meta.properties. The broker is trying
to join the wrong cluster. Configured zookeeper.connect may be wrong.
at kafka.server.KafkaServer.startup(KafkaServer.scala:220)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
at kafka.Kafka$.main(Kafka.scala:84)
at kafka.Kafka.main(Kafka.scala)
When I trigger .\bin\windows\kafka-server-start.bat .\config\server.properties zookeeper console returns :
INFO [SyncThread:0:FileTxnLog#216] - Creating new log file: log.1
How to fix this issue to get kafka running ?
Edit You can access to the proper issue on the right site (serverfault) here
Edit Here is the Answer
I managed to Solve this issue with the following steps :
Just Delete all the log/Data file created (or generated) into
zookeeper and kafka.
Run Zookeper
Run Kafka
[Since this post is open again I post my answer there so you got all on the same post]
** 1. The easiest solution is to remove kafka logs and start again.
** 2. But the root cause is Kafka saved failed cluster ID in meta.properties.**
Try to delete kafka-logs/meta.properties from your tmp folder, which is located in C:/tmp folder by default on windows, and /tmp/kafka-logs on Linux
if kafka is running in docker containers, the log path may be specified by volume config in the docker-compose - see docs.docker.com/compose/compose-file/compose-file-v2/#volumes -- Chris Halcrow
** 3. How to find Kafka log path:**
Open server server.properties file which is located in your kafka folder kafka_2.11-2.4.0\config\server.properties (considering your version of kafka, folder name could be kafka_<kafka_version>):
Then search for entry log.dirs to check where logs locate
log.dirs=/tmp/kafka-logs
For mac, the following steps are needed.
Stop kafka service: brew services stop kafka
open kafka server.properties file: vim /usr/local/etc/kafka/server.properties
find value of log.dirs in this file. For me, it is /usr/local/var/lib/kafka-logs
delete path-to-log.dirs/meta.properties file
start kafka service brew services start kafka
No need to delete the log/data files on Kafka. Check the Kafka error logs and find the new cluster id. Update the meta.properties file with cluster-ID then restart the Kafka.
/home/kafka/logs/meta.properties
To resolve this issue permanently follow below.
Check your zookeeper.properties file and look for dataDirpath and change the path tmp location to any other location which should not be removed after server restart.
/home/kafka/kafka/config/zookeeper.properties
Copy the zookeeper folder and file to the new(below or non tmp) location then restart the zookeeper and Kafka.
cp -r /tmp/zookeeper /home/kafka/zookeeper
Now server restart won’t affect the Kafka startup.
If you use Embedded Kafka with Testcontainers in your Java project like myself, then simply delete your build/kafka folder and Bob's your uncle.
The mentioned meta.properties can be found under build/kafka/out/embedded-kafka.
I had some old volumes lingering around. I checked the volumes like this:
docker volume list
And pruned old volumes:
docker volume prune
And also removed the ones that were kafka:
example:
docker volume rm test_kafka
I deleted the following directories :-
a.) logs directory from kafka-server's configured location i.e. log.dir property path.
b.) tmp directory from kafka broker's location.
log.dirs=../tmp/kafka-logs-1
I was using docker-compose to re-set up Kafka on a Linux server, with a known, working docker-compose.config that sets up a number of Kafka components (broker, zookeeper, connect, rest proxy), and I was getting the issue described in the OP. I fixed this for my dev server instance by doing the following
docker-compose down
backup kafka-logs directory using cp kafka-logs -r kafka-logs-bak
delete the kafka-logs/meta.properties file
docker-compose up -d
Note for users of docker-compose:
My log files weren't in the default location (/tmp/kafka-logs). If you're running Kafka in Docker containers, the log path can be specified by volume config in the docker-compose e.g.
volumes:
- ./kafka-logs:/tmp/kafka-logs
This is specifying SOURCE:TARGET. ./kafka-logs is the source (i.e. a directory named kafka-logs, in the same directory as the docker-compose file). This is then targeted to /tmp/kafka-logs as the mounted volume within the kafka container). So the logs can either be deleted from the source folder on the host machine, or by deleting them from the mounted volume after doing a docker exec into the kafka container.
see https://docs.docker.com/compose/compose-file/compose-file-v2/#volumes
For me, meta.properties was in /usr/local/var/lib/kafka-logs
By removing it, the kafka started working.
I also deleted all the content of the folder containing all data generated by Kafka. I could find the folder in my .yml file:
kafka:
image: confluentinc/cp-kafka:7.0.0
ports:
- '9092:9092'
environment:
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE: "true"
volumes:
- ./kafka-data/data:/var/lib/kafka/data
depends_on:
- zookeeper
networks:
- default
Under volumes: stays the location. So, in my case I deleted all files of the data folder located under kafka-data.
I've tried deleting the meta.properties file but didn't work.
In my case, it's solved by deleting legacy docker images.
But the problem with this is that deletes all previous data.
So be careful if you want to keep the old data this is not the right solution for you.
docker rm $(docker ps -q -f 'status=exited')
docker rmi $(docker images -q -f "dangling=true")

Unable to debug java app through stack driver in google kubernetes cluster

I am trying to debug a java app on GKE cluster through stack driver.
I have created a GKE cluster with Allow full access to all Cloud APIs
I am following documentation: https://cloud.google.com/debugger/docs/setup/java
Here is my DockerFile:
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE
COPY ${JAR_FILE} alnt-watchlist-microservice.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/alnt-watchlist-microservice.jar"]
In documentation, it was written to add following lines in DockeFile:
RUN mkdir /opt/cdbg && \
wget -qO- https://storage.googleapis.com/cloud-debugger/compute-java/debian-wheezy/cdbg_java_agent_gce.tar.gz | \
tar xvz -C /opt/cdbg
RUN java -agentpath:/opt/cdbg/cdbg_java_agent.so
-Dcom.google.cdbg.module=tpm-watchlist
-Dcom.google.cdbg.version=v1
-jar /alnt-watchlist-microservice.jar
When I build DockerFile, It fails saying tar: invalid magic , tar: short read.
In stackdriver debug console, It always show 'No deployed application found'. Which application it will show? I have already 2 services deployed on my kubernetes cluster.
I have already executed
gcloud debug source gen-repo-info-file --output-directory="WEB-INF/classes/
in my project's directory.
It generated source-context.json. After its creation, I tried building docker image and its failing.
The debugger will be ready for use when you deploy your containerized app. You are getting No deployed application found error because your debugger agent is failing to download or unzip in dockerfile.
Please check this discussion to resolve the tar: invalid magic , tar: short read. error.
Unfortunately it looks like Alpine isn't regularly tested with Debugger. There's a sample setup here that might help you: https://github.com/GoogleCloudPlatform/cloud-debug-java#alpine-linux
I resolved the issue.
Firstly, you will have to use java image "gcr.io/google-appengine/openjdk" instead of Alpine one.
Secondly,
I was putting entry points without comma separated (Basically in wrong format)
ENTRYPOINT ["java","-agentpath:/opt/cdbg/cdbg_java_agent.so", "-Djava.security.egd=file:/dev/./urandom" ,"-Dcom.google.cdbg.module=watchlist"]

zookeeper + Kafka - Unable to create data directory

I´m using zookeeper 3.4.8 in single node and try to use kafka.
When I run this command:
zookeeper-server-start.sh /usr/local/kafka_2.9.2-0.8.2.2 /config/zookeeper.properties
I get the below error:
[2016-02-22 17:32:41,661] ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.io.IOException: Unable to create data directory /var/zookeeper/version-2
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:85)
at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:104)
at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Any advice?
One reason could be the inappropriate path specified to zoo.config file.
A lot of solutions on the web specifies the path as ":\zookeeper-3.4.7\data".
Instead of the above mentioned format, specify the address as full path from your C: drive to the data folder. It worked for me. (Don't forget to put double slash \ instead of one in case you're on windows)
I got this problem for this setting on Windows PC:
dataDir=c:/data/zoo/
and thus this error:
2016-12-02 15:29:25,327 [myid:] - ERROR [main:ZooKeeperServerMain#64] - Unexpected exception, exiting abnormally
java.io.IOException: Unable to create data directory ??:\data\zoo\version-2
Problem was solved by changing (I have ZooKeeper on C disk unpackaged)
dataDir=/data/zoo/
Also run command line tool as Administrator if needed
I faced the same issue, and this works with
sudo bin/zookeeper-server-start.sh config/zookeeper.properties
You probably don't have permission to write to the directory log.dirs (see zookeeper.properties). Change the directory to a different one, change the permission setting of the current log.dirs directory or run Kafka as different user. You can use the command ls -l /var/zookeeper to see the current permissions and then chmod to change the permissions.
The reason is that zookeeper has no permission. Trying to use the administrator role to install it.
For window's machine
Solved : Use double slashes inside the path while defining the dataDir path
dataDir=E:\\tools\\zookeeperdata\\data
And in my windows 10 system, using zookeeper 3.4.10. the dataDir attribute should setting like :\\\\zookeeper\\\\data, not d:\zookeeper\data. it also can setting as linux file system separator(d:/zookeeper/data). then this problem should be ok. And in linux, I think it permission problem. also it can come across when dataDir is under driver C in windows system.
If you're running the zookeeper in the Windows 10 machine we need to specify the dataDir property something like this
"dataDir=C:\zookeeper-3.4.13\data"
In my windows 10 system, using zookeeper 3.4.13, the following example path is working:
"dataDir=C:\\dev\\tools\\zookeeper-3.4.13\\data"
You have to use double backslashes.
on zoo.cfg you need to change directory to above or anything similar:
dataDir=C:/zookeeper-3.4.14/zookeeper-3.4.14/data
For windows, set dataDir to full path where you have no access restrictions - with no quotes("")
dataDir=C:\\your-path\
dataDir=C:\\zk\tmp\
Note: I have observed the command to fail for some of the path(though full access) and running command prompt as administrator has solved it.
For windows the below too works:
dataDir=C:\\zookeeper-3.4.14\\zookeeper-3.4.14\\data

MongoDB not using /etc/mongodb.conf after I changed dbpath

Ever since I changed the dbpath in /etc/mongodb.conf, MongoDB has not been starting automatically, nor using the new dbpath. Prior to the change, MongoDB would be running when the computer started and I was able to simply run the command mongo to get into the console or start my Ruby on Rails server with no issues.
After I made the modification (in order to switch to a new drive with more space), the only way I can get everything to work is by manually running the command mongod --config /etc/mongodb.conf. If I don't run that, it doesn't seem like the service is running and running without the --config option give me the following error: ERROR: dbpath (/data/db/) does not exist. even though the config file says nothing about data/db.
Some other notes:
In addition to changing /etc/mongodb.conf, I moved all files out of /var/lib/mongodb and into /home/nick/appdev/mongodb.
I changed the owner and group from root to nick. Tried changing it back, but it didn't seem to fix anything.
I'm running Ubuntu 12.10 Beta 1 and Mongo 2.2.0 with Ruby on Rails 3.2.8
A late follow up on the above question...
I had a similar issue after moving the db to an ebs on ec2.
It turns out that just running mongod still directs the dbpath to /data/db/ (which exists).
The /etc/mongodb.conf is completely ignored unless specifically directed to.
I manage to work around this by using the directive --config or just the --dbpath(both work)
But was left wondering where does mongod takes it defaults from...?!
I was unable to locate and override these defaults anywhere.
Anyone ?
Note:
I am really annoyed by this behaviour of mongod...This is just bad design,and bad documentation.
It turns out that I needed to set the owner and group to mongodb. When I transferred the files to the new directory, I had set the owner and group to my user account nick and also tried root, neither of which worked.
To do so, here are the following commands:
sudo chown mongodb /home/nick/appdev/mongodb -R
sudo chgrp mongodb /home/nick/appdev/mongodb -R
To confirm that it worked, you can check the file permissions with:
ls -l /home/nick/appdev/mongodb
After checking all permission in the data, journal and log folders as suggested, my problem was solved by giving permission to a lock file in the /tmp folder
sudo chown mongod:mongod mongodb-27017.sock
I was running it as a AWS Amazon Linux instance. I figured that out by executing as the mongod user as below, and then, researching the error code. It might be useful for other troubleshooting.
sudo -S -u mongod mongod -f /etc/mongod.conf
MongoDB 1.6 is very old and the latest production version is 2.2, which contains a large amount of bug fixes and enhancements since 1.6.
Am I correct that you haven't installed 1.6 via a package manager such as yum or aptitude? I don't believe there are packages for 1.6 at present afaik. Therefore, mongod is behaving correctly as you have not started MongoDB with a control script.
Please see this link on configuration file options.

Zookeeper: FAILED TO WRITE PID

So I'm trying to to get started with Accumulo. I installed Hadoop and it runs w/o problems but when I try to start Zookeeper I get:
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
-n Starting zookeeper ...
/opt/zookeeper/bin/zkServer.sh: line 103: /tmp/zookeeper/zookeeper_server.pid: No such file or directory
FAILED TO WRITE PID
I've looked around can't seem to find an answer.
I have had the same problem. In my case was useful to start Zookeeper and directly specify a configuration file:
/bin/zkServer.sh start conf/zoo.conf
I have never heard of zookeeper, but it could be a permissions issue trying to write the file zookeeper_server.pid or perhaps the directory /tmp/zookeeper/ doesn't exist and the shell script isn't accounting for that possibility. Check the permissions and existence of those directories.
zookeeper distributed with default conf, uses /tmp/zookeeper as dataDir for just example sake. It is suggested changing this value in /path/to/zookeeper/conf/zoo.cfg to /var/lib/zookeeper.
Creating /var/lib/zookeeper needs root access, so sudo is required. This directory when created will have following permissions.
ls -al /var/lib/zookeeper/
drwxr-xrwx 4 root wheel 128 May 9 14:03 .
When zookeeper is started without root permission, it cannot write to this directory. hence fails with error
... /usr/local/zookeeper/bin/zkServer.sh: line 169: /var/lib/zookeeper/zookeeper_server.pid: Permission denied
FAILED TO WRITE PID
You need to give write permissions to allow user starting zookeeper to write to /var/lib/zookeeper. In my case, as I am using it in local, I used the following command and it worked
sudo chmod o+w /var/lib/zookeeper