cassandra upgrade from 0.6 to 0.7.2 - upgrade

I followed the instructions in NEWS.txt to upgrade cassandra 0.6 to 0.7.2.
The instructions are:
The process to upgrade is:
1) run "nodetool drain" on each 0.6 node. When drain finishes (log
message "Node is drained" appears), stop the process.
2) Convert your storage-conf.xml to the new cassandra.yaml using
"bin/config-converter".
3) Rename any of your keyspace or column family names that do not adhere
to the '^\w+' regex convention.
4) Start up your cluster with the 0.7 version.
5) Initialize your Keyspace and ColumnFamily definitions using
"bin/schematool import". You only need to do
this to one node.
I did the first three steps. drain node, stop cassandra 0.6, convert old storage-conf.xml to cassandra.yaml.
I start cassandra 0.7.2 using: "bin/cassandra -f". But it always complains the following errors. I am wondering whether I followed the right instructions. If so, how could i fix this problem?
"Fatal configuration error
org.apache.cassandra.config.ConfigurationException: saved_caches_directory missing"

Default location for saved_caches_directory is /var/lib/cassandra/saved_caches (From wiki). Try to create that manually (dont forget user permissions)

I figured out. I should delete old commitLog and files in system directory.

Related

Command confluent local services start gives an error : Starting ZooKeeper Error: ZooKeeper failed to start

I'm trying to run this command : confluent local services start
I don't know why each time it gives me an error before passing to the next step. So I had to run it again over and over until it passes all the steps.
what is the reason for the error and how to solve the problem?
You need to open the log files to inspect any errors that would be happening.
But, it's possible the services are having a race condition. Schema Registry requires Kafka, REST Proxy and Connect require the Schema Registry... Maybe they are not waiting for the previous components to start.
Or maybe your machine does not have enough resources to start all services. E.g. I believe at least 6GB of RAM are necessary. If you have 8GB on the machine, and Chrome and lots of other services are running, for example, then you wouldn't have 6GB readily available.

How to run MirrorMaker 2.0 in production?

From the documentation, I see MirrorMaker 2.0 like this on the command line -
./bin/connect-mirror-maker.sh mm2.properties
In my case I can go to an EC2 instance and enter this command.
But what is the correct practice if this needs to run for many days. It is possible that EC2 instance can get terminated etc.
Therefore, I am trying to find out what are best practices if you need to run MirrorMaker 2.0 for a long period and want to make sure that it is able to stay up and running without any manual intervention.
You have many options, these include:
Add it as a service to systemd. Then you can specify that it should be started automatically and restarted on failure. systemd is very common now, but if you're not running systemd, there are many other process managers. https://superuser.com/a/687188/80826.
Running in a Docker container, where you can specify a restart policy.

Session isn't active Pyspark in an AWS EMR cluster

I have opened an AWS EMR cluster and in pyspark3 jupyter notebook I run this code:
"..
textRdd = sparkDF.select(textColName).rdd.flatMap(lambda x: x)
textRdd.collect().show()
.."
I got this error:
An error was encountered:
Invalid status code '400' from http://..../sessions/4/statements/7 with error payload: {"msg":"requirement failed: Session isn't active."}
Running the line:
sparkDF.show()
works!
I also created a small subset of the file and all my code runs fine.
What is the problem?
I had the same issue and the reason for the timeout is the driver running out of memory. Since you run collect() all the data gets sent to the driver. By default the driver memory is 1000M when creating a spark application through JupyterHub even if you set a higher value through config.json. You can see that by executing the code from within a jupyter notebook
spark.sparkContext.getConf().get('spark.driver.memory')
1000M
To increase the driver memory just do
%%configure -f
{"driverMemory": "6000M"}
This will restart the application with increased driver memory. You might need to use higher values for your data. Hope it helps.
From This stack overflow question's answer which worked for me
Judging by the output, if your application is not finishing with a FAILED status, that sounds like a Livy timeout error: your application is likely taking longer than the defined timeout for a Livy session (which defaults to 1h), so even despite the Spark app succeeds your notebook will receive this error if the app takes longer than the Livy session's timeout.
If that's the case, here's how to address it:
1. edit the /etc/livy/conf/livy.conf file (in the cluster's master node)
2. set the livy.server.session.timeout to a higher value, like 8h (or larger, depending on your app)
3. restart Livy to update the setting: sudo restart livy-server in the cluster's master
4. test your code again
Alternative way to edit this setting - https://allinonescript.com/questions/54220381/how-to-set-livy-server-session-timeout-on-emr-cluster-boostrap
Just a restart helped solve this problem for me. On your Jupyter Notebook, go to -->Kernel-->>Restart
Once done, if you run the cell with "spark" command you will see that a new spark session gets established.
You might get some insights from this similar Stack Overflow thread: Timeout error: Error with 400 StatusCode: "requirement failed: Session isn't active."
Solution might be to increase spark.executor.heartbeatInterval. Default is 10 seconds.
See EMR's official documentation on how to change Spark defaults:
You change the defaults in spark-defaults.conf using the spark-defaults configuration classification or the maximizeResourceAllocation setting in the spark configuration classification.
Insufficient reputation to comment.
I tried increasing heartbeat Interval to a much higher (100 seconds), still the same result. FWIW, the error shows up in < 9s.
What worked for me is adding {"Classification": "spark-defaults", "Properties": {"spark.driver.memory": "20G"}} to the EMR configuration.

How do I upgrade apache kafka in linux

I have a novice question on kafka upgrade.. This is the 1st time I'm upgrading my kafka in Linux.
My current version is "kafka_2.11-1.0.0.tgz".. when I initially setup I had a folder named kafka_2.11-1.0.0.
Now I downloaded a new version "kafka_2.12-2.3.0.tgz". If I extract it is going to create a new folder kafka_2.12-2.3.0 which will result in 2 independent kafka with server.properties.
As per documentation I have to update server.properties with below 2 properties..
inter.broker.protocol.version=2.3
log.message.format.version=2.3
How does this affect if it is going to install in a new directory with new server.properties?
How can I merge server.properties & do the upgrade? Please share if you have documents or steps..
it's fairly simple to upgrade Kafka.
It would have been easier for you to separate config files from binary directories, as a result, from what I understand, your config file remains with the untar package folder.
You can put the config file in /etc/kafka next time you'll package it on your Linux server.
What you can do here is , after untar your kafka_2.12-2.3.0.tgz file, you just copy the former server.properties ( and other config files you might use as well) and replace the one in the 2.3.0 arborescence.
But be careful, for inter.broker.protocol.version=2.3 and log.message.format.version=2.3 parameters, you must first specify the former version for those parameters ( and message.format is not mandatory to change, double check the doc for this one) before doing your rolling restart.
If you are using 1.0 now, just put the following :
inter.broker.protocol.version=1.0 and log.message.format.version=1.0
then restart your brokers one by one (using the new package folder this time)
Then edit them again as follows :
inter.broker.protocol.version=2.3 and log.message.format.version=2.3 and do a second rolling restart.
Then you should be good
More details here :
https://kafka.apache.org/documentation/#upgrade_2_3_0
Yannick

Replica set never finishes cloning primary node

We're working with an average sized (50GB) data set in MongoDB and are attempting to add a third node to our replica set (making it primary-secondary-secondary). Unfortunately, when we bring the nodes up (with the appropriate command line arguments associating them with our replica set), the nodes never exit the RECOVERING stage.
Looking at the logs, it seems as though the nodes ditch all of their data as soon as the recovery completes and start syncing again.
We're using version 2.0.3 on all of the nodes and have tried adding the third node from both a "clean" (empty db) state as well as a bootstrapped state (using mongodump to take a snapshot of the primary database and mongorestore'ing that snapshot into the new node), each failing.
We've observed this recurring phenomenon over the past 24 hours and any input/guidance would be appreciated!
It's hard to be certain without looking at the logs, but it sounds like you're hitting a known issue in MongoDB 2.0.3. Check out http://jira.mongodb.org/browse/SERVER-5177 . The problem is fixed in 2.0.4, which has an available release candidate.
I don't know if it helps, but when I got that problem, I erased the replica DB and initiated it. It started from scratch and replicated OK. worth a try, I guess.