Upgrading to Spark 1.5.0 on Ambari - scala

I am using the Hortonworks release which uses Spark 1.3.
When I ran my app using Spark API 1.5 it failed, but switching to 1.3 worked.
There are new features in 1.5 that I want to use on the server but apparently this is not possible without upgrading it.
How do I upgrade Spark from 1.3 to 1.5 on Ambari?

Related

upgrade Apache kafka from 2.7 version to 3.X without zookeeper on production kafka cluster

we have production Kafka cluster with 2.7 version , 5 Kafka brokers on RHEL 7.9 version
we want to upgrade the Kafka version to 3.X version
3.X version not include zookeeper , so we are wondering if we can do upgrade without any data loss
regarding to kafka 2.7 version , Kafka storing the metadata on zookeeper servers ( as brokers ids , topics names etc )
but is it possible to perform rolling upgrade from 2.7 to 3.x version without any data loss?
The upgrade guide should contain all infos you need.
While KRaft mode (without ZooKeeper) is production ready since 3.3, they still keep ZooKeeper around for compatibility until the 4.0 release.
Furthermore If I understand correctly, it is currently only possible to set up a fresh cluster in KRaft mode, but not to migrate an existing one with ZooKeeper. Kafka 3.5 will be a migration version they intend you to migrate from ZooKeeper to KRaft.
This is explained quite nicely in the release notes from Kafka, especially for Kafka 3.3 and the release video
As long as your Kafka brokers are not running with Java 8 still, you can simply do a rolling upgrade from 2.7 to 3.X like you are used to.

How to get appropriate Kafka version when installing it from HDF 3.4 over HDP 3.1

I am building a Hortonworks cluster with both HDP and HDF to be installed. I first installed HDP and then installed/integrated with HDF on top of it.
Environment Details:
OS: Red Hat Enterprise Linux Server release 7.6 (Maipo)
Version: Ambari -2.7.3, HDP - 3.1, HDF -3.4.0
Basically HDP-3.1 has kafka 1.0.1 in the package and in HDF has kafka 2.1.0 packages are available and I need HDF version of Kafka to be available. Though I had installed Kafka from HDF, Ambari shows the kafka version of 1.0.1. After integration with HDF, it's not showing up the Kafka-2.1.0 in the Add service list.
I need to know, how can I get Kafka 2.1.0 installed in the cluster.
Also, Is it a possibility that version showed is 1.0.1, though Kafka 2.1.0 is installed.
Is it a possibility that version showed is 1.0.1, though Kafka 2.1.0 is installed
Doubtful. Ambari parses the packages that are installed on the machine to determine the versions
My suggestion would be to manually SSH to each machine and try to install Kafka from yum and see what versions are available.
If you appropriately setup the HDF YUM repos, then that version of Kafka should be available.
Alternatively, you could always install Kafka/Zookeeper externally and manage it outside of Ambari

Jhipster and Kafka 2.x well supported?

Jhipster 5.7 generates a kafka.yml file referencing kafka 1.0.0.
Is Kafka 2.x well supported (and already tested) by jhipster. If yes, is the move to kafka 2.x planned ?
Thanks

spark-cassandra-connector for Spark 1.4 and Cassandra 3.0

Spark-Cassandra experts: Will Apache Spark 1.4 work with Apache Cassandra 3.0 in Datastax installations?. We are considering several options for migrating DSE 4.8 (Spark 1.4 and Cassandra 2.1) to DSE 5.0 (Spark 1.6 and Cassandra 3.0). One option is to update Cassandra Cluster to DSE 5.0 and leave Spark cluster on DSE 4.8. This means we have to make Apache Spark 1.4 work with Apache Cassandra 3.0. We use https://github.com/datastax/spark-cassandra-connector versions 1.4 (DSE 4.8) and 1.6(DSE 5.0). Has someone tried using Spark 1.4 (DSE 4.8) with Cassandra 3.0 (DSE 5.0) ?.
As I can see from the Maven Central, Spark Cassandra Connector 1.4.5 did use the version 2.1.7 of the Java driver. According the compatibility matrix in official documentation, the driver 2.1.x won't work with Cassandra 3.0... You can of course test it, but I doubt that it will work - driver is usually backward compatible, but not forward compatible...
I recommend to perform migration to DSE 5.0, and then move to 5.1 fast enough, as 5.0 could be EOL soon.
P.S. If you have more questions, I recommend to join the DataStax Academy Slack - there is a separate channel about spark cassandra connector there.

Spark job server for spark 1.6.0

Is there any specific Spark Job Server version matching with Spark 1.6.0 ?
As per the version information in https://github.com/spark-jobserver/spark-jobserver, I see SJS is available only for 1.6.1 not for 1.6.0.
Our CloudEra hosted Spark is running on 1.6.0
I deployed SJS by configuring the spark home to 1.6.1. When I submitted jobs, I see job ids are getting generated but I can't see the job result.
Any inputs?
No, there is no SJS version tied to spark 1.6.0. But it should be easy for you to compile against 1.6.0. May be you could modify this https://github.com/spark-jobserver/spark-jobserver/blob/master/project/Versions.scala#L10 and try.