Kafka Connect using REST API with Strimzi with kind: KafkaConnector - scala

I'm trying to use Kafka Connect REST API for managing connectors, for simplicity consider the following pause implementation:
def pause(): Unit = {
logger.info(s"pause() Triggered")
val response = HttpClient.newHttpClient.send({
HttpRequest
.newBuilder(URI.create(config.connectUrl + s"/connectors/${config.connectorName}/pause"))
.PUT(BodyPublishers.noBody)
.timeout(Duration.ofMillis(config.timeout.toMillis.toInt))
.build()
}, BodyHandlers.ofString)
if (response.statusCode() != HTTPStatus.Accepted) {
throw new Exception(s"Could not pause connector: ${response.body}")
}
}
Since I'm using KafkaConnector as a resource, I cannot use Kafka Connect REST API because the connector operator has the KafkaConnetor resources as its single source of truth, manual changes such as pause made directly using the Kafka Connect REST API are reverted by the Cluster Operator.
So to pause the connector I need to edit the resource in some way.
I'm struggling to change the logic of the current function, It will be great to have some practical examples of how to handle KafkaConnetor resources.
I check out the Using Strimzi doc but couldn't find any practical example
Thanks!
After help from #Jakub i managed to create my new client:
class KubernetesService(config: Configuration) extends StrictLogging {
private[this] val client = new DefaultKubernetesClient(Config.autoConfigure(config.connectorContext))
def setPause(pause: Boolean): Unit = {
logger.info(s"[KubernetesService] - setPause($pause) Triggered")
val connector = getConnector()
connector.getSpec.setPause(pause)
Crds.kafkaConnectorOperation(client).inNamespace(config.connectorNamespace).withName(config.connectorName).replace(connector)
Crds.kafkaConnectorOperation(client)
.inNamespace(config.connectorNamespace)
.withName(config.connectorName)
.waitUntilCondition(connector => {
connector != null &&
connector.getSpec.getPause == pause && {
val desiredState = if (pause) "Paused" else "Running"
connector.getStatus.getConditions.stream().anyMatch(_.getType.equalsIgnoreCase(desiredState))
}
}, config.timeout.toMillis, TimeUnit.MILLISECONDS)
}
def delete(): Unit = {
logger.info(s"[KubernetesService] - delete() Triggered")
Crds.kafkaConnectorOperation(client).inNamespace(config.connectorNamespace).withName(config.connectorName).delete
Crds.kafkaConnectorOperation(client)
.inNamespace(config.connectorNamespace)
.withName(config.connectorName)
.waitUntilCondition(_ == null, config.timeout.toMillis, TimeUnit.MILLISECONDS)
}
def create(oldKafkaConnect: KafkaConnector): Unit = {
logger.info(s"[KubernetesService] - create(${oldKafkaConnect.getMetadata}) Triggered")
Crds.kafkaConnectorOperation(client).inNamespace(config.connectorNamespace).withName(config.connectorName).create(oldKafkaConnect)
Crds.kafkaConnectorOperation(client)
.inNamespace(config.connectorNamespace)
.withName(config.connectorName)
.waitUntilCondition(connector => {
connector != null &&
connector.getStatus.getConditions.stream().anyMatch(_.getType.equalsIgnoreCase("Running"))
}, config.timeout.toMillis, TimeUnit.MILLISECONDS)
}
def getConnector(): KafkaConnector = {
logger.info(s"[KubernetesService] - getConnector() Triggered")
Try {
Crds.kafkaConnectorOperation(client).inNamespace(config.connectorNamespace).withName(config.connectorName).get
} match {
case Success(connector) => connector
case Failure(_: NullPointerException) => throw new NullPointerException(s"Failure on getConnector(${config.connectorName}) on ns: ${config.connectorNamespace}, context: ${config.connectorContext}")
case Failure(exception) => throw exception
}
}
}

To pause the connector, you can edit the KafkaConnector resource and set the pause field in .spec to true (see the docs). There are several options how you can do it. You can use kubectl and either apply the new YAML from file (kubectl apply) or do it interactively using kubectl edit.
If you want to do it programatically, you will need to use a Kubernetes client to edit the resource. In Java, you can also use the api module of Strimzi which has all the structures for editing the resources. I put together a simple example for pausing the Kafka connector in Java using the Fabric8 Kubernetes client and the api module:
package cz.scholz.strimzi.api.examples;
import io.fabric8.kubernetes.client.DefaultKubernetesClient;
import io.fabric8.kubernetes.client.KubernetesClient;
import io.fabric8.kubernetes.client.dsl.MixedOperation;
import io.fabric8.kubernetes.client.dsl.Resource;
import io.strimzi.api.kafka.Crds;
import io.strimzi.api.kafka.KafkaConnectorList;
import io.strimzi.api.kafka.model.KafkaConnector;
public class PauseConnector {
public static void main(String[] args) {
String namespace = "myproject";
String crName = "my-connector";
KubernetesClient client = new DefaultKubernetesClient();
MixedOperation<KafkaConnector, KafkaConnectorList, Resource<KafkaConnector>> op = Crds.kafkaConnectorOperation(client);
KafkaConnector connector = op.inNamespace(namespace).withName(crName).get();
connector.getSpec().setPause(true);
op.inNamespace(namespace).withName(crName).replace(connector);
client.close();
}
}
(See https://github.com/scholzj/strimzi-api-examples for the full project)
I'm not a Scala users - but I assume it should be usable from Scala as well, but I leave rewriting it from Java to Scala to you.

Related

Akka Streams for server streaming (gRPC, Scala)

I am new to Akka Streams and gRPC, I am trying to build an endpoint where client sends a single request and the server sends multiple responses.
This is my protobuf
syntax = "proto3";
option java_multiple_files = true;
option java_package = "customer.service.proto";
service CustomerService {
rpc CreateCustomer(CustomerRequest) returns (stream CustomerResponse) {}
}
message CustomerRequest {
string customerId = 1;
string customerName = 2;
}
message CustomerResponse {
enum Status {
No_Customer = 0;
Creating_Customer = 1;
Customer_Created = 2;
}
string customerId = 1;
Status status = 2;
}
I am trying to achieve this by sending customer request then the server will first check and respond No_Customer then it will send Creating_Customer and finally server will say Customer_Created.
I have no idea where to start for it implementation, looked for hours but still clueless, I will be very thankful if anyone can point me in the right direction.
The place to start is the Akka gRPC documentation and, in particular, the service WalkThrough. It is pretty straightforward to get the samples working in a clean project.
The relevant server sample method is this:
override def itKeepsReplying(in: HelloRequest): Source[HelloReply, NotUsed] = {
println(s"sayHello to ${in.name} with stream of chars...")
Source(s"Hello, ${in.name}".toList).map(character => HelloReply(character.toString))
}
The problem is now to create a Source that returns the right results, but that depends on how you are planning to implement the server so it is difficult to answer. Check the Akka Streams documentation for various options.
The client code is simpler, just call runForeach on the Source that gets returned by CreateCustomer as in the sample:
def runStreamingReplyExample(): Unit = {
val responseStream = client.itKeepsReplying(HelloRequest("Alice"))
val done: Future[Done] =
responseStream.runForeach(reply => println(s"got streaming reply: ${reply.message}"))
done.onComplete {
case Success(_) =>
println("streamingReply done")
case Failure(e) =>
println(s"Error streamingReply: $e")
}
}

How to use google pubsub library with scala

I'm writing a Google pubsub client using the Java API, the client is written in Scala. The problem of this code is that it's not idiomatic in scala with the use of null and the while true loop
val receiver = new MessageReceiver() {
// React to each received message
// If there are any
override def receiveMessage(message: PubsubMessage, consumer: AckReplyConsumer): Unit = { // handle incoming message, then ack/nack the received message
System.out.println("Id : " + message.getMessageId)
System.out.println("Data : " + message.getData.toStringUtf8)
throw new RuntimeException("This is just an exception")
consumer.ack()
}
}
var subscriber: ApiService = null
try { // Create a subscriber for "my-subscription-id" bound to the message receiver
var subscriber = Subscriber.newBuilder(subscriptionName, receiver).build
subscriber.startAsync
// ...
} finally {
// stop receiving messages
if (subscriber != null) subscriber.stopAsync()
}
while (true) {
Thread.sleep(1000)
}
How do I tranform this code in order to use scala's Future or cats IO
Have you considered using Lightbend's Alpakka Google Cloud Pub/Sub connector?
https://developer.lightbend.com/docs/alpakka/current/google-cloud-pub-sub.html
It works very well and is idiomatic
There is a Pub/Sub Scala client in cats style - https://github.com/hyjay/fs2-google-cloud-pubsub
Disclaimer: I'm the author.

docker container based library to support elastic4s

Im using elastic4s and also interested in using a docker container based testing environment for my elastic search.
There are few libraries like: testcontainers-scala and docker-it-scala, but can't find how I integrate elastic4s into those libraries, did someone ever used a docker container based testing env?
currently my spec is very simple:
class ElasticSearchApiServiceSpec extends FreeSpec {
implicit val defaultPatience = PatienceConfig(timeout = Span(100, Seconds), interval = Span(50, Millis))
val configuration: Configuration = app.injector.instanceOf[Configuration]
val elasticSearchApiService = new ElasticSearchApiService(configuration)
override protected def beforeAll(): Unit = {
elasticSearchApiService.elasticClient.execute {
index into s"peopleIndex/person" doc StringDocumentSource(PeopleFactory.rawStringGoodPerson)
}
// since ES is eventually
Thread.sleep(3000)
}
override protected def afterAll(): Unit = {
elasticSearchApiService.elasticClient.execute {
deleteIndex("peopleIndex")
}
}
"ElasticSearchApiService Tests" - {
"elastic search service should retrieve person info properly - case existing person" in {
val personInfo = elasticSearchApiService.getPersonInfo("2324").futureValue
personInfo.get.name shouldBe "john"
}
}
}
and when I run it, I run elastic search in the background from my terminal, but I want to use containers now so it will be less dependent.
I guess you don't want to depend on ES server running on your local machine for the tests. Then the simplest approach would be using testcontainers-scala's GenericContainer to run official ES docker image this way:
class GenericContainerSpec extends FlatSpec with ForAllTestContainer {
override val container = GenericContainer("docker.elastic.co/elasticsearch/elasticsearch:5.5.1",
exposedPorts = Seq(9200),
waitStrategy = Wait.forHttp("/")
)
"GenericContainer" should "start ES and expose 9200 port" in {
assert(Source.fromInputStream(
new URL(
s"http://${container.containerIpAddress}:${container.mappedPort(9200)}/_status")
.openConnection()
.getInputStream)
.mkString
.contains("ES server is successfully installed"))
}
}

Colossus Background Task

I am building an application using Tumblr's new Colossus framework (http://tumblr.github.io/colossus/). There is still limited documentation on it (and the fact that I'm still very new to Akka doesn't help), so I was wondering if someone could chime in on whether my approach is correct.
The application is simple and consists of two key components:
A thin web service layer that will queue tasks into Redis
A background worker which will poll the same Redis instance for available tasks and process them as they become available
I made a simple example to demonstrate that my concurrency model will work (and it does), which I posted below. However, I would like to make sure that there is not a more idiomatic way to do this.
import colossus.IOSystem
import colossus.protocols.http.Http
import colossus.protocols.http.HttpMethod.Get
import colossus.protocols.http.UrlParsing._
import colossus.service.{Callback, Service}
import colossus.task.Task
object QueueProcessor {
implicit val io = IOSystem() // Create separate IOSystem for worker
Task { ctx =>
while(true) {
// Below code is for testing purposes only. This is where the Redis loop will live, and will use a blocking call to get the next available task
Thread.sleep(5000)
println("task iteration")
}
}
def ping = println("starting") // Method to launch this processor
}
object Main extends App {
implicit val io = IOSystem() // Primary IOSystem for the web service
QueueProcessor.ping // Launch worker
Service.serve[Http]("app", 8080) { ctx =>
ctx.handle { conn =>
conn.become {
case req#Get on Root => Callback.successful(req.ok("Here"))
// The methods to add tasks to the queue will live here
}
}
}
}
I tested the above model and it works. The background loop continues running while the service happily accepts requests. But, I think that there might be a better way to do this with workers (nothing found in documentation), or perhaps Akka Streams?
I got it working with something that seems semi-idiomatic to me. However, new answers & feedback are still welcomed!
class Processor extends Actor {
import scala.concurrent.ExecutionContext.Implicits.global
override def receive = {
case "start" => self ! "next"
case "next" => {
Future {
blocking {
// Blocking call here to wait on Redis (BRPOP/BLPOP)
self ! "next"
}
}
}
}
}
object Main extends App {
implicit val io = IOSystem()
val processor = io.actorSystem.actorOf(Props[Processor])
processor ! "start"
Service.serve[Http]("app", 8080) { ctx =>
ctx.handle { conn =>
conn.become {
// Queue here
case req#Get on Root => Callback.successful(req.ok("Here\n"))
}
}
}
}

scala- how to subscribe akka leader up event

I am using akka with play. And I want something to be done only once when a new leader is up
I am going to find something like , in other words, I am looking for something like this.
class LeaderUpHook {
def onLeaderUp {
log.log("a new leader is up")
}
}
I search the clustering doc , but still don't how to do
You should be able to use the cluster events to figure this out. I'm basing my code example off of the documentation under the Subscribe to Cluster Events section of the docs here. So in short, you basically create an actor that subscribes into relevant cluster events in order to determine who the leader is and when that leader is up. That actor could look like this:
import akka.actor._
import akka.cluster._
class LeaderUpHandler extends Actor{
import ClusterEvent._
val cluster = Cluster(context.system)
cluster.subscribe(self, classOf[MemberUp])
cluster.subscribe(self, classOf[LeaderChanged])
var leader:Option[Address] = None
def receive = {
case state:CurrentClusterState =>
println(s"Got current state: $state")
case MemberUp(member) =>
println(s"member up: $member")
leader.filter(_ == member.address) foreach{ address =>
println("leader is now up...")
}
case LeaderChanged(address) =>
println(s"leader changed: $address")
leader = address
}
}
Then to test this code, you could do the following:
val cfg = """
akka {
actor {
provider = "akka.cluster.ClusterActorRefProvider"
}
remote {
netty.tcp {
hostname = "127.0.0.1"
port = 2552
}
}
cluster {
seed-nodes = [
"akka.tcp://clustertest#127.0.0.1:2552"
]
auto-down = on
}
}
"""
val config = ConfigFactory.parseString(cfg).withFallback(ConfigFactory.load)
val system = ActorSystem("clustertest", config)
system.actorOf(Props[LeaderUpHandler])
When you run the above code, you should see the leader get determined to be up. This is an oversimplified example; I'm just trying to show that you can use the cluster events to find what you are looking for.