I'm very new to kafka and confluent. I wrote a Producer nearly identical to the tutorial on https://www.confluent.fr/blog/schema-registry-avro-in-spring-boot-application-tutorial/ with my own dummy model. The application.yaml is the same as well. When I send the message to ccloud - the messages being received is gibberish
Any idea as to how to fix this? When I do a System.out.println of the avro POJO before sending to kafka, the object looks good with all the proper values.
{
"locationId": 1,
"time": 1575950400,
"temperature": 9.45,
"summary": "Overcast",
"icon": "cloudy",
"precipitationProbability": 0.24,
...
Whereas when I download the message from ccloud, the value looks like this
[
{
"topic":"Weather",
"partition":0,
"offset":14,
"timestamp":1576008230509,
"timestampType":"CREATE_TIME",
"headers":[],
"key":"dummyKey",
"value":"\u0000\u0000\u0001��\u0002\u0002����\u000b\u0002fffff�\"#\
...
}
You're actually doing everything right :) What you're hitting is just a current limitation in the Confluent Cloud GUI in rendering Avro messages.
If you consume the message as Avro you'll see that everything is fine. Here's an example of consuming the message from Confluent Cloud using kafkacat:
$ source .env
$ docker run --rm edenhill/kafkacat:1.5.0 \
-X security.protocol=SASL_SSL -X sasl.mechanisms=PLAIN \
-X ssl.ca.location=./etc/ssl/cert.pem -X api.version.request=true \
-b ${CCLOUD_BROKER_HOST}:9092 \
-X sasl.username="${CCLOUD_API_KEY}" \
-X sasl.password="${CCLOUD_API_SECRET}" \
-r https://"${CCLOUD_SCHEMA_REGISTRY_API_KEY}":"${CCLOUD_SCHEMA_REGISTRY_API_SECRET}"#${CCLOUD_SCHEMA_REGISTRY_HOST} \
-s avro \
-t mssql-04-mssql.dbo.ORDERS \
-f '"'"'Topic %t[%p], offset: %o (Time: %T)\nHeaders: %h\nKey: %k\nPayload (%S bytes): %s\n'"'"' \
-C -o beginning -c1
Topic mssql-04-mssql.dbo.ORDERS[2], offset: 110 (Time: 1576056196725)
Headers:
Key:
Payload (53 bytes): {"order_id": {"int": 1345}, "customer_id": {"int": 11}, "order_ts": {"int": 18244}, "order_total_usd": {"double": 2.4399999999999999}, "item": {"string": "Bread - Corn Muffaleta Onion"}}
This is the same topic shown here, with the binary Avro value field:
Related
I am using rest api to get the topic if from the kafka instance on confluent cloud
I am using below curl command
curl "http://(myhost):9092/topics/(topicname)" --output op.txt
but I am getting junk value in op.txt like
"U^C^C^#^B^BP"
Is there any solution to this?
You can't use REST to consume from Confluent Cloud (yet). With your data in Confluent Cloud you can use tools like ccloud or kafkacat to access your data from the commandline.
Confluent Cloud CLI (ccloud)
$ ccloud kafka topic consume --from-beginning rmoff_test_topic_01
Starting Kafka Consumer. ^C or ^D to exit
Hello world!
This is a message on a topic in Confluent Cloud
kafkacat
You can run kafkacat locally, or use it from Docker.
docker run --rm --interactive edenhill/kafkacat:1.6.0 \
-X security.protocol=SASL_SSL -X sasl.mechanisms=PLAIN \
-X ssl.ca.location=./etc/ssl/cert.pem -X api.version.request=true \
-b $CCLOUD_BROKER_HOST \
-X sasl.username="$CCLOUD_API_KEY" \
-X sasl.password="$CCLOUD_API_SECRET" \
-t rmoff_test_topic_01 -C -u -e
Hello world!
This is a message on a topic in Confluent Cloud
You can also output the message in JSON which can be useful:
docker run --rm --interactive edenhill/kafkacat:1.6.0 \
-X security.protocol=SASL_SSL -X sasl.mechanisms=PLAIN \
-X ssl.ca.location=./etc/ssl/cert.pem -X api.version.request=true \
-b $CCLOUD_BROKER_HOST \
-X sasl.username="$CCLOUD_API_KEY" \
-X sasl.password="$CCLOUD_API_SECRET" \
-t rmoff_test_topic_01 -C -u -J -e
{
"topic": "rmoff_test_topic_01",
"partition": 0,
"offset": 0,
"tstype": "create",
"ts": 1604571163960,
"broker": 7,
"key": null,
"payload": "Hello world!"
}
{
"topic": "rmoff_test_topic_01",
"partition": 3,
"offset": 0,
"tstype": "create",
"ts": 1604571168723,
"broker": 1,
"key": null,
"payload": "This is a message on a topic in Confluent Cloud"
}
I need to produce batch messages to Kafka so I have a file that I feed kafkacat:
kafkacat -b localhost:9092 -t <my_topic> -T -P -l /tmp/msgs
The content of /tmp/msgs is as follows
-H "id=1"
{"key" : "value0"}
-H "id=2"
{"key" : "value1"}
When I run the kafkacat command above, it inserts four messages to kafka - one message per line in /tmp/msgs.
I need to instruct kafkacat to parse the file correctly - that is -H "id=1" is the header for the message {"key" = "value0"}.
How do I achieve this?
Thanks
You need to pass the headers as follows.
kcat -b localhost:9092 -t topic-name -P -H key1=value1 -H key2=value2 /temp/payload.json
What I intend to achieve is having a Scala Spark program (in a jar) receive a POST message from a client e.g. curl, take some argument values, do some Spark processing and then return a result value to the calling client.
From the Apache Livy documentation available I cannot find a way how I can invoke a compiled and packaged Spark program from a client (e.g. curl) via Livy in an interactive i.e. session mode. Such a request/reply scenario via Livy can be done with Scala code passed in plain text to the Spark shell. But how can I do it with a Scala class in a packaged jar?
curl -k --user "admin:mypassword" -v \
-H "Content-Type: application/json" -X POST \
-d #Curl-KindSpark_ScalaCode01.json \
"https://myHDI-Spark-Clustername.azurehdinsight.net/livy/sessions/0/statements" \
-H "X-Requested-By: admin"
Instead of Scala source code as data (-d #Curl-KindSpark_ScalaCode01.json) I would rather pass the path and filename of the jar-file and a ClassName and Argument values. But how?
Make a uber jar of your Spark app with sbt-assemby plugin.
Upload jar file from the previous step to your HDFS cluster:
hdfs dfs -put /home/hduser/PiNumber.jar /user/hduser
Execute your job via livy:
curl -X POST -d '{"conf": {"kind": "spark" , "jars": "hdfs://localhost:8020/user/hduser/PiNumber.jar"}}' -H "Content-Type: application/json" -H "X-Requested-By: user" localhost:8998/sessions
check it:
curl localhost/sessions/0/statements/3:
{"id":3,"state":"available","output":{"status":"ok","execution_count":3,"data":{"text/plain":"Pi
is roughly 3.14256"}}}
p.s.
Spark Livy API for Scala/Java requires using an uber jar file. sbt-assembly doesn't make fat jar instantly, it annoys me.
Usually, I use Python API of Livy for smoke tests and tweaking.
Sanity checks with Python:
curl localhost:sessions/0/statements -X POST -H 'Content-Type: application/json' -d '{"code":"print(\"Sanity check for Livy\")"}'
You can put more complicated logic to field code.
BTW, it's a way in which popular notebooks for Spark works - sending the source code to cluster via Livy.
Thx, I will try this out. In the meanwhile I found another solution:
$ curl -k --user "admin:" -v -H "Content-Type: application/json" -X POST -d #Curl-KindSpark_BrandSampleModel_SessionSetup.json "https://mycluster.azurehdinsight.net/livy/sessions
with a JSON file containing
{
"kind": "spark",
"jars": ["adl://skylytics1.azuredatalakestore.net/skylytics11/azuresparklivy_2.11-0.1.jar"]
}
and with the uploaded jar in the Azure Data Lake Gen1 account containing the Scala object and then post the statement
$ curl -k --user "admin:myPassword" -v -H "Content-Type: application/json" -X POST -d #Curl-KindSpark_BrandSampleModel_CodeSubmit.json "https://mycluster.azurehdinsight.net/livy/sessions/4/statements" -H "X-Requested-By: admin"
with the content
{
"code": "import AzureSparkLivy_GL01._; val brandModelSamplesFirstModel = AzureSparkLivyFunction.SampleModelOfBrand(sc, \"Honda\"); brandModelSamplesFirstModel"
}.
So I told Livy to start an interactive Spark session and load the specified jar and passed some code to invoke a member of the object in the jar. It works. Will check your advice too.
I have an AWS EC2 Instance running Ubuntu. I have installed Parse Server on it and it runs on localhost.
I've added a new object to the server using this command:
$ curl -X POST \
-H "X-Parse-Application-Id: APPLICATION_ID" \
-H "Content-Type: application/json" \
-d '{"score":1337,"playerName":"Sean Plott","cheatMode":false}' \
http://localhost:1337/parse/classes/GameScore
After I've added this sample object, I used this command in order to get that object back:
$ curl -X GET \
-H "X-Parse-Application-Id: APPLICATION_ID" \
http://localhost:1337/parse/classes/GameScore/2ntvSpRGIK
And I got this output, which means that it is on the database:
{
"objectId": "2ntvSpRGIK",
"score": 1337,
"playerName": "Sean Plott",
"cheatMode": false,
"updatedAt": "2016-03-11T23:51:48.050Z",
"createdAt": "2016-03-11T23:51:48.050Z"
}
But for some reason, the class GameScore isn't being shown not on the Parse dashboard and not on my migrated database on MongoDB. The MongoDB is on the same server as the Parse Server.
Is it okay or there is a problem?
I solved it by adding the database Uri
--databaseURI <DatabaseURI>
I am using Visual Recognition curl command to add a classification to an image:
curl -u "user":"password" \
-X POST \
-F "images_file=#image0.jpg" \
-F "classifier_ids=classifierlist.json" \
"https://gateway.watsonplatform.net/visual-recognition-beta/api/v2/classifiers?version=2015-12-02"
json file:
{
"classifiers": [
{
"name": "tomato",
"classifier_id": "tomato_1",
"created": "2016-03-23T17:43:11+00:00",
"owner": "xyz"
}
]
}
(Also tried without the classifiers array. Got the same error)
and getting an error:
{"code":400,"error":"Cannot execute learning task : no classifier name given"}
Is something wrong with the json?
To specify the classifiers you want to use you need to send a JSON object similar to:
{"classifier_ids": ["Black"]}
An example using Black as classifier id in CURL:
curl -u "user":"password" \
-X POST \
-F "images_file=#image0.jpg" \
-F "classifier_ids={\"classifier_ids\":[\"Black\"]}"
"https://gateway.watsonplatform.net/visual-recognition-beta/api/v2/classify?version=2015-12-02"
If you want to list the classifier ids in a JSON file then:
curl -u "user":"password" \
-X POST \
-F "images_file=#image0.jpg" \
-F "classifier_ids=#classifier_ids.json"
"https://gateway.watsonplatform.net/visual-recognition-beta/api/v2/classify?version=2015-12-02"
Where classifier_ids.json has:
{
"classifier_ids": ["Black"]
}
You can test the Visual Recognition API in the API Explorer.
Learn more about the service in the documentation.
The model schema you are referencing, and what is listed in the API reference, is the format of the response json. It is an example of how the API will return your results.
The format of the json that you use to specify classifiers should be a simple json object, as German suggests. In a file, it would be:
{
"classifier_ids": ["tomato_1"]
}
You also need to use < instead of # for the service to read the contents of the json file correctly. (And you might need to quote the < character on a command line since it has special meaning (redirect input).) So your curl would be:
curl -u "user":"password" \
-X POST \
-F "images_file=#image0.jpg" \
-F "classifier_ids=<classifier_ids.json"
"https://gateway.watsonplatform.net/visual-recognition-beta/api/v2/classify?version=2015-12-02"