Unexpected "End-of-Transmission character" (\x04 or ^D) in binary - apache-beam

msg:b'\x04{"origin": "fluentd", "timestamp": "2021-12-18T01:21:17.683Z", "collected_at": "2021-12-18T01:21:17.683Z"}'
This happens with apache_beam.io.kafka (Apache Beam 2.30). Python API.
What could be the cause?

Related

Using the Beam Python SDK and PortableRunner to connect to Kafka with SSL

I have the code below for connecting to kafka using the python beam sdk. I know that the ReadFromKafka transform is run in a java sdk harness (docker container) but I have not been able to figure out how to make ssl.truststore.location and ssl.keystore.location accesible inside the sdk harness' docker environment. The job_endpoint argument is pointing to java -jar beam-runners-flink-1.10-job-server-2.27.0.jar --flink-master localhost:8081
pipeline_args.extend([
'--job_name=paul_test',
'--runner=PortableRunner',
'--sdk_location=container',
'--job_endpoint=localhost:8099',
'--streaming',
"--environment_type=DOCKER",
f"--sdk_harness_container_image_overrides=.*java.*,{my_beam_sdk_docker_image}:{my_beam_docker_tag}",
])
with beam.Pipeline(options=PipelineOptions(pipeline_args)) as pipeline:
kafka = pipeline | ReadFromKafka(
consumer_config={
"bootstrap.servers": "bootstrap-server:17032",
"security.protocol": "SSL",
"ssl.truststore.location": "/opt/keys/client.truststore.jks", # how do I make this available to the Java SDK harness
"ssl.truststore.password": "password",
"ssl.keystore.type": "PKCS12",
"ssl.keystore.location": "/opt/keys/client.keystore.p12", # how do I make this available to the Java SDK harness
"ssl.keystore.password": "password",
"group.id": "group",
"basic.auth.credentials.source": "USER_INFO",
"schema.registry.basic.auth.user.info": "user:password"
},
topics=["topic"],
max_num_records=2,
# expansion_service="localhost:56938"
)
kafka | beam.Map(lambda x: print(x))
I tried specifying the image override option as --sdk_harness_container_image_overrides='.*java.*,beam_java_sdk:latest' - where beam_java_sdk:latest is a docker image I based on apache/beam_java11_sdk:2.27.0 and that pulls the credetials in its entrypoint.sh. But Beam does not appear to use it, I see
INFO org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for worker id 1-1
in the logs. Which is soon inevitebly followed by
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /opt/keys/client.keystore.p12 of type PKCS12
In conclusion, my question is this, In Apache Beam, is it possible to make files available inside java sdk harness docker container from the python beam sdk? If so, how might it be done?
Many thanks.
Currently, there is no straightforward way to achieve this. There is ongoing discussion and tracking issues to provide support for this kind of expansion service customization (see here, here, BEAM-12538 and BEAM-12539). That is the short answer.
Long answer is yes, you can do that. You would have to copy &paste ExpansionService.java into your codebase and build your custom expansion service, where you specify default environment (DOCKER) and default environment config (your image) here. You then have to manually run this expansion service and specify its address using expansion_service parameter of ReadFromKafka.

AWS Kinesis throwing CloudWatchException

I am trying Scala code using the KCL library to read a Kinesis stream. I keep getting this CloudWatchException and I would like to know why?
16:16:06.629 [aws-akka-http-akka.actor.default-dispatcher-20] DEBUG software.amazon.awssdk.request - Received error response: 400
16:16:06.638 [cw-metrics-publisher] WARN software.amazon.kinesis.metrics.CloudWatchMetricsPublisher - Could not publish 16 datums to CloudWatch
software.amazon.awssdk.services.cloudwatch.model.CloudWatchException: When Content-Type:application/x-www-form-urlencoded, URL cannot include query-string parameters (after '?'): '/?Action=PutMetricData&Version=2010-08-01&Namespace=......
Any idea what's causing this or as I suspect, perhaps it's a bug in the Kinesis library?

POST to Spring/Hibernate backend with Postgres

I'm trying to send this JSON to my Java backend
{
"title": "TITOLO",
"description": "DESCRIZIONE",
"mediaType": "MEDIATYPE"
}
but for some reasons, I cannot write it correctly to my DB. Some teammates can do it without problems but I get this error from Postman:
{
"timestamp": 1526753419760,
"status": 500,
"error": "Internal Server Error",
"message": "Type definition error: [simple type, class isssr.ticketsystem.entity.Ticket]; nested exception is com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of `isssr.ticketsystem.entity.Ticket` (no Creators, like default construct, exist): cannot deserialize from Object value (no delegate- or property-based Creator)\n at [Source: (PushbackInputStream); line: 2, column: 2]",
"path": "/ticketsystem/ticket"
}
I wrote on their local DB without problems but they cannot do the same to mine.
Why this?
My OS is MacOS X High Sierra 10.13.4, using as IDE Intellij IDEA.
UPDATE
Problem was Java 10.
With a downgrade to Java 8, all works fine.

Starting KsqlRestApplication form scala and getting NoSuchMethodError org.apache.kafka.streams.StreamsConfig.getConsumerConfigs

I am trying to write a program that enables me to run predefined KSQL operations on Kafka topics in Scala, but I don't want to open the KSQL Cli everytime. Therefore I want to start the KSQL "Server" from within my Scala program. If I understand the KSQL source code right, I have to build and start a KsqlRestApplication:
def restServer = KsqlRestApplication.buildApplication(new
KsqlRestConfig(defaultServerProperties), true, new VersionCheckerAgent
{override def start(ksqlModuleType: KsqlModuleType, properties:
Properties): Unit = ???})
But when I try doing that, I get the following error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.kafka.streams.StreamsConfig.getConsumerConfigs(Ljava/lang/String;Ljava/lang/String;)Ljava/util/Map;
at io.confluent.ksql.rest.server.BrokerCompatibilityCheck.create(BrokerCompatibilityCheck.java:62)
at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:241)
I looked into the function call in BrokerCompatibilityCheck and in the create function it calls the StreamsConfig.getConsumerConfigs() with 2 Strings as parameters instead of the parameters defined in
https://kafka.apache.org/0102/javadoc/org/apache/kafka/streams/StreamsConfig.html#getConsumerConfigs(StreamThread,%20java.lang.String,%20java.lang.String).
Are my KSQL and Kafka version simply not compatible or am I doing something wrong?
I am using KSQL version 4.1.0-SNAPSHOT and Kafka version 1.0.0.
Yes, NoSuchMethodError typically indicates a version incompatibility between libraries.
The link you posted is to javadoc for kafka 0.10.2. The method hasn't changed in 1.0 but indeed in the upcoming 1.1 it only takes 2 Strings:
https://kafka.apache.org/11/javadoc/org/apache/kafka/streams/StreamsConfig.html#getConsumerConfigs(java.lang.String,%20java.lang.String)
. That suggests the version of KSQL you're using (4.1.0-SNAPSHOT) depends on version 1.1 of kafka streams, which is currently in the release candidate phase and I believe and should be out soon:
https://lists.apache.org/thread.html/780c4458b16590e99261b69d7b41b9ec374a3226d72c8d38885a008a#%3Cusers.kafka.apache.org%3E
As per that email you can find the latest (1.1.0-rc2) artifacts in the apache staging repo:
https://repository.apache.org/content/groups/staging/

gremlin server REST restfull Error encountered evaluating script

I am running a gremlin-server, and using restful api to query it. But I'm confused with this error below :
{"message":"Error encountered evaluating script: g.V().next()"}
but it is such a simple script.
for other script like "100-1" and "g", the query result is just OK. I have checked lots times for spelling and character. And I also have checked the gremlin-server logs, but there was no related records. So I ask for your help, Thanks!
You would run into that error if you don't have any data in the graph because it is an unchecked traversal. You should try a query like:
if (g.V().hasNext()) { g.V().next() }
If you're using Apache TinkerPop 3.1.2 or later, you would have seen a more informative stack trace in the Gremlin Server logs:
[WARN] HttpGremlinEndpointHandler - Invalid request - responding with 500 Internal Server Error and Error encountered evaluating script: g.V().next()
org.apache.tinkerpop.gremlin.process.traversal.util.FastNoSuchElementException