Substream Source(EntitySource) cannot be materialized more than once - scala

I'am trying to implement file upload functionality in my application using Akka HTTP. I am using Akka version 2.5.32 and Scala version 2.11.12. I write a simple code with directive storeUploadedFile, like in documentation (https://doc.akka.io/docs/akka-http/current/routing-dsl/directives/file-upload-directives/storeUploadedFile.html):
def uploadFile: Route = (post & path("uploadFile")) {
storeUploadedFile("uploadFile", getDestination) {
(info, file) => complete(JsObject("path" -> JsString(file.getAbsolutePath)))
}
}
private def getDestination(info: FileInfo): File = File.createTempFile(tmpDir, info.fileName)
But when i try to send a request to my application:
curl -F "uploadFile=#data.csv" -X POST localhost:8080/uploadFile
I'll get internal server error and exception:
ERROR [ActorSystemImpl] - Error during processing of request: 'Substream Source(EntitySource) cannot be materialized more than once'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
java.lang.IllegalStateException: Substream Source(EntitySource) cannot be materialized more than once
What should i do to upload file without this exception?

Related

Spark Session returned an error : Apache NiFi

We are trying to run a spark program using NiFi. This is the basic sample we tried to follow.
We have configured Apache-Livy server in 127.0.0.1:8998.
ExecutiveSparkInteractive processor is used to run sample Spark code.
val gdpDF = spark.read.json("gdp.json")
val gdpRDD = gdpDF.rdd
gdpRDD.count()
LivyController is confiured for 127.0.0.1 port 8998 and Session Type : spark.
When we run the processor we get following error :
Spark Session returned an error, sending the output JSON object as the flow file content to failure (after penalizing)
We just want to output the line count in JSON file. How to redirect it to flowfile?
NiFi User log :
2020-04-13 21:50:49,955 INFO [NiFi Web Server-85]
org.apache.nifi.web.filter.RequestLogger Attempting request for
(anonymous) GET
http://localhost:9090/nifi-api/flow/controller/bulletins (source ip:
127.0.0.1)
NiFi app.log
ERROR [Timer-Driven Process Thread-3]
o.a.n.p.livy.ExecuteSparkInteractive
ExecuteSparkInteractive[id=9a338053-0173-1000-fbe9-e613558ad33b] Spark
Session returned an error, sending the output JSON object as the flow
file content to failure (after penalizing)
I have seen several people struggling with this example. I recommend following this example from the Cloudera Community (especially note part 2).
https://community.cloudera.com/t5/Community-Articles/HDF-3-1-Executing-Apache-Spark-via-ExecuteSparkInteractive/ta-p/247772
The key points I would be concerned with:
Does your spark work in general
Does your livy work in general
Is the Spark sample code good

How to query flink's queryable state

I am using flink 1.8.0 and I am trying to query my job state.
val descriptor = new ValueStateDescriptor("myState", Types.CASE_CLASS[Foo])
descriptor.setQueryable("my-queryable-State")
I used port 9067 which is the default port according to this, my client:
val client = new QueryableStateClient("127.0.0.1", 9067)
val jobId = JobID.fromHexString("d48a6c980d1a147e0622565700158d9e")
val execConfig = new ExecutionConfig
val descriptor = new ValueStateDescriptor("my-queryable-State", Types.CASE_CLASS[Foo])
val res: Future[ValueState[Foo]] = client.getKvState(jobId, "my-queryable-State","a", BasicTypeInfo.STRING_TYPE_INFO, descriptor)
res.map(_.toString).pipeTo(sender)
but I am getting :
[ERROR] [06/25/2019 20:37:05.499] [bvAkkaHttpServer-akka.actor.default-dispatcher-5] [akka.actor.ActorSystemImpl(bvAkkaHttpServer)] Error during processing of request: 'org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9067'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9067
what am I doing wrong ?
how and where should I define QueryableStateOptions
So If You want to use the QueryableState You need to add the proper Jar to Your flink. The jar is flink-queryable-state-runtime, it can be found in the opt folder in Your flink distribution and You should move it to the lib folder.
As for the second question the QueryableStateOption is just a class that is used to create static ConfigOption definitions. The definitions are then used to read the configurations from flink-conf.yaml file. So currently the only option to configure the QueryableState is to use the flink-conf file in the flink distribution.
EDIT: Also, try reading this]1 it provides more info on how does Queryable State works. You shouldn't really connect directly to the server port but rather You should use the proxy port which by default is 9069.

Authenticate with ECE ElasticSearch Sink from Apache Fink (Scala code)

Compiler error when using example provided in Flink documentation. The Flink documentation provides sample Scala code to set the REST client factory parameters when talking to Elasticsearch, https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
When trying out this code i get a compiler error in IntelliJ which says "Cannot resolve symbol restClientBuilder".
I found the following SO which is EXACTLY my problem except that it is in Java and i am doing this in Scala.
Apache Flink (v1.6.0) authenticate Elasticsearch Sink (v6.4)
I tried copy pasting the solution code provided in the above SO into IntelliJ, the auto-converted code also has compiler errors.
// provide a RestClientFactory for custom configuration on the internally created REST client
// i only show the setMaxRetryTimeoutMillis for illustration purposes, the actual code will use HTTP cutom callback
esSinkBuilder.setRestClientFactory(
restClientBuilder -> {
restClientBuilder.setMaxRetryTimeoutMillis(10)
}
)
Then i tried (auto generated Java to Scala code by IntelliJ)
// provide a RestClientFactory for custom configuration on the internally created REST client// provide a RestClientFactory for custom configuration on the internally created REST client
import org.apache.http.auth.AuthScope
import org.apache.http.auth.UsernamePasswordCredentials
import org.apache.http.client.CredentialsProvider
import org.apache.http.impl.client.BasicCredentialsProvider
import org.apache.http.impl.nio.client.HttpAsyncClientBuilder
import org.elasticsearch.client.RestClientBuilder
// provide a RestClientFactory for custom configuration on the internally created REST client// provide a RestClientFactory for custom configuration on the internally created REST client
esSinkBuilder.setRestClientFactory((restClientBuilder) => {
def foo(restClientBuilder) = restClientBuilder.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
override def customizeHttpClient(httpClientBuilder: HttpAsyncClientBuilder): HttpAsyncClientBuilder = { // elasticsearch username and password
val credentialsProvider = new BasicCredentialsProvider
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(es_user, es_password))
httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider)
}
})
foo(restClientBuilder)
})
The original code snippet produces the error "cannot resolve RestClientFactory" and then Java to Scala shows several other errors.
So basically i need to find a Scala version of the solution described in Apache Flink (v1.6.0) authenticate Elasticsearch Sink (v6.4)
Update 1: I was able to make some progress with some help from IntelliJ. The following code compiles and runs but there is another problem.
esSinkBuilder.setRestClientFactory(
new RestClientFactory {
override def configureRestClientBuilder(restClientBuilder: RestClientBuilder): Unit = {
restClientBuilder.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
override def customizeHttpClient(httpClientBuilder: HttpAsyncClientBuilder): HttpAsyncClientBuilder = {
// elasticsearch username and password
val credentialsProvider = new BasicCredentialsProvider
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(es_user, es_password))
httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider)
httpClientBuilder.setSSLContext(trustfulSslContext)
}
})
}
}
The problem is that i am not sure if i should be doing a new of the RestClientFactory object. What happens is that the application connects to the elasticsearch cluster but then discovers that the SSL CERT is not valid, so i had to put the trustfullSslContext (as described here https://gist.github.com/iRevive/4a3c7cb96374da5da80d4538f3da17cb), this got me past the SSL issue but now the ES REST Client does a ping test and the ping fails, it throws an exception and the app shutsdown. I am suspecting that the ping fails because of the SSL error and maybe it is not using the trustfulSslContext i setup as part of new RestClientFactory and this makes me suspect that i should not have done the new, there should be a simple way to update the existing RestclientFactory object and basically this is all happening because of my lack of Scala knowledge.
Happy to report that this is resolved. The code i posted in Update 1 is correct. The ping to ECE was not working for two reasons:
The certificate needs to include the complete chain including the root CA, the intermediate CA and the cert for the ECE. This helped get rid of the whole trustfulSslContext stuff.
The ECE was sitting behind an ha-proxy and the proxy did the mapping for the hostname in the HTTP request to the actual deployment cluster name in ECE. this mapping logic did not take into account that the Java REST High Level client uses the org.apache.httphost class which creates the hostname as hostname:port_number even when the port number is 443. Since it did not find the mapping because of the 443 therefore the ECE returned a 404 error instead of 200 ok (only way to find this was to look at unencrypted packets at the ha-proxy). Once the mapping logic in ha-proxy was fixed, the mapping was found and the pings are now successfull.

Pass Arraylist in post request to neo4j

I want to pass an array list to post request to Neo4j server, instead I get an warning at startup when I try run systemctl status neo4j.
WARN Failed to load plugin [ServerPlugin[KShortestPaths]]:
Unsupported parameter type: java.util.ArrayList
I want to pass something like:
{
"concept1":"C0000039",
"vocabulary": ["MSH", "NDFRT"]
}

Disable SSL with Scala Dispatch Library

I am currently in the process of moving all our Rest tests to the CI server and have noticed that all tests are failing due to the an SSL handshake, now I have successfully disabled this with the TrustManager with our Java test suite, but am unsure how to do it with Scala dispatch library, and havent been able to find many examples that could apply in this scenario.
val JSONstr = "{samplekey:samplevalue}"
val response:String = Http(url("https://www.host.com/path/to/post")
<< (checkInJSONstr, "application/json") as_str)
The following exception is occuring as expected:
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated at com.sun.net.ssl.internal.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:352)
...
Is there a way to do it cleanly syntactically ignore SSL with the dispatch library?
import dispatch._
val http = new Http with HttpsLeniency
val response:String = http(url("https://www.host.com/path/to/post")
<< (checkInJSONstr, "application/json") as_str)
viktortnk's answer doesn't work anymore with the newest version of Dispatch (0.13.2). For the newest version, you can use the following to create an http client that accepts any certificate:
val myHttp = Http.withConfiguration(config => config.setAcceptAnyCertificate(true))
Then you can use it for GET requests like this:
myHttp(url("https://www.host.com/path").GET OK as.String)
I found this out here: Why does dispatch throw "java.net.ConnectException: General SSLEngine ..." and "unexpected status" exceptions for a particular URL?