How to enable Javascript in Druid - druid

I have been using Druid for the past week and wanted to enable javascript for some postAggregations.
I think I followed the outlined steps and updated the common.runtime.properties file in ../con f/druid/_common/ to include druid.javascript.enabled=true. I then stopped the current processes and re-ran the Quickstart procedures, but it still says that JavaScript is disabled:
{
"error" : "Unknown exception",
"errorMessage" : "Instantiation of [simple type, class io.druid.query.aggregation.post.JavaScriptPostAggregator] value failed: JavaScript is disabled. (through reference chain: java.util.ArrayList[0])",
"errorClass" : "com.fasterxml.jackson.databind.JsonMappingException",
"host" : null
}
I am currently running it in the 'Quickstart' configuration - single local machine. Any pointers? Thanks!

JavaScript Query For druid Aggregation. Save the file as .body and hit the curl request.
This is a sample query for Average value.
curl -X POST "http://localhost:8082/druid/v2/?pretty" \ -H
'content-type: application/json' -d #query.body
{
"queryType":"groupBy",
"dataSource":"whirldata",
"granularity":"all",
"dimensions":[],
"aggregations":[{"name":"rows","type":"count","fieldName":"rows"},
{"name":"TargetDOS","type":"doubleSum","fieldName":"Target DOS"}],"postAggregations":[
{
"type": "javascript",
"name": "Target DOS Average",
"fieldNames": ["TargetDOS", "rows"],
"function": "function(TargetDOS, rows) { return Math.abs(TargetDOS) / rows; }"
}], "intervals":[ "2006-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z" ]}

The part you are missing is likely that the quickstart reads configs from conf-quickstart rather than conf. So try editing conf-quickstart/druid/_common/common.runtime.properties.

Related

OpenSearch 1.2 - Index pattern programmatically

I am trying to create an index_pattern for dashboard in Opensearch programmatically.
Due to the fact that I didn't found anything related to it on Opensearch documentation, I tried the elastic search saved_objects api:
POST /api/saved_objects/index-pattern/my_index_pattern
{
"attributes":{
"title": "my_index"
}
}
But when I call it I got the following error:
{
"error" : "no handler found for uri [/api/saved_objects/index-pattern/my_index_pattern?pretty=true] and method [POST]"
}
How can I solve it? Are there a different call for Opensearch to create an index_pattern?
curl -k -X POST http://<host>:5601/api/saved_objects/index-pattern/logs-* -H "osd-xsrf:true" -H "content-type:application/json" -d '
{
"attributes": {
"title": "logs-*",
"timeFieldName": "#timestamp"
}
}'
this works for me

Spark Job SUBMITTED but not RUNNING after submit via REST API

Following the instructions in this website, I'm trying to submit a job to Spark via REST API /v1/submissions.
I tried to submit SparkPi in the example:
$ ./create.sh
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20211212044718-0003",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
$ ./status.sh driver-20211212044718-0003
{
"action" : "SubmissionStatusResponse",
"driverState" : "SUBMITTED",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
create.sh:
curl -X POST http://172.17.197.143:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"appResource": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"sparkProperties": {
"spark.master": "spark://172.17.197.143:7077",
"spark.driver.memory": "1g",
"spark.driver.cores": "1",
"spark.app.name": "REST API - PI",
"spark.jars": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"spark.driver.supervise": "true"
},
"clientSparkVersion": "3.1.2",
"mainClass": "org.apache.spark.examples.SparkPi",
"action": "CreateSubmissionRequest",
"environmentVariables": {
"SPARK_ENV_LOADED": "1"
},
"appArgs": [
"400"
]
}'
status.sh:
export DRIVER_ID=$1
curl http://172.17.197.143:6066/v1/submissions/status/$DRIVER_ID
But when I try to get the status of the job (even after a few minutes), I got a "SUBMITTED" rather than "RUNNING" or "FINISHED".
Then I looked up the log and found that
21/12/12 04:47:18 INFO master.Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
21/12/12 04:47:18 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
# ...
21/12/12 04:49:02 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
However, in my spark-env.sh, I have
export SPARK_WORKER_MEMORY=10g
export SPARK_WORKER_CORES=2
I have no idea what happened. How can I make it run normally?
Since you've checked resources and You have enough. It might be network issue. executor maybe cannot connect back to driver program. Allow traffic on both master and workers.

Request body matching with XML declaration in Wiremock-standalone

I have a stub described in the ./mappings/*.json file.
"request": {
"method": "POST",
"url": "/some/thing",
"bodyPatterns" : [ {
"matchesXPath" : {
"expression": "//nodeA/text()",
"contains": "999"
}
} ]
}
Wiremock (ver.2.26.2) is started in standalone mode.
When I call the service like this:
curl -d "<request><nodeA>999</nodeA></request>" -X POST http://localhost:8888/some/thing
I am getting the response from stub as expected. The problem is that the request has to be sent with XML declaration tag , e.g.
curl -d "<?xml version="1.0" encoding="UTF-8"?><request><nodeA>999</nodeA></request>" -X POST http://localhost:8888/some/thing
and in this case the request isn't matched. I tried to find smth in the documentation regarding it, but so far no luck
I found out that the problem was with curl that I used. It was malformed since I used same double quotes in XML declaration. Now I load the request body from file and everything works just fine
curl -H "Content-Type: application/xml" -d "#path_to_request_file" -X POST http://localhost:8888/some/thing

What is GraphQL query to determine if GitHub package exists within an Organization?

How do I query GitHub Packages to determine whether a package already exists? I want to prevent CI/CD from attempting to publish a Maven package if that package already exists.
I'm trying a query that looks like:
curl -X POST -H "Authorization: bearer <some token>" -H "Accept: application/vnd.github.packages-preview+json" -d <my query> https://api.github.com/graphql
where <my query> looks like:
query {
organization(login: "myorg") {
registryPackagesForQuery(packageType: MAVEN, first: 100) {
edges {
node {
name
id
}
}
}
}
}
I am getting my packages returned:
{
"data": {
"organization": {
"registryPackagesForQuery": {
"edges": [
{
"node": {
"name": "com.example.myorg",
"id": "<some id>"
}
}
]
}
}
}
}
However, there is a query argument (of type String) for registryPackagesForQuery, but I don't know how that query format should look. Any attempt to add in a value seems to result in no data being returned.
What is the correct format of the query, and how does this query vary if I want to check whether a specific version of a package exists?
The base Query docs are here: https://developer.github.com/v4/query/.
Organization: https://developer.github.com/v4/object/organization/
RegistryPackageConnection: https://developer.github.com/v4/object/registrypackageconnection
RegistryPackage: https://developer.github.com/v4/object/registrypackage/
This may be off-topic, but wouldn't it be simpler to use a maven plugin for that? Like https://github.com/chonton/exists-maven-plugin maybe ?

How to create Avro tables using HCatalog REST API before Hive 0.14?

I couldn't find any docs explain the syntax or grammar for the JSON used by HCatalog REST API.
The office guide (
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+PutTable) only gives a very simple case without saying anything useful about how the JSON part is defined.
I tried the following, but without luck:
curl -X PUT -HContent-type:application/json -d '
{
"format": {
"storedAs": "INPUTFORMAT \"org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat\" OUTPUTFORMAT \"org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat\"",
"rowFormat": {
"serde": {
"name": "org.apache.hadoop.hive.serde2.avro.AvroSerDe"
} } }
"tblproperties" : [
"avro.schema.url": "hdfs://xxxx"
]
} ' \
'http://<host>:50111/templeton/v1/ddl/database/default/table/table1?user.name=hive'
any idea?
Thanks,