Is it possible to query a KSQL Table/Materialized view via HTTP? - rest

I have a materialized view created using
CREATE TABLE average_latency AS SELECT DEVICENAME, AVG(LATENCY) AS AVG_LATENCY FROM metrics WINDOW TUMBLING (SIZE 1 MINUTE) GROUP BY DEVICENAME EMIT CHANGES;
I would like to query the table average_latency via a REST API call to get the AVG_LATENCY and DEVICENAME column in the response.
HTTP Client -> KSQL Table/Materialized view
Is this use-case possible? If so, how?

It is possible to query the internal state store used by Kafka streams by exposing an RPC endpoint on the streams application.
Check out the following documentation and examples provided by Confluent.
https://docs.confluent.io/platform/current/streams/developer-guide/interactive-queries.html#streams-developer-guide-interactive-queries-rpc-layer
https://github.com/confluentinc/kafka-streams-examples/blob/7.1.1-post/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/KafkaMusicExample.java
https://github.com/confluentinc/kafka-streams-examples/blob/4.0.x/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/MusicPlaysRestService.java

You can get the result of the query using http. You have to send a post method with your query to the ksql address
curl -X "POST" "http://localhost:8088/query" \
-H "Accept: application/vnd.ksql.v1+json" \
-d $'{
"ksql": "SELECT * FROM TEST_STREAM EMIT CHANGES;",
"streamsProperties": {}
}'
I took it from the developer guide of ksql
https://docs.ksqldb.io/en/latest/developer-guide/api/

Related

Copying data from one Cassandra table to another with TTL

We are changing partition key of one of our table by removing one column from partition key. Every record in this table also has TTL. Now we want to preserve the data in that table with TTL.
How can we do it?
We can create new table with desired schema and then copy data from old table to new table. However, we loose TTL in this process.
For further information - This Cassandra table is populated by an Apache Storm application which reads events from Kafka. We can re-hydrate Kafka messages but Kafka has some unwanted messages which we don't want to process.
NOTE - TTL is decided based on date column value, which never changes. Because of this TTL would always be same on all the columns.
Before going to specific implementation, it's important to understand that TTL may exist on the individual cell as well as all cells in the row. And when you're performing INSERT or UPDATE operation, you can apply only one TTL value for all columns that are specified in the query, so if you have 2 columns with different TTLs, then you'll need to perform 2 queries - for each column, with different TTLs.
Regarding the tooling - there are 2 more or less ready-to-use options here:
Use DSBulk. This approach is described in details in the example 30.1 of this blog post. Basically, you need to unload data to disk using the query that will extract column values & TTLs for them, and then load data by generating batches for every column that have separate TTL. From example:
dsbulk unload -h localhost -query \
"SELECT id, petal_length, WRITETIME(petal_length) AS w_petal_length, TTL(petal_length) AS l_petal_length, .... FROM dsbulkblog.iris_with_id" \
-url /tmp/dsbulkblog/migrate
dsbulk load -h localhost -query \
"BEGIN BATCH INSERT INTO dsbulkblog.iris_with_id(id, petal_length) VALUES (:id, :petal_length) USING TIMESTAMP :w_petal_length AND TTL :l_petal_length; ... APPLY BATCH;" \
-url /tmp/dsbulkblog/migrate --batch.mode DISABLED
Use Spark Cassandra Connector - it supports reading & writing the data with TTL & WriteTime. But you'll need to develop the code that is doing it, and correctly handle things such as collections, static columns etc. (or wait for SPARKC-596 implemented)

How to write nested query in druid?

I am new to druid. I have worked with mysql databases so far. I want to know, how to write below nested mysql query as a druid query?
Select distinct(a.userId) as userIds
from transaction as a
where
a.transaction_type = 1
and a.userId IN (
select distinct(b.userId) where transaction as b where a.transaction_type = 2
)
I really appreciate your help.
There are couple of things you might be interested to know as you are new to druid.
Druid supports SQL now, it does not support all the fancy and complex feature like SQL does but it does support many standard SQL thing. It also provides the way to write SQL query in druid JSON.
Here's the more detail on that with example:
http://druid.io/docs/latest/querying/sql
Your query is simple enough so you can use druid sql feature as below:
{
"query" : "<your_sql_query>",
"resultFormat" : "object"
}
If you want to build a JSON query for above query and don't want to write entire big JSON then try this cool trick:
Running sql query to broker node with -q and it will print JSON query for you which you can use and then also modify it as necessary, here's the syntax for that:
curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -q <druid_sql_query>
In addition to this, You can also use DruidDry library which provides support to write fancy druid query in Java.

Join on Apache Ignite REST API

Does anyone know if you can do a join using the REST API for Apache Ignite? I have two objects, account and customer loaded to the Apache Ignite Server. Both objects are loaded with data and stored in the cache as account object cache and customer object cache. I am able to query both objects separately using the REST API, i.e.
http://localhost:8080/ignite?cmd=qryfldexe&pageSize=1000&cacheName=CustomerCache&qry=select+id+from+customer
http://localhost:8080/ignite?cmd=qryfldexe&pageSize=1000&cacheName=AccountCache&qry=select+id+from+account
However, I would like to execute a join on the account and customer cache. Is this supported and if so, does anyone have any examples? I can't find any documentation on this.
You need to specify one cache in the cacheName, and reference the second table in the JOIN via its schema name (by default it's the cache name). That's not unique to the REST API, Java API works in the same way.
The query should be something like
SELECT *
FROM Customer AS c
JOIN AccountCache.Account AS a
WHERE c.id = a.customerId
Try to use qryfldexe:
For example, I create next caches:
http://apache-ignite-users.70518.x6.nabble.com/file/t1704/ss1.java
It creates two caches with the same structure.
Now I am going to execute next command:
SELECT * FROM "mycache1".Value V1 join "mycache2".Value V2 on V1.key=V2.key
Let's use next converter to get URI string:
https://meyerweb.com/eric/tools/dencoder/
Our command will be next:
SELECT%20*%20FROM%20%22mycache1%22.Value%20V1%20join%20%22mycache2%22.Value%20V2%20on%20V1.key%3DV2.key
Run next in brouser:
http://127.0.0.1:8080/ignite?cmd=qryfldexe&pageSize=10&cacheName=mycache1&qry=SELECT%20*%20FROM%20%22mycache1%22.Value%20V1%20join%20%22mycache2%22.Value%20V2%20on%20V1.key%3DV2.key
Output:
{"successStatus":0,"error":null,"response":{"items":[[0,"Value 0",0,"Value
0"],[1,"Value 1",1,"Value 1"],[2,"Value 2",2,"Value 2"],[3,"Value
3",3,"Value 3"],[4,"Value 4",4,"Value 4"],[5,"Value 5",5,"Value
5"],[6,"Value 6",6,"Value 6"],[7,"Value 7",7,"Value 7"],[8,"Value
8",8,"Value 8"],[9,"Value 9",9,"Value
9"]],"last":false,"queryId":10,"fieldsMetadata":[{"schemaName":"mycache1","typeName":"VALUE","fieldName":"KEY","fieldTypeName":"java.lang.Integer"},{"schemaName":"mycache1","typeName":"VALUE","fieldName":"VALUE","fieldTypeName":"java.lang.String"},{"schemaName":"mycache2","typeName":"VALUE","fieldName":"KEY","fieldTypeName":"java.lang.Integer"},{"schemaName":"mycache2","typeName":"VALUE","fieldName":"VALUE","fieldTypeName":"java.lang.String"}]},"sessionToken":null}

Hbase data query using rest api

To get data from Hbase table using rest we can use:
http://ip:port/tablename/base64_encoded_key
My key is byte array of
prefix + customer_id + timestamp
byte[] rowKey = Bytes.add(Bytes.toBytes(prefix),Bytes.toBytes(customer_id),Bytes.toBytes(timestamp));
My sample key
3\x00\x00\x00\x02I9\xB1\x8B\x00\x00\x01a\x91\x88\xEFp
How do I get data from Hbase using rest?
How do I get data from Hbase using customer_id and time range?
You must send an HTTP request to get your value. For example if your are an Linux you can easily try a GET request to take a single value. This example retrieves from table users row with id row1 and column a from column family f
curl -vi -X GET \
-H "Accept: text/xml" \
"http://example.com:20550/users/row1/cf:a"
You can see more here including how to retrieve data with timestamp

Is there any way how I can get JSON raw data from Splunk for a given query?

Is there any way how I can get JSON raw data from Splunk for a given query in a RESTful way?
Consider the following timechart query:
index=* earliest=<from_time> latest=<to_time> | timechart span=1s count
Key things in the query are: 1. Start/End Time, 2. Time Span (say sec) and 3. Value (say count)
The expected JSON response would be:
{"fields":["_time","count","_span"],
"rows":[ ["2014-12-25T00:00:00.000-06:00","1460981","1"],
...,
["2014-12-25T01:00:00.000-06:00","536889","1"]
]
}
This is the XHR (ajax calls) for the output_mode=json_rows calls. This requires session and authentication setups.
I’m looking for a RESTful implementation of the same with authentication.
You can do something like this using the curl command
curl -k -u admin:changeme --data-urlencode search="search index=* earliest=<from_time> latest=<to_time> | timechart span=1s count" -d "output_mode=json" https://localhost:8089/servicesNS/admin/search/search/jobs/export