How to publish a table in dolphinDB? - streaming

i have a table like following in DolphinDB server:
n=100000
t=streamTable(rand(100,n) as id ,rand(100.0,n) as val)
share t as st;
How can I get the data in client, and the clint isn't a DolphinDB node.

If the client isn't a DolphinDB data node, please use corresponding language API to subscribe streaming table on DolphinDB data node. The supported API languages include Java, C++,c#, Go, Rust, Python, R and web。Go https://github.com/dolphindb to get corresponding language api.

Related

Is it possible to publish a message to Google Pub/Sub whenever a data is inserted or updated in Google Cloud SQL?

I'm new to Google Cloud SQL and Pub/Sub. I couldn't find documentation anywhere about this. But another question's accepted and upvoted answer seems to say it is possible to publish a Pub/Sub message whenever there is an insert happen to the database. Excerpt from that answer:
2 - The ideal solution would be to create the Pub/Sub topic and publish to it when you insert new data to the database.
But since my question is a different one, thus I asked a new question here.
Background: I'm using a combination of Google Cloud SQL, Firestore and Realtime Database for my app for its own unique strengths.
What I want to do is to be able to write into Firestore and Realtime databases once an insert is successful in Google Cloud SQL. According to the answer above, this is the steps I should do:
The app calls a Cloud Function to insert a data into Google Cloud SQL database (PostgreSQL). Note: The Postgres tables has some important constraints and triggers Postgres functions, thats why we want to start here.
When the insert is successful I want Google Cloud SQL to publish a message to Pub/Sub.
Then there is another Cloud Function that subscribes to the Pub/Sub topic. This function will write into Firestore / Realtime Database accordingly.
I got steps #1 & #3 all figured out. The solution I'm looking for is for step #2.
The answer in the other question is simply suggesting that your code do both of the following:
Write to Cloud SQL.
If the write is successful, send a message to a pubsub topic.
There isn't anything that will automate or simplify either of these tasks. There are no triggers for Cloud Functions that will respond to writes to Cloud SQL. You write code for task 1, then write the code for task 2. Both of these things should be straightforward and covered in product documentation. I suggest making an attempt at both (separately), and posting again with the code you have that isn't working the way you expect.
If you need to get started with pubsub, there are SDKs for pretty much every major server platform, and the documentation for sending a message is here.
While Google Cloud SQL doesn't manage triggers automatically, you can create a trigger in Postgres:
CREATE OR REPLACE FUNCTION notify_new_record() RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('on_new_record', row_to_json(NEW)::text);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER on_insert
AFTER INSERT ON your_table
FOR EACH ROW EXECUTE FUNCTION notify_new_record();
Then, in your client, listen to that event:
import pg from 'pg'
const client = new pg.Client()
client.connect()
client.query('LISTEN on_new_record') // same as arg to pg_notify
client.on('notification', msg => {
console.log(msg.channel) // on_new_record
console.log(msg.payload) // {"id":"...",...}
// ... do stuff
})
In the listener, you can either push to pubsub or cloud tasks, or, alternatively, write to firebase/firestore directly (or whatever you need to do).
Source: https://edernegrete.medium.com/psql-event-triggers-in-node-js-ec27a0ba9baa
You could also check out Supabase which now supports triggering cloud functions (in beta) after a row has been created/updated/deleted (essentially does the code above but you get a nice UI to configure it).

How to create HTTP GET and POST methods in kdb

What is the best way to set up HTTP GET and POST methods with a kdb database?
I'd like to be able to extract the column names from a kdb table to create a simple form with fillable fields in the browser, allow users to input text into the fields, and then upsert and save that text to my table.
For example if I had the following table...
t:([employeeID:`$()]fName:`$(); mName:`$(); lName:`$())
So far I know how to open a port \p 9999 and then view that table in browser by connecting to the local host http://localhost:9999 and I know how to get only the column names:cols t.
Though I'm unsure how to build a useful REST API from this table that achieves the above objective, mainly updating the table with the inputted data. I'm aware of .Q.hg and .Q.hp from this blog post and the Kx reference. But there is little information and I'm still unsure how to get it to work for my particular purpose.
Depending upon your front-end(client) technology, you can either use HTTP request or WebSockets. Using HTTP request will require extra work to customize the output of the request as by default it returns HTML data.
If your client supports Websockets like Javascript then it would be easy to use it.
Basically, you need to do 2 things to setup WebSockets:
1) Start your KDB server and setup handler function for WebSocket request. Function for that is .z.ws. For eample simple function would be something like below:
q) .z.ws:{neg[.z.w].Q.s #[value;x;{`$ "'",x}]}
2) Setup message handler function on the client side, open websocket connection from the client and send a request to KDB server.
Details: https://code.kx.com/v2/wp/websockets/
Example: https://code.kx.com/v2/wp/websockets/#a-simpledemohtml

Dynamic routing to IO sink in Apache Beam

Looking at the example for ClickHouseIO for Apache Beam the name of the output table is hard coded:
pipeline
.apply(...)
.apply(
ClickHouseIO.<POJO>write("jdbc:clickhouse:localhost:8123/default", "my_table"));
Is there a way to dynamically route a record to a table based on its content?
I.e. if the record contains table=1, it is routed to my_table_1, table=2 to my_table_2 etc.
Unfortunately the ClickHouseIO is still in development does not support this. The BigQueryIO does support Dynamic Destinations, so it is possible with Beam.
The limitation in the current ClickHouseIO is around transforming data to match the destination table schema. As a workaround, if your destination tables are known at pipeline creation time you could create a ClickHouseIO per table, then use the data to route to the correct instance of the IO.
You might want to file a feature request in the Beam bug tracker for this.

REST API for processing data stored in hbase

I have a lot of records in hbase store (millions) like this
key = user_id:service_id:usage_timestamp value = some_int
That means an user used some service_id for some_int at usage_timestamp.
And now I wanted to provide some rest api for aggregating that data. For example "find sum of all values for requested user" or "find max of them" and so on. So I'm looking for the best practise. Simple java application doesn't met my performance expectations.
My current approach - aggregates data via apache spark application, looks good enough but there are some issues to use it with java rest api so far as spark doesn't support request-response model (also I have took a view into spark-job-server, seems raw and unstable)
Thanks,
Any ideas?
I would offer Hbase + Solr if you are using Cloudera (i.e Cloudera search)
Solrj api for aggregating data(instead of spark), to interact with rest services
Solr Solution (in cloudera its Cloudera search) :
Create a collection (similar to hbase table) in solr.
Indexing : Use NRT lily indexer or custom mapreduce solr document creator to load data as solr documents.
If you don't like NRT lily indexer you can use spark or mapreduce job with Solrj to do the indexing For ex: Spark Solr :
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Data Retrieval : Use Solrj to get the solr docs from your web service call.
In Solrj,
There is FieldStatInfo through which Sum,Max etc.... can be achieved
There are Facets and Facetpivots to group data
Pagination is supported for rest API calls
you can integrate solr results with Jersey or some other web service as we have already implemented this way.
/**This method returns the records for the specified rows from Solr Server which you can integrate with any rest api like jersey etc...
*/
public SolrDocumentList getData(int start, int pageSize, SolrQuery query) throws SolrServerException {
query.setStart(start); // start of your page
query.setRows(pageSize);// number of rows per page
LOG.info(ClientUtils.toQueryString(query, true));
final QueryResponse queryResponse = solrCore.query(query, METHOD.POST); // post is important if you are querying huge result set Note : Get will fail for huge results
final SolrDocumentList solrDocumentList = queryResponse.getResults();
if (isResultEmpty(solrDocumentList)) { // check if list is empty
LOG.info("hmm.. No records found for this query");
}
return solrDocumentList;
}
Also look at
my answer in "Create indexes in solr on top of HBase"
https://community.hortonworks.com/articles/7892/spark-dataframe-to-solr-cloud-runs-on-sandbox-232.html
Note : I think same can be achieved with elastic search as well. But out of my experience , Im confident with Solr + solrj
I see two possibilities:
Livy REST Server - new REST Server, created by Cloudera. You can submit Spark jobs in REST way. It is new and developed by Cloudera, one of the biggest Big Data / Spark company, so it's very possible that it will be developed in future, not abandoned
You can run Spark Thrift Server and connect just like to normal database via JDBC. Here you've got documentation. Workflow: read data, preprocess and then share by Spark Thrift Server
If you want to isolate third-party apps from Spark you can create simple application that will have user-friendly endpoint and will translate query received by endpoint to Livy-Spark jobs or SQL that will be used with Spark Thrift Server

Auto generation of Sequence Numbers for nodes in Cypher REST API - Neo4J

I have a requirement to auto generate sequence numbers when inserting nodes into neo4j db, this sequence # will be like an id to the node and can be used to generate external url's to access that node directly from the UI.
This is similar to the auto generation of sequence property in mysql, how can we do this in neo4j via Cypher ? I did some research and found these links
Generating friendly id sequence in Neo4j
http://neo4j.com/api_docs//1.9.M05/org/neo4j/graphdb/Transaction.html
However these links are useful when I'm doing this programatically in transactional mode, in my case it's all using Cypher REST API.
Pls advise.
Thanks,
Deepesh
You can use MERGE to mimic sequences:
MERGE (s:Sequence {name:'mysequenceName'})
ON CREATE s.current = 0
ON MATCH s.current=s.current+1
WITH s.current as sequenceCounter
MATCH .... <-- your statement continues here
If your unique ID does not need to be numeric nor sequential, you can just generate and use a GUID whenever you want to create a node. You have to do this programmatically, and you should pass the value as a parameter, but there should be good libraries for GUID generation in all languages and for all platforms.