As the title says, can you have more than one query in an Azure Streaming Analytics job? If so, how should that be structured?
yes, you can have multiple queries in stream analytics job.
You would do something like below
select * into type1Output from inputSource where type = 1
select * into type2Output from inputSource where type = 2
The job has two outputs defined, called type1Output and type2Output. Each query writes to a different output.
Related
I am trying to create a streaming visualization from KSQLDB using Arcadia BI tool. I am able to establish a connection and see the streams and tables in KSQLDB from Arcadia. But while trying to sample data, I am getting error.Can anyone help?
Error running query: Error: b'{"#type":"statement_error","error_code":40001,"message":"Pull queries don\'t support LIMIT clauses. Refer to https://cnfl.io/queries for info on query types. If you intended to issue a push query, resubmit with the EMIT CHANGES clause\\n\\nQuery syntax in KSQL has changed. There are now two broad categories of queries:\\n- Pull queries: query the current state of the system, return a result, and terminate. \\n- Push queries: query the state of the system in motion and continue to output results until they meet a LIMIT condition or are terminated by the user.\\n\\n\'EMIT CHANGES\' is used to to indicate a query is a push query. To convert a pull query into a push query, which was the default behavior in older versions of KSQL, add `EMIT CHANGES` to the end of the statement before any LIMIT clause.\\n\\nFor example, the following are pull queries:\\n\\t\'SELECT * FROM X WHERE ROWKEY=Y;\' (non-windowed table)\\n\\t\'SELECT * FROM X WHERE ROWKEY=Y AND WINDOWSTART>=Z;\' (windowed table)\\n\\nThe following is a push query:\\n\\t\'SELECT * FROM X EMIT CHANGES;\'\\n\\nNote: Persistent queries, e.g. `CREATE TABLE AS ...`, have an implicit `EMIT CHANGES`, but we recommend adding `EMIT CHANGES` to these statements.","stackTrace":[],"statementText":"select * from table_name limit 10;","entities":[]}'
This is an issue with Arcadia and the version of KSQL that you're running.
There was a breaking change in ksqlDB 0.6 (shipped with Confluent Platform 5.4) which changed the syntax of queries. Pull queries were added and the previous "continuous queries" are known as "push queries" and denoted with a mandatory EMIT CHANGES clause.
We currently have multiple cloudwatch log streams per ec2 instance. This is horrible to debug; queries for "ERROR XY" across all instances would involve either digging into each log stream (time consuming) or using aws cli (time consuming queries).
I would prefer to have a log stream combining the log data of all instances of a specific type, let's say all "webserver" instances log their "apache2" log data to one central stream and "php" log data to another central stream.
Obviously, I still want to be able to figure out which log entry stems from which instance - as I would be with central logging via syslogd.
How can I add the custom field "instance id" to the logs in cloudwatch?
The best way to organize logs in CloudWatch Logs is as follows:
The log group represents the log type. For example: webserver/prod.
The log stream represents the instance id (i.e. the source).
For querying, I highly recommend using the Insights feature (I helped build it when I worked # AWS). The log stream name will be available with each log record as a special #logStream field.
You can query across all instances like this:
filter #message like /ERROR XY/
Or inside one instance like this:
filter #message like /ERROR XY/ and #logStream = "instance_id"
I have an application with a workflow that takes an object through a set of states, from S_1 -> S_N. I would like to log all of the state changes with the relevant data, like so:
event: {id, timestamp, state }
Once I have gathered the data, I would like to process it to understand how long each step of the work is taking.
An example SQL query for this data looks like this (please excuse my sql skills if this query is poorly architected):
select age(m2.event_time, m1.event_time)
from events m1 inner join events m2
on m1.m_id = m2.m_id and m1.state = 'STATE_X' and m2.state = 'STATE_Y';
This solution requires a relational database, which creates concerns for scalability and maintainability. We already have a time series database in production and I would ideally like to use that for this purpose. Is there a way to use something like ELK or Prometheus or InfluxDB to accomplish this task? Have I designed the solution incorrectly? Nothing I have found allows for queries of this nature.
I have created data factory source as Salesforce for which i am querying lead and then want to pass each lead email id as an argument (POST request) to a REST endpoint.
I am not sure in the pipeline what sink should i put and if it is HTTP File dataset then how to pass the email id from Source to sink in the argument?
The only way I know how to surface data in ADF itself is through a lookup activity. You can then iterate through the results of the lookup using a forEach activity. Reference the result of the lookup using in the ForEach items parameter:
#activity('nameoflookupactivity').output.value
If you need to add multiple IDs at once I think to only way would be to concatenate your IDs in a sink like a SQL database. You would first have to copy your IDs to a table in SQL. Then in Azure SQL DB/SQL server 2017 you could use the following query:
SELECT STRING_AGG([salesforce_Id], ',') AS Ids FROM salesforceLeads;
The ADF tutorial for incrementally loading multiple tables discusses the ForEach activity extensively, you can find it here:
ADF Incrementally Loading multiple Tables Tutorial
For more information about the STRING_AGG function check out:STRING_AGG Documentation
Thanks,
Jan
What I want to do is write an Order service project without using the traditional RDBMS(eg. mysql).After having read the docs from Kafka and confluent for several days, I think Kafka stream and state store could help me to implement this without using RDBMS.
But how to store the data that has a long list result in state store?Especially when I need to query result like using limit and offset?
eg:
I have a table, the columns are:
| userId | orderId | ... |
One user may have many order rows to store, so I need a query method to search the key:userId result in the state store with start and limit.But I can't find a method in the Kafka stream State Store interface.
Do I have to do this using an RDBMS? What is the standard method for me to implement an application like this?