Unable to setup Azure alert on resource specific events - azure-data-factory

In the past, it was possible to setup an Azure alert on a single event for a resource e.g. on data factory single RunFinished where the status is Failed*.
This appears to have been superseded by "Activity Log Alerts"
However these alerts only seem to either work on a metric threshold (e.g. number of failures in 5 minutes) or on events which are related to the general admin of the resource (e.g. has it been deployed) not on the operations of the resource.
A threshold doesn't make sense for data factory, as a data factory may only run once a day, if a failure happens and then it doesn't happen X minutes later it doesn't mean it's been resolved.
The activity event alerts, don't seem to have things like failures.
Am I missing something?
It it because this is expected to be done in OMS Log Analytics now? Or perhaps even in Event Grid later?
*n.b. it is still possible to create these alert types via ARM templates, but you can't see them in the portal anymore.

The events you're describing are part of a resource type's diagnostic logs, which are not alertable in the same way that the Activity Log is. I suggest routing the data to Log Analytics and setting the alert there: https://learn.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor

Related

how to automate bots to monitor for successful queues on orchestrator?

I have a project that I have to do that deals with queues being loaded successfully and unsuccessfully whereby I do manually at the moment that can be tedious and also positive negative meaning the orchestrator can state that new queues have been added but when I access the actual job (process) nothing has been added.
I would like to know, is there a way to monitor queue success and unsuccessful rates on orchestrator instead of the using monitoring it manually?
You can access pretty much any information via the Orchestrator API.
You can find the "Orchestrator HTTP Request" activity, which will allow you to access any relevant endpoint.
Note that the provisioned Robot in Orchestrator needs to have the right access permission, so please have a look at what roles are associated to the Robot user.
The API reference can be found here:
https://docs.uipath.com/orchestrator/reference
You will see it mentions swagger, which in turn will give you all the information you need to access the relevant APIs.

Is it possible to track down very rare failed requests using linkerd?

Linkerd's docs explain how to track down failing requests using the tap command, but in some cases the success rate might be very high, with only a single failed request every hour or so. How is it possible to track down those requests that are considered "unsuccessful"? Perhaps a way to log them somewhere?
It sounds like you're looking for a way to configure Linkerd to trap requests that fail and dump the request data somewhere, which is not supported by Linkerd at the moment.
You do have a couple of options with the current functionality to derive some of the info that you're looking for. The Linkerd proxies record error rates as Prometheus metrics which are consumed by Grafana to render the dashboards. When you observe one of these infrequent errors, you can use the time window functionality in Grafana to find the precise time that the error occurred, then refer to the service log to see if there are any corresponding error messages there. If the error is coming from the service itself, then you can add as much logging info about the request that you need to in order to help solve the problem.
Another option, which I haven't tried myself is to integrate linkerd tap into your monitoring system to collect the request info and save the data for the requests that fail. There's a caveat here in that you will want to be careful about leaving a tap command running, because it will continuously collect data from the tap control plane component, which will add load to that service.
Perhaps a more straightforward approach would be to ensure that all the proxy logs and service logs are written to a long-term store like Splunk, an ELK (Elasticsearch, Logstash, and Kibana), or Loki. Then you can set up alerting (Prometheus alert-manager, for example) to send a notification when a request fails, then you can match the time of the failure with the logs that have been collected.
You could also look into adding distributed tracing to your environment. Depending on the implementation that you use (jaeger, zipkin, etc.) I think the interface will allow you to inspect the details of the request for each trace.
One final thought: since Linkerd is an open source project, I'd suggest opening a feature request with specifics on the behavior that you'd like to see and work with the community to get it implemented. I know the roadmap includes plans to be able to see the request bodies using linkerd tap and this sounds like a good use case for having those bodies.

Getting many welcome messages from the same user

I am getting many welcome messages from the same user, is it some kind of a monitoring system by Google?
How can I learn to ignore those requests?
Yes, Google periodically issues a health check against your Action, usually about every 5-10 minutes. Your Action should respond to it normally so Google knows if there is something wrong. If there is, you will receive email that your Action is unavailable because it is unhealthy. They will continue to monitor it and, when healthy again, will restore it.
You don't need to ignore those requests, however you may wish to, either to save on resources or to avoid logging it all the time.
With a library such as multivocal, it detects it and responds automatically - there is nothing you need to to. For other libraries, you will need to examine the raw input sent in the body of your webhook request.
If you are using the Action SDK, you should examine the inputs array to see if there is one with an argument named "is_health_check". If you are using Dialogflow, then you would need to look under originalDetectIntentRequest.data.inputs.

To create the Log Analytics alerts using Azure Power Shell or Azure CLI

I'm trying to create alerts in LogAnlytics in azure portal, need to create 6 alerts for 5 db's, so have to create 30 alerts manually and is time consuming.
Hence would require an automated approach.
Tried to create via Creating Alerts Using Azure PowerShell, but this creates the alerts in the Alerts Classic under Monitor but this is not what is required, require it to be created in Log Analytics.
Next approach was via Create a metric alert with a Resource Manager template but this was metric alert and not LogAnalytics alert
At last tried Create and manage alert rules in Log Analytics with REST API, but this is a tedious process need to get the search id, schedule id, threshold id and action id. Even after trying to create the threshold id or action id the error I'm facing is "404 - File or directory not found." (as in the image).
Could someone please suggest me on how should this be proceeded, or is there any other way to create alerts apart from the manual creation?
If you use the Add activity log alert to add a rule, you will find it in the Alerts of Log Analytics in the portal.
Please refer to the Log Analytics Documentation,
Alerts are created by alert rules in Azure Monitor and can automatically run saved queries or custom log searches at regular intervals.
Update:
Please refer to my test screenshots, I think you should check the specific resource group or other things, etc.
Even so, activity log alert belongs to the alerts(classic), alerts is a new metric alert type. You could check the link new metric alert type in this article, it points the alerts. it is not supported by powershell and CLI currently.
Please refer to:
1.Use PowerShell to create alerts for Azure services
2.Use the cross-platform Azure CLI to create classic metric alerts in Azure Monitor for Azure services
As mentioned in the two articles:
This article describes how to create older classic metric alerts. Azure Monitor now supports newer, better metric alerts. These alerts can monitor multiple metrics and allow for alerting on dimensional metrics. PowerShell support for newer metric alerts is coming soon.
This article describes how to create older classic metric alerts. Azure Monitor now supports newer, better metric alerts. These alerts can monitor multiple metrics and allow for alerting on dimensional metrics. Azure CLI support for newer metric alerts is coming soon.
#Shashikiran : You can use the script published in the GITHUB https://github.com/microsoft/manageability-toolkits/tree/master/Alert%20Toolkit
This can create a few sample alerts. For now we have included some sample core machine monitoring alerts like CPU , hardware failures , SQL , etc... Also these are only the log alerts. You can use this as a sample code and come up with your code.

Cognitive Service Recommendation API Upload Usage Event

Cognitive Service Recommendation API of Upload Usage Event method does not work well.
Implementation Technique
I was created in the order of the ”model” · ”catalog” · ”file” · ”build” in Cognitive Service Recommendation API.
Response of ”Upload Usage Event” is status code is successful in 201.
I call the ”Update model”.
I call ”Download usage file” and ”Get item to item recommendation”.
The item of ”Upload Usage Event” I tried to make sure it is reflected.
However, it did not reflect.
I want to know how to reflect the item of Upload Usage Event to Build.
Am I wrong what implementation procedure?
After [updloading a usage event][1] you need to create a new build in that model for the usage event to be considered as part of the recommendations request.
Note that a single usage event may not significantly change the model. Usually you retrain the model once a week (or more or less depending on the level of traffic you receive) -- and at that point you would have had sent hundreds or thousands of usage events that may actually impact the model.