Azure Media Services v3 - Event Grid - deleting asset doesn't trigger any storage events - azure-media-services

Azure Media Services v3 - Event Grid - deleting asset doesn't trigger any storage events.
Goal is to use Event Grid to detect asset changes like asset created or asset deleted.
Is there a way to get events for the asset blob containers themselves and not the individual blobs?
Many Microsoft.Storage.BlobCreated events are sent during a live event - actually too many for what I need.
Only deleted events are sent only for the deletion of the live event preview blobs: preview.ism & preview.ismc blobs.
{
"topic": "/subscriptions/123/resourceGroups/ResGroup/providers/Microsoft.Storage/storageAccounts/my_storage",
"subject": "/blobServices/default/containers/asset-90fc157d-b4a3-4862-a7fe-ff4df7fa5ee7/blobs/preview.ismc",
"eventType": "Microsoft.Storage.BlobDeleted",
"eventTime": "2018-12-05T06:38:32.997468Z",
"id": "e8416467-b01e-00a3-2965-8ccf53060fe2",
"data": {
"api": "DeleteBlob",
"clientRequestId": "05549d31-e9be-4f15-961f-befbba482f6c",
"requestId": "e8416467-b01e-00a3-2965-8ccf53000000",
"eTag": "0x8D65A7C46CFD798",
"contentType": "application/octet-stream",
"contentLength": 3809,
"blobType": "BlockBlob",
"url": "https://my_storage.blob.core.windows.net/asset-90fc157d-b4a3-4862-a7fe-ff4df7fa5ee7/preview.ismc",
"sequencer": "0000000000000000000000000000137600000000003f399c",
"storageDiagnostics": {
"batchId": "06e102aa-d2ec-4aaf-8c4c-0d89dfae5ffb"
}
},
"dataVersion": "",
"metadataVersion": "1" }

First of all, according to the offical document Reacting to Blob storage events, as below.
Blob storage events are available in general-purpose v2 storage
accounts and Blob storage accounts.
So if you were using general-purpose v2 storage, only created events Microsoft.Storage.BlobCreated will be sent to your event subscription for Azure Storage. Please make sure which kind of Azure Storage you used.
Secondly, if only focus on asset deleted events Microsoft.Storage.BlobDeleted, you can select it from EVENT TYPES in your Event Subscription of Azure Storage like the figure below.
All Blob Storage Events are related to blobs themselves, not containers. There is Filtering events feature which you can see in the tab Additional Features to
match events with subject begins or ends string, but also still get events for asset blobs.
A workaround way is to process the blob events via webhook to filter, or you can try to combine EventHubs with StreamAnalytics to filter and get events for containers.

Related

Passing Cloud Storage custom metadata into Cloud Storage Notification

We have a Python script that copies/creates files in a GCS bucket.
# let me know if my setting of the custom-metadata is correct
blob.metadata = { "file_capture_time": some_timestamp_var }
blob.upload(...)
We want to configure the bucket such that it generates Cloud Storage notifications whenever an object is created. We also want the custom metadata above to be passed along with the Pub/Sub message to the topic and use that as an ordering key in the Subscription side. How can we do this?
The recommended way to receive notification when an event occurs on the intended GCS bucketis to create a Cloud Pub/Sub topic for new objects and to configure your GCS bucket to publish messages to that topic when new objects are created.
Initially, make sure you've activated the Cloud Pub/Sub API, and use the gsutil command similar to below:
gsutil notification create -f json -e OBJECT_FINALIZE gs://example-bucket
The -e specifies that you're only interested in OBJECT_FINALIZE messages (objects being created)
The -f specifies that you want the payload of the messages to be the object metadata for the JSON API
The -m specifies a key:value attribute that is appended to the set of attributes sent to Cloud Pub/Sub for all events associated with this notification config.
You may specify this parameter multiple times to set multiple attributes.
The full Firebase example which explains the parsing the filename and other info from its context/data with
Here is a good example with a similar context.

How do I add entities/ types on Google Actions in bulk using a csv or a json?

I am creating an google assistant app on Actions builder and I have some use cases which convert the company name to their codes. For example, BMW becomes BMWG.DE.
In actions builder, under the Types section I can see a way to add entries:
The problem is that the list is VERY long and I can not find a way to upload this list using a csv or a json. On dialogflow one can upload these Entities/ Types in bulk using a csv or JSON which is quite cool.
Does someone knows how to do it or is it not supported on google actions builder?
I can not migrate the dialogflow entity list to actions as it is a one time migration(quite angry on that) and I have used it already.
Project in Actions Builder are backed by a YAML-based file structure in Actions SDK. If you pull your project to a local environment, you can convert your JSON entities to types using a YAML structure and then push the changes back.
Example type:
synonym:
entities:
"0":
synonyms:
- first
"1":
synonyms:
- second
"2":
synonyms:
- third
matchType: EXACT_MATCH

Azure DevOps Webhooks (Service Hooks) Missing Fields like Description or Repro Steps

I am working on ASP.NET Core 5 project, which will use a subscription to my organization's Azure DevOps Service Hooks (Webhooks). I will analyze event payload's data (and metadata).
I checked what event payloads contains from here:
https://learn.microsoft.com/en-us/azure/devops/service-hooks/events?view=azure-devops#workitem.updated
And also downloaded this NuGet package: https://www.nuget.org/packages/Microsoft.AspNet.WebHooks.Receivers.vsts
But there is a problem. I cannot find (in the docs and also in the NuGet package) Work Item's "Description" field or Bug's "Repro Steps" fields. These two fields are most important fields from payload for my project.
Are these fields hidden somewhere? Or is possible to include these fields in payload?
After querying Work Item Types Field - List rest api which used to get a list of fields for a work item type with detailed references.
{
"alwaysRequired": false,
"defaultValue": null,
"allowedValues": [],
"dependentFields": [],
"referenceName": "System.Description",
"name": "Description",
"url": "https://dev.azure.com/fabrikam/_apis/wit/fields/System.Description"
},
The referenceName of Description field should be System.Description.
As you have pointed, seems this is not include in webhook event payload.
You may have to use work item Rest API to query corresponding info.

How to execute Google Data Fusion Pipeline from a event based triggers CDAP

Is there any way to run a Google Data Fusion pipeline from CDAP event based triggers?
The 1st requirement is, whenever a new file arrives within a GCS bucket. it will trigger the Data Fusion pipeline to run automatically.
The 2nd requirement is pipeline dependency, For example, Pipeline B cannot run if Pipeline A not started or failed.
Thanks
Reviewing your initial use case, I assume that for the 2nd requirement you might consider to look at CDAP pure components like: Schedules, Workflows and Triggers.
Generally, designing the run flow for underlying pipelines with some conditional execution schema, you create the Schedule object by defining the specific Workflow that holds the logical combination of the conditions between pipelines and apply the Trigger's model that matches you event occurrence.
According to the CDAP documentation:
Workflows can be controlled by the CDAP CLI and the Lifecycle
HTTP RESTful API.
Having above mentioned, it is required to compose an appropriate HTTP request to
CDAP REST API, containing JSON object that stores the details of the schedule to be created, based on the example from documentation and for the further reference I've created the workflow , whereas Pipeline_2 triggers only when Pipeline_1 succeeds:
{
"name": "Schedule_1",
"description": "Triggers Pipeline_2 on the succeding execution of Pipeline_1",
"namespace": "<Pipeline_2-namespace>",
"application": "Pipeline_2",
"version": "<application version of the Pipeline_2>",
"program": {
"programName": "Workflow_name",
"programType": "WORKFLOW"
},
"trigger": {
"type": "PROGRAM_STATUS",
"programId": {
"namespace": "<Pipeline_1-namespace>",
"application": "Pipeline_1",
"version": "<application version of the Pipeline_1>",
"type": "WORKFLOW",
"entity": "PROGRAM",
"program": "Workflow_name"
},
"programStatuses": ["COMPLETED"]
}
}
For the 1st requirement I'm not sure whether it can be feasible to achieve within the Data Fusion/CDAP native instruments, while I'm not able to see such kind of event, matching the continuous discover of GCS bucket:
Triggers are fired by events such as creation of a new partition in a
dataset, or fulfillment of a cron expression of a time trigger, or the
status of a program.
In such a case I would look at GCP Cloud function and GCP Composer, nicely written example, depicts the way how to use Cloud Functions for event-based DAG triggers, assuming that in particular Composer DAG file you can invoke sequential Data Fusion pipeline execution. Check out this Stack thread for more details.

CQRS and Event Sourcing coupled with Relational Database Design

Let me start by saying, I do not have real world experience with CQRS and that is the basis for this question.
Background:
I am building a system where a new key requirement is allowing admins to "playback" user actions (admins want to be able to step through every action that has happened in a system to any particular point). The caveats to this are, the company already has reports that are generated off of their current SQL db that they will not change (at least not in parallel with this new requirement) so the storage of record will be SQL. I do not have access to SQL's Change Data Capture, so creating a bunch of history tables with triggers would be incredibly difficult to maintain so I'd like to avoid that if at all possible. Lastly, there are potentially (not currently) a lot of data entry points that go through a versioning lifecycle that will result in changes to the SQL db (adding/removing fields) so if I tried to implement change tracking in SQL, I'd have to maintain the tables that handled the older versions of the data (nightmare).
Potential Solution
I am thinking about using NoSQL (Azure DocumentDB) to handle data storage (writes) and then have command handlers handle updating the current SQL (Azure SQL) with the relevant data to be queried (reads). That way the audit trail is created and that idea of "playing back" can be handled while also not disturbing the current back end functionality that is provided.
This approach would handle the requirement and satisfy the caveats. I wouldnt use CQRS for the entire app, just for the pieces that I needed this "playback" functionality. I know that I would have to mitigate failure points along the Client -> Write to DocumentDB -> Respond to user with success/fail -> Write to SQL on Success write to DocumentDB path, but my novice CQRS eyes can't see a reason why this isn't a great way to handle this.
Any advice would be greatly appreciated.
This article explained CQRS pattern and provided an example of a CQRS implementation please refer to it.
I am thinking about using NoSQL (Azure DocumentDB) to handle data storage (writes) and then have command handlers handle updating the current SQL (Azure SQL) with the relevant data to be queried (reads).
here is my suggestion, when a user do write operations to update a record, we could always do insert operation before admin audit user’s operations. For example, if user want to update a record, we could insert updating entity with a property that indicates if current operation is audited by admins instead of directly update the record.
Original data in document
{
"version1_data": {
"data": {
"id": "1",
"name": "jack",
"age": 28
},
"isaudit": true
}
}
For updating age field, we could insert entity with updated information instead of updating original data directly.
{
"version1_data": {
"data": {
"id": "1",
"name": "jack",
"age": 28
},
"isaudit": true
},
"version2_data": {
"data": {
"id": "1",
"name": "jack",
"age": 29
},
"isaudit": false
}
}
and then admin could check the current the document to audit user’s operations and determine if updates could write to SQL database.
One potential way to think about this is creating a transaction object that has a unique id and represents the work that needs to be done. The transaction in this case would be write an object to document db or write an object to sql db. It could contain the in memory object to be written and the destination db (doc db, sql, etc.) connection parameters.
Once you define your transaction you would need to adjust your work flow for a proper CQRS. Instead of client writing to doc db directly and waiting on the result of this call, let the client create a transaction with a unique id - which could be something like Date Time tick counts or an incremental transaction id for instance, and then write this transaction to a message queue like azure queue or service bus. Once you write the transaction to the queue return success to user at that point. Create worker roles that would read the transaction messages from this queue and process them, write objects to doc db. That is not overwriting the same entity in doc db, but just writing the transaction with the unique incremental id to doc db for that particular entity. You could also use azure table storage for that afaik.
After successfully updating the doc db transaction, the same worker role could write this transaction to a different message queue which would be processed by its own set of worker roles which would update the entity in sql db. If anything goes wrong in the interim, keep an error table and update failures in that error table to query and retry later on.