GCS Transfer via Source: URL list "errorCode": "UNKNOWN" - google-cloud-storage

I'm trying to transfer 7,860,379 files, using the transfer system via URL list, however always encounter the same error:
{ //...
"errorBreakdowns": [
{
"errorCode": "UNKNOWN",
"errorCount": "1",
"errorLogEntries": [
{
"url": " or ",
"errorDetails": [
""
]
}
]
}
]
// ...
}
All the my URLs are valid and the file format as documented:
TsvHttpData-1.0
^([^ ]+)\t([0-9]+)\t([a-f0-9]{32})$
The error I find the API is very generic, someone went through the same problem?
Since, I thank you.

Based on your regex, I suspect you are not providing a base-64 encoded MD5, as it often contains '=' characters. To do this, you need to compute the binary version of your MD5 and then convert it to base64.
Example: Hk2gdsIpWTDz3kQssoTqKg==

Related

Batch create transcription always results in: The recordings URI contains invalid data

I would like to use Azure Speech Services Batch Transcription APIs to create a transcription of my audio file. I've already had success using the Speech Service SDK (for Node.js), but was interested in trying out one of the newer features available in v3.1 preview version of the api (displayFormWordLevelTimestampsEnabled), so I figured I had to do use the REST API service to do that.
Overall my problem is that for whatever input I've feed the Create Transcript API for contentUrls, I always end up getting the same error:
"error": {
"code": "InvalidData",
"message": "The recordings URI contains invalid data."
}
After a little digging, I found some tips through the Azure portal to use sox to handle transcoding the audio file in the specific format requested.
The specific format they mention in the portal documentation shows:
If you are using REST API, make sure that it uses one of the formats in this table:
Format
Codec
Bit rate
Sample Rate
WAV
PCM
256 kbps
16 kHz, mono
OGG
OPUS
256 kpbs
16 kHz, mono
With the sox specific commands being:
Activity
SoX command
Check the audio file format.
sox --i
Convert the audio file to single channel, 16-bit, 16 KHz.
sox -b 16 -e signed-integer -c 1 -r 16k -t wav .wav
I ran my mp3 through the second command and verified the file with the first, and the contents of the file looks like:
Input File : 'out5.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:30.09 = 481488 samples ~ 2256.97 CDDA sectors
File Size : 963k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
Finally, I uploaded the file to a public S3 bucket, to use as my content url for my request:
POST https://westus.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions
{
"contentUrls": [
"https://s3.us-west-1.amazonaws.com/xxxx/out5.wav"
],
"locale": "en-US",
"displayName": "Test"
}
Still it failed with the same error that I posted above. Any insights into what might be wrong? Thanks!
Update:
The answer below mentioned being able to reference a reports.json file on the Get Transcript/Create Transcript api call.
When I use the Create Transcript API my payload is:
{
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95",
"model": {
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base/c3b008fa-eb47-4f6d-a5b9-71dd37870bb7"
},
"links": {
"files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95/files"
},
"properties": {
"diarizationEnabled": false,
"wordLevelTimestampsEnabled": false,
"displayFormWordLevelTimestampsEnabled": false,
"channels": [
0,
1
],
"punctuationMode": "DictatedAndAutomatic",
"profanityFilterMode": "Masked"
},
"lastActionDateTime": "2022-09-13T23:37:09Z",
"status": "NotStarted",
"createdDateTime": "2022-09-13T23:37:09Z",
"locale": "en-US",
"displayName": "Test"
}
Calling the Get Transcript I see:
{
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95",
"model": {
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base/c3b008fa-eb47-4f6d-a5b9-71dd37870bb7"
},
"links": {
"files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95/files"
},
"properties": {
"diarizationEnabled": false,
"wordLevelTimestampsEnabled": false,
"displayFormWordLevelTimestampsEnabled": false,
"channels": [
0,
1
],
"punctuationMode": "DictatedAndAutomatic",
"profanityFilterMode": "Masked",
"error": {
"code": "InvalidData",
"message": "The recordings URI contains invalid data."
}
},
"lastActionDateTime": "2022-09-13T23:37:22Z",
"status": "Failed",
"createdDateTime": "2022-09-13T23:37:09Z",
"locale": "en-US",
"displayName": "Test"
}
And finally looking at the transcript files I'm getting an empty list:
{
"values": []
}
I see no reference to a reports.json, or any data populated here at all.
In many cases you can get a detailed error information by doing a GET on https://westus.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions/<transcription_id>/files and looking at the report.json that is referenced there.
If that doesn't help, you could post transcription id(s) of failed transcription so someone from the team (I am one of them) can look at the service logs.

Capture text between two different char. Between the first{ and the last }

Capture text between two different char using PowerShell. Between the first{ and the last } . Basically there is text with Json in it and I want to capture the json from it. I have looked for examples but so far no luck.
PROJECT Description: Azure Test Project Description
PROJECT ADMINISTRATORS: jjohnson
CONTRIBUTORS: jdoe
BOARD PROCESS: Agile
SPECIAL INSTRUCTIONS:
{
"organization": "https://dev.azure.com/cloudops",
"projectName": "Test Project",
"projectDescription": "Azure Test Project Description",
"projectProcessType": "Agile",
"specialInstructions": "",
"adminMembers": [
{
"userSamAccountName": "jjohnson",
"userEmailAddress": "jjohnson#test.com",
"userPrincipalName": "jjohnson#test.com",
"projectGroupType": "projectAdministrator"
}
],
"contribMembers": [
{
"userSamAccountName": "jdoe",
"userEmailAddress": "jdoe#test.com",
"userPrincipalName": "jdoe#test.com",
"projectGroupType": "projectContributor"
}
]
}
Is this what you were looking for?
[Regex]::Match((Get-Content "sampleinputfile.txt" -Raw),
'^{.+}',
[Text.RegularExpressions.RegexOptions]::Multiline -bor
[Text.RegularExpressions.RegexOptions]::Singleline).Value
Basically this converts the input file to a single (newline-delimited) string (Get-Content -Raw), and then uses the .NET Framework's Match method to do a regular expression match for the lines of text between the { and } characters (inclusive).

Encountered unsupported property ComparisonOperator

Cloudformation stack throws Error "Encountered unsupported property Comparison Operator" , while creating an AWS::CloudWatch::Alarm using cloudformation.
As per AWS documentation ComparisionOperator value GreaterThanOrEqualtoThreshold is valid.
I use AWSTemplateFormatVersion as 2010-09-09
Any help would be appreciated :)
"CPUHighAlarm":{
"Type":"AWS::CloudWatch::Alarm",
"Properties":{
"AlarmDescription":"High CPU utilization",
"MetricName":"CPUUtilization",
"Namespace":"AWS/EC2",
"AlarmActions":[{"Ref":"asgScaleOut"}],
"ComparisionOperator": "GreaterThanOrEqualtoThreshold",
"EvaluationPeriods": "1",
"Threshold": "70",
"Period":"180",
"Statistic": "Average",
"Dimensions": [
{
"Name": "AutoScalingGroupName",
"Value": {
"Ref": "asg"
}
}
]
}
},
Just a typo. Should be ComparisonOperator instead of ComparisionOperator.
The CloudFormation Linter can help you catch these quicker and the Visual Studio Code extension can help prevent typos with autocompletion:
E3002: Invalid Property Resources/CPUHighAlarm/Properties/ComparisionOperator
Maybe it's case sensitive. Try GreaterThanOrEqualToThreshold.

PropertyParams when deploying VM from OVF

I am using the VMWare vCenter REST API to deploy new Virtual Machines from OVF library items. Part of the API allows for additional_paramaters but I am unable to get it to function properly. Specifically, I would like to set the PropertyParams for custom OVF template properties.
When deploying VM from OVF, I am using the following REST API:
POST https://{server}/rest/com/vmware/vcenter/ovf/library-item/id:{ovf_library_item_id}?~action=deploy
I have tried many structures and either end up with the POST succeeding but the parameters completely ignored, or with a 500 Internal Server error with a message about failing to convert the properties structure:
Could not convert field 'properties' of structure 'com.vmware.vcenter.ovf.property_params'
The payload that seems correct from the documentation (but fails with the error above):
deployment_spec : {
/* ... */
additional_parameters : [
{
type : 'PropertyParams',
properties : [
{
id : 'my_property_name',
value : 'foo',
}
]
}
]
}
Given an OVF that contains the following:
<ProductSection>
<Info>Information about the installed software</Info>
<Product>MyProduct</Product>
<Vendor>MyCompany</Vendor>
<Version>1.0</Version>
<Category>Config</Category>
<Property ovf:userConfigurable="true" ovf:type="string" ovf:key="my_property_name" ovf:value="">
<Label>My Property</Label>
<Description>A custom property</Description>
</Property>
</ProductSection>
This also fails for other property types such as boolean.
Note that I have posted on the vCenter forums as well.
I had the same issue, i success to solve it by browsing the vapi structure /com/vmware/vapi/metadata/metamodel/structure/id:<idstructure>
Here is my finding :
firstly, get your properties structure by using the filter api :
https://{{vc}}/rest/com/vmware/vcenter/ovf/library-item/id:300401a5-4561-4c3d-ac67-67bc7a1a6
Then, to deploy, use the class com.vmware.vcenter.ovh.property_params. It will be more clear with the exemple :
{
"deployment_spec": {
"accept_all_EULA": true,
"name": "clientok",
"default_datastore_id": "datastore-10",
"additional_parameters": [
{
"#class": "com.vmware.vcenter.ovf.property_params",
"properties":
[
{
"instance_id": "",
"class_id": "",
"description": "The gateway IP for this virtual appliance.",
"id": "gateway",
"label": "Default Gateway Address",
"category": "LAN",
"type": "ip",
"value": "10.1.2.1",
"ui_optional": true
}
],
"type": "PropertyParams"
}
]
}

how to translate curl to elixir httpoison

I have an example curl command from an api:
curl -sd '{"inputs":[{"addresses": ["add42af7dd58b27e1e6ca5c4fdc01214b52d382f"]}],"outputs":[{"addresses": ["884bae20ee442a1d53a1d44b1067af42f896e541"], "value": 4200000000000000}]}' https://api.blockcypher.com/v1/eth/main/txs/new?token=YOURTOKEN
I have no idea how to translate this into HTTPoison for Elixir. I've been trying for hours. I can't even begin to mention all the iterations I've gone through, but here's where I'm at now:
Connect.post( "https://api.blockcypher.com/v1/eth/main/txs/new?token=#{#token}",
"",
[
{ "inputs", "{addresses, #{address_from}}"},
{ "outputs", "[{addresses, #{address_to}}, {value, #{eth_amount}}]"}
]
)
unlike most everything I've tried before this actually gets to their servers and gives a response:
"{\"error\": \"Couldn't deserialize request: EOF\"}"
%{"error" => "Couldn't deserialize request: EOF"}
** (FunctionClauseError) no function clause matching in Base.encode16/2
(elixir) lib/base.ex:175: Base.encode16(nil, [])
(blockcypher) lib/blockcypher/handler.ex:55: Blockcypher.Handler.post_transa
ction_new/4
iex(46)>
can you help me out? I tried putting the data in the body portion instead of the headers but with no luck.
The data should be the second argument to HTTPoison.post/2 and should be encoded to JSON. Your data is also in the wrong format.
This should work:
Connect.post(
"https://api.blockcypher.com/v1/eth/main/txs/new?token=#{#token}",
"",
Poison.encode!(
%{"inputs" => [%{"addresses" => [address_from]}],
"outputs" => [%{"addresses" => [address_to],
"value" => eth_amount}]})
)