Delete of Service gives error - ibm-cloud

When trying to delete an service it gives errors. The "Microservices-OrdersAPI-marcvl-1526" has a route to an ElephantSQL. The error I get is:
{ "code": 10001, "description": "Service instance myMicroservicesSQL: Service broker error: {\"description\"=>\"Error 500 received from broker url https://bluemix-eu-gb.marketplace.ibmcloud.com/api/custom/cloudfoundry/v2/service_instances/ffa09d28-1774-46b5-ac36-8e8cc87d098c/service_bindings/57f38a3e-a522-4de2-9f09-179a6c770088?plan_id=4dfed884-0a35-4a76-ab61-e122117f3efe&service_id=45c4b1ce-ae64-11e3-8e2c-00259086a7bc\"}", "error_code": "CF-ServiceBrokerBadResponse", "http": { "uri": "https://provision-broker.eu-gb.bluemix.net/bmx/provisioning/brokers/29240c6e-d36b-4b4f-8628-4c5de43d6c61/v2/service_instances/ffa09d28-1774-46b5-ac36-8e8cc87d098c/service_bindings/57f38a3e-a522-4de2-9f09-179a6c770088", "method": "DELETE", "status": 500 } }
Please advice how to delete these two. Deleting others did work.

Related

Concourse Worker on another server loses connection to Concourse Web

We have a Concourse Web Container and a Concourse Worker Container running on Server A (212.77.7.255 - real IP is conceiled). We use the latest Concourse Version 7.8.1.
As we ran out of Worker resources, we added another Concourse Worker Container running on Server B. The Worker on Server B has been running fine for about five days, but all of a sudden it is not able to connect anymore to Concourse Web on Server A.
The logs of the Worker on Server B say:
{
"timestamp": "2022-07-12T11:15:59.542 985762Z",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5430446562",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4.2"
}
}{
"timestamp": "2022-07-12T11:15:59.5430608042",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5430689532",
"level": "error",
"source": "worker",
"message": "worker.container-sweeper.tick.failed-to-get-containers-to-destroy",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "6.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5541187512",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper. tick.failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:15:59.5541648442",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper.tick.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4.3"
}
}{
"timestamp": "2022-07-12T11:15:59.5541725932",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper.tick.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:15:59.554179789Z",
"level": "error",
"source": "worker",
"message": "worker.volume-sweeper. tick. failed-to-get-volume 3-to-destroy",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "7.4"
}
}{
"timestamp": "2022-07-12T11:16:04.5802200122",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon. failed-to-connect-to-tsa",
"data": {
"error": "dial tcp 212.77.7.255:2222: i/o timeout",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.580284659Z",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.dial.failed-to-connect-to-any-tsa",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1.10"
}
}{
"timestamp": "2022-07-12T11:16:04.5803353772",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.failed-to-dial",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.5803598682",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.beacon.exited-with-error",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.580372552Z",
"level": "debug",
"source": "worker",
"message",
"worker.beacon-runner.beacon.done",
"data": {
"session": "4.1"
}
}{
"timestamp": "2022-07-12T11:16:04.5803948792",
"level": "error",
"source": "worker",
"message": "worker.beacon-runner.failed",
"data": {
"error": "all worker SSH gateways unreachable",
"session": "4"
}
}
The logs on Concourse Web on Server A show no entries of the Worker on Server B trying to connect. On Server B I'm able to connect to Concourse Web on Server A:
$ nc 212.77.7.255 2222
SSH-2.0-Go
We had this problem before, but we solved it by upgrading Concourse to the latest version 7.8.1. Now I'm running out of options where to debug this. What I've tried:
restarting the workers
restarting the web container
pruning the stalled worker of Server B
docker system prune on Server B
Nothing does help. What can I do to debug this further and make the Worker on Server B connect again?
You said it happened to an earlier version, you "ran out of Worker resources", and I'm seeing I/O timeout in the logs... the one component you didn't mention is the DB.
It might be that the max conns on the DB has been reached, especially if the DB is used for purposes other than just Concourse. That's where I'd look next.
We couldn't find out why the docker network did not allow connecting to Server A. As connections on the host machine were going through, we told docker to use the host network:
services:
concourse-worker:
...
network-mode: host
...
This solved the issue. Not a pretty workaround, as the docker container should have it's own separated network, but as there is nothing else running on this server it's fine.

Azure data factory ci/cd release errors

I am trying to deploy to the UAT environment and followed the needed steps shown in a blog post and youtube video.
However, I keep getting failures.
If I run it in 'validation only' mode it passes fine. But to actually deploy it under 'incremental' I receive the following errors
2022-03-01T17:24:27.6268601Z ##[error]At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.
2022-03-01T17:24:27.6301466Z ##[error]Details:
2022-03-01T17:24:27.6306699Z ##[error]BadRequest: Failed to encrypt sub-resource payload {
"Id": "/subscriptions/a834838a-11d5-4657-a9c3-bc8b2ebdaa59/resourceGroups/adf-rg-uat-uks/providers/Microsoft.DataFactory/factories/adf-df-uat-uks/linkedservices/adfSQLDB_Dev",
"Name": "adfSQLDB_Dev",
"Properties": {
"annotations": [],
"type": "AzureSqlDatabase",
"typeProperties": {
"connectionString": "********************"
}
}
} and error is: Message for the errorCode not found..
2022-03-01T17:24:27.6314634Z ##[error]BadRequest: Failed to encrypt sub-resource payload {
"Id": "/subscriptions/a834838a-11d5-4657-a9c3-bc8b2ebdaa59/resourceGroups/adf-rg-uat-uks/providers/Microsoft.DataFactory/factories/adf-df-uat-uks/linkedservices/adfSQLDB_Prod",
"Name": "adfSQLDB_Prod",
"Properties": {
"annotations": [],
"type": "AzureSqlDatabase",
"typeProperties": {
"connectionString": "********************"
}
}
} and error is: Message for the errorCode not found..
2022-03-01T17:24:27.6322501Z ##[error]BadRequest: Failed to encrypt sub-resource payload {
"Id": "/subscriptions/a834838a-11d5-4657-a9c3-bc8b2ebdaa59/resourceGroups/adf-rg-uat-uks/providers/Microsoft.DataFactory/factories/adf-df-uat-uks/linkedservices/adfBlobStorage",
"Name": "adfBlobStorage",
"Properties": {
"annotations": [],
"type": "AzureBlobStorage",
"typeProperties": {
"connectionString": "********************"
}
}
} and error is: Expecting connection string of format "key1=value1; key2=value2"..
2022-03-01T17:24:27.6328134Z ##[error]BadRequest: Failed to encrypt sub-resource payload {
"Id": "/subscriptions/a834838a-11d5-4657-a9c3-bc8b2ebdaa59/resourceGroups/adf-rg-uat-uks/providers/Microsoft.DataFactory/factories/adf-df-uat-uks/linkedservices/adfSQLDB",
"Name": "adfSQLDB",
"Properties": {
"annotations": [],
"type": "AzureSqlDatabase",
"typeProperties": {
"connectionString": "********************"
}
}
} and error is: Message for the errorCode not found..
2022-03-01T17:24:27.6336169Z ##[error]BadRequest: Failed to encrypt sub-resource payload {
"Id": "/subscriptions/a834838a-11d5-4657-a9c3-bc8b2ebdaa59/resourceGroups/adf-rg-uat-uks/providers/Microsoft.DataFactory/factories/adf-df-uat-uks/linkedservices/adfSQLDB_UAT",
"Name": "adfSQLDB_UAT",
"Properties": {
"annotations": [],
"type": "AzureSqlDatabase",
"typeProperties": {
"connectionString": "********************"
}
}
} and error is: Message for the errorCode not found..
2022-03-01T17:24:27.6339122Z ##[error]Check out the troubleshooting guide to see if your issue is addressed: https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/deploy/azure-resource-group-deployment?view=azure-devops#troubleshooting
2022-03-01T17:24:27.6342204Z ##[error]Task failed while creating or updating the template deployment.
Regards
Mark
Are you using any Self Hosted Integration Runtime? If yes, then please check it is in available state during the deployment or your connection string may not have the Secure String property:
"typeProperties": {
"connectionString": {
"type": "SecureString",
"value": .............
}
}

Google People API updateContact, batchCreateContacts and batchUpdateContacts throwing Errors

The new Google People APIs
batchCreateContacts - https://developers.google.com/people/api/rest/v1/people/batchCreateContacts and batchUpdateContacts - https://developers.google.com/people/api/rest/v1/people/batchUpdateContacts gives following response on making a request.
{
"error": {
"code": 500,
"message": "Internal error encountered.",
"status": "INTERNAL"
}
}
updateContact - https://developers.google.com/people/api/rest/v1/people/updateContact gives the following response when CalendarUrl is sent in the update contact request
{
"error": {
"code": 400,
"message": "Invalid updatePersonFields mask path: \"calendar_urls\". Valid paths are documented at https://developers.google.com/people/api/rest/v1/people/updateContact.",
"status": "INVALID_ARGUMENT"
}
}
Can someone help with these issues.

Azure Automation Registration Endpoint is corrupted when used to pull DSC configuration

For some reason, I keep getting these weird issues.....
In this case, I have a Key and Endpoint URL for the Automation Account stored as Secrets in a KeyVault (I don't know of a away to extract it natively from Automation Account using ARM).
I can extract these values perfectly and they they are published to the Template that runs a PowerShell extension to pull a DSC Configuration.
For example as seen as an Input deploying the Template:
"RegistrationUrl":"https://ase-agentservice-prod-1.azure-automation.net/accounts/e0799801-a8da-8934-b0f3-9a43191dd7e6"
However, I receive the following error (note the Url in the Error with 3 forward slashes)
"code": "VMExtensionProvisioningError",
"message": "VM has reported a failure when processing extension 'dscLcm'.
Error message: "DSC Configuration 'ConfigureLCMforAAPull' completed with error(s). Following are the first few: The attempt to 'get an action' for AgentId 11A5A267-6D00-11E7-B07F-000D3AE0FB1B from server URL https://ase-agentservice-prod-1.azure-automation.net///accounts/e0799801-a8da-8934-b0f3-9a43191dd7e6/Nodes(AgentId='11A5A267-6D00-11E7-B07F-000D3AE0FB1B')/GetDscAction failed with server error 'ResourceNotFound(404)'.
For further details see the server error message below or the DSC debug event log with ID 4339.
ServerErrorMessage:- 'No NodeConfiguration was found for the agent.'\"."
The Endpoint Url is passed as a Secure String. I tried passing it a normal string - Same problem.
The Key and Endpoint are feed into the Template as Parameters:
"dscKeySecret": {
"type": "securestring",
"metadata": {
"description": "Key for PowerShell DSC Configuration."
}
},
"dscUrlSecret": {
"type": "securestring",
"metadata": {
"description": "Url for PowerShell DSC Configuration."
}
},
These values are used to create a parameter to be passed to the next template that runs the VM Extension.
"extn-settings": {
"value": {
"configuration": {
"url": "[concat(variables('urls').dscScripts, '/', 'lcm-aa-pull', '/', 'lcm-aa-pull', '.zip')]",
"script": "[concat('lcm-aa-pull', '.ps1')]",
"function": "ConfigureLCMforAAPull"
},
"configurationArguments": {
"registrationKey": {
"username": "dsckeySecret",
"password": "[parameters('dscKeySecret')]"
},
"registrationUrl": "[parameters('dscUrlSecret')]",
"configurationMode": "ApplyAndMonitor",
"configurationModeFrequencyMins": 15,
"domain": "[variables('names').domain]",
"name": "dscLcm",
"nodeConfigurationName": "[variables('names').config.ad]",
"rebootNodeIfNeeded": true,
"refreshFrequencyMins": 30
},
"protectedSettings": null,
}
}
The next template receives the Parameters and used in the Properties of the VM's Resources section:
"properties": {
"publisher": "Microsoft.Powershell",
"type": "DSC",
"typeHandlerVersion": "2.22",
"autoUpgradeMinorVersion": true,
"settings": {
"configuration": "[parameters('extn-settings').configuration]",
"configurationArguments": "[parameters('extn-settings').configurationArguments]"
},
"protectedSettings": "[parameters('extn-settings').protectedSettings]"
}
So why is the Url being corrupted with the the first '/' being changed to '///'?
I don't why the Endpoint Url has 3 x '/', but that wasn't the issue.... I wish I found the issue before I posted this question...
I found the Node Configuration Name was wrong with a spelling mistake (hang head in shame)
Thanks anyway!

presto: Discovery server cannot get connect

Recently I build presto with cluster mode, 1 coordinator & 1 worker, it works.
Then I repackage "presto-main-0.148.jar" without any change , and replace it to production environment, it doesn't work! Always get response with "No worker nodes available"
I search the Server.log and see below messages:
ERROR Discovery-0 io.airlift.discovery.client.CachingServiceSelector Cannot
connect to discovery server for refresh (collector/general): Lookup
of collector failed for
ht*p://10.3.2.33:18080/v1/service/collector/general
ERROR Discovery-0 io.airlift.discovery.client.CachingServiceSelector Cannot
connect to discovery server for refresh (presto/general): Lookup of
presto failed for ht*p://10.3.2.33:18080/v1/service/presto/general
INFO Discovery-1 io.airlift.discovery.client.CachingServiceSelector Discovery
server connect succeeded for refresh (collector/general)
INFO Discovery-2 io.airlift.discovery.client.CachingServiceSelector Discovery
server connect succeeded for refresh (presto/general)
So I guess discover server is not started,But I use command curl "h*tp://10.3.2.33:18080/v1/service/collector/general",
and get response below, and I also get coordinator status as 'ACTIVE'
{
"environment": "presto_**_flt",
"services": [
{
"id": "954e886d-7506-4f00-b954-eeab49209835",
"nodeId": "4c0f2596-7e6e-11e6-ae22-56b6b6499611",
"type": "presto",
"pool": "general",
"location": "/4c0f2596-7e6e-11e6-ae22-56b6b6499611",
"properties": {
"node_version": "a0e36ae",
"coordinator": "false",
"http": "h*tp://10.3.2.24:18080",
"http-external": "h*tp://10.3.2.24:18080",
"datasources": "hive,system"
}
},
{
"id": "6790b522-cd17-48ef-b077-e4e8fa97e310",
"nodeId": "4c0f2366-7e6e-11e6-ae22-56b6b6499611",
"type": "presto",
"pool": "general",
"location": "/4c0f2366-7e6e-11e6-ae22-56b6b6499611",
"properties": {
"node_version": "c34bef3-dirty",
"coordinator": "true",
"http": "h*tp://10.3.2.33:18080",
"http-external": "h*tp://10.3.2.33:18080",
"datasources": ""
}
}
]
}
I think this is because that you have two different node_version in these two services.
If you are repackaging presto-main or any other component, make sure you are using the same binaries on all the nodes.