Failed to start service VisualStudioRemoteDeployer - deployment

We are using on site Dev-Ops and have a similar problem to that described in the link Example from SO.
But ours is intermittent.
Our environment uses two build and deploy machines, which each deploy machine having two worker agents.
For one of our projects, when it is deployed, we constantly get the error:
The VisualStudioRemoteDeployerc4d3852f-411b-48ba-97d8-5e09c8d07ce4 service failed to start due to the following error:
%%2
But here is the rub, not every time. Sometimes the deployment completes without error.
Other projects that use the same deployment machine and the same target server work each and every time without fail.
The deployment log reports "The WSMan provider host process did not return a proper response." as an error.
Checking the allocated memory, described in PowerShell Out of Memory, to find our set at 2.1 Billion.

This is an interesting issue that I have uncovered. The source of this problem stems from the interaction of McAfee Endpoint security.
Said antivirus was reporting that when the remote powershell script, using WSMan, was called. McAfee, saw this as a viral payload and canceled the deployment by stopping the service from running and deleting the payload. This has been reported to McAfee as an issue. In the mean time, internal network security settings for McAfee has had to be modified to ignore the processes used by powershell in remote deployment.

Related

How Do Service Connections Work For On-Prem Agents Connecting To On-Prem Services?

This question is purposefully general because I'm trying to understand things more from an architectural perspective, because that will impact which group I need to contact. My team is using Azure DevOps (cloud) with on-prem build agents. The agents connect to ADO via a proxy.
We use several tools in-house provided by vendors with ADO plugins in the Marketplace that require us to set up service connections. Because the services are installed on-prem, the endpoints we enter are not available via the Web (e.g. https://vendor-product.my-company.com).
If I log into the build machine and open up IE, I am able to connect to the service endpoint URL. However, whenever I try to run a task from ADO, it fails with some kind of connection-related issue ("The underlying connection was closed: An unexpected error occurred on a send", "Task ended with an exception: Error: read ECONNRESET", etc.).
The way I thought it worked, all the work takes place on the build machine itself, so the calls would be going from my-build-server.my-company.com to https://vendor-product.my-company.com. Those error messages though make me wonder if the connection is actually coming from https://dev.azure.com.
So the questions I have are:
For situations like this, is the connection to a service endpoint going to be seen as coming from my on-prem build agent, or from ADO (or does it vary based on how the vendor writes their plugin)?
If the answer to #1 is "it varies", is there any way for me to tell just from the plugin itself without having to contact the vendor? (In my experience some of the vendor reps don't understand how the cloud works.)
and/or
Because my build agent was configured to use a proxy when I set it up, is it going to use that proxy for all connections, even internal ones? I think I can set up a proxy bypass list for the agents but I presently only have read access to the build box. I can request temporary elevated access but I'd need some level of confidence that's what the issue is.
Hope I explained the situation clearly, thanks in advance for any insight.

VSTS Agent service can't get code coverage data when running as Local System

Short version: Two builds, A and B, for the same commit, both running on our build server using the VSTS agent service
Build A:
Agent running as Network Service
Saves a .coverage file of 267kb, showing non-zero % code coverage
Runs successfully, no errors, same test logs as build B
Build B:
Agent running as Local System
Saves a .coverage file of 1kb, showing 0% code coverage
Runs successfully, no errors (except that a quality gate fails due to the 0% code coverage, but that's intentional), same test logs as build A
Extra info:
The VSTS Agent service normally ran on our build server as "Network Service", and all was well. Until we had to modify the agent service to run as "Local System" so it could access a cert in the "LocalMachine" store which we need for Azure AD service auth. After that, it still claimed to do everything successfully except that the code coverage file is tiny and claims 0% code coverage, which is weird because the unit tests are certainly being run. The logs from the two test tasks are exactly identical (except for things like timestamps and the build numbers), no helpful warnings or errors in there.
I'm sure it's probably not ideal to run the agent as Local System, but that account has more permissions than network service does, so I don't know how it could be a permission issue. I've probably just made a mistake in setting up something, but it seems like the only way out of this is to either
give Network Service extra permissions (bad)
regenerate / move the Azure AD service principal cert into the "CurrentUser" cert store for Network Service (feels bad but I'm not sure why)
set up a new service account and resign ourselves to having permissions issues forevermore (ugh)
Can we somehow diagnose what exactly is going on with this test task without resorting to procmon? Or is there a better way to manage this stuff?
Well this is rather annoying: I fixed it, but I don't know how. While demonstrating it to a colleague, all I did was repeat my previous steps of rebooting the server and switching the agent service back and forth between the two accounts a couple of times, at which point the problem stopped being reproducible. It seems this is one of those mysteriously vanishing problems that hides whenever you try too hard to investigate it. Hopefully it doesn't come back...

Visual studio release management - deploy with ps/dsc encountered error with server certificate

I'm trying to run a simple ps script on a target computer (my local machine) from our RM server through the RM client. However the release falls over when it reaches deploy using ps/dsc. The error message reads:
Connecting to remote server ### failed with the following error message : The server certificate on the destination computer (###:5985) has the following errors:
Encountered an internal error in the SSL library.
However as you can see by the winrm port number, I'm using HTTP not HTTPS to communicate with my machine, so surely SSL should not come into it. So has anyone else come across this or have any idea what I could be doing wrong?
UPDATE: the machines are part of the same domain.
In the deploy using DSC action keep UseHTTPS variable to false and skipCACheck to true, just in case.
BTW, how long does it take for the action to show this error message in the logs? Also, as someone mentioned in the comments, are you able to manually run the script using PS remoting?
If none of the above helps, we would need more details. Try looking into the event logs for the target machine right after your deployment failed and check for any errors.
I came across same issue ,On installing Azure service certification VM tailed,Resolved issue.

Azure deployment with PowerShell, "New-AzureDeployment : There was no endpoint listening at https://management.core.windows.net/..."

Following the guide and powershell script from this article,
https://www.windowsazure.com/en-us/develop/net/common-tasks/continuous-delivery/
I've run into an extremely odd error:
9/4/2012 9:02 PM - Creating New Deployment: In progress
New-AzureDeployment : There was no endpoint listening at https://management.core.windows.net/5921d8af-88a1-4f63-9673-5e1ae1df7e8a/services/storageservices/Build_2012-09-04_02-27.1/dist/LNEC_Admin.Azure.cspkg/keys that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.
It's odd because we're on build "Build_2012-09-04_08-16.1", not the one mentioned in the URL above (which no longer even exists on the filesystem). This is under Jenkins CI which runs under the NETWORK SERVICE account. If I run it by hand with my own account the same error results, but with a lnecint in place of the build directory: https://management.core.windows.net/5921d8af-88a1-4f63-9673-5e1ae1df7e8a/services/storageservices/lnecint/keys
That keyword "lnecint" isn't mentioned anywhere in any config (I've searched every file on the entire machine and TFS server). It was the name of a storage account, but it's long ago been deleted.
VS 2012, Azure SDK 1.7.1
There's definitely an issue with your endpoint. Can you check what parameters you're passing to the "New-AzureDeployment" Cmdlet?

when doing a "build" with AnthillPro, i get an error com.urbancode.command.CommandException: java.net.ConnectException: Connection refused

Any idea how to resolve this in AnthillPro. I am running the AnthillPro server on ubuntu 10.10
Knowing what version you were on would help. As would knowing the step that's failing. But I assume that there is a connectivity problem with the agent - either agent to server or server to agent.
Validate from the agent configuration that it is seen by the server. It's online. In newer versions you can run an explicit communication test. In older versions you can go to the agent's variable screen as that used to be pulled on each request rather than cached.
Then go to System -> Server Settings and find the connectivity URLs that are passed to the agent. Ensure that you can hit those URLs - exactly as they are on that screen - from the agent. If you can't, the agent won't be able to hit the Server's web services and you would see some sort of connectivity error - perhaps like this one.