Turning on email notifications breaks MS Release Management - deployment

I am running TFS 2013 Update 4, Release Management Client Update 4, Release Management Server Update 4, and Update 4 Deployment Agents. I am using ReleaseTfvcTemplate.12.xml.
When a developer checks in code, TFS Build compiles the code, and if it completes then it is released to the DEV stage. This works fine.
However, turning on emails creates a problem.
Let's say I need to notify 10 people of a deployment and then send those same 10 people "approval" emails after the deployment is accepted, which it automatically is. That's 20 emails.
I turned on verbose logging on the RM server and I see that each email takes 30 seconds to send. They send one at a time, one after the other. So it takes ten minutes to send twenty emails.
The emails start sending as soon as the deployment starts. The actual deployment usually takes around 1 minute. Release Management marks the build as deployed and keeps sending the "deploying" and "approval" emails. Meanwhile the TFS Build Configuration log is stuck waiting at:
Process each ConfigurationsToRelease
Release the build
Run the Release Management build process for the current configruation
If a deployment finishes its' emails because they are turned off or there are only 3-4 to send, then the TFS Build Configuration log completes the release and the build is marked successful. However, TFSBuild will only wait 5 minutes at the "Release the build" part of the ReleaseTfvcTemplate workflow. If it takes longer than 5 minutes to send 20 emails, which it does, the build fails. How do I increase this timeout? I have upped the timeout on every component/tool I could find in Release Management. I even changed some web.config timeout settings.
The end result is I end up with deployed code, Release Management thinks everything went fine, and TFS Build thinks the build failed.
Edit:
Here are some lines I pulled from the verbose RM server logs. Notice the timestamps. (I cut some lines out)
7/28/2015 3:49:48 PM - Verbose - (13008, 12024) - A workflow execution
is completed. 7/28/2015 3:49:48 PM - Information - (13008, 12024) -
DeploymentControllerServiceProcessor.OnActivityComplete: Workflow
completed successfully, accept the deployment step. LocalReleaseId:
596, LocalReleaseStepId: 2158 7/28/2015 3:54:47 PM - Information -
(13008, 6952) -
DeploymentControllerServiceProcessor.PrepareNotificationForDeployerImplementation:
NextActivityReadyForDeployment: 7/28/2015 3:54:47 PM - Information -
(13008, 6952) -
DeploymentControllerServiceProcessor.GetNextComponentReadyForDeployment:
DeploymentEvent: 7/28/2015 3:54:49 PM - Information - (13008, 12024)
- Exception in DeploymentControllerServiceProcessor.OnActivityComplete, app.Completed
7/28/2015 3:54:49 PM - Verbose - (13008, 12024) - The request was
aborted: The request was canceled.: \r\n\r\n at
System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at
Microsoft.TeamFoundation.Release.Data.WebRequest.PlatformHttpClient.EndGetResponse(IAsyncResult
asyncResult) at
Microsoft.TeamFoundation.Release.Data.WebRequest.RestClientResponseRetriever.EndGetAsyncMemoryStreamFromResponse(IAsyncResult
asyncResult, IPlatformHttpClient platformHttpClient) at
Microsoft.TeamFoundation.Release.Data.WebRequest.RestClientResponseRetriever.EndDownloadString(IAsyncResult
asyncResult, IPlatformHttpClient platformHttpClient) at
Microsoft.TeamFoundation.Release.Data.WebRequest.RestClient.EndPost(IAsyncResult
asyncResult) at
Microsoft.TeamFoundation.Release.Data.Proxy.RestProxy.HttpRequestor.<>c__DisplayClass1.b__0(String
url, String body) at
Microsoft.TeamFoundation.Release.Data.Proxy.RestProxy.BaseNotificationServiceProxy.SendNotification(Int32
releaseId, String releaseName, String applicationVersionName, String
stageTypeName, String environmentName, Int32 releaseStepId, Int32
releaseStepTypeId, Boolean releaseStepIsAutomated) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.CreateNextReleaseStep(Release
release, Stage stage, StageStep stageStep, Int32 releaseStageRank,
Int32 trialNumber) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.MoveToNextReleaseStep(Release
release, Stage currentStage, ReleaseStep currentReleaseStep) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.MoveWorkflowForward(Release
release, ReleasePath releasePath, Stage currentStage, ReleaseStep
currentReleaseStep, Int32 lastStepRankOfCurrentStage) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.AcceptStep(Release
release, Int32 releaseStepId, Int32 actualApproverId, String
approverComment, Nullable1 deferredDateTime) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.CreateNextReleaseStep(Release
release, Stage stage, StageStep stageStep, Int32 releaseStageRank,
Int32 trialNumber) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.MoveToNextReleaseStep(Release
release, Stage currentStage, ReleaseStep currentReleaseStep) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.MoveWorkflowForward(Release
release, ReleasePath releasePath, Stage currentStage, ReleaseStep
currentReleaseStep, Int32 lastStepRankOfCurrentStage) at
Microsoft.TeamFoundation.Release.Workflow.Services.ReleaseWorkflowService.AcceptStep(Release
release, Int32 releaseStepId, Int32 actualApproverId, String
approverComment, Nullable1 deferredDateTime) at
Microsoft.TeamFoundation.Release.ServiceProcessor.Processor.DeploymentControllerServiceProcessor.OnActivityComplete(String
workflow, WorkflowApplicationCompletedEventArgs e)

There is a setting on the "Administration" tab under "Settings" for "TFS Trigger Deployment Timeout". If you increase that, the build won't fail after 5 minutes.
I'd invest some time looking at why it takes 30 seconds to send each email, though. I've never seen that particular problem... it could be a network issue, or an issue with your mail server.

Related

Suddenly getting ##[error]System.ArgumentNullException: Value cannot be null. (Parameter 'input')

Problem appeared just today - previously it was working fine. Suddenly getting the error:
##[error]System.ArgumentNullException: Value cannot be null. (Parameter ‘input’)
at System.Text.RegularExpressions.Regex.Replace(String input, String replacement)
at Microsoft.VisualStudio.Services.Agent.Util.StringUtil.DeactivateVsoCommands(String input)
at Microsoft.VisualStudio.Services.Agent.Worker.WorkerUtilities.DeactivateVsoCommandsFromJobMessageVariables(AgentJobRequestMessage message)
at Microsoft.VisualStudio.Services.Agent.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
at Microsoft.VisualStudio.Services.Agent.Worker.Program.MainAsync(IHostContext context, String[] args)
Error reported in diagnostic logs. Please examine the log for more details.
- /home/vsts/agents/2.213.1/_diag/Worker_20221110-110312-utc.log
Pool: Azure Pipelines
Image: ubuntu-latest
Queued: Today at 13:00 [manage parallel jobs]
The agent request is already running or has already completed.
The stage seems waiting for the agent and after that fails with above message.
After cancelling the pipeline and restarting the stage - the same error. What is the problem? What would be temporary workaround?
Our pipelines are defined with yaml.
I have no access to the log indicated above so can not provide more details from there.
Regards,
Roman.
UPDATE 14/11/2022:
After applying patch 2.213.2 to MS hosted agents Microsoft has resolved the issue
===============================================
Update 11/14
According to this latest release for the agent, the error has been fixed in the version of 2.213.2.
You could try to update again and test to see if the issue persists.
If the issue is still observed, you could share the latest status with us.
This issue started yesterday also for us. It's indeed related to the fact the pipeline is using Subversion source. We mitigated the issue by switching to our own private/custom agent pool. Moved away from the public azure devops agents.
Update 1:
The mitigation worked for a few hours but all pipelines are now failing with:
##[error]System.ArgumentOutOfRangeException: The startIndex argument must be greater than or equal to zero. (Parameter 'startIndex')
at System.Collections.Concurrent.ConcurrentStack`1.ValidatePushPopRangeInput(T[] items, Int32 startIndex, Int32 count)
at System.Collections.Concurrent.ConcurrentStack`1.PushRange(T[] items, Int32 startIndex, Int32 count)
at System.Collections.Concurrent.ConcurrentStack`1.PushRange(T[] items)
at Agent.Plugins.Repository.SvnCliManager.<>c__DisplayClass3_0.<GetSvnWorkingCopyPaths>b__0(String fld)
at System.Linq.Parallel.ForAllOperator`1.ForAllEnumerator`1.MoveNext(TInput& currentElement, Int32& currentKey)
at System.Linq.Parallel.ForAllSpoolingTask`2.SpoolingWork()
at System.Linq.Parallel.SpoolingTaskBase.Work()
at System.Linq.Parallel.QueryTask.BaseWork(Object unused)
at System.Linq.Parallel.QueryTask.<>c.<.cctor>b__10_0(Object o)
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b__274_0(Object obj)
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
Had the same issue that started yesterday 11/10/2022 with subversion and Azure Hosted Build Agents. Had to build an on premise hosted agent and point build pipeline to use that instead of the azure hosted one.

Azure Devops - Release Pipeline when re-running failed tests azure devops shows failure status even if re-run succeeded

I use Specflow with SpecRunner+ I am using the Deafult.srprofile to to re-run failed tests 3 times in visual studio it shows 2passed 1 failed but the status of the test is a failure, the same goes for azure devops if a re-ran test passes the outcome of the run is a failure. The Failures are sometimes caused by locator timeouts or server timeouts not often but saw it happen few time thats why we decided to implement a re-run.
Could anyone help on this?
022-02-09T12:40:13.8607507Z Test Run Failed.
2022-02-09T12:40:13.8608607Z Total tests: 37
2022-02-09T12:40:13.8609271Z Passed: 36
2022-02-09T12:40:13.8609858Z Failed: 1
2022-02-09T12:40:13.8617476Z Total time: 7.4559 Minutes
2022-02-09T12:40:13.9226929Z ##[warning]Vstest failed with error. Check logs for failures. There might be failed tests.
2022-02-09T12:40:14.0075402Z ##[error]Error: The process 'D:\Microsoft_Visual_Studio\2019\Common7\IDE\Extensions\TestPlatform\vstest.console.exe' failed with exit code 1
2022-02-09T12:40:14.8164576Z ##[error]VsTest task failed.
But then the report states that it was retried 3 times which 2 of the retries were seccusefull but still a failure status on the azure devops run.
The behavior of the report is the correct one and sadly this can't be configured to be changed.
What you can do is to adjust how the results are reported back to Azure DevOps.
You can configure it via the VSTest element in the srProfile- File.
This example means, that at least one retry has to be passing:
<VSTest testRetryResults="Unified" passRateAbsolute="1"/>
Docs: https://docs.specflow.org/projects/specflow-runner/en/latest/Profile/VSTest.html
Be aware that we have stopped the development of the SpecFlow+ Runner. More details here: https://specflow.org/using-specflow/the-retirement-of-specflow-runner/

VSTS Build jobs freeze sporadically

using visual studio team services online with an in house build agent. The build agent while running a job will randomly just freeze, the job is still active but there are no updates to the console, not errors in event logs etc. If I open the agent's _diag folder and look it will just repeat what is below until it decides to continue work.
17:02:19.850546 LogFileTimer_Callback - enter (20)
17:02:19.850546 LogFileTimer_Callback - processing job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:19.850546 LogFileTimer_Callback - found 0 records for job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:19.850546 LogFileTimer_Callback - leave
17:02:20.100159 StatusTimer_Callback - enter (27)
17:02:20.100159 StatusTimer_Callback - processing job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:20.100159 StatusTimer_Callback - leave
17:02:20.240566 ConsoleTimer_Callback - enter (17)
17:02:20.240566 ConsoleTimer_Callback - Inside Lock
17:02:20.240566 ConsoleTimer_Callback - processing job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:20.240566 ConsoleTimer_Callback - leave
17:02:20.755392 ConsoleTimer_Callback - enter (22)
17:02:20.755392 ConsoleTimer_Callback - Inside Lock
17:02:20.755392 ConsoleTimer_Callback - processing job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:20.755392 ConsoleTimer_Callback - leave
17:02:20.864598 StatusTimer_Callback - enter (18)
17:02:20.864598 StatusTimer_Callback - processing job 7b9229d0-524e-4138-b6b3-33f630d109c6
17:02:20.864598 StatusTimer_Callback - leave
We have tried deleting the work folder, uninstalling the agent and reinstalling and it still just seems to freeze on random jobs. Any idea what else I could look into as why this is happening?
Just checked one log, and found these information existed in the log file here and there. Such as restore packages, upload logs, or retrieve files, etc.
These information don't mean there is an error. You may try to create a new agent on another machine to see whether this phenomenon would occur.

Not able to send/receive email from Jenkins using Email-ext plugin

I am using Jenkins ver. 1.463 running on 32-bit Windows Server. I have installed Email-ext plugin version 2.30.2.
I am unable to get any email notifications.
What i am trying to do - is send an email after every job - irrespective of whether the operation is success, failure, abort, not-build, etc..
I am using Jenkins to run Automated Test Suites.
The way i have configured a test job is "In the Post-Build Actions , have selected Editable Email Notification", and filled in the required fields(recipient list,etc). In the Advanced - selected all the triggers from dropdown such as "Success, Failure, Aborted, Regression, Fixed, Not-Built, Stable.
i have verified the smtp server, recipients, Still I am unable to get any email,
in the Console Output of the job i see the following lines
Email was triggered for: Success
Sending email for trigger: Success.
In the Jenkins Server, in the "jenkins.err.log", see the following error message / exception being thrown, but dont know the exact cause
Aug 27, 2013 5:41:57 PM hudson.model.Run run
INFO: TestJob-for-Email #7 main build action completed: SUCCESS
Aug 27, 2013 5:41:58 PM hudson.model.Executor run
SEVERE: Executor threw an exception
java.lang.NoSuchMethodError: hudson.model.AbstractBuild.getPreviousBuild()Lhudson/model/AbstractBuild;
at hudson.plugins.emailext.plugins.content.BuildStatusContent.evaluate(BuildStatusContent.java:71)
at org.jenkinsci.plugins.tokenmacro.DataBoundTokenMacro.evaluate(DataBoundTokenMacro.java:177)
at org.jenkinsci.plugins.tokenmacro.TokenMacro.expand(TokenMacro.java:177)
at org.jenkinsci.plugins.tokenmacro.TokenMacro.expandAll(TokenMacro.java:219)
at hudson.plugins.emailext.plugins.ContentBuilder.transformText(ContentBuilder.java:63)
at hudson.plugins.emailext.ExtendedEmailPublisher.setSubject(ExtendedEmailPublisher.java:687)
at hudson.plugins.emailext.ExtendedEmailPublisher.createMail(ExtendedEmailPublisher.java:485)
at hudson.plugins.emailext.ExtendedEmailPublisher.sendMail(ExtendedEmailPublisher.java:319)
at hudson.plugins.emailext.ExtendedEmailPublisher._perform(ExtendedEmailPublisher.java:311)
at hudson.plugins.emailext.ExtendedEmailPublisher.perform(ExtendedEmailPublisher.java:271)
at hudson.tasks.BuildStepMonitor$3.perform(BuildStepMonitor.java:36)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:710)
at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:685)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.cleanUp(MavenModuleSetBuild.java:1018)
at hudson.model.Run.run(Run.java:1478)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:477)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:239)
Could someone please help,Thanks a lot!
You need to upgrade Jenkins. Your version is too old for the version of the email-ext plugin that you're using. See https://issues.jenkins-ci.org/browse/JENKINS-18728

ClickOnce: DeploymentDownloadException: The operation has timed out

Symptom: ClickOnce installation starts and stops after around 600 kB (out of 2 MB).
Progress bar always stops at the same value (tried ten times).
Error log says that The operation has timed out (in inner exception) and fails with "DeploymentDownloadException (Unknown subtype)".
Error log details (irrelevant information trimmed):
ERROR DETAILS
Following errors were detected during this operation.
System.Deployment.Application.DeploymentDownloadException (Unknown subtype)
- Downloading http://fullpath/name.dll.deploy did not succeed.
- Source: System.Deployment
- Stack trace: at System.Deployment.Application.SystemNetDownloader.DownloadSingleFile(Downloa
dQueueItem next)
at
System.Deployment.Application.SystemNetDownloader.DownloadAllFiles()
at
System.Deployment.Application.FileDownloader.Download(SubscriptionState
subState)
--- Inner Exception ---
System.Net.WebException
- The operation has timed out.
- Source: System
- Stack trace:
at System.Net.ConnectStream.Read(Byte[] buffer,
Int32 offset, Int32 size)
at
System.Deployment.Application.SystemNetDownloader.DownloadSingleFile(Downloa
dQueueItem next)
This only happens for two customers. The install works OK for thousands of others. I have found numerous posts via google with no answer or generic "firewall is the issue" or "customer was using dialup".
Has anyone solved this? Is this a ClickOnce bug?
Disabling firewall software on the machine did not help because a hardware firewall installed on the network was the cause (FortiGate 30B).
I doubt that it's a bug. However, it seems like it gets stuck at one file in the deployment path. Maybe it is a type of file that is blocked by a firewall.
I would just remove all files but one from the build and see if that gets downloaded ok, and then add the rest of the files one by one (or maybe type by type) and see at what file ClickOnce gets stuck downloading.
If that doesn't seem to do anything, I'd build a dummy app and deploy it with ClickOnce and see if it installs at all on the customer's box.