Service Fabric .Net Framework 4.5.1 and 4.6 - azure-service-fabric

After changing the target framework from 4.5.1 to 4.6 the service in Auzure Fail, the local deployment is working.
Do I need to add .Net 4.6 support ? - I'm unable to find where I can see the frameworks available in my cluster in azure.
Thank you
ApplicationName :
fabric:/Lending20.Service.IdentityManagement AggregatedHealthState
: Error UnhealthyEvaluations :
Unhealthy services: 100% (1/1), ServiceType='IdentityManagementServiceType',
MaxPercentUnhealthyServices=0%.
Unhealthy service:
ServiceName='fabric:/Lending20.Service.IdentityManagement/Identity
ManagementService', AggregatedHealthState='Error'.
Unhealthy partitions: 100% (1/1),
MaxPercentUnhealthyPartitionsPerService=0%.
Unhealthy partition:
PartitionId='7c68b397-fda3-491d-9e17-921cd24217ca',
AggregatedHealthState='Error'.
Error event: SourceId='System.FM', Property='State'.
ServiceHealthStates :
ServiceName :
fabric:/Lending20.Service.IdentityManagement/IdentityManagementService
AggregatedHealthState : Error
DeployedApplicationHealthStates :
ApplicationName : fabric:/Lending20.Service.IdentityManagement
NodeName : _lending1
AggregatedHealthState : Ok
HealthEvents :
SourceId : System.CM
Property : State
HealthState : Ok
SequenceNumber : 3464
SentAt : 11/21/2015 12:38:08 PM
ReceivedAt : 11/21/2015 12:38:08 PM
TTL : Infinite
Description : Application has been created.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 11/21/2015 12:38:08 PM, LastError = 1/1/0001
12:00:00 AM

You can use the following ARM template to install .NET 4.6.1. Note that it's dependent on this script (used by Service Profiler). You also can replace it with any other PowerShell script.
The parameter is the base name of the node. So if you have VM0,.. VM5 in your cluster, you should set vmName = 'VM'. The vmExtensionLoop is set to 5 nodes; you can also change that of course.
If you use an ARM template to deploy your cluster, you can include this as part of it. Note it can slow down the deployment of the scale set, since it requires a restart.
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"vmName": {
"type": "string",
"metadata": {
"description": "Virtual machine name."
},
}
},
"resources": [
{
"apiVersion": "2015-05-01-preview",
"type": "Microsoft.Compute/virtualMachines/extensions",
"name": "[concat(parameters('vmName'),copyIndex(0), '/CustomScriptExtensionInstallNet461')]",
"location": "[variables('location')]",
"tags": {
"displayName": "CustomScriptExtensionInstallNet461"
},
"properties": {
"publisher": "Microsoft.Compute",
"type": "CustomScriptExtension",
"typeHandlerVersion": "1.4",
"autoUpgradeMinorVersion": true,
"settings": {
"fileUris": [ "https://gist.githubusercontent.com/aelij/7ea90dda4a187a482584/raw/a3e0f946d4a22b0af803edb503d0a30a263fba2c/InstallNetFx461.ps1" ],
"commandToExecute": "powershell.exe -ExecutionPolicy Unrestricted -File InstallNetFx461.ps1"
}
},
"copy": {
"name": "vmExtensionLoop",
"count": 5
}
}
]
}

.NET 4.6 is not yet available in the default Windows Server 2012 image used in Azure. At this point, your only option is to log into each VM and install it.

Use the windows Server 2016 image to get .net 4.6.1. pre installed. vmImageSku:"2016-Datacenter" when provisoning the cluster.

another option is use azure resource group template that includes a DSC extension to provision your VMs to have .net 46 installed.
Here is the snippet in my dsc powershell to deal with the installation of .net 461
code or gist for more complete script

Until 4.6 is supported by Azure natively, I'd use a custom VM image with .NET 4.6 preinstalled. See this article for details on how to create and use one.

Now .NET 4.6 and above is available in the Release of SDK 2.5.216 and Runtime 5.5.216
For more details please see: https://azure.microsoft.com/en-us/blog/announcing-azure-service-fabric-5-5-and-sdk-2-5/

Related

pod identity on aks cluster crreation

Right now, it's impossible to have assigned user assigned identities on arm templates (and terraform) on cluster creation. I already tried a lot of things, and updates works great, after inserting manually with:
az aks pod-identity add --cluster-name my-aks-cn --resource-group myrg --namespace myns --name example-pod-identity --identity-resource-id /subscriptions/......
But, I want to have this done at once, with the deployment, so I need to insert the pod user identities to the cluster automatically. I also tried to run the command using the DeploymentScripts but the deployment scripts are not ready to use preview aks extersion.
My config looks like this:
{
"type": "Microsoft.ContainerService/managedClusters",
"apiVersion": "2021-02-01",
"name": "[variables('cluster_name')]",
"location": "[variables('location')]",
"dependsOn": [
"[resourceId('Microsoft.Network/virtualNetworks', variables('vnet_name'))]"
],
"properties": {
....
"podIdentityProfile": {
"allowNetworkPluginKubenet": null,
"enabled": true,
"userAssignedIdentities": [
{
"identity": {
"clientId": "[reference(resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', 'managed-indentity'), '2018-11-30').clientId]",
"objectId": "[reference(resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', 'managed-indentity'), '2018-11-30').principalId]",
"resourceId": "[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', 'managed-indentity')]"
},
"name": "managed-indentity",
"namespace": "myns"
}
],
"userAssignedIdentityExceptions": null
},
....
},
"identity": {
"type": "SystemAssigned"
}
},
I'm always getting the same issue:
"statusMessage": "{\"error\":{\"code\":\"InvalidTemplateDeployment\",\"message\":\"The template deployment 'deployment_test' is not valid according to the validation procedure. The tracking id is '.....'. See inner errors for details.\",\"details\":[{\"code\":\"PodIdentityAddonUserAssignedIdentitiesNotAllowedInCreation\",\"message\":\"Provisioning of resource(s) for container service cluster-12344 in resource group myrc failed. Message: {\\n \\\"code\\\": \\\"PodIdentityAddonUserAssignedIdentitiesNotAllowedInCreation\\\",\\n \\\"message\\\": \\\"PodIdentity addon does not support assigning pod identities on creation.\\\"\\n }. Details: \"}]}}",
The Product team has shared the answer here: https://github.com/Azure/aad-pod-identity/issues/1123
which says:
This is a known limitation in the existing configuration. We will fix
this in the V2 implementation.
For others who are facing the same issue, please refer to the GitHub issue above.

How to figure out the physical machines on which the build tasks were run with Azure DevOps REST Api?

We have an on-premise Azure DevOps 2019. I need to know what builds run on what machines, not agents.
Motivation: When the builds are slow I would like to know what builds are running on a particular physical machine (which could be a virtual machine, but for this purpose I call them physical to distinguish from the agents). This could help us figure out if some builds should not be running on the same machines.
Given a build object, I can extract the worker name from the build tasks:
C:\> (Invoke-RestMethod $Build._links.timeline.href -UseDefaultCredentials).records.workerName |? { $_ } | sort -unique
TDC5DFC1BLD10_02
C:\>
So, I know all the tasks in the build were run on the build agent TDC5DFC1BLD10_02. But I want to know the physical machine name. So, I query the agent using its name:
C:\> (Invoke-RestMethod "$TfsInstanceUrl/_apis/distributedtask/pools/$($build.queue.pool.id)/agents?agentName=TDC5DFC1BLD10_02" -UseDefaultCredentials).value
_links : #{self=; web=}
maxParallelism : 1
createdOn : 2019-05-16T19:33:31.567Z
authorization : #{clientId=c4cebb22-e14f-4fdb-844c-079150766efc; publicKey=}
id : 308
name : TDC5DFC1BLD10_02
version : 2.131.0
osDescription : Microsoft Windows 10.0.14393
enabled : True
status : online
provisioningState : Provisioned
C:\>
But it does not give me the physical machine. I have no idea what queue or pool is, but I can check them too:
C:\> $Build.queue | ConvertTo-Json
{
"id": 1929,
"name": "GC-Master-TDC5DFC1BLD08-11",
"pool": {
"id": 90,
"name": "GC-Master-TDC5DFC1BLD08-11"
}
}
C:\> Invoke-RestMethod "$TfsInstanceUrl/SharpTop/_apis/distributedtask/queues/1929" -UseDefaultCredentials | ConvertTo-Json
{
"id": 1929,
"projectId": "ecff38d6-a219-4739-8b97-5e5d8d00e7ed",
"name": "GC-Master-TDC5DFC1BLD08-11",
"pool": {
"id": 90,
"scope": "a984b12d-89d2-47d6-998e-b9bfaa69ee85",
"name": "GC-Master-TDC5DFC1BLD08-11",
"isHosted": false,
"poolType": "automation",
"size": 8
}
}
C:\> Invoke-RestMethod "$TfsInstanceUrl/_apis/distributedtask/pools/90" -UseDefaultCredentials
createdOn : 2019-05-16T19:13:33.493Z
autoProvision : True
autoSize :
agentCloudId :
createdBy : #{displayName=Doe, John;
url=http://tdc1tfsapp01.xyz.com:8080/tfs/_apis/Identities/cc71b5eb-9dd6-436a-b722-6790d7ef4877; _links=;
id=cc71b5eb-9dd6-436a-b722-6790d7ef4877; uniqueName=xyz\P120A76; imageUrl=http://tdc1tfsapp01.xyz.com:8080/t
fs/_api/_common/identityImage?id=cc71b5eb-9dd6-436a-b722-6790d7ef4877;
descriptor=win.Uy0xLTUtMjEtNDg3MjU1NDc3LTE2MzE1MjcwMjItMzUxNzQ0NDQyLTE1NzQy}
owner : #{displayName=Doe, John;
url=http://tdc1tfsapp01.xyz.com:8080/tfs/_apis/Identities/cc71b5eb-9dd6-436a-b722-6790d7ef4877; _links=;
id=cc71b5eb-9dd6-436a-b722-6790d7ef4877; uniqueName=xyz\P120A76; imageUrl=http://tdc1tfsapp01.xyz.com:8080/t
fs/_api/_common/identityImage?id=cc71b5eb-9dd6-436a-b722-6790d7ef4877;
descriptor=win.Uy0xLTUtMjEtNDg3MjU1NDc3LTE2MzE1MjcwMjItMzUxNzQ0NDQyLTE1NzQy}
id : 90
scope : a984b12d-89d2-47d6-998e-b9bfaa69ee85
name : GC-Master-TDC5DFC1BLD08-11
isHosted : False
poolType : automation
size : 8
And I still have no idea of the physical machine. How do I do it?
Agent.ComputerName holds the value of hostname of the agent which executes your job.

Problem with cloudformation stack update and launch template version / autoscaling group

I have a stack in cloudformation (ECS cluster, App LB, Autoscaling Group, launch templates, etc etc.) It all works fine and we have been using this in production and pre production environments for a while.
A problem recently arose while trying to push a stack update. I made some changes to UserData in the AWS::EC2::LaunchTemplate. If i launch a new stack from this template it works great.
BUT:
If i make a change set and apply a stack update cloudformation creates a NEW launch template version -however- the autoscaling group still references the OLD version.
Looking at the AWS docs for AWS::AutoScaling::AutoScalingGroup LaunchTemplateSpecification
I see:
"AWS CloudFormation does not support specifying $Latest, or $Default for the template version number."
Anyone wrangled w/ stack updates creating new versions of resources that need to be referenced elsewhere? I feel like i am missing something obvious.
yay, i'm dumb:
use Fn::GetAtt
ok, make fun of me for using json not yaml
...
"ECSAutoScalingGroup": {
"Type": "AWS::AutoScaling::AutoScalingGroup",
"Properties": {
"VPCZoneIdentifier": {"Ref" : "Subnets"},
"MinSize": "1",
"MaxSize": "10",
"DesiredCapacity": { "Ref": "DesiredInstanceCount" },
"MixedInstancesPolicy": {
"InstancesDistribution" :
{
"OnDemandBaseCapacity" : "0",
"OnDemandPercentageAboveBaseCapacity" : { "Ref" : "PercentOnDemand"}
},
"LaunchTemplate" : {
"LaunchTemplateSpecification" : {
"LaunchTemplateId" : {"Ref" : "ECSLaunchTemplate"},
"Version" : { "Fn::GetAtt" : [ "ECSLaunchTemplate", "LatestVersionNumber" ] }
},
"Overrides" : [ {"InstanceType": "m5.xlarge"},{"InstanceType": "t3.xlarge"},{"InstanceType": "m4.xlarge" },{"InstanceType": "r4.xlarge"},{"InstanceType": "c4.xlarge"}]
}
}
},
...

Standalone Service Fabric - AWS - FileStoreService - Copy-ServiceFabricApplicationPackage Fails

I have a 3 node standalone windows service fabric setup in AWS. The TestConfiguration and CreateCluster scripts run successfully, however on attempting to deploy any applications into the cluster I get the following error from powershell.
Copy-ServiceFabricApplicationPackage -ApplicationPackagePath .\pkg\<packagename> -ImageStoreConnectionString fabric:ImageStore
Copy-ServiceFabricApplicationPackage : An error occurred during this operation. Please check the trace logs for more
details.
At line:1 char:1
+ Copy-ServiceFabricApplicationPackage -ApplicationPackagePath .\pkg\ ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (:) [Copy-ServiceFabricApplicationPackage], FabricException
+ FullyQualifiedErrorId : CopyApplicationPackageErrorId,Microsoft.ServiceFabric.Powershell.CopyApplicationPackage
Not sure which trace logs would be useful in diagnosing the error, however checking the windows event log on one of the nodes I see the following errors, all for the FileStoreService.
ImpersonateAndCopyFile for SourcePath:\\<ipaddress>\StoreShare_Node3\131601795137630192\6.0.232.9494_0\131601794828730764_8589934592_1.ClusterManifest.xml, DestinationPath:C:\ProgramData\SF\Node1\Fabric\work\Applications\__FabricSystem_App4294967295\work\Store\131601795317314061\6.0.232.9494_0\131601794828730764_8589934592_1.ClusterManifest.xml failed: 0x8007052e. Have tried all access tokens.
CopyFile: SourcePath:\\<ip address>\StoreShare_Node3\131601795137630192\6.0.232.9494_0\131601794828730764_8589934592_1.ClusterManifest.xml, DestinationPath:C:\ProgramData\SF\Node1\Fabric\work\Applications\__FabricSystem_App4294967295\work\Store\131601795317314061\6.0.232.9494_0\131601794828730764_8589934592_1.ClusterManifest.xml, Error:0x8007052e, ElapsedTime:80
CopyFile: no new token is found. current token count: 2
Any ideas what this could be? I have recreated a new cluster with no security, firewall has all ports opened both in AWS and on the node machines (trying to remove all things that could be blocking the copying). Within AWS am using SimpleAD so all nodes are running with the same AD administrator, and can communicate to create the cluster.
Below is the cluster config I'm using, kept it as simple as I could to try to limit the causes of the problems.
Any help with diagnosing the copy file issues, or even pointing me at the relevant trace logs would be great.
Additionally I notice the ImageStoreService is showing warnings within Service Fabric Explorer
Unhealthy event: SourceId='System.FM', Property='State', HealthState='Warning', ConsiderWarningAsError=false.
Partition reconfiguration is taking longer than expected.
ImageStoreService 3 3 00000000-0000-0000-0000-000000003000
P/P Ready Node3 131601795137630192
S/S InBuild Node1 131601795317314061
S/S InBuild Node2 131601795317314062
(Showing 3 out of 3 replicas. Total available replicas: 1)
EDIT
Additional Information
On investigating the problem more I ran the Copy-ServiceFabricApplicationPackage with -Debug flag and it now gives the below error, suggesting the user name or password being used to either upload the package from my computer into the cluster, or for the cluster to distribute node to node is incorrect. I presume for node to node it is using the local accounts it creates ending in fffff for which I don't know why it would be creating invalid user credentials. If its between the computer uploading the package and the cluster, then currently I'm running with no security turned on, so don't know why this would be an issue?? Any help much appreciated.
Copy-ServiceFabricApplicationPackage -ApplicationPackagePath ..\pkg\Release -ImageStoreConnectionString fabric:imagestore -Debug
VERBOSE: System.Fabric.FabricException: An error occurred during this operation. Please check the trace logs for more details. ---> System.Runtime.InteropServices.COMException: The user name or password is incorrect. (Exception from HRESULT: 0x8007052E)
Thanks
{
"name": "SampleCluster",
"clusterConfigurationVersion": "1.0.0",
"apiVersion": "08-2017",
"nodes": [
{
"nodeName": "Node1",
"iPAddress": "<node 1 internal ip address>",
"nodeTypeRef": "StandardNodeType",
"faultDomain": "fd:/0",
"upgradeDomain": "UD0"
},
{
"nodeName": "Node2",
"iPAddress": "<node 2 internal ip address>",
"nodeTypeRef": "StandardNodeType",
"faultDomain": "fd:/1",
"upgradeDomain": "UD1"
},
{
"nodeName": "Node3",
"iPAddress": "<node 3 internal ip address>",
"nodeTypeRef": "StandardNodeType",
"faultDomain": "fd:/2",
"upgradeDomain": "UD2"
}
],
"properties": {
"diagnosticsStore": {
"metadata": "Please replace the diagnostics store with an actual file share accessible from all cluster machines.",
"dataDeletionAgeInDays": "7",
"storeType": "FileShare",
"IsEncrypted": "false",
"connectionstring": "c:\\ProgramData\\SF\\DiagnosticsStore"
},
"nodeTypes": [
{
"name": "StandardNodeType",
"clientConnectionEndpointPort": "19000",
"clusterConnectionEndpointPort": "19001",
"leaseDriverEndpointPort": "19002",
"serviceConnectionEndpointPort": "19003",
"httpGatewayEndpointPort": "19080",
"reverseProxyEndpointPort": "19081",
"applicationPorts": {
"startPort": "20000",
"endPort": "30000"
},
"ephemeralPorts": {
"startPort": "49152",
"endPort": "65534"
},
"isPrimary": true
}
],
"fabricSettings": [
{
"name": "Setup",
"parameters": [
{
"name": "FabricDataRoot",
"value": "C:\\ProgramData\\SF"
},
{
"name": "FabricLogRoot",
"value": "C:\\ProgramData\\SF\\Log"
}
]
}
],
"addOnFeatures": [
"DnsService",
"RepairManager"
]
}
}
After more investigating, I discovered it was due to not correctly enabling File Sharing on the windows boxes. Although shown as enabled within the Properties of the Network Adaptor. I failed to realise the settings needed to be enabled under the Advanced Sharing Centre Options (Control Panel\Network and Internet\Network and Sharing Center\Advanced sharing settings).

How to add a ETW provider to an existing service fabric cluster using powershell?

I have already created a service fabric cluster with azure diagnostics and it is functional currently with my services deployed into that cluster. I have an ETW EventSource in my service that I would like to start collecting events from because my service code already uses this event source to write my service related events. Since the cluster is already enabled for azure diagnostics and my services are already deployed into that cluster, I think it is a simple matter of updating the ETW provider with my event source in this service fabric cluster. Here is the exported template (only a partial is shown that is relevant for azure diagnostics):
{
"properties": {
"publisher": "Microsoft.Azure.Diagnostics",
"type": "IaaSDiagnostics",
"typeHandlerVersion": "1.5",
"autoUpgradeMinorVersion": true,
"settings": {
"WadCfg": {
"DiagnosticMonitorConfiguration": {
"overallQuotaInMB": "50000",
"EtwProviders": {
"EtwEventSourceProviderConfiguration": [
{
"provider": "Microsoft-ServiceFabric-Actors",
"scheduledTransferKeywordFilter": "1",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableActorEventTable"
}
},
{
"provider": "Microsoft-ServiceFabric-Services",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
},
{
"provider": "Bb.ServiceFabric.Infrastructure.Container",
"scheduledTransferPeriod": "PT1M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
}
],
"EtwManifestProviderConfiguration": [
{
"provider": "cbd93bc2-71e5-4566-b3a7-595d8eeca6e8",
"scheduledTransferLogLevelFilter": "Information",
"scheduledTransferKeywordFilter": "4611686018427387904",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricSystemEventTable"
}
}
]
}
}
},
"StorageAccount": "sfdgsmsraghuplaygrou6827"
}
},
"name": "VMDiagnosticsVmExt_vmNodeType0Name"
}
I would like to update following EtwProviders/EtwEventSourceProviderConfiguration to contain following section (as MyCompany.MyServices.MyStatelessService is the name of my service's EventSource):
{
"provider": "MyCompany.MyServices.MyStatelessService",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
}
Here are my questions:
Is this the correct way of inserting an ETW provider/EventSource (from my service) into an existing cluster (that is already enabled with azure diagnostics)?
Can I add this event source (as a ETW event source provider) using a powershell command(s)?
If so, what is the exact powershell command (using all the information from the above code fragment)?
Note: I am using .net framework 4.5.2.
All seems good with the added configuration above. Just be aware that for ETWProviders the EventDestination cannot contain hyphens (-), yours don't so you are ok.
To update the Windows Azure Diagnostics (WAD) agent configuration, you can use either PowerShell or Cloud Explorer in Visual Studio.
For the former, simply update the ARM template and use the New-AzureRmResourceGroupDeployment cmdlet. See here for further information: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-diagnostics-how-to-setup-wad/#update-diagnostics-to-collect-and-upload-logs-from-new-eventsource-channels
For using Cloud Explorer in Visual Studio. Browse to your Virtual Machine Scale Set (as this is the Azure resource that holds the WAD configuration). Right-click and choose Update Diagnostics. In the dialog shown, you have the option to upload a private and public configuration file. Simple take a .json document containing the {"WadCfg": {}} element, and upload that as a public configuration.
If you need to update the private configuration specifies the storage account name and AccessKey:
{
"storageAccountName": "",
"storageAccountKey": "",
"storageAccountEndPoint": "https://core.windows.net",
}
Hope this helps.
Mikkel