Azure Service Fabric IPv6 networking issues - azure-service-fabric

We are having issues deploying our Service Fabric cluster to Azure and have it handle both IPv4
and IPv6 traffic.
We are developing an application that have mobile clients on iOS and Android which communicate with
our Service Fabric cluster. The communication consist of both HTTP traffic as well as TCP Socket communication.
We need to support IPv6 in order to have Apple accept the app in their App Store.
We are using ARM template for deploying to Azure as it seems the portal does not support configuring
load balancer with IPv6 configuration for Virtual Machine Scale Sets (ref: url). The linked page also states other limitations
to the IPv6 support, such as private IPv6 addresses cannot be deployed to VM Scale Sets. However according
to this page the possibility to assign private IPv6 to VM Scale Sets is available in preview
(although this was last updated 07/14/2017).
For this question I have tried to keep this as general as possible, and based the ARM template on a template found
in this tutorial. The template is called "template_original.json" and can be downloaded from
here. This is a basic template for a service fabric cluster with no security for simplicity.
I will be linking the entire modified ARM template in the bottom of this post, but will highlight the
main modified parts first.
Public IPv4 and IPv6 addresses that are associated with the load balancer. These are associated with their respective backend pools:
"frontendIPConfigurations": [
{
"name": "LoadBalancerIPv4Config",
"properties": {
"publicIPAddress": {
"id": "[resourceId('Microsoft.Network/publicIPAddresses',concat(parameters('lbIPv4Name'),'-','0'))]"
}
}
},
{
"name": "LoadBalancerIPv6Config",
"properties": {
"publicIPAddress": {
"id": "[resourceId('Microsoft.Network/publicIPAddresses',concat(parameters('lbIPv6Name'),'-','0'))]"
}
}
}
],
"backendAddressPools": [
{
"name": "LoadBalancerIPv4BEAddressPool",
"properties": {}
},
{
"name": "LoadBalancerIPv6BEAddressPool",
"properties": {}
}
],
Load balancing rules for frontend ports on respective public IP addresses, both IPv4 and IPv6.
This amounts to four rules in total, two per front end port. I have added port 80 for HTTP here and port 5607 for Socket connection.
Note that I have updated the backend port for IPv6 port 80 to be 8081 and IPv6 port 8507 to be 8517.
{
"name": "AppPortLBRule1Ipv4",
"properties": {
"backendAddressPool": {
"id": "[variables('lbIPv4PoolID0')]"
},
"backendPort": "[parameters('loadBalancedAppPort1')]",
"enableFloatingIP": "false",
"frontendIPConfiguration": {
"id": "[variables('lbIPv4Config0')]"
},
"frontendPort": "[parameters('loadBalancedAppPort1')]",
"idleTimeoutInMinutes": "5",
"probe": {
"id": "[concat(variables('lbID0'),'/probes/AppPortProbe1')]"
},
"protocol": "tcp"
}
},
{
"name": "AppPortLBRule1Ipv6",
"properties": {
"backendAddressPool": {
"id": "[variables('lbIPv6PoolID0')]"
},
/*"backendPort": "[parameters('loadBalancedAppPort1')]",*/
"backendPort": 8081,
"enableFloatingIP": "false",
"frontendIPConfiguration": {
"id": "[variables('lbIPv6Config0')]"
},
"frontendPort": "[parameters('loadBalancedAppPort1')]",
/*"idleTimeoutInMinutes": "5",*/
"probe": {
"id": "[concat(variables('lbID0'),'/probes/AppPortProbe1')]"
},
"protocol": "tcp"
}
},
{
"name": "AppPortLBRule2Ipv4",
"properties": {
"backendAddressPool": {
"id": "[variables('lbIPv4PoolID0')]"
},
"backendPort": "[parameters('loadBalancedAppPort2')]",
"enableFloatingIP": "false",
"frontendIPConfiguration": {
"id": "[variables('lbIPv4Config0')]"
},
"frontendPort": "[parameters('loadBalancedAppPort2')]",
"idleTimeoutInMinutes": "5",
"probe": {
"id": "[concat(variables('lbID0'),'/probes/AppPortProbe2')]"
},
"protocol": "tcp"
}
},
{
"name": "AppPortLBRule2Ipv6",
"properties": {
"backendAddressPool": {
"id": "[variables('lbIPv6PoolID0')]"
},
"backendPort": 8517,
"enableFloatingIP": "false",
"frontendIPConfiguration": {
"id": "[variables('lbIPv6Config0')]"
},
"frontendPort": "[parameters('loadBalancedAppPort2')]",
/*"idleTimeoutInMinutes": "5",*/
"probe": {
"id": "[concat(variables('lbID0'),'/probes/AppPortProbe2')]"
},
"protocol": "tcp"
}
}
Also added one probe per load balancing rule, but omitted here for clarity.
The apiVerison for VM Scale set is set to "2017-03-30" per recommendation from aforementioned preview solution.
The network interface configurations are configured according to recommendations as well.
"networkInterfaceConfigurations": [
{
"name": "[concat(parameters('nicName'), '-0')]",
"properties": {
"ipConfigurations": [
{
"name": "[concat(parameters('nicName'),'-IPv4Config-',0)]",
"properties": {
"privateIPAddressVersion": "IPv4",
"loadBalancerBackendAddressPools": [
{
"id": "[variables('lbIPv4PoolID0')]"
}
],
"loadBalancerInboundNatPools": [
{
"id": "[variables('lbNatPoolID0')]"
}
],
"subnet": {
"id": "[variables('subnet0Ref')]"
}
}
},
{
"name": "[concat(parameters('nicName'),'-IPv6Config-',0)]",
"properties": {
"privateIPAddressVersion": "IPv6",
"loadBalancerBackendAddressPools": [
{
"id": "[variables('lbIPv6PoolID0')]"
}
]
}
}
],
"primary": true
}
}
]
With this template I am able to successfully deploy it to Azure. Communication using IPv4 with the
cluster works as expected, however I am unable to get any IPv6 traffic through at all. This is the
same for both ports 80 (HTTP) and 5607 (socket).
When viewing the list of backend pools for the load balancer in the Azure portal it displays the
following information message which I have been unable to find any information about. I am unsure
if this affects anything in any way?
Backend pool 'loadbalanceripv6beaddresspool' was removed from Virtual machine scale set 'Node1'. Upgrade all the instances of 'Node1' for this change to apply Node1
load balancer error message
I am not sure why I cannot get the traffic through on IPv6. It might be that there is something I
have missed in the template, or some other error on my part? If any additional information is required
dont hesitate to ask.
Here is the entire ARM template. Due to the length and post length limitations I have not embedded it, but here is a Pastebin link to the full ARM Template (Updated).
Update
Some information regarding debugging the IPv6 connectivity. I have tried slightly altering the ARM template to forward the IPv6 traffic on port 80 to backend port 8081 instead. So IPv4 is 80=>80 and IPv6 80=>8081. The ARM template has been updated (see link in previous section).
On port 80 I am running Kestrel as a stateless web server. I have the following entries in the ServiceManifest.xml:
<Endpoint Protocol="http" Name="ServiceEndpoint1" Type="Input" Port="80" />
<Endpoint Protocol="http" Name="ServiceEndpoint3" Type="Input" Port="8081" />
I have been a bit unsure specifically which addresses to listen for in Kestrel. Using FabricRuntime.GetNodeContext().IPAddressOrFQDN always returns the IPv4 address. This is currently how we start it. For debugging this I currently get ahold of all the IPv6 addresses, and hardcoded hack for port 8081 we use that address. Fort port 80 use IPAddress.IPv6Any, however this always defaults to the IPv4 address returned by FabricRuntime.GetNodeContext().IPAddressOrFQDN.
protected override IEnumerable<ServiceInstanceListener> CreateServiceInstanceListeners()
{
var endpoints = Context.CodePackageActivationContext.GetEndpoints()
.Where(endpoint => endpoint.Protocol == EndpointProtocol.Http ||
endpoint.Protocol == EndpointProtocol.Https);
var strHostName = Dns.GetHostName();
var ipHostEntry = Dns.GetHostEntry(strHostName);
var ipv6Addresses = new List<IPAddress>();
ipv6Addresses.AddRange(ipHostEntry.AddressList.Where(
ipAddress => ipAddress.AddressFamily == AddressFamily.InterNetworkV6));
var listeners = new List<ServiceInstanceListener>();
foreach (var endpoint in endpoints)
{
var instanceListener = new ServiceInstanceListener(serviceContext =>
new KestrelCommunicationListener(
serviceContext,
(url, listener) => new WebHostBuilder().
UseKestrel(options =>
{
if (endpoint.Port == 8081 && ipv6Addresses.Count > 0)
{
// change idx to test different IPv6 addresses found
options.Listen(ipv6Addresses[0], endpoint.Port);
}
else
{
// always defaults to ipv4 address
options.Listen(IPAddress.IPv6Any, endpoint.Port);
}
}).
ConfigureServices(
services => services
.AddSingleton<StatelessServiceContext>(serviceContext))
.UseContentRoot(Directory.GetCurrentDirectory())
.UseServiceFabricIntegration(listener, ServiceFabricIntegrationOptions.None)
.UseStartup<Startup>()
.UseUrls(url)
.Build()), endpoint.Name);
listeners.Add(instanceListener);
}
return listeners;
}
Here is the endpoints shown in the Service Fabric Explorer for one of the nodes: Endpoint addresses
Regarding the socket listener I have also altered so that IPv6 is forwarded to backend port 8517 instead of 8507. Similarily as with Kestrel web server the socket listener will open two listening instances on respective addresses with appropriate port.
I hope this information is of any help.

It turns out I made a very stupid mistake that is completely my fault, I forgot to actually verify that my ISP fully supports IPv6. Turns out they don't!
Testing from a provider with full IPv6 support works as it should and I am able to get full connectivity to the nodes in the Service Fabric cluster.
Here is the working ARM template for anyone that needs a fully working example of Service Fabric cluster with IPv4 and IPv6 support:
Not allowed to post pastebin links without a accompanied code snippet...
Update:
Due to length constraints the template could not be pasted in this thread in its entirety, however over on the GitHub Issues page for Service Fabric I crossposted this. The ARM template is posted as a comment in that thread, it will hopefully be available longer than the pastebin link. View it here.

Related

what does the `port` mean in kafka zookeeper path `/brokers/ids/$id`

I got two kafka listeners with config
listeners=PUBLIC_SASL://0.0.0.0:5011,PUBLIC_PLAIN://0.0.0.0:5010
advertised.listeners=PUBLIC_SASL://192.168.181.2:5011,PUBLIC_PLAIN://192.168.181.2:5010
listener.security.protocol.map=PUBLIC_SASL:SASL_PLAINTEXT,PUBLIC_PLAIN:PLAINTEXT
inter.broker.listener.name=PUBLIC_SASL
5010 is plaintext, 5011 is sasl_plaintext.
After startup, I found this information in zookeeper(/brokers/ids/$id):
{
"listener_security_protocol_map": {
"PUBLIC_SASL": "SASL_PLAINTEXT",
"PUBLIC_PLAIN": "PLAINTEXT"
},
"endpoints": [
"PUBLIC_SASL://192.168.181.2:5011",
"PUBLIC_PLAIN://192.168.181.2:5010"
],
"jmx_port": -1,
"features": { },
"host": "192.168.181.2",
"timestamp": "1658485899402",
"port": 5010,
"version": 5
}
What does the port filed mean? Why the port is 5010? Could I change it to 5011?
What you're seeing are advertised.port and advertised.host Kafka settings, which may be parsed from the advertised.listener list for backward compatibility, but both of these are deprecated, however, and the Kafka protocol now uses the protocol map and corresponding endpoints list, instead.

Error core: failed to lookup token: error=failed to read entry, dial tcp [::1]:8500: getsockopt: connection refused in Vault log

We are performing load test on our application using Jmeter, our application uses consul and vault as a backend service for reading/storing application configuration related data. While performing load testing, our application queries the vault for authentication data and this happens for each incoming request. Initially it runs fine for some duration (10 to 15 minutes) and I can see the success response in Jmete, but eventually after sometime the responses starts failing for all the requests. I see the following error in the vault log for each request but do not see any error/exception in the consul log.
Error in Vault log
[ERROR] core: failed to lookup token: error=failed to read entry: Get http://localhost:8500/v1/kv//vault/sys/token/id/87f7b82131cb8fa1ef71aa52579f155d4cf9f095: dial tcp [::1]:8500: getsockopt: connection refused
As of now the load is 100 request (users) in each 10 milliseconds with a ramp-up period of 60 seconds. And this executes over a loop. What could be the cause of this error? Is it due to the limited connection to port 8500
Below is my vault and consul configuration
Vault
backend "consul" {
address = "localhost:8500"
path = "app/vault/"
}
listener "tcp" {
address = "10.88.97.216:8200"
cluster_address = "10.88.97.216:8201"
tls_disable = 0
tls_min_version = "tls12"
tls_cert_file = "/var/certs/vault.crt"
tls_key_file = "/var/certs/vault.key"
}
Consul
{
"data_dir": "/var/consul",
"log_level": "info",
"server": true,
"leave_on_terminate": true,
"ui": true,
"client_addr": "127.0.0.1",
"ports": {
"dns": 53,
"serf_lan": 8301,
"serf_wan" : 8302
},
"disable_update_check": true,
"enable_script_checks": true,
"disable_remote_exec": false,
"domain": "primehome",
"limits": {
"http_max_conns_per_client": 1000,
"rpc_max_conns_per_client": 1000
},
"service": {
"name": "nginx-consul-https",
"port": 443,
"checks": [{
"http": "https://localhost/nginx_status",
"tls_skip_verify": true,
"interval": "10s",
"timeout": "5s",
"status": "passing"
}]
}
}
I have also configured the http_max_conns_per_client & rpc_max_conns_per_client, thinking that it might be due to the limited connection perclicent. But still I am seeing this error in vault log.
After taking another look at this, the issue appears to be that Vault is attempting to contact Consul over the IPv6 loopback address–likely due to the v4 and v6 addresses being present in /etc/hosts–but Consul is only listening on the IPv4 loopback address.
You can likely resolve this through one of the following methods.
Use 127.0.0.1 instead of localhost for Consul's address in the Vault config.
backend "consul" {
address = "127.0.0.1:8500"
path = "app/vault/"
}
Configure Consul to listen on both the IPv4 and IPv6 loopback addresses.
{
"client_addr": "127.0.0.1 [::1]"
}
(Rest of the config omitted for brevity.)
Remove the localhost hostname from the IPv6 loopback in /etc/hosts
127.0.0.1 localhost
# Old hosts entry for ::1
#::1 localhost ip6-localhost ip6-loopback
# New entry
::1 ip6-localhost ip6-loopback

Drools stateful session per request

We are trying to use Drool as our rule engine service. What we done till now is listed below
Deployed workbench 7.2.Final
Deployed KIE server 7.2.0.Final
Configured some data objects, rules, deployed the changes to KIE server and we are able to execute the rule using rest API
Most of our requirements satisfied by stateless session (Give a set of data, execute the rule and return the data, that's it) . But using stateless we have to compromise many of the important features provided by Drools stateful session.
So we are trying to use stateful session per request. Which means the session should get disposed as soon as the request end. Also, parallel request should not interfere each other even if the session name is same
We found about container runtime strategy configuration (Workbench > Deploy > {any container} > Process Configuration > Runtime strategy)
But even after configure the container strategy to Per Request, it still behave same as Singleton (the session is not getting disposed after each request)
Few place we read it as, run time strategy only implemented in jBPM
The way we make request to KIE server is shown below
Request: POST {HOST}/kie-server/services/rest/server/containers/instances/TestRequest_1.0.4
{
"lookup": "ab-session", //stateful session
"commands": [
{
"insert": {
"out-identifier": "125",
"object": {
"com.myteam.testrequest.Product": {
"id": "123",
"name": "Hoo Hoo",
"count": 0
}
},
"return-object": "true"
}
},
{
"insert": {
"out-identifier": "126",
"object": {
"com.myteam.testrequest.Product": {
"id": "123",
"name": "Hoo Hoo",
"count": 0
}
},
"return-object": "true"
}
},
{"fire-all-rules": "hf2"}
]
}
We need help in achieving this requirement. Also, please help understand if we done something wrong
In kmodule.xml you may try to add "prototype" scope, because default is "singleton":
<ksession name="SessionName" type="stateful" default="false" clockType="realtime" scope="prototype"/>

Sample messages from IOT sensors for MQTT communications

There is an M2M Application which wants to talk to the temperature sensors on the field, i.e. send/receive messages using MQTT pub/sub protocol.
I have setup both IOTDM as well as one with eclipse OneM2M using Mosquito. But, I am looking for some sample APIs/commands through which a M2M application can send a message to the MQTT client and vice versa.
Or if any of you could point me to the appropriate call flows that would be helpful.
Any help would be highly appreciated.
Here is a GET MQTT message example:
topic: /oneM2M/req/{{origin}}/{{cse-id}}/json
message:
{
"m2m:rqp": {
"op": "2",
"to": "{{resource_uri}}",
"fr": "{{origin}}",
"rqi": 12345,
"pc": ""
}
}
{{resource_uri}} is the relative path of a resource existing on the
oneM2M server (e.g. /my_cse_base/my_ae)
{{origin}} is the origin enabled (by ACP) to retrieve the resource
{{cse-id}} is the CSEbase ID
The message received could be similar to:
topic: /oneM2M/resp/{{origin}}/{{cse-id}}/json
message:
{
"m2m:rsp": {
"rsc": 2000,
"rqi": 12345,
"pc": {
"m2m:ae": {
"pi": "Sy2XMSpbb",
"ty": 2,
"ct": "20170706T085259",
"ri": "r1NX_cOiVZ",
"rn": "my_ae",
"lt": "20170706T085259",
"et": "20270706T085259",
"acpi": ["/my_cse_base/acp_my_ae"],
"aei": "my_ae_id",
"rr": true
}
}
}
}
A POST example:
topic: /oneM2M/req/{{origin}}/{{cse-id}}/json
message:
{
"m2m:rqp": {
"op": "1",
"to": "{{resource_uri}}",
"fr": "{{origin}}",
"rqi": 12345,
"ty": "4",
"pc": {
"m2m:cin": {
"cnf": "text/plain:0",
"con": "123",
"lbl": ["test"]
}
}
}
}
{{resource_uri}} is the relative path of a resource existing on the
oneM2M server (e.g. /my_cse_base/my_ae)
{{origin}} is the origin enabled (by ACP) to create a new resource
{{cse-id}} is the CSEbase ID
For an JS speach i made an app for mesure the soil moisture. I used MQTT for send information from my Arduino to server written in NodeJS. I don't know if you have some skills on JS. You can see the cond on my github repo . I hope this solution can help you.

How to add a ETW provider to an existing service fabric cluster using powershell?

I have already created a service fabric cluster with azure diagnostics and it is functional currently with my services deployed into that cluster. I have an ETW EventSource in my service that I would like to start collecting events from because my service code already uses this event source to write my service related events. Since the cluster is already enabled for azure diagnostics and my services are already deployed into that cluster, I think it is a simple matter of updating the ETW provider with my event source in this service fabric cluster. Here is the exported template (only a partial is shown that is relevant for azure diagnostics):
{
"properties": {
"publisher": "Microsoft.Azure.Diagnostics",
"type": "IaaSDiagnostics",
"typeHandlerVersion": "1.5",
"autoUpgradeMinorVersion": true,
"settings": {
"WadCfg": {
"DiagnosticMonitorConfiguration": {
"overallQuotaInMB": "50000",
"EtwProviders": {
"EtwEventSourceProviderConfiguration": [
{
"provider": "Microsoft-ServiceFabric-Actors",
"scheduledTransferKeywordFilter": "1",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableActorEventTable"
}
},
{
"provider": "Microsoft-ServiceFabric-Services",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
},
{
"provider": "Bb.ServiceFabric.Infrastructure.Container",
"scheduledTransferPeriod": "PT1M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
}
],
"EtwManifestProviderConfiguration": [
{
"provider": "cbd93bc2-71e5-4566-b3a7-595d8eeca6e8",
"scheduledTransferLogLevelFilter": "Information",
"scheduledTransferKeywordFilter": "4611686018427387904",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricSystemEventTable"
}
}
]
}
}
},
"StorageAccount": "sfdgsmsraghuplaygrou6827"
}
},
"name": "VMDiagnosticsVmExt_vmNodeType0Name"
}
I would like to update following EtwProviders/EtwEventSourceProviderConfiguration to contain following section (as MyCompany.MyServices.MyStatelessService is the name of my service's EventSource):
{
"provider": "MyCompany.MyServices.MyStatelessService",
"scheduledTransferPeriod": "PT5M",
"DefaultEvents": {
"eventDestination": "ServiceFabricReliableServiceEventTable"
}
}
Here are my questions:
Is this the correct way of inserting an ETW provider/EventSource (from my service) into an existing cluster (that is already enabled with azure diagnostics)?
Can I add this event source (as a ETW event source provider) using a powershell command(s)?
If so, what is the exact powershell command (using all the information from the above code fragment)?
Note: I am using .net framework 4.5.2.
All seems good with the added configuration above. Just be aware that for ETWProviders the EventDestination cannot contain hyphens (-), yours don't so you are ok.
To update the Windows Azure Diagnostics (WAD) agent configuration, you can use either PowerShell or Cloud Explorer in Visual Studio.
For the former, simply update the ARM template and use the New-AzureRmResourceGroupDeployment cmdlet. See here for further information: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-diagnostics-how-to-setup-wad/#update-diagnostics-to-collect-and-upload-logs-from-new-eventsource-channels
For using Cloud Explorer in Visual Studio. Browse to your Virtual Machine Scale Set (as this is the Azure resource that holds the WAD configuration). Right-click and choose Update Diagnostics. In the dialog shown, you have the option to upload a private and public configuration file. Simple take a .json document containing the {"WadCfg": {}} element, and upload that as a public configuration.
If you need to update the private configuration specifies the storage account name and AccessKey:
{
"storageAccountName": "",
"storageAccountKey": "",
"storageAccountEndPoint": "https://core.windows.net",
}
Hope this helps.
Mikkel