Occasional Unknownhost Exception from for a service within kubernetes - kubernetes

I have a kubernetes cluster setup on AWS. When i make call to elasticsearch-client.default.svc.cluster.local from a pod, i get unknown host exception occasionaly. It must have something to do with the name resolution, coz hitting the service IP directly works fine.
Note : I already have kube-dns autoscaler enabled. I manually tried with almost 6 kube-dns pods. SO i dont think it is because of dns pod scaling.
When I set the kube-dns configMap with the upstreamserver values to google nameservers (8.8.8.8 and 8.8.4.4) I am not getting the issue. I assume it is because of api ratelimiting done by AWS on route53. But I dont know why the name resolution request would got to AWS NS.

Here's a good write-up that may be related to your problems, also check this one out by Weaveworks.
Basically there have been a number of issues during the last year created at the GitHub Kubernetes issue tracker that has to do with various DNS latencies/problems from within a cluster.
Worth mentioning, although not a fix to every DNS related problem, is that CoreDNS are generally available since version 1.11 and are or will be default thus replacing kube-dns as the default DNS add-on for clusters.
Here's a couple of issues that might be related to the problem you're experiencing:
#47142
#45976
#56903
Hopefully this may help you moving forward.

I also faced with the similar issue with my custom Kubernetes cluster and MySQL and Solr. Kube DNS checks suggested by tutorial from official site were fine (https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/) and I had to apply the following retry logic for data source and Solr client:
...
import org.apache.commons.dbcp.BasicDataSource;
...
public class CommunicationSafeDataSource extends BasicDataSource {
private static final Logger LOGGER = LoggerFactory.getLogger(CommunicationSafeDataSource.class);
#Override
public Connection getConnection() throws SQLException {
for (int i = 1; i <= 10; i++) {
try {
return super.getConnection();
} catch (Exception e) {
if ((e instanceof CommunicationsException) || (e.getCause() instanceof CommunicationsException)) {
LOGGER.warn("Communication exception occurred, retry " + i);
try {
Thread.sleep(i * 1000);
} catch (InterruptedException ie) {
//
}
} else {
throw e;
}
}
}
throw new IllegalStateException("Cannot get connection");
}
}
...
import org.apache.solr.client.solrj.impl.HttpSolrClient;
...
public class CommunicationSafeSolrClient extends HttpSolrClient {
private static final Logger LOGGER = LoggerFactory.getLogger(CommunicationSafeSolrClient.class);
protected CommunicationSafeSolrClient(Builder builder) {
super(builder);
}
#Override
protected NamedList<Object> executeMethod(HttpRequestBase method, ResponseParser processor, boolean isV2Api)
throws SolrServerException {
for (int i = 1; i <= 10; i++) {
try {
return super.executeMethod(method, processor, isV2Api);
} catch (Exception e) {
if ((e instanceof UnknownHostException) || (e.getCause() instanceof UnknownHostException)
|| (e instanceof ConnectException) || (e.getCause() instanceof ConnectException)) {
LOGGER.warn("Communication exception occurred, retry " + i);
try {
Thread.sleep(i * 1000);
} catch (InterruptedException ie) {
//
}
} else {
throw e;
}
}
}
throw new IllegalStateException("Cannot execute method");
}
}

Related

#Transactional with handling error and db-inserts in catch block (Spring Boot)

I would like to rollback a transaction for the data in case of errors and at the same time write the error to db.
I can't manage to do with Transactional Annotations.
Following code produces a runtime-error (1/0) and still writes the data into the db. And also writes the data into the error table.
I tried several variations and followed similar questions in StackOverflow but I didn't succeed to do.
Anyone has a hint, how to do?
#Service
public class MyService{
#Transactional(rollbackFor = Exception.class)
public void updateData() {
try{
processAndPersist(); // <- db operation with inserts
int i = 1/0; // <- Runtime error
}catch (Exception e){
persistError()
trackReportError(filename, e.getMessage());
}
}
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void persistError(String message) {
persistError2Db(message); // <- db operation with insert
}
You need the way to throw an exception in updateData() method to rollback a transaction. And you need to not rollback persistError() transaction at the same time.
#Transactional(rollbackFor = Exception.class)
public void updateData() {
try{
processAndPersist(); // <- db operation with inserts
int i = 1/0; // <- Runtime error
}catch (Exception e){
persistError()
trackReportError(filename, e.getMessage());
throw ex; // if throw error here, will not work
}
}
Just throwing an error will not help because persistError() will have the same transaction as updateData() has. Because persistError() is called using this reference, not a reference to a proxy.
Options to solve
Using self reference.
Using self injection Spring self injection for transactions
Move the call of persistError() outside updateData() (and transaction). Remove #Transactional from persistError() (it will not work) and use transaction of Repository in persistError2Db().
Move persistError() to a separate serface. It will be called using a proxy in this case.
Don't use declarative transactions (with #Transactional annotation). Use Programmatic transaction management to set transaction boundaries manually https://docs.spring.io/spring-framework/docs/3.0.0.M3/reference/html/ch11s06.html
Also keep in mind that persistError() can produce error too (and with high probability will do it).
Using self reference
You can use self reference to MyService to have a transaction, because you will be able to call not a method of MyServiceImpl, but a method of Spring proxy.
#Service
public class MyServiceImpl implements MyService {
public void doWork(MyService self) {
DataEntity data = loadData();
try {
self.updateData(data);
} catch (Exception ex) {
log.error("Error for dataId={}", data.getId(), ex);
self.persistError("Error");
trackReportError(filename, ex);
}
}
#Transactional
public void updateData(DataEntity data) {
persist(data); // <- db operation with inserts
}
#Transactional
public void persistError(String message) {
try {
persistError2Db(message); // <- db operation with insert
} catch (Exception ex) {
log.error("Error for message={}", message, ex);
}
}
}
public interface MyService {
void doWork(MyService self);
void updateData(DataEntity data);
void persistError(String message);
}
To use
MyService service = ...;
service.doWork(service);

Service is not running in some devices and unable to stop running service

I have location service implemented in my app that runs for every x minutes.
Service is started when the user logs in and whenever the app is killed and opened, I check to see if the service is running or not in OnStart().
In App.xaml.cs,
Code to stop service:
DependencyService.Get<IDeviceService>().StopLocationService();
Code to check if the service is running:
bool IsLocationServiceRunning = DependencyService.Get<IServiceRunning>().IsServiceRunning();
In DeviceService.cs:
public void StopLocationService()
{
try
{
Android.App.Application.Context.StopService(new Intent(Android.App.Application.Context, typeof(LocationService)));
}
catch (Exception ex)
{
Utility.LogMessage("Error in StopLocationService(M): " + ex.Message, LogMessageType.error);
}
}
ServiceRunning.cs:
public bool IsServiceRunning()
{
try
{
ActivityManager manager = (ActivityManager)Forms.Context.GetSystemService(Context.ActivityService);
Type serviceClass = typeof(LocationService);
foreach (var service in manager.GetRunningServices(int.MaxValue))
{
if (service.Service.ClassName.EndsWith(typeof(LocationService).Name))
{
return true;
}
}
}
catch(Exception ex)
{
Utility.LogMessage("Error in IsServiceRunning(M): ", LogMessageType.error);
Utility.LogError(ex, LogMessageType.error);
}
return false;
}
Fisrt I'm stopping the service, and when I check the IsLocationService running, it still says true.
Any Idea how to stop a service?
I need to stop a service and start again in some scenarios.
thanks

netty issue when writeAndFlush called from different InboundChannelHandlerAdapter.channelRead

I've got an issue, for which I am unable to post full code (sorry), due to security reasons. The gist of my issue is that I have a ServerBootstrap, created as follows:
bossGroup = new NioEventLoopGroup();
workerGroup = new NioEventLoopGroup();
final ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.childHandler(new ChannelInitializer<SocketChannel>() {
#Override
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addFirst("idleStateHandler", new IdleStateHandler(0, 0, 3000));
//Adds the MQTT encoder and decoder
ch.pipeline().addLast("decoder", new MyMessageDecoder());
ch.pipeline().addLast("encoder", new MyMessageEncoder());
ch.pipeline().addLast(createMyHandler());
}
}).option(ChannelOption.SO_BACKLOG, 128).option(ChannelOption.SO_REUSEADDR, true)
.option(ChannelOption.TCP_NODELAY, true)
.childOption(ChannelOption.SO_KEEPALIVE, true);
// Bind and start to accept incoming connections.
channelFuture = b.bind(listenAddress, listenPort);
With createMyHandlerMethod() that basically returns an extended implementation of ChannelInboundHandlerAdapter
I also have a "client" listener, that listens for incoming connection requests, and is loaded as follows:
final String host = getHost();
final int port = getPort();
nioEventLoopGroup = new NioEventLoopGroup();
bootStrap = new Bootstrap();
bootStrap.group(nioEventLoopGroup);
bootStrap.channel(NioSocketChannel.class);
bootStrap.option(ChannelOption.SO_KEEPALIVE, true);
bootStrap.handler(new ChannelInitializer<SocketChannel>() {
#Override
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addFirst("idleStateHandler", new IdleStateHandler(0, 0, getKeepAliveInterval()));
ch.pipeline().addAfter("idleStateHandler", "idleEventHandler", new MoquetteIdleTimeoutHandler());
ch.pipeline().addLast("decoder", new MyMessageDecoder());
ch.pipeline().addLast("encoder", new MyMessageEncoder());
ch.pipeline().addLast(MyClientHandler.this);
}
})
.option(ChannelOption.SO_REUSEADDR, true)
.option(ChannelOption.TCP_NODELAY, true);
// Start the client.
try {
channelFuture = bootStrap.connect(host, port).sync();
} catch (InterruptedException e) {
throw new MyException(“Exception”, e);
}
Where MyClientHandler is again a subclassed instance of ChannelInboundHandlerAdapter. Everything works fine, I get messages coming in from the "server" adapter, i process them, and send them back on the same context. And vice-versa for the "client" handler.
The problem happens when I have to (for some messages) proxy them from the server or client handler to other connection. Again, I am very sorry for not being able to post much code, but the gist of it is that I'm calling from:
serverHandler.channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof myProxyingMessage) {
if (ctx.channel().isActive()) {
ctx.channel().writeAndFlush(someOtherMessage);
**getClientHandler().writeAndFlush(myProxyingMessage);**
}
}
}
Now here's the problem: the bolded (client) writeAndFlush - never actually writes the message bytes, it doesn't throw any errors. The ChannelFuture returns all false (success, cancelled, done). And if I sync on it, eventually it times out for other reasons (connection timeout set within my code).
I know I haven't posted all of my code, but I'm hoping that someone has some tips and/or pointers for how to isolate the problem of WHY it is not writing to the client context. I'm not a Netty expert by any stretch, and most of this code was written by someone else. They are both subclassing ChannelInboundHandlerAdapter
Feel free to ask any questions if you have any.
*****EDIT*********
I tried to proxy the request back to a DIFFERENT context/channel (ie, the client channel) using the following test code:
public void proxyPubRec(int messageId) throws MQTTException {
logger.log(logLevel, "proxying PUBREC to context: " + debugContext());
PubRecMessage pubRecMessage = new PubRecMessage();
pubRecMessage.setMessageID(messageId);
pubRecMessage.setRemainingLength(2);
logger.log(logLevel, "pipeline writable flag: " + ctx.pipeline().channel().isWritable());
MyMQTTEncoder encoder = new MyMQTTEncoder();
ByteBuf buff = null;
try {
buff = encoder.encode(pubRecMessage);
ctx.channel().writeAndFlush(buff);
} catch (Throwable t) {
logger.log(Level.SEVERE, "unable to encode PUBREC");
} finally {
if (buff != null) {
buff.release();
}
}
}
public class MyMQTTEncoder extends MQTTEncoder {
public ByteBuf encode(AbstractMessage msg) {
PooledByteBufAllocator allocator = new PooledByteBufAllocator();
ByteBuf buf = allocator.buffer();
try {
super.encode(ctx, msg, buf);
} catch (Throwable t) {
logger.log(Level.SEVERE, "unable to encode PUBREC, " + t.getMessage());
}
return buf;
}
}
But the above at line: ctx.channel().writeAndFlush(buff) is NOT writing to the other channel - any tips/tricks on debugging this sort of issue?
someOtherMessage has to be ByteBuf.
So, take this :
serverHandler.channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof myProxyingMessage) {
if (ctx.channel().isActive()) {
ctx.channel().writeAndFlush(someOtherMessage);
**getClientHandler().writeAndFlush(myProxyingMessage);**
}
}
}
... and replace it with this :
serverHandler.channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof myProxyingMessage) {
if (ctx.channel().isActive()) {
ctx.channel().writeAndFlush(ByteBuf);
**getClientHandler().writeAndFlush(myProxyingMessage);**
}
}
}
Actually, this turned out to be a threading issue. One of my threads was blocked/waiting while other threads were writing to the context and because of this, the writes were buffered and not sent, even with a flush. Problem solved!
Essentially, I put the first message code in an Runnable/Executor thread, which allowed it to run separately so that the second write/response was able to write to the context. There are still potentially some issues with this (in terms of message ordering), but this is not on topic for the original question. Thanks for all your help!

Vert.x - Get deployment ID within currently running verticle

I'm looking for the deployment ID for the currently running verticle.
The goal is to allow a verticle to undeploy itself. I currently pass the deploymentID into the deployed verticle over the event bus to accomplish this, but would prefer some direct means of access.
container.undeployVerticle(deploymentID)
There are 2 ways you can get the deployment id. If you have some verticle that starts and handles all the module deployments you can add an async result handler and then get the deployment id that way or you can get the platform manager from the container using reflection.
Async handler will be as follows:
container.deployVerticle("foo.ChildVerticle", new AsyncResultHandler<String>() {
public void handle(AsyncResult<String> asyncResult) {
if (asyncResult.succeeded()) {
System.out.println("The verticle has been deployed, deployment ID is " + asyncResult.result());
} else {
asyncResult.cause().printStackTrace();
}
}
});
Access Platform Manager as follows:
protected final PlatformManagerInternal getManager() {
try {
Container container = getContainer();
Field f = DefaultContainer.class.getDeclaredField("mgr");
f.setAccessible(true);
return (PlatformManagerInternal)f.get(container);
}
catch (Exception e) {
e.printStackTrace();
throw new ScriptException("Could not access verticle manager");
}
}
protected final Map<String, Deployment> getDeployments() {
try {
PlatformManagerInternal mgr = getManager();
Field d = DefaultPlatformManager.class.getDeclaredField("deployments");
d.setAccessible(true);
return Collections.unmodifiableMap((Map<String, Deployment>)d.get(mgr));
}
catch (Exception e) {
throw new ScriptException("Could not access deployments");
}
}
References:
http://grepcode.com/file/repo1.maven.org/maven2/io.vertx/vertx-platform/2.1.2/org/vertx/java/platform/impl/DefaultPlatformManager.java#DefaultPlatformManager.genDepName%28%29
https://github.com/crashub/mod-shell/blob/master/src/main/java/org/vertx/mods/VertxCommand.java
http://vertx.io/core_manual_java.html#deploying-a-module-programmatically
Somewhere in your desired verticle use this:
context.deploymentID();

Write to Event Log - The source was not found, but some or all event logs could not be searched. Inaccessible logs: Security."

I'am trying to write some messages to Windows Event log.
The (security) exception will be thrown when calling function "SourceExists()".
private bool CheckIfEventLogSourceExits()
{
try
{
if (!EventLog.SourceExists(this.BaseEventLog))
{
return false;
}
return true;
}
catch (System.Security.SecurityException)
{
return false;
}
}
All answers to this question are explaining how you can MANUALLY resolve issue.
Like here: Stackoverflow Thread. Solution would be changing some registry keys
But you can't expect that everyone who consumes your application is aware of these changes.
So my question is, how can we solve this issue programmatically?
Below my code:
try
{
string sLog = "Application";
if (CheckIfEventLogSourceExits())
{
EventLog.CreateEventSource(this.BaseEventLog, sLog);
}
EventLog.WriteEntry(this.BaseEventLog, message, eventLogEntryType);
}
catch (Exception ex)
{
ex.Source = "WriteToEventLog";
throw ex;
}
I know it's too late for this posting, but the answer, I found from similar experience, is that the service you're running under doesn't have administrative rights to the machine and, thus, can't write to the logs.
It's easy enough to figure out if an app is being run under admin rights. You can add something like this to your code with a message box advising the user to run "admin".
private void GetServicePermissionLevel()
{
bool bAdmin = false;
try {
SecurityIdentifier sidAdmin = new SecurityIdentifier(WellKnownSidType.BuiltinAdministratorsSid, null);
AppDomain myDomain = Thread.GetDomain();
myDomain.SetPrincipalPolicy(PrincipalPolicy.WindowsPrincipal);
WindowsPrincipal myPrincipal = (WindowsPrincipal)Thread.CurrentPrincipal;
if (myPrincipal.IsInRole(sidAdmin)) {
bAdmin = true;
} else {
bAdmin = false;
}
} catch (Exception ex) {
throw new Exception("Error in GetServicePermissionlevel(): " + ex.Message + " - " + ex.StackTrace);
} finally {
_ServiceRunAsAdmin = bAdmin;
}
}