Spring Cloud | Feign Hytrix | First Call Timeout - spring-cloud

I have a service that has uses 3 feign clients. Each time I start my application, I get a TimeoutException on the first call to any feign client.
I have to trigger each feign client at least once before everything is stable. Looking around online, the problem is that something inside of feign or hystrix is lazy loaded and the solution was to make a configuration class that overrides the spring defaults. I've tried that wiith the below code and it is still not helping. I still see the same issue. Anyone know a fix for this? Is the only solution to call the feignclient twice via a hystrix callback?
#FeignClient(value = "SERVICE-NAME", configuration =ServiceFeignConfiguration.class)
#Configuration
public class ServiceFeignConfiguration {
#Value("${service.feign.connectTimeout:60000}")
private int connectTimeout;
#Value("${service.feign.readTimeOut:60000}")
private int readTimeout;
#Bean
public Request.Options options() {
return new Request.Options(connectTimeout, readTimeout);
}
}
Spring Cloud - Brixton.SR4
Spring Boot - 1.4.0.RELEASE
This is all running in docker
Ubuntu - 12.04
Docker - 1.12.1
Docker-Compose - 1.8

I found the solution to be that the default properties of Hystrix are not good. They have a very small timeout window and the request will always time out on the first try. I added these properties to my application.yml file in my config service and now all of my services can use feign with no problems and i dont have to code around the first time timeout
hystrix:
threadpool.default.coreSize: "20"
threadpool.default.maxQueueSize: "500000"
threadpool.default.keepAliveTimeMinutes: "2"
threadpool.default.queueSizeRejectionThreshold: "500000"
command:
default:
fallback.isolation.semaphore.maxConcurrentRequests: "20"
execution:
timeout:
enabled: "false"
isolation:
strategy: "THREAD"
thread:
timeoutInMilliseconds: "30000"

Related

Spring Cloud Dataflow errorChannel not working

I'm attempting to create a custom exception handler for my Spring Cloud Dataflow stream to route some errors to be requeued and others to be DLQ'd.
To do this I'm utilizing the global Spring Integration "errorChannel" and routing based on exception type.
This is the code for the Spring Integration error router:
package com.acme.error.router;
import com.acme.exceptions.DlqException;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.integration.annotation.MessageEndpoint;
import org.springframework.integration.annotation.Router;
import org.springframework.integration.transformer.MessageTransformationException;
import org.springframework.messaging.Message;
#MessageEndpoint
#EnableBinding({ ErrorMessageChannels.class })
public class ErrorMessageMappingRouter {
private static final Logger LOGGER = LoggerFactory.getLogger(ErrorMessageMappingRouter.class);
public static final String ERROR_CHANNEL = "errorChannel";
#Router(inputChannel = ERROR_CHANNEL)
public String onError(Message<Object> message) {
LOGGER.debug("ERROR ROUTER - onError");
if(message.getPayload() instanceof MessageTransformationException) {
MessageTransformationException exception = (MessageTransformationException) message.getPayload();
Message<?> failedMessage = exception.getFailedMessage();
if(exceptionChainContainsDlq(exception)) {
return ErrorMessageChannels.DLQ_QUEUE_NAME;
}
return ErrorMessageChannels.REQUEUE_CHANNEL;
}
return ErrorMessageChannels.DLQ_QUEUE_NAME;
}
...
}
The error router is picked up by each of the stream apps through a package scan on the Spring Boot App for each:
#ComponentScan(basePackages = { "com.acme.error.router" }
#SpringBootApplication
public class StreamApp {}
When this is deployed and run with the local Spring Cloud Dataflow server (version 1.5.0-RELEASE), and a DlqException is thrown, the message is successfully routed to the onError method in the errorRouter and then placed into the dlq topic.
However, when this is deployed as a docker container with SCDF Kubernetes server (also version 1.5.0-RELEASE), the onError method is never hit. (The log statement at the beginning of the router is never output)
In the startup logs for the stream apps, it looks like the bean is picked up correctly and registers as a listener for the errorChannel, but for some reason, when exceptions are thrown they do not get handled by the onError method in our router.
Startup Logs:
o.s.i.endpoint.EventDrivenConsumer : Adding {router:errorMessageMappingRouter.onError.router} as a subscriber to the 'errorChannel' channel
o.s.i.channel.PublishSubscribeChannel : Channel 'errorChannel' has 1 subscriber(s).
o.s.i.endpoint.EventDrivenConsumer : started errorMessageMappingRouter.onError.router
We are using all default settings for the spring cloud stream and kafka binder configurations:
spring.cloud:
stream:
binders:
kafka:
type: kafka
environment.spring.cloud.stream.kafka.binder.brokers=brokerlist
environment.spring.cloud.stream.kafka.binder.zkNodes=zklist
Edit: Added pod args from kubectl describe <pod>
Args:
--spring.cloud.stream.bindings.input.group=delivery-stream
--spring.cloud.stream.bindings.output.producer.requiredGroups=delivery-stream
--spring.cloud.stream.bindings.output.destination=delivery-stream.enricher
--spring.cloud.stream.binders.xdkafka.environment.spring.cloud.stream.kafka.binder.zkNodes=<zkNodes>
--spring.cloud.stream.binders.xdkafka.type=kafka
--spring.cloud.stream.binders.xdkafka.defaultCandidate=true
--spring.cloud.stream.binders.xdkafka.environment.spring.cloud.stream.kafka.binder.brokers=<brokers>
--spring.cloud.stream.bindings.input.destination=delivery-stream.config-enricher
One other idea we attempted was trying to use the Spring Cloud Stream - spring integration error channel support to send to a broker topic on errors, but since messages don't seem to be landing in the global Spring Integration errorChannel at all, that didn't work either.
Is there anything special we need to do in SCDF Kubernetes to enable the global Spring Integration errorChannel?
What am I missing here?
Update with solution from the comments:
After reviewing your configuration I am now pretty sure I know what
the issue is. You have a multi-binder configuration scenario. Even if
you only deal with a single binder instance the existence of
spring.cloud.stream.binders.... is what's going to make framework
treat it as multi-binder. Basically this a bug -
github.com/spring-cloud/spring-cloud-stream/issues/1384. As you can
see it was fixed but you need to upgrade to Elmhurst.SR2 or grab the
latest snapshot (we're in RC2 and 2.1.0.RELEASE is in few weeks
anyway) – Oleg Zhurakousky
This was indeed the problem with our setup. Instead of upgrading, we just eliminated our multi-binder usage for now and the issue was resolved.
Update with solution from the comments:
After reviewing your configuration I am now pretty sure I know what
the issue is. You have a multi-binder configuration scenario. Even if
you only deal with a single binder instance the existence of
spring.cloud.stream.binders.... is what's going to make framework
treat it as multi-binder. Basically this a bug -
github.com/spring-cloud/spring-cloud-stream/issues/1384. As you can
see it was fixed but you need to upgrade to Elmhurst.SR2 or grab the
latest snapshot (we're in RC2 and 2.1.0.RELEASE is in few weeks
anyway) – Oleg Zhurakousky
This was indeed the problem with our setup. Instead of upgrading, we just eliminated our multi-binder usage for now and the issue was resolved.

Spring cloud gateway routes from consul

I can't figure out if spring-cloud-gateway supports Route reading from consul registry, like it is with Zuul.
I added spring-cloud-starter-consul-discovery dependency and #EnableDiscoveryClient, and configured consul properties in application.yml, hovewer, /actuator/gateway/routes doesn't show any routes from consul
I also tried to set spring.cloud.gateway.discovery.locator.enabled: true but doesn't changed anything.
Sample excample below:
spring:
cloud:
consul:
discovery:
register: false
locator:
enabled: true
acl-token: d3ee84e2-c99a-5d84-e4bf-b2cefd7671ba
enabled: true
so the main question, is it even suppose to work?
EDIT: Probably should have mentioned it is version 2.0.0.M5., with Spring Boot 2.0.0.M7
Also I launched with --debug and there is this line:
GatewayDiscoveryClientAutoConfiguration#discoveryClientRouteDefinitionLocator:
Did not match:
- #ConditionalOnBean (types: org.springframework.cloud.client.discovery.DiscoveryClient; SearchStrategy: all) did not find any beans of type org.springframework.cloud.client.discovery.DiscoveryClient (OnBeanCondition)
Matched:
- #ConditionalOnProperty (spring.cloud.gateway.discovery.locator.enabled) matched (OnPropertyCondition)
I could solve it declaring the following bean: DiscoveryClientRouteDefinitionLocator (reference)
#Configuration
#EnableDiscoveryClient
public class AutoRouting {
#Bean
public DiscoveryClientRouteDefinitionLocator discoveryClientRouteDefinitionLocator(DiscoveryClient discoveryClient, DiscoveryLocatorProperties properties) {
return new DiscoveryClientRouteDefinitionLocator(discoveryClient, properties);
}
}
P.S: You need to include "spring-cloud-consul"

annotation #RibbonClient not work together with RestTemplate

I am trying Ribbon configuration with RestTemplate based on bookmark service example but without luck, here is my code:
#SpringBootApplication
#RestController
#RibbonClient(name = "foo", configuration = SampleRibbonConfiguration.class)
public class BookmarkServiceApplication {
public static void main(String[] args) {
SpringApplication.run(BookmarkServiceApplication.class, args);
}
#Autowired
RestTemplate restTemplate;
#RequestMapping("/hello")
public String hello() {
String greeting = this.restTemplate.getForObject("http://foo/hello", String.class);
return String.format("%s, %s!", greeting);
}
}
with error page as below:
Whitelabel Error Page
This application has no explicit mapping for /error, so you are seeing this as a fallback.
Tue Mar 22 19:59:33 GMT+08:00 2016
There was an unexpected error (type=Internal Server Error, status=500).
No instances available for foo
but if I remove annotation #RibbonClient, everything will be just ok,
#RibbonClient(name = "foo", configuration = SampleRibbonConfiguration.class)
and here is SampleRibbonConfiguration implementation:
public class SampleRibbonConfiguration {
#Autowired
IClientConfig ribbonClientConfig;
#Bean
public IPing ribbonPing(IClientConfig config) {
return new PingUrl();
}
#Bean
public IRule ribbonRule(IClientConfig config) {
return new AvailabilityFilteringRule();
}
}
Is it because RibbonClient can not work with RestTemplate together?
and another question is that does Ribbon configuration like load balancing rule could be configured via application.yml configuration file?
as from Ribbon wiki, seems we can configure Ribbon parameters like NFLoadBalancerClassName, NFLoadBalancerRuleClassName etc in property file, does Spring Cloud also supports this?
I'm going to assume you're using Eureka for Service Discovery.
Your particular error:
No instances available for foo
can happen for a couple of reasons
1.) All services are down
All of the instances of your foo service could legitimately be DOWN.
Solution: Try visiting your Eureka Dashboard and ensure all the services are actually UP.
If you're running locally, the Eureka Dashboard is at http://localhost:8761/
2.) Waiting for heartbeats
When you very first register a service via Eureka, there's a period of time where the service is UP but not available. From the documentation
A service is not available for discovery by clients until the
instance, the server and the client all have the same metadata in
their local cache (so it could take 3 heartbeats)
Solution: Wait a good 30 seconds after starting your foo service before you try calling it via your client.
In your particular case I'm going to guess #2 is likely what's happening to you. You're probably starting the service and trying to call it immediately from the client.
When it doesn't work, you stop the client, make some changes and restart. By that time though, all of the heartbeats have completed and your service is now available.
For your second question. Look at the "Customizing the Ribbon Client using properties" section in the reference documentation. (link)

Spring cloud sidecar can not un-register nodeJS service once it is shut down

I suspect this is an issue, can anyone help to have a check?
In my sideCar application, I have application.yml:
server:
port: 5678
spring:
application:
name: nodeservice
sidecar:
port: ${nodeServer.instance.port:3000}
health-uri: http://localhost:${nodeServer.instance.port:3000}/app/health.json
eureka:
instance:
hostname: ${host.instance.name:localhost}
leaseRenewalIntervalInSeconds: 5 #default is 30, recommended to keep default
metadataMap:
instanceId: ${spring.application.name}:${spring.application.instance_id:${random.value}}
client:
serviceUrl:
defaultZone: http://localhost:8761/eureka/
And in my main spring config app, I have:
String url_node = "";
try {
InstanceInfo instance = discoveryClient.getNextServerFromEureka("nodeservice", false);
// InstanceInfo instance = discoveryClient.getNextServerFromEureka("foo", false);
url_node = instance.getHomePageUrl();
} catch (Exception e) {
}
Now I start my nodeJS server, I have in spring app:
url for nodeService is: http://SJCC02MT0NUFD58.local:3000/
This is perfect, but after I shutdown my nodeJS server,
http://localhost:3000/app/health.json url is totally down, BUT, in the main java spring app, I still see the same output there.
So it seemed even if the NodeJS service is no longer available, eureka is still remembering that in memory.
Anything wrong for my configuration?
Another question is why the url being discovered by spring is http://SJCC02MT0NUFD58.local:3000/, not http://localhost:3000? I already configured Eureka.server.instance.host to be localhost.
Thanks
You are seeing the appropriate behavior. Eureka and ribbon are built to be very resilient (AP in CAP). In the case you described, a service had at least one instance, then there were none, the ribbon eureka client keeps the last know list of servers around as a last resort. You're just printing the names, if you try to connect to that service it will fail. This is where you use the Hystrix Circuit Breaker that can provide a fallback in the case that no instances are up.

Zuul timing out in long-ish requests

I am using a front end Spring Cloud application (micro service) acting as a Zuul proxy (#EnableZuulProxy) to route requests from an external source to other internal micro services written using spring cloud (spring boot).
The Zuul server is straight out of the applications in the samples section
#SpringBootApplication
#Controller
#EnableZuulProxy
#EnableDiscoveryClient
public class ZuulServerApplication {
public static void main(String[] args) {
new SpringApplicationBuilder(ZuulServerApplication.class).web(true).run(args);
}
}
I ran this set of services locally and it all seems to work fine but if I run it on a network with some load, or through a VPN, then I start to see Zuul forwarding errors, which I am seeing as client timeouts in the logs.
Is there any way to change the timeout on the Zuul forwards so that I can eliminate this issue from my immediate concerns? What accessible parameter settings are there for this?
In my case I had to change the following property:
zuul.host.socket-timeout-millis=30000
The properties to set are: ribbon.ReadTimeout in general and <service>.ribbon.ReadTimeout for a specific service, in milliseconds. The Ribbon wiki has some examples. This javadoc has the property names.
I have experienced the same problem: in long requests, Zuul's hystrix command kept timing out after around a second in spite of setting ribbon.ReadTimeout=10000.
I solved it by disabling timeouts completely:
hystrix:
command:
default:
execution:
timeout:
enabled: false
An alternative that also works is change Zuul's Hystrix isolation strategy to THREAD:
hystrix:
command:
default:
execution:
isolation:
strategy: THREAD
thread:
timeoutInMilliseconds: 10000
This worked for me, I had to set connection and socket timeout in the application.yml:
zuul:
host:
connect-timeout-millis: 60000 # starting the connection
socket-timeout-millis: 60000 # monitor the continuous incoming data flow
I had to alter two timeouts to force zuul to stop timing out long-running requests. Even if hystrix timeouts are disabled ribbon will still timeout.
hystrix:
command:
default:
execution:
timeout:
enabled: false
ribbon:
ReadTimeout: 100000
ConnectTimeout: 100000
If Zuul uses service discovery, you need to configure these timeouts with the ribbon.ReadTimeout and ribbon.SocketTimeout Ribbon properties.
If you have configured Zuul routes by specifying URLs, you need to use zuul.host.connect-timeout-millis and zuul.host.socket-timeout-millis
by routes i mean
zuul:
routes:
dummy-service:
path: /dummy/**
I had a similar issue and I was trying to set timeout globally, and also sequence of setting timeout for Hystrix and Ribbon matters.
After spending plenty of time, I ended up with this solution. My service was taking upto 50 seconds because of huge volume of data.
Points to consider before changing default value for Timeout:
Hystrix time should be greater than combined time of Ribbon ReadTimeout and ConnectionTimeout.
Use for specific service only, which means don't set globally (which doesn't work).
I mean use this:
command:
your-service-name:
instead of this:
command:
default:
Working solution:
hystrix:
command:
your-service-name:
execution:
isolation:
strategy: THREAD
thread:
timeoutInMilliseconds: 95000
your-service-name:
ribbon:
ConnectTimeout: 30000
ReadTimeout: 60000
MaxTotalHttpConnections: 500
MaxConnectionsPerHost: 100
Reference
Only these settings on application.yml worked for me:
ribbon:
ReadTimeout: 90000
ConnectTimeout: 90000
eureka:
enabled: true
zuul:
host:
max-total-connections: 1000
max-per-route-connections: 100
semaphore:
max-semaphores: 500
hystrix:
command:
default:
execution:
isolation:
thread:
timeoutInMilliseconds: 1000000
Hope it helps someone!