how to use nifi.web.proxy.host and nifi.web.proxy.context.path? - haproxy

I have deployed NiFi with Kerberos in a cluster and for accessing the UI I am using haproxy. I am able to access NiFi UI through the individual node URL but it's not working with the loadbalncer URL and getting following error
System Error
The request contained an invalid host header
I think it can be fixed by nifi.web.proxy.host and nifi.web.proxy.context.path parameters. I tried with this two parameters but the problem still remains.

This issue was pointed at in NiFi 1.5 NIFI-4761.
To resolve this issue, whitelist the hostname used to access NiFi using the following parameter in the nifi.properties configuration file :
nifi.web.proxy.host = <host:port>
Its a comma-separated list of allowed HTTP Host header values to consider when NiFi is running securely and will be receiving requests to a different host[:port]. For example, when running in a Docker container or behind a proxy (e.g. localhost:18443, proxyhost:443). By default, this value is blank, meaning NiFi should allow only requests sent to the host[:port] that NiFi is bound to.

Related

Cors and database uri problem in scaling out architecture

I have react frontend and spring boot backend with mongodb behind.
I have issues with setting 2 parameters in the spring boot service.
First is address of the mongodb which is now set as localhost:27017 in the application.properties
It works at localhost but since I plan to scale out using kubernetes and docker images i would like to know how to define
It and where for the case in which I have mongo1 mongo2 and mongo3 database hosts and would like to pass all 3 URIs ?
Second issues is more tricky! React frontend doesnt work in chrome until I put allow cross origin anotation over my spring rest endpoint . I used hardcoded localhost:3000 here but when I scale it out using kubernetes this wont work if it gets data from another host in the cluster.What to do here?
To answer your first question, you can configure multi data sources, see here documentation how you can configure more than one data sources (80.2 Configure Two DataSources.
For second question you can simply wildcard CORS URL or if you know all of your front end server urls which are load balanced you can pass as list of cors url.
– * – means that all origins are allowed.
– If undefined, all origins are allowed.
RECOMMENDATION
Run your react via yarn to deploy on Apache or ngnix. Once you seted up your domain or sub domain for front end, load balanced your front end so not required to run your front end on ports..

Call nifi processor as a rest api

I want to call a Nifi custom processor as a REST Api and pass the parameters at run-time through pyspark. And retrieve the results in the response object.
Can anyone please help me in suggesting different approaches for the same.
use the following sequence of processors:
HandleHttpRequest
extract patameters
your other processors...
prepare response
HandleHttpResponse
The steps are:
Configure HandleHttpRequest processor.
Enable the required HTTP methods (GET, POST, DELETE, etc.).
Set the listening port.
Attached the Context Map to a service (the listener).
5. Enable the service and the processor.
Bonus:
If you run Nifi from a Docker container, as I do, you should get the container's IP:
docker inspect <container-name> --format='{{.NetworkSettings.IPAddress}}'
Now, you can run Postman, and the HandleHttpRequest processor will fetch it. For example:
I created a simple template to exemplify this scenario. The HTTP request's body is saved into a directory:

Not able to load balance using hardcoded urls in spring cloud zuul

I am testing spring zuul. I want to test round-robin requests forward using zuul routes. And not using eureka setup.
zuul.ignoredServices=*
ribbon.eureka.enabled=false
server.port=9000
zuul.routes.trackingv1.path=/tracking/v1/**
zuul.routes.trackingv1.stripPrefix=false
zuul.routes.trackingv1.serviceId=trackingv1
trackingv1.ribbon.listOfServers=http://localhost:8080/trackingv1,http://localhost:8081/trackingv1
But I am getting errors like Caused by: com.netflix.client.ClientException: Load balancer does not have available server for client: trackingv1
Any idea, what could be wrong?
It's same old problem with using properties. (extra space in value part of key). I had extra space in
zuul.routes.trackingv1.serviceId=trackingv1<space>
Now next problem is, from list of servers
trackingv1.ribbon.listOfServers=http://localhost:8080/trackingv1,http://localhost:8081/trackingv1 it is picking online host:port portion. How to add contextPath "trackingv1" ?

Connection to a S3 instance using a service-connector

I'm trying to create a service-connector to my s3 instance like this:
cf service-connector 13001 mybucketname.ds31s3.swisscom.com:443
But I get the following error:
Server-Error 403: Check of security groups failed (no access)
I have created my service key according to this documentation.
Connecting to my MongoDB works perfectly using a service connector.
You can access Swisscom's S3 directly without the service connector.
The error message suggests that your current org and space do no have access to the S3. This is usually the case is there is no app-binding for that service in the current space. Please check whether you created your service key in the right org and space.
There was a misconfiguration due to security changes. We fixed the issue, so connecting to s3 with the service-connector should now work.

Is there a api for ganglia?

Hello I would like to enquire if there is an API that can be used to retrieve Ganglia stats for all clients from a single ganglia server?
The Ganglia gmetad component listens on ports 8651 and 8652 by default and replies with XML metric data. The XML data type definition can be seen on GitHub here.
Gmetad needs to be configured to allow XML replies to be sent to specific hosts or all hosts. By default only localhost is allowed. This can be changed in /etc/ganglia/gmetad.conf.
Connecting to port 8651 will get you a default XML report of all metrics as a response.
Port 8652 is the interactive port which allows for customized queries. Gmetad will recognize raw text queries sent to this port, i.e. not HTTP requests.
Here are examples of some queries:
/?filter=summary (returns a summary of the whole grid, i.e. all clusters)
/clusterName (returns raw data of a cluster called "clusterName")
/clusterName/hostName (returns raw data for host "hostName" in cluster "clusterName")
/clusterName?filter=summary (returns a summary of only cluster "clusterName")
The ?filter=summary parameter changes the output to contain the sum of each metric value over all hosts. The number of hosts is also provided for each metric so that the mean value may be calculated.
Yes, there's an API for Ganglia: https://github.com/guardian/ganglia-api
You should check this presentation from 2012 Velocity Europe - it was really a great talk: http://www.guardian.co.uk/info/developer-blog/2012/oct/04/winning-the-metrics-battle
There is also an API you can install from pypi with 'pip install gangliarest' and sets up a configurable API backed with a Redis cache and indexer to improve performance.
https://pypi.python.org/pypi/gangliarest