How do I connect Locust workers to a master running with an https URL? - locust

My company writes medical software and as such is subject to the HIPPA requirments. We are running our code in GCP and I am trying to implement load testing using Locust.
I am able to get the locust master up and running on one of our clusters with an external address but only via https://locustmaster.gcp.mycompany
I am trying to figure out how to get the workers to connect to this. There are TLS and web auth command line options but those are for connecting to the target URL and not the locust master.
Any ideas on how to get this to work?
Oh, I am using Locust V1.4.4.

Worker and master communicate over ZeroMQ (unfortunately they cant work over http). You point to the master using --master-host=X.X.X.X and --master-port=XYZ
https://docs.locust.io/en/stable/running-locust-distributed.html#options

Related

Connect PySpark session to DataProc

I'm trying to connect a PySpark session running locally to a DataProc cluster. I want to be able to work with files on gcs without downloading them. My goal is to perform ad-hoc analyses using local Spark, then switch to a larger cluster when I'm ready to scale. I realize that DataProc runs Spark on Yarn, and I've copied over the yarn-site.xml locally. I've also opened up an ssh tunnel from my local machine to the DataProc master node and set up port forwarding for the ports identified in the yarn xml. It doesn't seem to be working though, when I try to create a session in a Jupyter notebook it hangs indefinitely. Nothing in stdout or DataProc logs that I can see. Has anyone had success with this?
For anyone interested, I eventually abandoned this approach. I'm instead running Jupyter Enterprise Gateway on the master node, setting up port forwarding, and then launching my notebooks locally to connect to kernel(s) running on the server. It works very nicely so far.

Chronos Cluster with High Availability

I have three server A,B,C on each machine I'm running Chronos, ZooKeeper, mesos-master, mesos-slave.
Chronos contact mesos-master using ZooKeeper url hence it automatically picks leading master even if some node is down. I'm having high availability here.
Even Chronos run in cluster mode so accessing any of the Chronos I see same list of jobs and everything works fine.
Problem I have here is, Chronos is accessible with any of the three URLs
http://server_node_1:4400
http://server_node_2:4400
http://server_node_3:4400
I have another application which schedules jobs in Chronos using Rest API. Which URL my application has to talk to in order in run in high availabiity mode?
Let's say my application talks to http://server_node_1:4400 for scheduling the job, if Chronos on node server_node_1 is down I'm not able to schedule the Job.
My application needs to talk to single URL in order to schedule job in Chronos. Even if some Chronos node is down, I should be able to schedule the job. Do I need to have some kind of load balancer between my application and Chronos cluster to pick running chronos node for job scheduling? How can I achieve high availability in my scenario?
Use HAProxy for routing to a Chronos instance. This way you can access a Chronos instance using e.g. curl loadbalancer:8081.
haproxy.cfg:
listen chronos_8081
bind 0.0.0.0:8081
mode http
balance roundrobin
option allbackups
option http-no-delay
server chronos01 server_node_1:4400
server chronos02 server_node_2:4400
server chronos03 server_node_3:4400
Or even better, start Chronos via Marathon, which will ensure given number of instances. Then HAProxy configuration could be generated by:
marathon-lb
bamboo

How do I deploy an entire environment (group of servers) using Chef?

I have an environment (Graphite) that looks like the following:
N worker servers
1 relay server that forwards work to these worker servers
1 web server that can query the relay server.
I would like to use Chef to setup and deploy this environment in EC2 without having to create each worker server individually, get their IPs and set them as attributes in the relay cookbook, create that relay, get the IP, set it as an attribute in the web server cookbook, etc.
Is there a way using chef in which I can make sure that the environment is properly deployed, configured and running without having to set the IPs manually? Particularly, I would like to be able to add a worker server and have the relay update its worker list, or swap the relay server for another one and have the web server update its reference accordingly.
Perhaps this is not what Chef is intended for and is more for per-server configuration and deployment, if that is the case, what would be a technology that facilitates this?
Things you will need are:
knife-ec2 - This is used to start/stop Amazon EC2 instances.
chef-server - To be able to use search in your recipes. It should be also accessible from your EC2 instances.
search - with this you will be able to find among the nodes provisioned by chef, exactly the one you need using different queries.
I have lately written an article How to Run Dynamic Cloud Tests with 800 Tomcats, Amazon EC2, Jenkins and LiveRebel. It involves loadbalancer installation and loadbalancer must know all IP adresses of the servers it balances. You can check out the recipe of balanced node, how it looks for loadbalancer:
search(:node, "roles:lr-loadbalancer").first
And check out the loadbalancer recipe, how it looks for all the balanced nodes and updates the apache config file:
lr_nodes = search(:node, "role:lr-node")
template ::File.join( node[:apache2][:home], 'conf.d', 'httpd-proxy-balancer.conf' ) do
mode 0644
variables(:lr_nodes => lr_nodes)
notifies :restart, 'service[apache2]'
end
Perhaps you are looking for this?
http://www.infochimps.com/platform/ironfan

can a Heroku worker job with external socket connection run in parallel?

Can a worker job in Heroku make socket (ex.pop3) connection to external server ?
I guess scaling worker process to 2 or more will run jobs in parallel and they all trying to connect to same server/port from a same client/port, am I right or missing something ?
Yes - Heroku workers can connect to the outside world - however, there is no built in provision for handling the sort of problems that you mention - you'd need to do that bit yourself.
Just look at the workers as a variety of separate EC2 instances.

Condor central manager could not see the other computing nodes

I connect three servers to form an HPC cluster using condor as a middleware when I run the command condor_status from the central manager it does not shows the other nodes I can run jobs in the central manager and connect to the other nodes via SSH but it seems that there is something missing in condor configuration files where I set the central manager as condor host and allows writing and reading for everyone. I keep the daemon MASTER, STARTD in the daemon list for the worker nodes.
When I run condor_status in the central manager it just show the central manager and when I run it on the compute node it give me the error "CEDAR:6001:Failed to connect to" followed by the central manager IP and port number.
I manage to solve it. The problem was in the central manager's firewall (in my case it was iptables) which was running.
So, when I stopped the firewall (su -c "service iptables stop") all nodes appeared normally, typing condor_status".
The firewall status can be checked using "service iptables status".
There are a number of things that could be going on here. I'd suggest you follow this tutorial and see if it resolves your problems -
http://spinningmatt.wordpress.com/2011/06/12/getting-started-creating-a-multiple-node-condor-pool/
In my case the service "condor.exe" was not running on the server. I had stopped manually. I just start it and every thing went fine.