TCurl/Yab Command to check Cadence server health - cadence-workflow

I'm looking for a command-line tool to check the cadence server health. I've come across TCurl/Yab, but not able to figure out the exact command to run.

TCurl command to check for cadence server health is:
tcurl -p localhost:7933 cadence-frontend --health

Related

Possible JMeter bug when using JDBC Connection over SSH?

I need to perform a load test against a pgbouncer. All the clients perform an SSH Tunnel before enstablishing a database connection to the database (through the pgbouncer). It's something like this:
sshpass -p 'MY_PSW' ssh -o StrictHostKeyChecking=no -N -L LOCAL_PORT:127.0.0.1:63666 PGBOUNCER_USER#PGBOUNCER_ADDRESS -p PORT >/dev/null 2>&1 &
My Jmeter project has three thread Groups at the moment:
SetUp Thread Group: In which I make a connection to a different database to select a random username and schema
Query Thread Group: In which I perform the JDBC connection using the previous user (which became a property using props.put("schema", vars.get("schema_1")); into the BeanShell Assertion) and the queries
TearDown Thread Group: In which I close the ssh Tunnel.
Now every first time I run the test from the GUI, the data select (JDBC request) into the Query Thread Group gives me an error:
Cannot create PoolableConnectionFactory (FATAL: "trust" authentication failed)
After that, if I run the test again, everything works. I checked the content of the variables and properties with a Debug sampler and everything is correct.
The main problem starts when I run the test without the GUI. It always fails because of that error.
I actually don't like the fact that I have to enstablish the SSH tunnel running the command with an OS Process Sampler, but I can't find any better solution. The SSH tunnel is a part of the test, I don't need that for the master/slave configuration of JMeter.
I would appreciate a lot for a solution or a suggestion to make this work. Thanks.
If you're using the command in the OS Process Sampler it's being run in the background therefore my expectation is that the OS Process Sampler returns the SampleResult immediately and the tunnel is not up yet.
Then when the "Query Thread Group" starts as per JMeter Test Elements execution order JDBC Connection Configuration tries to establish the connection using local port which is not fully established. The fact that the issue is reproducible in non-GUI mode might be the confirmation for my guess as JMeter works much faster in non-GUI as it doesn't need to waste time and resources for GUI refreshing and propagating sample results to listeners.
My expectation is that if you add i.e. Flow Control Action sampler to the setUp Thread Group and configure it to "sleep" for a couple of seconds it should resolve your issue. If it doesn't - try increasing JMeter logging verbosity for the JDBC Test Elements by adding the next line to log4j2.xml file
<Logger name="org.apache.jmeter.protocol.jdbc" level="debug" />
and compare the entries for "successful" and "failed" executions in the jmeter.log file.
Ok, I managed to find the solution. Thank to Dmitri T Answer, I could track the problem, which was about the property not being properly set.
I was using the BeanShell Assertion to set the property after the result of the JDBC request in the setUp Thread Group. Apparently, the BeanShell Assertion is executed at the end of the entire run, so the property where being set at the end. During the first execution, the property is empty.
I used the Beanshell Sampler instead and now it works.

Kubernetes, configure startupProbe to read stdout logs

I have Java application running on kube and wanted to set startupProbe to look for 'tomcat started' text in java logs that are thrown to stdout; any ideas how to make that happen? I saw the documentation but there are only references to checking/curling some endpoint or running a command. Question is how can pod check its own logs
Also - I see that stdout logs are temporarily stored on /var/logs/containers directory (that is on NODE, not POD) so kinda not usefull.
In your question you are focused a bit more on the solution that the actual problem. So let's see how we can tackle the problem from different angle. This is the tomcat echo command that you are trying to base your probe on:
https://github.com/apache/tomcat/blob/0a2ee9b1ba7ded327c2aa2361cccff6a16cdef84/bin/catalina.sh#L506
As you can see this indeed tell you that Tomcat has started but it is not validating anything for you as the code continue to run. You will also notice that this output is not from the tomcat itself but from the script that launches tomcat.
Opened port on the other side is much better option to validate that the web server is up and running. Here's an example how this can be checked:
If you curl the tomcat port that is opened,the exit code would be 0 which will tell you that the server has started:
curl -s localhost:8080 > /dev/null
Here we are using echo $? to check the output from the previous command to validate it:
/usr/local/tomcat# echo $?
0
Moving forward, let's now make a test with the port that is not opened:
We are using the same way of checking it as in previous steps, the difference is just providing the different port (not opened):
/usr/local/tomcat# curl -s localhost:8010 > /dev/null
And then use echo $? to check the exit code of the command:
/usr/local/tomcat# echo $?
7
Alternative way described in this answer would be to query the Kubernetes API for pod logs:
GET /api/v1/namespaces/{namespace}/pods/{name}/log
Having said that, the best way of building things would be to have actual health check endpoint from you application.

Workload Scheduler job won't enable

I'm trying to create a Workload Scheduler job that executes a curl command.
In Steps I've selected Start a program for the step and RP_CLOUD as the Agent(it's the only option). I pasted my curl command into Program.
Now when I try to enable the job I get a popup saying <b>AWSUI4177E</b><br />AWSUI4177E Unable to update the Process.<br /><i>AWSUI4299E An internal error has occurred: AWSPRE001E The user "paul.carron#anaeko.com.5c81ed484ccf4c54aa9e348e" cannot create a job of type "executable" on the "RP_CLOUD" workstation. Download and install a Workload Automation Agent on a different machine.</i>.
The curl statement works when executd in my Terminal. What am I doing wrong?
There are some security constrains on running jobs on the agents provided by the infrastructure.
I see two options:
Use the restful job type (since you are invoking a curl command)
Install an agent

Apache CloudStack: No templates showing when adding instance

I have setup the apache cloudstack on CentOS 6.8 machine following quick installation guide. The management server and KVM are setup on the same machine. The management server is running without problems. I was able to add zone, pod, cluster, primary and secondary storage from the web interface. But when I tried to add an instance it is not showing any templates in the second stage as you can see in the screenshot
However, I am able to see two templates under Templates link in web UI.
But when I select the template and navigate to Zone tab, I see Timeout waiting for response from storage host and Ready field shows no.
When I check the management server logs, it seems there is an error when cloudstack tries to mount secondary storage for use. The below segment from cloudstack-management.log file describes this error.
2017-03-09 23:26:43,207 DEBUG [c.c.a.t.Request] (AgentManager-Handler-
14:null) (logid:) Seq 2-7686800138991304712: Processing: { Ans: , MgmtId:
279278805450918, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":
{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
GetRootDir for nfs://172.16.10.2/export/secondary failed due to
com.cloud.utils.exception.CloudRuntimeException: Unable to mount
172.16.10.2:/export/secondary at /mnt/SecStorage/6e26529d-c659-3053-8acb-
817a77b6cfc6 due to mount.nfs: Connection timed out\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.getRootDir(Nf
sSecondaryStorageResource.java:2080)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.execute(NfsSe
condaryStorageResource.java:1829)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.executeReques
t(NfsSecondaryStorageResource.java:265)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)\n\tat
com.cloud.utils.nio.Task.call(Task.java:83)\n\tat
com.cloud.utils.nio.Task.call(Task.java:29)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:262)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\
n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\
n\tat java.lang.Thread.run(Thread.java:745)\n","wait":0}}] }
Can anyone please guide me how to resolve this issue? I have been trying to figure it out for some hours now and don't know how to proceed further.
Edit 1: Please note that my LAN address was 10.103.72.50 which I assume is not /24 address. I tried to give CentOs a static IP by making the following settings in ifcg-eth0 file
DEVICE=eth0
HWADDR=52:54:00:B9:A6:C0
NM_CONTROLLED=no
ONBOOT=yes
BOOTPROTO=none
IPADDR=172.16.10.2
NETMASK=255.255.255.0
GATEWAY=172.16.10.1
DNS1=8.8.8.8
DNS2=8.8.4.4
But doing this would stop my internet. As a workaround, I reverted these changes and installed all the packages first. Then I changed the IP to static by the same configuration settings as above and ran the cloudstack management. Everything worked fine untill I bumped into this template thing. Please help me figure out what might have went wrong
I know I'm late, but for people trying out in the future, here it goes:
I hope you have successfully added a host as mentioned in Quick Install Guide before you changed your IP to static as it autoconfigures VLANs for different traffic and creates two bridges - generally with names 'cloud' or 'cloudbr'. Cloudstack uses the Secondary Storage System VM for doing all the storage-related operations in each Zone and Cluster. What seems to be the problem is that secondary storage system vm (SSVM) is not able to communicate with the management server at port 8250. If not, try manually mounting the NFS server's mount points in the SSVM shell. You can ssh into the SSVM using the below command:
ssh -i /var/cloudstack/management/.ssh/id_rsa -p 3922 root#<Private or Link local Ip address of SSVM>
I suggest you run the /usr/local/cloud/systemvm/ssvm-check.sh after doing ssh into the secondary storage system VM (assuming it is running) and has it's private, public and link local IP address. If that doesn't help you much, take a look at the secondary storage troubleshooting docs at Cloudstack.
I would further recommend, if anyone in future runs into similar issues, check if the SSVM is running and is in "Up" state in the System VMs section of Infrastructure tab and that you are able to open up a console session of it from the browser. If that is working go on to run the ssvm-check.sh script mentioned above which systematically checks each and every point of operation that SSVM executes. Even if console session cannot be opened up, you can still ssh using the link local IP address of SSVM which can be accessed by opening up details of SSVM and than execute the script. If it says, it cannot communicate with Management Server at port 8250, I recommend you check the iptables rules of management server and make sure all traffic is allowed at port 8250. A custom command to check the same is nc -v <mngmnt-server-ip> 8250. You can do a simple search and learn how to add port 8250 in your iptables rules if that is not opened. Next, you mentioned you used CentOS 6.8, so it probably uses older versions of nfs, so execute exportfs -a in your NFS server to make sure all the NFS shares are properly exported and there are no errors. I would recommend that you wait for the downloading status of CentOS 5.5 no GUI kvm template to be complete and its Ready status shown as 'Yes' before you start importing your own templates and ISOs to execute on VMs. Finally, if your ssvm-check.sh script shows everything is good and the download still does not start, you can run the command: service cloud restart and actually check if the service has gotten a PID using service cloud status as the older versions of system vm templates sometimes need us to manually start the cloud service using service cloud start even after the restart command. Restarting the cloud service in SSVM triggers the restart of downloading of all remaining templates and ISOs. Side note: the system VMs uses a Debian kernel if you want to do some more troubleshooting. Hope this helps.

pactl called from systemd service always reports "pa_context_connect() failed connection refused"

I've setup a systemd service file to perform some pactl operations at system startup for a test process. While the commands work fine when performed from a terminal I always get "pa_context_connect() failed connection refused" when running the same script from the systemd service by starting the service. I'm also using the 'User=' directive in the service file to ensure that the auto-login user matches the user used to run the service commands.
I've read that this is somehow related to the pulseaudio session not being valid in the environmentless context of the systemd service but I haven't been able to figure that out further.
Although it might be a bit late for whatever project you might have been be working on, here's what I found out.
The regular systemctl, the PID 1, indeed cannot access the environement variables of the current user when launching a service. Since pactl relies on those variables to find what instance of pulseaudio it needs to connect to, it is unable to do so when launched though a service. I'm sure there's a fairly dirty workaround for this, but I found something better.
Most systems have a second instance of systemd running in userspace (accessible through systemctl --user while not connected as root). This instance indeed can access all the userspace environment variables and I found that pactl doesn't return any errors when being called either directly or through a script.
All you need to do is put your services in either /usr/lib/systemd/user/, /etc/systemd/user/, or ~/.config/systemd/user/, remove the User= directive from your service file and run systemctl --user daemon-reload as a regular user to make sure they've been detected.