Codeship Pro on_fail accross step - codeship

Is the on_fail directive of a step run when a previous step has failed ?
I'm using these steps :
- name: fail intentionally
service: busybox
command: false
- name: check if onfail is called
service: busybox
command: true
on_fail:
- command: echo reporting failure
Calling jet steps produces the following output :
(step: fail intentionally)
(image: busybox) (service: busybox) Image exists, using cached image
(step: fail intentionally) error ✗
(step: fail intentionally) container exited with a 1 code
My on_fail is not run.
Is that an issue with the jet utility or would things behave the same in Codeship ?

You have defined an on_fail contingency for the second test step (a step that will not fail). If the on_fail was set for the first step (which fails and stops the build), you would have noted the echoed statement.
This behavior would be consistent with a build running in CodeShip Pro.

Related

Github Action failing to Build Images for the plugins being used in workflow

I am trying to use a plugin in my eks based k8s cluster,
I am using a Github Action controller that spawns on demand Container as Self Hosted runner
When the Github action start this plugin or any other that needs to build itself as a docker image fails with below error, any thoughts or ideas ?
This is my self hosted runner image Link
FYI : If i run a standalone alpine container in the cluster all typical cmd works, and this also works with default ubuntu based self hosted runner, so i dont think its the cluster
/usr/local/bin/docker build -t 60e226:1b6fc15462134e6fb8520b7df48cf7fd -f "/runner/_work/_actions/aquasecurity/trivy-action/master/Dockerfile" "/runner/_work/_actions/aquasecurity/trivy-action/master"
Sending build context to Docker daemon 644.6kB
Step 1/5 : FROM ghcr.io/aquasecurity/trivy:0.[3](https://github.com//docker-images/actions/runs/4134005760/jobs/7147011143#step:3:3)7.1
0.37.1: Pulling from aquasecurity/trivy
c158987b0551: Pulling fs layer
67a7d067ef7d: Pulling fs layer[6]Download complete
67a7d067ef7d: Pull complete
2ec1cdd48f38: Verifying Checksum
2ec1cdd48f38: Download complete
2ec1cdd48f38: Pull complete
fe56e6aa700e: Pull complete
Digest: sha256:7c[16](https://github.com//docker-images/actions/runs/4134005760/jobs/7147011143#step:3:16)7f7f3002948f1ec099555aa968bd8b8b097780603a38cc801fe965da0a69
Status: Downloaded newer image for ghcr.io/aquasecurity/trivy:0.37.1
---> c3e68408cd24
Step 2/5 : COPY entrypoint.sh /
---> 1f1da443ea86
Step 3/5 : RUN apk --no-cache add bash curl npm
---> Running in 647f7f479cac
fetch https://dl-cdn.alpinelinux.org/alpine/v3.[17](https://github.com//docker-images/actions/runs/4134005760/jobs/7147011143#step:3:17)/main/x86_64/APKINDEX.tar.gz
48ABC73BEB7F0000:error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:[18](https://github.com//docker-images/actions/runs/4134005760/jobs/7147011143#step:3:18)89:
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/main: Permission denied
fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/community/x86_64/APKINDEX.tar.gz
48ABC73BEB7F0000:error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1889:
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/community: Permission denied
ERROR: unable to select packages:
bash (no such package):
required by: world[bash]
curl (no such package):
required by: world[curl]
npm (no such package):
required by: world[npm]
The command '/bin/sh -c apk --no-cache add bash curl npm' returned a non-zero code: 3
Warning: Docker build failed with exit code 3, back off 6.807 seconds before retry.
It was expected to build the docker image and proceed with the github action workflow
Tried different flavors of image and nothing worked except for ubunut-latest
the plugin in question
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action#master
with:
image-ref: 'test:latest'
format: 'table'
exit-code: '1'
ignore-unfixed: true
vuln-type: 'os,library'
severity: 'CRITICAL,HIGH'

How to run e2e tests on custom cluster within Kubernetes.

https://github.com/kubernetes/community/blob/master/contributors/devel/e2e-tests.md#testing-against-local-clusters
I have been following the above guide, but I keep getting this error:
2017/07/12 09:53:58 util.go:131: Step './cluster/kubectl.sh version --match-server-version=false' finished in 20.604745ms
2017/07/12 09:53:58 util.go:129: Running: ./hack/e2e-internal/e2e-status.sh
WARNING: The bash deployment for AWS is obsolete. The
v1.5.x releases are the last to support cluster/kube-up.sh with AWS.
For a list of viable alternatives, (...)
2017/07/12 09:53:58 util.go:131: Step './hack/e2e-internal/e2e-status.sh' finished in 18.71843ms
2017/07/12 09:53:58 main.go:216: Something went wrong: encountered 2 errors: [error during ./cluster/kubectl.sh version --match-server-version=false: exit status 1 error during ./hack/e2e-internal/e2e-status.sh: exit status 1]
2017/07/12 09:53:58 e2e.go:78: err: exit status 1
How do I fix this, what am I doing wrong?
If you just want to execute e2e tests without setting up the whole cluster, you can compile them from kubernetes repository: make all WHAT=test/e2e/e2e.test, and then run this compiled e2e binary against your cluster: ./e2e.test --host="<your apiserver>" --provider=local --kubeconfig=<kubeconfig location> -ginkgo.Focus="/[Conformance/]". Conformance tests should pass for any kubernetes cluster, but of course you can set any filter you want. To list all available tests, type: ./e2e.test --ginkgo.DryRun.
Some supplements
You can also compile ginkgo:
make WHAT=vendor/github.com/onsi/ginkgo/ginkgo
Some options are useful:(ginkgo --help to see details)
-flakeAttempts
-focus
-nodes
-outputdir
-skip
-v
To run tests parellely:(set --node=1 for serial tests)
./_output/bin/ginkgo --nodes=25 --flakeAttempts=2 \
./_output/bin/e2e.test -- --host="http://127.0.0.1:8080" \
--provider="local" --ginkgo.v=true --kubeconfig="~/.kube/config" \
--ginkgo.focus="Conformance" --ginkgo.skip="Serial|Slow" \
--ginkgo.failFast=false
And if you want to launch local cluster for e2e testing, hack/local-up-cluster.sh is handy.

Starting Parpool in MATLAB

I tried starting parpool in MATLAB 2015b. Command as follows,
parpool('local',3);
This command should allocate 3 workers. Whereas I received an error stating failure to start parpool. The error message as follows,
Error using parpool (line 94)
Failed to start a parallel pool. (For information in addition to
the causing error, validate the profile 'local' in the Cluster Profile
Manager.)
A similar query was posted in (https://nl.mathworks.com/matlabcentral/answers/196549-failed-to-start-a-parallel-pool-in-matlab2015a). I followed the same procedure, to validate the local profile as per the suggestions.
Using distcomp.feature( 'LocalUseMpiexec', false); or distcomp.feature( 'LocalUseMpiexec', true) in startup.m didn't create any improvement. Also when attempting to validate local profile still gives error message as follows,
VALIDATION DETAILS
Profile: local
Scheduler Type: Local
Stage: Cluster connection test (parcluster)
Status: Passed
Description:Validation Passed
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
Stage: Job test (createJob)
Status: Failed
Description:The job errored or did not reach state finished.
Command Line Output:
Failed to determine if job 24 belongs to this cluster because: Unable to
read file 'C:\Users\varad001\AppData\Roaming\MathWorks\MATLAB
\local_cluster_jobs\R2015b\Job24.in.mat'. No such file or directory..
Error Report:(none)
Debug Log:(none)
Stage: SPMD job test (createCommunicatingJob)
Status: Failed
Description:The job errored or did not reach state finished.
Command Line Output:
Failed to determine if job 25 belongs to this cluster because: Unable to
read file 'C:\Users\varad001\AppData\Roaming\MathWorks\MATLAB
\local_cluster_jobs\R2015b\Job25.in.mat'. No such file or directory..
Error Report:(none)
Debug Log:(none)
Stage: Pool job test (createCommunicatingJob)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
Stage: Parallel pool test (parpool)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
I am receiving these error only in my cluster machine. But launching parpool in my standalone PC is working perfectly. Is there a way to rectify this issue?

msdeploy - stop deploy in postsync if presync fails

I am using msdeploy -presync to backup the current deployment of a website in IIS before the -postsync deploys it, however I recently had a situation where the -presync failed (raised a warning due to a missing dll) and the -postsync continued and overwrote the code.
Both the presync and postsync run batch files.
Obviously this is bad as the backup failed so there is no backout route if the deployment has bugs or fails.
Is there anyway to stop the postsync if the presync raises warnings with msdeploy?
Perhaps the issue here is that the presync failure was raised as a warning not an error.
Supply successReturnCodes parameter set to 0 (success return code convention) to presync option such as:
-preSync:runCommand="your script",successReturnCodes=0
More info at: http://technet.microsoft.com/en-us/library/ee619740(v=ws.10).aspx

Inspect and retry resque jobs via redis-cli

I am unable to run the resque-web on my server due to some issues I still have to work on but I still have to check and retry failed jobs in my resque queues.
Has anyone any experience on how to peek the failed jobs queue to see what the error was and then how to retry it using the redis-cli command line?
thanks,
Found a solution on the following link:
http://ariejan.net/2010/08/23/resque-how-to-requeue-failed-jobs
In the rails console we can use these commands to check and retry failed jobs:
1 - Get the number of failed jobs:
Resque::Failure.count
2 - Check the errors exception class and backtrace
Resque::Failure.all(0,20).each { |job|
puts "#{job["exception"]} #{job["backtrace"]}"
}
The job object is a hash with information about the failed job. You may inspect it to check more information. Also note that this only lists the first 20 failed jobs. Not sure how to list them all so you will have to vary the values (0, 20) to get the whole list.
3 - Retry all failed jobs:
(Resque::Failure.count-1).downto(0).each { |i| Resque::Failure.requeue(i) }
4 - Reset the failed jobs count:
Resque::Failure.clear
retrying all the jobs do not reset the counter. We must clear it so it goes to zero.