I'm using thoughworks go for a build pipeline as shown below:
The "Test" stage fetches artefacts from the build stage and runs each of it's jobs in parallel (unit tests, integration test, acceptance tests, package) on different ages. However, each of those jobs is a shell script.
When those tasks are run on a different agent they are failing because permission is denied. Each job is a shell script, and when I ssh into the agent I can see it does not have executable permissions as shown below:
drwxrwxr-x 2 go go 4096 Mar 4 09:48 .
drwxrwxr-x 8 go go 4096 Mar 4 09:48 ..
-rw-rw-r-- 1 go go 408 Mar 4 09:48 aa_tests.sh
-rw-rw-r-- 1 go go 443 Mar 4 09:48 Dockerfile
-rw-rw-r-- 1 go go 121 Mar 4 09:48 run.sh
However, in the git repository they have executable permission and they seem to execute fine on the build agent that clones the git repository.
I solved the problem by executing the script with bash. E.g "bash sriptname.sh" as the command for the task.
Related
We have installed postgres v12 on ubuntu 20.04 (with apt install -y postgresql postgresql-contrib) and wish to enable archiving to /data/db/postgres/archive by setting the following in postgresql.conf:
max_wal_senders=2
wal_keep_segments=256
wal_sender_timeout=60s
archive_mode=on
archive_command=cp %p /data/db/postgres/archive/%f
However the postgres service fails to write there:
2022-11-15 15:02:26.212 CET [392860] FATAL: archive command failed with exit code 126
2022-11-15 15:02:26.212 CET [392860] DETAIL: The failed archive command was: archive_command=cp pg_wal/000000010000000000000002 /data/db/postgres/archive/000000010000000000000002
2022-11-15 15:02:26.213 CET [392605] LOG: archiver process (PID 392860) exited with exit code 1
sh: 1: pg_wal/000000010000000000000002: Permission denied
This directory /data/db/postgres/archive/ is owned by the postgres user and when we su postgres we are able to create and delete files without a problem.
Why can the postgresql service (running as postgres) not write to a directory it owns?
Here are the permissions on all the parents of the archive directory:
drwxr-xr-x 2 postgres root 6 Nov 15 14:59 /data/db/postgres/archive
drwxr-xr-x 3 root root 21 Nov 15 14:29 /data/db/postgres
drwxr-xr-x 3 root root 22 Nov 15 14:29 /data/db
drwxr-xr-x 5 root root 44 Nov 15 14:29 /data
2022-11-15 15:02:26.212 CET [392860] DETAIL: The failed archive command was: archive_command=cp pg_wal/000000010000000000000002 /data/db/postgres/archive/000000010000000000000002
So, your archive_command is apparently set to the peculiar string archive_command=cp %p /data/db/postgres/archive/%f.
After the %variables are substituted, the result is passed to the shell. The shell does what it was told, which is to set the (unused) environment variable 'archive_command' to be 'cp', and then tries to execute the file pg_wal/000000010000000000000002, which is not allowed to because it doesn't have the execute bit set.
I don't know how you managed to get such a deformed archive_command, but it didn't come from anything you showed us.
I'm running a WSL2 Ubuntu terminal with docker for windows, and every time I run docker-compose up the permissions of the folder that contains the project get changed.
Before running docker-compose:
drwxr-xr-x 12 cesarvarela cesarvarela 4096 Jun 24 15:37 .
After:
drwxr-xr-x 12 999 cesarvarela 4096 Jun 24 15:37
This prevents me from changing git branch, editing files, etc. I have to chown the folder again to my user to do that, but I would like to not having to do this everytime.
I've been trying to install custom pack using these links on a single node K8 cluster.
https://github.com/StackStorm/st2packs-dockerfiles/
https://github.com/stackstorm/stackstorm-ha
Stackstorm is installed successfully with default dashboard but when I tried to build custom pack and helm upgrade it's not working.
Here is my stackstorm pack dir and image Dockerfile:
/home/manisha.tanwar/st2packs-dockerfiles # ll st2packs-image/packs/st2_chef/
total 28
drwxr-xr-x. 4 manisha.tanwar domain users 4096 Apr 28 16:11 actions
drwxr-xr-x. 2 manisha.tanwar domain users 4096 Apr 28 16:11 aliases
-rwxr-xr-x. 1 manisha.tanwar domain users 211 Apr 28 16:11 pack.yaml
-rwxr-xr-x. 1 manisha.tanwar domain users 65 Apr 28 16:11 README.md
-rwxr-xr-x. 1 manisha.tanwar domain users 293 Apr 28 17:47 requirements.txt
drwxr-xr-x. 2 manisha.tanwar domain users 4096 Apr 28 16:11 rules
drwxr-xr-x. 2 manisha.tanwar domain users 4096 Apr 28 16:11 sensors
/home/manisha.tanwar/st2packs-dockerfiles # cat st2packs-image/Dockerfile
ARG PACKS="file:///tmp/stackstorm-st2"
FROM stackstorm/st2packs:builder AS builder
COPY packs/st2_chef /tmp/stackstorm-st2/
RUN ls -la /tmp/stackstorm-st2
RUN git config --global http.sslVerify false
# Install custom packs
RUN /opt/stackstorm/st2/bin/st2-pack-install ${PACKS}
###########################
# Minimize the image size. Start with alpine:3.8,
# and add only packs and virtualenvs from builder.
FROM stackstorm/st2packs:runtime
Image is created using command
docker build -t st2_chef:v0.0.2 st2packs-image
And then I changed values.yaml as below:
packs:
configs:
packs.yaml: |
---
# chef pack
image:
name: st2_chef
tag: 0.0.1
pullPolicy: Always
And run
helm upgrade <release-name>.
but it doesn't show anything on dashboard as well as cmd.
Please help, We are planning to upgrade to Stackstorm HA from standalone stackstorm and I need to get POC done for that.
Thanks in advance!!
Got it working with the help of community. Here's the link if anyone want to follow:
https://github.com/StackStorm/stackstorm-ha/issues/128
I wasn't using docker registery to push the image and use it in helm configuration.
Updated values.yaml as :
packs:
# Custom StackStorm pack configs. Each record creates a file in '/opt/stackstorm/configs/'
# https://docs.stackstorm.com/reference/pack_configs.html#configuration-file
configs:
core.yaml: |
---
image:
# Uncomment the following block to make the custom packs image available to the necessary pods
#repository: your-remote-docker-registry.io
repository: manishatanwar
name: st2_nagios
tag: "0.0.1"
pullPolicy: Always
I'm running Postgres 9.4 installed on Ubuntu 16.04.3. Postgres was installed using apt-get, I downloaded the sources and dependencies with apt-get too. I downloaded pg_rewind REL9_4_STABLE branch and built it. When I try to run my pg_rewind command I get the following:
The servers diverged at WAL position 0/6148D50 on timeline 1.
Rewinding from Last common checkpoint at 0/5000098 on timeline 1
SQL command failed
CREATE OR REPLACE FUNCTION rewind_support.rewind_support_ls_dir(text, boolean) RETURNS SETOF text AS '$libdir/pg_rewind_support' LANGUAGE C STRICT;
ERROR: could not access file "$libdir/pg_rewind_support": No such file or directory
Failure, exiting
I found the pg_rewind_support.so library file and I placed it in the locations returned by pg_config --libdir and --pkglibdir with no success. I even created a copy without .so extension.
$ls -la $(pg_config --pkglibdir)/pg_rewind_support*
-rw-r--r-- 1 root root 18768 Jul 16 17:59 /usr/lib/postgresql/9.4/lib/pg_rewind_support
-rw-r--r-- 1 root root 18768 Jul 16 17:50 /usr/lib/postgresql/9.4/lib/pg_rewind_support.so
$ls -la $(pg_config --libdir)/pg_rewind_support*
-rw-r--r-- 1 root root 18768 Jul 16 17:59 /usr/lib/x86_64-linux-gnu/pg_rewind_support
-rw-r--r-- 1 root root 18768 Jul 16 17:44 /usr/lib/x86_64-linux-gnu/pg_rewind_support.so
Any ideas how can I make my apt-get installed Postgres recognize the pg_rewind library? I don't want to end up running in production a full postgres that was packaged and built in-house.
In working through this with the OP, the steps to build pg_rewind were:
Download the appropriate PostgreSQL 9.4.18 tarball, unpack.
Download pg_rewind, move into contrib/
Configure PostgreSQL to match the directory layout that Debian/Ubuntu uses:
./configure --libdir=/usr/lib/postgresql/9.4/lib --bindir=/usr/lib/postgresql/9.4/bin
Do a "make" on PostgreSQL.
Do a "make" and a "sudo make install" on pg_rewind.
pg_rewind must be installed on both the source system (so that the .so is available there) and on the target system (so the pg_rewind binary is available there).
I try to run simple spring batch task on Spring Cloud Data Flow for Yarn. Unfortunatelly while running it I got error message in ResourceManager UI:
Application application_1473838120587_5156 failed 1 times due to AM Container for appattempt_1473838120587_5156_000001 exited with exitCode: 1
For more detailed output, check application tracking page:http://ip-10-249-9-50.gc.stepstone.com:8088/cluster/app/application_1473838120587_5156Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1473838120587_5156_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
More informations from Appmaster.stderror stated that:
Log Type: Appmaster.stderr
Log Upload Time: Mon Nov 07 12:59:57 +0000 2016
Log Length: 106
Error: Unable to access jarfile spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar
If it comes to Spring Cloud Data Flow I'm trying to run in dataflow-shell:
app register --type task --name simple_batch_job --uri https://github.com/spring-cloud/spring-cloud-dataflow-samples/raw/master/tasks/simple-batch-job/batch-job-1.0.0.BUILD-SNAPSHOT.jar
task create foo --definition "simple_batch_job"
task launch foo
Its really hard to know why this error occurs. I'm sure that connection from dataflow-server to yarn works fine because in standard HDFS localization (/dataflow) some files was copied (servers.yml, jars with jobs and utilities) but it is unaccessible in some way.
My servers.yml config:
logging:
level:
org.apache.hadoop: DEBUG
org.springframework.yarn: DEBUG
maven:
remoteRepositories:
springRepo:
url: https://repo.spring.io/libs-snapshot
spring:
main:
show_banner: false
hadoop:
fsUri: hdfs://HOST:8020
resourceManagerHost: HOST
resourceManagerPort: 8032
resourceManagerSchedulerAddress: HOST:8030
datasource:
url: jdbc:h2:tcp://localhost:19092/mem:dataflow
username: sa
password:
driverClassName: org.h2.Driver
I'll be glad to hear any informations or spring-yarn tips&tricks to make this work.
PS: As hadoop environment I use Amazon EMR 5.0
EDIT: Recursive path from hdfs:
drwxrwxrwx - user hadoop 0 2016-11-07 15:02 /dataflow/apps
drwxrwxrwx - user hadoop 0 2016-11-07 15:02 /dataflow/apps/stream
drwxrwxrwx - user hadoop 0 2016-11-07 15:04 /dataflow/apps/stream/app
-rwxrwxrwx 3 user hadoop 121 2016-11-07 15:05 /dataflow/apps/stream/app/application.properties
-rwxrwxrwx 3 user hadoop 1177 2016-11-07 15:04 /dataflow/apps/stream/app/servers.yml
-rwxrwxrwx 3 user hadoop 60202852 2016-11-07 15:04 /dataflow/apps/stream/app/spring-cloud-deployer-yarn-appdeployerappmaster-1.0.0.RELEASE.jar
drwxrwxrwx - user hadoop 0 2016-11-04 14:22 /dataflow/apps/task
drwxrwxrwx - user hadoop 0 2016-11-04 14:24 /dataflow/apps/task/app
-rwxrwxrwx 3 user hadoop 121 2016-11-04 14:25 /dataflow/apps/task/app/application.properties
-rwxrwxrwx 3 user hadoop 2101 2016-11-04 14:24 /dataflow/apps/task/app/servers.yml
-rwxrwxrwx 3 user hadoop 60198804 2016-11-04 14:24 /dataflow/apps/task/app/spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar
drwxrwxrwx - user hadoop 0 2016-11-04 14:25 /dataflow/artifacts
drwxrwxrwx - user hadoop 0 2016-11-07 15:06 /dataflow/artifacts/cache
-rwxrwxrwx 3 user hadoop 12323493 2016-11-04 14:25 /dataflow/artifacts/cache/https-c84ea9dc0103a4754aeb9a28bbc7a4f33c835854-batch-job-1.0.0.BUILD-SNAPSHOT.jar
-rwxrwxrwx 3 user hadoop 22139318 2016-11-07 15:07 /dataflow/artifacts/cache/log-sink-rabbit-1.0.0.BUILD-SNAPSHOT.jar
-rwxrwxrwx 3 user hadoop 12590921 2016-11-07 12:59 /dataflow/artifacts/cache/timestamp-task-1.0.0.BUILD-SNAPSHOT.jar
There's clearly a mix of wrong versions as hdfs has spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar and error complains about spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar.
Not sure how you got snapshots unless you built distribution manually?
I'd recommend picking 1.0.2 from http://cloud.spring.io/spring-cloud-dataflow-server-yarn. See "Download and Extract Distribution" from ref docs. Also delete old /dataflow directory from hdfs.