How to configure Kubernetes initContainers runs after the other finished - kubernetes

I have a situation where I want to speed up deployment time by caching the git resources into a shared PVC.
This is the bash script I use for checkout the resource and save into a share PVC folder
#!/bin/bash
src="$1"
dir="$2"
echo "Check for existence of directory: $dir"
if [ -d "$dir" ]
then
echo "$dir found, no need to clone the git"
else
echo "$dir not found, clone $src into $dir"
mkdir -p $dir
chmod -R 777 $dir
git clone $src $dir
echo "cloned $dir"
fi
Given I have a Deployment with more than 1 pods and each of them have an initContainer. The problem with this approach is all initContainers will start almost at the same time.
They all check for the existence of the git resource directory. Let's say first deployment we dont have the git directory yet. Then it will create the directory, then clone the resource. Now, the second and third initContainers see that the directory is already there so they finish immediately.
Is there a way to make other initContainers wait for the first one to finish?
After reading the kubernetes documentation, I don't think it's supported by default
Edit 1:
The second solution I can think of is to deploy with 1 pod only, after a successful deployment, we will scale it out automatically. However I still don't know how to do this

I have found a workaround. The idea is to create a lock file, and write a script to wait until the lock file exists. In the initContainer, I prepare the script like this
#!/bin/bash
src="$1"
dir="$2"
echo "Check for existence of directory: $dir/src"
if [ -d "$dir/src" ]
then
echo "$dir/src found, check if .lock is exist"
until [ ! -f $dir/.lock ]
do
sleep 5
echo 'After 5 second, .lock is still there, I will check again'
done
echo "Finish clone in other init container, I can die now"
exit
else
echo "$dir not found, clone $src into $dir"
mkdir -p $dir/src
echo "create .lock, make my friends wait for me"
touch $dir/.lock
ls -la $dir
chmod -R 777 $dir
git clone $src $dir/src
echo "cloned $dir"
echo "remove .lock now"
rm $dir/.lock
fi
This kind of like a cheat, but it works. The script will make other initContainers wait until the .lock is remove. By then, the project is cloned already.

Related

postgres backup with WAL

i am in no way a db admin, so please don't shoot me if i'm doing it completly wrong ...
I have to add some archiving to a productive postgres database (newest version in docker container) and trying to build some scripts to use with WAL.
The idea is to have a weekly script, that does a full backup to a new directory and then creates a symlink to this new directory that is used by the WAL script to write it's logs. Also the weekly script will delete old backups older than 30 days.
I would be very happy for any comments on this...
db settings
wal_level = replica
archive_mode = on
archive_command = '/archive/archive_wal.sh "%p" "%f"'
archive_timeout = 300
weekly script:
#!/bin/bash
#create base archive dir
#base_arch_dir=/tmp/archive/
base_arch_dir=/archive/
if [ ! -d "$base_arch_dir" ]; then
mkdir "$base_arch_dir"
chown -R postgres:postgres "$base_arch_dir"
fi
#create dir for week
dir="$base_arch_dir"$(date '+%Y_%m_%d__%H_%M_%S')
if [ ! -d "$dir" ]; then
mkdir "$dir"
chown -R postgres:postgres "$dir"
fi
#change/create the symlink
newdir="$base_arch_dir"wals
ln -fsn "$dir" "$newdir"
chown -R postgres:postgres "$newdir"
#do the base backup to the wals dir
if pg_basebackup -D "$newdir" -F tar -R -X fetch -z -Z 9 -U postgres; then
find "$base_arch_dir"* -type d -mtime +31|xargs rm -rf
fi
crchive script:
#!/bin/bash
set -e
arch_dir=/archive/wals
arch_log="$arch_dir/arch.log"
if [ ! -d "$arch_dir" ]; then
echo arch_dir '"$arch_dir"' does not exist >> "$arch_log"
exit -1
fi
#get the variables from postgres
p=$1
f=$2
if [ -f "$arch_dir"/$f.xz ]; then
echo wal file '"$arch_dir"/$f.xz' already exists
exit -1
fi
pxz -2 -z --keep -c $p > "$arch_dir"/$f.xz
Thank you in advance
It's not terribly difficult to put together your own archiving scripts, but there are a few things you need to keep track of, because when you need your backups you really need them. There are some packaged backup systems for PostgreSQL. You may find these two a good place to start, but others are available.
https://www.pgbarman.org/
https://pgbackrest.org/

How does the copy artifacts job work in Kubernetes

I am trying to run a hyperledger fabric blockchain network on kubernetes using https://github.com/IBM/blockchain-network-on-kubernetes as the reference. In one of the steps, the atrifacts (chaincode, configtx.yaml ) are copied into the volume using the below yaml file
https://github.com/IBM/blockchain-network-on-kubernetes/blob/master/configFiles/copyArtifactsJob.yaml
I am unable to understand how the files are copied into the shared persistent volume. Does the entry point command on line 24 copy the artifaces to the persistent volume? I do not see cp here. So how does the copy happen?
command: ["sh", "-c", "ls -l /shared; rm -rf /shared/*; ls -l /shared; while [ ! -d /shared/artifacts ]; do echo Waiting for artifacts to be copied; sleep 2; done; sleep 10; ls -l /shared/artifacts; "]
Actually this job does not copy anything. It is just used to wait until copy complete.
Look at setup_blockchainNetwork.sh script. Actual copy is happening at line 82.
kubectl cp ./artifacts $pod:/shared/
This line copy content of ./artifact into the /shared directory of shared-pvc volume.
The job just make sure that copy is completed before processing further task. When copy is done, the job will find the files in /shared/artifacts directory and will go to completion. When the job is completed, the script proceed to further task. Look at the condition here.

Bluemix: cf push using DEA instead of DIEGO architecture

When deploying an application into dedicated Bluemix it uses DEA architecture by default. How can I force it to use DIEGO architecture instead?
You have to use more steps. Deploy without start, switch to diego, start.
cf push APPLICATION_NAME --no-start
cf disable-diego APPLICATION_NAME
cf start APPLICATION_NAME
Ref Deploying Apps
I built a bash exec to do this, which will use your existing manifest.yml file and pack all of this into a single request. The contents of the bash exec follow:
#!/bin/bash
filename="manifest.yml"
if [ -e $filename ];
then
echo "using manifest.yml file in this directory"
else
echo "no manifest.yml file found. exiting"
exit -2
fi
shopt -s nocasematch
string='name:'
targetName=""
echo "Retrieving name from manifest file"
while read -r line
do
name="$line"
variable=${name%%:*}
if [[ $variable == *"name"* ]]
then
inBound=${name#*:}
targetName="$(echo -e "${inBound}" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')"
fi
done < "$filename"
if [ "$targetName" == "" ];
then
echo "Could not find name of application in manifest.yml file. Cancelling build."
echo "application name is identified by the 'name: ' term in the manifest.yml file"
exit -1
else
echo "starting cf push for $targetName"
cf push --no-start
echo "cf enable-diego $targetName"
cf enable-diego $targetName
echo "cf start $targetName"
cf start $targetName
exit 0
fi
Just put this code into your editor as a new file and then make the file executable. I keep a copy of this exec in each of my repos in the root directory. After doing a copy-paste and executing this exec, you may get the following error:
/bin/bash^M: bad interpreter: No such file or directory
If you do, just run the dos2unix command and it will 'fix up' the line endings to match your os.

Postgres 9.4 restore not working

I do a backup of postgresql 9.4 database as per the postgres doc of WAL archiving.
After backup I create 2 records in DB.
Now when I try to restore the DB, the last 2 records which I created above is not coming up.
WAL archive steps:
cd /etc/postgresql/9.4/
mkdir archives
mkdir backups
chown postgres:postgres archives
chown postgres:postgres backups
cd /etc/postgresql/9.4/main/
echo 'max_wal_senders=1' >> postgresql.conf
echo 'wal_level=hot_standby' >> postgresql.conf
echo 'archive_mode=on' >> postgresql.conf
echo "archive_command='test ! -f /etc/postgresql/9.4/archives/%f && cp %p /etc/postgresql/9.4/archives/%f'" >> postgresql.conf
echo 'local replication postgres trust' >> pg_hba.conf
service postgresql restart
Backup steps:
cd /etc/postgresql/9.4/backups
rm -rf *
pg_basebackup --xlog -U postgres --format=t -D /etc/postgresql/9.4/backups/
Restore steps:
service postgresql stop
cd /var/lib/postgresql/9.4/
if [ ! -d "/var/lib/postgresql/9.4/tmp/" ]
then
mkdir tmp
else
rm -rf tmp
fi
mkdir tmp
mv /var/lib/postgresql/9.4/main/* /var/lib/postgresql/9.4/tmp/
cd /var/lib/postgresql/9.4/main/
rm -rf *
cd /etc/postgresql/9.4/backups
tar -xf base.tar -C /var/lib/postgresql/9.4/main/
cd /var/lib/postgresql/9.4/main/
FROMDIR="/etc/postgresql/9.4/archives/"
TODIR="/var/lib/postgresql/9.4/tmp/pg_xlog/"
if [ ! -d "$FROMDIR" ]
then
echo "Directory $FROMDIR does not exist!!"
exit
fi
if [ ! -d "$TODIR" ]
then
echo "Directory $TODIR does not exist!!"
exit
fi
cd $FROMDIR
for i in `find . -type f`
do
if [ ! -f $TODIR/$i ]
then
echo "copying file $i"
cp $i /var/lib/postgresql/9.4/main/pg_xlog/$i
fi
done
cd /var/lib/postgresql/9.4/main/pg_xlog/
chown -R postgres:postgres *
cd /var/lib/postgresql/9.4/main/
FILE="recovery.done"
if [ -f $FILE ]
then
mv $FILE recovery.conf
else
echo "restore_command = 'cp /etc/postgresql/9.4/archives/%f %p'" >> recovery.conf
fi
su postgres service postgresql start
exit
Changes appear in archive (/etc/postgresql/9.4/archives/ in your case) when the current WAL segment (usually 16 Mb) is filled up. Let me quote the documentation:
The archive_command is only invoked for completed WAL segments. Hence,
if your server generates little WAL traffic (or has slack periods
where it does so), there could be a long delay between the completion
of a transaction and its safe recording in archive storage. To limit
how old unarchived data can be, you can set archive_timeout to force
the server to switch to a new WAL segment file periodically. When this
parameter is greater than zero, the server will switch to a new
segment file whenever this many seconds have elapsed since the last
segment file switch, and there has been any database activity,
including a single checkpoint. (Increasing checkpoint_timeout will
reduce unnecessary checkpoints on an idle system.) Note that archived
files that are closed early due to a forced switch are still the same
length as completely full files. Therefore, it is unwise to use a very
short archive_timeout — it will bloat your archive storage.
If you just want to test the restore process, you can simply do select pg_switch_xlog(); after creating some records to force switch to a new WAL segment. Then verify that a new file appeared in the archive directory.
Also, you need not copy files from the archive directory to pg_xlog/. Restore_command will do it for you.

Delete non git directory in git bash, windows

xx#xx-PC ~/xampp/htdocs/sites
$ rmdir /s "yo-2"
rmdir: `/s': No such file or directory
rmdir: `yo-2': Directory not empty
xx#xx-PC ~/xampp/htdocs/sites
$ rmdir "yo-2"
rmdir: `yo-2': Directory not empty
I cant seem to get rmdir to work in git bash. Its not in a git repo and I've tried the above. Mkdir works as expected, why doesnt this?
rmdir will not work if directory is empty
Try
rm -rf yo-2
git-bash is a Linux like shell
If you are trying to remove an entire directory regardless of contents, you could use:
rm <dirname> -rf
just use the command below:
rm -rfv mydirectory
After trying out a couple of other commands, this worked for me:
rm dirname -rf
A bit late, but I believe it still can help someone with performance problems on Windows systems. It is REALLY FAST to delete on Windows using git bash comparing with just the ordinary rm -rf. The trick here is to move the file/directory to another random name in a temporary directory at the same drive (on Windows) or at the same partition (on *nix systems) and invoke the rm -rf command in background mode. At least you don't need to wait for a blocking IO task and OS will perform the deletion as soon it gets idle.
Depending on the system you are using you may need to install the realpath program (ie macOS). Another alternative is to write a bash portable function like in this post: bash/fish command to print absolute path to a file.
fast_rm() {
path=$(realpath $1) # getting the absolute path
echo $path
if [ -e $path ]; then
export TMPDIR="$(dirname $(mktemp -u))"
kernel=$(uname | awk '{print tolower($0)}')
# if windows, make sure to use the same drive
if [[ "${kernel}" == "mingw"* ]]; then # git bash
export TMPDIR=$(echo "${path}" | awk '{ print substr($0, 1, 2)"/temp"}')
if [ ! -e $TMPDIR ]; then mkdir -p $TMPDIR; fi
fi
if [ "${kernel}" == "darwin" ]; then MD5=md5; else MD5=md5sum; fi
rnd=$(echo $RANDOM | $MD5 | awk '{print $0}')
to_remove="${TMPDIR}/$(basename ${path})-${rnd}"
mv "${path}" "${to_remove}"
nohup rm -rf "${to_remove}" > /dev/null 2>&1 &
fi
}
# invoking the function
directory_or_file=./vo-2
fast_delete $directory_or_file
I have faced same issue. this is worked for me
rimraf is a Node.js package, which is the UNIX command rm -rf for node, so you will need to install Node.js which includes npm. Then you can run:
npm install -g rimraf
Then you can run rimraf from the command line.
rimraf directoryname
visit https://superuser.com/questions/78434/how-to-delete-directories-with-path-names-too-long-for-normal-delete
I found this solution because npm itself was causing this problem due to the way it nests dependencies.
Late reply, but for those who search a solution, for me the
rm <dirname> -rf
wasn't good, I always get the directory non-empty or path too long on node directories.
A really simple solution :
Move the directory you want to delete to the root of your disk (to shorten your path) and then you can delete it normally.