How to narrow down results of the Ansible find module? [duplicate] - find

This question already has answers here:
Get sorted list of folders with Ansible
(4 answers)
Closed 6 years ago.
Instead of using the Ansible shell or command modules, I am trying to use the find module to delete old backup directories and only keep the latest n backups. Currently, I use the following code to get a list of all the backup directories (so that in a second step I could delete the unwanted ones):
- find:
paths: "/opt/"
patterns: "backup_*"
file_type: "directory"
Unfortunately, I don't see any way of narrowing down the resulting list of directories...
The find module doesn't seem to support sorting... can that be done in any way?
Does Ansible provide any means to manipulate a JSON list... to keep only n elements in a list and remove all others?
Has anyone successfully used the find module for similar purposes?

You can sort with sort filter
You can task first N elements of a list with [:N] syntax.
- find:
path: "/tmp/"
pattern: "file*"
register: my_files
- debug: msg="{{(my_files.files | sort(attribute='ctime'))[:-3] | map(attribute='path') | list }}"
Sort files by ctime, take all but last three elements, take only path attribute and form a list.

Related

How to append more than 33 files in a gcloud bucket?

I use to append datasets in a bucket in gcloud using:
gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/composite
However, today when I tried to append some data the terminal prints the error CommandException: The compose command accepts at most 33 arguments.
I didn't know about this restriction. How can I append more than 33 files in my bucket? Is there another command line tool? I would like to avoid to create a virtual machine for what looks like a rather simple task.
I checked the help using gsutil help compose. But it didn't help much. There is only a warning saying "Note that there is a limit (currently 32) to the number of components that can
be composed in a single operation." but no hint on a workaround.
Could you not do it recursively|batch?
I've not tried this.
Given an arbitrary list of files (FILES)
While there is more than 1 file in FILES:
Take the first n where n<=33 from FILES and gsutil compose into temp file
If that succeeds, replace the n names in FILES with the 1 temp file.
Repeat
The file that remains is everything composed.
Update
The question piqued my curiosity and gave me an opportunity to improve my bash ;-)
A rough-and-ready proof-of-concept bash script that generates batches of gsutil compose commands for arbitrary (limited by the string formatting %04) numbers of files.
GSUTIL="gsutil compose"
BATCH_SIZE="32"
# These may be the same (or no) bucket
SRC="gs://bucket01/"
DST="gs://bucket02/"
# Generate test LST
FILES=()
for N in $(seq -f "%04g" 1 100); do
FILES+=("${SRC}/file-${N}")
done
function squish() {
LST=("$#")
LEN=${#LST[#]}
if [ "${LEN}" -le "1" ]; then
# Empty array; nothing to do
return 1
fi
# Only unique for this configuration; be careful
COMPOSITE=$(printf "${DST}/composite-%04d" ${LEN})
if [ "${LEN}" -le "${BATCH_SIZE}" ]; then
# Batch can be composed with one command
echo "${GSUTIL} ${LST[#]} ${COMPOSITE}"
return 1
fi
# Compose 1st batch of files
# NB Provide start:size
echo "${GSUTIL} ${LST[#]:0:${BATCH_SIZE}} ${COMPOSITE}"
# Remove batch from LST
# NB Provide start (to end is implied)
REM=${LST[#]:${BATCH_SIZE}}
# Prepend composite from above batch to the next run
NXT=(${COMPOSITE} ${REM[#]})
squish "${NXT[#]}"
}
squish "${FILES[#]}"
Running with BATCH_SIZE=3, no buckets and 12 files yields:
gsutil compose file-0001 file-0002 file-0003 composite-0012
gsutil compose composite-0012 file-0004 file-0005 composite-0010
gsutil compose composite-0010 file-0006 file-0007 composite-0008
gsutil compose composite-0008 file-0008 file-0009 composite-0006
gsutil compose composite-0006 file-0010 file-0011 composite-0004
gsutil compose composite-0004 file-0012 composite-0002
NOTE How composite-0012 is created by the first command but then knitted into the subsequent command.
I'll leave it to you to improve throughput by not threading the output from each step into the next, parallelizing the gsutil compose commands across the list chopped into batches and then compose the batches.
The docs say that you may only combine 32 components in a single operation, but there is no limit to the number of components that can make up a composite object.
So, if you have more than 32 objects to concatenate, you may perform multiple compose operations, composing 32 objects at a time until you eventually get all of them composed together.

how to rename multiple files in a folder with a specific format? perl syntax explanation [duplicate]

This question already has answers here:
How to rename multiple files in a folder with a specific format?
(2 answers)
Closed 2 years ago.
I asked a similar question previously, but need help to understand the Perl commands that achieve the renaming process. I have many files in a folder with format '{galaxyID}-psf-calexp-pdr2_wide-HSC-I-{#}-{#}-{#}.fits'. Here are some examples:
7-psf-calexp-pdr2_wide-HSC-I-9608-7,2-205.41092-0.41487.fits
50-psf-calexp-pdr2_wide-HSC-I-9332-6,8-156.64674--0.03277.fits
124-psf-calexp-pdr2_wide-HSC-I-9323-4,3-143.73514--0.84442.fits
I want to rename all .fits files in the directory to match the following format:
7-HSC-I-psf.fits
50-HSC-I-psf.fits
124-HSC-I-psf.fits
namely, I want to remove "psf-calexp-pdr2_wide", all of the numbers after "HSC-I", and add "-psf" to the end of each file after HSC-I. I have tried the following command:
rename -n -e 's/-/-\d+-calexp-/-\d+pdr2_wide; /-/-//' *.fits
which gave me the error message: Argument list too long. You can probably tell I don't understand the Perl syntax. Thanks in advance!
First of all, Argument list too long doesn't come from perl; it comes from the shell because you have so many files that *.fits expanded to something too long.
To fix this, use
# GNU
find . -maxdepth 1 -name '*.fits' -exec rename ... {} +
# Non-GNU
find . -maxdepth 1 -name '*.fits' -print0 | xargs -0 rename ...
But your Perl code is also incorrect. All you need is
s/^(\d+).*/$1-HSC-I-psf.fits/
which can also be written as
s/^\d+\K.*/-HSC-I-psf.fits/

google cloud storage with php app engine. Get all sub directories list

How would it be possible to get a list with all directories inside a specific directory of a Cloud Storage bucket using an App Engine application in PHP?
You can try two approaches to list all the directories inside a specific directory in a bucket:
Use the code snippet from the PHP docs samples at Google Cloud Platform GitHub and modify it in a way that list_objects_with_prefix function includes also delimiter, not only prefix. I have written such function in Python in this SO topic, you can use it as a reference. Here prefix needed to be the name of the parent directory, for e.g. 'my_directory/' and delimiter is simply '/' to indicate that we want to end our search on elements finishing with '/' (hence, directories).
Use gsutil ls command to list objects in a directory from within PHP. You will need to use shell_exec function:
$execCommand = "gsutil ls gs://bucket";
$output = shell_exec($execCommand);
output will be a string in this case and it will contain also file names if present in the parent directory.
This SO topic might be also informative, here the question was to list the whole directory (together with files).

Google Cloud Storage : What is the easiest way to update timestamp of all files under all subfolders

I have datewise folders in the form of root-dir/yyyy/mm/dd
under which there are so many files present.
I want to update the timestamp of all the files falling under certain date-range,
for example 2 weeks ie. 14 folders, so that these these files can be picked up by my file-Streaming Data Ingestion process.
What is the easiest way to achieve this?
Is there a way in UI console? or is it through gsutil?
please help
GCS objects are immutable, so the only way to "update" the timestamp would be to copy each object on top of itself, e.g., using:
gsutil cp gs://your-bucket/object1 gs://your-bucket/object1
(and looping over all objects you want to do this to).
This is a fast (metadata-only) operation, which will create a new generation of each object, with a current timestamp.
Note that if you have versioning enabled on the bucket doing this will create an extra version of each file you copy this way.
When you say "folders in the form of root-dir/yyyy/mm/dd", do you mean that you're copying those objects into your bucket with names like gs://my-bucket/root-dir/2016/12/25/christmas.jpg? If not, see Mike's answer; but if they are named with that pattern and you just want to rename them, you could use gsutil's mv command to rename every object with that prefix:
$ export BKT=my-bucket
$ gsutil ls gs://$BKT/**
gs://my-bucket/2015/12/31/newyears.jpg
gs://my-bucket/2016/01/15/file1.txt
gs://my-bucket/2016/01/15/some/file.txt
gs://my-bucket/2016/01/15/yet/another-file.txt
$ gsutil -m mv gs://$BKT/2016/01/15 gs://$BKT/2016/06/20
[...]
Operation completed over 3 objects/12.0 B.
# We can see that the prefixes changed from 2016/01/15 to 2016/06/20
$ gsutil ls gs://$BKT/**
gs://my-bucket/2015/12/31/newyears.jpg
gs://my-bucket/2016/06/20/file1.txt
gs://my-bucket/2016/06/20/some/file.txt
gs://my-bucket/2016/06/20/yet/another-file.txt

Get ansible to read value from mysql and/or perl script

There may be a much better way to do what i need altogether. I'll give the background first then my current (non-working) approach.
The goal is to migrate a bunch of servers from one SLES 11 to SLES 12 making use of ansible playbooks. The problem is that the newserver and the oldserver are supposed to have the same nfs mounted dir. This has to be done at the beginning of the playbook so that all of the other tasks can be completed. The name of the dir being created can be determined in 2 ways - on the oldserver directly or from a mysql db query for the volume name on that oldserver. The newservers are named migrate-(oldservername). I tried to prompt for the volumename in ansible, but that would then apply the same name to every server. Goal Recap: dir name must be determined from the oldserver, created on new server.
Approach 1: I've created a perl script that ansible will copy to the newserver and it will execute the mysql query, and create the dir itself. There are 2 problems with this - 1) mysql-client needs to be installed on each server. This is completely unneccesary for these servers and would have to then be uninstalled after the query is run. 2) Copying files and remotely executing them seems like a bad approach in general.
Approach 2: Create a version of the above except run it on the ansible control machine (where mysql-client is installed already) and store the values in key:value pairs in a file. Problems - 1) I cannot figure out how to in the perl script determine what hosts Ansible is running against and would have to manually enter them into the perl script. 2) I cannot figure out how to get Ansible to import those values correctly from the file created.
Here's the relevant perl code I have for this -
$newserver = migrate-old.server.com
($mig, $oldhost) = split(/\-/, $newserver);
$volname=`mysql -u idm-ansible -p -s -N -e "select vol_name from assets.servers where hostname like '$oldhost'";`;
open(FH, ">vols.yml");
$values = $newhost{$volname};
print FH "$newhost:$volname\n";
close(FH);
My Ansible code is all over the place as I've tried and commented out a ton of things. I can share that here if it is helpful.
Approach 3: Do it completely in Ansible Basically a mysql query for loop over each host. Problem - I have absolutely no idea how to do this. Way too unfamiliar with ansible. I think this is what I would prefer to try and do though.
What is the best approach here? How do I go about getting the right value into ansible to create the correct directory?
Please let me know if I can clarify anything.
Goal Recap: dir name must be determined from the oldserver, created on new server.
Will magic variables do?
Something like this:
---
- hosts: old-server
tasks:
- shell: "/get-my-mount-name.sh"
register: my_mount_name
- hosts: new-server
tasks:
- shell: "/mount-me.sh --mount_name={{ hostvars['old-server'].my_mount_name.stdout }}"