What is the idiomatic way to build to/move compiled Cljs to an arbitrary directory when using Boot? - firefox-addon-sdk

I'm using ClojureScript, Boot and Boot-Cljs to build a Firefox Add-On. Firefox's Add-On SDK assumes a certain directory structure; in my case, I'll need a project-root/data directory, which will house my contentScriptFile.
How should I go about building a ClojureScript file, which lives in project-root/src/foo/core.cljs and either outputting it or moving it to project-root/data? I've tried using boot sift --move to no avail (I admittedly don't fully understand how this task is supposed to work, which arguments are required, etc.), using a main.cljs.edn manifest and tweaking its location, :asset-path, :output-path, etc. to no avail.
;; src/main.cljs.edn
{:require [foo.core]
:init-fns [foo.core/init!]
:compiler-options {:output-path "data"}}
;; build.boot
(set-env!
:source-paths #{"src" "data"}
:dependencies '[[org.clojure/clojure "1.7.0"]
[org.clojure/clojurescript "1.7.170"]
[adzerk/boot-cljs "1.7.170-3"]])
(require '[adzerk.boot-cljs :refer [cljs]])
(deftask build []
(comp (cljs :optimizations :whitespace)))
I'm intrigued by Boot and want to figure this out, but I have to admit, I've wasted a lot of time on this (and it looks like I'm not alone). Seeing as I already have a simple, working Clojure script which calls the ClojureScript compiler directly - and uses :output-to to do exactly what I need - I may revert back to that approach in order to wrap this experiment up.

The weird thing about Boot that looks to be tripping you up a bit is that tasks - such as the cljs task - don't necessarily interact directly with the filesystem. They interact with the FileSet, a type of immutable value they are passed and expected to return.
At the beginning of the build, Boot gathers files on disk into the build's first FileSet. Then, it threads the FileSet through the task stack. Finally, Boot writes the FileSet returned by the last task to the target directory.
Boot's sift task can be used to move files around within the FileSet, returning a new FileSet, but not to move files outside of the target directory.
Long story short, I think you can do what you want by specifying a target directory to Boot like so:
(set-env! :target-path "data" ...)

This task might help:
(deftask public []
(comp (production)
(build)
(sift :invert true :include #{#"js/app.out" #"js/app.cljs.edn"})
(target :dir #{"data"})))

Related

How to make Snakemake recognize Globus remote files using Globus CLI?

I am working in a high performance computing grid environment, where large-scale data transfers are done via Globus. I would like to use Snakemake to pull data from a Globus path, process the data, and then push the processed data to a different Globus path. Globus has a command-line interface.
Pulling the data is no problem, for I'd just create a rule that would run globus transfer to create the requisite local file. But for pushing the data back to Globus, I think I'll need a rule that can "see" that the file is missing at the remote location, and then work backwards to determine what needs to happen to create the file.
I could create local "proxy" files that represent the remote files. For example I could make a rule for creating 'processed_data_1234.tar.gz' output files in a directory. These files would just be created using touch (thus empty), and the same rule will run globus transfer to push the files remotely. But then there's the overhead of making sure that the proxy files don't get out of sync with the real Globus-hosted files.
Is there a more elegant way to do this akin to the Remote File capability? Is it difficult to add a Globus CLI support for Snakemake? Thanks in advance for any advice!
Would it help to create a utility function that would generate a list of all desired files and compare it against the list of files available on globus? Something like this (pseudocode):
def return_needed_files():
list_needed_files = [] # either hard-coded or specified with some logic
list_available = [] # as appropriate, e.g. using globus ls
return [i for i in list_needed_files if i not in list_available]
# include all the needed files in the all rule
rule all:
input: return_needed_files

Azure batch Application package not getting copied to Working Directory of Task

I have created Azure Batch pool with Linux Machine and specified Application Package for the Pool.
My command line is
command='python $AZ_BATCH_APP_PACKAGE_scriptv1_1/tasks/XXX/get_XXXXX_data.py',
python3: can't open file '$AZ_BATCH_APP_PACKAGE_scriptv1_1/tasks/XXX/get_XXXXX_data.py':
[Errno 2] No such file or directory
when i connect to node and look at working directory non of the Application Package files are present there.
How do i make sure that files from Application Package are available in working directory or I can invoke/execute files under Application Package from command line ?
Make sure that your async operation have proper await in place before you start using the package in your code.
Also please share your design \ pseudo-code scenario and how you are approaching it as a design?
Further to add:
Seems like this one is pool level package.
The error seems like that the application env variable is either incorrectly used or there is some other user level issue. Please checkout linmk below and specially the section where use of env variable is mentioned.
This seems like user level issue because In case of downloading the package resource, if there will be an error it will be visible to you via exception handler or at the tool level is you are using batch explorer \ Batch-labs or code level exception handling.
https://learn.microsoft.com/en-us/azure/batch/batch-application-packages
Reason \ Rationale:
If the pool level or the task application has error, an error-list will come back if there was an error in the application package then it will be returned as the UserError or and AppPackageError which will be visible in the exception handle of the code.
Key you can always RDP into your node and checkout the package availability: information here: https://learn.microsoft.com/en-us/azure/batch/batch-api-basics#connecting-to-compute-nodes
I once created a small sample to help peeps around so this resource might help you to checkeout the use here.
Hope rest helps.
On Linux, the application package with version string is formatted as:
AZ_BATCH_APP_PACKAGE_{0}_{1}
On Windows it is formatted as:
AZ_BATCH_APP_PACKAGE_APPLICATIONID#version
Where 0 is the application name and 1 is the version.
$AZ_BATCH_APP_PACKAGE_scriptv1_1 will take you to the root folder where the application was unzipped.
Does this "exact" path exist in that location?
tasks/XXX/get_XXXXX_data.py
You can see more information here:
https://learn.microsoft.com/en-us/azure/batch/batch-application-packages
Edit: Just saw this question: "or can I invoke/execute files under Application Package from command line"
Yes you can invoke and execute files from the application package directory with the environment variable above.
If you type env on the node you will see the environment variables that have been set.

~/.felix folder contains massive number of files

On one of the accounts we use on a cluster there is a hidden folder in the home directory:
/home/user/.felix/
This contains a huge number of directories:
[user#gateway .felix]$ ls | head -10
osgi-cache1050e0f4_15774cb91f4_-7ffe
osgi-cache-1063880a_15289337854_-7ffe
osgi-cache-10716929_155ac249b99_-7ffe
osgi-cache-1076af32_1567b76f77c_-7ffe
osgi-cache10fdd858_15288297a76_-7ffe
osgi-cache1145761a_1567b157a97_-7ffe
osgi-cache-1158de5c_15775794758_-7ffe
osgi-cache-117b5c79_1577655ca87_-7ffe
osgi-cache-1188faa3_154532959fc_-7fff
osgi-cache11bf2822_1528906f443_-7ffe
In each of these folders:
osgi-cache-37166e7_1545cb3b7e0_-7ffe/bundle10
[user#gateway bundle4]$ cat bundle.location
reference:file:/gpfs22/local/centos6/matlab/2013a/java/jar/toolbox/bioinfo.jar
So I'm thinking these files are created by matlab somehow.
This .felix folder contains about ~150k files which is causing us to go over our quota of 300k files. Is there a way to:
disable the creation of these files
clean them up in a safe way (maybe a cron)
possible move the location of where these files are created?
Technically its the apache-felix bundle cache (http://felix.apache.org/documentation/subprojects/apache-felix-framework/apache-felix-framework-usage-documentation.html) and I'm afraid there's no safe way to remove any of these without contacting the user (even when migrating the path).
I noticed that Matlab is creating about 7k files in /tmp/.felix. The space usage is pretty minimal (184k). I was able to delete them by:
find /tmp/.felix -user <my username> -exec rm -r {} \;
But when I run my Matlab code it recreates many (all?) of the files. So at least in the Matlab usage case it seems relatively safe to delete them, but I could imagine there being problems if this info is actively being updated.
Digging into the Felix docs a bit (mentioned in answer), I google "Felix bundle cache", and find that this is used to store pointers to Java jar files, and perhaps to state as well. There are indeed parameters that you can configure to control the location and flushing of this cache. configuring Felix bundle cache
Mathworks also has Matlab specific suggestions. In the case mentioned there, this seemed to be triggered by plotting. Names in the stack trace there suggest it may have to do with implementation of key bindings (keyboard shortcuts).
Rob

How do I use Puppet's ralsh with resource types provided by modules?

I have installed the postgresql module from Puppetforge.
How can I query Postgresql resources using ralsh ?
None of the following works:
# ralsh postgresql::db
# ralsh puppetlabs/postgresql::db
# ralsh puppetlabs-postgresql::db
I was hoping to use this to get a list of databases (including attributes such as character sets) and user names/passwords from the current system in a form that I can paste into a puppet manifest to recreate that setup on a different machine.
In principle, any puppet client gets the current state of your system from another program called Facter. You should create a custom Fact (a module of Facter), and then included into your puppet client. Afterwards, I think you could call this custom Fact from ralsh.
More information about creating a custom Fact can be found in here.
In creating your own Fact, you should execute your SQL query and then save the result into particular variable.

What's the best Perl module for hierarchical and inheritable configuration?

If I have a greenfield project, what is the best practice Perl based configuration module to use?
There will be a Catalyst app and some command line scripts. They should share the same configuration.
Some features I think I want ...
Hierarchical Configurations to cleanly maintain different development and live settings.
I'd like to define "global" configurations once (eg, results_per_page => 20), have those inherited but override-able by my dev/live configs.
Global:
results_per_page: 20
db_dsn: DBI:mysql;
db_name: my_app
Dev:
inherit_from: Global
db_user: dev
db_pass: dev
Dev_New_Feature_Branch:
inherit_from: Dev
db_name: my_app_new_feature
Live:
inherit_from: Global
db_user: live
db_pass: secure
When I deploy a project to a new server, or branch/fork/copy it somewhere new (eg, a new development instance), I want to (one time only) set which configuration set/file to use, and then all future updates are automatic.
I'd envisage this could be achieved with a symlink:
git clone example.com:/var/git/my_project . # or any equiv vcs
cd my_project/etc
ln -s live.config to_use.config
Then in the future
git pull # or any equiv vcs
I'd also like something that akin to FindBin, so that my configs can either use absolute paths, or relative to the current deployment. Given
/home/me/development/project/
bin
lib
etc/config
where /home/me/development/project/etc/config contains:
tmpl_dir: templates/
when my perl code looks up the tmpl_dir configuration it'll get:
/home/me/development/project/templates/
But on the live deployment:
/var/www/project/
bin
lib
etc/config
The same code would magically return
/var/www/project/templates/
Absolute values in the config should be honoured, so that:
apache_config: /etc/apache2/httpd.conf
would return "/etc/apache2/httpd.conf" in all cases.
Rather than a FindBin style approach, an alternative might be to allow configuration values to be defined in terms of other configuration values?
tmpl_dir: $base_dir/templates
I'd also like a pony ;)
Catalyst::Plugin::ConfigLoader supports multiple overriding config files. If your Catalyst app is called MyApp, then it has three levels of override: 1) MyApp.pm can have a __PACKAGE__->config(...) directive, 2) it next looks for MyApp.yml in the main directory of the app, 3) it looks for MyApp_local.yml. Each level may override settings in each other level.
In a Catalyst app I built, I put all of my immutable settings in MyApp.pm, my debug settings in MyApp.yml, and my production settings in MyApp_<servertype>.yml and then symlinked MyApp_local.yml to point at MyApp_<servertype>.yml on each deployed server (they were all a little different...).
That way, all of my config was in SVN and I just needed one ln -s step to manually config a server.
Perl Best Practices warns against exactly what you want. It states that config files should be simple and avoid the sort of baroque features you desire. It goes on to recommend three modules (none of which are Core Perl): Config::General, Config::Std, and Config::Tiny.
The general rational behind this is that the editing of config files tends to be done by non-programmers and the more complicated you make your config files, the more likely they will screw them up.
All of that said, you might take a look at YAML. It provides a full featured, human readable*, serialization format. I believe the currently recommend parser in Perl is YAML::XS. If you do go this route I would suggest writing a configuration tool for end users to use instead of having them edit the files directly.
ETA: Based on Chris Dolan's answer it sounds like YAML is the way to go for you since Catalyst is already using it (.yml is the de facto extension for YAML files).
* I have heard complaints that blind people may have difficulty with it
YAML is hateful for config - it's not non-programmer friendly partly because yaml in pod is by definition broken as they're both white-space dependent in different ways. This addresses the main problem with Config::General. I've written some quite complicated config files with C::G in the past and it really keeps out of your way in terms of syntax requirements etc. Other than that, Chris' advice seems on the money.