I'm in a bit of a pickle...
I work on a project that is multi-site. Unfortunately, the VOB sync between the two sites is not working properly right now, and our Clearcase Admins are too busy doing other work to get it fixed.
I need to take code from a Dynamic View on one server and merge it to a Dynamic View on another server.
Usually we check everything in, label it, and then once the VOB syncs merge from the label on the other side.
Any tips or tricks on how to do this merge?
Ok, here's what I've got so far:
- I made sure that my source view & my target view were based on the same (slightly older) label that had synced properly.
Running the following command tells me what files have changed in my branch on the 1st server:
ct find . -version 'version (.../branch-name/LATEST)' -nxn -print
Running this command will give me a GNU style diff against the labeled version:
ct diff -diff FILENAME `cleartool find FILENAME -version 'lbtype(LABEL)' -print`
Now I need to chain these together to create a Patchset file than I can then use GNU Merge to merge into the 2nd view that's based on the same label.
You need to get the data back somehow from the other site of the replicated environment.
if the mkreplica did work, but the ship process failed, you could try to ask for a shared file replica, which could then be imported (see mkreplica help, section Imports).
multitool mkreplica –export –workdir /tmp/ms_workdir –c "make a new replica for sanfran_hub" –out /tmp/sanfran_hub_packet
multitool mkreplica –import –workdir /tmp/ms_workdir –tag /vobs/dev –vob /net/goldengate/vobstg/dev.vbs –preserve –c "create sanfran_hub replica" /tmp/sanfran_hub_packet
But if your CC admins are that busy, all there is left is the "replica of the poor":
some kind of zip, and a merge with a third party tool between your local view and said zip.
I am sure you could extract any relevant data from a source dynamic view which would not be up-to-date anyway.
Admins finally got around to cleaning it up before I could finish my solution, so don't need this anymore. Hopefully they will keep it up and running.
Related
I am working in a high performance computing grid environment, where large-scale data transfers are done via Globus. I would like to use Snakemake to pull data from a Globus path, process the data, and then push the processed data to a different Globus path. Globus has a command-line interface.
Pulling the data is no problem, for I'd just create a rule that would run globus transfer to create the requisite local file. But for pushing the data back to Globus, I think I'll need a rule that can "see" that the file is missing at the remote location, and then work backwards to determine what needs to happen to create the file.
I could create local "proxy" files that represent the remote files. For example I could make a rule for creating 'processed_data_1234.tar.gz' output files in a directory. These files would just be created using touch (thus empty), and the same rule will run globus transfer to push the files remotely. But then there's the overhead of making sure that the proxy files don't get out of sync with the real Globus-hosted files.
Is there a more elegant way to do this akin to the Remote File capability? Is it difficult to add a Globus CLI support for Snakemake? Thanks in advance for any advice!
Would it help to create a utility function that would generate a list of all desired files and compare it against the list of files available on globus? Something like this (pseudocode):
def return_needed_files():
list_needed_files = [] # either hard-coded or specified with some logic
list_available = [] # as appropriate, e.g. using globus ls
return [i for i in list_needed_files if i not in list_available]
# include all the needed files in the all rule
rule all:
input: return_needed_files
On one of the accounts we use on a cluster there is a hidden folder in the home directory:
/home/user/.felix/
This contains a huge number of directories:
[user#gateway .felix]$ ls | head -10
osgi-cache1050e0f4_15774cb91f4_-7ffe
osgi-cache-1063880a_15289337854_-7ffe
osgi-cache-10716929_155ac249b99_-7ffe
osgi-cache-1076af32_1567b76f77c_-7ffe
osgi-cache10fdd858_15288297a76_-7ffe
osgi-cache1145761a_1567b157a97_-7ffe
osgi-cache-1158de5c_15775794758_-7ffe
osgi-cache-117b5c79_1577655ca87_-7ffe
osgi-cache-1188faa3_154532959fc_-7fff
osgi-cache11bf2822_1528906f443_-7ffe
In each of these folders:
osgi-cache-37166e7_1545cb3b7e0_-7ffe/bundle10
[user#gateway bundle4]$ cat bundle.location
reference:file:/gpfs22/local/centos6/matlab/2013a/java/jar/toolbox/bioinfo.jar
So I'm thinking these files are created by matlab somehow.
This .felix folder contains about ~150k files which is causing us to go over our quota of 300k files. Is there a way to:
disable the creation of these files
clean them up in a safe way (maybe a cron)
possible move the location of where these files are created?
Technically its the apache-felix bundle cache (http://felix.apache.org/documentation/subprojects/apache-felix-framework/apache-felix-framework-usage-documentation.html) and I'm afraid there's no safe way to remove any of these without contacting the user (even when migrating the path).
I noticed that Matlab is creating about 7k files in /tmp/.felix. The space usage is pretty minimal (184k). I was able to delete them by:
find /tmp/.felix -user <my username> -exec rm -r {} \;
But when I run my Matlab code it recreates many (all?) of the files. So at least in the Matlab usage case it seems relatively safe to delete them, but I could imagine there being problems if this info is actively being updated.
Digging into the Felix docs a bit (mentioned in answer), I google "Felix bundle cache", and find that this is used to store pointers to Java jar files, and perhaps to state as well. There are indeed parameters that you can configure to control the location and flushing of this cache. configuring Felix bundle cache
Mathworks also has Matlab specific suggestions. In the case mentioned there, this seemed to be triggered by plotting. Names in the stack trace there suggest it may have to do with implementation of key bindings (keyboard shortcuts).
Rob
I have a config file in my project that includes some info that is per machine dependent (db username, password, path). I understand that in this particular case, I could enforce everybody to use the same username, db path, and password to keep this simple, but there must be another way to deal with this problem.
I use mercurial, if you care, but I am ok with just a theoretical answer if you are unfamiliar with hg specifics.
A common way to handle this is to put a config.example or similar under version control and force the user to copy it and make any necessary changes. That way the user can pull down the overall structure of the file from your repository without overwriting local changes.
Alternatively, you could make your config file provide only defaults, with the option to source a subset of variables from a higher-priority custom config file (in the same format) which the user may or may not provide.
You'll want to use the .hgignore file to not include the config file in the repository.
This will allow everyone to have their own version of the config file.
Basically, you just want to add the relative path to the config file and Mercurial commands will ignore it. So the file would look like this:
config/dbconfig.ext
Edit
I just realized you still want to be able to version control the config file (misunderstood the question). So I suggest moving the parts of the config file that are dependent into their own config file and then applying the fix above. That way, you can still have the regular config information under version control and keep part of it separate for each person's machine.
I have per machine databases for my PHP projects. What I do is check the hostname at runtime. If it is one host, I feed it certain credentials. If another, feed it different credentials.
On some systems I create a list of credentials and then just go down the line trying them until one of the connections works. If the list is exhausted, the connection cannot be made.
I've never found a solid method for handling this type of configuration files. My final solution was to just maintain a version of each file and use symbolic links. That way each server has the same file path, but different root file.
Without knowing exactly what is in your config file, I'm going to assume your file has some stuff that is machine-dependent (e.g., db password, paths) and other stuff that is not (db hostname, maybe some paths relative to a path that is configured on a per-machine basis, etc.)
If that's the case, what you want to do is re-factor your config file so that you have two config files---one for the common stuff, one for the machine-specific stuff. Check the common one in, and add the machine-specific configuration to the ignore file.
The wiki mentions it's possible to do this under hg serve, but there aren't any examples (such as a sample webdir-conf file). Yes I know it would be better to do this all under Apache, but this is a local machine and hg serve just makes sense for us.
As you've hinted at you use the hg serve --webdir-conf FILE invocation and the webdir.conf format is the same as it is for hgweb.cgi. So those examples apply to you too:
https://www.mercurial-scm.org/wiki/HgWebDirStepByStep#Preparing_the_config
so at your most basic you can do:
[paths]
/repos = /webdata/hg_repos/*
where repos/ is the path on your local system to the directory containing the repositories.
(and you're right it would be much better to take the time to do this under Apache).
use this in your webdir config (for example)
foo.config << EOL
[paths]
power = power/Repo
billable = /path/to/billable/Repo
EOL
hg serve --webdir-conf foo.config
Assuming your repos live in different places...
As an alternative You can use RhodeCode, it's standalone app written in pylons.
"RhodeCode is Pylons framework based Mercurial repository browser/management with build in push/pull server and full text search and permissions system."
A demo can be viewed here.
http://demo.rhodecode.org
Regards
Currently Buildbot does not support multiple repositories. If one desires to have this then separate instances of Buildbot need to be run.
Still I'm curious if anyone has come up with a creative workaround to get this feature working anyway.
Update
This answer received a few downvotes recently, please note that this answer applies to the releases of buildbot that were published/used around the end of 2012/beginning of 2013 and may not be applicable for future versions.
Original Answer
As #Macke said, buildbot (>= 0.8.x) supports multiple projects/repositories. This is done with configuration like the following:
# Set configuration to watch the Git repository for possible
# changes. When a change does occur the schedulers will be
# notified with the project data (TestProj).
c['change_source'] = []
c['change_source'].append(
GitPoller(
repourl ='git://github.com/SO/my_test_project.git',
project = 'TestProj',
branch = 'master',
workdir = '/home/buildmaster/repos/TestProj'
)
)
# Set the schedule to run on each change, but only for the project
# specified above via the project information.
c['schedulers'] = []
c['schedulers'].append(
SingleBranchScheduler(
name = "TestProj-master",
builderNames = ['TestProj-master-builder'],
change_filter = ChangeFilter(
project = 'TestProj',
branch = 'master'
)
)
)
You can see that the project parameter in the change source is then used again in the scheduler's change_filter property to ensure that the scheduler only responds to that particular change source. This allows you to configure multiple change sources and multiple schedulers responding to explicitly chosen change sources.
Since the 0.8.7p1 release, buildbot supports multiple codebases
Indeed i don't get the reason why you say that it does not support multiple repositories....you can create a poller for each repository and multiple schedulers that ping the different pollers and get the builds for many different repositories (either on the same machine where the master runs, or you can have a dedicated slave on a different box).
You want to avoid to have multiple instances, but for example, master and slave coexist on the same machine even if is a pain to start and stop them in order, otherwise you get conflict errors :)
|> Currently Buildbot does not support multiple repositories.
I don't really understand the question.. sorry. Do you mean that you have to run multiple master servers? It is actually advised by the buildbot devs to do so, but the contrary works for me: you can have in the same master.cfg multiple slaves (columns in the waterfall) and for each or them a BuildFactory with different first steps of the type: Git(repourl=...) and/or Mercurial(repourl=...) etc.
Each will clone/pull from different repositories and you can even add some more checkouts that are needed in subsequent steps (using maven or directly your scm client). The only issue with having a unique master.cfg file is that all builders will have only one method for getting notifications of changes; we have for example PBChangeSource() (master is notified by remote code, it has nothing to do). If for instance you have an SCM with good PBChangeSource support (e.g., svn, hg, git) and an other ones with bad support (e.g., MKS) then you should have two master server instances in order to cope with that.
Hope it'll help.