rsync'ed outputs are always marked as outdated - remote-server

Using {targets} to manage a workflow, which is great.
We don't have a proper cluster setup, but I have access to remote machines with much better specs than my laptop, so I can use git to keep the plan in sync locally and remotely.
When I want to work with something locally, I use rsync to move the files over.
rsync -avxP -e "ssh -p ..." remote:path/to/_targets .
When I query the remote cache with tar_network, I see that a bunch of my targets are "uptodate".
When I query the local cache after the rsync above, those same targets are "outdated".
I'm wondering if there is either better calls to rsync or certain arguments to tar_network(), or if this is a bug and the targets should stay as "uptodate" after an rsync like this?

OK, so I figured this out.
It was because I was being foolish about what was a target in this case. To make sure that a package dependency was being captured, I was using something that grabbed the entire DESCRIPTION of the package (packageDescription() I think). The problem with that is, is that when you install the package using remotes::install_github(), is that it adds some more information to the DESCRIPTION upon installation (packaged and built fields), and that information differed between the installation on my local machine and the installation on the remote machine.
What I really wanted was just the GithubSHA1 bit from the packageDescription(), to verify that I was using the same package at the same commit from my GitHub repo. Once I switched to using that instead of the entire DESCRIPTION, then targets had no issues with the meta information and things would stay current when rsync'ing them between machines.

Related

Use Git list output to copy files for archiving

I'm currently helping to maintain a project for a client remotely. I'm the only developer ergo some of my unorthodox approaches/thinking.
the problem
The client is using Visual Studio 2010 + Team Foundation Server for their source control. I am working on a Mac over VPN and have tried several approaches to make committing to their TFS workable. I've tried TFS plugin for Eclipse with no luck (VPN really hoses the connection to TFS). Currently I am having to do a full "checkout for edit" through a virtual machine to the TFS, then transferring the project over the VPN to overwrite those files. Not a sustainable solution to say the least.
the solution?
I'm wondering if there is a way to:
get a list of changed files from GIT (I think this is the solution
(How to list all the files in a commit?)
then use that list as a means to go in and fetch those file, maintaining their folder structure
from there I can do my dump over
VPN into the VM that has the project mapped in TFS.
Or if there is something I've overlooked or hadn't thought of, please do recommend them, I'm all ears.
First, I'm assuming you are running the VM on or near the TFS server, not on your Mac. If not, you can just share a directory using VMware/VirtualBox and edit away on your Mac...
It sounds like you could achieve what you want with plain old Git. If you:
Create a bare repository on the VM (git init --bare)
Add a post-receive hook to copy the files from the master branch (for example) into the TFS directory, overwriting merrily (http://git-scm.com/book/en/Customizing-Git-Git-Hooks)
Initialise your local copy of the source as a Git repository (git init)
Add the remote repository. Assuming it's a Windows box you can use an SMB shared folder over the VPN so your remote is "local" as far as Git is concerned. (git remote add tfsserver file:///Volumes/tfsmount/code
Your first push will be expensive (but you could prepopulate the remote repo to get around that), but subsequent pushes would be just the changesets. The post-receive hook would then take care of updating the files, and you're laughing.
Of course, you then get to impress them with how amazing Git is, get them to migrate, and your problem goes away forever :).
Update: Here's a link which describes these steps in more detail, under the guise of updating a remote website: http://toroid.org/ams/git-website-howto.

git hub and coding processes

Ive been programming for a little while now and have built a little application which is now hosted on a dedicated server.
Now i have been rolling out different versions of my app with no real understanding on how to manage the process properly.
Is this the proper way to manage a build of an application when using a product like git hub ?
Upload my entire application onto github.
Each time i work on it, download it and install it on my dev server.
When im done working on it and it appears to be ok, do i then upload the changed files with the current project i am working on or am i meant to update the entire lot or am i mean to create a new version of the project?
once all my changes are updated, is there anyway of pushing these to a production machine from git hub or generating a listing of the newly changed files so i can update production machine easily with a checklist of some kind ?
My application has about 900 files associated with it and is stored in various folder structures and is a server based app (coldfusion to be precise) and as i work alone majority of the time, im struggling to understand how to manage the development of an app...
I also have no idea on using the command line and my desktop machine is a mac, with a VM running all my required server apps (windows server 2012, MSSQL 2012 etc)
I really want to make sure i can keep my dev process in order, but ive struggled with how to understand how to manage a server side apps development when im using a mac my dev machine is a windows machine i feel like im stuck in the middle.
You make it sound more complicated than it is.
Upload my entire application onto github.
Well, this is actually 2 steps: First, create a local git repo (git init), then push your repo up to github.
Each time i work on it, download it and install it on my dev server.
Well, you only need to "download" it once to a new dev box. After that, just git pull (or git fetch depending on workflow), which ensures any changes on the server are pulled down. Just the deltas are sent.
Git is a distributed version control system. That means every git repo has the full history of the entire project. So only deltas need to be sent. (This really helps when multiple people are hacking on a project).
When im done working on it and it appears to be ok, do i then upload the changed files with the current project i am working on or am i meant to update the entire lot or am i mean to create a new version of the project?
Hmm, you are using fuzzy terminology here. When you are done editing, you first commit locally (git add ...; git commit), then you push the changes to github (git push). Only the deltas are sent. Every commit is "a new version" if you squint.
Later on, if you want to think in terms of "software releases" (i.e. releasing "version 1.1" after many commits), you can use git tags. But don't worry about that right away.
once all my changes are updated, is there anyway of pushing these to a production machine from git hub or
generating a listing of the newly changed files so i can update production machine easily with a checklist of some kind ?
Never manually mess around with files manually on your server. The server should ONLY be allowed to run a valid, checked-out version of your software. If your production server is running random bits of code, nobody will be able to reproduce problems because they aren't in the version control system.
The super-simple way to deploy is to do a git clone on your server (one time), then git pull to update the code. So you push a change to github, then pull the change from your server.
More advanced, you will want something like capistrano that will manage the checkouts for you, and break up "checking out" from "deploying" to allow for easier rollback, etc. There may be windows-specific ways of doing that too. (Sorry, I'm a Linux guy.)

Jenkins: FTP / SSH deployment, including deletion and moving of files

I was wondering how to get my web-projects deployed using ftp and/or ssh.
We currently have a self-made deployment system which is able to handle this, but I want to switch to Jenkins.
I know there are publishing plugins and they work well when it comes to uploading build artifacts. But they can't delete or move files.
Do you have any hints, tipps or ideas regarding my problem?
The Publish Over SSH plugin enables you to send commands using ssh to the remote server. This works very well, we also perform some moving/deleting files before deploying the new version, and had no problems whatsoever using this approach.
The easiest way to handle deleting and moving items is by deleting everything on the server before you deploy a new release using one of the 'Publish over' extensions. I'd say that really is the only way to know the deployed version is the one you want. If you want more versioning-system style behavior you either need to use a versioning system or maybe rsync that will cover part of it.
If your demands are very specific you could develop your own convention to mark deletions and have them be performed by a separate script (like you would for database changes using Liquibase or something like that).
By the way: I would recommend not automatically updating your live sites after every build using the 'publish over ...' extension. In case we really want to have a live site automatically updated we rely on the Promoted Builds Plugin to keep it nearly fully-automated but add a little safety.
I came up with a simple solution to remove deleted files and upload changes to a remote FTP server as a build action in Jenkins using a simple lftp mirror script. Lftp Manual Page
In Short, you create a config file in your jenkins user directory ~/.netrc and populate it with your FTP credentials.
machine ftp.remote-host.com
login mySuperSweetUsername
password mySuperSweetPassword
Create an lftp script deploy.lftp and drop it in the root of your .git repo
set ftp:list-options -a
set cmd:fail-exit true
open ftp.remote-host.com
mirror --reverse --verbose --delete --exclude .git/ --exclude deploy.lftp --ignore-time --recursion=always
Then add an "Exec Shell" build action to execute lftp on the script.
lftp -f deploy.lftp
The lftp script will
mirror: copy all changed files
reverse: push local files to a remote host. a regular mirror pulls from remote host to local.
verbose: dump all the notes about what files were copied where to the build log
delete: remove remote files no longer present in the git repo
exclude: don't publish .git directory or the deploy.lftp script.
ignore-time: won't publish based on file creation time. If you don't have this, in my case, all files got published since a fresh clone of the git repo updated the file create timestamps. It still works quite well though and even files modified by adding a single space in them were identified as different and uploaded.
recursion: will analyze every file rather than depending on folders to determine if any files in them were possibly modified. This isn't technically necessary since we're ignoring time stamps but I have it in here anyway.
I wrote an article explaining how I keep FTP in sync with Git for a WordPress site I could only access via FTP. The article explains how to sync from FTP to Git then how to use Jenkins to build and deploy back to FTP. This approach isn't perfect but it works. It only uploads changed files and it deletes files off the host that have been removed from the git repo (and vice versa)

How to actually use a source control system?

So I get that most of you are frowning at me for not currently using any source control. I want to, I really do, now that I've spent some time reading the questions / answers here. I am a hobby programmer and really don't do much more than tinker, but I've been bitten a couple of times now not having the 'time machine' handy...
I still have to decide which product I'll go with, but that's not relevant to this question.
I'm really struggling with the flow of files under source control, so much so I'm not even sure how to pose the question sensibly.
Currently I have a directory hierarchy where all my PHP files live in a Linux Environment. I edit them there and can hit refresh on my browser to see what happens.
As I understand it, my files now live in a different place. When I want to edit, I check it out and edit away. But what is my substitute for F5? How do I test it? Do I have to check it back in, then hit F5? I admit to a good bit of trial and error in my work. I suspect I'm going to get tired of checking in and out real quick for the frequent small changes I tend to make. I have to be missing something, right?
Can anyone step me through where everything lives and how I test along the way, while keeping true to the goal of having a 'time machine' handy?
Eric Sink has a great series of posts on source control basics. His company (Sourcegear) makes a source control tool called Vault, but the how-to is generally pretty system agnostic.
Don't edit your code on production.
Create a development environment, with the appropriate services (apache w/mod_php).
The application directory within your dev environment is where you do your work.
Put your current production app in there.
Commit this directory to the source control tool. (now you have populated source control with your application)
Make changes in your new development environment, hitting F5 when you want to see/test what you've changed.
Merge/Commit your changes to source control.
Actually, your files, while stored in a source repository (big word for another place on your hard drive, or a hard drive somewhere else), can also exist on your local machine, too, just where they exist now.
So, all files that aren't checked out would be marked as "read only", if you are using VSS (not sure about SVN, CVS, etc). So, you could still run your website by hitting "F5" and it will reload the files where they currently are. If you check one out and are editing it, it becomes NOT read only, and you can change it.
Regardless, the web server that you are running will load readonly/writable files with the same effect.
You still have all the files on your hard drive, ready for F5!
The difference is that you can "checkpoint" your files into the repository. Your daily life doesn't have to change at all.
You can do a "checkout" to the same directory where you currently work so that doesn't have to change. Basically your working directory doesn't need to change.
This is a wildly open ended question because how you use a SCM depends heavily on which SCM you choose. A distributed SCM like git works very differently from a centralized one like Subversion.
svn is way easier to digest for the "new user", but git can be a little more powerful and improve your workflow. Subversion also has really great docs and tool support (like trac), and an online book that you should read:
http://svnbook.red-bean.com/
It will cover the basics of source control management which will help you in some way no matter which SCM you ultimately choose, so I recommend skimming the first few chapters.
edit: Let me point out why people are frowning on you, by the way: SCM is more than simply a "backup of your code". Having "timemachine" is nothing like an SCM. With an SCM you can go back in your change history and see what you actually changed and when which is something you'll never get with blobs of code. I'm sure you've asked yourself on more than one occasion: "how did this code get here?" or "I thought I fixed that bug"-- if you did, thats why you need SCM.
You don't "have" to change your workflow in a drastic way. You could, and in some cases you should, but that's not something version control dictates.
You just use the files as you would normally. Only under version control, once you reach a certain state of "finished" or at least "working" (solved an issue in your issue tracker, finished a certain method, tweaked something, etc), you check it in.
If you have more than one developer working on your codebase, be sure to update regularly, so you're always working against a recent (merged) version of the code.
Here is the general workflow that you'd use with a non-centralized source control system like CVS or Subversion: At first you import your current project into the so-called repository, a versioned storage of all your files. Take care only to import hand-generated files (source, data files, makefiles, project files). Generated files (object files, executables, generated documentation) should not be put into the repository.
Then you have to check out your working copy. As the name implies, this is where you will do all your local edits, where you will compile and where you will point your test server at. It's basically the replacement to where you worked at before. You only need to do these steps once per project (although you could check out multiple working copies, of course.)
This is the basic work cycle: At first you check out all changes made in the repository into your local working copy. When working in a team, this would bring in any changes other team members made since your last check out. Then you do your work. When you've finished with a set of work, you should check out the current version again and resolve possible conflicts due to changes by other team members. (In a disciplined team, this is usually not a problem.) Test, and when everything works as expected you commit (check in) your changes. Then you can continue working, and once you've finished again, check out, resolve conflicts, and check in again. Please note that you should only commit changes that were tested and work. How often you check in is a matter of taste, but a general rule says that you should commit your changes at least once at the end of your day. Personally, I commit my changes much more often than that, basically whenever I made a set of related changes that pass all tests.
Great question. With source control you can still do your "F5" refresh process. But after each edit (or a few minor edits) you want to check your code in so you have a copy backed up.
Depending on the source control system, you don't have to explicitly check out the file each time. Just editing the file will check it out. I've written a visual guide to source control that many people have found useful when grokking the basics.
I would recommend a distributed version control system (mercurial, git, bazaar, darcs) rather than a centralized version control system (cvs, svn). They're much easier to setup and work with.
Try mercurial (which is the VCS that I used to understand how version control works) and then if you like you can even move to git.
There's a really nice introductory tutorial on Mercurial's homepage: Understanding Mercurial. That will introduce you to the basic concepts on VCS and how things work. It's really great. After that I suggest you move on to the Mercurial tutorials: Mercurial tutorial page, which will teach you how to actually use Mercurial. Finally, you have a free ebook that is a really great reference on how to use Mercurial: Distributed Revision Control with Mercurial
If you're feeling more adventurous and want to start off with Git straight away, then this free ebook is a great place to start: Git Magic (Very easy read)
In the end, no matter what VCS tool you choose, what you'll end up doing is the following:
Have a repository that you don't manually edit, it only for the VCS
Have a working directory, where you make your changes as usual.
Change what you like, press F5 as many times as you wish. When you like what you've done and think you would like to save the project the way it is at that very moment (much like you would do when you're, for example, writing something in Word) you can then commit your changes to the repository.
If you ever need to go back to a certain state in your project you now have the power to do so.
And that's pretty much it.
If you are using Subversion, you check out your files once . Then, whenever you have made big changes (or are going to lunch or whatever), you commit them to the server. That way you can keep your old work flow by pressing F5, but every time you commit you save a copy of all the files in their current state in your SVN-repository.
Depends on the source control system you use. For example, for subversion and cvs your files can reside in a remote location, but you always check out your own copy of them locally. This local copy (often referred to as the working copy) are just regular files on the filesystem with some meta-data to let you upload your changes back to the server.
If you are using Subversion here's a good tutorial.
Depending on the source control system, 'checkout' may mean different things. In the SVN world, it just means retrieving (could be an update, could be a new file) the latest copy from the repository. In the source-safe world, that generally means updating the existing file and locking it. The text below uses the SVN meaning:
Using PHP, what you want to do is checkout your entire project/site to a working folder on a test apache site. You should have the repository set up so this can happen with a single checkout, including any necessary sub folders. You checkout your project to set this up one time.
Now you can make your changes and hit F5 to refresh as normal. When you're happy with a set of changes to support a particular fix or feature, you can commit in as a unit (with appropriate comments, of course). This puts the latest version in the repository.
Checking out/committing one file at a time would be a hassle.
A source control system is generally a storage place for your files and their history and usually separate from the files you're currently working on. It depends a bit on the type of version control system but suppose you're using something CVS-like (like subversion), then all your files will live in two (or more) places. You have the files in your local directory, the so called "working copy" and one in the repository, which can be located in another local folder, or on another machine, usually accessed over the network. Usually, after the first import of your files into the repository you check them out under a working folder where you continue working on them. I assume that would be the folder where your PHP files now live.
Now what happens when you've checked out a copy and you made some non-trivial changes that you want to "save"? You simply commit those changes in your working copy to the version control system. Now you have a history of your changes. Should you at any point wish to go back to the version at which you committed those changes, then you can simply revert your working copy to an older revision (the name given to the set of changes that you commit at once).
Note that this is all very CVS/SVN-specific, as GIT would work slightly different. I'd recommend starting with subversion and reading the first few chapters of the very excellent SVN Book to get you started.
This is all very subjective depending on the the source control solution that you decide to use. One that you will definitely want to look into is Subversion.
You mentioned that you're doing PHP, but are you doing it in a Linux environment or Windows? It's not really important, but what I typically did when I worked in a PHP environment was to have a production branch and a development branch. This allowed me to configure a cron job (a scheduled task in Windows) for automatically pulling from the production-ready branch for the production server, while pulling from the development branch for my dev server.
Once you decide on a tool, you should really spend some time learning how it works. The concepts of checking in and checking out don't apply to all source control solutions, for example. Either way, I'd highly recommend that you pick one that permits branching. This article goes over a great (in my opinion) source control model to follow in a production environment.
Of course, I state all this having not "tinkered" in years. I've been doing professional development for some time and my techniques might be overkill for somebody in your position. Not to say that there's anything wrong with that, however.
I just want to add that the system that I think was easiest to set up and work with was Mercurial. If you work alone and not in a team you just initialize it in your normal work folder and then go on from there. The normal flow is to edit any file using your favourite editor and then to a checkin (commit).
I havn't tried GIT but I assume it is very similar. Monotone was a little bit harder to get started with. These are all distributed source control systems.
It sounds like you're asking about how to use source control to manage releases.
Here's some general guidance that's not specific to websites:
Use a local copy for developing changes
Compile (if applicable) and test your changes before checking in
Run automated builds and tests as often as possible (at least daily)
Version your daily builds (have some way of specifying the exact bits of code corresponding to a particular build and test run)
If possible, use separate branches for major releases (or have a development and a release branch)
When necessary, stabilize your code base (define a set of tests such that passing all of those tests means you are confident enough in the quality of your product to release it, then drive toward 0 test failures, i.e. ban any checkins to the release branch other than fixes for the outstanding issues)
When you have a build which has the features you want and has passed all of the necessary tests, deploy it.
If you have a small team, a stable product, a fast build, and efficient, high-quality tests then this entire process might be 100% automated and could take place in minutes.
I recommend Subversion. Setting up a repository and using it is actually fairly trivial, even from the command line. Here's how it would go:
if you haven't setup your repo (repository)
1) Make sure you've got Subversion installed on your server
$ which svn
/usr/bin/svn
which is a tool that tells you the path to another tool. if it returns nothing that tool is not installed on your system
1b) If not, get it
$ apt-get install subversion
apt-get is a tool that installs other tools onto your system
If that's not the right name for subversion in apt, try this
$ apt-cache search subversion
or this
$ apt-cache search svn
Find the right package name and install it using apt-get install packagename
2) Create a new repository on your server
$ cd /path/to/directory/of/repositories
$ svnadmin create my_repository
svnadmin create reponame creates a new repository in the present working directory (pwd) with the name reponame
You are officially done creating your repository
if you have an existing repo, or have finished setting it up
1) Make sure you've got Subversion installed on your local machine per the instructions above
2) Check out the repository to your local machine
$ cd /repos/on/your/local/machine
$ svn co svn+ssh://www.myserver.com/path/to/directory/of/repositories/my_repository
svn co is the command you use to check out a repository
3) Create your initial directory structure (optional)
$ cd /repos/on/your/local/machine
$ cd my_repository
$ svn mkdir branches
$ svn mkdir tags
$ svn mkdir trunk
$ svn commit -m "Initial structure"
svn mkdir runs a regular mkdir and creates a directory in the present working directory with the name you supply after typing svn mkdir and then adds it to the repository.
svn commit -m "" sends your changes to the repository and updates it. Whatever you place in the quotes after -m is the comment for this commit (make it count!).
The "working copy" of your code would go in the trunk directory. branches is used for working on individual projects outside of trunk; each directory in branches is a copy of trunk for a different sub project. tags is used more releases. I suggest just focusing on trunk for a while and getting used to Subversion.
working with your repo
1) Add code to your repository
$ cd /repos/on/your/local/machine
$ svn add my_new_file.ext
$ svn add some/new/directory
$ svn add some/directory/*
$ svn add some/directory/*.ext
The second to last line adds every file in that directory. The last line adds every file with the extension .ext.
2) Check the status of your repository
$ cd /repos/on/your/local/machine
$ svn status
That will tell you if there are any new files, and updated files, and files with conflicts (differences between your local version and the version on the server), etc.
3) Update your local copy of your repository
$ cd /repos/on/your/local/machine
$ svn up
Updating pulls any new changes from the server you don't already have
svn up does care what directory you're in. If you want to update your entire repository, makre sure you're in the root directory of the repository (above trunk)
That's all you really need to know to get started. For more information I recommend you check out the Subversion Book.

How do you update your web application on the server?

I am aware of Capistrano, but it is a bit too heavyweight for me. Personally, I set up two Mercurial repositories, one on the production server and another on my local dev machine. Regularly, when a new feature is ready, I push changes from repository on my local machine to repository on the server, then update on the server. This is a pretty simple and quick way to keep files in sync on several computers, but does not help to update databases.
What is your solution to the problem?
I used to use git push to publish to my web server but lately I've just been using rsync. I try to make my site as agnostic about where it's running as possible (using relative paths, etc) and so far it's worked pretty well. The only challenge is keeping databases in sync, and for that I usually use the production database as the master and make regular backups and imports into my testing database.
Or Fabric, if you prefer Python.
what's heavyweight about capistrano? if you want to sync files then sure rsync is great. but if you're then going to need to do db updates maybe cap isn't so bad ?
I'm assuming you're speaking of Ruby on Rails.
Check out the HowTo wiki:
http://wiki.rubyonrails.com/rails/pages/Howtos#deployment
#Andrew
To use git push to deploy your site you will need to do first set up a remote server in your .git/config file to push to. Then you need to configure a hook that will basically perform a git reset --hard to copy the code you just copied to the repository to the working directory.
I know this is a little vague, but I actually deleted the server-side .git folder once I switched to rsync, so I don't have the exact scripts that I used to make the magic happen. That might be a good candidate for a full question though, so you might get more responses that way.
edit: I know it's been a while, but I eventually found what I was using again:
Deploy a project using Git push