Perforce: Prevent keywords from being expanded when syncing files out of the depot? - version-control

I have a situation where I'd like to diff two branches in Perforce. Normally I'd use diff2 to do a server-side diff but in this case the files on the branches are so large that the diff2 call ends up filling up /tmp on my server trying to diff them and the diff fails.
I can't bring down my server to rectify this so I'm looking at checking out the the content to disk and using diff on the command line to inspect and compare the content.
The trouble is: most of the files have RCS keywords in them that are being expanded.
I know can remove keyword expansion from a file by opening the files for edit and removing the -k attribute from the files in the process, but that seems a bit brute force. I was hoping I could just tell the p4 sync command not to expand the keywords on checkout. I can't seem to find a way to do this? Is it possible?
As a possible alternative solution, does anyone know if you can tell p4 diff2 which directory to use for temporary space when you call it? If I could tell it to use abundant NAS space instead of /tmp on the Perforce server I might be able to make it work.
I'm using 2010.x version of Perforce if that changes the answer in any way.

There's no way I know of to disable keyword expansion on sync. Here's what I would try:
1) Create a branch spec between the two sets of files
2) Run "p4 files //path/to/files/... | cut -d '#' -f 1 > tmp"
Path to files above should be the right hand side of the branch spec you created
3) p4 -x tmp diff2 -b
This tells p4 to iterate over the lines of text in 'tmp' and treat them as arguments to the command. I think /tmp on your server will get cleared in-between each file this way, preventing it from filling up.
I unfortunately don't have files large enough to test that it works, so this is entirely theoretical.
To change the temp directory that p4d uses just TEMP or TMP to a different path and restart p4d. If you're on Windows make sure to call 'p4 set -S perforce TMP=' to set variable for the Perforce service; without the -S perforce you'll just set it for the current user.

Related

How to read a file without checking out in perforce

I'm writing a syntax check tool to parse several files on different branches.
Is there a way for me to read the contents without checking out the file?
The tool is written in Perl.
`p4 print //depot/path/to/file`;
(Usual requirements for running a p4 command apply -- make sure the p4 executable is in your PATH, make sure you're authenticated with p4 login, make sure you're connecting to the right server, etc.)
See p4 help print for more info on the print command -- you might find the -q and/or -o flags helpful depending on what exactly you need to do with the output.

Can we wget with file list and renaming destination files?

I have this wget command:
sudo wget --user-agent='some-agent' --referer=http://some-referrer.html -N -r -nH --cut-dirs=x --timeout=xxx --directory-prefix=/directory/for/downloaded/files -i list-of-files-to-download.txt
-N will check if there is actually a newer file to download.
-r will turn the recursive retrieving on.
-nH will disable the generation of host-prefixed directories.
--cut-dirs=X will avoid the generation of the host's subdirectories.
--timeout=xxx will, well, timeout :)
--directory-prefix will store files in the desired directorty.
This works nice, no problem.
Now, to the issue:
Let's say my files-to-download.txt has these kind of files:
http://website/directory1/picture-same-name.jpg
http://website/directory2/picture-same-name.jpg
http://website/directory3/picture-same-name.jpg
etc...
You can see the problem: on the second download, wget will see we already have a picture-same-name.jpg, so it won't download the second or any of the following ones with the same name. I cannot mirror the directory structure because I need all the downloaded files to be in the same directory. I can't use the -O option because it clashes with --N, and I need that. I've tried to use -nd, but doesn't seem to work for me.
So, ideally, I need to be able to:
a.- wget from a list of url's the way I do now, keeping my parameters.
b.- get all files at the same directory and being able to rename each file.
Does anybody have any solution to this?
Thanks in advance.
I would suggest 2 approaches -
Use the "-nc" or the "--no-clobber" option. From the man page -
-nc
--no-clobber
If a file is downloaded more than once in the same directory, >Wget's behavior depends on a few options, including -nc. In certain >cases, the local file will be
clobbered, or overwritten, upon repeated download. In other >cases it will be preserved.
When running Wget without -N, -nc, -r, or -p, downloading the >same file in the same directory will result in the original copy of file >being preserved and the second copy
being named file.1. If that file is downloaded yet again, the >third copy will be named file.2, and so on. (This is also the behavior >with -nd, even if -r or -p are in
effect.) When -nc is specified, this behavior is suppressed, >and Wget will refuse to download newer copies of file. Therefore, ""no->clobber"" is actually a misnomer in
this mode---it's not clobbering that's prevented (as the >numeric suffixes were already preventing clobbering), but rather the >multiple version saving that's prevented.
When running Wget with -r or -p, but without -N, -nd, or -nc, >re-downloading a file will result in the new copy simply overwriting the >old. Adding -nc will prevent this
behavior, instead causing the original version to be preserved >and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r or -p, the >decision as to whether or not to download a newer copy of a file depends >on the local and remote timestamp and
size of the file. -nc may not be specified at the same time as >-N.
A combination with -O/--output-document is only accepted if the >given output file does not exist.
Note that when -nc is specified, files with the suffixes .html >or .htm will be loaded from the local disk and parsed as if they had been >retrieved from the Web.
As you can see from this man page entry, the behavior might be unpredictable/unexpected. You will need to see if it works for you.
Another approach would be to use a bash script. I am most comfortable using bash on *nix, so forgive the platform dependency. However the logic is sound, and with a bit of modifications, you can get it to work on other platforms/scripts as well.
Sample pseudocode bash script -
for i in `cat list-of-files-to-download.txt`;
do
wget <all your flags except the -i flag> $i -O /path/to/custom/directory/filename ;
done ;
You can modify the script to download each file to a temporary file, parse $i to get the filename from the URL, check if the file exists on the disk, and then take a decision to rename the temp file to the name that you want.
This offers much more control over your downloads.

Creating a script that compares multiple files in multiple servers

I have several different linux servers, all of which are essentially mirrors of each other. However, some of them have gone out of sync (file A in machine 1 is different from file B in machine 2).
I'm in the process of designing a script (shell or Perl only) that will systematically walk through certain directories and diff the corresponding files in the different machines against each other, and generate a meaningful report. Later on, I will try to sync up the files.
These are my thoughts so far on how to approach this:
sftp files to /tmp and diff locally
using ssh and diff
using rsync
My question is: what is the best way to systematically compare two files that are in different machines (but similar directory structure), and are there any built-in Perl utilities that may be helpful?
rsync will figure out the difference and sync your files by sending only the diff. Once two folders get synced, it will be pretty quick. (But the 1st time to sync will take some time)
You can also use git here. One possible workflow: just checkin all files you want to compare (or complete directories using git add -A). Then create an empty git repository on your local workstation which is used fetch all the other repositories, and which is used to do the comparisons:
git init
git remote add firstmachine ssh://user#firstmachine/path/to/directory
git remote add othermachine ssh://user#othermachine/path/to/directory
git fetch --all
Now the contents of two machines may be compared:
git diff remotes/firstmachine/master remotes/othermachine/master
Or just compare the contents of a specific file:
git diff remotes/firstmachine/master remotes/othermachine/master -- file/to/compare
It's not strictly necessary to use a third machine for the comparisons. You can also git-fetch the contents from othermachine to firstmachine.
I had worked on a similar tool (which was in python). What it did was, run a cron job, at a given time of the night, which would bring the tar bzipped files to one server, extract the directories and run a recursive diff on it. The diff output was then run through some python scripts, which would analyse the diff hunks (+ lines/! lines etc) to know the amount of change.
Not sure if there are pre-built modules in Perl or Python, but some helper utils might sure be available in one of them.
If you need to know the difference between some local and remote file systems, the following method minimizes the network load:
make a local copy ($C) of the local directory ($D) you want to compare. I.e.:
cp -R $D $C
use rsync to copy the remote directory ($R) you want to compare over $C:
rsync -av --delete $remote_host:$R $C
compare $D to $C:
diff -u $D $C

What is the command line syntax to delete files in Perforce?

I am creating some build scripts that interact with Perforce and I would like to mark for delete a few files. What exactly is the P4 syntax using the command line?
p4 delete filename
(output of p4 help delete)
delete -- Open an existing file to delete it from the depot
p4 delete [ -c changelist# ] [ -n ] file ...
Opens a file that currently exists in the depot for deletion.
If the file is present on the client it is removed. If a pending
changelist number is given with the -c flag the opened file is
associated with that changelist, otherwise it is associated with
the 'default' pending changelist.
Files that are deleted generally do not appear on the have list.
The -n flag displays what would be opened for delete without actually
changing any files or metadata.
Teach a man to fish:
p4 help - gets you general command
syntax
p4 help commands - lists the
commands
p4 help <command name> -
provides detailed help for a specific
command
http://www.perforce.com/perforce/doc.062/manuals/boilerplates/quickstart.html
Deleting files
To delete files from both the Perforce server and your workspace, issue the p4 delete command. For example:
p4 delete demo.txt readme.txt
The specified files are removed from your workspace and marked for deletion from the server. If you decide you don't want to delete the files after all, issue the p4 revert command. When you revert files opened for delete, Perforce restores them to your workspace.
Admitted - it takes a (small) number of steps to find the (excellent!) Perforce user guide online in the version that matches your installation and get to the chapter with the information you need.
Whenever I find myself in need of anything about the p4 command line client, I rely on the help Perforce have built into it. Accessing it could not be easier:
on the command line, enter p4
This gets you to the information Michael Burr has shown in his answer (and some more).
If you do not get a help screen right away, something is wrong with our client configuration, e.g. P4PORT is not set properly. You obviously need to fix that first.

How to find untracked files in a Perforce tree? (analogue of svn status)

Anybody have a script or alias to find untracked (really: unadded) files in a Perforce tree?
EDIT: I updated the accepted answer on this one since it looks like P4V added support for this in the January 2009 release.
EDIT: Please use p4 status now. There is no need for jumping through hoops anymore. See #ColonelPanic's answer.
In the Jan 2009 version of P4V, you can right-click on any folder in your workspace tree and click "reconcile offline work..."
This will do a little processing then bring up a split-tree view of files that are not checked out but have differences from the depot version, or not checked in at all. There may even be a few other categories it brings up.
You can right-click on files in this view and check them out, add them, or even revert them.
It's a very handy tool that's saved my ass a few times.
EDIT: ah the question asked about scripts specifically, but I'll leave this answer here just in case.
On linux, or if you have gnu-tools installed on windows:
find . -type f -print0 | xargs -0 p4 fstat >/dev/null
This will show an error message for every unaccounted file. If you want to capture that output:
find . -type f -print0 | xargs -0 p4 fstat >/dev/null 2>mylogfile
Under Unix:
find -type f ! -name '*~' -print0| xargs -0 p4 fstat 2>&1|awk '/no such file/{print $1}'
This will print out a list of files that are not added in your client or the Perforce depot. I've used ! -name '*~' to exclude files ending with ~.
Ahh, one of the Perforce classics :) Yes, it really sucks that there is STILL no easy way for this built into the default commands.
The easiest way is to run a command to find all files under your clients root, and then attempt to add them to the depot. You'll end up with a changelist of all new files and existing files are ignored.
E.g dir /s /b /A-D | p4 -x - add
(use 'find . -type f -print' from a nix command line).
If you want a physical list (in the console or file) then you can pipe on the results of a diff (or add if you also want them in a changelist).
If you're running this within P4Win you can use $r to substitute the client root of the current workspace.
Is there an analogue of svn status or git status?
Yes, BUT.
As of Perforce version 2012.1, there's the command p4 status and in P4V 'reconcile offline work'. However, they're both very slow. To exclude irrelevant files you'll need to write a p4ignore.txt file per https://stackoverflow.com/a/13126496/284795
2021-07-16: THIS ANSWER MAY BE OBSOLETE.
I am reasonably sure that it was accurate in 2016, for whatever version of Perforce I was using them (which was not necessarily the most current). But it seems that this problem or design limitation has been remedied in subsequent releases of Perforce. I do not know what the stack overflow etiquette for this is -- should this answer be removed?
2016 ANSWER
I feel impelled to add an answer, since the accepted answer, and some of the others, have what I think is a significant problem: they do not understand the difference between a read-only query command, and a command that makes changes.
I don't expect any credit for this answer, but I hope that it will help others avoid wasting time and making mistakes by following the accepted but IMHO incorrect answer.
---+ BRIEF
Probably the most convenient way to find all untracked files in a perforce workspace is p4 reconcile -na.
-a says "give me files that are not in the repository, i.e. that should be added".
-n says "make no changes" - i.e. a dry-run. (Although the messages may say "opened for add", mentally you must interpret that as "would be opened for add if not -n")
Probably the most convenient way to find all local changes made while offline - not just files that might need to be added, but also files that might need to be deleted, or which have been changed without being opened for editing via p4 edit, is p4 reconcile -n.
Several answers provided scripts, often involving p4 fstat. While I have not verified all of those scripts, I often use similar scripts to make up for the deficiencies of perforce commands such as p4 reconcile -n - e.g. often I find that I want local paths rather than Perforce depot paths or workspace paths.
---+ WARNING
p4 status is NOT the counterpart to the status commands on other version control systems.
p4 status is NOT a read-only query. p4 status actually finds the same sort of changes that p4 reconcile does, and adds them to the repository. p4 status does not seem to have a -n dry-run option like p4 reconcile does.
If you do p4 status, look at the files and think "Oh, I don't need those", then you will have to p4 revert them if you want to continue editing in the same workspace. Or else the changes that p4 status added to your changeset will be checked in the next time.
There seems to be little or no reason to use p4 status rather than p4 reconcile -n, except for some details of local workspace vs depot pathname.
I can only imagine that whoever chose 'status' for a non-read-only command had limited command of the English language and other version control tools.
---+ P4V GUI
In the GUI p4v, the reconcile command finds local changes that may need to be added, deleted, or opened for editing. Fortunately it does not add them to a changelist by default; but you still may want to be careful to close the reconcile window after inspecting it, if you don't want to commit the changes.
Alternatively from P4Win, use the ""Local Files not in Depot" option on the left hand view panel.
I don't use P4V much, but I think the equivalent is to select "Hide Local Workspace Files" in the filter dropdown of the Workspace view tab.p4 help fstat
In P4V 2015.1 you'll find these options under the filter button like this:
I use the following in my tool that backs up any files in the workspace that differ from the repository (for Windows). It handles some odd cases that Perforce doesn't like much, like embedded blanks, stars, percents, and hashmarks:
dir /S /B /A-D | sed -e "s/%/%25/g" -e "s/#/%40/g" -e "s/#/%23/g" -e "s/\*/%2A/g" | p4 -x- have 1>NUL:
"dir /S /B /A-D" lists all files at or below this folder (/S) in "bare" format (/B) excluding directories (/A-D). The "sed" changes dangerous characters to their "%xx" form (a la HTML), and the "p4 have" command checks this list ("-x-") against the server discarding anything about files it actually locates in the repository ("1>NUL:"). The result is a bunch of lines like:
Z:\No_Backup\Workspaces\full\depot\Projects\Archerfish\Portal\Main\admin\html\images\nav\navxx_background.gif - file(s) not on client.
Et voilĂ !
Quick 'n Dirty: In p4v right-click on the folder in question and add all files underneath it to a new changelist. The changelist will now contain all files which are not currently part of the depot.
The following commands produce status-like output, but none is quite equivalent to svn status or git status, providing a one-line summary of the status of each file:
p4 status
p4 opened
p4 diff -ds
I don't have enough reputation points to comment, but Ross' solution also lists files that are open for add. You probably do not want to use his answer to clean your workspace.
The following uses p4 fstat (thanks Mark Harrison) instead of p4 have, and lists the files that aren't in the depot and aren't open for add.
dir /S /B /A-D | sed -e "s/%/%25/g" -e "s/#/%40/g" -e "s/#/%23/g" -e "s/\*/%2A/g" | p4 -x- fstat 2>&1 | sed -n -e "s/ - no such file[(]s[)]\.$//gp"
===Jac
Fast method, but little orthodox. If the codebase doesn't add new files / change view too often, you could create a local 'git' repository out of your checkout. From a clean perforce sync, git init, add and commit all files locally. Git status is fast and will show files not previously committed.
The p4 fstat command lets you test if a file exists in the workspace, combine with find to locate files to check as in the following Perl example:
// throw the output of p4 fstat to a 'output file'
// find:
// -type f :- only look at files,
// -print0 :- terminate strings with \0s to support filenames with spaces
// xargs:
// Groups its input into command lines,
// -0 :- read input strings terminated with \0s
// p4:
// fstat :- fetch workspace stat on files
my $status=system "(find . -type f -print0 | xargs -0 p4 fstat > /dev/null) >& $outputFile";
// read output file
open F1, $outputFile or die "$!\n";
// iterate over all the lines in F1
while (<F1>) {
// remove trailing whitespace
chomp $_;
// grep lines which has 'no such file' or 'not in client'
if($_ =~ m/no such file/ || $_ =~ m/not in client/){
// Remove the content after '-'
$_=~ s/-\s.*//g;
// below line is optional. Check ur output file for more clarity.
$_=~ s/^.\///g;
print "$_\n";
}
}
close F1;
Or you can use p4 reconcile -n -m ...
If it is 'opened for delete' then it has been removed from the workspace. Note that the above command is running in preview mode (-n).
I needed something that would work in either Linux, Mac or Windows. So I wrote a Python script for it. The basic idea is to iterate through files and execute p4 fstat on each. (of course ignoring dependencies and tmp folders)
You can find it here: https://gist.github.com/givanse/8c69f55f8243733702cf7bcb0e9290a9
This command can give you a list of files that needs to be added, edited or removed:
p4 status -aed ...
you can use them separately too
p4 status -a ...
p4 status -e ...
p4 status -d ...
In P4V, under the "View" menu item choose "Files in Folder" which brings up a new tab in the right pane.
To the far right of the tabs there is a little icon that brings up a window called "Files in Folder" with 2 icons.
Select the left icon that looks like a funnel and you will see several options. Choose "Show items not in Depot" and all the files in the folder will show up.
Then just right-click on the file you want to add and choose "Mark for Add...". You can verify it is there in the "Pending" tab.
Just submit as normal (Ctrl+S).