What's the best way to perform a differential between a list of directories? - perl

I am interested in looking at a list of directories and comparing the previous list with a current list of directories and setting up a script to do so. Maybe in perl or as a shell script.
Should I use something like diff? Programatically, what would be an ideal way to do this? For example let say I output the diff to an output file, if there is no diff then exit, if there is results, I want to see it.
Let's for example I have the following directories today:
/foo/bar/staging/abc
/foo/bar/staging/def
/foo/bar/staging/a1b2c3
Next day would look like this where a directory is either added, or renamed:
/foo/bar/staging/abc
/foo/bar/staging/def
/foo/bar/staging/ghi
/foo/bar/staging/a1b2c4

There might be better ways, but the way I typically do something like this is to run a find command in each directory root, and pipe the output to separate files. You can then diff the files using the diff tool of your choice. If you want to filter out certain directories or files, you can throw in some grep or grep -v commands in the pipeline, or you can experiment with options on the find command.
The other main option is to find a diff tool that offers directory/folder comparisons. Most of the goods ones support this, but I like the command line method, because you get more control over what you're diffing.
cd /my/directory/one
find . -print | sort > /temp/one.txt
cd /my/directory/two
find . -print | sort > /temp/two.txt
diff /temp/one.txt /temp/two.txt

also check the inotifywait command. it allows you to monitor files in RT.

You might also consider the find command using the -newer switch.
The usage is:
find . -newer timefile.txt -print
The -newer switch makes find return a list of files that are created or updated after the specified file's modification time. In the example above, any file created or updated after timefile.txt would be returned. You'd have to create a timefile.txt file, most likely once per day. Some versions of find have variations of newer that compare against other time stamps for a file (last modified, last accessed, last created, etc.)
This technique would not report a file that was deleted, however. A daily diff of the file listings could report that.

Related

Is there a way to skip given files when performing a cleartool findmerge?

In our development environment, we have certain files that are autogenerated by some parsing tools, and they should never be merged from one branch to another. We do have them under source control, however, so that only one user needs to run the generation tool for any given branch; all other users will get the generated files automatically.
Is there a way to tell "findmerge" to skip these files when it traverses the VOB? If findmerge cannot resolve the differences in a file, it loads the default diff tool so the user can resolve the differences manually. For these autogenerated files, this is a waste of time; the user just has to cancel it and then run the autogeneration tool when the findmerge is complete.
If it matters, we use dynamic views.
You might consider the same approach as with binary files
Your project manager can overcome this problem by creating a special element type for the binary file type and specifying one of the following mergetypes:
never: A merge or findmerge operation ignores versions whose element type has never as a mergetype.
So, as in this page, something like:
cleartool mkeltype -supertype file -mergetype never -nc FILE_NEVER_MERGE
And then, in the folder with your generated files (here for instance for XML files):
ct find path/to/generated -type f -ele "{eltype(xml)}" -exec "cleartool chtype -force FILE_NEVER_MERGE %CLEARCASE

Search Eclipse local history

Does Eclipse provide (possibly via a plug-in) the ability to search the local history?
E.g, I have a lot of history files and don't want to have to trawl through them all because I know that the version I want is the most recent which contains the string "slithy toves".
[Update] people answering similar questions on this site have not read the question – or it was badly phrased.
I am not looking to go to the local history (which do know how to find) and manually search through every entry, version by version. I want a single search function which will do that for me.
[Update++] the referenced question does not contain an acceptable answer. The only solution offered there involves creating a dummy project, which is more overehad than I care for.
Note: if it helps anyone, I found that the local history can be found in
Workspace\.metadata\.plugins\org.eclipse.core.resources\.history\
from which, I can use any file search tool of my choice
Manualy search the local history
My answer to the question is "No, you cannot do that search from within Eclipse".
Nevertheless, in this answer to another question you can see the (linux) command line (and the explanation of what it does) to search the local history structure if you know specific code that was in the file you seek (like you indicated "slithy toves"):
fgrep -r -c "[slithy toves]" * | grep -v ":0" | cut -d : -f 1 | xargs ls -l
I found that the local history can be found in
Workspace\.metadata\.plugins\org.eclipse.core.resources\.history\
from which, I can use any file search tool of my choice.
However, filenames all look like 001ad08cc7790016142da217e60cb1a5, so I can't search for widget.c :-(
There is also no index, since I searched for files containing 001ad08cc7790016142da217e60cb1a5, hoping to find maybe some XML which told me the original file name, with a reference to 001ad08cc7790016142da217e60cb1a5, but I found nothing :-(
Also, some of the saved files seem to be binary, and I can't see how to configure Eclipse to save only *.c and *.h

Bourne Shell delete oldest file with DOY extension

I am relatively new to bourne scripting (running on Solaris), and I am struggling with this simple problem for some reason. I am creating a script that will run in a directory, and will try to delete the files older than a certain date.
The files are of the form: log.DOY, so for example log.364, log.365, log.001, etc.
Now this would be easy if it wasn't for the pesky rollover, especially with it not always being 365 as a max (leap years).
I have debated using find -mtime, but it would be preferable to use the file extension if possible.
Do any of you scripting magicians have any suggestions?
Your choice of find with -mtime is close, but there is a potentially easier way. You say you would like to remove files older than the date of some measuring file (say all files older than log.287 -- including log.287).
find provides the -newer option that will do just that. The following is a short script that takes the measuring filename as its first argument and will print here (but you can include delete on you own) all files in that directory (non-recursively with the -maxdepth 1 option). The printf operation is provided for testing to insure there are no "OOPs" accidents. Let me know if you have questions:
#!/bin/sh
find . -maxdepth 1 -type f ! -newer "$1" |
while read filenm; do
printf "%s\n" "$filenm" ## you can add rm to remove the file
done
Note: check your version of read. The POSIX compliant use is shown above, but if you have the -r option, I would suggest its use as well.
I don't have Solaris handy to check, but I don't think this is practical purely in shell script unless you happen to have non-standard CLI tools available (such as GNU Coreutils).
Specifically, figuring out the end-of-year wrap depends on knowing what day of the year it is right now, and I don't see a way to do that in the documentation I can find. (It can be done in GNU date using +%j as the format.)
However, the docs do say that you should have perl, so I would look to use that.

Comparing baseline differences including file contents

I know that I can use the diffbl command to compare two baselines. But how can I also ask the command to print out the diffs of all the files that were different?
Is this even possible or do I need to write a script or something? Any pointers?
The closest you can get with cleartool alone is cleartool diffbl -ver
For each version, you can make a cleartool diff -dif -pred <aversion> in order to print the actual diff (using the -dif format, where differences are reported in the style of the UNIX and Linux diff utility)
As mentioned in the comments, the diff -pred only prints the diff introduced by a specific version.
The concatenation of all those diffs represents all the modifications introduced by the new baseline compared to the old one.
In other word, that concatenation of diffs represents "print out the diffs of all the files that were different". What was originally asked for.

SAS- Reading multiple compressed data files

I hope you are all well.
So my question is about the procedure to open multiple raw data files that are compressed.
My files' names are ordered so I have for example : o_equities_20080528.tas.zip o_equities_20080529.tas.zip o_equities_20080530.tas.zip ...
Thank you all in advance.
How much work this will be depends on whether:
You have enough space to extract all the files simultaneously into one folder
You need to be able to keep track of which file each record has come from (i.e. you can't tell just from looking at a particular record).
If you have enough space to extract everything and you don't need to track which records came from which file, then the simplest option is to use a wildcard infile statement, allowing you to import the records from all of your files in one data step:
infile "c:\yourdir\o_equities_*.tas" <other infile options as per individual files>;
This syntax works regardless of OS - it's a SAS feature, not shell expansion.
If you have enough space to extract everything in advance but you need to keep track of which records came from each file, then please refer to this page for an example of how to do this using the filevar option on the infile statement:
http://www.ats.ucla.edu/stat/sas/faq/multi_file_read.htm
If you don't have enough space to extract everything in advance, but you have access to 7-zip or another archive utility, and you don't need to keep track of which records came from each file, you can use a pipe filename and extract to standard output. If you're on a Linux platform then this is very simple, as you can take advantage of shell expansion:
filename cmd pipe "nice -n 19 gunzip -c /yourdir/o_equities_*.tas.zip";
infile cmd <other infile options as per individual files>;
On windows it's the same sort of idea, but as you can't use shell expansion, you have to construct a separate filename for each zip file, or use some of 7zip's more arcane command-line options, e.g.:
filename cmd pipe "7z.exe e -an -ai!C:\yourdir\o_equities_*.tas.zip -so -y";
This will extract all files from all of the matching archives to standard output. You can narrow this down further via the 7-zip command if necessary. You will have multiple header lines mixed in with the data - you can use findstr to filter these out in the pipe before SAS sees them, or you can just choose to tolerate the odd error message here and there.
Here, the -an tells 7-zip not to read the zip file name from the command line, and the -ai tells it to expand the wildcard.
If you need to keep track of what came from where and you can't extract everything at once, your best bet (as far as I know) is to write a macro to process one file at a time, using the above techniques and add this information while you're importing each dataset.