I need to zip a large and deep directory tree with thousands of files on various levels of the tree.
The problem is that the whole tree is under SVN's version control. SVN has it's hidden metadata ".svn" directories in every dir, which inflates the size of the resulting ZIP by more than 100% (which is unacceptable since the resulting archive is purposed for online distribution).
Currently I'm using this:
7z -u archive.zip baseDir\*.png
7z -u archive.zip baseDir\*\*.png
7z -u archive.zip baseDir\*\*\*.png
7z -u archive.zip baseDir\*\*\*.png
7z -u archive.zip baseDir\*\*\*\*.png
...where the number of * levels is the maximum theoretical value of the tree. And all this is repeated for every extension that can possibly appear in the tree. This works - it builds the archive exactly as it should, but it takes far too long (a few minutes), since the whole tree has to be traversed many times.
And I want to make it faster, since I need to repeat this for every debug session.
Is there a more efficient way to select the "real" files in the directory tree?
Thanks for any help!
Try -xr!.svn
Stupid site won't recognise my answer because it was too simple...
Related
Changing the path of a Yocto environment is not a good idea, as I found out. This also explains why e.g. bitbake can be run regardless the current working directory. Absolute paths are stored in many places during the build process, even subdirectory structures are created into the tmp directory tree. I ended up in rebuilding from scratch - which takes a long time.
A documentation of how I tried to modify all paths:
find . -name *.conf -exec sed -i 's/media\/rob\/3210bcd4-49ef-473e-97a6-e4b7a2c1973e/home/g' {} +
This step replaces absolute paths, within many dynamic conf files (from xx/xx/linux to /home/linux - where linux was chosen for historical reasons. I could mount the partition also as /home/yocto or whatever name).
Next was deletion of subdirectory structures with the old path in the hope that the build process would recognize these deletions, and still rebuild quickly:
find . -name *3210bcd4-49ef-473e-97a6-e4b7a2c1973e* -exec fakeroot rm -r {} +
It was not recognized. Then I gave up.
From a user new to Yocto, familiar with former/classic crossbuild environments based on make menuconfig etc.
My question is:
Why are absolute paths generated & used throughout tmp instead of treating everything as relative?
Or, asked differently:
Why not use something like ${TOPDIR}/tmp throughout the build configuration, instead of hardcoding the absolute path to tmp?
Something seems seriously broken with infozip.exe (same as zip.exe in cygwin).
I've spent FORTY MINUTES and I can't create a zip named anything other than "1.zip" if I use -r (for a new zip) and -FS (for an existing zip)
When no zip is present, I do:
[info]zip.exe "S:\Dropbox\BACKUPS\winamp.zip" -r "C:\Program Files\winamp\"
When the zip is present, I do:
[info]zip.exe "S:\Dropbox\BACKUPS\winamp.zip" -FS -r "C:\Program Files\winamp\"
This should be the easiest thing in the world. I could do this just fine in 1990 (but they had no -FS option back then). Or using wzzip (but I don't want to; it requires an install, and has no -FS option). Or using my command-lines built-in primitive zip (has no -FS option). But I need to use infozip, because I intend to use -FS to sync only changed files to the zip file later.
Of course, -FS does the same thing: The file that is created is always s:\dropbox\backups\1.zip
The error that I get is always:
zip warning: name not matched: S:\Dropbox\BACKUPS\winamp.zip
Of course the name is not matched! I'm creating a new zip! This should be easy!
Unfortunately, only [info]zip.exe has this beahvior, and only [info]zip.exe has the -FS functionality that I need to NOT have to re-zip the whole thing every time. This is a utility that will be used for many folder backups, I don't want to re-zip every file every time.
Clarification: The zip.exe that comes with cygwin is infozip.exe.
In my workflow, I have lots of xxx.smr files in a folder and I need to convert them into other file format xxx_step3.mat by importing some data from xxx_info.xlsx. I learned that GNU make is powerful in keep all the files up-to-date.
In a very simple "explicit" format (without sophisticated wild card usage), Makefile for this process would look like this. To handle multiple xxx.smr files and their descendants, I should be able to do that by modifying this file.
.PHONY: all clean
all: xxx_step3.mat
xxx_step3.mat: xxx_step2.mat xxx_info.xlsx
matlab -r "merge2files('xxx_step2.mat', 'xxx_info.xlsx')"
xxx_step2.mat: xxx_step1.mat
matlab -r "convertmat('xxx_step1.mat')"
xxx_info.xlsx: master.xslx
matlab -r "extractfromMasterxlsx('master.xlsx', 'xxx_info.xlsx')"
xxx_step1.mat: xxx_step0.smr
#echo "\nCreate " $#
# I can't do this step from the command line so I leave message
clean:
rm -f xxx_step1.mat xxx_step2.mat xxx_step3.mat xxx_info.xlsx
However, I realized that, when some of xxx.smr files were found to be surplus and deleted at some point, running GNU make with this Makefile does not delete the obsolete descendant files, including all the intermediate files and the final xxx_step3.mat files, that are dependent on those deleted xxx.smr files.
For example, I start with the three xxx.smr files and run Make.
A.smr, B.smr, C.smr
It will create all the descendants, including the final target files:
A_step3.mat, B_step3.mat, C_step3.mat
Later, say, I find the B.smr contained a fatal error and decided to delete from the folder.
A.smr, C.smr
Running Make at this stage will result in ... no change, because both A_step3.mat and C_step3.mat are newer than its direct prerequisites (and than A.smr and C.smr). However, actually I need to remove all the descendants of B.smr, such as B_step1.mat, B_step2.mat, B_step3.mat, and B_info.xlsx. If those obsolete files are kept, the final target B_step3.mat will be included in the subsequent analyses and affect the results.
I wonder if there is a "smart" way of removing xxx_step1.mat, xxx_step2.mat, xxx_step3.mat, xxx_info.xlsx files, when their corresponding xxx.smr files have been deleted.
Or should I just implement this with MATLAB or Python etc?
Since a Makefile is a collection of shell commands, on your clean: target, you can collect and remove all the files that correspond to your xxx.smr files using a for loop and parameter expansion/substring matching. To find all files that correspond to each xxx.smr file, find all xxx.smr files. Then for each xxx.smr, extract xxx and remove all xxx_step?.* and xxx_info.* files. After each of the step? and info files are removed, then remove xxx.smr. In multi-line form it would look like:
for i in *.smr; do
for j in ${i%.*}; do
rm -f "${j}_step?.*" "${j}_info.*"
done
rm -f "$i"
done
Or, in a single line:
for i in *.smr; do for j in ${i%.*}; do rm -f "${j}_step?.*" "${j}_info.*"; done; rm -f "$i"; done
Note this will remove all xxx_step... and xxx_info... files for each xxx.smr file. Make sure this is what you intend and run on a test directory first. You can tighten the extensions above to just remove xxx_info.xlsx by replacing xxx_info.* with xxx_info.xlsx, etc...
I have a Makefile where I currently have two files that should be copied to different directories. Currently, I've tested
echo ${dirs} | xargs -n 1 cp ${sources}
So I understand that this will not work since it will try to copy both source files to one of the directory every time. But is there a way that I can execute the copy command for every source file and directory each?
Best regards,
Simon
I think it is possible to deduce what you want from what you wrote, but as others pointed out, you should be more clear, so we don't have to spend time deducing it.
Anyway, since you want to not copy all files to all directories, you must somehow tell Make where you want to copy which files. The easiest way is to list the full paths of the copies you want in a variable such as $(COPIES), and not just ${dirs}. In this answer I am going to assume the destination directories already exist.
.PHONY: all
all: $(COPIES)
PERCENT := %
.SECONDEXPANSION:
$(COPIES): %: $$(filter $$(PERCENT)/$$(notdir $$*), $(sources)) Makefile
cp $< $#
My company has a local production server I want to download files from that have a certain naming convention. However, I would like to exclude certain elements based on a portion of the name. Example:
folder client_1234
file 1234.jpg
file 1234.ai
file 1234.xml
folder client_1234569
When wget is ran I want it to bypass all folders and files with "1234". I have researched and ran across ‘--exclude list’ but that appears to be only for directories and ‘reject = rejlist’ which appears to be for file extensions. Am I missing something in the manual here
EDIT:
this should work.
wget has options -A <accept_list> and -R <reject_list>, which from the manual page, appear to allow either suffixes or patterns. These are separate from the -I <include_dirs> and -X <exclude_dirs> options, which, as you note, only deal with directories. Given the example you list, something along the lines of -A "folder client_1234*" -A "file 1234.*" might be what you need, although I'm not entirely sure that's exactly the naming convention you're after...