copy (multiple) files only if filesize is smaller - copy

I'm trying to make my image reference library take up less space. I know how to make Photoshop batch save directories of images with a particular amount of compression. BUT some of my images were originally save with more compression than what I would have done.
So I wind up with two directories of images, some of the newer files have a larger filesize, some smaller, and some the same. I want to copy over the new images into the old directory, excluding any files that have a larger filesize (or the same, though these probably aren't numerous enough for me to care about the extra time to process them).
I obviously don't want to sit there and parse through each file, but other than that I'm not picky about how it gets tackled.
running Windows 10, btw.

We have similar situations. Instead of Photoshop, I use FFmpeg (using its qscale option) to batch re-encode multiple images into a subfolder then use XXCOPY to overwrite only the larger original source images. In fact I ended up creating my BATCH file which let FFmpeg do the batch e-encoding (using its "best" qscale setting), then let ExifTool batch copy the metadata to the newly encoded images, then let XXCOPY copy only the smaller newly created images. All automated, with the "new" folder and its leftover newly created but larger-sized images deleted too. Thus I save considerable disk space, as I have many images categorized/grouped in many different folders. But you should make a test run first or back up your images. I hope this works for you.
Here is my XXCOPY command line:
xxcopy "C:\SOURCE" "C:\DESTINATION" /s /bzs /y
The original post/forum where I learned this from is:
overwrite only files wich are smaller
https://groups.google.com/forum/#!topic/alt.msdos.batch.nt/Agooyf23kFw
Just to add, XXCOPY can also do it if the larger file size is wanted instead which I think is /BZL. I think it's also mentioned in that original post/forum.

Related

how to create a script that allows to use the path list as a reference for copying files in PowerShell in .bat script

I'm looking for a way to automate archiving where after I plug my two external drives I can copy all my resources. The problem is that I have different file structures on my laptop and on both external drives so I need to select specific folders to be copied. It means that I can't select one root folder and copy it straightforward. I tried to find a way to declare more than one path in the cp command and in the copy command, without success. An example path:
/my_programming_stuff
/folder1
/folder2
/folder3
/folder4
I want to select only the first 3 folders to copy them into external drive1 and external drive 2. The idea is to create a .bat file that will copy everything at once ( in the best case scenario it will be copied simultaneously on both external drives, so it will be much faster). Another problem is that there needs to be a bypass the ntfs long path limitations (max. 260 characters).
Flags that I want to use:
Copy the files and directories and all of their attributes,
including ownerships and permissions.
Recursively copy directories and their contents.
When copying files from one directory to another, only
copy files that either doesn't exist or are newer than the
existing corresponding files, in the destination
directory.
data verification (so it's certain that the copy was verified)
progression bar with time eta
Until now I was using Total Commander to do this but every day I need to pick only a few folders to be copied which takes time and is inefficient.
I have experience with Bash and PowerShell but I am not sure how to handle this topic.
Create a static batch file with robocopy commands. I think /copyall is the only switch you need to specify for all this. Other defaults should satisfy requirements.
https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy
I think your time will be better spent learning how to use either FastCopy or FreeFileSynce. I used FreeFileSync some years ago but got disgusted with the it's constantly changing format of its xml file used for starting a backup, so I switched to FastCopy. But it looks like FreeFileSync may be getting their act together and I aim to do some experiments over the summer to see if I want to switch back to it.
Both can handle the long filename format issues, both can be executed by a batch file, both seem to have a lot of quality, but FreeFileSync has more features - and more bloated because of the features. But speed wise, I think FastCopy is probably one of the better products out there and very streamline in use and design.

How to list only n first files in folder?

I have about 1500 image files in a folder and listing all using dir() last too long and sometimes crashes the AppDesigner. Is there a way to list only the first or the first n images at a given time?
No, this is not possible, because dir essentially is a wrapper of a operation system function (such as the Windows dir and the Linux 'ls'). So now you have two (actually three but you're not gonna like at least one of them^^) options
it shouldn't cause much trouble to call 1500 files. Have a look if it is really the dir command, which causes trouble. Use the Rund and Time functionality in the Editor
use dir with a pattern. Eg. lst = dir() returns an array of structs with everything in the current folder -- including . and .., which are usually nothing you need. Use lst = dir('*.m') instead to get all .m files. In your case, it can be lst = dir('d_Seite_*.jpg'). I am not sure if that saves you much time but at least memory
restructure how your data (images) is stored so that there are less files available. You could run a background task that moves only the n latest files into the folder latest_files and moves them out again if there are newer ones...

Get Maximum Compression from 7zip compression algorithm

I am trying to compress some of my large document files. But most of files are getting compresses by only 10% maximum. I am using 7zip Terminal Commands.
7z a filename.7z -m0=LZMA -mx=9 -mmt=on -aoa -mfb=64 filename.pptx
Any suggestion on changing parameters. I need at least 30% compression ratio.
.pptx files or .docx files are internally .zip archives. You can not expect a lot of compression on an already compressed file.
Documentation states lzma2 handles better data that can not be compressed, so you can try with
7z a -m0=lzma2 -mx filename.7z filename.pptx
But the required 30% is almost unreachable.
If you really need that compression, you could use the fact that a pptx is just a fancy zip file:
Unzip the pptx, then compress it with 7zip. To recover an equivalent (but not identical) pptx decompress with 7zip and recompress with zip.
There are probably some complications, for example with epub there is a certain file that must be stored uncompressed as first file in the archive at a certain offset from the start. I'm not familiar with pptx, but it might have similar requirements.
I think it's unlikely that the small reduction in file size is worth the trouble, but it's the only approach I can think of.
Depending on what's responsible for the size of the pptx you could also try to compress the contained files. For example by recompressing png files with a better compressor, stripping unnecessary data (e.g. meta-data or change histories) or applying lossy compression with lower quality settings for jpeg files.
Well just an idea to max compressing is
'recompress' these .zip archives(the .docx, .pptx, jar...) using -m0 (storing = NoCompression) and then
apply lzma2 on them
lzma2 is petty good - however if the file contains many jpg's consider to give the opensource packer peazip or more specify paq8o a try. Paq8 has a build in Jpeg compressor and supports range compression. So it will also come along with jpg's the are inside some other file. Winzip's zipx in contrast to this will require pure jpg files and is useless in this case.
But again to make PAQ effectively working/compressing your target file you'll need to 'null' the zip/deflate compression, turn it into an uncompressed zip.
Well PAQ is probably a little exotic, however it's in my eye's more honest and clear than zipx. PAQ is unsupport so it's as always a good idea to just google for what don't have/know and you will find something.
Zipx in contrast may appears a little intrigious since it looks like a normal zip and files are listed properly in Winrar or 7zip but when you like to extract jpg's it will fail so if the user is not experienced it may seem like the zip corrupted. It'll be much harder to find out that is a zipx that so far only winzip or The Unarchiver(unar.exe) can handle properly.
PPTX, XLSX, and DOCX files can indeed be compressed effectively if there are many of them. By unzipping each of them into their directories, an archiver can find commonalities between them, deduplicating the boilerplate XML as well as any common text between them.
If you must use the ZIP format, first create a zero-compression "store" archive containing all of them, then ZIP that. This is necessary because each file in a ZIP archive is compressed from scratch without taking advantage of redundancies across different files.
By taking advantage of boilerplate deduplication, 30% should be a piece of cake.

Reading image files serially from a folder in matlab

I tried to read .jpg files from a folder in matlab using dir command. But I am not getting them from the first image stored in the folder. Instead it started from 10th image. I want to know how to read the files serially starting from the beginning.
I am almost certain that if you are using a simple enough command, it will give you all files that are there. However, this may not seem to be the case because of this little line in the description:
Results appear in the order returned by the operating system.
This may mean, that you will first see files like 1, 100, 1000,1999 and only later a file with numer 2. Of course you can sort the results after you have collected them, and then process them in your desired order.
For completenes, one would like to have something like:
dir *.jpg
or if you want to be sure to catch everything that even remotely resembles .jpg:
dir *.*j*p*g*

A rotating log file in perl

I have implemented a log file that will be storing the cpu and memory state of a process after every minute.I have limited the maximum size of the file to 3MB (thats enough for my purpose).
The script will be called by a cron job after every minute and the script will log the details for that minute and will rename the file as "Log_.log".
When the size reaches "3MB - 100 bytes" I reset the file pointer to point to the begining and will overwrite the first entry in the log file and will now rename the file as "Log_<0+some offset>.log".
As I am renaming the file after every minute to update the file pointer position, is it a good/efficient way ?
I do not want to maintain more than one log file for this purpose.
Another option for me is to maintain the file pointer position in a file ,but ....another file !! not interested in maintaining one if this option is good :)
Thanks in Advance.
Are you an engineer? This is a nice example of some simple task, solved by a perfectly working but overly complex solution.
Unless the content you put in takes exactly as many bytes as the content you take out, writing "in" a file will actually cause the whole following part after your writing position to be rewritten to disk. Append is much cheaper.
Renaming the file to store the pointer works - but it's not very elegant, and makes stuff more complex (for one, your process needs write rights to the directory in which the file resides - else just write access to two files is sufficient)
Unless disk space is an issue (and really, it rarely is), your approach is less efficient than say, append everything to a file, and rotate the file when it reaches its maximum size. This way you always have the last 3MB of logs available, and maximum 3MB more in your current file. It will make parsing the file a lot easier too, instead of recalculating the entire pointer position thing.
Update to answer your comment:
Renaming a file every minute (or even every second) shouldn't slow down your system significantly, don't worry about that.
Our concerns are mainly with "why you think you need to rename the file". It's not better technically, it's not better from a logical point of view, it makes a lot of other (future) tasks harder. You could store the file pointer in a seperate file, or at the end of your file, and there are better^H^H^H^H^H^H simpler solutions that don't require the file pointer at all.
I'm confused why you would rename your file. What does this accomplish?
Are the log entries fixed size? Or variable size?
If the entries are fixed size, then there is no trouble in re-writing the existing file from the start: you won't ever have incomplete entries in your file, and if you are writing a counter or timestamps to the file, it should be clear where the 'cursor' is located.
If the entries are variable size, then you should probably not begin re-writing the file from the beginning without somehow making it clear where the 'cursor' is located in the file, and write code that is resilient to reading truncated log entries.
Can you re-use existing tools such as RRDtool?