msysgit large installation size - msysgit

I installed (extracted) msysgit portable (PortableGit-1.9.5-preview20150319.7z)
The compressed archive is 23 MB, but once extracted the contents take up 262 MB. This is mostly due to the git command binaries (under 'libexec\git-core'). Almost all of the binaries are identical, they just have different names.
Why did the developers build the project like this? I suppose they need an executable for each command to support the CLI on windows cmd.exe.
But isn't there a way to avoid having ~100 identical binaries, each 1.5 MB in size (ex: using batch files)?

Why did the developers build the project like this? I suppose they
need an executable for each command to support the CLI on windows
cmd.exe.
Under unixoid OSes, you can have symbolic links to another file that behave exactly like the original file; if you do that for your executable, your executable can look into argv[0] to find out how it was called. That's a very common trick for many programs.
Under Windows, and especially without installers, it's (to my knowledge) impossible to get the same behaviour out of your file system -- there's just no symbolic link equivalent. Especially when you consider that programs that are meant to run from USB drives have to cope with ancient filesystems such as FAT32!
Thus, the executables simply were copied over. Same functionality, more storage. However, on a modern machine that you'd run windows on, you really don't care about 200MB give or take for such a versatile tool such as git.
In conclusion: the developers had no choice here; since windows (though having some posix abstraction layer) has no proper filesystem support for symbolic links, that was the only good way to port this unix-originating program. You either shouldn't care or use an OS that behaves better in that respect. I'd recommend the latter, but OS choices often aren't made freely...

Related

packing a tcl program for deployment in linux

We are getting ready to deploy a Tcl application, but i'm having trouble figuring out how to do it. Currently, I'm experimenting with tclkit and sdx.kit. I can pack a single tcl file and run it, but the structure of the whole application contains folders and images and c files that work together with tcl. i have two folders and inside a bunch of c files and tcl files and other stuff. How would i go and wrap the whole thing. What tool do you guys recommend other than tclkit and why?
The main way that you're recommended to distribute applications is as a tclkit. There are a few alternatives (e.g., TOBE, ActiveState's commercial tooling) but they're pretty similar as they all build on top of Tcl's virtual filesystem layer. (NB. This isn't the same as the Linux VFS stuff; this is a VFS in a single application.) Indeed, the ActiveState tooling actually is a rebadged tclkit (plus some other stuff like code obfuscation). I believe that TOBE uses ZIP archives instead of metakit databases.
The advantage of using a VFS-based solution is that it means that lots of things work inside, particularly including both source (for getting another .tcl file in) and load (for getting a binary library). In fact, you can put your application, the packages it depends on, and the resources (images, etc.) inside the VFS and be fairly sure that things will work. About the only things that we know run into real problems are where you want to exec something in the archive (the VFS mount is process-local; you have to copy the subsidiary file out if you want it to be seen in subprocesses) and if you're wanting to load certificates of private keys with the tls package (because the underlying OpenSSL library doesn't delegate to Tcl to handle that part of its I/O for some reason, AIUI).
When you're building these things, effectively you make a directory (and its subdirectories) that have everything laid out right. Then you run the packager (sdx for tclkits) and it builds the overall application for you. Attach the result to a runtime (the standard tclkit) and you're ready to test and deploy.
We don't generally do tool recommendations here on Stack Overflow, but the ActiveState Tcl Dev Kit is actually rather widely used. Many other people use sdx/tclkit. TOBE is quite a lot rarer. (There are other packaging techniques, but I wouldn't recommend them these days; a packaged VFS works very well indeed.)

Should we deploy scrrun.dll (Windows Scripting Runtime)?

Our VB6 application relies heavily on the use of scrrun.dll (Windows Scripting Host). Until a year ago we used to deploy this dll with our installer. Since the Windows Scripitng Host is supposed to be part of Windows we removed the dll from the installation package. However, now and then surface customers who have a non functional scrrun.dll on their system and we have to help them reinstall or reregister it.
So, should we put the scrrun.dll back in the installation package? Should we perform some check on installation? Or should we just live with the fact that we have to provide hands on support to some of our customers to set their systems right?
Don't try to deploy these libraries as part of a normal setup.
Microsoft Scripting Runtime must be installed through the use of a
self-extracting .exe file. For versions of Scripting Runtime mentioned
at the beginning of this article, the only way to distribute it is to
use the complete self extracting .exe file located at the following
locations...
It is possible that some users employ an older anti-malware suite, many of which tried to disable scripting. It is more likely though that some users have managed to break their Windows installation, either themselves or by using applications improperly packaged to try to include these libraries - and blindly remove them from the system on uninstall (cough, cough - Inno).
The libraries involved have been tailored code for some time. This is why the ancient .CAB file was "recalled" long ago. There is no single copy of them intended to run on any random version of Windows, and there are no redist packs for any modern version of Windows. The correct fix is a system restore or repair install.
While this can't be blamed directly on InnoSetup because it is the result of poorly authored scripts it is frustrating enough and common enough that I won't cry when its signature is added to anti-malware suites. There are just too many poorly written examples loose in the wild copy/pasted by too many people.
I spend plenty of time undoing the damage caused by uninstalls of these applications and have grown quite weary of it. Where possible I use isolated assemblies now in self-defense, which helps a lot. Windows File Protection is getting better about preventing abusive action for system files too.
But in general you are much better off avoiding any dependency on scripting tools in an application. There isn't very much that they can do as well as straight code anyway, though it may take some time to write alternative logic.

same hash, different behavior

I have two files that give the same hash, and even the same hexdump. File A and File B start on Linux Box 1 and Linux Box 2, respectively. I then copy both files to a Windows share, and read them from a Windows machine. The files still seem to be byte-by-byte identical with the Windows utility Fc (with /b option -- binary mode). However, when I open the two different files, they appear to have different encoding (newlines/line-wrap). Why wasn't this uncovered by the hashes/hexdump/Fc?
What am I overlooking here?
Don't use wordpad for that. Actually, don't use wordpad at all. Note that Microsoft often does not keep to standards, and in many times (e.g. the browser) simply takes and informed guess at file or stream content, using the header as some kind of magic. Sometime it guesses wrong, some times it doesn't.
You could calculate the hash on the Windows machine as well, there are plenty of lightweight utilities that calculate secure hashes within windows Explorer. You could also install command line utilities such as OpenSSL on Windows (or take it a step further and install Cygwin, which I always have running on my Windows machine).
Windows has never had a real strategy regarding line endings, except keeping to it's own double-character standard. In later versions of Windows you may use Notepad which does (finally) understand Unix newlines if you must (because maybe it screws up UTF-16 this time around).

How should I improve my perl application deployment process?

I develop and maintain a bioinformatics application suite of 50+ scripts and its deployment process is a mess:
Entire suite is in one big git repository. It has lots of CPAN dependencies, and dozens of internal modules as well.
Development platform is Linux.
Deployment platforms are Windows (20+ users), Mac (10+), Linux (2-3). Most are not 'power users'.
For windows, I have one installer (made with NSIS) for strawberry perl + required modules (ie, I installed strawberry on a windows box, installed all modules and zipped c:\strawberry), and another installer for the suite-- I did this b/c the suite is updated a lot more than the list of required modules.
For Mac, I bundle perl 5.14, all required cpan modules, and the application suite into a double-clickable installer. I don't use the system perl b/c it tends to be out of date. I bundle everything together unlike on windows b/c I suck at mac.
For Linux, I handle their installations manually since there are only a few of them, and they use different distros.
This is obviously a mess that grew organically over several generation of developers. Ideally I would like to create cpan-installable distributions out of the internal libraries and various groups of related scripts, and use module dependencies to let cpan install them for me.
But I'm not sure what the best approach for this is, b/c I'd still need distribute perl itself, would have to write some sort of non-command-line interface to CPAN, control the exact versions of 3rd party CPAN modules, point it by default to my "DarkPan" where I would store our modules, how I would push updates, etc. etc.
I don't think I can use PerlApp or Par since afaik those are for bundling single scripts, not an entire suite of them.
Any advice highly appreciated.
Besides the 3 platforms mentioned (more, if you count the Linux variants), you really have a couple different problems:
Deployment of a standard known-good Perl executable and libraries (CPAN modules).
Deployment of your Perl scripts and modules.
Once upon a time, I supported a large Solaris Perl installation. I tried for a while to stand up a Linux Perl installation "side-by-side", re-using the same CPAN modules. Didn't work. The big problem for me is that a fair number of Perl modules require compilation, which means they target a specific platform. I ended up just with 2 installs, and always remembered to install a new CPAN module in both areas.
We're now 100% Windows, so I don't have the same issue. However, we do run Perl off a shared network drive. All the users map this drive, and run a Registry script that associates .PL files with the network install of Perl. (See my answer to this other Perl question.)
So, besides the mapped drive and the Registry script, users don't need to install anything. Even the CPAN modules are picked up from the network. This solves item #1 (for Windows only users).
Same thing holds true for item #2: the scripts are stored on a network drive (same one) and the users run another Registry script to include the scripts folder in their search PATH. We edit our scripts in one area, and have a "Check-In 'n Release" ("CINR") that we use to, well, check-in and release scripts to the area the users point to. The users can double-click the scripts in Explorer, run them in DOS, or even better yet get them included in the contextual-menu in Explorer, etc. (Actually, we use a .NET application to map the drive and make all these settings for the user, but it can be done much simpler.)
So, how does this help with the other platforms, Linux and Mac? As I ran into with my Solaris/Linux experiment, I think you're stuck with different Perl installation for all 3 platforms, although you should be able to reach the same network drive for your Perl scripts and modules.
The Perl installation might even be OK on a network drive for the Linux users. It's probably easier for them than the Windows users. Mac users are tough. I administer a home Mac network, and I think network drives are very difficult to do in Mac OS X compared to other OSes. It should be as easy as in Linux since so much is the same, but there are very strange problems (for me) mapping NFS and SMB drives. AFP drives are a little easier for the user to map manually, but not so easy to map programmatically.
My Mac recommendation is to try using Platypus. It's definitely great at bundling scripts into a double-clickable application, although your interface options are limited to output only (no user input allowed during execution that I can tell). Not sure if you could put an entire Perl installation into the Platypus app or not, but if you could the paths figured out, you might be able to.
Good luck!
You may wish to check out CAVA packager. It can deal with multiple scripts in a single package.

Dependency of a C program on CPU and OS

Let's think about a simple C program compiled in Windows.
I can compile the program on an Intel CPU machine and run it on an AMD CPU one (same operating system). So does it mean that the instruction set of the CPU's are the same?
Why doesn't the same program run on a machine with different OS and the same CPU?
Intel and AMD line of processors in general have a big overlap in the instruction set that they implement (e.g. sometimes one the other invent some new things and there is a gap until the other company catches up) - that is why you can run programs on both architectures. The same is the reason you cannot run it on other CPU architectures - they don't have the same instruction set for starters, but there are many things that are different.
Operating systems also have their differences. For example, when you compile a program under Windows, you generally get an .exe file. That .exe has a format that only Windows understands and is very different from the format used by Linux for example.
Also, the support that OS gives is completely different - Windows has different kernel functions that you can call compared to e.g. OpenBSD. Even on more abstract levels, it's incompatible. E.g. Windows uses drive letters such as C:\, D:\ and so on to mark drives, while e.g. under Linux it's one big filesystem where you mount different partitions, e.g. under /media or so.
There are different attempts, such as Wine and Cygwin, to execute programs from one platform on another. Using Wine, you can execute Windows executables on Linux directly, as it tries to emulate what Windows provides (not everything works, though). Cygwin is a different product - you can run Windows programs that work similarly to GNU programs on Linux, but they need to be specially compiled - just giving you a hint that it's just two worlds.
That is why Java and .NET (with Mono support on Linux) try to bring these two together. When you make a Java application, you should be able to run it on Linux with more or less same code - some things might not be the same, but majority is.
They're the same, or at least your program is using only a common subset.
For your second question, a few common reasons include:
different OSes require different formats of executables
different OSes will typically have different functions for the program to use
different OSes use different ways of invoking what they provide
1) Intel CPUs and AMD CPUs are intentionally produced this way. You can not run
a program compiled for, say, SPARC CPU on, say, an ARM CPU.
2) In theory it can. Say, Linux has this Wine thing to emulate Windows.
Many Windows programs run perfectly on Linux under Wine.