Why are there different packages for the same architecture, but different OSes? - operating-system

My question is rather conceptual. I noticed that there are different packages for the same architecture, like x86-64, but for different OSes. For example, RPM offers different packages for Fedora and OpenSUSE for the same x86-64 architecture: http://www.rpmfind.net/linux/rpm2html/search.php?query=wget - not to mention different packages served up by YUM and APT (for Ubuntu), all for x86-64.
My understanding is that a package contains binary instructions suitable for a given CPU architecture, so that as long as CPU is of that architecture, it should be able to execute those instructions natively. So why do packages built for the same architecture differ for different OSes?

Considering just different Linux distros:
Besides being compiled against different library versions (as Hadi described), the packaging itself and default config files can be different. Maybe one distro wants /etc/wget.conf, while maybe another wants /etc/default/wget.conf, or for those files to have different contents. (I forget if wget specifically has a global config file; some packages definitely do, and not just servers like Exim or Apache.)
Or different distros could enable / disable different sets of compile-time options. (Traditionally set with ./configure --enable-foo --disable-bar before make -j4 && make install).
For wget, choices may include which TLS library to compile against (OpenSSL vs. gnutls), not just which version.
So ABIs (library versions) are important, but there are other reasons why every distro has their own package of things.
Completely different OSes, like Linux vs. Windows vs. OS X, have different executable file formats. ELF vs. PE vs. Mach-O. All three of those formats contain x86-64 machine code, but the metadata is different. (And OS differences mean you'd want the machine code to do different things.
For example, opening a file on Linux or OS X (or any POSIX OS) can be done with an int open(const char *pathname, int flags, mode_t mode); system call. So the same source code works for both those platforms, although it can still compile to different machine code, or actually in this case very similar machine code to call a libc wrapper around the system call (OS X and Linux use the same function calling convention), but with a different symbol name. OS X would compile to a call to _open, but Linux doesn't prepend underscores to symbol names, so the dynamic linker symbol name would be open.
The mode constants for open might be different. e.g. maybe OS X defines O_RDWR as 4, but maybe Linux defines it as 2. This would be an ABI difference: same source compiles to different machine code, where the program and the library agree on what means what.
But Windows isn't a POSIX system. The WinAPI function for opening a file is HFILE WINAPI OpenFile(LPCSTR lpFileName, LPOFSTRUCT lpReOpenBuff, UINT uStyle);
If you want to do anything invented more recently than opening / closing files, especially drawing a GUI, things are even less similar between platforms and you will use different libraries. (Or a cross platform GUI library will use a different back-end on different platforms).
OS X and Linux both have Unix heritage (real or as a clone implementation), so the low-level file stuff is similar.

These packages contain native binaries that require a particular Application Binary Interface (ABI) to run. The CPU architecture is only one part of the ABI. Different Linux distros have different ABIs and therefore the same binary may not be compatible across them. That's why there are different packages for the same architecture, but different OSes. The Linux Standard Base project aims at standardizing the ABIs of Linux distros so that it's easier to build portable packages.

Related

Determine machine architecture reliably in a BitBake recipe

I am writing a recipe for a package which needs to be aware of the underlying machine's microarchitecture. In other words, I would like a string such as aarch64 or arm64 for a 64-bit Arm system, and x86_64 for a 64-bit Intel system.
So far, I have identified:
MACHINE - This seems to be whatever the meta-* layer author decides to name their machine and may contain the architecture, it may not. For example, beaglebone is no use.
MACHINE_ARCH - This seems to be close to what I'm looking for. However, taking this BSP layer as an example, and doing a quick search, it doesn't seem as though this variable is set anywhere. Only read from in a few packages.
TUNE_PKGARCH - May be the best bet so far. But, what format is this variable in? What architecture naming conventions are used? Also, the aforementioned BSP layer, again, doesn't seem to set this anywhere.
I would have thought that knowing the machine architecture in a well-defined format is important, but it doesn't seem to be so simple. Any advice?
I'm accustomed to doing this with uname -m (Windows fans can use the output of SET processor), so for me in Yocto it ends up being a toss-up:
According to the Mega-Manual entry for TARGET_ARCH:
TARGET_ARCH
The target machine's architecture. The OpenEmbedded build system supports many
architectures. Here is an example list of architectures supported. This list is by
no means complete as the architecture is configurable:
arm
i586
x86_64
powerpc
powerpc64
mips
mipsel
uname -m is a bit better since you get subarchitectural information as well. From the machines I have access to at this moment:
Intel-based Nuc build system: x86_64
ARM embedded system: armv7l
Raspberry Pi 4B: aarch64
I have found that the GNU automake (native) and libtool (available for target) packages compute a useful variable named UNAME_MACHINE_ARCH. If you are using libtool already or are willing to take it on just for the purpose of having this done for you :-#), you can solve this way. Look in the built tree for files named config.guess.
You may be able to get by more generically than libtool by using Yocto BUILD_ARCH:
BUILD_ARCH
Specifies the architecture of the build host (e.g. i686). The OpenEmbedded build
system sets the value of BUILD_ARCH from the machine name reported by the uname
command.
So play with these and make your own choice depending on your project's circumstances.

Looking for a lua obfuscator to protect code

I have written a plugin for vanilla lua. I wish to protect this plugin, and I have heard of obfuscation. I tried XFuscator, but even after fixing line 5's logic, it doesnt work. Are there any newer, better ones floating out there?
Thanks!
If you are going to run your Lua script in the same machine you build it (I mean, same Lua version, same machine architecture), you could just compile it to bytecode using luac like this:
luac -s -o example.out example.lua
And distribute the .out file, that doesn't contain the Lua source code.
Note that Lua bytecode is platform specific (endianness, word size), and it could change in future Lua versions (in fact it already did in the past). For that reason, if you compile it, let's say, in a Intel x86-64 with Lua 5.3, you should run your generated .out only in this kind of machines or compatible ones.

Under cygwin64 and gtk2, how to specify includes and libraries?

I am using cygwin64 installed in C:/cygwin64, with eclipse and GTK2.0. Although include <gtk/gtk.h> is in the source, and C:/cygwin64/usr/include/gtk-2.0 is in the include path (I added it), many things in a gtk2 simple example are still not recognized, such as GtkWidget, gpointer, and GTK_WINDOW_TOPLEVEL. I got the whole of GTK2 via cygwin setup. I was and am reluctant to download all of GTK2 separately and install it on top of cygwin, since wouldn't it result in multiple locations for the same thing? How may I resolve it? Would separate download and installation not result in redundancy, and possible alternate or even conflicting aliases?
A secondary question: I am confused about the general library requirements. Cygwin is a package which runs on Windows, but offers a Linux/unix-like interface. This argues that the libraries should be .a and .so. But since it is Windows, I also see a lot of .dll within C:\cygwin64. Normally, I would expect that only cygwin proper would contain .dll files and all other code would be Linux code. Yet that seems not to be the case. Often, I see both .dll and .so libraries with the same base name. Which is it, dll, or .so and .a, etc?
A tertiary question relating to the one above involves the main gtk2 library. The projected usage is not developing these programs, but just using GTK2 in applications. The documentation says to use glib, but there are many. Some are glib2.so, others glib2, or cygglib2.0.0.dll. Which of these is appropriate? or some other library? How do I set the exclipse LIBRARY path? (Since I unexpectedly encountered the problem with gtk.h, I am trying to anticipate and head off the corresponding problem with the library implementing gtk2.)

msysgit large installation size

I installed (extracted) msysgit portable (PortableGit-1.9.5-preview20150319.7z)
The compressed archive is 23 MB, but once extracted the contents take up 262 MB. This is mostly due to the git command binaries (under 'libexec\git-core'). Almost all of the binaries are identical, they just have different names.
Why did the developers build the project like this? I suppose they need an executable for each command to support the CLI on windows cmd.exe.
But isn't there a way to avoid having ~100 identical binaries, each 1.5 MB in size (ex: using batch files)?
Why did the developers build the project like this? I suppose they
need an executable for each command to support the CLI on windows
cmd.exe.
Under unixoid OSes, you can have symbolic links to another file that behave exactly like the original file; if you do that for your executable, your executable can look into argv[0] to find out how it was called. That's a very common trick for many programs.
Under Windows, and especially without installers, it's (to my knowledge) impossible to get the same behaviour out of your file system -- there's just no symbolic link equivalent. Especially when you consider that programs that are meant to run from USB drives have to cope with ancient filesystems such as FAT32!
Thus, the executables simply were copied over. Same functionality, more storage. However, on a modern machine that you'd run windows on, you really don't care about 200MB give or take for such a versatile tool such as git.
In conclusion: the developers had no choice here; since windows (though having some posix abstraction layer) has no proper filesystem support for symbolic links, that was the only good way to port this unix-originating program. You either shouldn't care or use an OS that behaves better in that respect. I'd recommend the latter, but OS choices often aren't made freely...

Dependency of a C program on CPU and OS

Let's think about a simple C program compiled in Windows.
I can compile the program on an Intel CPU machine and run it on an AMD CPU one (same operating system). So does it mean that the instruction set of the CPU's are the same?
Why doesn't the same program run on a machine with different OS and the same CPU?
Intel and AMD line of processors in general have a big overlap in the instruction set that they implement (e.g. sometimes one the other invent some new things and there is a gap until the other company catches up) - that is why you can run programs on both architectures. The same is the reason you cannot run it on other CPU architectures - they don't have the same instruction set for starters, but there are many things that are different.
Operating systems also have their differences. For example, when you compile a program under Windows, you generally get an .exe file. That .exe has a format that only Windows understands and is very different from the format used by Linux for example.
Also, the support that OS gives is completely different - Windows has different kernel functions that you can call compared to e.g. OpenBSD. Even on more abstract levels, it's incompatible. E.g. Windows uses drive letters such as C:\, D:\ and so on to mark drives, while e.g. under Linux it's one big filesystem where you mount different partitions, e.g. under /media or so.
There are different attempts, such as Wine and Cygwin, to execute programs from one platform on another. Using Wine, you can execute Windows executables on Linux directly, as it tries to emulate what Windows provides (not everything works, though). Cygwin is a different product - you can run Windows programs that work similarly to GNU programs on Linux, but they need to be specially compiled - just giving you a hint that it's just two worlds.
That is why Java and .NET (with Mono support on Linux) try to bring these two together. When you make a Java application, you should be able to run it on Linux with more or less same code - some things might not be the same, but majority is.
They're the same, or at least your program is using only a common subset.
For your second question, a few common reasons include:
different OSes require different formats of executables
different OSes will typically have different functions for the program to use
different OSes use different ways of invoking what they provide
1) Intel CPUs and AMD CPUs are intentionally produced this way. You can not run
a program compiled for, say, SPARC CPU on, say, an ARM CPU.
2) In theory it can. Say, Linux has this Wine thing to emulate Windows.
Many Windows programs run perfectly on Linux under Wine.