How does the NX flag work? - cpu-architecture

Could you please explain what the NX flag is and how it works (please be technical)?

It marks a memory page non-executable in the virtual memory system and in the TLB (a structure used by the CPU for resolving virtual memory mappings). If any program code is going to be executed from such page, the CPU will fault and transfer control to the operating system for error handling.
Programs normally have their binary code and static data in a read-only memory section and if they ever try to write there, the CPU will fault and then the operating-system normally kills the application (this is known as segmentation fault or access violation).
For security reasons, the read/write data memory of a program is usually NX-protected by default. This prevents an attacker from supplying some application his malicious code as data, making the application write that to its data area and then having that code executed somehow, usually by a buffer overflow/underflow vulnerability in the application, overwriting the return address of a function in stack with the location of the malicious code in the data area.
Some legitimate applications (most notably high-performance emulators and JIT compilers) also need to execute their data, as they compile the code at runtime, but they specifically allocate memory with no NX flag set for that.

From Wikipedia
The NX bit, which stands for No
eXecute, is a technology used in CPUs
to segregate areas of memory for use
by either storage of processor
instructions (or code) or for storage
of data, a feature normally only found
in Harvard architecture processors.
However, the NX bit is being
increasingly used in conventional von
Neumann architecture processors, for
security reasons.
An operating system with support for
the NX bit may mark certain areas of
memory as non-executable. The
processor will then refuse to execute
any code residing in these areas of
memory. The general technique, known
as executable space protection, is
used to prevent certain types of
malicious software from taking over
computers by inserting their code into
another program's data storage area
and running their own code from within
this section; this is known as a
buffer overflow attack.

Have a look at this 'DEP' found on wikipedia which uses the NX bit. As for supplying the technical answer, sorry, I do not know enough about this but to quote:
Data Execution Prevention (DEP) is a security feature included in modern
Microsoft Windows operating systems that is intended to prevent an
application or service from executing code from a non-executable memory region.
....
DEP was introduced in Windows XP Service Pack 2 and is included in Windows XP
Tablet PC Edition 2005, Windows Server 2003 Service Pack 1 and later, Windows
Vista, and Windows Server 2008, and all newer versions of Windows.
...
Hardware-enforced DEP enables the NX bit on compatible CPUs, through the
automatic use of PAE kernel in 32-bit Windows and the native support on 64-bit
kernels.
Windows Vista DEP works by marking certain parts of memory as being intended to
hold only data, which the NX or XD bit enabled processor then understands as
non-executable.
This helps prevent buffer overflow attacks from succeeding. In Windows Vista,
the DEP status for a process, that is, whether DEP is enabled or disabled for a
particular process can be viewed on the Processes tab in the Windows Task
Manager.
See also here from the MSDN's knowledge base about DEP. There is a very detailed explanation here on how this works.
Hope this helps,
Best regards,
Tom.

Related

Does CPU architecture influence and define OS design?

I am currently reading a manual dedicated to OS development. The first part of the book describes how to write a boot sector, which is followed by a more dense text devoted to kernel development. Also, I have not been questioning myself a lot about the code structure until I have encountered following instruction sequence:
lgdt; load global descriptor table
mov eax, cr0; switch adressing mode and paging control via cr0
or eax, 0x1
mov cr0, eax
Actually, I have always been persuaded that all this stuff related to memory paging, the concept of kernel, protection rings was just a fancy idea born somewhere deep in some OS designer's mind. However, the existence of these instructions shows that GDT and everything that goes together is dictated by the CPU design. So, basically, according to my understanding, there is no conceptual difference between *nix and Windows (I mean mainstream OS). If my premise is correct, I would like to ask several questions.
1) If OS is written accordingly to CPU architecture, ie following the same design rules established since the dawn of ages, why OS developer job openings are often tagged as "R&D engineering"? Besides using 64-bit registers, what kind of novelty can be introduced into OS development?
2) Is there any way to build an OS which doesn't have any of aforementioned features and still can use all the capacities of CPU, ie not to be restricted to 16-bit instruction set? What about Kolibri OS or Menuet OS? Besides being written on ASM, do these OS conceptually differ from MS Windowes or *nix? Do they also need to establish GDT, enter protected mode, etc?
3) Who influences who? Gates calls to Intel designers and says he needs a GDT feature in CPU? Or electrical engineers from Intel call Gates and point him that he has no choice but have to live all his life with GDT and paging? In general how do CPU designers interact with software community?

how do operating system call bios functions?

since we know that operating system works in protected mode
and BIOS works in a Real mode ( 16 bit). so when an interrupt is called from
either Operating System or application program, does CPU switch mode back and forth everytime?
In general; hardware is capable of doing lots of things at once (playing sounds while generating 3D graphics while sending data to network while transferring data to multiple disks while waiting for user input while all CPUs are busy doing actual processing); and BIOS functions are not capable of allowing more than one thing to happen at time (e.g. will waste 100% of CPU time waiting for a hard disk controller to transfer data while the CPU does nothing and while nothing else can use any other BIOS service for anything).
For this reason alone, BIOS services are not usable and not used by any modern OS (except briefly during boot).
Of course it's not the only reason - no IO prioritisation, no support for hot-plug of any kind, no support for power management, no support for system management (e.g. https://en.wikipedia.org/wiki/S.M.A.R.T. ), no support for GPU, no support for any sound card (other than "PC speaker beep"), no support for networking (excluding PXE), no support for IO APICs, etc. Ironically, out of all of the problems, "BIOS services are designed for real mode" is the least important problem because it's the only problem that an OS can work around.
Instead, each OS has native drivers that don't have any of these limitations.
Note: this is partly why it's relatively easy for modern operating systems to support UEFI (where none of the BIOS services exist at all) - for most, they replace the boot loader/boot code and don't need to change much else.

OS memory isolation

I am trying to write a very thin hypervisor that would have the following restrictions:
runs only one operating system at a time (ie. no OS concurrency, no hardware sharing, no way to switch to another OS)
it should be able only to isolate some portions of RAM (do some memory translations behind the OS - let's say I have 6GB of RAM, I want Linux / Win not to use the first 100MB, see just 5.9MB and use them without knowing what's behind)
I searched the Internet, but found close to nothing on this specific matter, as I want to keep as little overhead as possible (the current hypervisor implementations don't fit my needs).
What you are looking for already exists, in hardware!
It's called IOMMU[1]. Basically, like page tables, adding a translation layer between the executed instructions and the actual physical hardware.
AMD calls it IOMMU[2], Intel calls it VT-d (please google:"intel vt-d" I cannot post more than two links yet).
[1] http://en.wikipedia.org/wiki/IOMMU
[2] http://developer.amd.com/documentation/articles/pages/892006101.aspx
Here are a few suggestions / hints, which are necessarily somewhat incomplete, as developing a from-scratch hypervisor is an involved task.
Make your hypervisor "multiboot-compliant" at first. This will enable it to reside as a typical entry in a bootloader configuration file, e.g., /boot/grub/menu.lst or /boot/grub/grub.cfg.
You want to set aside your 100MB at the top of memory, e.g., from 5.9GB up to 6GB. Since you mentioned Windows I'm assuming you're interested in the x86 architecture. The long history of x86 means that the first few megabytes are filled with all kinds of legacy device complexities. There is plenty of material on the web about the "hole" between 640K and 1MB (plenty of information on the web detailing this). Older ISA devices (many of which still survive in modern systems in "Super I/O chips") are restricted to performing DMA to the first 16 MB of physical memory. If you try to get in between Windows or Linux and its relationship with these first few MB of RAM, you will have a lot more complexity to wrestle with. Save that for later, once you've got something that boots.
As physical addresses approach 4GB (2^32, hence the physical memory limit on a basic 32-bit architecture), things get complex again, as many devices are memory-mapped into this region. For example (referencing the other answer), the IOMMU that Intel provides with its VT-d technology tends to have its configuration registers mapped to physical addresses beginning with 0xfedNNNNN.
This is doubly true for a system with multiple processors. I would suggest you start on a uniprocessor system, disable other processors from within BIOS, or at least manually configure your guest OS not to enable the other processors (e.g., for Linux, include 'nosmp'
on the kernel command line -- e.g., in your /boot/grub/menu.lst).
Next, learn about the "e820" map. Again there is plenty of material on the web, but perhaps the best place to start is to boot up a Linux system and look near the top of the output 'dmesg'. This is how the BIOS communicates to the OS which portions of physical memory space are "reserved" for devices or other platform-specific BIOS/firmware uses (e.g., to emulate a PS/2 keyboard on a system with only USB I/O ports).
One way for your hypervisor to "hide" its 100MB from the guest OS is to add an entry to the system's e820 map. A quick and dirty way to get things started is to use the Linux kernel command line option "mem=" or the Windows boot.ini / bcdedit flag "/maxmem".
There are a lot more details and things you are likely to encounter (e.g., x86 processors begin in 16-bit mode when first powered-up), but if you do a little homework on the ones listed here, then hopefully you will be in a better position to ask follow-up questions.

Why does syscall need to switch into kernel mode?

I'm studying for my operating systems final and was wondering if someone could tell me why the OS needs to switch into kernel mode for syscalls?
A syscall is used specifically to run an operating in the kernel mode since the usual user code is not allowed to do this for security reasons.
For example, if you wanted to allocate memory, the operating system is privileged to do it (since it knows the page tables and is allowed to access memory of other processes), but you as a user program should not be allowed to peek or ruin the memory of other processes.
It's a way of sandboxing you. So you send a syscall requesting the operating system to allocate memory, and that happens at the kernel level.
Edit: I see now that the Wikipedia article is surprisingly useful on this
Since this is tagged "homework", I won't just give the answer away but will provide a hint:
The kernel is responsible for accessing the hardware of the computer and ensuring that applications don't step on one another. What would happen if any application could access a hardware device (say, the hard drive) without the cooperation of the kernel?

Virtualization and why it is good for programmers

Why does it help to know about virtualization from a programmer's perspective? Except testing and developing on several different platforms without the need of switching between operating systems is there a particular reason why virtualization is important for a programmer? Are there any details that must be kept in mind before developing on virtual instances?
I use it for testing our installer, because it is important to check whether the application will work on a clean installation of the operating system.
I used to do these tests by keeping a hard drive with a fresh operating system installation and making a copy of that disk for (almost) every new test run. This was very time consuming, and the virtual machine solution has saved me a lot of time. Note that this even allows you to do remote debugging as easily as when using two non-virtual machines.
Note: If you're interested, I'm using VirtualBox, which is a very good and free virtualization tool.
If you develop a driver or something very close to the hardware with a high risk to crash the machine, you will be glad to be working on a virtual machine.
Reverting to an old state is easier than to repair a damaged OS.
One of the main advantages is having your entire development environment as a single image file. I have a perfectly configured version of Windows Server, Visual Studio, ReSharper, etc. I can easily try a new version of something on a copy of this virtual machine without worrying about it causing problems.
I can also back up my entire dev environment to transfer it to another physical machine very easily. I've been through 3 machines in this office alone so that was a lifesaver in itself.
The only real trade-off I see is performance. You generally have to use less physical CPU cores than you actually have and less memory. With a sufficiently powerful machine this is not much of a problem though.
Edit: As nader said, I/O is obviously important for most projects as well. Although developing on a virtual machine does mean a fairly large I/O penalty compared with a native OS install, in practice I rarely find it to be a problem. The superior random access capabilities of SSDs are helping to mitigate this drawback as well.
Being able to completely reset the state of the system is very useful to debug applications which modify their environment - If the actions are repeated after a reset, and they're constrained to the sandbox environment of the VM, you are pretty much guaranteed to get the same result.
We have a large number of different versions / customer customisations of our software, and its not possible for 2 installs of our software to coexist on the same machine. Virtualisation allows us to replace the 50-60 physical machines that we need to maintain for testing and problem reproduction with 2-3 virtual servers - it takes around 10 miniutes to make a copy of a VHD template we have and create a new virtual machine, and as long as you allocate 1-2Gb of RAM the performance is comparable to that of a (slow) physical machine.
Virtual machines are also great for build machines.
Personally I do all of my development on my deskop machine for best performance, and remote debug into VMs. I dont run virtual machines on my desktop as it uses up too much RAM, we have dedicated virtual servers for that.
Good for developing, because you have same server configuration in virtual machine like on production server.
https://stackoverflow.com/questions/905926/developer-software-setup
From a user space application there should be no difference developing for a virtualised OS versus a normal OS. There may be some gotchas if your code makes explicit assumptions of the machines memory size and number of processors and believes what the hypervisor tells you.
I'm surprised no one has mentioned the ease of deployment. All you need to do is get the build down on the virtual O/S and then you can copy the image to as many new servers (running some kind of virtualization solution [like VMWare]) as you want, easily scaling your application.
Record the state of a bug in a program, and send it to the developer (along with the entire "machine").
Testing your code on various O.S, some of which you don't have.
Working in a more protected environment, making sure that the code doesn't harm your system -useful for understanding dangerous programs, like viruses, and developing security against it, for writing potentially wrong hard-drive programs, and anything that can have catastrophic effects on your system.
Easily Write your own O.S without the need to write on 'real' boot sectors, a potentially harmful act (Hope this is not new...).
Quickly use tools and programs not found on your own O.S.
Demonstrate a program at various times, by restoring a virtual machine,
quicker and less prone to failure, than trying to recreate the state at the minutes before the demo.
Less directly connected to programing, but surfing vie a virtual machine (for example to see documentation) has the added value that your own important system (and code) is less likely to be harmed by malicious programs.
From my experience in most cases the answer is typically "no" (When testing and targeting multiple platforms is removed) Both are huge reasons to be familiar with "desktop" VM solutions. Others have done an excellent job of listing rare exceptions like debugging kernel codes.
There are some quirks one must be aware of when running on a virtual machine. This is hardly an exhaustive list:
Loss of precsision or even time reversal in high resolution timers due to emulation of hardware resources (depends somewhat on the vm platform and operating system)
Virtual network interfaces ususally bridged. We've seen some extremely odd behavior in the host system with an application that sets up its own bridge between virtual interfaces -behavior which logically should not effect the host in one of the leading VM solutions.
Usage models - If your product has orwellian licensing codes or records state dependant behavior when interacting with remote systems you should account for what would happen if a system were "paused" and "restarted" or restarted from an earlier "state". Normally this kind of thing would be taken into account anyway in a robust implementation.
If you are developing in a virtual environment you will want to make sure you know what specifications were used to create the environment. If you have say a 4 Gig machine and create a virtual environment with 1 Gig you will want to make sure things in your development do not grow to a point that it overruns the memory. This will cause slight performance problems. I personally ran into this and it was a pretty tricky thing to track down. The scenario was that I was fixing a bug and testing it in a virtual environment. I did not setup the virtual environment by the way... The application took a performance hit because of all of the memory swapping that was taking place.
A very good use for a virtual environment is when you are developing applications that mess with the Windows Gina. It's much easier to reinstall a virtual environment than an entire PC....(been here done that too).
I do all of my development on a virtual XP instance under VMWare Fusion so that I can use a Mac for everything and still write .NET code ;-)
Sometimes they are necessary, because the platform you are programming doesn't support the standard developer environment. One such example is Sharepoint. As of Sharepoint 2007 you still need a server OS to install Sharepoint 2007, WSS, and the Visual Studio Sharepoint Extensions (VseWSS).
Thus for Sharepoint I have to use a Window Server VM to do my development work. As for Sharepoint 2010 they are supporting installations on Vista and 7 x64, but I will still use a VM, because I don't want to have Sharepoint on my main machine slowing everything down. Rather I want it in a VM where the services are on when needed and off when I don't without having to manually turn off/on each service. This in addition to the many great answers posted above.