MS-DOS, what determines the memory model selection - operating-system

In this article we can see that 16 Bits systems have different memory models.
Through that answer we know that COM application always uses the Tiny Model (all segments are in the same one) but for the other executables what make the operating system uses one model or another?
I did not see in the MS-DOS Header any flag that would help for a choice so how does MS-DOS determines what memory model to use?

The selection of memory model is necessary as compiler option, not OS related. You can assume that DOS always works with Large memory model (far pointers for CS and DS).

Related

Is UEFI required to map 4k pages on x64?

I am creating a kernel for x64 which is booting with UEFI. While the kernel has to be loaded at a low-ish address (I believe, because UEFI requires identity mapped pages so it cannot be mapped higher than the highest physical address), I want to relocate up to the end of memory. During this process I intend on creating new paging structures and in order to reduce memory consumption, I wanted to reuse the page tables used to map the image in the lower half. However, these page tables will only exist if 4k paging is used by UEFI, so my question is whether or not UEFI is required to use 4k paging on x64. I believe the answer is no, but I hope otherwise and wanted to see if this is true.
Now I understand UEFI allocates memory via BootServices->AllocatePage in 4k chunks it refers to as pages, but is this required to translate to the actual mapping structure used? I noticed that in section 2.3.6 of the UEFI 2.8 specification, the section referring to AArch64 calling conventions, it states
MMU configuration: Implementations must use only 4k pages [...]
There is no similar denotation in section 2.3.4, on the x64 calling conventions, which is why I believe the answer is no.
EDIT:
Based upon what I've already seen and the comment by Peter Cordes, I believe the standard does not specify exactly what it should be. Thus a revised version of the question is: Does the standard specify 4k translation granularity? If not, do most UEFI vendors on x64 use 4k pages?

What exactly is a machine instruction?

The user's program in main memory consists of machine instructions and
data. In contrast, the control memory holds a fixed microprogram that
cannot be altered by the occasional user. The microprogram consists of
microinstructions that specify various internal control signals for
execution of register microoperations. Each machine instruction
initiates a series of micro instructions in control memory. These
microsinstructions generates microoperations to fetch the instruction
for main memory; to evaluate the effective address, to execute the
operation specified by the instruction, and to return control the
fetch phase in order to repeat the cycle for the next instruction
I don't exactly understand here the difference between machine instruction, microinstruction and micropeerations. i certainly do understand that microinstructions according to the paragraph given are the intermediate level of instructions but which of the other 2 is the one that is more close to the machine language. Are CLA, ADD, STA, BUN, BSA, AND etc machine instructions or microoperations?
A CPU presents itself to the outside as a device capable of executing machine instructions. For example,
mov (%esi,%ebx,4), %edx
is a machine instruction that moves 4 bytes of data at address ESI+4*EBX into register EDX. Machine instructions are public - they are published by CPU manufacturer in a user manual. Compilers such as gcc will output files that contain machine instructions, and these will typically end up in EXE/DLL files.
If you look closely at the above instruction, you will see that it is a fairly complex operation. It involves some arithmetic (multiplying and addition) to get the memory address, then moving data from that address into a register. From CPU's perspective, it would also make sense to use the arithmetical unit that is already there. So it makes natural sense to break down this instruction into microinstructions. In essence, mov instruction is implemented internally by CPU as a microprogram written in microinstructions. This is, however, an implementation detail of a CPU. Microinstructions are internal to CPU and they are invisible to anybody except to CPU manufacturer.
Microinstructions have several benefits:
they simplify internal CPU architecture, design and testing, thus lowering cost per unit
they make it easy to create rich and powerful sets of machine instructions (you just have to combine microinstrcutions in different ways)
they provide a consistent machine language across different CPUs (e.g. Xeon and Pentium both implement basic x86_64 instruction set even though they are very different in hardware)
create optimizations (i.e. the same instruction on one CPU can be implemented by a hardware, the other can be emulated in microinstructions)
fix bugs (e.g. you can fix Spectre vulnerability while the machine is running and without buying a new CPU and opening your server)
For more information, see https://en.wikipedia.org/wiki/Micro-operation
I think the answer to your question is in these three sentences:
The user's program in main memory consists of machine instructions and data
Each machine instruction initiates a series of micro-instructions in control memory.
These micro-instructions generate micro-operations.
So:
The user supplies machine instructions
Those get translated into micro-instructions
Those get translated into micro-operations
The mnemonics you mentioned are what the user might use to write or read a list of machine instructions (the actual instructions just being patterns of bits understood by the processor). The "occasional user" (i.e. everyone other than the chip's designer) never needs to deal directly in micro-instructions or micro-operations, so would never know individual names for them.

ARM11/ARMv6 cache flushing on VM mapping changes?

I'm writing a toy operating system for the Raspberry Pi, which is based around an ARM11/ARMv6. I want to use basic memory mapping features, mainly so I can swap code in and out of a particular virtual address. I'm intending to use the 1MB sections because they look pretty simple and they're big enough that I only need to change one at a time.
However, there are two things I haven't been able to figure out yet --- the ARM ARM is nigh impenetrable...
when changing a mapping by updating a TLB table entry, do I need to invalidate that region of virtual address space? Some of the diagrams I've seen indicate that the caches are attached to physical memory, but suggests no, but the caching behaviour is controlled by flags on the TLB table entry, which suggests yes.
if I have two regions of virtual memory pointing at the same physical location, are they cache coherent? Can I write to one and then assume that data is immediately readable from the other? It'd make life loads easier if it were...
Does anyone know the answers for sure?

Writing to hard disk from contiguous physical memory

I have an ARM based device, running linux, which is connected to a camera, and I'm trying to store captured frames to HD efficiently.
I'm developing in user space, but can modify drivers at will
I'm coding in C
Frames which are written into memory using DMA, and I have their physical memory pointer.
I am able to control all the frame capturing flow, and I can tell when the frame buffers are stable (dqueued from the video4linux driver)
Linux version is 3.0.35
I'm familiar with kernel source code, not an expert, but I'm able to find my way in it and figure out things, as long as I get some hints...
I believe I have 2 alternatives:
Find the optimal configuration for my filesystem, for opening the file and writing into it. I'm now using ext4, and standard fopen() fwrite() functions. I understand I can also use mmap, or add O_DIRECT flag when calling open(), but didn't try it yet.
Find a way to pass the physical address of the buffer (I can get it
from my Video4Linux driver) directly to the filesystem/hard drive driver,
so the data will be transfered directly from there.
I found method 1 to be slow, having memory transactions as my bottleneck, since fwrite involves copying data from userspace to kernel space, and then again into some sort of cache, and then on to DMA. Too many memory transactions for a simple store...
Regarding method 2 - I don't know if that's possible, but if I was the one designing this system from scratch, this is what I would do.
Any thoughts?
Regarding method 1 (using open() and write(), mmap() and/or O_DIRECT)
can you recommend an optimal settings for my purpose?
Is method 2 (storing to HD directly from an existing DMA buffer) possible? If so - can you point me to an example?
the only problem with writing into a file via mmap on UNIXs, is that you either have to deal with signals in case of out-of-disk-space
or you have make certain that the file is not sparse
and thus all needed disk space is already allocated.
I think an uptodate G++ provides a method of converting signals into C++ exception handling,
but I'm not certain how supported this is on other systems than mac-os.

Data management in matlab versus other common analysis packages

Background:
I am analyzing large amounts of data using an object oriented composition structure for sanity and easy analysis. Often times the highest level of my OO is an object that when saved is about 2 gigs. Loading the data into memory is not an issue always, and populating sub objects then higher objects based on their content is much more java memory efficient than just loading in a lot of mat files directly.
The Problem:
Saving these objects that are > 2 gigs will often fail. It is a somewhat well known problem that I have gotten around by just deleting a number of sub objects until the total size is below 2-3 gigs. This happens regardless of how boss the computer is, a 16 gigs of ram 8 cores etc, will still fail to save the objects correctly. Back versioning the save also does not help
Questions:
Is this a problem that others have solved somehow in MATLAB? Is there an alternative that I should look into that still has a lot of high level analysis and will NOT have this problem?
Questions welcome, thanks.
I am not sure this will help, but here: Do you make sure to use recent version of mat file? Check for instance save. Quoting from the page:
'-v7.3' 7.3 (R2006b) or later Version 7.0 features plus support for data items greater than or equal to 2 GB on 64-bit systems.
'-v7' 7.0 (R14) or later Version 6 features plus data compression and Unicode character encoding. Unicode encoding enables file sharing between systems that use different default character encoding schemes.
Also, could by any chance your object by or contain a graphic handle object? In that case, it is wise to use hgsave