Why do you need blocks when you have sectors and why is the block size a multiple of sector size? - operating-system

On reading through disc structure, I come across this statement that blocks size is a multiple of sector size. First thought is why do u even need blocks when u have sectors, and secondly why is the block size a multiple of sector like 1,2,4?
Why can't it be half of sector? What's the rationale here? This is not for homework.

Block is an abstraction of filesystems. All filesystem operations can be accessed only in multiple of blocks. In other terms , smallest logically addressable unit to filesystem is block , not a sector.
The smallest addressable unit on a block device is a sector.The sector size is physical property of a block device and is the fundamental unit of all block devices.
Most block devices have 512-byte sectors (although other sizes are common. For example, some CD-ROM discs have 2-kilobyte sectors) while block sizes are commonly of size 512 bytes , 1 KB or 4KB. This is the reason block size is a multiple of sector.

Early in the computing industry, the term "block" was loosely used to refer to a small chunk of data. Later the term referring to the data area was replaced by sector, and block became associated with the data packets that are passed in various sizes by different types of data streams.
read more here: http://en.wikipedia.org/wiki/Disk_sector

Related

Error using zeros Out of memory

When I try running
Adj = zeros(x*y);
I am receiving the following error:
Error using zeros
Out of memory. Type HELP MEMORY for your options.
where x*y=37901. The occupancy of my PC storage is
I know the C drive doesn't have much space but 34.2 GB should be more than enough for creating a 37901*37901 matrix.
When I run the memory command this is what I got:
>> memory
Maximum possible array: 4825 MB (5.059e+09 bytes) *
Memory available for all arrays: 4825 MB (5.059e+09 bytes) *
Memory used by MATLAB: 12369 MB (1.297e+10 bytes)
Physical Memory (RAM): 12218 MB (1.281e+10 bytes)
* Limited by System Memory (physical + swap file) available.
How can I solve this issue? (I am using MATLAB 2017b)
Actually, coding side, variables are normally stored into memory (your computer RAM) rather than into hard disk space. That's what your error complains about... you don't have enough memory to store the variable you want to allocate.
The default numerical variable used by Matlab is double, which is used to represent double precision floating-point values and takes up 8 bytes of memory. Hence, you are trying to allocate:
37901 * 37901 * 8 = 11491886408 bytes
~= 10.7 gigabytes
When you only have something like 11.9 gigabytes of available memory and Matlab is telling you that you can't allocate an array greater than 4.7 gigabytes. As a workaround, I suggest you to take a look at Tall Arrays, which are a Matlab feature tailored around handling very big data flows:
Tall arrays are used to work with out-of-memory data that is backed by
a datastore. Datastores enable you to work with large data sets in
small chunks that individually fit in memory, instead of loading the
entire data set into memory at once. Tall arrays extend this
capability to enable you to work with out-of-memory data using common
functions.
What is a Tall Array?
Since the data is not loaded into memory all at once, tall arrays can be arbitrarily large in the first dimension
(that is, they can have any number of rows). Instead of writing
special code that takes into account the huge size of the data, such
as with techniques like MapReduce, tall arrays let you work with large
data sets in an intuitive manner that is similar to the way you would
work with in-memory MATLABĀ® arrays. Many core operators and functions
work the same with tall arrays as they do with in-memory arrays.
MATLAB works with small chunks of the data at a time, handling all of
the data chunking and processing in the background, so that common
expressions, such as A+B, work with big data sets.
Benefits of Tall Arrays
Unlike in-memory arrays, tall arrays typically remain unevaluated until you request that the calculations
be performed using the gather function. This deferred evaluation
allows you to work quickly with large data sets. When you eventually
request output using gather, MATLAB combines the queued calculations
where possible and takes the minimum number of passes through the
data. The number of passes through the data greatly affects execution
time, so it is recommended that you request output only when
necessary.

advantages of segmentation in 8086 microprocessor

what are the advantages of segmentation in 8086 microprocessor?
Not getting the importance of segmentation. Is it for managing more memory?
The instruction set used in 8086 is a 16-bit instruction set. This means that a register can only store values in the range 0x0000 to 0xFFFF, and instructions mostly only did 16-bit operations (16-bit addition, 16-bit subtraction, etc). If a register contains an address/pointer, then it would've worked out to a maximum of 64 KiB of address space (some for ROMs, some for RAM) and this wasn't enough for the market at the time.
Segmentation was a way to allow the 16-bit CPU to support a larger address space. Essentially, combining two 16-bit registers together, so that addresses/pointers could be much larger. Unfortunately (likely, to avoid "unnecessary at the time" costs of having more address lines on the CPU's bus), instead of using two 16-bit registers as a 32-bit address, Intel did an "address = segment * 16 + offset" thing to end up with a 20-bit address, giving the 8086 a 1 MiB address space.
Later (early 1980s) there was a push towards "protected objects" where "objects" (in object oriented programming) could be given access controls and limits that are enforced/checked by hardware, and around the same time there were "virtual memory" ideas floating around. These ideas led to the ill-fated iAPX 432 CPU; but also led to the idea of associating protection (attributes and limits) to the segments that 8086 already had, which resulted in the "protected mode" introduced with 80286 (and extended in 80386).
Essentially; the original reason for (advantage of) segments was to increase the address space (without the cost of a 32-bit instruction set, etc); and things like protection and memory management were retro-fitted afterwards (and then barely used by software before being abandoned in favour of paging).
Answer
Memory size is divided into segments of various sizes.
A segment is just area in memory.
Process of dividing memory in this way is called segmentation.
data ----> bytes -----> specific address.
8086 has 20 lines address bus.
2^20 bytes = 1Mb
4 types of Segments
Code Segment
Data Segment
Stack Segment
Extra Segment
Each of these segments are addressed by an address stored in corresponding segment address.
registers are 16 bit in size.
store base address of corresponding segments and store upper 16 bits.

Addressing a word inside memory frames

Suppose we have a 64 bit processor with 8GB ram with frame size 1KB.
Now main memory size is 2^33 B
So number of frames is 2^33 / 2^10 which is 2^23 frames.
So we need 23 bits to uniquely identify every frame.
So the address split would be 23 | 10 where 10 bits are required to identify each byte in a frame (total 1024 bytes)
As it is word addressable with each word = 8B, will the address split now be 23 | 7 as we have 2^7 words in each frame?
Also can the data bus size be different than word size ?
If suppose data bus size is 128 bits then does it mean that we can address two words and transfer 2 words at a time in a single bus cycle but can only perform 64 bit operations?
Most of the answers are dependent on how the system is designed. Also there is bit more picture to your question.
There is something called available addressable space on a system. In a 32 bit application this would be 2^32 and in a 64 bit application this would be 2^64. This is called virtual memory. And there is physical memory which commonly refereed as RAM. If the application is built as 64 bits, then it is able work as if there is 2^64 memory is available. The underlying hardware may not have 2^64 RAM available, which taken care by the memory management unit. Basically it breaks both virtual memory and physical memory into pages( you have refereed to this as frames) and keeps the most frequently used pages in RAM. Rest are stored in the hard disk.
Now you state, the RAM is 8GB which supports 2^33 addressable locations. When you say the processor is 64 bits, I presume you are talking about a 64 bit system which supports 2^64 addressable locations. Now remember the applications is free to access any of these 2^64 locations. Number of pages available are 2^64/2^10 = 2^54. Now we need to know which virtual page is mapped to which physical page. There is a table called page table which has this information. So we take the first 54 bits of the address and index in to this table which will return the physical page number which will be 2^33/2^10 = 23 bits. We combine this 23 bits to the least 10 bits of the virtual address which gives us the physical address. In a general CPU, once the address is calculated, we don't just go an fetch it. First we check if its available in the cache, all the way down the hierarchy. If its not available a fetch request will be issued. When a cache issues a fetch request to main memory, it fetches an entire cache line (which is usually a few words)
I'm not sure what you mean by the following question.
As it is word addressable with each word = 8B, will the address split now be 23 | 7 as we have 2^7 words in each frame?
Memories are typically designed to be byte addressable. Therefore you'll need all the 33 bits to locate a byte within the page.
Also can the data bus size be different than word size ?
Yes you can design a data bus to have any width, but having it less than a byte would be painful.
If suppose data bus size is 128 bits then does it mean that we can
address two words and transfer 2 words at a time in a single bus cycle
but can only perform 64 bit operations?
Again the question is bit unclear, if the data but is 128 bits wide, and your cache line is wider than 128 bits, it'll take multiple cycles to return data as a response to a cache miss. You wont be doing operations on partial data in the cache (at least to the best of my knowledge), so you'll wait until the entire cache line is returned. And once its there, there is no restriction of what operations you can do on that line.

Difference between paging and segmentation

I am trying to understand both paradigms of memory management;however, I fail to see the big picture and the difference between both. Paging consists of taking fixed size pages from a secondary to a primary storage in order to do some task requested by a process. Segmentation consists of assigning to each unit in a process an address space, so they are allowed to grow. I don't quiet see how they are related and that's because there are still a lot of holes in my understanding. Can someone fill them up?
I think you have something confused. One problem you have is that the term "segment" had multiple meanings.
Segmentation is a method of memory management. Memory is managed in segments that are of variable or fixed length, depending upon the processor. Segments originated on 16-bit processors as a means to access more than 64K of memory.
On the PDP-11, programmers used segments to map different memory into the 64K address space. At any given time a process could only access 64K of memory but the memory that made up that 64K could change.
The 8086 and it successors used segments with base registers. Each segment could have 64K (that grew with the processors) but a process could have 4 segments (more in later processors).
Paging allows a process to have a larger address space than there is physical memory available.
The 8086's successors used the kludge of paging on top of segments. However, that bit of ugliness has finally gone away in 64-bit mode.
You got your answer right there, paging relates with fixed size pages in a storage while segmentation deals with units in a page. 'Segments' are objects in the class 'Page'

How big can a memory-mapped file be?

What limits the size of a memory-mapped file? I know it can't be bigger than the largest continuous chunk of unallocated address space, and that there should be enough free disk space. But are there other limits?
You're being too conservative: A memory-mapped file can be larger than the address space. The view of the memory-mapped file is limited by OS memory constraints, but that's only the part of the file you're looking at at one time. (And I guess technically you could map multiple views of discontinuous parts of the file at once, so aside from overhead and page length constraints, it's only the total # of bytes you're looking at that poses a limit. You could look at bytes [0 to 1024] and bytes [240 to 240 + 1024] with two separate views.)
In MS Windows, look at the MapViewOfFile function. It effectively takes a 64-bit file offset and a 32-bit length.
This has been my experience when using memory-mapped files under Win32:
If your map the entire file into one segment, it normally taps out at around 750 MB, because it can't find a bigger contiguous block of memory. If you split it up into smaller segments, say 100MB each, you can get around 1500MB-1800MB depending on what else is running.
If you use the /3g switch you can get more than 2GB up to about 2700MB but OS performance is penalized.
I'm not sure about 64-bit, I've never tried it but I presume the max file size is then limited only by the amount of physical memory you have.
Under Windows: "The size of a file view is limited to the largest available contiguous block of unreserved virtual memory. This is at most 2 GB minus the virtual memory already reserved by the process. "
From MDSN.
I'm not sure about LINUX/OSX/Whatever Else, but it's probably also related to address space.
Yes, there are limits to memory-mapped files. Most shockingly is:
Memory-mapped files cannot be larger than 2GB on 32-bit systems.
When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified. On systems with POSIX filesystem semantics, the extended part will be filled with zero bytes.
Even on my 64-bit, 32GB RAM system, I get the following error if I try to read in one big numpy memory-mapped file instead of taking portions of it using byte-offsets:
Overflow Error: memory mapped size must be positive
Big datasets are really a pain to work with.
The limit of virtual address space is >16 Terabyte on 64Bit Windows systems. The issue discussed here is most probably related to mixing DWORD with SIZE_T.
There should be no other limits. Aren't those enough? ;-)