I try to understand os bootstraping process. Some questions are not clear to me.
One of them is :
How does bootstrap code in Volume boot record know about absolute LBA address of 0 sector of patition where Volume boot record resides?
Within the VBR is a structure called a BIOS Parameter Block, named after the BIOS, the bottom-half of the traditional MS-DOS kernel structure. Within the BIOS Parameter Block is a field denoting the number of hidden sectors between the partition and the (MBR-style) partition table entry that encloses it. The VBR code simply reads that field out of itself and adds it to the Volume-relative Block Address to produce the LBA to read from the disc.
This is why it is impossible to boot operating systems such as Windows NT, MS/PC/DR-DOS, and OS/2 from secondary partitions directly, without assistance. In primary partitions, the BPB field is simply the start LBA of the start of the volume, because the partition table that it is relative to is the primary MBR in block #0 of the disc, and everything works. In secondary partitions, because of a quirk of MS-DOS version 3 that everyone has had to remain compatible with ever since, the BPB field is only the offset of the "logical drive" within the "extended partition" containing it, and the boot code doesn't work because it looks for the rest of the boot volume in completely the wrong place on the disc.
Boot managers provide assistance by fixing up the BPB on the fly. The VBR code of course reads the in-memory copy of itself, not the on-disc copy. So boot managers simply adjust the field of the BPB for secondary partitions to the correct absolute value, as they are loading the VBR into memory. Then everything works.
Related
We usually learn Virtual memory and Paging at the same time in Operating System and they seem dependent. However, I wonder if they exist independently of each other?
The answer to your question depends on how you define "Virtual Memory". If you define it just as "the addresses that the application sees", then yes Virtual Memory can exist without paging.
Prior to paging, systems used segmentation to isolate user processes. To put it in simple words every process has it's own segment. All the addresses it "sees" are just offsets inside the segment. The hardware implicitly adds the segment base to the address requested by the application to get the Physical addresses. Just like the page table, the segment bases can be modified only by the kernel and it can effectively isolate memory for processes at the same time allowing scope for sharing some parts of memory between processes too.
Segments also have limits which are checked before every access to ensure that the user doesn't use a very big offset and spill into other process.
Segmentation support has been removed from Intel X86_64 architectures where the segment registers do exist but are always set to 0. Only the two segment registers %fs and %gs continue to exist. But the limit checks on them is not performed by the hardware. These segments are now used by the OS for thread local storage.
I am developing a low level SATA driver for an FPGA based embedded system. The driver is running fine.
When I read sectors from the disk using dd command, I can see that the SCSI read(10) command (opcode 0x28) is recived by my low level driver, which is I think correct. But, when I write sectors to the disk using dd command, the SCSI driver sends first read(10) command (opcode 0x28) for several times and then few write(10) commands (opcode 0x2A).
Can someone explain me, that why does SCSI driver send read() command during the write operation?
Edited: During a file write operation I can see that the driver first reads (DMA mode) from LBA 0 upto some 8 sectors. Then it writes (DMA) sg blocks and then it reads (PIO) disk specific information. After that it takes some random LBAs and performs several reads(DMA) finally it stops by reading device specific data read(PIO). This is a sequence for dd'ing 1KB file. The disk has no partition table and no file system (verified from fdisk utility ).
Is it a normal behaviour of driver? If yes then is it not decreasing the speed of operation? As overall file reading is faster than writing due to extra reads in write operation.
Thank you
It's hard to say concretely without knowing more about your system. Two possibilities come to mind:
Linux is looking for partition tables. This is likely the case if the reads are to LBA 0 and the first few logical blocks, or if the reads are to the very end of the device, where there is a secondary GPT header.
You're dd'ing a file on a filesystem, and the filesystem is reading in uncached metadata.
Staff, this question is for anyone who believes in Debian linux, more precisely of Raspbian, which is a version to run on the board Raspberry Pi:
As all users of Raspberry Pi should know: The operating system is installed on an SD card. AND the problem is that the SD card is a Flash memory, and this type of memory supports only a limited quantity of write operations.
I would like to know if the Raspbian writes the SD card when it is idle. If this happens, how can I disable?
I found this:
Tips for running Linux on a flash device by David Härdeman
If you are running your NSLU2 on a USB flash key, there are a number
of things you might want to do in order to reduce the wear and tear on
the underlying flash device (as it only supports a limited number of
writes).
Note: this document currently describes Debian etch (4.0) and needs to
be updated to Debian squeeze (6.0) and Debian wheezy (7.0). Some of
the hints may still apply, but some may not.
The ext3 filesystem per default writes metadata changes every five
seconds to disk. This can be increased by mounting the root filesystem
with the commit=N parameter which tells the kernel to delay writes to
every N seconds.
The kernel writes a new atime for each file that has been read which
generates one write for each read. This can be disabled by mounting
the filesystem with the noatime option.
Both of the above can be done by adding e.g. noatime,commit=120,... to /etc/fstab. This can also be done on an
already mounted filesystem by running the command:
mount -o remount,noatime,commit=120 /
The system will run updatedb every day which creates a database of all
files on the system for use with the locate command. This will also
put some stress on the filesystem, so you might want to disable it by
adding
exit 0
early in the /etc/cron.daily/find script.
syslogd will in the default installation sync a lot of log files to
disk directly after logging some new information. You might want to
change /etc/syslog.conf so that every filename starts with a - (minus)
which means that writes are not synced immediately (which increases
the risk that some log messages are lost if your system crashes). For
example, a line such as:
kern.* /var/log/kern.log
would be changed to:
kern.* -/var/log/kern.log
You also might want to disable some classes of messages altogether by
logging them to /dev/null instead, see syslog.conf(5) for details.
In addition, syslogd likes to write -- MARK -- lines to log files
every 20 minutes to show that syslog is still running. This can be
disabled by changing SYSLOGD in /etc/default/syslogd so that it reads
SYSLOGD="-m 0"
After you've made any changes, you need to restart syslogd by running
/etc/init.d/syslogd restart
If you have a swap partition or swap file on the flash device, you
might want to move it to a different part of the disk every now and
then to make sure that different parts of the disk gets hit by the
frequent writes that it can generate. For a swap file this can be done
by creating a new swap file before you remove the old one.
If you have a swap partition or swap file stored on the flash device,
you can make sure that it is used as little as possible by setting
/proc/sys/vm/swappiness to zero.
The kernel also has a setting known as laptop_mode, which makes it
delay writes to disk (initially intended to allow laptop disks to spin
down while not in use, hence the name). A number of files under
/proc/sys/vm/ controls how this works:
/proc/sys/vm/laptop_mode: How many seconds after a read should a
writeout of changed files start (this is based on the assumption that
a read will cause an otherwise spun down disk to spin up again).
/proc/sys/vm/dirty_writeback_centisecs: How often the kernel should
check if there is "dirty" (changed) data to write out to disk (in
centiseconds).
/proc/sys/vm/dirty_expire_centisecs: How old "dirty" data should be
before the kernel considers it old enough to be written to disk. It is
in general a good idea to set this to the same value as
dirty_writeback_centisecs above.
/proc/sys/vm/dirty_ratio: The maximum amount of memory (in percent) to
be used to store dirty data before the process that generates the data
will be forced to write it out. Setting this to a high value should
not be a problem as writeouts will also occur if the system is low on
memory.
/proc/sys/vm/dirty_background_ratio: The lower amount of memory (in
percent) where a writeout of dirty data to disk is allowed to stop.
This should be quite a bit lower than the above dirty_ratio to allow
the kernel to write out chunks of dirty data in one go.
All of the above kernel parameters can be tuned by using a custom init
script, such as this example script. Store it to e.g.
/etc/init.d/kernel-params, make it executable with
chmod a+x /etc/init.d/kernel-params
and make sure it is executed by running
update-rc.d kernel-params defaults
Note: Most of these settings reduce the number of writes to disk by
increasing memory usage. This increases the risk for out of memory
situations (which can trigger the dreaded OOM killer in the kernel).
This can even happen when there is free memory available (for example
when the kernel needs to allocate more than one contiguous page and
there are only fragmented free pages available).
As with any tweaks, you are advised to keep a close eye on the amount
of free memory and adapt the tweaks (e.g. by using less aggressive
caching and increasing the swappiness) depending on your workload.
This article has been contributed by David Härdeman
Go back to the Debian on NSLU2 page.
http://www.cyrius.com/debian/nslu2/linux-on-flash/
Someone has some more tip?
I have been using various raspberry pi setups and haven't had SD card troubles to date (fingers crossed). That being said, there is a bit of evidence for SD card lifespan related issues
A quick google search does show a few more tips though:
Bigger is better - reduces the load on specific sections
Write to ram for temp
Only store the boot partition on SD card and leave the OS on USB drive
(http://www.makeuseof.com/tag/extend-life-raspberry-pis-sd-card/)
Anyway, it'll be interesting to hear from someone who has a raspberry cluster or some such on their SD card lifespans!
(https://resin.io/blog/what-would-you-do-with-a-120-raspberry-pi-cluster/)
You can put files in tmpfs after load and write them back before shutdown using script from http://www.observium.org/wiki/Persistent_RAM_disk_RRD_storage
But it can be detrimental:
Tmpfs will destroy all changes on power outage, you must use UPS;
Raspberry Pi RAM is far from big, don't waste it.
If your pi often writes small files this can work for you
I would like to know if it's possible to move the files on a given partition to the begin of the disk, where the read/writes are faster.
I currently have Windows Vista installed on the first partition of the disk with a few more partitions for work related files and other stuff.
Would this guarantee that the OS partition (which was the first to be created) is located at the begin of the disk?
Would a defrag program help with this or the various partitions limit the location of the files on the disk?
No, it's not possible. Physical partitions separate a disk into circular regions. The first partition occupies the outer ring of the disk, and the next partition occupies the ring inside that one. You can move files within a partition, but all the files of one partition come before any files of any other partition.
If partitions were wedge-shaped, then what you request would be possible.
Ultimate Defrag (sry, no link, just google it up, it's free) has the option to move recent used files to the fastest area of the disk
Greetz,
GHad
I need to figure out the hard drive name for a solaris box and it is not clear to me what the device name is. On linux, it would be something like /dev/hda or /dev/sda, but on solaris I am getting a bit lost in the partitions and what the device is called. I think that entries like /dev/rdsk/c0t0d0s0 are the partitions, how is the whole hard drive referenced?
/dev/rdsk/c0t0d0s0 means Controller 0, SCSI target (ID) 0, and s means Slice (partition) 0.
Typically, by convention, s2 is the entire disk. This partition overlaps with the other partitions.
prtvtoc /dev/rdsk/c0t0d0s0 will show you the partition table for the disk, to make sure.
If you run Solaris on non SPARC hardware and don't use EFI, the whole hard drive is not c0t0d0s2 but c0t0d0p0, s2 is in that case just the Solaris primary partition.
What do you want to do to the whole disk? Look at the EXAMPLES section of the man page for the command in question to see how much of a disk name the command requires.
zpool doesn't require a partition, as in: c0t0d0
newfs does: c0t0d0s0
dd would use the whole disk partition: c0t0d0s2
Note: s2 as the entire disk is just a convention. A root user can use the Solaris format command and change the extent of any of the partitions.
The comments about slice 2 are only correct for drives with an SMI label.
If the drive is greater than 1TB, or if the drive has been used for ZFS, the drive will have an EFI label and slice 2 will NOT be the entire disk. With an EFI label, slice 2 is "just another slice". You would then refer to the whole disk by using the device name without a slice, e.g. c0t0d0.
There are two types for disk label, one is SMI(vtoc), the other is GPT(EFI).
On X86 platform and the disk is SMI labeled(default behavior):
cXtXdXp0 is the whole physical disk
cXtXdXp1-cXtXdXp4 are primary partitions, included the solaris partitions.
cXtXdXs0-cXtXdXs8 are the partitions(slices) of the activate Solaris partitions.
cXtXdXs2 is the whole activate Solaris partition, maybe not the whole disk.
Hope I am clear.
/Meng
C0 - Controller
T0 - Target
D0 - Disk
S- - Slice
c0t0d0s0 is the entire drive. The breakdown is:
/dev/[r]dsk/c C t A d0s S
...where C is the controller number, A is the SCSI address, and S is the "slice". Slice 0 is the whole disk; the other slices are the partition numbers.
See this for more info.
cXtYdZs2 is the whole drive. period.