I'm doing a paper on NTFS vs FAT32 and showing a comparison between both file systems.
As far as my knowledge goes, I know that NTFS uses the MFT for holding all the files and directories whereas FAT32 only knows the following cluster for a specific file or directory. This means that FAT32 doesn't know a-priori the first block of a file if not found first by looking up in a specific directory.
My question is the following, if NTFS holds all information regarding the file system in a file does it mean it's going to be faster when doing a raw search for a filename "test.txt" within the system? From what I know, FAT will have to scan every directory in the hard drive and in each directory look if the filename exists whereas in NTFS, it only needs to scan the MFT file which is contiguous for a record that has name : "text.txt".
Am I right or I'm missing something?
I don't know, probably yes(40% bet), but to turn your problem into Stack Overflow coding on-topic here are some resources where you can find your answer and give us your self-answer:
probably, by reading "text":
NTFS.com: NTFS Basic
Microsoft TechNet: File Systems Technologies → How NTFS Works
for sure, by reading "code":
GitHub: /torvalds/linux/fs/ntfs - C source code of the NTFS file system driver used by Linux - "Linux is a clone of the operating system Unix, written from scratch by Linus Torvalds with assistance from a loosely-knit team of hackers across the Net. It aims towards POSIX and Single UNIX Specification compliance"
svn.reactos.org: /reactos/drivers/filesystems/ntfs - C source code of the NTFS file system driver used by ReactOS - "free open source operating system based on the best design principles found in the Windows NT® architecture (Windows versions such as Windows XP, Windows 7, Windows Server 2012 are built on Windows NT architecture). Written completely from scratch, ReactOS is not a Linux based system, and shares none of the UNIX architecture"
All of the above resources should agree on the concept and the two last resources show how the machines really treat the problem without some marketing talk (machines don't understand marketing and advertising very well)
Related
first of all hello to all hope you are good and safe.
well i have some questions about machine code and hardware and operating system.
1- i was search about how is pure machine code and i find somethings in here and net but not enough to answer my questions since im new to low level programming language. so how to write a pure machine code like open just my computer with 0,1 are machine code have any file extensions like assembly and .exe i wana write code just directly get area in ram and talk with processor and do what i writed for example open my computer or open a text file for example. so i wana know how to do it are pure machine code have a file extension like .exe or .asm
2- i know each cpu have it owen machine language somethigns is different on them it could not be a way to all cpu's undrestand our machine code. also i wana know for example we have 2 cpu both of them are x64 or x32 but 1 of them are windows other is linux are machine code of x64 windows will work also on x64 cpu linux?
thank you for give your time to me and read.
for now gathering information
An operating system provides the capability to run programs. So, one program, like the desktop or command line shell, can ask the operating system to run another program.
When that happens, the operating system creates an environment to run the program called a process, and then loads a program file from disc into the process, and directs the CPU to begin executing that program file starting at its beginning.
The operating system has a loader, whose job is to load the disc-based program file into memory of the process.
The loader knows about certain executable file formats. Different operating system have different loaders and most likely understand different executable formats.
Machine code is contained in these program files, stored on disc using those file formats. There are other ways to load machine code into memory, though a program file stored on disc loaded by the loader is the most common approach.
Asm, .asm, is a text file, human readable, for storing program code in assembly language. To use it, such text file is given as input to a build system, which converts that human readable program code into a program file containing equivalent machine code, for later loading into a process by the operating system.
Not only do different operating systems support different file formats for program files, they also support different ways to interact with the operating system, which goes to their programming model that is described by an Application Binary Interface aka ABI. All programs need to interact with the operating system for basic services like input, output, mouse, keyboard, etc.. Because ABIs differ between operating systems, the machine code in a program written for one operating system won't necessarily run on a different operating system, even if the processor is exactly the same.
Most disc-based file formats for executable program files contain indicators telling what processor the program will run on, so the same operating system on different processors requires different machine code, and hence usually different executable program files. (Some file formats support "fat" binaries meaning that the machine code for several different processors is in one program file.)
Operating systems also have features that allow execution of new machine code within an existing process. That machine code can be generated on the fly as with JTT compilers, or loaded more informally by an application program rather than the operating system loader. Further, most operating system loaders support dynamically loading additional program file content from executable program files.
So, there's lots of ways to get machine code into the memory of a process for execution — support for machine code is one of the fundamental features of operating systems.
Let's also note that no real program is pure machine code — programs use machine code & data together, so all executable file formats store both machine code and data (and metadata).
I installed (extracted) msysgit portable (PortableGit-1.9.5-preview20150319.7z)
The compressed archive is 23 MB, but once extracted the contents take up 262 MB. This is mostly due to the git command binaries (under 'libexec\git-core'). Almost all of the binaries are identical, they just have different names.
Why did the developers build the project like this? I suppose they need an executable for each command to support the CLI on windows cmd.exe.
But isn't there a way to avoid having ~100 identical binaries, each 1.5 MB in size (ex: using batch files)?
Why did the developers build the project like this? I suppose they
need an executable for each command to support the CLI on windows
cmd.exe.
Under unixoid OSes, you can have symbolic links to another file that behave exactly like the original file; if you do that for your executable, your executable can look into argv[0] to find out how it was called. That's a very common trick for many programs.
Under Windows, and especially without installers, it's (to my knowledge) impossible to get the same behaviour out of your file system -- there's just no symbolic link equivalent. Especially when you consider that programs that are meant to run from USB drives have to cope with ancient filesystems such as FAT32!
Thus, the executables simply were copied over. Same functionality, more storage. However, on a modern machine that you'd run windows on, you really don't care about 200MB give or take for such a versatile tool such as git.
In conclusion: the developers had no choice here; since windows (though having some posix abstraction layer) has no proper filesystem support for symbolic links, that was the only good way to port this unix-originating program. You either shouldn't care or use an OS that behaves better in that respect. I'd recommend the latter, but OS choices often aren't made freely...
I was going through basics of File System implementation. While implementing for looking up for a file, how does the OS distinguish a file and the directory which it is in?
For example: If I want to lookup a file foo.c with the given path: /home/mac/work/foo.c, How does the OS decide home,mac and work are directories and foo.c is the file inside work directory
I will assume this question pertains to Linux operating systems.
A file by definition is at leaf-level of a tree. Therefore, anything that is suffixed with a / cannot be a file.
The leaf is another story. foo.c might be a file or it might be a directory. The OS has to look at it in order to determine which it is. Internally, a directory is technically a file, but it behaves differently.
To complicate things, Linux has soft- and hard-links, which are special files that can link to a file or directory. And indeed a directory might be the mount point for an entire file system. It's quite common to mount a separate partition or drive as /home. You don't really have to worry about these. You are mostly concerned with the addressing.
If you want to find out what a file is in Linux, use /usr/bin/stat.
I have two files that give the same hash, and even the same hexdump. File A and File B start on Linux Box 1 and Linux Box 2, respectively. I then copy both files to a Windows share, and read them from a Windows machine. The files still seem to be byte-by-byte identical with the Windows utility Fc (with /b option -- binary mode). However, when I open the two different files, they appear to have different encoding (newlines/line-wrap). Why wasn't this uncovered by the hashes/hexdump/Fc?
What am I overlooking here?
Don't use wordpad for that. Actually, don't use wordpad at all. Note that Microsoft often does not keep to standards, and in many times (e.g. the browser) simply takes and informed guess at file or stream content, using the header as some kind of magic. Sometime it guesses wrong, some times it doesn't.
You could calculate the hash on the Windows machine as well, there are plenty of lightweight utilities that calculate secure hashes within windows Explorer. You could also install command line utilities such as OpenSSL on Windows (or take it a step further and install Cygwin, which I always have running on my Windows machine).
Windows has never had a real strategy regarding line endings, except keeping to it's own double-character standard. In later versions of Windows you may use Notepad which does (finally) understand Unix newlines if you must (because maybe it screws up UTF-16 this time around).
I've been using RCS (revision control system) from MKS Source Integrity for several old projects. I have to move to a new Windows 7 computer. The old version I have does not install on Windows 7, and a new version of the software is very expensive.
What is the best free or cheep source of RCS for Windows 7? Also, will it be compatible with MKS Toolkit which I am still going to install?
The official website for RCS has Windows 32-bit binaries, but they are dated. YMMV. http://www.cs.purdue.edu/homes/trinkle/RCS/.
Edit: I just tried the binaries (from the first zip file). They seem to work on a trivial text file.
I put them in a directory. Then I created an "RCS" directory. Then I created a text file. Then I ran "set TZ=EST" in my cmd.exe window (the tools require a timezone). Then I was able to check the text file in and out with the RCS command line tools.
Note that large files are probably not supported given the date of the binaries.
If you want the binaries to be available system wide, you have to place them in a location on your Windows PATH and set the TZ environmental variable to the zone you need in your account's environment.
RCS offers reverse merge which can be useful when you want to apply selected fixes for ECO version of your software without addition on less tested product enhancements. I was able to produce ECO version of real-time control system with several hundred fixes without the assistance of software engineers working on the next release of the product. ClearCase did not offer similar capability at the time.
We used rcs and gmake. Build scripts were written in Perl. Each ran on native Windows. I wish the idiots at the software company in Washington would use / instead of \ for file separator.
On Windows 7 64-bit SETUP32.EXE fails "not compatible with the version of Windows you're running"
My workaround:
Copy sintcm32.dll from a 32-bit machine into c:\windows\syswow64.
In Explorer, double-click a .pj file, set description to "MKS Source Integrity Project/Sandbox File" and target to "[network location]\mkssi\mkssi32.exe"
Create start menu shortcuts to "[network location]\mkssi\mkssi32.exe" etc.
Git (MSysGit) works on Windows 7 and is free. There is a learning curve associated with Git, which I think is worth it (for the benefits you get, especially regarding a distributed VCS), but some may disagree. This bundles come with bash and an ssh client (useful synchronization with remote repositories).
EDIT: For RCS specifically, there is an RCS package via Cygwin or an independent package from the Purdue RCS Homepage (the latter says "The latest PC (OS/2 DOS Win95 NT)", but I guess it might work on Windows 7, I'm fairly sure the Cygwin package works on Win 7).