Other then cache what are the on chip memory? And how explicitly can be addressable? - cpu-cache

I came to understanding about SRAM as on chip memory. Moving towards latest technologies High Band width another on-chip memory? Also I need to know what are the latest tech used in processors, in terms of on-chip memory

You can find a lot of information about CPU cache in paper titled What Every Programmer Should Know About Memory, written by Ulrich Drepper. Especially in sections 2.1.1 and whole 3.
If you are interested in current technologies, you should read available documentation and optimisation guides for modern processors. You can find many details about processors at wikichip.org, for example https://en.wikichip.org/wiki/amd/microarchitectures/zen_2.
In case of AMD/Intel processors search for "AMD64 Architecture Programmer’s Manual " or "Intel® 64 and IA-32 Architectures Software Developer’s Manual".

Related

Can the AMD64 ISA work without licensing the x86 ISA? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I know that AMD64 aka. x86-64 is AMD's own proprietary technology and can be licensed by 3rd parties, and they do, like Intel, VIA, etc.
I also know that the "big thing" about AMD64 ISA is that it extends the x86 ISA, thus compatibility is a fundamental advantage over the Intel's IA-64.
But (my question comes now ;)) as AMD64 relies on the basic x86 instruction set, does this mean if AMD would not get a license to that from Intel, AMD64 would be just an extension to x86 without the x86 instruction set itself, or does AMD64 "reimplement/redefine" the whole x86 ISA making the x86 license unnecessary in this regard? (I guess licensing the x86 by AMD is not just about having a complete ISA with the AMD64, so this question is just a "what if"-kind to let me better understand how AMD64 depends on or free from x86.)
If a manufacturer wants to make a CPU purely with AMD64 ISA, is it possible to make an OS that runs on it? Will it involve x86 instruction set? Or AMD64 cannot be defined without x86 so there's a bunch of basic instructions that are not part of the AMD64, thus without them there's no way a CPU can work at all?
Unlike AArch64 vs. ARM 32-bit, it's not even a separate machine-code format. I think you'd have a hard time justifying an x86-64 as separate from x86, even if you left out "legacy mode" (i.e. the ability to work exactly like a 32-bit-only CPU, until/unless you enable 64-bit mode).
x86-64's 64-bit mode uses the same opcodes and instruction formats (with mostly just a new prefix, REX). https://wiki.osdev.org/X86-64_Instruction_Encoding. I doubt anyone could argue it was substantially different from x86, or whatever the required standard is for patents. (Although patents on that might be long expired, if they were for 8086).
Especially given that long mode includes 32/16-bit "compat" sub-modes (https://en.wikipedia.org/wiki/X86-64#Operating_modes), and uses Intel's existing PAE page-table format.
But note that a lot of the patent-sharing stuff between Intel and AMD is for implementation techniques, for example a "stack engine" that handles the modification-to-stack-pointer part of push/pop/call/ret, letting it decode to 1 uop and avoiding a latency chain through RSP. Or IT-TAGE branch prediction (Intel Haswell, AMD Zen 2). Or perhaps the whole concept of decoding to uops, which Intel first did with P6 (Pentium Pro) in ~1995.
Presumably there are also patents on ISA extensions like SSE4.1 and AVX that would make unattractive to sell a CPU without, for most purposes. (SSE2 is baseline for x86-64, so you need that. Again, the instructions and machine-code formats are identical to 32-bit mode.)
BTW, you'd have to invent a way for it to boot, starting in long mode which requires paging to be enabled. So maybe boot with a direct-mapping of some range of addresses? Or invent a new sub-mode of long mode that allows paging to be disabled, using physical addresses directly.
The firmware could handle this and boot via 64-bit UEFI, potentially allowing 64-bit OSes to run unmodified as long as they never switch out of long mode.
Note that AMD, when designing AMD64, intentionally kept x86's variable-length hard-to-decode machine-code format as unchanged as possible, and made as few other changes as possible.
This means the CPU hardware doesn't need separate decoders, or separate handling in execution units, to run in 64-bit mode. AMD weren't sure AMD64 was going to catch on, and presumably didn't want to be stuck needing a lot of extra transistors to implement 64-bit mode when hardly anybody was going to take advantage of it.
(Which was definitely true even in practice for their first generation K8 chips; it was years before 64-bit Windows was common, and GNU/Linux users running still-evolving amd64 ports of distros were only a small fraction of the market back in 2003.)
Unfortunately this means that unlike AArch64, AMD64 missed the opportunity to clean up some of x86's minor warts (like providing setcc r/m32 instead of the inconvenient setcc r/m8 is my favourite example of something I would have changed for the semantics of an opcode in 64-bit mode vs. 16 and 32.)
I can see why they didn't want to totally redesign the machine-code format and need an entirely new decoding method; as well as costing silicon, that would force toolchain software (assemblers / disassemblers) to change more, instead of mostly minor changes to existing tools. That would slightly raise the barrier to adoption of their extension to x86, which was critical for them to beat IA-64.
(IA-64 was Intel's 64-bit ISA at the time, whose semantics are very different from x86 and thus couldn't even share much of a back-end. It would have been possible to redesign machine-code for mostly the same instruction semantics as x86 though. See Could a processor be made that supports multiple ISAs? (ex: ARM + x86) for more about this point: separate front-ends to feed a common back-end can totally work if the ISAs are basically the same, like just a different machine-code format for most of the same semantics.)

How much is known publicly about the details of how Apple processors work internally?

Edit: in an attempt to avoid this question being closed as a reference request (though I still would appreciate references!), I will give a few general, non-link-only questions for concreteness. I would accept an answer for any of these, but the more the better.
Is the A12 in-order, or out-of-order?
How many instructions can it retire per cycle?
How many pipeline stages does it have?
What sort of cache hierarchy does it have?
Does it architecturally resemble modern Intel processors, and if not, what are the major differences?
Original question: There is a lot of publicly available documentation about how the current mainstream Intel core design works (Pentium Pro and all its descendants). Both Intel’s own optimization manuals, and descriptions published by WikiChip and Agner Fog.
Any curious person can learn what the pipeline stages are, what each part of the core does, and so on.
I can’t find anything similar for the Apple Ax series. Does it exist?
Apple is an ARM architectural licensee and they have developed several generations of ARM64 chips. A resource for some of the micro-architectural detail on their chips is the Cyclone LLVM scheduler model analyzed here. This is upstreamed into LLVM and also released by Apple as open source. I think the Cyclone model covers all their chips.
Other resources are WikiChip and Wikipedia which aggregate information and cite sources. The Apple patent file provides other information. Benchmarks and reviews are available but not at the level of Agner.
First, Wikipedia says the A12 is OOO but a Big Little chip. Big (Vortex) on the A12 decodes 7-wide and Little (Tempest) is 3-Wide with 13 and 5 execution ports respectively. I can't find retire rates.

Why the size of operating systems (clean install) is increasing?

Ok, this is just a simple question, but I really like to have some answers from people that create distributions (linux) or if there are also people involved on OsX or Windows.
The size after installation seems to be increasing, Windows 10 requires 20GB of disk space (64bit). I suppose that the kernel is not the problem, so the problem is in the applications (i.e. user space). But I cannot see an increase in the number of applications packaged with the OS, at least not a big increase, so the problem is..how they wrote them, the runtime support, etc.
Could someone comment on this?
While I don't think this question is suitable for here, I'll point out that programmers tend to rely too heavily on increasing memory capacity and increasing processor speed.
Another thing to consider is that newer versions of software usually need to keep backwards compatibility with older versions. This results in multiplied memory requirements and possibly processor requirements.
Newer versions of software introduce new features (whether substantial ones or simply eye-candy ones does not matter in this context). The results are the same as with backwards compatibility.
Many people may disagree here, but some other people will argue there is this thing called "Planned Obsolescence". This way working hardware gets obsolete simply because software requirements increase.

what operating system concepts should every programmer be aware of? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am compiling various lists of competencies that self taught programmers must have.
Among all subjects, Operating Systems is the trickiest one, because creating even a toy operating system is a rather non-trivial task. However, at the same time an application developer (who may not have formally learned CS) must at least be aware of and hopefully should have implemented some key concepts to appreciate how an OS works, and to be a better developer.
I have a few specific questions:
What key concepts of operating systems are important for a self taught programmer to understand so they can be better software developers (albeit working on regular application development)?
Is it even remotely possible to learn such a subject in byte sized practical pieces ? (Even a subject like compiler construction can be learned in a hands on way, at a rather low level of complexity)
I would suggest reading Andrew S. Tanenbaum ( http://en.wikipedia.org/wiki/Andrew_S._Tanenbaum ) book on Modern Operating Systems (ISBN 978-0-13-600663-3) as everything is there.
However from the book index we can identify the minimum key topics:
Processes
Memory management
File systems
Input/output
And the easiest way to start playing with this topics will be to download MINIX:
http://www.minix3.org/
and study the code. Older versions of this operating system might be easier to understand.
Another useful resource is Mike Saunders How to write a simple operating system that shows you how to write and build your first operating system in x86 assembly language:
http://mikeos.sourceforge.net/write-your-own-os.html
Every OS designer must understand the concepts behind Multics. One of the most brilliant ideas is the notion of of a vast virtual memory partioned into directly readable and writable segments with full protections, and multiprocessor support to boot; with 64 bit pointers, we have enough bits to address everything on the planet directly. These ideas are from the 1960s yet timeless IMHO.
The apparent loss of such knowledge got us "Eunuchs" now instantiated as Unix then Linux and an equally poor design from Microsoft, both of which organize the world as a flat process space and files. Those who don't know history are doomed to doing something dumber.
Do anything you can to get a copy of Organick's book on Multics, and read it, cover to cover. (Elliott I. Organick, The Multics System: An Examination of Its Structure).
The wikipedia site has some good information; Corbato's papers are great.
I believe it depends on the type of application you are developing and the OS platform you are developing for. For example if you are developing a website you don't need to know too much about the OS. In this example you need to know more about your webserver. There are different things you need to know when you are working on Windows, Linux or Android or some embedded system or sometimes you need to know nothing beyond what your API provides. In general it is always good for a developer or CS person to know following.
What lies in the responsibility of application, toolchain and then OS.
Inter process communication and different IPC mechanism the OS system calls provides.
OS is quite an interesting subject but mostly consist of theory but this theory comes to action when you working on embedded systems. On average for desktop applications you don't see where all that theory fits in.
Ok, operating system concepts that a good programmer should be aware of.
practically speaking. Unless you are concerned about performance. If you are writing in a cross os language. None.
If you care about performance.
The cost of user/system transitions
How the os handles locking/threads/deadlocks and how to best use them.
Virtual Memory/Paging/thrashing and the cost thereof.
Memory allocation, how the os does it, and how you should take advantage of that to when A, use the OS allocator ( see 1) and when to allocate from the os and sub allocate.
As earlier put, process creation/ and inter process communication.
How the os writes/reads to disk by default to read/write optimally ( see why databases use B-trees)
Bonus, sub-os, what cache size and cache lines can mean to you in terms of performance.
but generally it would boil down to what does the OS provide you that isn't generic, and what and why does it cost, and what will cost too much ( too much cpu, too much disk usage, too much io, too much network ect).
Well that depends on the need of the developer like:-
Point.
Applications such as web browsers and email tools are
performing an increasingly important role inmodern desktop computer
systems. To fulfill this role, they should be incorporated as part of the
operating system. By doing so, they can provide better performance
and better integration with the rest of the system. In addition, these
important applications can have the same look-and-feel as the operating
system software.
Counterpoint.
The fundamental role of the operating system is to manage
system resources such as the CPU, memory, I/O devices, etc. In addition,
it’s role is to run software applications such as web browsers and
email applications. By incorporating such applications into the operating
system, we burden the operating system with additional functionality.
Such a burdenmay result in the operating system performing a less-thansatisfactory
job at managing system resources. In addition, we increase
the size of the operating system thereby increasing the likelihood of
system crashes and security violations.
Also there are many other important points which one must understand to get a better grip of Operating System like Multithreading, Multitasking, Virtual Memory, Demand Paging, Memory Management, Processor Management, and more.
I would start with What Every Programmer Should Know About Memory. (Not completely OS, but all of it is useful information. And chapter 4 covers virtual memory, which is the first thing that came to mind reading your question.)
To learn the rest piecemeal, pick any system call and learn exactly what it does. This will often mean learning about the kernel objects it manipulates.
Of course, the details will differ from OS to OS... But so does the answer to your question.
Simply put:
Threads and Processes.
Kernel space/threads vs user space/threads (probably some kernel level programming)
Followed by the very fundamental concepts of process deadlocks.
And thereafter monitors vs semaphores vs mutex
How Memory works and talks to process and devices.
Every self-taught programmer and computer scientist alike should know the OSI model and know it well. It helps to identify where a problem could lie and who to contact if there are problems. The scope is defined here and many issues could be filtered out here.
This is because there is just too much in an operating system to simply learn it all. As a web developer I usually work in the application level when an issue ever goes out of this scope I know when i need help. Also many people simply do not care about certain components they want to create thing as quickly as possible. The OSI model is a place where someone can find their computer hot spot.
http://en.wikipedia.org/wiki/OSI_model

Comparison of embedded operating systems?

I've been involved in embedded operating systems of one flavor or another, and have generally had to work with whatever the legacy system had. Now I have the chance to start from scratch on a new embedded project.
The primary constraints on the system are:
It needs a web-based interface.
Inputs are required to be processed in real-time (so a true RTOS is needed).
The memory available is 32MB of RAM and FLASH.
The operating systems that the team has used previously are VxWorks, ThreadX, uCos, pSOS, and Windows CE.
Does anyone have a comparison or trade study regarding operating system choice?
Are there any other operating systems that we should consider? (We've had eCos and RT-Linux suggested).
Edit - Thanks for all the responses to date. A pity I can't flag all as "accepted".
I think it would be wise to evaluate carefully what you mean by "RTOS". I have worked for years at a large company that builds high-performance embedded systems, and they refer to them as "real-time", although that's not what they really are. They are low-latency and have deterministic schedulers, and 9 times out of 10, that's what people are really after when they say RTOS.
True real-time requires hardware support and is likely not what you really mean. If all you want is low latency and deterministic scheduling (again, I think this is what people mean 90% of the time when they say "real-time"), then any Linux distribution would work just fine for you. You could probably even get by with Windows (I'm not sure how you control the Windows scheduler though...).
Again, just be careful what you mean by "Real-time".
It all depends on how much time was allocated for your team has to learn a "new" RTOS.
Are there any reasons you don't want to use something that people already have experience with?
I have plenty of experience with vxWorks and I like it, but disregard my opinion as I work for WindRiver.
uC/OS II has the advantage of being fully documented (as in the source code is actually explained) in Labrosse's Book. Don't know about Web Support though.
I know pSos is no longer available.
You can also take a look at this list of RTOSes
I worked with QNX many years ago, and have nothing but great things to say about it. Even back then, QNX 4 (which is positively chunky compared to the Neutrino microkernel) was perfectly suited for low memory situations (though 32MB is oodles compared to the 1-2MB that we had to play with), and while I didn't explicitly play with any web-based stuff, I know Apache was available.
I purchased some development hardware from netburner
It has been very easy to work with and very well documented. It is an RTOS running uCLinux. The company is great to work with.
It might be a wise decision to select an OS that your team is experienced with. However I would like to promote two good open source options:
eCos (has you mentioned)
RTEMS
Both have a lot of features and drivers for a wide variety of architectures. You haven't mentioned what architecture you will be using. They provide POSIX layers which is nice if you want to stay as portable as possible.
Also the license for both eCos and RTEMS is GPL but with an exception so that the executable that is produced by linking against the kernel is not covered by GPL.
The communities are very active and there are companies which provide commercial support and development.
We've been very happy with the Keil RTX system....light and fast and meets all of our tight real time constraints. It also has some nice debugging features built in to monitor stack overflow, etc.
I have been pretty happy with Windows CE, although it is 'heavier'.
Posting to agree with Ben Collins -- your really need to determine if you have a soft real-time requirement (primarily for human interaction) or hard real-time requirement (for interfacing with timing-sensitive devices).
Soft can also mean that you can tolerate some hiccups every once in a while.
What is the reliability requirements? My experience with more general-purpose operating systems like Linux in embedded is that they tend to experience random hiccups due to their smart average-case optimizations that try to avoid starvation and similar for individual tasks.
VxWorks is good:
good documentation;
friendly developing tool;
low latency;
deterministic scheduling.
However, I doubt that WindRiver would convert their major attention to Linux and WindRiver Linux would break into the market of WindRiver VxWorks.
Less market, less requirement of engineers.
Here is the latest study. The last one was done more than 8 years ago so this is most relevant. The tables can be used to add additional RTOS choices. You'll note that this comparison is focused on lighter machines but is equally applicable to heavier machines provided virtual memory is not required.
http://www.embedded.com/design/operating-systems/4425751/Comparing-microcontroller-real-time-operating-systems