Since the term “CPU” is generally defined as a device for software execution, the earliest devices that could rightly be called CPUs came with the advent of the stored-program computer.
Most modern CPUs are primarily von Neumann in design, but CPUs with the Harvard architecture are seen as well, especially in embedded applications; for instance, the Atmel AVR microcontrollers are Harvard architecture processors.
As the microelectronic technology advanced, an increasing number of transistors were placed on ICs, decreasing the quantity of individual ICs needed for a complete CPU. MSI and LSI ICs increased transistor counts to hundreds, and then thousands.
In stark contrast with its SSI and MSI predecessors, the first LSI implementation of the PDP-11 contained a CPU composed of only four LSI integrated circuits.
Microprocessors Main article: Microprocessor Die of an Intel 80486DX2 microprocessor in its packaging Intel Core i5 CPU on a Vaio E series laptop motherboard In the 1970s, the fundamental inventions by Federico Faggin changed the design and implementation of CPUs forever.
Combined with the advent and eventual success of the ubiquitous personal computer, the term CPU is now applied almost exclusively[a] to microprocessors.
Several CPUs can be combined in a single processing chip.
Previous generations of CPUs were implemented as discrete components and numerous small integrated circuits on one or more circuit boards.
Microprocessors, on the other hand, are CPUs manufactured on a very small number of ICs; usually just one.
The overall smaller CPU size, as a result of being implemented on a single die, means faster switching time because of physical factors like decreased gate parasitic capacitance.
As the ability to construct exceedingly small transistors on an IC has increased, the complexity and number of transistors in a single CPU has increased many fold.
While the complexity, size, construction, and general form of CPUs have changed enormously since 1950, it is notable that the basic design and function has not changed much at all.
Almost all common CPUs today can be very accurately described as von Neumann stored-program machines.
Operation The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions that is called a program.
Nearly all CPUs follow the fetch, decode and execute steps in their operation, which are collectively known as the instruction cycle.
Decode The instruction that the CPU fetches from memory determines what the CPU will do.
In the decode step, performed by the circuitry known as the instruction decoder, the instruction is converted into signals that control other parts of the CPU. The way in which the instruction is interpreted is defined by the CPU’s instruction set architecture.
In some cases the memory that stores the microprogram is rewritable, making it possible to change the way in which the CPU decodes instructions.
During each action, various parts of the CPU are electrically connected so they can perform all or part of the desired operation and then the action is completed, typically in response to a clock pulse.
Very often the results are written to an internal CPU register for quick access by subsequent instructions.
The actual mathematical operation for each instruction is performed by a combinational logic circuit within the CPU’s processor known as the arithmetic logic unit or ALU. In general, a CPU executes an instruction by fetching it from memory, using its ALU to perform an operation, and then storing the result to memory.
The frequency of the clock pulses determines the rate at which a CPU executes instructions and the faster the clock, the more instructions the CPU will execute each second.
To ensure proper operation of the CPU, the clock period is longer than the maximum time needed for all signals to propagate through the CPU. In setting the clock period to a value well above the worst-case propagation delay, it is possible to design the entire CPU and the way it moves data around the “Edges” of the rising and falling clock signal.
Architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs.
As clock rate increases, so does energy consumption, causing the CPU to require more heat dissipation in the form of CPU cooling solutions.
Integer range Every CPU represents numerical values in a specific way.
In the case of a binary CPU, this is measured by the number of bits that the CPU can process in one operation, which is commonly called “Word size”, “Bit width”, “Data path width”, “Integer precision”, or “Integer size”.
A CPU’s integer size determines the range of integer values it can directly operate on.
An 8-bit CPU can directly manipulate integers represented by eight bits, which have a range of 256 discrete integer values.
Integer range can also affect the number of memory locations the CPU can directly address.
If a binary CPU uses 32 bits to represent a memory address then it can directly address 232 memory locations.
CPUs with larger word sizes require more circuitry and consequently are physically larger, cost more, and consume more power.
As a result, smaller 4- or 8-bit microcontrollers are commonly used in modern applications even though CPUs with much larger word sizes are available.
A CPU can have internal data paths shorter than the word size to reduce size and cost.
To gain some of the advantages afforded by both lower and higher bit lengths, many instruction sets have different bit widths for integer and floating-point data, allowing CPUs implementing that instruction set to have different bit widths for different portions of the device.
The description of the basic operation of a CPU offered in the previous section describes the simplest form that a CPU can take.
I] Most modern CPU designs are at least somewhat superscalar, and nearly all general purpose CPUs designed in the last decade are superscalar.
In later years some of the emphasis in designing high-ILP computers has been moved out of the CPU’s hardware and into its software interface, or ISA. The strategy of the very long instruction word causes some ILP to become implied directly by the software, reducing the amount of work the CPU must perform to boost ILP and thereby reducing the design’s complexity.
Due to specific capabilities of modern CPUs, such as hyper-threading and uncore, which involve sharing of actual CPU resources while aiming at increased utilization, monitoring performance levels and hardware utilization gradually became a more complex task.
As a response, some CPUs implement additional hardware logic that monitors actual utilization of various parts of a CPU and provides various counters accessible to software; an example is Intel’s Performance Counter Monitor technology.