Computer Architecture - Essentials at a glance

Computer Architecture - Essentials at a glance

"In the beginning was the word...And the word was..." Oh wait! That is far too back...pressing the fast-forward button, Big bang -> Cave Men -> Doubling of cerebral cortex -> Greek guys with unpronounceable names poisoned or beheaded -> Isaac Newton -> Michael Faraday -> Charles Babbage -> Jon Von Neumann -> William Shockley....and we stop on a beautiful morning on the spring of 2015 on the lecture halls of UC Berkley where the following knowledge is being revealed!

The following is the simplified picture you need to keep in mind, in-order to understand the architecture of a modern computer image.png A computer has a processor, and a memory(RAM) external to it. The computer also has a bus which provides the pathway for the processor to fetch data to and from the main memory. The picture above shows that the processor is partitioned into the control path and the data path. The control path has all the circuitry necessary to govern the flow of electrical signals through the data path and the main memory. The data path has, a bunch of registers, which are memory locations baked on-chip which can store data and also the ALU(Arithmetic And Logic Unit), which is the digital circuitry necessary to do all the computations. One register of particular importance is the PC aka (Program Counter or the Instruction Pointer) which holds the memory location of the next instruction to be executed.

Given the above brief summary, the following definitions are very useful

Word length of a processor is the number of bits a register can hold, and the total number of bits that the a memory address in the RAM is made of. The total number of values a memory can hold, is 2^(word_length). So a 32-bit processor, can hold a total of 2^(32) bits of data. The width of the bus is also determined by the word length.

Instruction Set Architecture(ISA) of a processor is a set of instructions that can be manipulated in different combinations to design a program. The hardware circuitry of the processor is designed to fetch the instructions that is pointed to by the Program Counter register from the memory and then execute that instruction. Note, that both the program and the data are just data in the main memory. This is called the "stored program" concept. So all the programs you write in C, or any other language is converted into the ISA of a specific processor for it to be run by it.

A processor like any digital circuitry, runs of the concept of clock signals. Things in the processor change for every clock cycle. So higher the frequency of the clock, faster things move inside a processor. So to recap, a program contains a bunch of instructions, each instructions can take a finite number of clock cycles and each cycle of clock is determined by the clock cycle time (1/ clock frequency).

This leads to what is called the Golden rule of CA

Execution time of a program = Total number of instructions x Cycles per instructions x Clock cycle time

[What is the clock frequency of our brain? Mine varies but can go as low as it can when I go Uhhhhhhhh??? before I decide I should probably get away from the oncoming truck]

So as a designer of an application, one might immediately jump up and say why can't we choose a processor that produces the lowest number of instructions for your given C-program. If I write a program and the generated ISA has the least number of instructions, shouldn't that processor be my choice? The no-free lunch theorem hits you with a hammer.

For a given program, if the total number of generated instructions is low, then that means each of these instructions, does more work than it needs to. Which means that the second variable in our golden rule, "Cycles Per Instructions", increases. So one can say, that each of these instructions in this case are "Complex". And such a processor, is called a Complex Instruction Set Computer(CISC) processor. Another thing to note, here is that for a CISC processor, if each of your instructions does a lot but a specific thing, you also need a lot of instructions in your ISA, in-order to do a slightly different variations of the same thing. Intel is an example of a CISC architecture.

The counterpart to the above is called RISC(Reduced Instruction Set Computer), which means, the ISA is very minimal and the cycles per instructions is lesser. But for a given program, the total number of instructions generated will be larger.

Another interesting thing to note, for a RISC processor, most of the complexity of executing a program is delegated to the software engineers. As an application programmer, you might wonder, "But I care about no such thing!" True!. The real people who took the hit are the brave souls who designed the compiler. For a RISC processor, the C-compiler needs to do a lot more work that for a CISC processor, which delegates most of the complexity to the hardware designers. In the former case, the compiler has to figure out what combinations of these teeny instructions do I rearrange to make the CPU do something, whereas for the latter case, the hardware circuitry handles it and is where the complexity lies.

Let us take a brief moment to look at the software stack involved in executing a program

image.png

The above is called the CALL stack. The compiler converts C-program into Assembly code specific to the architecture, the role of the assembler is to replace the assembly strings to OPCODE(also called machine code), and arranges things in a specific format. The linker resolves references from other programs that your program will depend on. And now, the binary file is ready, this is located on the filesystem. The loader, loads the program into the RAM from your filesystem. In reality, the operating system performs the function of the loader

Hopefully, the above article paints a high level picture of what goes on in a computer. "Bruh! This is pretty basic! I knew all this stuff when I was in the uterus!" If that sounds like you, Stay tuned!