×
Home Articles Tutorials Videos Projects
Low Level Programming For Noobs - An introduction to machine code and Assembly Language

Low Level Programming For Noobs - An introduction to machine code and Assembly Language

29 Apr, 2023

We all know that computers are pretty cool, but we often underestimate just how extraordinary they truly are. Only a few decades ago, many of the tasks we perform on computers today would have been deemed impossible feats of magic.

Today, we have the ability to access the entirety of human knowledge within seconds, communicate instantaneously with people across the world, post Rust memes on Twitter (and bluesky), control spacecraft traversing the solar system, and accomplish countless other incredible things thanks to the computers we've created.

But have you ever paused to consider the intricate inner workings of these machines? Have you ever wondered, "How do computers really work πŸ€”"? Perhaps you're already familiar with computers, perhaps you're even a software developer or programmer. Do you actually understand the underlying technology that underpins the software you write?

While it's not necessary to comprehend every facet of computer architecture to exploit their potential and tackle complex problems, understanding the underlying principles of computing is an enjoyable and enlightening experience that ultimately enhances your ability to use computers effectively, and write better software.

Some motivation for you

Endless possibilities await you as you deepen your understanding of the fundamental aspects of computing! With this knowledge, you'll have the power to bring even the most complex programs to life, and who knows, you might be inspired to create your very own operating system, programming language, or Twitter clone. The only limit is your imagination, and excitement is guaranteed!

While I strive to impart as much knowledge as possible, I understand that I cannot teach you everything you need to know. You must find your own way in this vast world of computers, discovering every bit and every byte along the way. My goal is to inspire you and show you that it is not as hard as it seems. Here's a poem to inspire you:

β€œinvent yourself and then reinvent yourself, don't swim in the same slough. invent yourself and then reinvent yourself and stay out of the clutches of mediocrity.

invent yourself and then reinvent yourself, change your tone and shape so often that they can never categorize you.

reinvigorate yourself and accept what is but only on the terms that you have invented and reinvented.

be self-taught.

and reinvent your life because you must; it is your life and its history and the present belong only to you.”

― Charles Bukowski, The Pleasures of the Damned.

This tutorial is targeted at programmers with zero prior knowledge of machine code and low-level programming concepts. There may be over-simplifications, and certain explanations may not be the best way to present the concepts discussed. I try my best to be as accurate as possible, while maintaining simplicity. I am not an expert, just a software hobbyist.

How do computers work?

Let us begin by asking the question "How do computers work"?

hiro dinosaur image
Well, I guess programmers write software that..

Yes!! programmers write software, but they do so by using other software called operating systems, programming language compilers, interpreters, code editors, IDEs, and web browsers, all of which are software made by other programmers.

hiro dinosaur image
Oh no! Another chicken and egg problem πŸ˜†

I know right? they pop up everywhere. Say you have a computer with no software at all. No operating system, web browser, programming language, absolutely nothing but hardware. How would you get the computer to do something like show a text on the screen or play a tone from the speaker? In other words, what is the most hello-worldy software that a computer can execute?

hiro dinosaur image
I'm guessing it involves binary thingies like ones and zer..

Yes!! machine code!

Sorry to interrupt, but I'm quite enthusiastic about this stuff. Modern computers are electronic devices, and as such they operate only by channeling electricity around their electronic components. Let us do a quick history lesson.

hiro dinosaur image
History lessons! I hat.... I mean love them πŸ™‚

Welcome to history class πŸ‘¨β€πŸ«

In the early days, machines were built primarily to simplify and streamline daily tasks. Mundane and repetitive tasks are a part of everyday life, and people sought a way to automate these tasks so they could focus on more significant endeavors. However, these machines were inflexible and rigid, making them unsuitable for complex tasks that require decision-making abilities.

The notion of a "programmable" machine emerged as a solution to this problem. One of the most significant contributions to this idea was made by Alan Turing, a British mathematician who lived in the early 20th century. Turing introduced the idea of a "Turing machine," a theoretical device capable of performing any calculation that a human mathematician could accomplish.

hiro dinosaur image
So how did this Turing machine work?

Good question. Imagine a machine with a long strip of paper that has a series of symbols written on it. The machine has a "read-write head" that can move back and forth along the strip of paper, reading the symbols and writing new ones.

turing machine

The machine also has a set of rules that tell it what to do based on the symbol it's currently reading.

For example, let's say that the machine is reading a symbol that says "add." The machine would then look at the next few symbols on the strip of paper and perform an addition operation based on those symbols. Then it would write the result back onto the strip of paper and move the read-write head to the next symbol.

hiro dinosaur image
That's interesting

It is. This idea paved the way for the development of early computers, which were essentially just large, mechanical Turing machines. These machines could be programmed to perform a wide variety of tasks, from basic arithmetic to more complex operations like sorting and searching.

hiro dinosaur image
Alright so how does this relate to modern computers?

The idea of a Turing machine was revolutionary at its inception, but as such machines were scaled up, they got more difficult to build, maintain and operate. At the time, machines were mostly constructed as mechanical devices. There were no screens or keyboards used, just hard metallic parts and vacuum tubes with switches and knobs. Such machines were only used for scientific and military purposes as they were very expensive to build and needed a lot of expertise to operate.

Soon enough, scientists began working on electronic Turing machines that could harness the power of electricity to perform the same tasks done by their mechanical counterparts.

At this point, the word "computer" was already being used to refer to these machines.

hiro dinosaur image
Why were they called computers?

Because they were mostly used for mathematical "computations".

With the advent of electronic computers, a new technology emerged that would revolutionize the field of computing: transistors. They replaced the bulky vacuum tubes and made electronic computers smaller, more reliable, and cheaper to build.

Look, Mom! I'm a computer 🀭

Transistors are tiny electronic switches that can be turned on or off to represent the binary digits of 0 and 1. Now Imagine you have a small stream of water that you can control with a series of gates. When the gate is closed, the water can't flow through, and when it's open, the water can flow freely. This is similar to how transistors are used in computers.

Instead of a stream of water, we have a flow of electricity that can be controlled by the gates. When the transistor gate is closed, no electricity can pass through, and when it's open, electricity can flow freely. These gates are very tiny and can be turned on or off millions of times per second, allowing the computer to perform complex operations quickly.

Just like how you can direct the flow of water in different directions by opening and closing different gates, the computer can direct the flow of electricity through different transistors to perform different operations. By combining these simple operations, the computer can perform more complex tasks.

In the same way that gates in a water channel can be combined to create different water flow patterns, electronic circuits made up of transistors can be combined to create logic gates that perform simple operations like addition and subtraction.

There are a few different types of logic gates, but two of the most common are the AND gate and the OR gate. The AND gate only produces a "1" output when both of its inputs are "1", and produces a "0" output in all other cases. The OR gate produces a "1" output if either of its inputs are "1", and produces a "0" output only when both inputs are "0".

By combining these simple logic gates, we can create more complex operations. For example, we can use an XOR (exclusive OR) gate to perform addition. An XOR gate produces a "1" output only when one of its inputs is "1" and the other input is "0". By connecting multiple XOR gates together, we can perform addition.

For example, let's say we want to add the binary numbers 101 and 010 together. We can represent these numbers using transistors and use XOR gates to add them together, like this:

   1     0     1
  XOR   XOR   XOR
   0     1     0
-----------------
   1     1     1

Of course, this is a simplified example and real computers use much more complex circuits to perform addition and other operations, but the basic idea is the same.

hiro dinosaur image
Oh i see where this is going :)
hiro dinosaur image
Once you have the basic building blocks of math, you can then combine them to form even more complex calculations

That is absolutely right. Once you have addition, you can add multiple times to get multiplication. Now I would like to go into more detail about how an electronic computer works by building one with you, but implementing a real computer may be too complex.

hiro dinosaur image
How about we create an imaginary computer that has only a few simple features?

Brilliant idea Dino! Brilliant! Let us call it COM1 as in "Computer 1" since we may build even more complex imaginary computers in the future.

COM1

COM1 is a simple imaginary computer. It is a Turing machine that has a tape with an infinite number of slots for instructions, and a read/write HEAD.

[i0, i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, ... ]
              ^
             HEAD

Each slot can contain a single instruction. The HEAD will then read each instruction starting from i0, and execute them one after the other. The three dots "..." indicate that there are more slots, but they have been omitted for brevity. The instructions inside these slots can be basic tasks like addition and subtraction.

hiro dinosaur image
Hmmmmm... πŸ€”
hiro dinosaur image
If the tape can contain only instructions like add and subtract, where do we put the numbers to be added/subtracted and where do we keep the result of the computation?

That's a good observation. This means we need a place to store numbers. A place where we can access values at any time during execution. Let's call it a memory unit.

So how would this memory unit (MU) be structured? A good way to have it is to place another tape elsewhere, containing a series of numbers that can be accessed by the first tape.

hiro dinosaur image
I guess instead of calling it 'the first tape', we should probably give it a name

Yeah right. Let us call it the Central Unit or CU.

So we have a CU that looks like this:

[i0, i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, ...]
              ^
             HEAD

And an MU that looks like this:

[44, 23, 34, 27, ...]

The MU does not need a head as it is simply a bucket for storing raw numbers. Each number has an address that we can use to locate it. We can simply use the index as the address, so the above number 44 has address 0, the number 23 has address 1, the number 34 has address 2 and the number 27 has address 3, and so on.

hiro dinosaur image
So what are the possible instructions that each slot in the CU can have?

Every possible instruction will be represented by a number because the computer can only work with numbers. Therefore, each instruction is already implemented into the hardware of the computer, and we can tell the computer what instruction to execute by giving it the corresponding number to that instruction. Here are a few example instructions:

1 -> Add the number at MU address 0 to the number in MU address 1 and save the result in MU address 2.

2 -> Add the number at MU address 3 to the number in MU address 2 and save the result in MU address 1.

Saying to the computer "Add the number at MU address 0 to the number in..." is impossible because it does not understand language. But it does understand numbers so we can use a unique number to represent every possible instruction.

If we combine everything we have described above, we can now write our first machine code for COM1.

I will give you the code in the Central Unit (CU), and the numbers in the memory unit (MU), then see if you can work out the final values that will be in the MU when all the instructions have been executed.

CU -> [1, 2, 2]

MU -> [2, 4, 1, 1]

Solution

The HEAD starts at the first instruction which is 1. This corresponds to the instruction "Add the number at MU address 0 to the number in MU address 1 and save the result in MU address 2". When this instruction is executed, the resulting state of the CU would be:

[1, 2, 2]
 ^
HEAD

And the state of the MU would be:

[2, 4, 6, 1]

Notice that the number in MU address 2 has changed from 1 to 6. This is because the instruction dictates that we overwrite its value and save only the result of the computation. Also note that when addressing the memory, we start counting from 0 instead of 1.

Now we step forward to the next instruction which has the value 2. The instruction 2 corresponds to "Add the number at MU address 3 to the number in MU address 2 and save the result in MU address 1". So the state of the CU after execution would be:

[1, 2, 2]
    ^
   HEAD

And the MU would be:

[2, 7, 6, 1]

Finally, we step forward to the last instruction which has the value 2. The instruction 2 corresponds to "Add the number at MU address 3 to the number in MU address 2 and save the result in MU address 1". So the state of the CU after execution would be:

[1, 2, 2]
       ^
      HEAD

And the MU would be:

[2, 7, 6, 1]

Notice that no value was changed in the MU? This is because we only modify the destination address, and leave the values being added together unchanged. In this case, the result being stored in the destination address of 1 was 7 and the value it contained previously was also 7 so no change is noticed.

hiro dinosaur image
Woah! That was fun!

Yeah! But did you notice anything strange in our implementation of COM1?

hiro dinosaur image
The name?

Not that. We previously agreed that computers only know the binary states of 1 and 0, but our instructions were being represented as normal decimal numbers, and the numbers stored in the MU were also represented as decimal numbers.

hiro dinosaur image
Yeah! how is that possible?

I only used decimal numbers for the convenience of explaining them. In an ideal imaginary computer, the instructions we encoded as [1, 2, 2] would look something like [0b1, 0b10, 0b10] when written in binary, and the MU we wrote as [2, 4, 1, 1] would be [0b10, 0b100, 0b1, 0b1]. The brackets are not important except for showing that the numbers are in a sequence, and the 0b prefix is just to differentiate binary numbers from decimal numbers.


This system of separating the instructions being executed from the data being manipulated (like we did with the CU and MU) is known as the Harvard architecture. However, modern computers use a different architecture:

Von Neumann architecture

Von Neumann architecture is a theoretical model for a digital computer that describes how a computer should be organized and how it should function. The architecture is named after John von Neumann, a prominent mathematician and computer scientist.

The Von Neumann architecture has five main components:

Our first implementation of COM1 uses a Harvard architecture because the instructions are stored in the Central Unit (CU) and the data is stored separately in the Memory Unit (MU).

Let us now modify our imaginary computer to use a Von Neumann architecture instead.

COM1 will no longer store instructions separately from the numbers being manipulated. Instead, we will store both instructions and numbers on the same tape. The Central Unit (CU) will be responsible for finding the next instruction and executing it.

We will no longer call it the Central Unit but the Processor from now on. Also, the tape containing both the data and instructions will be called the Memory. Since we have combined these two tapes, we will no longer use a HEAD to specify the current instruction but use something called the Program Counter or PC.

Therefore, the memory would look like this:

[op1, op2, op3, op4, op5, op6, op7, ...]
            ^
           PC
hiro dinosaur image
What does the op mean?

op means "opcode". We know that instructions are represented as numbers, and data is also represented with numbers, so we give them a general name opcode.

hiro dinosaur image
How would the computer differentiate code from data?

The flow of execution guides the computer on what should be executed as instructions and what should be used as data. As we run some examples, it will be clear how this works in practice.

The following is an example of values in the memory of COM1:

[1, 4, 5, 2, 0, 12, 3]
 ^
PC

Any one of these opcodes could potentially be executed by the processor as an instruction, or used as input for an instruction. For example, the first opcode above is 1. The opcode 1 could mean "add the next two numbers together and store the result in the next position, then advance the PC by 3". If we write this in pseudocode, it would look like this:

PC + 1 = (PC + 1) + (PC + 2)
PC = PC + 3

Breaking this down, it would mean:

PC + 1 represents the next memory address after the program counter (i.e. the address of the number 4 from our above example). We will be assigning the result of the computation to this address.

(PC + 1) represents the value of the next memory address after the program counter (i.e. the number 4).

(PC + 2) represents the value of the next 2 memory addresses after the program counter (i.e. the number 5).

PC = PC + 3 means after we add the two values and save the result, we will then move the PC forward by three addresses. When this instruction has been executed, the content of the memory will then look like this:

[1, 9, 5, 2, 0, 12, 3]
          ^
         PC

As you have seen in this new implementation, the program counter (PC) starts in memory address 0 and increments when an instruction is executed. The next instruction to be executed will depend on the location of the PC after the last instruction is done. In the above example, the number 1 was the first instruction, the numbers 4 and 5 were the input to that instruction, and the result (9) was stored in the position of the number 4.

Because the program counter was moved forward by the last instruction, the processor does not accidentally use the second and third opcodes as instructions and instead executes the fourth opcode 2 as the next instruction.

Let us define one more instruction:

Opcode 2 -> GOTO (PC + 1)

This translates to: the opcode 2 means the program counter should go to the memory address represented by the number in the next address. Writing it in pseudocode would be:

PC = (PC + 1)

Incidentally, the program counter in our previous example is currently pointing at the number 2. What do you think the memory would look like after this instruction is executed?

hiro dinosaur image
I think the program counter will move back to the first instruction 1

That is correct. The next number after the instruction 2 is 0. Therefore, the program counter goes to the address 0 which is the first instruction.

hiro dinosaur image
OMG we have a loop 😱

Yes!! You can see how simple instructions can be combined to make even more complex computations. If we allow the above program to continue running, the value of the address 1 will keep increasing forever.

Now COM1 looks a lot more like a real computer than an imaginary computer.

hiro dinosaur image
What do you mean?

This is how modern computers work. They have a central processing unit that executes code from the memory, and use the surrounding data as input. They are more complex than this and they can have thousands of different kinds of instructions, but the principle is the same.

hiro dinosaur image
That's pretty cool πŸ˜‡

It is that simple, but there is one part we're getting wrong. It is misleading to use decimal numbers when writing machine code, but the machine code we've been writing so far does exactly that. Ideally, the above example machine code should be written in binary numbers to look more like what the computer sees:

[1, 100, 101, 10, 0, 1100, 11]
hiro dinosaur image
This looks errr... clunky 😫

Understandable. It is inconvenient to write code in binary numbers, especially when we're simply representing the code on paper. Using decimal numbers to represent machine code is not so bad, but there is another number system that sits just right between the binary numbers that the machine understands, and the decimal numbers we find convenient. They are called hexadecimal numbers.

Hexadecimal Numbers

We hate binary. It's too clunky. Thankfully, hexadecimal numbers are much easier to work with than binary. Hexadecimal numbers are base 16, which means there are 16 symbols used to represent a number instead of the 2 symbols used in binary (0 and 1) or the 10 symbols used in decimal (0-9).

The symbols used in hexadecimal are the numbers 0-9 and the letters A-F. So instead of counting from 0 to 9 before moving to the next digit like in decimal, we count from 0 to F before moving to the next digit.

For example, the number 15 in decimal is represented as F in hexadecimal. The number 255 in decimal is represented as FF in hexadecimal.

hiro dinosaur image
Why not just use decimal instead of using a new number system?

Well, there's a lot to it. It is easier to convert hexadecimal numbers into binary than decimal numbers, and hex numbers are well suited for representing bytes of data which we handle very often in low-level programming. If you want to know more about the advantages of hexadecimal numbers, I have written an article that goes into more detail about it.

Now when we convert the example machine code we wrote earlier from decimal numbers to hexadecimal numbers:

Before:

[1, 4, 5, 2, 0, 12, 3]
 ^
PC

After:

[0x1, 0x4, 0x5, 0x2, 0x0, 0xc, 0x3]
  ^
 PC

The 0x before each opcode is just a prefix we use when writing hexadecimal numbers to avoid confusing them for decimal numbers.

talk is cheap

Implementing COM1 in Rust

We have theoretically designed a simple imaginary computer, but having it only on paper is not so fun. How about we make a COM1 virtual machine?

A virtual machine or VM is basically just software that tries to mimic the operation of hardware. By making a virtual machine, we essentially build a computer from the ground up using software, without having to work directly with physical hardware. It also gives us the flexibility of inspecting the operation of the computer and probing every instruction it executes.

We have developed COM1 enough to simulate it in software, so hopefully by implementing it in the Rust programming language, it will help us understand how COM1 works in practice.

hiro dinosaur image
But why Rust?

If you know anything about me, then you should know that I love Rust. So why not Rust? Frankly, you don't need to know Rust to follow along, heck you don't even need to know any programming language. The goal is to illustrate how COM1 works in practice. I will keep it as simple as possible, and explain every part of the code. It'll be fun I promise.

To follow along, you need to install Rust. In this tutorial, I assume you are using a Unix-like operating system such as Linux or MacOS. If you are on Windows, you can translate the Unix commands to their Windows equivalent, or just use WSL.

Create a new Rust project:

cargo new com-1

Then move into the created directory:

cd com-1

Now if you run cargo run in the terminal, you should see a "Hello World" printed on your screen.

The plan for this virtual machine is to read a text file containing a sequence of hexadecimal numbers, then store these numbers in an array that will represent the memory of the virtual machine. Then it reads each number from the memory and performs some actions depending on what opcode the number represents.

To get started, we need to read the contents of the text file. Edit the main.rs file in the src directory and add the following:

fn main() {
    let source = std::fs::read_to_string("./code.txt").unwrap();
}

We used the read_to_string() function from the fs module in the standard library to read a file code.txt into a string. Whatever the file contains will simply be copied into a string variable called source.

You can now create a code.txt file in the root directory of the rust project:

touch code.txt

Edit the text file and add the following:

[0x1, 0x4, 0x5, 0x2, 0x0, 0xc, 0x3]

Now when the program runs, the content of the text file will be stored in the source variable. This will be the starting value of the virtual machine's memory.

It will be difficult to work with the string as it is, so we need to deserialize it into an array of numbers. To do this, we will use the Ron library.

hiro dinosaur image
Hey! why not the serde_json library?

serde_json does not allow you to have hexadecimal numbers, so Ron is more suitable for this project.

To add Ron as a dependency, modify the Cargo.toml file and add the following:

[package]
name = "com-1"
version = "0.1.0"
edition = "2021"

[dependencies]
ron = "0.8.0" # new

Now we can use Ron to deserialize the source:

fn main() {
    let source = std::fs::read_to_string("./code.txt").unwrap();
    let mut memory: Vec<u8> = ron::from_str(&source).unwrap();
}

We declared a mutable variable called memory with the type Vec<u8>. This type is basically an array of numbers. Then we assign the value returned from the function ron::from_str() to it. This function takes the source variable as an argument and returns a Vec<u8>.

Now we have the virtual machine's memory. We declared it as mutable because we will be changing its value during execution. For sanity, we can print the memory to see its content:

fn main() {
    let source = std::fs::read_to_string("./code.txt").unwrap();
    let mut memory: Vec<u8> = ron::from_str(&source).unwrap();

    println!("STARTING MEMORY => {:?}", memory);
}

By running cargo run, you should see the following text printed in the terminal:

STARTING MEMORY => [1, 4, 5, 2, 0, 12, 3]

As you can see, the starting value of the memory is exactly what we wrote in the code.txt file, but look! The hexadecimal numbers have been converted to decimal numbers. This happened while Ron was deserializing the string in source into an array.

Now we have to define a few variables to be used later:

// after println
let clock_speed = 1000;
let memory_length = memory.len();
let mut pc = 0;

Every computer processor has a clock speed, and so does our virtual machine. The clock speed determines the number of instructions that a processor can execute per second. In other words, it indicates how quickly the processor can perform calculations and process data.

The clock speed is typically measured in gigahertz (GHz), with modern processors operating at speeds ranging from a few hundred megahertz to several gigahertz. The higher the clock speed, the faster the processor can perform computations and process data.

In this case, the clock_speed variable we defined is in milliseconds. To simulate a clock speed, we sleep for this duration of milliseconds before executing the next instruction.

We defined memory_length so that we know how many opcodes are in the memory for later use.

Finally, we define the program counter (PC). The pc variable holds.... the program counter of our virtual machine. We defined it as mutable because its value will be modified during execution.

Now, the fun part:

// after pc
while pc < memory_length {
    std::thread::sleep(std::time::Duration::from_millis(clock_speed));

    execute(&mut pc, &mut memory);
}

We write a while loop that repeatedly calls the execute function which we will define later, but before every iteration, it sleeps for a few milliseconds depending on the value of clock_speed. We previously defined the value of clock_speed as 1000 milliseconds which is equal to 1 second.

Putting it all together, our main function should look like this:

fn main() {
    let source = std::fs::read_to_string("./code.txt").unwrap();
    let mut memory: Vec<u8> = ron::from_str(&source).unwrap();

    println!("STARTING MEMORY => {:?}", memory);

    let clock_speed = 1000;
    let memory_length = memory.len();
    let mut pc = 0;


    while pc < memory_length {
        std::thread::sleep(std::time::Duration::from_millis(clock_speed));

        execute(&mut pc, &mut memory);
    }
}

Now we write a new function called execute:

// after main function
fn execute(pc: &mut usize, memory: &mut Vec<u8>) {
    let opcode = memory[*pc as usize];

    match opcode {
        // NOP
        0 => {
            *pc += 1;
        }

        // ADD
        1 => {
            memory[*pc + 1] = memory[*pc + 1] + memory[*pc + 2];

            *pc += 3;
        }

        // GOTO
        2 => {
             *pc = memory[*pc + 1] as usize;
        }

        _ => {
            panic!("Invalid opcode {}", opcode);
        }
    }

    println!("Processor executed opcode {} - MEMORY => {:?} - PC => {}", opcode, memory, pc);
}

There you have it. A virtual machine that works just like our imaginary COM1 computer.

hiro dinosaur image
Mind explaining what the execute function is doing?

Before the execute function was called, we defined some values: the memory and the pc. When we call the execute function, we pass these two values as parameters.

Inside the execute function, we first index into the memory and store the value that the pc is pointing at in the opcode variable. We then use a match statement to check if the opcode matches any of the defined instructions. When it finds a match, it executes the instruction and increments the pc by a number depending on the instruction.

I have defined three instructions: 0, 1, and 2. If an opcode does not match any of these instructions, it will match the underscore _ case which will cause the program to crash and print an error showing the opcode.

The 0 instruction is a NOP instruction which means "No operation". It does nothing except increment the pc by 1.

The 1 instruction is our "add the next two numbers together and store the result in the next position, then advance the PC by 3" instruction.

The 2 instruction is our GOTO instruction.

When an instruction is done executing, we print the opcode that was just executed, the value of the memory, and the value of the pc. By printing these values, we see what changes the instruction has made. For example, if the instruction was a GOTO instruction, we should see the value of the pc change.

Putting it all together, our COM1 virtual machine's code should look like this:

fn main() {
    let source = std::fs::read_to_string("./code.txt").unwrap();
    let mut memory: Vec<u8> = ron::from_str(&source).unwrap();

    println!("STARTING MEMORY => {:?}", memory);

    let clock_speed = 1000;
    let memory_length = memory.len();
    let mut pc = 0;


    while pc < memory_length {
        std::thread::sleep(std::time::Duration::from_millis(clock_speed));

        execute(&mut pc, &mut memory);
    }
}

fn execute(pc: &mut usize, memory: &mut Vec<u8>) {
    let opcode = memory[*pc as usize];

    match opcode {
        // NOP
        0 => {
            *pc += 1;
        }

        // ADD
        1 => {
            memory[*pc + 1] = memory[*pc + 1] + memory[*pc + 2];

            *pc += 3;
        }

        // GOTO
        2 => {
             *pc = memory[*pc + 1] as usize;
        }

        _ => {
            panic!("Invalid opcode {}", opcode);
        }
    }

    println!("Processor executed opcode {} - MEMORY => {:?} - PC => {}", opcode, memory, pc);
}

Now you can run it with:

cargo run

If everything works fine, we should see values printed on the terminal every 1 second. The virtual machine executes an instruction, waits for 1 second, then executes the next instruction, printing the values of the executing opcode, memory, and pc. You should see something like this:

STARTING MEMORY => [1, 4, 5, 2, 0, 12, 3]
Processor executed opcode 1 - MEMORY => [1, 9, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 9, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 14, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 14, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 19, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 19, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 24, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 24, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 29, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 29, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 34, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 34, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 39, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 39, 5, 2, 0, 12, 3] - PC => 0
Processor executed opcode 1 - MEMORY => [1, 44, 5, 2, 0, 12, 3] - PC => 3
Processor executed opcode 2 - MEMORY => [1, 44, 5, 2, 0, 12, 3] - PC => 0
hiro dinosaur image
That is soooo cooool!!! 😱

Yes, it is. Recall that the machine code currently being executed by the virtual machine is written in the code.txt file:

[0x1, 0x4, 0x5, 0x2, 0x0, 0xc, 0x3]

If we modify these values, the behavior of the virtual machine should change. Can you program it to do something different?

hiro dinosaur image
yes! yes! let me try!
hiro dinosaur image
[0x1, 0x4, 0x5, 0x1, 0x0, 0xc, 0x2, 0x3]
STARTING MEMORY => [1, 4, 5, 1, 0, 12, 2, 3]
Processor executed opcode 1 - MEMORY => [1, 9, 5, 1, 0, 12, 2, 3] - PC => 3
Processor executed opcode 1 - MEMORY => [1, 9, 5, 1, 12, 12, 2, 3] - PC => 6
Processor executed opcode 2 - MEMORY => [1, 9, 5, 1, 12, 12, 2, 3] - PC => 3
Processor executed opcode 1 - MEMORY => [1, 9, 5, 1, 24, 12, 2, 3] - PC => 6
Processor executed opcode 2 - MEMORY => [1, 9, 5, 1, 24, 12, 2, 3] - PC => 3
Processor executed opcode 1 - MEMORY => [1, 9, 5, 1, 36, 12, 2, 3] - PC => 6
Processor executed opcode 2 - MEMORY => [1, 9, 5, 1, 36, 12, 2, 3] - PC => 3
Processor executed opcode 1 - MEMORY => [1, 9, 5, 1, 48, 12, 2, 3] - PC => 6
Processor executed opcode 2 - MEMORY => [1, 9, 5, 1, 48, 12, 2, 3] - PC => 3

That's a good one! Can you explain what the computer is doing?

hiro dinosaur image
It starts with an add instruction, then another add instruction, then a GOTO instruction that moves the PC to the second add instruction

That is correct. The effect of that machine code is: the value at memory address 0x4 keeps increasing forever. Well, not forever because if we leave it to run for some time, the program crashes when its value is 255. This is a limitation of our virtual machine because we have defined the size of opcodes to be 8-bit large.

hiro dinosaur image
Can we add more instructions now?

Yes, we can. Let us add a subtract instruction and a write instruction:

fn execute(pc: &mut usize, memory: &mut Vec<u8>) {
    let opcode = memory[*pc as usize];

    match opcode {
        // NOP
        0 => {
            *pc += 1;
        }

        // ADD
        1 => {
            memory[*pc + 1] = memory[*pc + 1] + memory[*pc + 2];

            *pc += 3;
        }

        // GOTO
        2 => {
            *pc = memory[*pc + 1] as usize;
        }

        // SUBTRACT
        3 => {
            memory[*pc + 1] = memory[*pc + 1] - memory[*pc + 2];

            *pc += 3;
        }

        // WRITE
        4 => {
            let target = memory[*pc + 1] as usize;
            let value = memory[*pc + 2];
              
            memory[target] = value;

            *pc += 3;
        }

        _ => {
            panic!("Invalid opcode {}", opcode);
        }
    }

    println!(
        "Processor executed opcode {} - MEMORY => {:?} - PC => {}",
        opcode, memory, pc
    );
}

The SUBTRACT instruction is quite similar to the ADD instruction. It uses a 3 opcode and we simply replace the + with a -.

The WRITE instruction is for directly writing a value to an address in memory. It uses a 4 opcode, and it writes the value in the next 2 addresses after the pc to the next address after the pc. In pseudocode, it looks like this:

(PC + 1) = (PC + 2)

PC = PC + 3

To test this, we can write a machine code that uses both the subtract and the write instructions:

[0x3, 0x9, 0x1, 0x4, 0x2, 0xF]

When you write the above machine code in code.txt and run it, we should see the following output:

STARTING MEMORY => [3, 9, 1, 4, 2, 15]
Processor executed opcode 3 - MEMORY => [3, 8, 1, 4, 2, 15] - PC => 3
Processor executed opcode 4 - MEMORY => [3, 8, 15, 4, 2, 15] - PC => 6

First, we see that the 3 opcode is executed, which is a subtract instruction. It subtracts 1 from 9 and writes the result in memory address 1. Therefore memory address 1 which had a value of 9 at the start of the program now has a value of 8. Then the PC is incremented to a value of 3.

The opcode under the PC is now 4. This corresponds to the write instruction which writes the value 15 to the memory address 2. We can see that the value in that address is now 15, and the PC has been incremented to a value of 6. The program now exits as there are no more instructions to execute.

hiro dinosaur image
How did you determine what opcodes should represent what instruction?

I used whatever number I could think of since it is just an imaginary computer. However, if you are programming a real computer, the instructions have been pre-determined by the manufacturer of the processor. The set of instructions a processor uses and understands is known as the Instruction set of the processor.

Instruction Set Architecture (ISA)

An Instruction Set Architecture (ISA) is a term used to describe the set of instructions and features that a particular processor supports. An ISA is essentially the interface between the hardware and the software that runs on the processor. It defines how a processor executes a set of instructions, how data is stored and manipulated, and how input and output operations are performed.

There are many different ISAs in use today, ranging from simple embedded processors to complex server-class CPUs.

Each ISA has its own set of advantages and disadvantages. Some ISAs are designed to be simple and efficient, with a small set of instructions that can be executed quickly. These ISAs are often used in embedded systems and other applications where low power consumption and small size are important factors. Other ISAs are designed to be more complex and feature-rich, with a larger set of instructions that can be used to implement more advanced algorithms and data structures. These ISAs are often used in high-performance computing and other applications where speed and functionality are more important than power consumption.

One of the advantages of an ISA is that it provides a standardized interface between hardware and software. By providing a set of instructions that abstract away the details of how the hardware actually works, an ISA allows software developers to focus on the higher-level algorithms and data structures that are central to their application.

Think of the instruction set of a processor as a table with all of the processor's opcodes, and what each one of those opcodes means to the processor.

Some of the most common ISAs are:

Notice that Instruction Set Architecture, Architecture, and Instruction Set are sometimes used interchangeably.

For now, we will suspend our work on the COM1 imaginary computer and focus on the x86 architecture that real computers use, then subsequently we will explore more architectures such as RISC-V and ARM.

The x86 Instruction Set

The x86 architecture is one of the most widely used for computer processors, with a history that dates back to the early days of personal computing. It has a complex and diverse instruction set, with hundreds (or thousands) of opcodes that enable a wide range of operations, from basic arithmetic to advanced system-level tasks.

The x86 instruction set is infamously known to be overly complicated with a f*** ton of instructions. The complexity of x86 is further compounded by the requirement to maintain compatibility with earlier versions of the architecture.

While preserving legacy code is certainly advantageous, it can also create a considerable burden for programmers who work with the architecture. Achieving a delicate balance to ensure compatibility with older software and hardware systems can be challenging.

For this reason (and many others), directly using the opcodes of a processor to write machine code is time-consuming, and impractical. This is not a problem specific to x86 but it is more profound with it. If only there was a way for us to write machine code without actually writing opcodes in hexadecimal numbers πŸ€”.

Assembly language

What if we could simplify the process of writing machine code such that we could easily understand what a code is doing just by looking at it?

We can achieve this by creating a special program that takes in a series of words like ADD, SUBTRACT, WRITE as input, and generates machine code that corresponds to these words. We refer to these short words as mnemonics.

If you recall the COM1 virtual machine we built earlier, the opcode for the ADD instruction was 1. Therefore, we can write a program that takes in a file containing mnemonics instead of opcodes, and generates a new file containing the corresponding opcodes of those mnemonics. If it encounters an ADD mnemonic, it outputs a 1 opcode.

We can then define a set of standard mnemonics that correspond to all the opcodes of an instruction set. This set of mnemonics and the rules for how they are converted to machine code is known as an "Assembly language".

hiro dinosaur image
Does this mean that each Instruction set has its own special assembly language?

That is correct. There is an x86 assembly language, there is an ARM assembly language, and so on. The program that converts the mnemonics of an assembly language into machine code is known as an assembler.

hiro dinosaur image
But how do you create the assembler? do you write it using just opcodes?

Yes! The first assembler may be written manually with the opcodes of a processor, and then using that first assembler, more advanced assemblers can be built.

In software development, the process of using a primitive and old tool to make a better tool is known as bootstrapping. Just like you use a not-so-sharp tool to make a sharper tool, and then use that sharper tool to make an even sharper tool, you can use raw machine code to make a limited assembler, and then use that assembler to make a more advanced assembler.

Luckily for us, other programmers have done the dirty work of defining assembly languages and making assemblers for them, so we only need to have an assembler to write machine code for a particular processor architecture.

hiro dinosaur image
If I write assembly code and assemble it using an assembler on my computer, can I run the resulting machine code on another computer?

Yes, you can, but only if they both have the same architecture. If you write x86 assembly code, you cannot run the resulting machine code on a computer with an ARM architecture.

x86 Assembly Language

As much as I would love to continue this tutorial, it is getting quite lengthy. I tell you what, how about we make this a series? If you would like me to make this a low-level programming series, tweet at me or message me on my discord server. I would love to hear your feedback and answer your questions. Thanks for reading, and see you in the next one.

Also, Consider donating, or becoming a sponsor to help me continue making open-source software, and writing cool programming content.

πŸ₯ΊSponsor Me
this website does not show ads, track you, or collect your data. have a great day :)