Introduction#
Before you attend this week’s lab, make sure:
- you can use VSCode to edit assembly (
.S
) files - you can connect to a debugging session on your microbit (or emulator)1
- you have completed the midsem break lab
- you have read the lab content below
If you’re not confident about that stuff then feel free to have a look over the midsem lab again or ask your tutor for help.
In this week’s lab you will:
- write your own first pieces of machine code
- learn about how instructions (i.e. the program) are represented and executed on your microbit
- learn some assembler directives for getting data into your program
- translate simple mathematical expressions into sequences of assembly instructions
- watch the status register and monitor the condition flags
- branch (jump) around in your program, including conditional branches
- learn how to write an if statement, while loop and for loop
- learn how to use the record and array data structures
This lab is long, and covers a large amount of content. This is because most of it should be a refresher and slight extension on what you have already seen in the first half of the course.
In the first half of the course, you saw how your CPU/MCU is made of logic gates that form different components. You also saw how you can use specially crafted words (i.e. numbers) stored in memory to instruct your CPU to do things (e.g., add two values). You may remember this as forming opcodes (operation codes) and instructions.
Today you will be having a look at the form these opcodes and instructions take in the ARM assembly language, which you’ll be using for the rest of the course.
You’ll also start to see more clearly the connections between what we’ve been
covering in this course and the higher-level programming languages you’re used
to, with “high-level” if
statements, for
loops, and other structures. This
process of “demystifying” programming is a big part of what this course is
about, so take the time to reflect on what you’re doing and how it fits in with
what you know and do in other programming situations.
Some of the content in this lab might feel like a refresher to what was covered in
first half of the course. That’s a good thing! it means you’re actively engaging
with and learning the content. Just keep in mind that ARM does differ to QuAC, so make sure
you’re still reading everything here carefully.
A good example of this is the fact that the microbit uses a 32bit CPU, whereas with QuAC we were
working with a 16bit CPU. This gives us more room to fit things in the instruction encoding,
so you won’t find things like MOVL
anymore, we just have MOV
.
Some new resources are now available for Microbits! Check out the following!
Task 0: Opening the Lab Packs#
Plug in your microbit, fork & clone the lab pack template
to your machine if you haven’t already, and open the lab-07
folder in VSCode,
then open and modify the src/main.S
file as usual.
Make sure to have the lab-07
folder in particular open; the Makefile needs to be in the
top level directory for the Build/Upload commands to work!
Once you have done this, build and run your program with the debugger.
Throughout this lab you may find the following tool to convert between binary, hexadecimal and decimal representations of 32-bit numbers useful.
Decimal | |
Hex | |
Binary |
Task 1: Bit Vectors#
You can store arbitrary text into your program using the
.ascii
compiler
directive. This will store the text following the .ascii
directive into the
program using the ASCII
encoding.
We won’t really be using the .ascii
directive much beyond this exercise. The
more common directives you will see are:
This directive uses the ASCII character code to determine exactly what bytes the assembler puts in your program. These bytes are put into memory in the order they are listed — the lowest byte in memory is used to store the first character, the second lowest byte is used to store the second character, and so on.
We can use .ascii
to load some new data into register r1
by making your
main.S
look like:
.syntax unified
.global main
cope:
.ascii "COPE"
.type main, %function
main:
@ load "COPE" into r1
ldr r1, cope
@ fill in the rest of the instructions here!
@ infinite catch loop
inf_loop:
nop
b inf_loop
This code introduces a new compiler directive: labels. From the documentation:
A label is written as a symbol immediately followed by a colon
:
. The symbol then represents the current value of the active location counter, and is, for example, a suitable instruction operand. You are warned if you use the same symbol to represent two different locations: the first definition overrides any other definitions.
So in the code above there are two new labels: cope
and inf_loop
. A label
is a way of attaching a human-readable name to a location in your sequence of
instructions (i.e. your program). But always remember that it’s just a location
marker, and once your program is running on your board it will have a specific
memory address which you can store in a register, do arithmetic on, etc.
Using the Memory Viewer#
Once you’ve started running this program in the debugger, you can inspect the value stored at the “cope” label directly.
When the debugger has stopped at main, use VsCode’s Command Palette to run the “COMP2300: View Memory (New Tab)” command:
This will ask you for a memory address (in this case enter 0x200
) and a length
(in this case, any number greater than 4 will suffice. The following example
uses a length of “64”). This will open a new
tab in your editor that shows you the values currently stored in memory starting
at address 0x200
and continuing for length bytes:
This is similar to how you would inspect the memory of your QuAC CPU by right-clicking on the RAM component.
What will you see in r1
after the ldr r1, cope
line?
Your goal in this exercise is to isolate and shift individual bytes within the
"COPE"
word:
- first, change it into
"HOPE"
and store inr2
- then, change it into
"HOPS"
and store inr3
Note: you do not have to store the word back into memory! In fact, you will not be able to do this because the “HOPS” is written in the code section of memory, which is read-only.
Each of these steps requires isolating and manipulating one 8-bit (1-character) part of the 32-bit word without messing with the rest of it. We will provide you with the following hints to get started:
- The character “C” is encoded as
0x43
(67), the character “O” as0x4F
(79), the character “P” as0x50
(80) and “E” as0x45
(69). You should look at the “Printable Character Table” linked above to find out what “H” and “S” are encoded as. - The value you will see in
r1
afterldr r1, cope
is executed will be0x45504f43
. - In the ASCII character encoding you can add 5 to change a
C
into anH
, and add 14 to change anE
into anS
. - The new characters need to be shifted back into the correct position and then need to replace the appropriate character in the original word.
- If you have completed this exercise correctly then
r2
will equal0x45504f48
andr3
will equal0x53504f48
.
It might be helpful to use a piece of paper here: write out what the "COPE"
data looks like in memory (remember endianness!), and figure out what
shuffles, logical operations, or arithmetic operations you need to make the
transformations into "HOPS"
. If you’re stuck, think what bit-vector
operations will remove, replace or combine information in your registers (you
have a cheat sheet for this! ARM assembly cheat sheet).
“Shuffles” in this case mean shift operations. In ARM, you have the lsl
and lsr
instructions to shift values in registers around by numbers of bits. There are other
types of shifts that you can consult the lectures or the cheat sheet to learn about.
Each individual ascii character takes up one byte, which is equal to 8 bits. What does this tell you about how many bits you need to shift by in order to “target” the different ascii characters?
Here’s a simple exercise to check your understanding:
Suppose you had a number 0xAABBCCDD
stored in r2
and the number 1
stored
in register r3
. Determine how many bits (and in which direction) would r3
need to be shifted by to:
- Add
1
to0xDD
(the least significant/0th byte) - Add
1
to0xCC
(the second least significant/1st byte) - Add
1
to0xBB
(the second most significant/2nd byte) - Add
1
to0xAA
(the most significant/3rd byte) - Add
64
to0xCC
(the second least significant/1st byte)
Click here to see the answers and check your understanding
- To add
1
to0xDD
, you wouldn’t need to shift the value at all. -
To add
1
to0xCC
, you will first need to shiftr3
left by 8 bits:lsl r3, 8 @ becomes 0x100 add r2, r3
-
To add
1
to0xBB
, you will first need to shiftr3
left by 16 bits:lsl r3, 16 @ becomes 0x10000 add r2, r3
-
To add
1
to0xAA
, you will first need to shiftr3
left by 24 bits:lsl r3, 24 @ becomes 0x1000000 add r2, r3
-
To add
64
to0xCC
you will need to shiftr3
left by 14 bits. This is because1 << 6
is equal to 64, and to target the 1st byte we also need to shift left by8
:lsl r3, 14 @ becomes 0x4000 add r2, r3
There are several ways to do this, how many can you think of? Show your program to your neighbour or tutor to get ideas about how it could be done differently.
Finalise your program so that the main
function performs the "COPE"
->
"HOPE"
-> "HOPS"
transformation and leaves the "HOPS"
value in r3
.
Copy the code into tasks/task-1.S
. Commit and push your changes with the
message “completed task 1”. The CI will run a test of your code to verify you
have completed this task correctly.
Task 2: Using Memory#
In the previous we introduced the ascii
directive to store the word "COPE"
in your program memory (the section of memory where your microbit looks for
instructions to execute). While it is okay to store data there, we have a
dedicated section of memory for storing data to (the RAM). To access that
section of memory, we need to look at memory sections.
Sections in Memory#
Sections in your program are directives (so they start with a .
) to the
assembler that the different parts of our program should go in different
parts of the microbit’s memory space. Some parts of this address space are
for instructions which the microbit will execute, but other parts contain
data that your program can use.
Your program can have as many sections as you like (with whatever names you like) but there are a couple of sections which the IDE & toolchain will do useful things with by default:
-
if you use a
.text
section in your program, then anything after that (until the next section) will be put inFlash
as program code for your microbit to execute (referred to a ‘text memory’ from this point) -
if you use a
.data
section, then anything after that (until the next section) will be put inRAM
as memory that your program can use to read/write the data it needs to do useful things (referred to as ‘data memory’ from this point)
If you’re interested in what other directives exist, you can check them out here.
When you create a new main.S
file, any instructions you put are put in the
.text
section until the assembler sees a new section directive.
Here’s an example:
main:
ldr r0, =main
ldr r1, =storage
.data
storage:
.word 2, 3, 0, 0
.asciz "Computer Architecture"
Looking at the microbit’s address space map and running the program above,
where do you think the main
and storage
parts of your program are ending up?
Can you find the string “Computer Architecture” in memory?
Try and find it in the memory view.
You can interleave the sections in your program if it makes sense:
.text
program:
@ ...
.data
storage:
@ ...
.text
more_program:
@ ...
.data
more_storage:
@ ...
When you hit build (or debug, which triggers a build) the toolchain will figure out how to put all the various bits in the right places, and you can use the labels as values in your program to make sure you’re reading and writing to the right locations.
If you’re interested in seeing how it’s done, you can look at your project’s linker script, located in your project folder at
lib/link.ld
The Task#
For this task, we’re going to move "COPE"
from text memory to data memory.
Copy the following code block into your main.S
file.
.syntax unified
.global main
.type main, %function
main:
@ load the address for "COPE" into r0
@ load "C" into r1
@ load "O" into r2
@ load "P" into r3
@ load "E" into r4
@ modify your bit shifting solution to combine the individual characters and turn "COPE" -> "HOPS"
@ store "HOPS" back into memory at 'hops'
@ infinite catch loop
inf_loop:
nop
b inf_loop
.data
@ load COPE from here, don't forget that each ASCII character here is taking up 32 bits (4 bytes) of memory because we are storing them using .word
cope:
.word 0x43 @ "C"
.word 0x4f @ "O"
.word 0x50 @ "P"
.word 0x45 @ "E"
@ store HOPS here once you have finished
hops:
.word 0
Comparing the code above with the code from task 2, you’ll notice that we
have added the .data
directive and moved cope
there. By doing so, we
are now storing "COPE"
in data memory instead of text memory.
This also has implications on how we are loading the value from memory, previously we were able to load from text memory using the label directly without loading the address first. This time we aren’t able to do that, instead we have to first get the address of the cope label first. We can do this by using the following syntax:
ldr r0, =cope @ load the address of 'cope' into `r0`.
Once we have the address of the label, we can use
immediate loads
to get the individual bytes of "COPE"
into the registers. Then we can use the bit vector
techniques we learned in task 1 to alter the characters and combine them into a single
register.
To verify that your implementation is correct, you need to look at what’s stored at the
hops
address in the memory viewer. To do this, you need to get the address of hops
;
you can do this by hovering over the hops
label while running the program, or by
loading the address of hops
into a register and reading the register.
Why do you think we didn’t have to first get a memory address when loading the cope
data in task 1? If you’re curious, you can have a look at
this section of the cheat sheet
We have split apart the individual bytes for "COPE"
and are storing them using
the .word
directive. This means that each character is taking up 32 bits (4 bytes)
of space. What implications does this have for you? Hint: have a look at the cope
address in the memory viewer if you get stuck.
Complete your program such that the main
function loads “COPE” from memory,
performs the same "COPE"
-> "HOPE"
-> "HOPS"
transformation and stores the
result back into memory at the address labelled hops
. The result should be
stored as a single 32-bit value.
Verify that your code works by using the memory viewer, then copy the code into
tasks/task-2.S
.
Commit and push your changes with the message “completed task 4”.
Task 3: Records#
For the following 3 exercises we will be creating a basic inventory system for the Thunder Muffin paper company. The company only stocks 1 item at the moment, but has plans to increase that.
As such it has asked that you create an inventory system with the following information:
- Product Identification Number (10 digits, starting with a 1)
- Stock level (a value between 0 and 100)
- Restock level (a value between 0 and 100)
- Customer order count (a value between 0 and 9999)
This information will be organised into a record data structure. Data structures are, as the name suggests, used to structure data in a way that is useful for the program, but assembly has the additional challenge that we are working directly with memory. For each piece of data we want to access, we need to be able to compute its memory address.
For records, we have a base address, as well as a static offset associated to each element. Then, to access an element, we just add its associated offset to the base to get its memory address, then load from that address. In this task, we will use a record that looks like this:
Records can contain elements of different size. E.g. you can store a 1-byte character at offset 0, a 4-byte integer at offset 1 and a 24-byte string at offset 5. But the elements must be of fixed size, i.e. cannot grow during the program’s execution. Look at the assembly data structures lecture for more info.
The Task#
The company wants you to initialize their first product in the inventory system, they have provided you with the following information:
- Product Identification Number (PIN):
1234567890
- Customer order count:
2008
- Stock level:
53
- Restock level:
50
Copy the following code into your main.S
file:
.syntax unified
.global main
.type main, %function
main:
@ code to initialize plain_a4 goes here
@ infinite catch loop
inf_loop:
nop
b inf_loop
.data
plain_a4:
.word 0, 0, 0, 0
This line:
.word 0, 0, 0, 0
Is equivalent to the following, which you saw in the last task:
.word 0
.word 0
.word 0
.word 0
The difference being it is much more compact. It is okay to use either in your code, and you should use whichever style, or combination of them, that makes sense.
The plain_a4
record has been initialized with 4 0 values. The record is expected to appear
in the order of:
- Product Identification Number (PIN)
- Customer order count
- Stock level
- Restock level
The load and store section of the
cheatsheet will be useful to you. It is also worth noting that you can use
ldr r0, =SOME_NUMBER_HERE
to put a number of any size into a register
- eg:
ldr r1, =5403035
As in Task 4, you can verify your implementation works correctly by looking at the
data at the memory location pointed to by plain_a4
in the memory viewer. Follow
the same method as there to find the value of the label’s address.
Remember that the syntax for ldr is ldr rd, [ra]
not ldr rd, ra
! If you forget
the square brackets around the second register, it’ll give you a very perplexing error
message (something about a T32_OFFSET_IMM
)!
Complete your program such that the main
function initializes the plain_a4
record
with the information provided above. Verify that your code works
by using the memory viewer, then copy the code into tasks/task-3.S
.
Commit and push your changes with the message “completed task 3”.
Task 4: Customer Orders#
Now that you have the record initialized correctly, its time to start adding some functionality to the inventory system. Here is the first action to implement
- customer order:
- Customer order count := Customer order count + 1
- Stock level := Stock level - 1
Write a series of assembly instructions to perform this action after you have initialized plain_a4
.
At this point, the inventory system only has one action, so the only way for
the code to proceed is to keep performing that action. We can do that with a
branch instruction: b
.
This instruction tells your microbit to “branch” (sometimes called a jump on other
CPU types2) to a different part of the code. You can specify the “destination” of
the branch in a bunch of different ways, including using a label, or a constant
value (if you know exactly what address you want to go to ahead of time) or even
the address in a register. If you’ve wondered how to get your program to do
something other than just keep following the instructions from top to bottom,
branching is the answer.
Add a label and a branch instruction to modify your program so that the inventory system keeps processing customer orders (one after the other) indefinitely.
Hit the continue (play) button in the debug toolbar and let the program run for a while, pausing every now and again to check the inventory system values—what do you notice?
If you’re feeling stuck, we can provide the following hint for how you can:
- Complete the “Customer order count := Customer order count + 1” step
- Repeat the entire operation by branching back to a label
Click here for a hint about how you can approach this task.
@ this goes below your initialisation code
customer_order_loop:
@ increment customer order count
ldr r0, =plain_a4
ldr r1, [r0, 4]
add r1, 1
str r1, [r0, 4]
@ decrement stock level
@ ... you'll have to write this part yourself! ...
@ branch back to the "customer_order_loop" label
b customer_order_loop
Complete your program such that the main
function initializes the plain_a4
record
and then performs customer orders indefinitely. Verify that your code works
by using the memory viewer, then copy the code into tasks/task-4.S
.
Commit and push your changes with the message “completed task 4”.
Task 5: Status Flags and Condition Codes#
One thing you may have noticed in the previous task is that the stock level continually decreases below zero.
How can you deal with this problem? The answer lies is in the program status
register in every ARMv7 CPU (including our little microbit). You can see it in the
Variables viewlet in VSCode under xPSR
. Hover over the “xPSR” text to expand the
program status register, as shown below.
Remember we covered the status flags in the first half of the course. Some of these flag values should look familiar to you, as they were a part of the QuAC ISA.
When the microbit executes any instruction with an s
suffix (e.g. adds
) it
updates these status flags according to the result of the operation. That’s all
the s
does—add
and adds
will leave the exact same result in the
destination register, but adds
will update the flags to leave some
“breadcrumbs” about the result (which can be helpful, as you’ll soon see).
In addition to this, if you look at the Tests section of the cheat
sheet then
you can see that there are some instructions specifically used to update the
flags without changing the values in the general purpose registers
(r0
- r12
). For example, cmp r0, 10
is the same as subs r0, 10
except
that the value in r0 is left untouched.
Sometimes the status flags are called status bits, or condition flags, or condition codes, or some other combination of those words. They all refer to the same thing—the bits in the program status register.
Write a series of simple programs (e.g. mov
some values into registers, then
do an arithmetic operation on those registers) to set
- (a) the negative flag bit
- (b) the zero flag bit
- (c) the carry flag bit and
- (d) the overflow flag bit.
Copy the following into your main.S
file, but make sure you saved your previous task first
as we will be coming back to that after this.
.syntax unified
.global main
.type main, %function
main:
@ set the negative flag
@ ... your instruction(s) go here ...
@ set the zero flag
@ ... your instruction(s) go here ...
@ set the carry flag
@ ... your instruction(s) go here ...
@ set the overflow flag
@ ... your instruction(s) go here ...
@ infinite catch loop
inf_loop:
nop
b inf_loop
If you’re getting bored of stepping through every instruction, don’t forget you
can set breakpoints, these control exactly where your debugger will pause after
clicking ‘continue’ (the green button). You can do this by clicking in the
left-hand “gutter” (or margin) of the code view. You should see a little red
dot appear:
Once your program is working and setting the correct flags, copy the code into
tasks/task-5.S
. Commit and push your changes with the message “completed task 5”.
If you're feeling stuck, click here for a sample solution.
mov r0, 0
mov r1, 1
mov r2, 0x7fffffff
subs r3, r0, r1 @ set the negative flag
adds r0, r0 @ set the zero flag
adds r4, r3, 2 @ set the carry flag
adds r5, r2, r2 @ set the overflow flag
Note that the last instruction sets both the negative and overflow flags. As an extension: Can you think of a way to set just the overflow flag?
It might seem like this carry/overflow stuff isn’t worth worrying about because it’ll never happen in real life. But that’s not true. It can cause serious problems, like literally causing rockets to explode. So understanding and checking the status flags really matters :)
Task 6: Restocking Items#
Now that you have a grasp on the condition codes, its time to add another action to our inventory system.
- restock product:
- Stock level := 100
The catch here is that we don’t always want to restock an item. We only want to restock it
if the stock level is <=
to the restock level.
Bring your code from Task 4 back into main.S
and modify it to perform the following:
- Initialize plain_a4
- Check the stock level
- Restock if stock level is
<=
to the restock level - Process a customer order (like you did in Task 4)
- Repeat step 2
To achieve this, you will have to combine your knowledge of condition codes and branching. In short you are going to need to set up a conditional branch that performs a restock when needed.
A general if/else statement using conditional branches looks like this:
@ ... some code
cmp r0, r1
beq if_cond
b else_cond
if_cond:
@ code here is executed if cmp r0, r1 set the Z flag
b end_if
else_cond:
@ code here is executed if cmp r0, r1 did NOT set the Z flag
end_if:
@ resume execution after if / else section
If there is only an if
and no else then it would look like:
@ ... some code
cmp r0, r1
beq if_cond
b end_if
if_cond:
@ code here is executed if cmp r0, r1 set the Z flag
end_if:
@ resume execution after if / else section
The labels in the above examples are chosen for demonstration purposes, you can label them with whatever you believe makes sense. Remember, labels need to be unique!
Also, you now also have the tools to easily design a “while” loop, which repeats the loop until a certain condition stops holding, using branches and conditional branches. See the assembly control flow lectures for more details.
Complete your program to the above spec. Verify that your code works by stepping through it
and using the memory viewer, then copy the code into tasks/task-6.S
.
The example if/else above used the suffix eq
to indicate that we should branch
if two registers are equal. Which
condition suffix will
let you branch when the value of a register is Less than or Equal to
another?
Commit and push your changes with the message “completed task 6”.
Task 7: Loops And Arrays#
In the previous tasks, you loaded values from memory, modified them and then stored them back. For this exercise, we are going to be doing this across a series of elements (an array) using a loop.
An array is another common data structure. Unlike a record, which can hold items of different sizes
and types, an array holds values of the same type/size. However, in exchange, an array can be a variable
length of items (whereas records are a fixed size). Arrays in assembly have a base address, and you access
elements at a particular index (e.g. the i
th element) by accessing the memory location base + i * size_of_elements
.
See the data structures lecture for more info.
In your standard, high level programming languages. If you were asked to add 1 an array of elements, you may end up with something like this:
int[] elements = {1, 2, 5, 6, 3};
for (int i = 0; i < elements.length; i++) {
elements[i] = elements[i] + 1;
}
In assembly we can do something similar, here is some starter code, copy it into your main.S
file:
.syntax unified
.global main
.type main, %function
main:
@ initialize your loop variables
@ start your loop
add_1_loop:
@ check if you no longer meet your loop condition, exit if you don't, continue if you do
@ load your element, add 1, store it back
@ increase your loop variables
@ begin the loop again
@ finish the loop
end_add_1_loop:
nop
@ infinite catch loop
inf_loop:
nop
b inf_loop
.data
elements:
.word 1, 2, 5, 6, 3
elements_end:
.word 0 @ This is just here as a place holder, if you had more values in memory then it
@ wouldn't be necessary
Loop variables#
To perform a loop in assembly, we need a loop condition. For this example we’re going to be
doing a basic for
loop over the values in an array, so we will need the following:
- A way to keep track of how many iterations we’ve done
- A way to know how many iterations we need to do
For 1., we can initialize a register, say r2
, to 0 and then add to it every loop iteration.
But 2. is where things get a bit more tricky.
Length of an array#
In assembly, we don’t have access to things like .length
or len()
to determine
what the length of an array of elements is, it is up to us to determine this.
We can do this in a few different ways:
- Zero termination
- The array ends with a 0 value, so when iterating over the array, we check if the value is 0, if it isn’t then we continue. If it is 0, then we know we have reached the end of the array. This has the limitation of not allowing a 0 value in the array contents.
- Known value termination
- Similar to zero termination, this uses some known specific value to determine the end of the array. So like with zero termination, we’re checking for a specific value that will let us know that the array has ended, but again this means that the value can’t be a valid value of the array. This works well when we have a limited range of accepted values in the array.
- Explicit length value
- The length of the array is explicitly defined ahead of time. This could be at a different label, or the first element of the array. If doing this then be careful to be consistent with where the length is stored, and when adding elements to the array the length also needs to be updated.
- Memory length calculation
- If you know the starting address of an array and the ending address of an array, then you can calculate how many memory addresses, and therefore the length, of the array.
For this task, we’re going to use “memory length calculation” to determine the length of the array. In assembly we can do this by:
- Loading the address of the start of the array
- Loading the address of the end of the array
- subtracting the start address from the end address
- optional: divide the result by the size of an element in the array (in bytes)
When we’re dealing with memory address calculations our results will be in # of addresses, where every memory address holds a byte (8 bits) of information. If your elements are longer than this, here they are 4 bytes (32 bits), then the result from subtracting the start address from the end address will not be the amount of elements, but instead the number of memory addresses the elements use.
In this example, we will get a result of 20 for the length of the array (5 elements, 4 bytes each, 5*4 = 20). So if we want to get the number of elements, then we’d need to divide our result (20) by 4 (the amount of bytes each element uses).
Why may we not want to divide the result to get the amount of elements?
Generally, the following instruction(s) will be of interest to you:
Complete your program such that the add_1_loop
function uses a loop to increment
all of the values in the elements
array by 1. Verify that your code works
by using the memory viewer, then copy the code into tasks/task-7.S
.
Commit and push your changes with the message “completed task 7”.
At this point you’ve completed the main part of this weeks lab — good job!
Extension Exercises#
These exercises are less concerned with teaching you about coding in ARM assembly and more concerned with teaching you about how instructions in ARM assembly are translated to machine code.
While an interesting topic in its own right, these exercises will not directly prepare you for completing the rest of the labs or your second assignment. As a result we recommend that if you do not finish these tasks by the end of your week 7 lab you skip them (for now) and move onto lab 8.
Task 8 (Extension): Reverse Engineering#
Remember the 2 + 2 task
from the midsem lab? In this exercise we’re going to have a look under the hood
at the program you wrote and leave no bit unturned. Copy your 2 + 2 program into
the main.S
file. Once you’ve done that, start a debugging session and wait until you
get to main
, then leave it “hanging”—do not execute any further.
If you don't have the program from the previous lab then you can use this example one.
.syntax unified
.global main
.type main, %function
main:
mov r1, 2
add r1, 2
end_check:
nop
end_loop:
nop
b end_loop
Do you know where each of your “number 2”s are represented in your program when it’s actually running on your microbit? This is not dissimilar to what you were seeing in the first half of the course.
To do that we need to be able to view the memory in microbit. In VSCode you can look at your memory directly using the Memory View command.
There are actually two versions of Memory View in the COMP2300 extension: Cortex Debug: View Memory
and Cortex Debug: View Memory (legacy)
. The difference is that the legacy version opens a new tab for memory view while the normal version uses a panel at the bottom of the screen. The COMP2300 extension has wrappers for these with more descriptive names. Some students have found that the legacy version is more robust (the normal one sometimes doesn’t update immediately after stepping past store instructions) so you may prefer that.
This might look overwhelming, but the 2D “grid” layout is pretty simple: the blue hex numbers down the left hand side are the base memory addresses, and the blue hex numbers along the top represent the “offset” of that particular byte from the base address. The white hex numbers are the value at the corresponding address.
Eagle eyed students may have also noticed the green values to the right, what do you think these could be?
Spoiler
It’s our memory values interpreted as ASCII.
You can definitely find the bytes in memory which correspond to the
instructions you wrote in your main.S
file in this view. The trick is figuring
out where to look—what should the starting address be?
Even your humble microbit has a lot of addressable memory. Discuss with your
lab neighbour—where should you look to find your program? How did your QuAC
CPU know where the instruction you were executing was stored in memory?
If you get stuck, here’s a trick which is often helpful in reverse engineering things—try and put some known values into the program so that we know what we’re looking for when we’re staring at the memory view.
There’s an assembler directive called
.hword
which you
can use to put 16-bit numbers into your program (“hword” is short for half-word,
and comes from the fact that your microbit’s CPU uses 32-bit “words”).
Modify your program to use the .hword
directive to put some data into your
program.
.syntax unified
.global main
.type main, %function
main:
nop
.hword 0xdead
.hword 0xbeef
b main
.size main, .-main
Build & upload your program and open a memory view and go to the address of your
first instruction (or a bit before—as long as the number of bytes you display
spans over the memory region you’re interested in). You might need to re-size
the different viewlets to make sure you can see all the columns of the memory
view. Can you see the 0xdead
and 0xbeef
values you put into your program? If
you can’t see them exactly, can you see something which looks suspiciously
like them? What do you notice?
Endianness#
To make sense of the numbers displayed in the memory view, we need to talk about endianness.
One of the key differences between the QuAC CPU you built and ARM is that your
registers are 32 bits. Memory on ARM, however, is still byte addressable, i.e. stored as
individual bytes (i.e. 8-bit numbers, which can be represented with two hex digits).
Memory addresses refer to individual bytes, but the ldr
instruction will load 4 bytes
at a time since registers can hold 4 bytes.
Endianness refers to the order in which these small 8-bit bytes are arranged into larger numbers (e.g. 32-bit words, the size your registers hold). In the little-endian format, the byte stored at the lowest address is the least significant byte; where as in big-endian format, the byte stored at the lowest address is the most significant byte.
Here’s an example: suppose we have the number 0x01
stored at a lower memory
address (e.g. 0x000001e0
), and the number 0xF1
stored at a higher memory
address (0x000001e1
), as shown below:
Under the little-endian format, when these two 8-bit numbers are read as one
16-bit half-word number by the CPU, it represents 0xF101
(the 0x01
at the
lower address is treated as less significant). And under the big-endian
format, it represents 0x01F1
(the 0x01
at the lower address is treated as
more significant).
When reading four bytes from the memory, the CPU can read them as two 16-bit half-words, or as one 32-bit word. The endianness format applies everytime when combining bytes into bigger words. The following diagram illustrates this using the little-endian format:
You need to be aware of this byte ordering to make sense of the memory view. It might be painful and confusing at the start, but you’ll get used to it.
According to Wikipedia, Danny Cohen introduced the terms Little-Endian and Big-Endian for byte ordering in an article from 1980. In this technical and political examination of byte ordering issues, the “endian” names were drawn from Jonathan Swift’s 1726 satire, Gulliver’s Travels, in which civil war erupts over whether the big end or the little end of a boiled egg is the proper end to crack open, which is analogous to counting from the end that contains the most significant bit or the least significant bit.
Instruction Encoding(s)#
Now that you know how bytes fit together into words, let’s get back to the task of figuring out how the instructions in your program are encoded in memory.
To help, you can use the known half-words you put into your program earlier to help you out. Update your program like so to add a single instruction (i.e. one line of assembly code) from the your 2+2 program you wrote in Task 1.
.syntax unified
.global main
.type main, %function
main:
nop
.hword 0xdead
@ put a single "real" assembly instruction here from your 2+2 program
.hword 0xbeef
inf_loop:
nop
b inf_loop
Build, upload and start a new debug session (but don’t run it, just leave it paused) then find your program again in the memory view. What does your instruction look like in memory? If it helps, try making a note of the bytes, then modify the instruction slightly (e.g. change the number, or the register you’re using) and see how the bytes change in memory (you’ll need to stop the current debugging session and start a new one to view the changes in the memory).
Thinking back to the QuAC instruction encodings of the first half of the course, can you figure out what the different bits (and bytes) in the instruction encoding mean? If you’ve figured that out, why doesn’t your program actually work? If you’ve got questions, either ask your lab neighbour what they think or ask a tutor.
To fully make sense of these instruction encodings you need more than just your cheat sheet, you need the ARM®v7-M Architecture Reference Manual. You need to dig to the deepest levels, by going to section A7.7 Alphabetical list of ARMv7-M Thumb instructions (page A7-184). Use the bookmarks in your pdf viewer to navigate to the relevant instructions inside this huge 400 page section.
For each instruction you will see a number of encodings. They detail bit-by-bit the different ways of specifying the machine instructions that your microbit CPU understands. You may also find the number format conversion tool helpful.
Put a comment in your “reverse engineering” program about what your instruction
looks like in memory (it doesn’t actually have to run at this stage),
copy the code into tasks/task-8.S
. Commit and push your changes with the message “completed task 8”.
If you are curious about the encoding format used in ARM, you can read more about it here.
Task 9 (Extension): Hand-crafted, Artisanal Organic Instructions#
Now that you’ve identified the spot in memory where your instructions live, in this exercise we turn our approach around and program the CPU by writing specific numbers directly to memory locations.
This is very similar to what you did in the first half of the course when encoding the QuAC instructions. The main difference here is that you might have multiple options presented to you for each of the instructions.
Now, instead of calculating 2+2
, you are going to make the CPU calculate 3-1
by putting the right numbers into memory.
In fact, you have been doing this all along, except that the assembler has helped you in calculating those numbers. Replace your program with the following assembly code:
.syntax unified
.global main
.type main, %function
main:
nop
.hword 0xffff
.hword 0xffff
inf_loop:
nop
b inf_loop
only this time instead of the 0xffff
s you need to figure out the actual values
which will make the CPU load 3
into r1
and subtract 1
from it. Remember
that it’ll be similar to the words you looked at in the memory viewer earlier,
but some of the bits will be different (since we’re dealing with -
, 3
and
1
instead of +
, 2
and 2
).
For help with the instruction encodings, you can look at the same location in the reference manual as the previous exercise.
When running your hand-crafted program in the debugger, it may not be able to stop on
the hand-crafted instructions. However, you’ll be able to verify the result in r1
is
correct once your program reaches the b main
instruction. You may need to place a
breakpoint on the b main
instruction for this.
Another useful feature of the debugger is that it can disassemble code for
you! The disassembler view can be opened by typing Open Disassembly View
in
the command palette during an active debug session (so make sure you start a
debug session before doing this). It will then ask for which function to
disassemble, type the function name (e.g. main
). It will look something like
this:
Bring up the disassembler view for main
—you’re now looking at the program as
it will be understood by the CPU. You can even do step debugging in disassembly
view; it’s automatic based on which source tab you have active3.
Looking at the disassembled code (i.e. the way the CPU will interpret the
instructions) did you see what you intended? Will your new, hand-assembled
program show the correct result in r1
after it has been run? If it doesn’t,
what might have gone wrong?
You can now cheer as loud as your upbringing allows that you will never again have to hand-craft the bits needed to instruct the CPU to do this, and you can leave this job to the assembler from now on.
As a side effect, you also learnt something about security: your system can be compromised by injecting some data into memory (an array of numbers, a string or anything which the host system would accept) and making the CPU somehow stumble into executing it. Classically this is done in form of buffer overruns in C programs (very few other languages allow that sort of thing) where you can make your program write data into places where the executable code lives… and bingo!
Once your hand-crafted assembly instruction code works,
copy the code into tasks/task-9.S
. Commit and push your changes with the message “completed task 9”.
Extension Task 10: Further Understanding#
Starting to Debug#
What’s all that code that appears when you start debugging? The board actually
needs to set up a few things before it can run your code. Once it’s done, it
starts executing the main
function which you have defined.
Which Instruction?#
How might you figure out on your microbit’s Cortex-M4 CPU whether a 32-bit instruction is read as one 32-bit word or as two 16-bit half-words? (hint: have a look at A5.1 in the ARM®v7-M Architecture Reference Manual).
Which Encoding?#
Can you tell which specific encoding has been used? Note that not every encoding can express every version of the instruction, but sometimes a more complex encoding can also express what the a simpler form could have done as well. Can you hint the assembler towards the specific encoding you want?
Arbitrary Code Execution#
If you’re interested in Arbitrary Code Execution, it is often used in speed runs of video games to create possibilities that wouldn’t exist normally, such as allowing the player to skip entire sections of the game etc. Here is a video explaining how ACE works in Zelda: Ocarina of Time.
How Do Those Numbers Fit?#
You might have used one (or more) large numbers in your bit manipulation adventures. How many bits is that number? Do you think it will fit into the instruction encoding? Have a look at the disassembly, and cross check it with the instruction manual. Can you make sense of what’s happening here? (Hint: this blog post might be helpful to you. It uses a different instruction set, but the idea is the same). Give it a crack, but we will revisit this again later in the course.