To understand functions in ARM we first need to get familiar with the structural parts of a function, which are:
- Prologue,起始,序曲
- Body
- Epilogue,结束,尾声
The purpose of the prologue is to save the previous state of the program (by storing values of LR and R11 onto the Stack) and set up the Stack for the local variables of the function. While the implementation of the prologue may differ depending on a compiler that was used, generally this is done by using PUSH/ADD/SUB instructions. An example of a prologue would look like this:
函数起始:
(1)保存之前的状态(将LR和R11保存到堆栈,下面第1句)
(2)设置堆栈的fp,一般是将fp=sp+4(因为之前push已经移动了2个单位)
(3)设置堆栈的sp,sp现在已经移动了2个单位,再移动剩余所需的空间即可。
1 push {r11, lr} /* Start of the prologue. Saving Frame Pointer and LR onto the stack */
2 add r11, sp, #0 /* Setting up the bottom of the stack frame */
3 sub sp, sp, #16 /* End of the prologue. Allocating some buffer on the stack. This also allocates space for the Stack Frame */
The body part of the function is usually responsible for some kind of unique and specific task. This part of the function may contain various instructions, branches (jumps) to other functions, etc. An example of a body section of a function can be as simple as the following few instructions:
1 mov r0, #1 /* setting up local variables (a=1). This also serves as setting up the first parameter for the function max */
2 mov r1, #2 /* setting up local variables (b=2). This also serves as setting up the second parameter for the function max */
3 bl max /* Calling/branching to function max */
The sample code above shows a snippet of a function which sets up local variables and then branches to another function. This piece of code also shows us that the parameters of a function (in this case function max) are passed via registers. In some cases, when there are more than 4 parameters to be passed, we would additionally use the Stack to store the remaining parameters. It is also worth mentioning, that a result of a function is returned via the register R0. So what ever the result of a function (max) turns out to be, we should be able to pick it up from the register R0 right after the return from the function. One more thing to point out is that in certain situations the result might be 64 bits in length (exceeds the size of a 32bit register). In that case we can use R0 combined with R1 to return a 64 bit result.
不超过4个的输入参数可以通过寄存器传递,若超过4个参数,则超过的需要通过堆栈传递。函数返回值也是通过R0传递。
The last part of the function, the epilogue, is used to restore the program’s state to it’s initial one (before the function call) so that it can continue from where it left of. For that we need to readjust the Stack Pointer. This is done by using the Frame Pointer register (R11) as a reference and performing add or sub operation. Once we readjust the Stack Pointer, we restore the previously (in prologue) saved register values by poping them from the Stack into respective registers. Depending on the function type, the POP instruction might be the final instruction of the epilogue. However, it might be that after restoring the register values we use BX instruction for leaving the function. An example of an epilogue looks like this:
函数结束,恢复初始状态:
(1)设置堆栈的sp,一般通过r11=fp来设置,通常应该是sp=r11+4。
(2)恢复之前保存的r11=fp和lr到r11和PC。
1 sub sp, r11, #0 /* Start of the epilogue. Readjusting the Stack Pointer */
2 pop {r11, pc} /* End of the epilogue. Restoring Frame Pointer from the Stack, jumping to previously saved LR via direct load into PC. The Stack Frame of a function is finally destroyed at this step. */
So now we know, that:
- Prologue sets up the environment for the function;
- Body implements the function’s logic and stores result to R0;
- Epilogue restores the state so that the program can resume from where it left of before calling the function.
Another key point to know about the functions is their types: leaf and non-leaf. The leaf function is a kind of a function which does not call/branch to another function from itself. A non-leaf function is a kind of a function which in addition to it’s own logic’s does call/branch to another function. The implementation of these two kind of functions are similar. However, they have some differences. To analyze the differences of these functions we will use the following piece of code:
另一个关于函数的要点是,函数分叶子函数和非叶子函数。叶子函数里不再继续调用其它函数,非叶子函数里会继续调用其它函数
1 /* azeria@labs:~$ as func.s -o func.o && gcc func.o -o func && gdb func */
2 .global main
3
4 main:
5 push {r11, lr} /* Start of the prologue. Saving Frame Pointer and LR onto the stack */
6 add r11, sp, #0 /* Setting up the bottom of the stack frame */
7 sub sp, sp, #16 /* End of the prologue. Allocating some buffer on the stack */
8 mov r0, #1 /* setting up local variables (a=1). This also serves as setting up the first parameter for the max function */
9 mov r1, #2 /* setting up local variables (b=2). This also serves as setting up the second parameter for the max function */
10 bl max /* Calling/branching to function max */
11 sub sp, r11, #0 /* Start of the epilogue. Readjusting the Stack Pointer */
12 pop {r11, pc} /* End of the epilogue. Restoring Frame pointer from the stack, jumping to previously saved LR via direct load into PC */
13
14 max:
15 push {r11} /* Start of the prologue. Saving Frame Pointer onto the stack */
16 add r11, sp, #0 /* Setting up the bottom of the stack frame */
17 sub sp, sp, #12 /* End of the prologue. Allocating some buffer on the stack */
18 cmp r0, r1 /* Implementation of if(a<b) */
19 movlt r0, r1 /* if r0 was lower than r1, store r1 into r0 */
20 add sp, r11, #0 /* Start of the epilogue. Readjusting the Stack Pointer */
21 pop {r11} /* restoring frame pointer */
22 bx lr /* End of the epilogue. Jumping back to main via LR register */
The example above contains two functions: main, which is a non-leaf function, and max – a leaf function. As mentioned before, the non-leaf function calls/branches to another function, which is true in our case, because we branch to a function max from the function main. The function max in this case does not branch to another function within it’s body part, which makes it a leaf function.
Another key difference is the way the prologues and epilogues are implemented. The following example shows a comparison of prologues of a non-leaf and leaf functions. The main difference here is that the entry of the prologue in the non-leaf function saves more register’s onto the stack. The reason behind this is that by the nature of the non-leaf function, the LR gets modified during the execution of such a function and therefore the value of this register needs to be preserved so that it can be restored later. Generally, the prologue could save even more registers if it’s necessary.
函数起始:对于非叶子函数,因为进一步调用其它函数会改变LR寄存器,因此,在函数起始,需要将r11和LR一起压入堆栈存储。而对于叶子函数,不再调用其它函数,LR不会改变,因此,不需要将LR压入堆栈。
1 /* A prologue of a non-leaf function */
2 push {r11, lr} /* Start of the prologue. Saving Frame Pointer and LR onto the stack */
3 add r11, sp, #0 /* Setting up the bottom of the stack frame */
4 sub sp, sp, #16 /* End of the prologue. Allocating some buffer on the stack */
5
6 /* A prologue of a leaf function */
7 push {r11} /* Start of the prologue. Saving Frame Pointer onto the stack */
8 add r11, sp, #0 /* Setting up the bottom of the stack frame */
9 sub sp, sp, #12 /* End of the prologue. Allocating some buffer on the stack */
The comparison of the epilogues of the leaf and non-leaf functions, which we see below, shows us that the program’s flow is controlled in different ways: by branching to an address stored in the LR register in the leaf function’s case and by direct POP to PC register in the non-leaf function.
函数结束:对于叶子函数,可以直接bx lr,跳转到LR处继续执行,因为,LR未改变。BX的意思为Branch and eXchange ARM/Thumb模式。
对于非叶子函数,需要将之前保存的LR恢复给PC,来继续执行。
1 /* An epilogue of a leaf function */
2 add sp, r11, #0 /* Start of the epilogue. Readjusting the Stack Pointer */
3 pop {r11} /* restoring frame pointer */
4 bx lr /* End of the epilogue. Jumping back to main via LR register */
5
6 /* An epilogue of a non-leaf function */
7 sub sp, r11, #0 /* Start of the epilogue. Readjusting the Stack Pointer */
8 pop {r11, pc} /* End of the epilogue. Restoring Frame pointer from the stack, jumping to previously saved LR via direct load into PC */
Finally, it is important to understand the use of BL and BX instructions here. In our example, we branched to a leaf function by using a BL instruction. We use the the label of a function as a parameter to initiate branching. During the compilation process, the label gets replaced with a memory address. Before jumping to that location, the address of the next instruction is saved (linked) to the LR register so that we can return back to where we left off when the function max is finished.
在BL的时候,将调用指令的下一条指令地址已经保存(链接)在了LR寄存器。
The BX instruction, which is used to leave the leaf function, takes LR register as a parameter. As mentioned earlier, before jumping to function max the BL instruction saved the address of the next instruction of the function main into the LR register. Due to the fact that the leaf function is not supposed to change the value of the LR register during it’s execution, this register can be now used to return to the parent (main) function. As explained in the previous chapter, the BX instruction can eXchange between the ARM/Thumb modes during branching operation. In this case, it is done by inspecting the last bit of the LR register: if the bit is set to 1, the CPU will change (or keep) the mode to thumb, if it’s set to 0, the mode will be changed (or kept) to ARM. This is a nice design feature which allows to call functions from different modes.
BX LR指令中LR寄存器的最后1bit还可以用于切换ARM和Thumb模式。
最后一个关于叶子函数和非叶子函数的例子,gif动态图很长,可以用一些gif编辑软件,暂停看。