In the previous post, we looked at C call and Std call. Now we will discuss Fast call for both x86 and x64 architecture types.
Fast call
This is the default calling convention in x64 machines. It is sometimes used in x86 machines as well.
Calling convention uses registers to store arguments. In x86, the first two arguments are put into ECX and EDX and the rest are pushed into the stack from right to left. In x64 - the first 4 parameters are put in RCX, RDX, R8 and R9 and the remaining arguments are pushed into the stach from right to left.
Called routine responsible for cleaning up the stack, typically executing RET N or sub RSP, ##
Functions decorated with @ prefix followed by number of bytes in parameters suffixed with @.
X64 ABI defines support for only fase calling convention. The first four parameters in RCX, RDX, R8 and R9. Further parameters are passed on the stack. The x64 Application Binary Interface also forces compiler writers to create 1:1 stack backing for each argument in a function. The x64 architecture provides for 16 general-purpose registers as well as 16 XMM registers available for floating-point use. The following are some of the rules in x64 calling convention:
X64 only supports fast call convention. The first four parameters are passed in RCX, RDX, R8 and R9. If arguments are float/double – they are passed in XMM0L, XMM1L, XMM2L, XMM3L. Aggregates (structures/classes) if > 64 bits is passed as
a pointer.
A scalar return value that can fit into 64 bits is returned through RAX. Non-scalar types including floats, doubles are returned in XMM0
Caller allocates space on the stack for parameters to the callee. The x64 spec also states that the caller should allocate backing space(parameter homing space) for parameters passed through registers, the callee expects that. The actual registers
may or may not be stored in the homing area – that depends on the caller and the callee’s prolog.
A function’s prolog is responsible for allocating stack space for local variables, saved registers, stack parameters, and register parameters. It has to pre-book space for parameters that this function may send, when it calls other functions –
called the stack params area. This is required for the debugger to rebuild the stack in the absence of frame pointer.
The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile and must be considered destroyed on function calls (unless otherwise safety-provable by analysis such as whole program optimization).
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are considered nonvolatile and must be saved and restored by a function that uses them.
Volatile registers are scratch registers that the caller assumes - to be destroyed across a call. The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile by the caller and may be considered destroyed on function calls (unless otherwise safety-provable by analysis such as whole program optimization).
Nonvolatile registers are those that the callee must maintain - so that the callers values are alive across a function call. Callee can save Non volatile registers if they are used within the function.
Here’s a table summarizing their usage:
Register
Status
Use
RAX
Volatile
Return value register
RCX
Volatile
First integer argument
RDX
Volatile
Second integer argument
R8
Volatile
Third integer argument
R9
Volatile
Fourth integer argument
R10:R11
Volatile
Must be preserved as needed by caller; used in syscall/sysret instructions
R12:R15
Nonvolatile
Must be preserved by callee
RDI
Nonvolatile
Must be preserved by callee
RSI
Nonvolatile
Must be preserved by callee
RBX
Nonvolatile
Must be preserved by callee
RBP
Nonvolatile
May be used as a frame pointer; must be preserved by callee
RSP
Nonvolatile
Stack pointer
XMM0
Volatile
First FP argument
XMM1
Volatile
Second FP argument
XMM2
Volatile
Third FP argument
XMM3
Volatile
Fourth FP argument
XMM4:XMM5
Volatile
Must be preserved as needed by caller
XMM6:XMM15
Nonvolatile
Must be preserved as needed by callee.
All memory addresses > RSP is volatile and callees should not write here. Function prolog allocates space on stack for local variables, saved registers, and stack based parameters and register parameter’s backing store. Number of space allocated = 4 or the maximum space required by any function calls made within this function. For c++, this pointer is always passed through RCX.
This call ( __thiscall )
This is the default calling convention for C++ member functions. Class member functions needs a mechanism to know the “this” pointer at any point in time. This convention makes sure that when a member function is called, this pointer be implicitly passed per the programmer.
The this pointer is passed in ECX register. (Variation to this is COMCALL, in which the “this” pointer is passed on the stack). Remaining arguments are passed on the stack.