x64 Return Address Spoofing
Introduction
AntiVirus or EDR solutions determine if an activity is malicious based on behavioural thresholds. If a process's activity crosses this threshold, it gets categorized as suspicious or outrightly malicious and is probably terminated. The threshold, however, is different for all processes. Trusted processes, like system services, have a higher threshold as opposed to the process of a never-seen or unsigned executable. There are various activities that are monitored to identify suspicious activities. Typically, one such activity is the execution of suspicious APIs from unbacked memory regions. For instance, if a shellcode that has been loaded into the memory starts executing WinAPIs like "InternetConnectA", "HttpOpenRequestA", or "HttpSendRequestA", etc., the process can be categorized as malicious with certainty.
To determine the memory location from which a WinAPI is called, one just needs to look at its return address. This is the address from which the program will continue execution after returning from the WinAPI. By modifying the return address, we can make it look like the function is being called from a different location. This technique is called as "Return Address Spoofing". While this technique is not new or perfect and can be detected as it breaks the call stack chain, it is a fundamental technique to build upon. There are various blogs and great resources on this technique, and going through them inspired me to develop a POC that works with all WinAPIs.
The code for this project can be found on my GitHub.
Let's begin!
First, let's see how the call stack would look like when a shellcode from unbacked memory executes a MessageBox.
As we can see, the originating call is from an unbacked memory region.
Let's take a look at how the below code looks inside a debugger.
The breakpoint is set at the first instruction of MessageBoxA. Since this API is called within the "main()" function, we can see its return address at the top of the stack. Our objective will be to modify this value to an address within a trusted DLL, such as KERNEL32.DLL. This will make it look as if MessageBoxA is being called from KERNEL32.DLL. When the MessageBoxA API returns, the execution will be directed at the address in KERNEL32.DLL, and the instructions at this address will be executed.
The return address needs to be selected in such a way that the corresponding instructions redirect execution flow back to us, or else the program will crash. Such instructions are usually referred to as gadgets. We will use the following gadget in our program jmp QWORD PTR [rbx]
.
The address of our liking is selected and its pointer is placed into the "rbx" register. When the gadget in KERNEL32.DLL gets executed, the control flow will be directed back to the address stored in "rbx", giving us the control.
You can alternatively use any DLL which contains these instructions. Here is a simple Python script to check if a DLL can be used.
Implementation
Now that we have an idea of what to do let's start with the implementation.
C Program
The following function retrieves the memory location of our gadget from a loaded DLL.
We will store all the required arguments inside a structure which is defined as follows
"pRopGadget" will hold the address of our gadget
"pTarget" will contain the address of the function which needs to be called. e.g. MessageBoxA, VirtualAlloc, etc.
"dwNumberOfArgs" to specify the number of arguments our target function needs. e.g. 4 for MessageBoxA
"pEbx" to hold the address of our liking when the gadget is executed
"pArgs" pointer to the arguments of our target function
We will use a function to initialize this structure and the required arguments for our target function.
This function configures four as the minimum number of arguments, and if the number of arguments is odd, it is incremented by one. This is done to avoid complex steps in our assembly code to make sure that the stack will be aligned.
Before proceeding to the assembly section, let's understand the Windows x64 calling convention and the stack alignment requirement.
Windows x64 uses the fastcall
calling convention. In the fastcall
calling convention, the first four arguments for a function are stored in registers in the following order: rcx
, rdx
, r8
, and r9
. Any additional arguments are pushed to the stack from right to left. i.e. the 5th argument will be at the top of the stack, and the last argument will be at the bottom. If there are an odd number of arguments, an additional padding of 8 bytes is adjusted to maintain the stack alignment.
Additionally, a 32-bit space must be reserved on the stack before calling the function, known as the shadow space. This space is used by the callee to store the first four arguments from the registers. Windows also requires the stack to be 16-byte aligned before a function is called. Basically, the value in rsp
must be divisible by 16.
We will define a prototype for our Spoof
function, which will be coded in assembly. This function will take the pointer to our config structure as a parameter and will be responsible for spoofing the return address.
Spoof Assembly Code
This is where we get our hands dirty and work directly with assembly. This part is necessary as we want granular control over the stack and the register values.
To get started with coding in assembly, check out g3tsyst3m's "x64 Assembly" series.
Assembly Code
Firstly, we will store the return address within our "main" function inside the rdi
register. We would like to continue execution from this address after our shenanigans are completed. Once the pop
instruction is executed, our stack will be in an aligned state, and we need to maintain that. The next set of instructions configure the first four arguments in the registers. If our target function has more than four arguments, the number of arguments to be placed on the stack is stored in the r12
register.
We now have to store the additional arguments on the stack. Which will be accomplished by using a loop. First, space is created on the stack by subtracting the rsp
value. As we have ensured that the number of arguments is even, the stack will still be aligned.
Now that our stack is ready, we have to do the following steps:
Allocate Shadow Space
Push the gadget's address to the stack, which will act as the return address
Configure
rbx
Jump to the target function
The final section is the cleanup
, which will revert the stack back to its original state and return to our "main" function.
Example
Let's see an example of executing a shellcode using this technique.
We can see that our Shellcode has been successfully executed. Let's analyse this inside a debugger.
Breakpoint at VirtualAlloc
Breakpoint at CreateThread
Breakpoint at WaitForSingleObject
References
https://sabotagesec.com/the-stack-series-return-address-spoofing-on-x64/
https://g3tsyst3m.github.io/shellcoding/assembly/debugging/x64-Assembly-&-Shellcoding-101/
Last updated