In this blog post I’m writing about something that I consider really interesting and also very fascinating from the engineering perspective. I’m a huge fan of computer security related stuff and am intrigued by the many branches (ha!) it covers top to bottom with so many different aspects that you really cannot know everything at all. There’s always so much more to discover. This is not supposed to be a deep-dive or a complete guide to anything at all, more than just to give you a brief introduction to something that you could also find interesting and possibly study on your own. Please, if you don’t read this posting, at least checkout the link from the second to last paragraph!
So back in the day CPU architectures introduced the stack in order to allow better control of the programs’ execution flow and for instance make recursion possible (if it’s not clear for you, read this post and try to come up with the reason why you cannot have recursion without the stack!). Nowadays stack is a must and you cannot find a piece of hardware executing software without a stack (as far as I know). The stack is interesting. It grows with the push operations and decreases with the pop operations. Every time the CPU calls a new function the stack is grown and every time the execution of a function is over the stack is decreased. What goes in there, specifically, are the local variables for a function and the return address to use after the execution. So lets say the CPU calls a void function called count that is supposed to print numbers from 0 to 10. What we push to the stack is the return address which is the address of the next instruction that is supposed to be executed right after the function is done and space for the count’s local variables (it supposedly needs a loop and a loop variable). When count is done, we decrease the stack with pop operation which then gets rid off the count’s local variables and returns the execution to the location pushed to the stack earlier. It grows dynamically, so to say, and the whole thing is really interesting, the way it works and the cleverness of it. It’s automatically grown and decreased and always keeps each functions’ environments intact and stores the addresses for continuing execution. It’s very simple but really powerful at the same time.
The problem is… what if an adversary party could write some malicious code into your stack and override the bits that holds the information where to go next? Specifically, overwrite the return address from the stack with a return address of the party’s own. That would inherently make the control flow jump to execute whatever the malicious party wanted. That’s what bad guys started doing and which lead to DEP and ASRL. DEP stands for Data Execution Protection, a technology which separates the dynamic, software written data from the instructions that the CPU is going to execute. ASRL stands for Address-Space Layout Randomization, which changes the location of the software, the kernel code, the operating system’s libraries, and so forth, in the RAM so that the malicious party cannot know where certain pieces of software logic lies in the address-space which makes the use of known locations of useful code a no-go for execution. DEP and ASRL are both themselves really fascinating stuff and if you’re interested, go check them out in more detail, my few words here don’t really get you covered on them at all.
Return-Oriented Programming (ROP) is a way of getting control of the stack and changing bits and pieces of software logic towards the end of subroutines so that the CPU would then execute code that would diverge the program flow to an adversary route. That would of course, done correctly, lead to takeover of the operating system since the malicious party could execute whatever it wishes. Now Intel and Microsoft have come up with a new piece of fascinating technology called Control-Flow Enforcement Technology that adds more protection specifically to the stack operations. There are two new things to fight the problem:
- ENDBRANCH instruction is added to the new x86/x64 architecture instructions sets. When the software is compiled with a CET supporting compiler that targets new CPUs, all legal (valid) calls and jumps are directed to an endbranch instruction. That is to say that whenever a subroutine is called, the first instruction to be found inside the subroutine must be an endbranch. If it’s not, the CPU throws an exception and the hell breaks loose. So the program gets compiled in a way that the subroutine/call/jump flow is enforced to obey the original intent of the programmer – thus the name Control-Flow Enforcement. Today’s processors can return to any valid place the kernel lets them to return to no matter what instructions lie there, but that’s going to be changed with CET. That makes it impossible for an attacker to redirect the program flow for example into middle of a library or so that could then lead to overtake. The brilliance of the endbranch instruction is that it’s implemented as a NOP in all the current Intel chips so it’s 100 % backwards compatible and requires no tricks from the software programmer.
- Shadow stack which is a separate, hidden, you-cannot-touch-me stack living along the usual stack. For all the calls and jumps the software makes the return addresses are pushed to the stack but on the new chips they are also pushed to a new shadow stack – which happens automatically. Then when the pop op comes and it’s time for returning to a previous location, the CPU checks whether both the shadow stack and the normal stack have the same return address or not. If they match, ok, continue execution. If they don’t match, it means that someone has had their fingers on the normal stack’s return address and that all execution should be stopped. The programmer has no any sort of control over the shadow stack, it just operates in the background and is implemented in the hardware so that you cannot trick it nor touch it.
Together these two additions will make it absurdly hard to alter the execution flow.
If you found any of this interesting, do your homework for DEP, ASRL, CET and before that, take a look at Ars Techinca’s brilliant article How security flaws work: The buffer overflow. That’s a really nice piece on the whole stack and the security questions around it and gives you many pointers for more things to discover.
Some links: Intel’s blogposting, ROP on Wikipedia, TWIT.tv episode of Security Now