A buffer overflow occurs when a too many bytes are written to a data structure or variable (a “buffer”) in memory. When this happens, the content of the memory locations immediately after the variable being changed are modified. This inadvertent change can corrupt other variables in the same function, corrupt
return addresses (if the buffer being overflowed is on the stack, which is the case if a local/automatic variable is the buffer being overflowed), or damage the heap (if the buffer is a dynamically allocated area).
An attacker can exploit a buffer overflow vulnerability if an application does not perform proper length checking of input arguments (typically supplied by the attacker in these scenarios.) If a caller is allowed to provide both the input data and the size of that data to a vulnerable function, and that function trusts the length provided by the caller (attacker), the application can be fooled into overwriting the subsequent bytes in memory if it makes a copy of that input data into a local buffer/variable.
Altering neighboring bytes in memory can allow an attacker to take control of an application. This can happen through several different mechanisms:
- If the buffer being overflowed is on the stack (eg, a local/automatic variable in C/C++), the attacker may be able to overwrite stack frames, including the return address from the function currently being called. When the function returns to the original caller, the control flow can be hijacked and directed to code of the attacker’s choice (eg, a ROP-style attack). This type of control flow hijacking is possible due to the fact that modern CPU architectures and ABIs use the stack for both local variable storage as well
as control flow. - If the buffer being overflowed is on the heap (eg, a dynamically allocated variable from malloc() or new()), the attacker may be able to overwrite the heap implementation’s internal metadata. If the heap implementation uses a linked list or sequence of buffers, the attacker may be able to fool the heap implementation’s free() function to write arbitrary values elsewhere in memory. This technique can be used to write to known locations containing function pointers (such as C++ object vtables), influencing program control next time those functions are used.
Control flow hijacking is not the only danger present if the application is subject to buffer overflow vulnerabilities. By overwriting adjacent bytes with garbage or junk data, the application may crash, possibly producing a DoS (denial of service) attack.
It should also be noted that buffer overflows don’t always need to be of the “write” variety. Buffers can be overflowed in a “read” scenario as well. If a caller of a function controls the “length” or “size” parameter of data to be returned by a function, and that function does not validate that the amount of data requested is less than the actual amount (or max amount) of data available, a read operation may return bytes past the end of the correct data. This can potentially allow an attacker to retrieve data unrelated to the original request, which may include sensitive data residing in memory after the buffer.
Buffer overflows that can be used to influence control flow are typically more common in memory-unsafe languages like C/C++. Languages that manage memory via an interpreter may be more resilient to these attacks, although bugs and implementation problems in the interpreters themselves could still allow for
control flow hijacking. Memory safe languages may offer a structural barrier preventing buffer overflows, but may not be able to protect a developer against logic errors causing buffer overflows within a managed region of memory (e.g., reading 100 bytes from a 1000 byte buffer, where only 50 bytes should have been returned).
Mitigating Buffer Overflows #
Buffer overflow exploits can be mitigated using a number of safe programming techniques:
- Don’t use functions that operate without buffer length checks. This includes functions like strcpy/strcat/sprintf and similar variants. These functions date from the 1970s and Better alternatives (such as strlcpy) are almost always available.
- Use memory safe languages to provide an additional layer of safety.
- Use fuzzing tools (during test) on functions accepting caller-provided “length” or “size” function parameters, in an attempt to discover buffer overflow susceptibility before code reaches production.
- Practice good code reviews; always treat “length” or “size” function input parameters as suspect. Ensure these parameters are checked before any data manipulation occurs.
References:
https://en.wikipedia.org/wiki/Buffer_overflow
https://en.wikipedia.org/wiki/Return-oriented_programming