Tuesday, October 5, 2010

HOWTO Smash Your C Stack and Learn More !!!

Stack is an abstract data type (ADT) in C. It is a memory visualization, where the memory contents are put inside one on top of the other. Just as a pile of plates. We have to take each plate rom the top, one by one. Hence, a stack can be also called a Last In First Out (LIFO) structure.

Its main uses include storing the current instruction address of a program when it calls another function, contents of relevant registers like the program counter, etc. Other uses of stack are seen while performing string operations, one of which I am going to use now.


A simple implementation of builtin strcat()
---------------------------------------------------------
The program can be as follows:
Fig 1. The pseudo code for strcat()












Its time to play with the test data. Initially,
Fig 2. Initial test data









The output will be printed graciously as,
Fig 3. Initial output







My name is 'hari john kuriakose'
---------------------------------------------
It is only natural that I try to concatenate the two 'halves'
of my name. So my new test data is,
Fig 4. Test data 2






Surprisingly, we get a peculiar output. What we see is,
Fig 5. Output for test data 2 - part I




Fig 6. Output for test data 2 - part II













Note that the length of destination (here its a) is 7. The stack will be smashed for the lengths 8 and 9 too, but not for other arbitrary lengths. Also, this behaviour is true only when length of the source (here its b) in all these cases is >1.
Fig 7. Test data 3

Fig 8. Test data 4











If the length of the source is 0 or 1, no smashing occurs.

Similarly, when concatenation is in the reverse order, the stack smashing willl occur only when the length of destination (here its  b) is 7 and length of source (here its a) is >1.
Fig 9. Test data 5






Here too, no smashing is observed when length of source is 0 or 1.


Analyzing with gdb
---------------------------
On analysis, the program works fine. There will be no problem till the very last line. Consider the test data is one among the problematic cases mentioned above. On exiting from main() after seeing the closing curly bracket, this odd behaviour suddenly springs up. But now, gdb informs that two more things have happened.
1. Program received signal SIGABRT, Aborted
2. 0x0012d422 in __kernel_vsyscall ()
Fig 10. gdb output - part I

Fig 11. gdb output - part II
















This, is interesting. What are these things?


SIGABRT
--------------
It is a signal. It is defined in signal.h . It is used to tell a program to abort. Its signal number is 6.

On some platforms, SIGIOT (signal for I/O transfer) is taken as a synoym for SIGABRT. It is also the signal a process sends itself when it calls the abort libc function, defined in stdlib.h . The SA_SIGINFO macro would be interesting too.

SIGABRT can be caught, but it cannot be trapped. When the signal handler returns, all streams must be flushed out and the program terminated. Hence, an abort function will not return. This signal is used to indicate fatal conditions in the supporting libraries, when the program cannot continue, but the main() can cleanup while it exits. Typical example is, on the failure of an assertion.

If a SIGINFO is called after the occurrence of a SIGABRT, a pointer to a structure of type ABRT_t will be returned. This strcuture contains a printable form of the ABEND code. Since SIGABRT cannot return to the point of interrupt, an attempt to do so will reissue a ABEND.


__kernel_vsyscall()
-------------------------
This is the method used by linuxgate.so (a part of the kernel) to make fast system calls, usually using sysenter.

The linuxgate.so is a Virtual Dynamically-linked Shared Object (VDSO), a kernel-provided shared library that helps userspace perform a few kernel actions without the overhead of a system call, as well as automatically choosing the most efficient syscall mechanism. Also called the “vsyscall page”.
This is as per the KernelGlossary.

Or, the linux gate is the actual interface between the kernel and the user space. Originally it was named linux-vsyscall.so.1, but to make it more meaningful, it was changed.

To view the linuxgate.so, print the shared library dependencies for the shell.
    ldd /bin/sh
Fig 12. The Linux gate





But the address you see there, does not actually exist. It is just a shared object used by kernel to facilitate a VDSO. To know whether your kernel is VDSO enabled, you can see the contents of a file in /proc, which contains the currently mapped memory regions and their access permissions.
    cat /proc/self/maps

Fig 13. cat /proc/self/maps - part I

Fig 14. cat /proc/self/maps - part II














In the kernel, all processes will therefore share this same object, which enables us to do a simple trick with the convert and copy ability in linux, the dd tool. On successive objdump, the '__kernel_vsyscall' can be spotted. On further objdumping, the underlying 'sysenter' can be seen too.

What are 'system calls'?
They are the means by which a user can access the services offered by a kernel. Those services include storage, process management, etc. In C, the stubs are the interface to invoke system calls.

Is that all?
No. In a system call, the actual code running is a part of the kernel itself. It runs with a privilege level 0 (CPL 0), which is the highest level in x86 architecture. It is termed 'ring 0'. All user processes run in ring 3. Thus, to sucessfully make a system call, we have to call a ring 0 code from a ring 3 code.

What are these 'faster system calls' ?
Originally, in the x86 architecture, all system calls were implemented as interrupts. Linux and Unix-like kernels used 0x80. For the newer members in the x86 family, it was observed that interrupting via 0x80 was getting much slower! A Pentium IV was more slower than a Pentium III in this particular respect. Intel recognized this problem early, and introduced two instructions, 'sysenter' and 'sysexit'. But the hardware bugs were so plenty, they could not boast about it.

That was until Linus Torvalds came along. He was very skillful in manipulating the fact that the return point of 'sysenter' instruction is totally arbitrary, and he wrote some highly tricky code that ultimately came out as the solution!!! Anyways, he himself says his final solution is disgusting!

For the techier techies, 


N. B. 
---------
The pseudo code can be tweaked slightly and made theoretically correct.