Binary reversing is an essential skill for malware analysis and solving wargames challenges. Program written in C are common and there are various tutorial about their reversing (calling conventions, dynamic libraries, stack, variables and so on). Once the assembly language is learned, it’s just a matter of patience to reverse an application (anti-reversing techniques aside, of course).

However the assembly generated from C++ code is harder to analyze, due to object-oriented constructs. These tutorials aim to study how high-levels constructs, such as namespaces, operators, classes and their relationships, are converted into assembly code and how to reverse them when analyzing a binary.

PART 1: NAME MANGLING AND FUNCTIONS

First of all, functions memory addresses are renamed with a name suitable for the compiler and the linker. This process is called name mangling (see below for references).

namespace
{
// _ZN12_GLOBAL__N_17ScroogeEv
int Scrooge()
{
return 5;
}
}

// _Z11GlobalPlutov
int GlobalPluto()
{
return 4;
}

// _ZL11GoofyStaticv
static int GoofyStatic()
{
return 3;
}

namespace Donald
{
// _ZN6Donald12GlobalDonaldEv
int GlobalDonald()
{
return 1;
}
// _ZN6DonaldL12StaticDonaldEv
int StaticDonald()
{
return 2;
}
}

Thus by reading the following code:

push   %ebp
mov %esp,%ebp
lea -0x1018(%esp),%esp
orl $0x0,(%esp)
lea 0x1010(%esp),%esp
call 80487fd
call 80487c1
call 8048749
call 8048785
call 804870d
mov $0x0,%eax
leave
ret

We can say that the function calls two function without namespace called GoofyStatic and GlobalPluto, two function inside the ‘Donald’ namespace and finally a function residing in the global namespace. Finally, GDB offers an automatic demangling utility:

gdb> set print asm-demangle on
gdb> disass main
Dump of assembler code for function main:
0x0804873f : push %ebp
0x08048740 : mov %esp,%ebp
0x08048742 : and $0xfffffff0,%esp
0x08048745 : lea -0x1010(%esp),%esp
0x0804874c : orl $0x0,(%esp)
0x08048750 : lea 0x1010(%esp),%esp
0x08048757 : call 0x8048659
0x0804875c : call 0x80485e9
0x08048761 : call 0x8048707
0x08048766 : call 0x8048621
0x0804876b : call 0x8048723
0x08048770 : mov $0x0,%eax
0x08048775 : leave
0x08048776 : ret

References:
http://www.int0x80.gr/papers/name_mangling.pdf
http://www.ofb.net/gnu/gcc/gxxint_15.html
http://en.wikipedia.org/wiki/Name_mangling#Name_mangling_in_C.2B.2B
http://stackoverflow.com/a/1962381

Advertisements