Prashant's Log - VIDA-NYU/reprozip GitHub Wiki
Dtruss is the shell script which is wrapped around the Dtrace system call. Dtruss can give us the precise details for tracking the system calls for the executable programs in MacOS in a nice text format. It is "supposedly" better than the Strace in other Unix systems. Please see this link Strace and Dtrace
Now comes the new security feature that MacOs has introduced called "System Integrity Protection" aka "SIP". Read more about SIP. Essentially what it does is It introduced extra layer of security in system folders such as /usr,/bin,/sbin etc
in a way that even root programs can not modify the contents. Thus called rootless
mode.
Now, Dtruss
is a root mode program. After OS X El Capitan, If we want to trace the system calls we have two options.
- Disable SIP (Completely or Partially)
- Run Dtruss in unprotected directories
Let me go through both of these approaches one by one. I will discuss merits and demerits of both in following sections.
- Disable completely
Caution, could be dangerous!! at least that's what Apple is saying. The process itself is not very complected, but the only problem is we can't automate this process.
- Restart Your Mac Computer.
- On OS X starts up, hold down Command-R and keep it held down until you see an Apple icon and a progress bar. Release. This boots you into Recovery.
- From the Utilities menu, select Terminal.
- At the prompt type exactly the following and then press Return:
$ csrutil disable
- Terminal should display a message that SIP was disabled.
- Restart the OS X.
Now, you might have noticed that we need to boot the computer into the recovery mode. That is the process we can't automate using shell script.
- Disable SIP partially The process is essentially the same. But we can specifically disable SIP for Dtrace only. All the other components will be safe. Steps are as follows.
- Restart Your Mac Computer.
- On OS X starts up, hold down Command-R and keep it held down until you see an Apple icon and a progress bar. Release. This boots you into Recovery.
- From the Utilities menu, select Terminal.
- At the prompt type exactly the following and then press Return:
$ csrutil enable --without dtrace
- Terminal should display a message that SIP was disabled.
- Restart the OS X.
Again the problem remains same. We can not automate the process. User has follow all the steps manually in order to run Dtruss
on his\her system. Even if we can somehow automate the process we need to give the document to user explaining complete procedure and explain all the "potential risk" that they are taking.
I read the apple docs for Drtuss
and Dtrace
, nothing seems to come up which can help us in the regard. May be I am missing something. I am new to OS X, I have used it only once or twice. I can discuss it with Remi next week.
I have tried to use the Fakeroot from the following repository. As you can see on the github page of the repository. It says OS X El Captian is not supported.
Besides above repo what I tried.
- Manually compiling the fakeroot from source and then running it.
- Installing
brew install dpkg
[fakeroot comes with dpkg].
None of those seemed to be working as promised. Remi pointed out It might be because of SIP. I disabled SIP completely and then tried. No luck so far.
In order to trace calls we tried C code injection. I came across these resources which got me started.
I tried both the approaches, in the first one the code compilation issue was daunting. it needed specific architecture to work on. In the second approach I wrote a small C program which overrides these functions
- open
- read
- fopen
The problem remains same, if I inject libraries after disabling SIP, it works otherwise it won't work. Except we copy the binaries from system directories to local directories and then inject libs. I am working on other approach which remi told me to take a look at Binary Run.
UPDATE: I managed to load the code from the memory. essentially we follow steps provided below.
- Add the location of the binary into the program (Physical location such as
bin/ls
or./program
) - Program reads the binary and loads into the buffer
- We can run the binary using command such as
./loadFromMemory /bin/cat
The code is written below.
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <dirent.h>
#include <mach-o/loader.h>
#include <mach-o/nlist.h>
#include <CoreServices/CoreServices.h>
#include <mach-o/dyld.h>
#define EXECUTABLE_BASE_ADDR 0x100000000
#define DYLD_BASE 0x00007fff5fc00000
int IS_SIERRA = -1;
int is_sierra(void) {
// returns 1 if running on Sierra, 0 otherwise
// this works because /bin/rcp was removed in Sierra
if(IS_SIERRA == -1) {
struct stat statbuf;
IS_SIERRA = (stat("/bin/rcp", &statbuf) != 0);
}
return IS_SIERRA;
}
int find_macho(unsigned long addr, unsigned long *base, unsigned int increment, unsigned int dereference) {
unsigned long ptr;
// find a Mach-O header by searching from address.
*base = 0;
while(1) {
ptr = addr;
if(dereference) ptr = *(unsigned long *)ptr;
chmod((char *)ptr, 0777);
if(errno == 2 /*ENOENT*/ &&
((int *)ptr)[0] == 0xfeedfacf /*MH_MAGIC_64*/) {
*base = ptr;
return 0;
}
addr += increment;
}
return 1;
}
int find_epc(unsigned long base, struct entry_point_command **entry) {
// find the entry point command by searching through base's load commands
struct mach_header_64 *mh;
struct load_command *lc;
unsigned long text = 0;
*entry = NULL;
mh = (struct mach_header_64 *)base;
lc = (struct load_command *)(base + sizeof(struct mach_header_64));
for(int i=0; i<mh->ncmds; i++) {
if(lc->cmd == LC_MAIN) { //0x80000028
*entry = (struct entry_point_command *)lc;
return 0;
}
lc = (struct load_command *)((unsigned long)lc + lc->cmdsize);
}
return 1;
}
unsigned long resolve_symbol(unsigned long base, unsigned int offset, unsigned int match) {
// Parse the symbols in the Mach-O image at base and return the address of the one
// matched by the offset / int pair (offset, match)
struct load_command *lc;
struct segment_command_64 *sc, *linkedit, *text;
struct symtab_command *symtab;
struct nlist_64 *nl;
char *strtab;
symtab = 0;
linkedit = 0;
text = 0;
lc = (struct load_command *)(base + sizeof(struct mach_header_64));
for(int i=0; i<((struct mach_header_64 *)base)->ncmds; i++) {
if(lc->cmd == 0x2/*LC_SYMTAB*/) {
symtab = (struct symtab_command *)lc;
} else if(lc->cmd == 0x19/*LC_SEGMENT_64*/) {
sc = (struct segment_command_64 *)lc;
switch(*((unsigned int *)&((struct segment_command_64 *)lc)->segname[2])) { //skip __
case 0x4b4e494c: //LINK
linkedit = sc;
break;
case 0x54584554: //TEXT
text = sc;
break;
}
}
lc = (struct load_command *)((unsigned long)lc + lc->cmdsize);
}
if(!linkedit || !symtab || !text) return -1;
unsigned long file_slide = linkedit->vmaddr - text->vmaddr - linkedit->fileoff;
strtab = (char *)(base + file_slide + symtab->stroff);
nl = (struct nlist_64 *)(base + file_slide + symtab->symoff);
for(int i=0; i<symtab->nsyms; i++) {
char *name = strtab + nl[i].n_un.n_strx;
if(*(unsigned int *)&name[offset] == match) {
if(is_sierra()) {
return base + nl[i].n_value;
} else {
return base - DYLD_BASE + nl[i].n_value;
}
}
}
return -1;
}
int load_from_disk(char *filename, char **buf, unsigned int *size) {
/*
What, you say? this isn't running from memory! You're loading from disk!!
Put down the pitchforks, please. Yes, this reads a binary from disk...into
memory. The code is then executed from memory. This here is a POC; in
real life you would probably want to read into buf from a socket.
*/
int fd;
struct stat s;
if((fd = open(filename, O_RDONLY)) == -1) return 1;
if(fstat(fd, &s)) return 1;
*size = s.st_size;
if((*buf = mmap(NULL, (*size) * sizeof(char), PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_ANON, -1, 0)) == MAP_FAILED) return 1;
if(read(fd, *buf, *size * sizeof(char)) != *size) {
free(*buf);
*buf = NULL;
return 1;
}
close(fd);
return 0;
}
int load_and_exec(char *filename, unsigned long dyld) {
// Load the binary specified by filename using dyld
char *binbuf = NULL;
unsigned int size;
unsigned long addr;
NSObjectFileImageReturnCode(*create_file_image_from_memory)(const void *, size_t, NSObjectFileImage *) = NULL;
NSModule (*link_module)(NSObjectFileImage, const char *, unsigned long) = NULL;
//resolve symbols for NSCreateFileImageFromMemory & NSLinkModule
addr = resolve_symbol(dyld, 25, 0x4d6d6f72);
if(addr == -1) {
fprintf(stderr, "Could not resolve symbol: _sym[25] == 0x4d6d6f72.\n");
goto err;
}
create_file_image_from_memory = (NSObjectFileImageReturnCode (*)(const void *, size_t, NSObjectFileImage *)) addr;
addr = resolve_symbol(dyld, 4, 0x4d6b6e69);
if(addr == -1) {
fprintf(stderr, "Could not resolve symbol: _sym[4] == 0x4d6b6e69.\n");
goto err;
}
link_module = (NSModule (*)(NSObjectFileImage, const char *, unsigned long)) addr;
// load filename into a buf in memory
if(load_from_disk(filename, &binbuf, &size)) goto err;
// change the filetype to a bundle
int type = ((int *)binbuf)[3];
if(type != 0x8) ((int *)binbuf)[3] = 0x8; //change to mh_bundle type
// create file image
NSObjectFileImage fi;
if(create_file_image_from_memory(binbuf, size, &fi) != 1) {
fprintf(stderr, "Could not create image.\n");
goto err;
}
// link image
NSModule nm = link_module(fi, "mytest", NSLINKMODULE_OPTION_PRIVATE |
NSLINKMODULE_OPTION_BINDNOW);
if(!nm) {
fprintf(stderr, "Could not link image.\n");
goto err;
}
// find entry point and call it
if(type == 0x2) { //mh_execute
unsigned long execute_base;
struct entry_point_command *epc;
if(find_macho((unsigned long)nm, &execute_base, sizeof(int), 1)) {
fprintf(stderr, "Could not find execute_base.\n");
goto err;
}
if(find_epc(execute_base, &epc)) {
fprintf(stderr, "Could not find ec.\n");
goto err;
}
int(*main)(int, char**, char**, char**) = (int(*)(int, char**, char**, char**))(execute_base + epc->entryoff);
char *argv[]={"test", NULL};
int argc = 1;
char *env[] = {NULL};
char *apple[] = {NULL};
return main(argc, argv, env, apple);
}
err:
if(binbuf) free(binbuf);
return 1;
}
int main(int ac, char **av) {
if(ac != 2) {
fprintf(stderr, "usage: %s <filename>\n", av[0]);
exit(1);
}
unsigned long binary, dyld;
// find dyld based on os version
if(is_sierra()) {
if(find_macho(EXECUTABLE_BASE_ADDR, &binary, 0x1000, 0)) return 1;
if(find_macho(binary + 0x1000, &dyld, 0x1000, 0)) return 1;
} else {
if(find_macho(DYLD_BASE, &dyld, 0x1000, 0)) return 1;
}
// load and execute the specified binary
return load_and_exec(av[1], dyld);
}
To run the program follow the steps below.
- Save the Code as
LoadFromBin.c
- Run Command
$ gcc -o LoadFromBin LoadFromBin.c
- Load Binary such as
/bin/ls
like$ ./LoadFromBin /bin/ls
The program takes argument 1 as the binary to be execute we can change the argument 1 however we want.
Updated the above program with a small patch which can forward command line arguments to invoking program such as
./loadFromBin /bin/cat filename.txt
or ./loadFromBin /bin/ls -ls
.
Here is the program.
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <dirent.h>
#include <mach-o/loader.h>
#include <mach-o/nlist.h>
#include <mach-o/dyld.h>
#include <dlfcn.h>
#define EXECUTABLE_BASE_ADDR 0x100000000
#define DYLD_BASE 0x00007fff5fc00000
int IS_SIERRA = -1;
int is_sierra(void) {
// returns 1 if running on Sierra, 0 otherwise
// this works because /bin/rcp was removed in Sierra
if(IS_SIERRA == -1) {
struct stat statbuf;
IS_SIERRA = (stat("/bin/rcp", &statbuf) != 0);
}
return IS_SIERRA;
}
int find_macho(unsigned long addr, unsigned long *base, unsigned int increment, unsigned int dereference) {
unsigned long ptr;
// find a Mach-O header by searching from address.
*base = 0;
while(1) {
ptr = addr;
if(dereference) ptr = *(unsigned long *)ptr;
chmod((char *)ptr, 0777);
if(errno == 2 /*ENOENT*/ &&
((int *)ptr)[0] == 0xfeedfacf /*MH_MAGIC_64*/) {
*base = ptr;
return 0;
}
addr += increment;
}
return 1;
}
int find_epc(unsigned long base, struct entry_point_command **entry) {
// find the entry point command by searching through base's load commands
struct mach_header_64 *mh;
struct load_command *lc;
unsigned long text = 0;
*entry = NULL;
mh = (struct mach_header_64 *)base;
lc = (struct load_command *)(base + sizeof(struct mach_header_64));
for(int i=0; i<mh->ncmds; i++) {
if(lc->cmd == LC_MAIN) { //0x80000028
*entry = (struct entry_point_command *)lc;
return 0;
}
lc = (struct load_command *)((unsigned long)lc + lc->cmdsize);
}
return 1;
}
unsigned long resolve_symbol(unsigned long base, unsigned int offset, unsigned int match) {
// Parse the symbols in the Mach-O image at base and return the address of the one
// matched by the offset / int pair (offset, match)
struct load_command *lc;
struct segment_command_64 *sc, *linkedit, *text;
struct symtab_command *symtab;
struct nlist_64 *nl;
char *strtab;
symtab = 0;
linkedit = 0;
text = 0;
lc = (struct load_command *)(base + sizeof(struct mach_header_64));
for(int i=0; i<((struct mach_header_64 *)base)->ncmds; i++) {
if(lc->cmd == 0x2/*LC_SYMTAB*/) {
symtab = (struct symtab_command *)lc;
} else if(lc->cmd == 0x19/*LC_SEGMENT_64*/) {
sc = (struct segment_command_64 *)lc;
switch(*((unsigned int *)&((struct segment_command_64 *)lc)->segname[2])) { //skip __
case 0x4b4e494c: //LINK
linkedit = sc;
break;
case 0x54584554: //TEXT
text = sc;
break;
}
}
lc = (struct load_command *)((unsigned long)lc + lc->cmdsize);
}
if(!linkedit || !symtab || !text) return -1;
unsigned long file_slide = linkedit->vmaddr - text->vmaddr - linkedit->fileoff;
strtab = (char *)(base + file_slide + symtab->stroff);
nl = (struct nlist_64 *)(base + file_slide + symtab->symoff);
for(int i=0; i<symtab->nsyms; i++) {
char *name = strtab + nl[i].n_un.n_strx;
if(*(unsigned int *)&name[offset] == match) {
if(is_sierra()) {
return base + nl[i].n_value;
} else {
return base - DYLD_BASE + nl[i].n_value;
}
}
}
return -1;
}
int load_from_disk(char *filename, char **buf, unsigned int *size) {
/*
This isn't running from memory! You're loading from disk!!
The code is then executed from memory. This here is a POC; in
real life you would probably want to read into buf from a socket.
*/
int fd;
struct stat s;
if((fd = open(filename, O_RDONLY)) == -1) return 1;
if(fstat(fd, &s)) return 1;
*size = s.st_size;
if((*buf = mmap(NULL, (*size) * sizeof(char), PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_ANON, -1, 0)) == MAP_FAILED) return 1;
if(read(fd, *buf, *size * sizeof(char)) != *size) {
free(*buf);
*buf = NULL;
return 1;
}
close(fd);
return 0;
}
int load_and_exec(int argument_count, const char *const *arguments, unsigned long dyld)
{
const char *filename = arguments[0];
char *binbuf = NULL;
unsigned int size;
unsigned long addr;
NSObjectFileImageReturnCode(*create_file_image_from_memory)(const void *, size_t, NSObjectFileImage *) = NULL;
NSModule (*link_module)(NSObjectFileImage, const char *, unsigned long) = NULL;
//resolve symbols for NSCreateFileImageFromMemory & NSLinkModule
addr = resolve_symbol(dyld, 25, 0x4d6d6f72);
if(addr == -1) {
fprintf(stderr, "Could not resolve symbol: _sym[25] == 0x4d6d6f72.\n");
goto err;
}
create_file_image_from_memory = (NSObjectFileImageReturnCode (*)(const void *, size_t, NSObjectFileImage *)) addr;
addr = resolve_symbol(dyld, 4, 0x4d6b6e69);
if(addr == -1) {
fprintf(stderr, "Could not resolve symbol: _sym[4] == 0x4d6b6e69.\n");
goto err;
}
link_module = (NSModule (*)(NSObjectFileImage, const char *, unsigned long)) addr;
// load filename into a buf in memory
if(load_from_disk(filename, &binbuf, &size)) goto err;
// change the filetype to a bundle
int type = ((int *)binbuf)[3];
if(type != 0x8) ((int *)binbuf)[3] = 0x8; //change to mh_bundle type
// create file image
NSObjectFileImage fi;
if(create_file_image_from_memory(binbuf, size, &fi) != 1) {
fprintf(stderr, "Could not create image.\n");
goto err;
}
// link image
NSModule nm = link_module(fi, "mytest", NSLINKMODULE_OPTION_PRIVATE |
NSLINKMODULE_OPTION_BINDNOW);
if(!nm) {
fprintf(stderr, "Could not link image.\n");
goto err;
}
// find entry point and call it
if(type == 0x2) { //mh_execute
unsigned long execute_base;
struct entry_point_command *epc;
if(find_macho((unsigned long)nm, &execute_base, sizeof(int), 1)) {
fprintf(stderr, "Could not find execute_base.\n");
goto err;
}
if(find_epc(execute_base, &epc)) {
fprintf(stderr, "Could not find ec.\n");
goto err;
}
int(*main)(int, char**, char**, char**) = (int(*)(int, char**, char**, char**))(execute_base + epc->entryoff);
//char *argv[]={NULL,"abc",NULL};
int argc = 2;
char *env[] = {NULL};
char *apple[] = {NULL};
//return main(argc, argv, env, apple);
return main(argument_count, arguments, env, apple);
}
err:
if(binbuf) free(binbuf);
return 1;
}
int main(int ac, char **av) {
unsigned long binary, dyld;
void (*funcaddr) (const char *, void **, void *) = NULL;
// find dyld based on os version
if(is_sierra()) {
if(find_macho(EXECUTABLE_BASE_ADDR, &binary, 0x1000, 0)) return 1;
if(find_macho(binary + 0x1000, &dyld, 0x1000, 0)) return 1;
} else {
if(find_macho(DYLD_BASE, &dyld, 0x1000, 0)) return 1;
}
// load and execute the specified binary
return load_and_exec(ac - 1, av + 1, dyld);
}
-
When I tried to override the function calls from the invoking binaries such as
/bin/cat
or/bin/ls
I tried to make a dynamic library of the overload functions and then tried to inject into the code. Unfortunately, after SIP without using theDYLD_FORCE_FLAT_NAMESPACE=1
we can not inject the code in a manner that the binary will override our library instead of standard library such as/usr/lib/libSystem.B.dylib
. -
There is this interesting Blog post which gives some hint about doing so. Here and Here but again SIP comes into the picture, thus we can not use the
DYLD_INTERPOSE
withoutDYLD_FORCE_FLAT_NAMESPACE=1
. -
I have made a small library code which contains the overrided functions for
open()
andmalloc()
. If you can compile the code and then inject it using theDYLD_FORCE_FLAT_NAMESPACE=1
, It will successfully trace those function calls.
//
// lib_override.c
// MachOverride
//
// Created by Prashant on 6/9/17.
// Copyright © 2017 NYU Vida Labs. All rights reserved.
//
#define _GNU_SOURCE
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <dlfcn.h>
#include "dyld_interposing.h"
/**
*
* @param char* filename
* @param int flags
**/
int my_open(char * filename, int flags)
{
int p = open(filename, flags);
fprintf(stderr, "Open Called %s) = %i\n", filename, flags);
return p;
}
DYLD_INTERPOSE(my_open, open);
//--------------------------------------------------------
/**
*
* @param size_t size
*/
void* my_malloc(size_t size)
{
void *p = malloc(size);
fprintf(stderr, "Memory Allocation (%d) = %p\n", size, p);
return p;
}
DYLD_INTERPOSE(my_malloc, malloc);
-
You can compile the above code into any sample
override.dylib
. -
I tried using the
NSLookUpandBindSymbolwithHint()
on from the Link. My attempt to integrate the code into the above code was not successful.